Developing Data Flow Applications

Learn about the Library , including reusable Spark application templates and application security. Also learn how to create and view applications, edit applications, delete applications, and apply arguments or parameters.

Data Flow automatically stops long running batch jobs (more than 24 hours) using a delegation token. In this case, if the application isn't finished with processing the data, you might get run failure and the job remains unfinished. To prevent this, use the following options to limit the total time the application can run:

When Creating Applications using the Console: Under Advanced Options, specify the duration in Max run duration minutes.
When Creating Applications using the CLI: Pass command line option of --max-duration-in-minutes <number>
When Creating Applications using the SDK: Provide optional argument max_duration_in_minutes
When Creating Applications using the API: Set the optional argument maxDurationInMinutes

Reusable Spark Application Templates

An Application is an infinitely reusable Spark application template.

Data Flow Applications consist of a Spark application, its dependencies, default parameters, and a default runtime resource specification. After a Spark developer creates a Data Flow Application, anyone can use it without worrying about the complexities of deploying it, setting it up, or running it. You can use it through Spark analytics in custom dashboards, reports, scripts, or REST API calls. There is on the left is a figure representing Spark developers. An arrow passes to a box representing published applications. The arrow is labelled Publish: Parameterized Application. To the right of the box is another figure representing non-developers. An arrow flows from the non-developers to the box and is labelled Execute: Custom Reports and Custom Dashboards.

Every time you invoke the Data Flow Application, you create a Run . It fills in the details of the application template and starts it on a specific set of IaaS resources.

Oracle Cloud Infrastructure Documentation Try Free Tier

Developing Data Flow Applications

Reusable Spark Application Templates 🔗

Oracle Cloud Infrastructure Documentation
Try Free Tier

Reusable Spark Application Templates