com.coxautodata.waimak.dataflow.spark
Label to write
Base location of temporary folder
Destination path to put files in
Number of files to generate
Prefix of name of the file up to the filenumber and extension
Format to write (e.g. parquet, csv)
Options to pass to the DataFrameWriter
For representing the action
For representing the action
Destination path to put files in
Prefix of name of the file up to the filenumber and extension
Action has the responsibility of assessing itself and produce DataFlowActionState, that will be used by the executors to determine if they can call performAction or not.
Action has the responsibility of assessing itself and produce DataFlowActionState, that will be used by the executors to determine if they can call performAction or not. Also can be used for progress monitoring. This will allow for more custom actions without modifying the executors
- action will study the state of the inputs in order to generate self assessment
- an instance of the DataFlowActionState
Format to write (e.g.
Format to write (e.g. parquet, csv)
Unique id of the action, but using it for adding behaviours can be problematic due to Interceptors that are defined at a much later stage.
Unique id of the action, but using it for adding behaviours can be problematic due to Interceptors that are defined at a much later stage. Because of that ActionSchedulers must NOT use this guid.
The unique identifiers for the inputs to this action
The unique identifiers for the inputs to this action
Label to write
Number of files to generate
Options to pass to the DataFrameWriter
The unique identifiers for the outputs to this action
The unique identifiers for the outputs to this action
Perform the action
Perform the action
the DataFlowEntities corresponding to the inputLabels
context of the flow in which this action runs
the action outputs (these must be declared in the same order as their labels in outputLabels)
This action can only be executed if all of the inputs are not empty.
This action can only be executed if all of the inputs are not empty. An input can be explicitly marked as empty. If false, than one or more inputs can be empty to start execution.
Interceptors must not override this property, as certain behaviours of the data flow (Ex, execution pools) will be associated with this scheduling guid.
Interceptors must not override this property, as certain behaviours of the data flow (Ex, execution pools) will be associated with this scheduling guid. Also ActionScheduler will use this guid to track scheduled actions.
Base location of temporary folder
Write a file or files with a specific filename to a folder. Allows you to control the final output filename without the Spark-generated part UUIDs. Filename will be
$filenamePrefix.extension
if number of files is 1, otherwise$filenamePrefix.$fileNumber.extension
where file number is incremental and zero-padded.Label to write
Base location of temporary folder
Destination path to put files in
Number of files to generate
Prefix of name of the file up to the filenumber and extension
Format to write (e.g. parquet, csv)
Options to pass to the DataFrameWriter