io.smartdatalake.workflow.action.sparktransformer
name of the transformer
Optional description of the transformer
class name implementing trait CustomDfTransformer
Options to pass to the transformation
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
class name implementing trait CustomDfTransformer
Optional description of the transformer
Optional description of the transformer
Returns the factory that can parse this type (that is, type CO
).
Returns the factory that can parse this type (that is, type CO
).
Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
the factory (object) for this class.
name of the transformer
name of the transformer
Options to pass to the transformation
Options to pass to the transformation
Optional function to implement validations in prepare phase.
Optional function to implement validations in prepare phase.
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation.
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
Function to be implemented to define the transformation between an input and output DataFrame (1:1)
Function to be implemented to define the transformation between an input and output DataFrame (1:1)
Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.
id of the action which executes this transformation. This is mainly used to prefix error messages.
partition values to transform
Map of input to output partition values. This allows to map partition values forward and backward, which is needed in execution modes. Return None if mapping is 1:1.
Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.
Options specified in the configuration for this transformation, including evaluated runtimeOptions
Function to be implemented to define the transformation between an input and output DataFrame (1:1)
Function to be implemented to define the transformation between an input and output DataFrame (1:1)
Options specified in the configuration for this transformation, including evaluated runtimeOptions
Configuration of a custom Spark-DataFrame transformation between one input and one output (1:1) as Java/Scala Class. Define a transform function which receives a DataObjectId, a DataFrame and a map of options and has to return a DataFrame. The Java/Scala class has to implement interface CustomDfTransformer.
name of the transformer
Optional description of the transformer
class name implementing trait CustomDfTransformer
Options to pass to the transformation
optional tuples of [key, spark sql expression] to be added as additional options when executing transformation. The spark sql expressions are evaluated against an instance of DefaultExpressionData.