io.smartdatalake.workflow.action.sparktransformer
Returns the factory that can parse this type (that is, type CO
).
Returns the factory that can parse this type (that is, type CO
).
Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
the factory (object) for this class.
Function to be implemented to define the transformation between many inputs and many outputs (n:m) see also DfsTransformer.transform()
Function to be implemented to define the transformation between many inputs and many outputs (n:m) see also DfsTransformer.transform()
Options specified in the configuration for this transformation, including evaluated runtimeOptions
Optional function to implement validations in prepare phase.
Optional function to implement validations in prepare phase.
Function to be implemented to define the transformation between many inputs and many outputs (n:m)
Function to be implemented to define the transformation between many inputs and many outputs (n:m)
id of the action which executes this transformation. This is mainly used to prefix error messages.
partition values to transform
Map of (dataObjectId, DataFrame) tuples available as input
Map of transformed (dataObjectId, DataFrame) tuples
Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.
id of the action which executes this transformation. This is mainly used to prefix error messages.
partition values to transform
Map of input to output partition values. This allows to map partition values forward and backward, which is needed in execution modes. Return None if mapping is 1:1.
Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases. see also DfsTransformer.transformPartitionValues()
Options specified in the configuration for this transformation, including evaluated runtimeOptions
Interface to implement Spark-DataFrame transformers working with many inputs and many outputs (n:m) This trait extends DfSparkTransformer to pass a map of options as parameter to the transform function. This is mainly used by custom transformers.