action

Type Members

case class ActionMetadata(name: Option[String] = None, description: Option[String] = None, feed: Option[String] = None, tags: Seq[String] = Seq()) extends Product with Serializable

Additional metadata for a Action
Additional metadata for a Action
name
Readable name of the Action
description
Description of the content of the Action
feed
Name of the feed this Action belongs to
tags
Optional custom tags for this object
case class CopyAction(id: ActionId, inputId: DataObjectId, outputId: DataObjectId, deleteDataAfterRead: Boolean = false, transformer: Option[CustomDfTransformerConfig] = None, columnBlacklist: Option[Seq[String]] = None, columnWhitelist: Option[Seq[String]] = None, additionalColumns: Option[Map[String, String]] = None, filterClause: Option[String] = None, standardizeDatatypes: Boolean = false, breakDataFrameLineage: Boolean = false, persist: Boolean = false, executionMode: Option[ExecutionMode] = None, executionCondition: Option[Condition] = None, metricsFailCondition: Option[String] = None, metadata: Option[ActionMetadata] = None)(implicit instanceRegistry: InstanceRegistry) extends SparkSubFeedAction with Product with Serializable

Action to copy files (i.e.
Action to copy files (i.e. from stage to integration)
inputId
inputs DataObject
outputId
output DataObject
deleteDataAfterRead
a flag to enable deletion of input partitions after copying.
transformer
optional custom transformation to apply
columnBlacklist
Remove all columns on blacklist from dataframe
columnWhitelist
Keep only columns on whitelist in dataframe
additionalColumns
optional tuples of [column name, spark sql expression] to be added as additional columns to the dataframe. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
executionMode
optional execution mode for this Action
executionCondition
optional spark sql expression evaluated against SubFeedsExpressionData. If true Action is executed, otherwise skipped. Details see Condition.
metricsFailCondition
optional spark sql expression evaluated as where-clause against dataframe of metrics. Available columns are dataObjectId, key, value. If there are any rows passing the where clause, a MetricCheckFailed exception is thrown.
case class CustomFileAction(id: ActionId, inputId: DataObjectId, outputId: DataObjectId, transformer: CustomFileTransformerConfig, deleteDataAfterRead: Boolean = false, filesPerPartition: Int = 10, breakFileRefLineage: Boolean = false, executionMode: Option[ExecutionMode] = None, executionCondition: Option[Condition] = None, metricsFailCondition: Option[String] = None, metadata: Option[ActionMetadata] = None)(implicit instanceRegistry: InstanceRegistry) extends FileSubFeedAction with SmartDataLakeLogger with Product with Serializable

Action to transform files between two Hadoop Data Objects.
Action to transform files between two Hadoop Data Objects. The transformation is executed in distributed mode on the Spark executors. A custom file transformer must be given, which reads a file from Hadoop and writes it back to Hadoop.
inputId
inputs DataObject
outputId
output DataObject
transformer
a custom file transformer, which reads a file from HadoopFileDataObject and writes it back to another HadoopFileDataObject
deleteDataAfterRead
if the input files should be deleted after processing successfully
filesPerPartition
number of files per Spark partition
executionMode
optional execution mode for this Action
executionCondition
optional spark sql expression evaluated against SubFeedsExpressionData. If true Action is executed, otherwise skipped. Details see Condition.
metricsFailCondition
optional spark sql expression evaluated as where-clause against dataframe of metrics. Available columns are dataObjectId, key, value. If there are any rows passing the where clause, a MetricCheckFailed exception is thrown.
case class CustomSparkAction(id: ActionId, inputIds: Seq[DataObjectId], outputIds: Seq[DataObjectId], transformer: CustomDfsTransformerConfig, breakDataFrameLineage: Boolean = false, persist: Boolean = false, mainInputId: Option[DataObjectId] = None, mainOutputId: Option[DataObjectId] = None, executionMode: Option[ExecutionMode] = None, executionCondition: Option[Condition] = None, metricsFailCondition: Option[String] = None, metadata: Option[ActionMetadata] = None, recursiveInputIds: Seq[DataObjectId] = Seq(), inputIdsToIgnoreFilter: Seq[DataObjectId] = Seq())(implicit instanceRegistry: InstanceRegistry) extends SparkSubFeedsAction with Product with Serializable

Action to transform data according to a custom transformer.
Action to transform data according to a custom transformer. Allows to transform multiple input and output dataframes.
inputIds
input DataObject's
outputIds
output DataObject's
transformer
custom transformation for multiple dataframes to apply
mainInputId
optional selection of main inputId used for execution mode and partition values propagation. Only needed if there are multiple input DataObject's.
mainOutputId
optional selection of main outputId used for execution mode and partition values propagation. Only needed if there are multiple output DataObject's.
executionMode
optional execution mode for this Action
executionCondition
optional spark sql expression evaluated against SubFeedsExpressionData. If true Action is executed, otherwise skipped. Details see Condition.
metricsFailCondition
optional spark sql expression evaluated as where-clause against dataframe of metrics. Available columns are dataObjectId, key, value. If there are any rows passing the where clause, a MetricCheckFailed exception is thrown.
recursiveInputIds
output of action that are used as input in the same action
inputIdsToIgnoreFilter
optional list of input ids to ignore filter (partition values & filter clause)
case class DeduplicateAction(id: ActionId, inputId: DataObjectId, outputId: DataObjectId, transformer: Option[CustomDfTransformerConfig] = None, columnBlacklist: Option[Seq[String]] = None, columnWhitelist: Option[Seq[String]] = None, additionalColumns: Option[Map[String, String]] = None, filterClause: Option[String] = None, standardizeDatatypes: Boolean = false, ignoreOldDeletedColumns: Boolean = false, ignoreOldDeletedNestedColumns: Boolean = true, breakDataFrameLineage: Boolean = false, persist: Boolean = false, executionMode: Option[ExecutionMode] = None, executionCondition: Option[Condition] = None, metricsFailCondition: Option[String] = None, metadata: Option[ActionMetadata] = None)(implicit instanceRegistry: InstanceRegistry) extends SparkSubFeedAction with Product with Serializable

Action to deduplicate a subfeed.
Action to deduplicate a subfeed. Deduplication keeps the last record for every key, also after it has been deleted in the source. It needs a transactional table as output with defined primary keys.
inputId
inputs DataObject
outputId
output DataObject
transformer
optional custom transformation to apply
columnBlacklist
Remove all columns on blacklist from dataframe
columnWhitelist
Keep only columns on whitelist in dataframe
additionalColumns
optional tuples of [column name, spark sql expression] to be added as additional columns to the dataframe. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
ignoreOldDeletedColumns
if true, remove no longer existing columns in Schema Evolution
ignoreOldDeletedNestedColumns
if true, remove no longer existing columns from nested data types in Schema Evolution. Keeping deleted columns in complex data types has performance impact as all new data in the future has to be converted by a complex function.
executionMode
optional execution mode for this Action
executionCondition
optional spark sql expression evaluated against SubFeedsExpressionData. If true Action is executed, otherwise skipped. Details see Condition.
metricsFailCondition
optional spark sql expression evaluated as where-clause against dataframe of metrics. Available columns are dataObjectId, key, value. If there are any rows passing the where clause, a MetricCheckFailed exception is thrown.
abstract class FileSubFeedAction extends Action
case class FileTransferAction(id: ActionId, inputId: DataObjectId, outputId: DataObjectId, deleteDataAfterRead: Boolean = false, overwrite: Boolean = true, breakFileRefLineage: Boolean = false, executionMode: Option[ExecutionMode] = None, executionCondition: Option[Condition] = None, metricsFailCondition: Option[String] = None, metadata: Option[ActionMetadata] = None)(implicit instanceRegistry: InstanceRegistry) extends FileSubFeedAction with Product with Serializable

Action to transfer files between SFtp, Hadoop and local Fs.
Action to transfer files between SFtp, Hadoop and local Fs.
inputId
inputs DataObject
outputId
output DataObject
deleteDataAfterRead
if the input files should be deleted after processing successfully
executionMode
optional execution mode for this Action
executionCondition
optional spark sql expression evaluated against SubFeedsExpressionData. If true Action is executed, otherwise skipped. Details see Condition.
metricsFailCondition
optional spark sql expression evaluated as where-clause against dataframe of metrics. Available columns are dataObjectId, key, value. If there are any rows passing the where clause, a MetricCheckFailed exception is thrown.
case class HistorizeAction(id: ActionId, inputId: DataObjectId, outputId: DataObjectId, transformer: Option[CustomDfTransformerConfig] = None, columnBlacklist: Option[Seq[String]] = None, columnWhitelist: Option[Seq[String]] = None, additionalColumns: Option[Map[String, String]] = None, standardizeDatatypes: Boolean = false, filterClause: Option[String] = None, historizeBlacklist: Option[Seq[String]] = None, historizeWhitelist: Option[Seq[String]] = None, ignoreOldDeletedColumns: Boolean = false, ignoreOldDeletedNestedColumns: Boolean = true, breakDataFrameLineage: Boolean = false, persist: Boolean = false, executionMode: Option[ExecutionMode] = None, executionCondition: Option[Condition] = None, metricsFailCondition: Option[String] = None, metadata: Option[ActionMetadata] = None)(implicit instanceRegistry: InstanceRegistry) extends SparkSubFeedAction with Product with Serializable

Action to historize a subfeed.
Action to historize a subfeed. Historization creates a technical history of data by creating valid-from/to columns. It needs a transactional table as output with defined primary keys.
inputId
inputs DataObject
outputId
output DataObject
transformer
optional custom transformation to apply
columnBlacklist
Remove all columns on blacklist from dataframe
columnWhitelist
Keep only columns on whitelist in dataframe
additionalColumns
optional tuples of [column name, spark sql expression] to be added as additional columns to the dataframe. The spark sql expressions are evaluated against an instance of DefaultExpressionData.
filterClause
filter of data to be processed by historization. It can be used to exclude historical data not needed to create new history, for performance reasons.
historizeBlacklist
optional list of columns to ignore when comparing two records in historization. Can not be used together with historizeWhitelist.
historizeWhitelist
optional final list of columns to use when comparing two records in historization. Can not be used together with historizeBlacklist.
ignoreOldDeletedColumns
if true, remove no longer existing columns in Schema Evolution
ignoreOldDeletedNestedColumns
if true, remove no longer existing columns from nested data types in Schema Evolution. Keeping deleted columns in complex data types has performance impact as all new data in the future has to be converted by a complex function.
executionMode
optional execution mode for this Action
executionCondition
optional spark sql expression evaluated against SubFeedsExpressionData. If true Action is executed, otherwise skipped. Details see Condition.
metricsFailCondition
optional spark sql expression evaluated as where-clause against dataframe of metrics. Available columns are dataObjectId, key, value. If there are any rows passing the where clause, a MetricCheckFailed exception is thrown.
case class Metric(dataObjectId: String, key: Option[String], value: Option[String]) extends Product with Serializable
case class NoDataToProcessDontStopWarning(actionId: NodeId, msg: String, results: Option[Seq[SubFeed]] = None) extends TaskSkippedDontStopWarning[SubFeed] with Product with Serializable

Execution modes can throw this exception to indicate that there is no data to process, and dependent Actions should be executed nevertheless.
Execution modes can throw this exception to indicate that there is no data to process, and dependent Actions should be executed nevertheless.

Annotations
@DeveloperApi()
case class NoDataToProcessWarning(actionId: NodeId, msg: String) extends TaskSkippedWarning with Product with Serializable

Execution modes can throw this exception to indicate that there is no data to process, and dependent Actions should not be executed.
Execution modes can throw this exception to indicate that there is no data to process, and dependent Actions should not be executed.

Annotations
@DeveloperApi()
abstract class SparkSubFeedAction extends SparkAction
abstract class SparkSubFeedsAction extends SparkAction
case class SubFeedExpressionData(partitionValues: Seq[Map[String, String]], isDAGStart: Boolean, isSkipped: Boolean) extends Product with Serializable
case class SubFeedsExpressionData(inputSubFeeds: Map[String, SubFeedExpressionData]) extends Product with Serializable

Value Members

object CopyAction extends FromConfigFactory[Action] with Serializable
object CustomFileAction extends FromConfigFactory[Action] with Serializable
object CustomSparkAction extends FromConfigFactory[Action] with Serializable
object DeduplicateAction extends FromConfigFactory[Action] with Serializable
object FileTransferAction extends FromConfigFactory[Action] with Serializable
object HistorizeAction extends FromConfigFactory[Action] with Serializable
object SubFeedsExpressionData extends Serializable
package customlogic

package action

Type Members

case class ActionMetadata(name: Option[String] = None, description: Option[String] = None, feed: Option[String] = None, tags: Seq[String] = Seq()) extends Product with Serializable

abstract class FileSubFeedAction extends Action

case class Metric(dataObjectId: String, key: Option[String], value: Option[String]) extends Product with Serializable

case class NoDataToProcessDontStopWarning(actionId: NodeId, msg: String, results: Option[Seq[SubFeed]] = None) extends TaskSkippedDontStopWarning[SubFeed] with Product with Serializable

case class NoDataToProcessWarning(actionId: NodeId, msg: String) extends TaskSkippedWarning with Product with Serializable

abstract class SparkSubFeedAction extends SparkAction

abstract class SparkSubFeedsAction extends SparkAction

case class SubFeedExpressionData(partitionValues: Seq[Map[String, String]], isDAGStart: Boolean, isSkipped: Boolean) extends Product with Serializable

case class SubFeedsExpressionData(inputSubFeeds: Map[String, SubFeedExpressionData]) extends Product with Serializable

Value Members

object CopyAction extends FromConfigFactory[Action] with Serializable

object CustomFileAction extends FromConfigFactory[Action] with Serializable

object CustomSparkAction extends FromConfigFactory[Action] with Serializable

object DeduplicateAction extends FromConfigFactory[Action] with Serializable

object FileTransferAction extends FromConfigFactory[Action] with Serializable

object HistorizeAction extends FromConfigFactory[Action] with Serializable

object SubFeedsExpressionData extends Serializable

package customlogic

Ungrouped