Package

io.smartdatalake

workflow

Permalink

package workflow

Visibility
  1. Public
  2. All

Type Members

  1. case class DAG[N <: DAGNode] extends SmartDataLakeLogger with Product with Serializable

    Permalink

    A generic directed acyclic graph (DAG) consisting of DAGNodes interconnected with directed DAGEdges.

    A generic directed acyclic graph (DAG) consisting of DAGNodes interconnected with directed DAGEdges.

    This DAG can have multiple start nodes and multiple end nodes as well as disconnected parts.

  2. case class FileSubFeed(fileRefs: Option[Seq[FileRef]], dataObjectId: DataObjectId, partitionValues: Seq[PartitionValues], processedInputFileRefs: Option[Seq[FileRef]] = None) extends SubFeed with Product with Serializable

    Permalink

    A FileSubFeed is used to transport references to files between Actions.

    A FileSubFeed is used to transport references to files between Actions.

    fileRefs

    path to files to be processed

    dataObjectId

    id of the DataObject this SubFeed corresponds to

    partitionValues

    Values of Partitions transported by this SubFeed

    processedInputFileRefs

    used to remember processed input FileRef's for post processing (e.g. delete after read)

  3. case class InitSubFeed(dataObjectId: DataObjectId, partitionValues: Seq[PartitionValues]) extends SubFeed with Product with Serializable

    Permalink

    A InitSubFeed is used to initialize first Nodes of a DAG.

    A InitSubFeed is used to initialize first Nodes of a DAG.

    dataObjectId

    id of the DataObject this SubFeed corresponds to

    partitionValues

    Values of Partitions transported by this SubFeed

  4. class PrimaryKeyConstraintViolationException extends RuntimeException

    Permalink
  5. class ProcessingLogicException extends RuntimeException

    Permalink

    Exception to signal that a configured pipeline can't be executed properly

  6. case class SparkSubFeed(dataFrame: Option[DataFrame], dataObjectId: DataObjectId, partitionValues: Seq[PartitionValues]) extends SubFeed with Product with Serializable

    Permalink

    A SparkSubFeed is used to transport DataFrame's between Actions.

    A SparkSubFeed is used to transport DataFrame's between Actions.

    dataFrame

    Spark DataFrame to be processed. DataFrame should not be saved to state (@transient).

    dataObjectId

    id of the DataObject this SubFeed corresponds to

    partitionValues

    Values of Partitions transported by this SubFeed

  7. trait SubFeed extends DAGResult

    Permalink

    A SubFeed transports references to data between Actions.

    A SubFeed transports references to data between Actions. Data can be represented by different technologies like Files or DataFrame.

Value Members

  1. object DAG extends SmartDataLakeLogger with Serializable

    Permalink
  2. object FileSubFeed extends Serializable

    Permalink
  3. object SparkSubFeed extends Serializable

    Permalink
  4. package action

    Permalink
  5. package connection

    Permalink
  6. package dataobject

    Permalink

Ungrouped