Class/Object

io.smartdatalake.workflow

FileSubFeed

Related Docs: object FileSubFeed | package workflow

Permalink

case class FileSubFeed(fileRefs: Option[Seq[FileRef]], dataObjectId: DataObjectId, partitionValues: Seq[PartitionValues], isDAGStart: Boolean = false, isSkipped: Boolean = false, fileRefMapping: Option[Seq[FileRefMapping]] = None) extends SubFeed with Product with Serializable

A FileSubFeed is used to transport references to files between Actions.

fileRefs

path to files to be processed

dataObjectId

id of the DataObject this SubFeed corresponds to

partitionValues

Values of Partitions transported by this SubFeed

isDAGStart

true if this subfeed is a start node of the dag

isSkipped

true if this subfeed is the result of a skipped action

fileRefMapping

store mapping of input to output file references. This is also used for post processing (e.g. delete after read).

Linear Supertypes
Serializable, Serializable, Product, Equals, SubFeed, SmartDataLakeLogger, DAGResult, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. FileSubFeed
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. SubFeed
  7. SmartDataLakeLogger
  8. DAGResult
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new FileSubFeed(fileRefs: Option[Seq[FileRef]], dataObjectId: DataObjectId, partitionValues: Seq[PartitionValues], isDAGStart: Boolean = false, isSkipped: Boolean = false, fileRefMapping: Option[Seq[FileRefMapping]] = None)

    Permalink

    fileRefs

    path to files to be processed

    dataObjectId

    id of the DataObject this SubFeed corresponds to

    partitionValues

    Values of Partitions transported by this SubFeed

    isDAGStart

    true if this subfeed is a start node of the dag

    isSkipped

    true if this subfeed is the result of a skipped action

    fileRefMapping

    store mapping of input to output file references. This is also used for post processing (e.g. delete after read).

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def applyExecutionModeResultForInput(result: ExecutionModeResult, mainInputId: DataObjectId)(implicit session: SparkSession, context: ActionPipelineContext): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  5. def applyExecutionModeResultForOutput(result: ExecutionModeResult)(implicit session: SparkSession, context: ActionPipelineContext): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  6. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  7. def breakLineage(implicit session: SparkSession, context: ActionPipelineContext): FileSubFeed

    Permalink

    Break lineage.

    Break lineage. This means to discard an existing DataFrame or List of FileRefs, so that it is requested again from the DataObject. On one side this is usable to break long DataFrame Lineages over multiple Actions and instead reread the data from an intermediate table. On the other side it is needed if partition values or filter condition are changed.

    Definition Classes
    FileSubFeedSubFeed
  8. def checkPartitionValuesColsExisting(partitions: Set[String]): Boolean

    Permalink
  9. def clearDAGStart(): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  10. def clearPartitionValues(breakLineageOnChange: Boolean = true)(implicit session: SparkSession, context: ActionPipelineContext): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  11. def clearSkipped(): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  12. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. val dataObjectId: DataObjectId

    Permalink

    id of the DataObject this SubFeed corresponds to

    id of the DataObject this SubFeed corresponds to

    Definition Classes
    FileSubFeedSubFeed
  14. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. val fileRefMapping: Option[Seq[FileRefMapping]]

    Permalink

    store mapping of input to output file references.

    store mapping of input to output file references. This is also used for post processing (e.g. delete after read).

  16. val fileRefs: Option[Seq[FileRef]]

    Permalink

    path to files to be processed

  17. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  18. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  19. val isDAGStart: Boolean

    Permalink

    true if this subfeed is a start node of the dag

    true if this subfeed is a start node of the dag

    Definition Classes
    FileSubFeedSubFeed
  20. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  21. val isSkipped: Boolean

    Permalink

    true if this subfeed is the result of a skipped action

    true if this subfeed is the result of a skipped action

    Definition Classes
    FileSubFeedSubFeed
  22. lazy val logger: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    SmartDataLakeLogger
  23. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  24. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  25. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  26. val partitionValues: Seq[PartitionValues]

    Permalink

    Values of Partitions transported by this SubFeed

    Values of Partitions transported by this SubFeed

    Definition Classes
    FileSubFeedSubFeed
  27. def resultId: String

    Permalink
    Definition Classes
    SubFeed → DAGResult
  28. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  29. def toOutput(dataObjectId: DataObjectId): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  30. def union(other: SubFeed)(implicit session: SparkSession, context: ActionPipelineContext): SubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  31. def unionPartitionValues(otherPartitionValues: Seq[PartitionValues]): Seq[PartitionValues]

    Permalink
    Definition Classes
    SubFeed
  32. def updatePartitionValues(partitions: Seq[String], breakLineageOnChange: Boolean = true, newPartitionValues: Option[Seq[PartitionValues]] = None)(implicit session: SparkSession, context: ActionPipelineContext): FileSubFeed

    Permalink
    Definition Classes
    FileSubFeedSubFeed
  33. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  34. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  35. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from SubFeed

Inherited from SmartDataLakeLogger

Inherited from DAGResult

Inherited from AnyRef

Inherited from Any

Ungrouped