Class/Object

io.smartdatalake.definitions

SparkIncrementalMode

Related Docs: object SparkIncrementalMode | package definitions

Permalink

case class SparkIncrementalMode(compareCol: String, alternativeOutputId: Option[DataObjectId] = None) extends ExecutionMode with ExecutionModeWithMainInputOutput with Product with Serializable

Compares max entry in "compare column" between mainOutput and mainInput and incrementally loads the delta. This mode works only with SparkSubFeeds. The filter is not propagated to following actions.

compareCol

a comparable column name existing in mainInput and mainOutput used to identify the delta. Column content should be bigger for newer records.

alternativeOutputId

optional alternative outputId of DataObject later in the DAG. This replaces the mainOutputId. It can be used to ensure processing all partitions over multiple actions in case of errors.

Linear Supertypes
Serializable, Serializable, Product, Equals, ExecutionModeWithMainInputOutput, ExecutionMode, SmartDataLakeLogger, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. SparkIncrementalMode
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. ExecutionModeWithMainInputOutput
  7. ExecutionMode
  8. SmartDataLakeLogger
  9. AnyRef
  10. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new SparkIncrementalMode(compareCol: String, alternativeOutputId: Option[DataObjectId] = None)

    Permalink

    compareCol

    a comparable column name existing in mainInput and mainOutput used to identify the delta. Column content should be bigger for newer records.

    alternativeOutputId

    optional alternative outputId of DataObject later in the DAG. This replaces the mainOutputId. It can be used to ensure processing all partitions over multiple actions in case of errors.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def alternativeOutput(implicit context: ActionPipelineContext): Option[DataObject]

    Permalink
    Definition Classes
    ExecutionModeWithMainInputOutput
  5. val alternativeOutputId: Option[DataObjectId]

    Permalink

    optional alternative outputId of DataObject later in the DAG.

    optional alternative outputId of DataObject later in the DAG. This replaces the mainOutputId. It can be used to ensure processing all partitions over multiple actions in case of errors.

    Definition Classes
    SparkIncrementalMode → ExecutionModeWithMainInputOutput
  6. def apply(actionId: ActionObjectId, mainInput: DataObject, mainOutput: DataObject, subFeed: SubFeed)(implicit session: SparkSession, context: ActionPipelineContext): Option[(Seq[PartitionValues], Option[String])]

    Permalink
    Definition Classes
    SparkIncrementalModeExecutionMode
  7. def applyCondition: Option[String]

    Permalink
    Definition Classes
    ExecutionMode
  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. val compareCol: String

    Permalink

    a comparable column name existing in mainInput and mainOutput used to identify the delta.

    a comparable column name existing in mainInput and mainOutput used to identify the delta. Column content should be bigger for newer records.

  11. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  12. final def evaluateApplyCondition(actionId: ActionObjectId, subFeed: SubFeed)(implicit session: SparkSession, context: ActionPipelineContext): Option[Boolean]

    Permalink
    Definition Classes
    ExecutionMode
  13. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  16. lazy val logger: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    SmartDataLakeLogger
  17. def mainInputOutputNeeded: Boolean

    Permalink
    Definition Classes
    SparkIncrementalModeExecutionMode
  18. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  19. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  20. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  21. def prepare(actionId: ActionObjectId)(implicit session: SparkSession, context: ActionPipelineContext): Unit

    Permalink
    Definition Classes
    SparkIncrementalModeExecutionMode
  22. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  23. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from ExecutionModeWithMainInputOutput

Inherited from ExecutionMode

Inherited from SmartDataLakeLogger

Inherited from AnyRef

Inherited from Any

Ungrouped