RepartitionTransformer

Instance Constructors

new RepartitionTransformer(name: String = "repartition", description: Option[String] = None, numberOfTasksPerPartition: Int, keyCols: Seq[String] = Seq())

name
name of the transformer
description
Optional description of the transformer
numberOfTasksPerPartition
Number of Spark tasks to create per partition value by repartitioning the DataFrame.
keyCols
Optional key columns to distribute records over Spark tasks inside a partition value.

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
val description: Option[String]

Optional description of the transformer
Optional description of the transformer

Definition Classes
RepartitionTransformer → DfTransformer
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def factory: FromConfigFactory[ParsableDfTransformer]

Returns the factory that can parse this type (that is, type CO).
Returns the factory that can parse this type (that is, type CO).
Typically, implementations of this method should return the companion object of the implementing class. The companion object in turn should implement FromConfigFactory.
returns
the factory (object) for this class.

Definition Classes
RepartitionTransformer → ParsableFromConfig
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val keyCols: Seq[String]

Optional key columns to distribute records over Spark tasks inside a partition value.
val name: String

name of the transformer
name of the transformer

Definition Classes
RepartitionTransformer → DfTransformer
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val numberOfTasksPerPartition: Int

Number of Spark tasks to create per partition value by repartitioning the DataFrame.
def prepare(actionId: ActionId)(implicit session: SparkSession, context: ActionPipelineContext): Unit

Optional function to implement validations in prepare phase.
Optional function to implement validations in prepare phase.

Definition Classes
DfTransformer
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def transform(actionId: ActionId, partitionValues: Seq[PartitionValues], df: DataFrame, dataObjectId: DataObjectId)(implicit session: SparkSession, context: ActionPipelineContext): DataFrame

Function to be implemented to define the transformation between an input and output DataFrame (1:1)
Function to be implemented to define the transformation between an input and output DataFrame (1:1)

Definition Classes
RepartitionTransformer → DfTransformer
def transformPartitionValues(actionId: ActionId, partitionValues: Seq[PartitionValues])(implicit session: SparkSession, context: ActionPipelineContext): Option[Map[PartitionValues, PartitionValues]]

Optional function to define the transformation of input to output partition values.
Optional function to define the transformation of input to output partition values. For example this enables to implement aggregations where multiple input partitions are combined into one output partition. Note that the default value is input = output partition values, which should be correct for most use cases.
actionId
id of the action which executes this transformation. This is mainly used to prefix error messages.
partitionValues
partition values to transform
returns
Map of input to output partition values. This allows to map partition values forward and backward, which is needed in execution modes. Return None if mapping is 1:1.

Definition Classes
PartitionValueTransformer
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Docs: object RepartitionTransformer | package sparktransformer

case class RepartitionTransformer(name: String = "repartition", description: Option[String] = None, numberOfTasksPerPartition: Int, keyCols: Seq[String] = Seq()) extends ParsableDfTransformer with Product with Serializable

Instance Constructors

new RepartitionTransformer(name: String = "repartition", description: Option[String] = None, numberOfTasksPerPartition: Int, keyCols: Seq[String] = Seq())

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

val description: Option[String]

final def eq(arg0: AnyRef): Boolean

def factory: FromConfigFactory[ParsableDfTransformer]

def finalize(): Unit

final def getClass(): Class[_]

final def isInstanceOf[T0]: Boolean

val keyCols: Seq[String]

val name: String

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

val numberOfTasksPerPartition: Int

def prepare(actionId: ActionId)(implicit session: SparkSession, context: ActionPipelineContext): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def transform(actionId: ActionId, partitionValues: Seq[PartitionValues], df: DataFrame, dataObjectId: DataObjectId)(implicit session: SparkSession, context: ActionPipelineContext): DataFrame

def transformPartitionValues(actionId: ActionId, partitionValues: Seq[PartitionValues])(implicit session: SparkSession, context: ActionPipelineContext): Option[Map[PartitionValues, PartitionValues]]

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from ParsableDfTransformer

Inherited from ParsableFromConfig[ParsableDfTransformer]

Inherited from DfTransformer

Inherited from PartitionValueTransformer

Inherited from AnyRef

Inherited from Any

Ungrouped