Packages

t

org.apache.spark.sql.execution.datasources

WriteTaskStatsTracker

trait WriteTaskStatsTracker extends AnyRef

A trait for classes that are capable of collecting statistics on data that's being processed by a single write task in FileFormatWriter - i.e. there should be one instance per executor.

This trait is coupled with the way FileFormatWriter works, in the sense that its methods will be called according to how tuples are being written out to disk, namely in sorted order according to partitionValue(s), then bucketId.

As such, a typical call scenario is:

newPartition -> newBucket -> newFile -> newRow -. ^ |^_ ^| | | || | || ||

newPartition and newBucket events are only triggered if the relation to be written out is partitioned and/or bucketed, respectively.

Linear Supertypes
AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. WriteTaskStatsTracker
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Abstract Value Members

  1. abstract def getFinalStats(): WriteTaskStats

    Returns the final statistics computed so far.

    Returns the final statistics computed so far.

    returns

    An object of subtype of WriteTaskStats, to be sent to the driver.

    Note

    This may only be called once. Further use of the object may lead to undefined behavior.

  2. abstract def newBucket(bucketId: Int): Unit

    Process the fact that a new bucket is about to written.

    Process the fact that a new bucket is about to written. Only triggered when the relation is bucketed by a (non-empty) sequence of columns.

    bucketId

    The bucket number.

  3. abstract def newFile(filePath: String): Unit

    Process the fact that a new file is about to be written.

    Process the fact that a new file is about to be written.

    filePath

    Path of the file into which future rows will be written.

  4. abstract def newPartition(partitionValues: InternalRow): Unit

    Process the fact that a new partition is about to be written.

    Process the fact that a new partition is about to be written. Only triggered when the relation is partitioned by a (non-empty) sequence of columns.

    partitionValues

    The values that define this new partition.

  5. abstract def newRow(row: InternalRow): Unit

    Process the fact that a new row to update the tracked statistics accordingly.

    Process the fact that a new row to update the tracked statistics accordingly. The row will be written to the most recently witnessed file (via newFile).

    row

    Current data row to be processed.

    Note

    Keep in mind that any overhead here is per-row, obviously, so implementations should be as lightweight as possible.

Concrete Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  10. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  11. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  12. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  13. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  14. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  15. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  16. def toString(): String
    Definition Classes
    AnyRef → Any
  17. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  18. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  19. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from AnyRef

Inherited from Any

Ungrouped