Packages

class ApplyInPandasWithStateWriter extends AnyRef

This class abstracts the complexity on constructing Arrow RecordBatches for data and state with bin-packing and chunking. The caller only need to call the proper public methods of this class startNewGroup, writeRow, finalizeGroup, finalizeData and this class will write the data and state into Arrow RecordBatches with performing bin-pack and chunk internally.

This class requires that the parameter root has been initialized with the Arrow schema like below: - data fields - state field

  • nested schema (Refer ApplyInPandasWithStateWriter.STATE_METADATA_SCHEMA)

Please refer the code comment in the implementation to see how the writes of data and state against Arrow RecordBatch work with consideration of bin-packing and chunking.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ApplyInPandasWithStateWriter
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. Protected

Instance Constructors

  1. new ApplyInPandasWithStateWriter(root: VectorSchemaRoot, writer: ArrowStreamWriter, arrowMaxRecordsPerBatch: Int)

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##: Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  5. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.CloneNotSupportedException]) @native()
  6. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  7. def equals(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.Throwable])
  9. def finalizeData(): Unit

    Indicates writer that all groups have been processed.

  10. def finalizeGroup(): Unit

    Indicates writer that current group has finalized and there will be no further row bound to the current group.

  11. final def getClass(): Class[_ <: AnyRef]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  13. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  14. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  15. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  16. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  17. def startNewGroup(keyRow: UnsafeRow, groupState: GroupStateImpl[Row]): Unit

    Indicates writer to start with new grouping key.

    Indicates writer to start with new grouping key.

    keyRow

    The grouping key row for current group.

    groupState

    The instance of GroupStateImpl for current group.

  18. final def synchronized[T0](arg0: => T0): T0
    Definition Classes
    AnyRef
  19. def toString(): String
    Definition Classes
    AnyRef → Any
  20. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  21. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException])
  22. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws(classOf[java.lang.InterruptedException]) @native()
  23. def writeRow(dataRow: InternalRow): Unit

    Indicates writer to write a row in the current group.

    Indicates writer to write a row in the current group.

    dataRow

    The row to write in the current group.

Inherited from AnyRef

Inherited from Any

Ungrouped