class DynamicPartitionDataSingleWriter extends BaseDynamicPartitionDataWriter
Dynamic partition writer with single writer, meaning only one writer is opened at any time for writing. The records to be written are required to be sorted on partition and/or bucket column(s) before writing.
- Alphabetic
- By Inheritance
- DynamicPartitionDataSingleWriter
- BaseDynamicPartitionDataWriter
- FileFormatDataWriter
- DataWriter
- Closeable
- AutoCloseable
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new DynamicPartitionDataSingleWriter(description: WriteJobDescription, taskAttemptContext: TaskAttemptContext, committer: FileCommitProtocol, customMetrics: Map[String, SQLMetric] = Map.empty)
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- val MAX_FILE_COUNTER: Int
Max number of files a single task writes out due to file size.
Max number of files a single task writes out due to file size. In most cases the number of files written should be very small. This is just a safe guard to protect some really bad settings, e.g. maxRecordsPerFile = 1.
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
- def abort(): Unit
- Definition Classes
- FileFormatDataWriter → DataWriter
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- def close(): Unit
- Definition Classes
- FileFormatDataWriter → Closeable → AutoCloseable
- def commit(): WriteTaskResult
Returns the summary of relative information which includes the list of partition strings written out.
Returns the summary of relative information which includes the list of partition strings written out. The list of partitions is sent back to the driver and used to update the catalog. Other information will be sent back to the driver too and used to e.g. update the metrics in UI.
- Definition Classes
- FileFormatDataWriter → DataWriter
- def currentMetricsValues(): Array[CustomTaskMetric]
- Definition Classes
- DataWriter
- var currentWriter: OutputWriter
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def equals(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef → Any
- var fileCounter: Int
File counter for writing current partition or bucket.
File counter for writing current partition or bucket. For same partition or bucket, we may have more than one file, due to number of records limit per file.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- lazy val getBucketId: (InternalRow) => Int
Given an input row, returns the corresponding
bucketId
Given an input row, returns the corresponding
bucketId
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- val getOutputRow: UnsafeProjection
Returns the data columns to be written given an input row
Returns the data columns to be written given an input row
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- lazy val getPartitionValues: (InternalRow) => UnsafeRow
Extracts the partition values out of an input row.
Extracts the partition values out of an input row.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- def hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- val isBucketed: Boolean
Flag saying whether or not the data to be written out is bucketed.
Flag saying whether or not the data to be written out is bucketed.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- val isPartitioned: Boolean
Flag saying whether or not the data to be written out is partitioned.
Flag saying whether or not the data to be written out is partitioned.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- var recordsInFile: Long
Number of records in current file.
Number of records in current file.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- def releaseCurrentWriter(): Unit
Release resources of
currentWriter
.Release resources of
currentWriter
.- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
- def releaseResources(): Unit
Release all resources.
Release all resources.
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
- def renewCurrentWriter(partitionValues: Option[InternalRow], bucketId: Option[Int], closeCurrentWriter: Boolean): Unit
Opens a new OutputWriter given a partition key and/or a bucket id.
Opens a new OutputWriter given a partition key and/or a bucket id. If bucket id is specified, we will append it to the end of the file name, but before the file extension, e.g. part-r-00009-ea518ad4-455a-4431-b471-d24e03814677-00002.gz.parquet
- partitionValues
the partition which all tuples being written by this OutputWriter belong to
- bucketId
the bucket which all tuples being written by this OutputWriter belong to
- closeCurrentWriter
close and release resource for current writer
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- def renewCurrentWriterIfTooManyRecords(partitionValues: Option[InternalRow], bucketId: Option[Int]): Unit
Open a new output writer when number of records exceeding limit.
Open a new output writer when number of records exceeding limit.
- partitionValues
the partition which all tuples being written by this
OutputWriter
belong to- bucketId
the bucket which all tuples being written by this
OutputWriter
belong to
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- val statsTrackers: Seq[WriteTaskStatsTracker]
Trackers for computing various statistics on the data as it's being written out.
Trackers for computing various statistics on the data as it's being written out.
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- def toString(): String
- Definition Classes
- AnyRef → Any
- val updatedPartitions: Set[String]
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()
- def write(record: InternalRow): Unit
Writes a record.
Writes a record.
- Definition Classes
- DynamicPartitionDataSingleWriter → FileFormatDataWriter → DataWriter
- def writeRecord(record: InternalRow): Unit
Writes the given record with current writer.
Writes the given record with current writer.
- record
The record to write
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
- def writeWithIterator(iterator: Iterator[InternalRow]): Unit
Write an iterator of records.
Write an iterator of records.
- Definition Classes
- FileFormatDataWriter
- def writeWithMetrics(record: InternalRow, count: Long): Unit
- Definition Classes
- FileFormatDataWriter