class DynamicPartitionDataSingleWriter extends BaseDynamicPartitionDataWriter
Dynamic partition writer with single writer, meaning only one writer is opened at any time for writing. The records to be written are required to be sorted on partition and/or bucket column(s) before writing.
- Alphabetic
- By Inheritance
- DynamicPartitionDataSingleWriter
- BaseDynamicPartitionDataWriter
- FileFormatDataWriter
- DataWriter
- Closeable
- AutoCloseable
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new DynamicPartitionDataSingleWriter(description: WriteJobDescription, taskAttemptContext: TaskAttemptContext, committer: FileCommitProtocol, customMetrics: Map[String, SQLMetric] = Map.empty)
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
val
MAX_FILE_COUNTER: Int
Max number of files a single task writes out due to file size.
Max number of files a single task writes out due to file size. In most cases the number of files written should be very small. This is just a safe guard to protect some really bad settings, e.g. maxRecordsPerFile = 1.
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
-
def
abort(): Unit
- Definition Classes
- FileFormatDataWriter → DataWriter
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
close(): Unit
- Definition Classes
- FileFormatDataWriter → Closeable → AutoCloseable
-
def
commit(): WriteTaskResult
Returns the summary of relative information which includes the list of partition strings written out.
Returns the summary of relative information which includes the list of partition strings written out. The list of partitions is sent back to the driver and used to update the catalog. Other information will be sent back to the driver too and used to e.g. update the metrics in UI.
- Definition Classes
- FileFormatDataWriter → DataWriter
-
def
currentMetricsValues(): Array[CustomTaskMetric]
- Definition Classes
- DataWriter
-
var
currentWriter: OutputWriter
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
var
fileCounter: Int
File counter for writing current partition or bucket.
File counter for writing current partition or bucket. For same partition or bucket, we may have more than one file, due to number of records limit per file.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
lazy val
getBucketId: (InternalRow) ⇒ Int
Given an input row, returns the corresponding
bucketId
Given an input row, returns the corresponding
bucketId
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
val
getOutputRow: UnsafeProjection
Returns the data columns to be written given an input row
Returns the data columns to be written given an input row
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
lazy val
getPartitionValues: (InternalRow) ⇒ UnsafeRow
Extracts the partition values out of an input row.
Extracts the partition values out of an input row.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
val
isBucketed: Boolean
Flag saying whether or not the data to be written out is bucketed.
Flag saying whether or not the data to be written out is bucketed.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
val
isPartitioned: Boolean
Flag saying whether or not the data to be written out is partitioned.
Flag saying whether or not the data to be written out is partitioned.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
var
recordsInFile: Long
Number of records in current file.
Number of records in current file.
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
def
releaseCurrentWriter(): Unit
Release resources of
currentWriter
.Release resources of
currentWriter
.- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
-
def
releaseResources(): Unit
Release all resources.
Release all resources.
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
-
def
renewCurrentWriter(partitionValues: Option[InternalRow], bucketId: Option[Int], closeCurrentWriter: Boolean): Unit
Opens a new OutputWriter given a partition key and/or a bucket id.
Opens a new OutputWriter given a partition key and/or a bucket id. If bucket id is specified, we will append it to the end of the file name, but before the file extension, e.g. part-r-00009-ea518ad4-455a-4431-b471-d24e03814677-00002.gz.parquet
- partitionValues
the partition which all tuples being written by this OutputWriter belong to
- bucketId
the bucket which all tuples being written by this OutputWriter belong to
- closeCurrentWriter
close and release resource for current writer
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
def
renewCurrentWriterIfTooManyRecords(partitionValues: Option[InternalRow], bucketId: Option[Int]): Unit
Open a new output writer when number of records exceeding limit.
Open a new output writer when number of records exceeding limit.
- partitionValues
the partition which all tuples being written by this
OutputWriter
belong to- bucketId
the bucket which all tuples being written by this
OutputWriter
belong to
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
val
statsTrackers: Seq[WriteTaskStatsTracker]
Trackers for computing various statistics on the data as it's being written out.
Trackers for computing various statistics on the data as it's being written out.
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
val
updatedPartitions: Set[String]
- Attributes
- protected
- Definition Classes
- FileFormatDataWriter
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
write(record: InternalRow): Unit
Writes a record.
Writes a record.
- Definition Classes
- DynamicPartitionDataSingleWriter → FileFormatDataWriter → DataWriter
-
def
writeRecord(record: InternalRow): Unit
Writes the given record with current writer.
Writes the given record with current writer.
- record
The record to write
- Attributes
- protected
- Definition Classes
- BaseDynamicPartitionDataWriter
-
def
writeWithIterator(iterator: Iterator[InternalRow]): Unit
Write an iterator of records.
Write an iterator of records.
- Definition Classes
- FileFormatDataWriter
-
def
writeWithMetrics(record: InternalRow, count: Long): Unit
- Definition Classes
- FileFormatDataWriter