org.apache.spark.sql.execution.streaming
Store the metadata for the specified batchId and return true
if successful.
Store the metadata for the specified batchId and return true
if successful. If the batchId's
metadata has already been stored, this method will return false
.
A PathFilter
to filter only batch files
A PathFilter
to filter only batch files
Return metadata for batches between startId (inclusive) and endId (inclusive).
Return metadata for batches between startId (inclusive) and endId (inclusive). If startId
is
None
, just return all batches before endId (inclusive).
Return the metadata for the specified batchId if it's stored.
Return the metadata for the specified batchId if it's stored. Otherwise, return None.
the deserialized metadata in a batch file, or None if file not exist.
IllegalArgumentException
when path does not point to a batch file.
Return the latest batch Id and its metadata if exist.
Return the latest batch Id and its metadata if exist.
Get an array of [FileStatus] referencing batch files.
Get an array of [FileStatus] referencing batch files. The array is sorted by most recent batch file first to oldest batch file.
Removes all the log entry earlier than thresholdBatchId (exclusive).
Removes all the log entry earlier than thresholdBatchId (exclusive).
Used to write log files that represent batch commit points in structured streaming. A commit log file will be written immediately after the successful completion of a batch, and before processing the next batch. Here is an execution summary: - trigger batch 1 - obtain batch 1 offsets and write to offset log - process batch 1 - write batch 1 to completion log - trigger batch 2 - obtain bactch 2 offsets and write to offset log - process batch 2 - write batch 2 to completion log ....
The current format of the batch completion log is: line 1: version line 2: metadata (optional json string)