Package io.delta.kernel.internal
Class TableImpl
Object
io.delta.kernel.internal.TableImpl
- All Implemented Interfaces:
Table
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
checkpoint
(Engine engine, long version) Checkpoint the table at given version.void
Computes and writes a checksum file for the table at given version.createReplaceTableTransactionBuilder
(Engine engine, String engineInfo) createTransactionBuilder
(Engine engine, String engineInfo, Operation operation) Create aTransactionBuilder
which can create aTransaction
object to mutate the table.static Table
Instantiate a table object for the Delta Lake table at the given path.static Table
Instantiate a table object for the Delta Lake table at the given path.getChanges
(Engine engine, long startVersion, long endVersion, Set<DeltaLogActionUtils.DeltaAction> actionSet) Returns delta actions for each version between startVersion and endVersion.getClock()
getLatestSnapshot
(Engine engine) Get the latest snapshot of the table.The fully qualified path of thisTable
instance.getSnapshotAsOfTimestamp
(Engine engine, long millisSinceEpochUTC) Get the snapshot of the table at the giventimestamp
.getSnapshotAsOfVersion
(Engine engine, long versionId) Get the snapshot at the givenversionId
.long
getVersionAtOrAfterTimestamp
(Engine engine, long millisSinceEpochUTC) Returns the latest version that was committed at or aftermillisSinceEpochUTC
.long
getVersionBeforeOrAtTimestamp
(Engine engine, long millisSinceEpochUTC) Returns the latest version that was committed before or atmillisSinceEpochUTC
.
-
Constructor Details
-
TableImpl
-
-
Method Details
-
forPath
Description copied from interface:Table
Instantiate a table object for the Delta Lake table at the given path.- Behavior when the table location doesn't exist:
- Reads will fail with a
TableNotFoundException
- Writes will create the location
- Reads will fail with a
- Behavior when the table location exists (with contents or not) but not a Delta table:
- Reads will fail with a
TableNotFoundException
- Writes will create a Delta table at the given location. If there are any existing files in the location that are not already part of the Delta table, they will remain excluded from the Delta table.
- Reads will fail with a
- Behavior when the table location doesn't exist:
-
forPath
Instantiate a table object for the Delta Lake table at the given path. It takes an additional parameter calledClock
which helps in testing. -
getPath
Description copied from interface:Table
The fully qualified path of thisTable
instance. -
getLatestSnapshot
Description copied from interface:Table
Get the latest snapshot of the table.- Specified by:
getLatestSnapshot
in interfaceTable
- Parameters:
engine
-Engine
instance to use in Delta Kernel.- Returns:
- an instance of
Snapshot
- Throws:
TableNotFoundException
- if the table is not found
-
getSnapshotAsOfVersion
Description copied from interface:Table
Get the snapshot at the givenversionId
.- Specified by:
getSnapshotAsOfVersion
in interfaceTable
- Parameters:
engine
-Engine
instance to use in Delta Kernel.versionId
- snapshot version to retrieve- Returns:
- an instance of
Snapshot
- Throws:
TableNotFoundException
- if the table is not found
-
getSnapshotAsOfTimestamp
public Snapshot getSnapshotAsOfTimestamp(Engine engine, long millisSinceEpochUTC) throws TableNotFoundException Description copied from interface:Table
Get the snapshot of the table at the giventimestamp
. This is the latest version of the table that was committed before or attimestamp
.Specifically:
- If a commit version exactly matches the provided timestamp, we return the table snapshot at that version.
- Else, we return the latest commit version with a timestamp less than the provided one.
- If the provided timestamp is less than the timestamp of any committed version, we throw an error.
- If the provided timestamp is after (strictly greater than) the timestamp of the latest version of the table, we throw an error
- Specified by:
getSnapshotAsOfTimestamp
in interfaceTable
- Parameters:
engine
-Engine
instance to use in Delta Kernel.millisSinceEpochUTC
- timestamp to fetch the snapshot for in milliseconds since the unix epoch- Returns:
- an instance of
Snapshot
- Throws:
TableNotFoundException
- if the table is not found
-
checkpoint
public void checkpoint(Engine engine, long version) throws TableNotFoundException, CheckpointAlreadyExistsException, IOException Description copied from interface:Table
Checkpoint the table at given version. It writes a single checkpoint file.- Specified by:
checkpoint
in interfaceTable
- Parameters:
engine
-Engine
instance to use.version
- Version to checkpoint.- Throws:
TableNotFoundException
- if the table is not foundCheckpointAlreadyExistsException
- if a checkpoint already exists at the given versionIOException
- for any I/O error.
-
checksum
Description copied from interface:Table
Computes and writes a checksum file for the table at given version. If a checksum file already exists, this method does nothing.Note: For very large tables, this operation may be expensive as it requires scanning the log to compute table statistics.
- Specified by:
checksum
in interfaceTable
- Parameters:
engine
-Engine
instance to use.version
- Version to generate checksum file for.- Throws:
TableNotFoundException
- if the table is not foundIOException
- for any I/O error.
-
createTransactionBuilder
public TransactionBuilder createTransactionBuilder(Engine engine, String engineInfo, Operation operation) Description copied from interface:Table
Create aTransactionBuilder
which can create aTransaction
object to mutate the table.- Specified by:
createTransactionBuilder
in interfaceTable
- Parameters:
engine
-Engine
instance to use.engineInfo
- information about the engine that is making the updates.operation
- metadata of operation that is being performed. E.g. "insert", "delete".- Returns:
TransactionBuilder
instance to build the transaction.
-
createReplaceTableTransactionBuilder
-
getClock
-
getChanges
public CloseableIterator<ColumnarBatch> getChanges(Engine engine, long startVersion, long endVersion, Set<DeltaLogActionUtils.DeltaAction> actionSet) Returns delta actions for each version between startVersion and endVersion. Only returns the actions requested in actionSet.For the returned columnar batches:
- Each row within the same batch is guaranteed to have the same commit version
- The batch commit versions are monotonically increasing
- The top-level columns include "version", "timestamp", and the actions requested in
actionSet. "version" and "timestamp" are the first and second columns in the schema,
respectively. The remaining columns are based on the actions requested and each have the
schema found in
DeltaAction.schema
.
- Parameters:
engine
-Engine
instance to use in Delta Kernel.startVersion
- start version (inclusive)endVersion
- end version (inclusive)actionSet
- the actions to read and return from the JSON log files- Returns:
- an iterator of batches where each row in the batch has exactly one non-null action and its commit version and timestamp
- Throws:
TableNotFoundException
- if the table does not exist or if it is not a delta tableKernelException
- if a commit file does not exist for any of the versions in the provided rangeKernelException
- if provided an invalid version rangeKernelException
- if the version range contains a version with reader protocol that is unsupported by Kernel
-
getVersionBeforeOrAtTimestamp
Returns the latest version that was committed before or atmillisSinceEpochUTC
. If no version exists, throws aKernelException
Specifically:
- if a commit version exactly matches the provided timestamp, we return it
- else, we return the latest commit version with a timestamp less than the provided one
- If the provided timestamp is less than the timestamp of any committed version, we throw an error.
- Parameters:
millisSinceEpochUTC
- the number of milliseconds since midnight, January 1, 1970 UTC- Returns:
- latest commit that happened before or at
timestamp
. - Throws:
KernelException
- if the timestamp is less than the timestamp of any committed versionTableNotFoundException
- if no delta table is found
-
getVersionAtOrAfterTimestamp
Returns the latest version that was committed at or aftermillisSinceEpochUTC
. If no version exists, throws aKernelException
Specifically:
- if a commit version exactly matches the provided timestamp, we return it
- else, we return the earliest commit version with a timestamp greater than the provided one
- If the provided timestamp is larger than the timestamp of any committed version, we throw an error.
- Parameters:
millisSinceEpochUTC
- the number of milliseconds since midnight, January 1, 1970 UTC- Returns:
- latest commit that happened at or before
timestamp
. - Throws:
KernelException
- if the timestamp is more than the timestamp of any committed versionTableNotFoundException
- if no delta table is found
-