Class TableImpl

Object
io.delta.kernel.internal.TableImpl
All Implemented Interfaces:
Table

public class TableImpl extends Object implements Table
  • Constructor Details

    • TableImpl

      public TableImpl(String tablePath, Clock clock)
  • Method Details

    • forPath

      public static Table forPath(Engine engine, String path)
      Description copied from interface: Table
      Instantiate a table object for the Delta Lake table at the given path.
      • Behavior when the table location doesn't exist:
      • Behavior when the table location exists (with contents or not) but not a Delta table:
        • Reads will fail with a TableNotFoundException
        • Writes will create a Delta table at the given location. If there are any existing files in the location that are not already part of the Delta table, they will remain excluded from the Delta table.
      Parameters:
      engine - Engine instance to use in Delta Kernel.
      path - location of the table. Path is resolved to fully qualified path using the given engine.
      Returns:
      an instance of Table representing the Delta table at the given path
    • forPath

      public static Table forPath(Engine engine, String path, Clock clock)
      Instantiate a table object for the Delta Lake table at the given path. It takes an additional parameter called Clock which helps in testing.
      Parameters:
      engine - Engine instance to use in Delta Kernel.
      path - location of the table.
      clock - Clock instance to use for time-related operations.
      Returns:
      an instance of Table representing the Delta table at the given path
    • getPath

      public String getPath(Engine engine)
      Description copied from interface: Table
      The fully qualified path of this Table instance.
      Specified by:
      getPath in interface Table
      Parameters:
      engine - Engine instance.
      Returns:
      the table path.
    • getLatestSnapshot

      public Snapshot getLatestSnapshot(Engine engine) throws TableNotFoundException
      Description copied from interface: Table
      Get the latest snapshot of the table.
      Specified by:
      getLatestSnapshot in interface Table
      Parameters:
      engine - Engine instance to use in Delta Kernel.
      Returns:
      an instance of Snapshot
      Throws:
      TableNotFoundException - if the table is not found
    • getSnapshotAsOfVersion

      public Snapshot getSnapshotAsOfVersion(Engine engine, long versionId) throws TableNotFoundException
      Description copied from interface: Table
      Get the snapshot at the given versionId.
      Specified by:
      getSnapshotAsOfVersion in interface Table
      Parameters:
      engine - Engine instance to use in Delta Kernel.
      versionId - snapshot version to retrieve
      Returns:
      an instance of Snapshot
      Throws:
      TableNotFoundException - if the table is not found
    • getSnapshotAsOfTimestamp

      public Snapshot getSnapshotAsOfTimestamp(Engine engine, long millisSinceEpochUTC) throws TableNotFoundException
      Description copied from interface: Table
      Get the snapshot of the table at the given timestamp. This is the latest version of the table that was committed before or at timestamp.

      Specifically:

      • If a commit version exactly matches the provided timestamp, we return the table snapshot at that version.
      • Else, we return the latest commit version with a timestamp less than the provided one.
      • If the provided timestamp is less than the timestamp of any committed version, we throw an error.
      • If the provided timestamp is after (strictly greater than) the timestamp of the latest version of the table, we throw an error
      .
      Specified by:
      getSnapshotAsOfTimestamp in interface Table
      Parameters:
      engine - Engine instance to use in Delta Kernel.
      millisSinceEpochUTC - timestamp to fetch the snapshot for in milliseconds since the unix epoch
      Returns:
      an instance of Snapshot
      Throws:
      TableNotFoundException - if the table is not found
    • checkpoint

      public void checkpoint(Engine engine, long version) throws TableNotFoundException, CheckpointAlreadyExistsException, IOException
      Description copied from interface: Table
      Checkpoint the table at given version. It writes a single checkpoint file.
      Specified by:
      checkpoint in interface Table
      Parameters:
      engine - Engine instance to use.
      version - Version to checkpoint.
      Throws:
      TableNotFoundException - if the table is not found
      CheckpointAlreadyExistsException - if a checkpoint already exists at the given version
      IOException - for any I/O error.
    • checksum

      public void checksum(Engine engine, long version) throws TableNotFoundException, IOException
      Description copied from interface: Table
      Computes and writes a checksum file for the table at given version. If a checksum file already exists, this method does nothing.

      Note: For very large tables, this operation may be expensive as it requires scanning the log to compute table statistics.

      Specified by:
      checksum in interface Table
      Parameters:
      engine - Engine instance to use.
      version - Version to generate checksum file for.
      Throws:
      TableNotFoundException - if the table is not found
      IOException - for any I/O error.
    • createTransactionBuilder

      public TransactionBuilder createTransactionBuilder(Engine engine, String engineInfo, Operation operation)
      Description copied from interface: Table
      Create a TransactionBuilder which can create a Transaction object to mutate the table.
      Specified by:
      createTransactionBuilder in interface Table
      Parameters:
      engine - Engine instance to use.
      engineInfo - information about the engine that is making the updates.
      operation - metadata of operation that is being performed. E.g. "insert", "delete".
      Returns:
      TransactionBuilder instance to build the transaction.
    • createReplaceTableTransactionBuilder

      public TransactionBuilder createReplaceTableTransactionBuilder(Engine engine, String engineInfo)
    • getClock

      public Clock getClock()
    • getChanges

      public CloseableIterator<ColumnarBatch> getChanges(Engine engine, long startVersion, long endVersion, Set<DeltaLogActionUtils.DeltaAction> actionSet)
      Returns delta actions for each version between startVersion and endVersion. Only returns the actions requested in actionSet.

      For the returned columnar batches:

      • Each row within the same batch is guaranteed to have the same commit version
      • The batch commit versions are monotonically increasing
      • The top-level columns include "version", "timestamp", and the actions requested in actionSet. "version" and "timestamp" are the first and second columns in the schema, respectively. The remaining columns are based on the actions requested and each have the schema found in DeltaAction.schema.
      Parameters:
      engine - Engine instance to use in Delta Kernel.
      startVersion - start version (inclusive)
      endVersion - end version (inclusive)
      actionSet - the actions to read and return from the JSON log files
      Returns:
      an iterator of batches where each row in the batch has exactly one non-null action and its commit version and timestamp
      Throws:
      TableNotFoundException - if the table does not exist or if it is not a delta table
      KernelException - if a commit file does not exist for any of the versions in the provided range
      KernelException - if provided an invalid version range
      KernelException - if the version range contains a version with reader protocol that is unsupported by Kernel
    • getVersionBeforeOrAtTimestamp

      public long getVersionBeforeOrAtTimestamp(Engine engine, long millisSinceEpochUTC)
      Returns the latest version that was committed before or at millisSinceEpochUTC. If no version exists, throws a KernelException

      Specifically:

      • if a commit version exactly matches the provided timestamp, we return it
      • else, we return the latest commit version with a timestamp less than the provided one
      • If the provided timestamp is less than the timestamp of any committed version, we throw an error.
      .
      Parameters:
      millisSinceEpochUTC - the number of milliseconds since midnight, January 1, 1970 UTC
      Returns:
      latest commit that happened before or at timestamp.
      Throws:
      KernelException - if the timestamp is less than the timestamp of any committed version
      TableNotFoundException - if no delta table is found
    • getVersionAtOrAfterTimestamp

      public long getVersionAtOrAfterTimestamp(Engine engine, long millisSinceEpochUTC)
      Returns the latest version that was committed at or after millisSinceEpochUTC. If no version exists, throws a KernelException

      Specifically:

      • if a commit version exactly matches the provided timestamp, we return it
      • else, we return the earliest commit version with a timestamp greater than the provided one
      • If the provided timestamp is larger than the timestamp of any committed version, we throw an error.
      .
      Parameters:
      millisSinceEpochUTC - the number of milliseconds since midnight, January 1, 1970 UTC
      Returns:
      latest commit that happened at or before timestamp.
      Throws:
      KernelException - if the timestamp is more than the timestamp of any committed version
      TableNotFoundException - if no delta table is found