Class TransactionBuilderImpl

Object
io.delta.kernel.internal.TransactionBuilderImpl
All Implemented Interfaces:
TransactionBuilder
Direct Known Subclasses:
ReplaceTableTransactionBuilderImpl

public class TransactionBuilderImpl extends Object implements TransactionBuilder
  • Constructor Details

  • Method Details

    • withSchema

      public TransactionBuilder withSchema(Engine engine, StructType newSchema)
      Description copied from interface: TransactionBuilder
      Set the schema of the table. If setting the schema on an existing table for a schema evolution, then column mapping must be enabled. This API will preserve field metadata for fields such as field IDs and physical names. If field metadata is not specified for a field, it is considered as a new column and new IDs/physical names will be specified. The possible schema evolutions supported include column additions, removals, renames, and moves. If a schema evolution is performed, implementations must perform the following validations:
      • No duplicate columns are allowed
      • Column names contain only valid characters
      • Data types are supported
      • No new non-nullable fields are added
      • Physical column name consistency is preserved in the new schema
      • No type changes
      • ToDo: Nested IDs for array/map types are preserved in the new schema
      • ToDo: Validate invalid field reorderings
      Specified by:
      withSchema in interface TransactionBuilder
      Parameters:
      engine - Engine instance to use.
      newSchema - The new schema of the table.
      Returns:
      updated TransactionBuilder instance.
    • withPartitionColumns

      public TransactionBuilder withPartitionColumns(Engine engine, List<String> partitionColumns)
      Description copied from interface: TransactionBuilder
      Set the list of partitions columns when create a new partitioned table.
      Specified by:
      withPartitionColumns in interface TransactionBuilder
      Parameters:
      engine - Engine instance to use.
      partitionColumns - The partition columns of the table. These should be a subset of the columns in the schema. Only top-level columns are allowed to be partitioned. Note: Clustering columns and partition columns cannot coexist in a table.
      Returns:
      updated TransactionBuilder instance.
    • withClusteringColumns

      public TransactionBuilder withClusteringColumns(Engine engine, List<Column> clusteringColumns)
      There are three possible cases when handling clustering columns via `withClusteringColumns`:
      • Clustering columns are not set (i.e., `withClusteringColumns` is not called):
        • No changes are made related to clustering.
        • For table creation, the table is initialized as a non-clustered table.
        • For table updates, the existing clustered or non-clustered state remains unchanged (i.e., no protocol or domain metadata updates).
      • Clustering columns are an empty list:
        • This is equivalent to executing `ALTER TABLE ... CLUSTER BY NONE` in Delta.
        • The table remains a clustered table, but its clustering domain metadata is updated to reflect an empty list of clustering columns.
      • Clustering columns are a non-empty list:
        • The table is treated as a clustered table.
        • We update the protocol (if needed) to include clustering writer support and set the clustering domain metadata accordingly.
      Specified by:
      withClusteringColumns in interface TransactionBuilder
      Parameters:
      engine - Engine instance to use.
      clusteringColumns - The clustering columns of the table. These should be a subset of the columns in the schema. Both top-level and nested columns are allowed to be clustered. Note: Clustering columns and partition columns cannot coexist in a table.
      Returns:
      updated TransactionBuilder instance.
    • withTransactionId

      public TransactionBuilder withTransactionId(Engine engine, String applicationId, long transactionVersion)
      Description copied from interface: TransactionBuilder
      Set the transaction identifier for idempotent writes. Incremental processing systems (e.g., streaming systems) that track progress using their own application-specific versions need to record what progress has been made, in order to avoid duplicating data in the face of failures and retries during writes. By setting the transaction identifier, the Delta table can ensure that the data with same identifier is not written multiple times. For more information refer to the Delta protocol section Transaction Identifiers.
      Specified by:
      withTransactionId in interface TransactionBuilder
      Parameters:
      engine - Engine instance to use.
      applicationId - The application ID that is writing to the table.
      transactionVersion - The version of the transaction. This should be monotonically increasing with each write for the same application ID.
      Returns:
      updated TransactionBuilder instance.
    • withTableProperties

      public TransactionBuilder withTableProperties(Engine engine, Map<String,String> properties)
      Description copied from interface: TransactionBuilder
      Set the table properties for the table. When the table already contains the property with same key, it gets replaced if it doesn't have the same value. Note, user-properties (those without a '.delta' prefix) are case-sensitive. Delta-properties are case-insensitive and are normalized to their expected case before writing to the log.
      Specified by:
      withTableProperties in interface TransactionBuilder
      Parameters:
      engine - Engine instance to use.
      properties - The table properties to set. These are key-value pairs that can be used to configure the table. And these properties are stored in the table metadata.
      Returns:
      updated TransactionBuilder instance.
    • withTablePropertiesRemoved

      public TransactionBuilder withTablePropertiesRemoved(Set<String> propertyKeys)
      Description copied from interface: TransactionBuilder
      Unset the provided table properties on the table. If a property does not exist this is a no-op. For now this is only supported for user-properties (in other words, does not support 'delta.' prefixed properties). An exception will be thrown upon calling TransactionBuilder.build(Engine) if the same key is both set and unset in the same transaction. Note, user-properties (those without a '.delta' prefix) are case-sensitive.
      Specified by:
      withTablePropertiesRemoved in interface TransactionBuilder
      Parameters:
      propertyKeys - the table property keys to unset (remove from the table properties)
      Returns:
      updated TransactionBuilder instance.
    • withMaxRetries

      public TransactionBuilder withMaxRetries(int maxRetries)
      Description copied from interface: TransactionBuilder
      Set the maximum number of times to retry a transaction if a concurrent write is detected. This defaults to 200
      Specified by:
      withMaxRetries in interface TransactionBuilder
      Parameters:
      maxRetries - The number of times to retry
      Returns:
      updated TransactionBuilder instance
    • withLogCompactionInverval

      public TransactionBuilder withLogCompactionInverval(int logCompactionInterval)
      Description copied from interface: TransactionBuilder
      Set the number of commits between log compactions. Defaults to 0 (disabled). For more information see the Delta protocol section Log Compaction Files.
      Specified by:
      withLogCompactionInverval in interface TransactionBuilder
      Parameters:
      logCompactionInterval - The commits between log compactions
      Returns:
      updated TransactionBuilder instance
    • withDomainMetadataSupported

      public TransactionBuilder withDomainMetadataSupported()
      Description copied from interface: TransactionBuilder
      Enables support for Domain Metadata on this table if it is not supported already. The table feature _must_ be supported on the table to add or remove domain metadata using Transaction.addDomainMetadata(java.lang.String, java.lang.String) or Transaction.removeDomainMetadata(java.lang.String). See How does Delta Lake manage feature compatibility? for more details on table feature support.

      See the Delta protocol for more information on how to use Domain Metadata. This may break existing writers that do not support the Domain Metadata feature; readers will be unaffected.

      Specified by:
      withDomainMetadataSupported in interface TransactionBuilder
    • build

      public Transaction build(Engine engine)
      Description copied from interface: TransactionBuilder
      Build the transaction. Also validates the given info to ensure that a valid transaction can be created.
      Specified by:
      build in interface TransactionBuilder
      Parameters:
      engine - Engine instance to use.