Package io.delta.kernel.internal
Class TransactionBuilderImpl
Object
io.delta.kernel.internal.TransactionBuilderImpl
- All Implemented Interfaces:
TransactionBuilder
- Direct Known Subclasses:
ReplaceTableTransactionBuilderImpl
-
Constructor Summary
ConstructorsConstructorDescriptionTransactionBuilderImpl
(TableImpl table, String engineInfo, Operation operation) -
Method Summary
Modifier and TypeMethodDescriptionBuild the transaction.withClusteringColumns
(Engine engine, List<Column> clusteringColumns) There are three possible cases when handling clustering columns via `withClusteringColumns`: Clustering columns are not set (i.e., `withClusteringColumns` is not called): No changes are made related to clustering.Enables support for Domain Metadata on this table if it is not supported already.withLogCompactionInverval
(int logCompactionInterval) Set the number of commits between log compactions.withMaxRetries
(int maxRetries) Set the maximum number of times to retry a transaction if a concurrent write is detected.withPartitionColumns
(Engine engine, List<String> partitionColumns) Set the list of partitions columns when create a new partitioned table.withSchema
(Engine engine, StructType newSchema) Set the schema of the table.withTableProperties
(Engine engine, Map<String, String> properties) Set the table properties for the table.withTablePropertiesRemoved
(Set<String> propertyKeys) Unset the provided table properties on the table.withTransactionId
(Engine engine, String applicationId, long transactionVersion) Set the transaction identifier for idempotent writes.
-
Constructor Details
-
TransactionBuilderImpl
-
-
Method Details
-
withSchema
Description copied from interface:TransactionBuilder
Set the schema of the table. If setting the schema on an existing table for a schema evolution, then column mapping must be enabled. This API will preserve field metadata for fields such as field IDs and physical names. If field metadata is not specified for a field, it is considered as a new column and new IDs/physical names will be specified. The possible schema evolutions supported include column additions, removals, renames, and moves. If a schema evolution is performed, implementations must perform the following validations:- No duplicate columns are allowed
- Column names contain only valid characters
- Data types are supported
- No new non-nullable fields are added
- Physical column name consistency is preserved in the new schema
- No type changes
- ToDo: Nested IDs for array/map types are preserved in the new schema
- ToDo: Validate invalid field reorderings
- Specified by:
withSchema
in interfaceTransactionBuilder
- Parameters:
engine
-Engine
instance to use.newSchema
- The new schema of the table.- Returns:
- updated
TransactionBuilder
instance.
-
withPartitionColumns
Description copied from interface:TransactionBuilder
Set the list of partitions columns when create a new partitioned table.- Specified by:
withPartitionColumns
in interfaceTransactionBuilder
- Parameters:
engine
-Engine
instance to use.partitionColumns
- The partition columns of the table. These should be a subset of the columns in the schema. Only top-level columns are allowed to be partitioned. Note: Clustering columns and partition columns cannot coexist in a table.- Returns:
- updated
TransactionBuilder
instance.
-
withClusteringColumns
There are three possible cases when handling clustering columns via `withClusteringColumns`:- Clustering columns are not set (i.e., `withClusteringColumns` is not called):
- No changes are made related to clustering.
- For table creation, the table is initialized as a non-clustered table.
- For table updates, the existing clustered or non-clustered state remains unchanged (i.e., no protocol or domain metadata updates).
- Clustering columns are an empty list:
- This is equivalent to executing `ALTER TABLE ... CLUSTER BY NONE` in Delta.
- The table remains a clustered table, but its clustering domain metadata is updated to reflect an empty list of clustering columns.
- Clustering columns are a non-empty list:
- The table is treated as a clustered table.
- We update the protocol (if needed) to include clustering writer support and set the clustering domain metadata accordingly.
- Specified by:
withClusteringColumns
in interfaceTransactionBuilder
- Parameters:
engine
-Engine
instance to use.clusteringColumns
- The clustering columns of the table. These should be a subset of the columns in the schema. Both top-level and nested columns are allowed to be clustered. Note: Clustering columns and partition columns cannot coexist in a table.- Returns:
- updated
TransactionBuilder
instance.
- Clustering columns are not set (i.e., `withClusteringColumns` is not called):
-
withTransactionId
public TransactionBuilder withTransactionId(Engine engine, String applicationId, long transactionVersion) Description copied from interface:TransactionBuilder
Set the transaction identifier for idempotent writes. Incremental processing systems (e.g., streaming systems) that track progress using their own application-specific versions need to record what progress has been made, in order to avoid duplicating data in the face of failures and retries during writes. By setting the transaction identifier, the Delta table can ensure that the data with same identifier is not written multiple times. For more information refer to the Delta protocol section Transaction Identifiers.- Specified by:
withTransactionId
in interfaceTransactionBuilder
- Parameters:
engine
-Engine
instance to use.applicationId
- The application ID that is writing to the table.transactionVersion
- The version of the transaction. This should be monotonically increasing with each write for the same application ID.- Returns:
- updated
TransactionBuilder
instance.
-
withTableProperties
Description copied from interface:TransactionBuilder
Set the table properties for the table. When the table already contains the property with same key, it gets replaced if it doesn't have the same value. Note, user-properties (those without a '.delta' prefix) are case-sensitive. Delta-properties are case-insensitive and are normalized to their expected case before writing to the log.- Specified by:
withTableProperties
in interfaceTransactionBuilder
- Parameters:
engine
-Engine
instance to use.properties
- The table properties to set. These are key-value pairs that can be used to configure the table. And these properties are stored in the table metadata.- Returns:
- updated
TransactionBuilder
instance.
-
withTablePropertiesRemoved
Description copied from interface:TransactionBuilder
Unset the provided table properties on the table. If a property does not exist this is a no-op. For now this is only supported for user-properties (in other words, does not support 'delta.' prefixed properties). An exception will be thrown upon callingTransactionBuilder.build(Engine)
if the same key is both set and unset in the same transaction. Note, user-properties (those without a '.delta' prefix) are case-sensitive.- Specified by:
withTablePropertiesRemoved
in interfaceTransactionBuilder
- Parameters:
propertyKeys
- the table property keys to unset (remove from the table properties)- Returns:
- updated
TransactionBuilder
instance.
-
withMaxRetries
Description copied from interface:TransactionBuilder
Set the maximum number of times to retry a transaction if a concurrent write is detected. This defaults to 200- Specified by:
withMaxRetries
in interfaceTransactionBuilder
- Parameters:
maxRetries
- The number of times to retry- Returns:
- updated
TransactionBuilder
instance
-
withLogCompactionInverval
Description copied from interface:TransactionBuilder
Set the number of commits between log compactions. Defaults to 0 (disabled). For more information see the Delta protocol section Log Compaction Files.- Specified by:
withLogCompactionInverval
in interfaceTransactionBuilder
- Parameters:
logCompactionInterval
- The commits between log compactions- Returns:
- updated
TransactionBuilder
instance
-
withDomainMetadataSupported
Description copied from interface:TransactionBuilder
Enables support for Domain Metadata on this table if it is not supported already. The table feature _must_ be supported on the table to add or remove domain metadata usingTransaction.addDomainMetadata(java.lang.String, java.lang.String)
orTransaction.removeDomainMetadata(java.lang.String)
. See How does Delta Lake manage feature compatibility? for more details on table feature support.See the Delta protocol for more information on how to use Domain Metadata. This may break existing writers that do not support the Domain Metadata feature; readers will be unaffected.
- Specified by:
withDomainMetadataSupported
in interfaceTransactionBuilder
-
build
Description copied from interface:TransactionBuilder
Build the transaction. Also validates the given info to ensure that a valid transaction can be created.- Specified by:
build
in interfaceTransactionBuilder
- Parameters:
engine
-Engine
instance to use.
-