@Beta public interface PartitionedFileSet extends Dataset, InputFormatProvider, OutputFormatProvider
FileSetProperties
for details. If it is enabled for explore, a Hive external table will be created when the dataset is
created. The Hive table is partitioned by the same keys as this dataset.Modifier and Type | Method and Description |
---|---|
void |
addMetadata(PartitionKey key,
Map<String,String> metadata)
Adds a set of new metadata entries for a particular partition.
|
void |
addMetadata(PartitionKey key,
String metadataKey,
String metadataValue)
Adds a new metadata entry for a particular partition.
|
void |
addPartition(PartitionKey key,
String path)
Add a partition for a given partition key, stored at a given path (relative to the file set's base path).
|
void |
addPartition(PartitionKey key,
String path,
Map<String,String> metadata)
Add a partition for a given partition key, stored at a given path (relative to the file set's base path),
with the given metadata.
|
Future<Void> |
concatenatePartition(PartitionKey key)
Asynchronous operation to concatenate the partition in Hive.
|
PartitionConsumerResult |
consumePartitions(PartitionConsumerState partitionConsumerState)
Incrementally consumes partitions.
|
PartitionConsumerResult |
consumePartitions(PartitionConsumerState partitionConsumerState,
int limit,
Predicate<PartitionDetail> predicate)
Incrementally consumes partitions.
|
void |
dropPartition(PartitionKey key)
Remove a partition for a given partition key, silently ignoring if the key is not found.
|
FileSet |
getEmbeddedFileSet() |
PartitionDetail |
getPartition(PartitionKey key)
Return the partition for a specific partition key, or null if key is not found.
|
Partitioning |
getPartitioning()
Get the partitioning declared for the file set.
|
PartitionOutput |
getPartitionOutput(PartitionKey key)
Return a partition output for a specific partition key, in preparation for creating a new partition.
|
Set<PartitionDetail> |
getPartitions(PartitionFilter filter)
Return all partitions matching the partition filter.
|
Map<String,String> |
getRuntimeArguments()
Allow direct access to the runtime arguments of this partitioned file set.
|
void |
removeMetadata(PartitionKey key,
Set<String> metadataKeys)
Removes a set of metadata entries for a particular partition.
|
void |
removeMetadata(PartitionKey key,
String metadataKey)
Removes a metadata entry for a particular partition.
|
void |
setMetadata(PartitionKey key,
Map<String,String> metadata)
Sets metadata entries for a particular partition.
|
getInputFormatClassName, getInputFormatConfiguration
getOutputFormatClassName, getOutputFormatConfiguration
static final String TYPE
Partitioning getPartitioning()
void addPartition(PartitionKey key, String path)
PartitionAlreadyExistsException
- if a partition for the same key already existsIllegalArgumentException
- if the partition key does not match the partitioning of the datasetvoid addPartition(PartitionKey key, String path, Map<String,String> metadata)
PartitionAlreadyExistsException
- if a partition for the same key already existsIllegalArgumentException
- if the partition key does not match the partitioning of the datasetvoid addMetadata(PartitionKey key, String metadataKey, String metadataValue)
DataSetException
- when an attempt is made to update an existing entryPartitionNotFoundException
- when a partition for the given key is not foundIllegalArgumentException
- if the partition key does not match the partitioning of the datasetvoid addMetadata(PartitionKey key, Map<String,String> metadata)
DataSetException
- when an attempt is made to update existing entriesPartitionNotFoundException
- when a partition for the given key is not foundIllegalArgumentException
- if the partition key does not match the partitioning of the datasetvoid setMetadata(PartitionKey key, Map<String,String> metadata)
PartitionNotFoundException
- when a partition for the given key is not foundIllegalArgumentException
- if the partition key does not match the partitioning of the datasetvoid removeMetadata(PartitionKey key, String metadataKey)
PartitionNotFoundException
- when a partition for the given key is not foundIllegalArgumentException
- if the partition key does not match the partitioning of the datasetvoid removeMetadata(PartitionKey key, Set<String> metadataKeys)
PartitionNotFoundException
- when a partition for the given key is not foundIllegalArgumentException
- if the partition key does not match the partitioning of the datasetvoid dropPartition(PartitionKey key)
IllegalArgumentException
- if the partition key does not match the partitioning of the datasetFuture<Void> concatenatePartition(PartitionKey key)
Future
which returns null, but may be used to await completion of the concatenation operation.PartitionNotFoundException
- when a partition for the given key is not foundIllegalArgumentException
- if the partition key does not match the partitioning of the dataset@Nullable PartitionDetail getPartition(PartitionKey key)
IllegalArgumentException
- if the partition key does not match the partitioning of the datasetSet<PartitionDetail> getPartitions(@Nullable PartitionFilter filter)
filter
- If non null, only partitions that match this filter are returned. If null,
all partitions are returned.PartitionConsumerResult consumePartitions(PartitionConsumerState partitionConsumerState)
partitionConsumerState
- the state from which to start consuming fromPartitionConsumerResult
which holds the state of consumption as well as an iterator to the consumed
Partition
sPartitionConsumerResult consumePartitions(PartitionConsumerState partitionConsumerState, int limit, Predicate<PartitionDetail> predicate)
partitionConsumerState
- the state from which to start consuming fromlimit
- number of partitions, which once reached, will not add add more partitions committed by other
transactions; the limit is checked after adding consuming all partitions of a transaction, so
the total number of consumed partitions may be greater than this limitpredicate
- a predicate which determines the partitions to be consumedPartitionConsumerResult
which holds the state of consumption as well as an iterator to the consumed
Partition
sPartitionOutput getPartitionOutput(PartitionKey key)
PartitionOutput.addPartition()
to add the partition to this dataset.PartitionAlreadyExistsException
- if the partition already existsIllegalArgumentException
- if the partition key does not match the partitioning of the datasetFileSet getEmbeddedFileSet()
Copyright © 2022 Cask Data, Inc. Licensed under the Apache License, Version 2.0.