Class RocksDBKeyedStateBackend<K>
- java.lang.Object
-
- org.apache.flink.runtime.state.AbstractKeyedStateBackend<K>
-
- org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend<K>
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
,org.apache.flink.api.common.state.CheckpointListener
,org.apache.flink.api.common.state.InternalCheckpointListener
,org.apache.flink.runtime.state.CheckpointableKeyedStateBackend<K>
,org.apache.flink.runtime.state.InternalKeyContext<K>
,org.apache.flink.runtime.state.KeyedStateBackend<K>
,org.apache.flink.runtime.state.KeyedStateFactory
,org.apache.flink.runtime.state.PriorityQueueSetFactory
,org.apache.flink.runtime.state.Snapshotable<org.apache.flink.runtime.state.SnapshotResult<org.apache.flink.runtime.state.KeyedStateHandle>>
,org.apache.flink.runtime.state.TestableKeyedStateBackend<K>
,org.apache.flink.util.Disposable
public class RocksDBKeyedStateBackend<K> extends org.apache.flink.runtime.state.AbstractKeyedStateBackend<K>
AnAbstractKeyedStateBackend
that stores its state inRocksDB
and serializes state to streams provided by aCheckpointStreamFactory
upon checkpointing. This state backend can store very large state that exceeds memory and spills to disk. Except for the snapshotting, this class should be accessed as if it is not threadsafe.This class follows the rules for closing/releasing native RocksDB resources as described in + this document.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
RocksDBKeyedStateBackend.RocksDbKvStateInfo
Rocks DB specific information about the k/v states.-
Nested classes/interfaces inherited from class org.apache.flink.runtime.state.AbstractKeyedStateBackend
org.apache.flink.runtime.state.AbstractKeyedStateBackend.PartitionStateFactory
-
Nested classes/interfaces inherited from interface org.apache.flink.runtime.state.KeyedStateBackend
org.apache.flink.runtime.state.KeyedStateBackend.KeySelectionListener<K extends Object>
-
-
Field Summary
Fields Modifier and Type Field Description protected org.rocksdb.RocksDB
db
Our RocksDB database, this is used by the actual subclasses ofAbstractRocksDBState
to store state.static String
MERGE_OPERATOR_NAME
The name of the merge operator in RocksDB.
-
Constructor Summary
Constructors Constructor Description RocksDBKeyedStateBackend(ClassLoader userCodeClassLoader, File instanceBasePath, RocksDBResourceContainer optionsContainer, Function<String,org.rocksdb.ColumnFamilyOptions> columnFamilyOptionsFactory, org.apache.flink.runtime.query.TaskKvStateRegistry kvStateRegistry, org.apache.flink.api.common.typeutils.TypeSerializer<K> keySerializer, org.apache.flink.api.common.ExecutionConfig executionConfig, org.apache.flink.runtime.state.ttl.TtlTimeProvider ttlTimeProvider, org.apache.flink.runtime.state.metrics.LatencyTrackingStateConfig latencyTrackingStateConfig, org.rocksdb.RocksDB db, LinkedHashMap<String,RocksDBKeyedStateBackend.RocksDbKvStateInfo> kvStateInformation, Map<String,org.apache.flink.runtime.state.heap.HeapPriorityQueueSnapshotRestoreWrapper<?>> registeredPQStates, int keyGroupPrefixBytes, org.apache.flink.core.fs.CloseableRegistry cancelStreamRegistry, org.apache.flink.runtime.state.StreamCompressionDecorator keyGroupCompressionDecorator, org.apache.flink.util.ResourceGuard rocksDBResourceGuard, RocksDBSnapshotStrategyBase<K,?> checkpointSnapshotStrategy, RocksDBWriteBatchWrapper writeBatchWrapper, org.rocksdb.ColumnFamilyHandle defaultColumnFamilyHandle, RocksDBNativeMetricMonitor nativeMetricMonitor, org.apache.flink.runtime.state.SerializedCompositeKeyBuilder<K> sharedRocksKeyBuilder, org.apache.flink.runtime.state.PriorityQueueSetFactory priorityQueueFactory, RocksDbTtlCompactFiltersManager ttlCompactFiltersManager, org.apache.flink.runtime.state.InternalKeyContext<K> keyContext, long writeBatchSize, CompletableFuture<Void> asyncCompactFuture, RocksDBManualCompactionManager rocksDBManualCompactionManager)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
compactState(org.apache.flink.api.common.state.StateDescriptor<?,?> stateDesc)
<T extends org.apache.flink.runtime.state.heap.HeapPriorityQueueElement & org.apache.flink.runtime.state.PriorityComparable<? super T> & org.apache.flink.runtime.state.Keyed<?>>
org.apache.flink.runtime.state.KeyGroupedInternalPriorityQueue<T>create(String stateName, org.apache.flink.api.common.typeutils.TypeSerializer<T> byteOrderedElementSerializer)
<T extends org.apache.flink.runtime.state.heap.HeapPriorityQueueElement & org.apache.flink.runtime.state.PriorityComparable<? super T> & org.apache.flink.runtime.state.Keyed<?>>
org.apache.flink.runtime.state.KeyGroupedInternalPriorityQueue<T>create(String stateName, org.apache.flink.api.common.typeutils.TypeSerializer<T> byteOrderedElementSerializer, boolean allowFutureMetadataUpdates)
<N,SV,SEV,S extends org.apache.flink.api.common.state.State,IS extends S>
IScreateOrUpdateInternalState(org.apache.flink.api.common.typeutils.TypeSerializer<N> namespaceSerializer, org.apache.flink.api.common.state.StateDescriptor<S,SV> stateDesc, org.apache.flink.runtime.state.StateSnapshotTransformer.StateSnapshotTransformFactory<SEV> snapshotTransformFactory)
<N,SV,SEV,S extends org.apache.flink.api.common.state.State,IS extends S>
IScreateOrUpdateInternalState(org.apache.flink.api.common.typeutils.TypeSerializer<N> namespaceSerializer, org.apache.flink.api.common.state.StateDescriptor<S,SV> stateDesc, org.apache.flink.runtime.state.StateSnapshotTransformer.StateSnapshotTransformFactory<SEV> snapshotTransformFactory, boolean allowFutureMetadataUpdates)
void
dispose()
Should only be called by one thread, and only after all accesses to the DB happened.Optional<CompletableFuture<Void>>
getAsyncCompactAfterRestoreFuture()
int
getKeyGroupPrefixBytes()
<N> Stream<K>
getKeys(String state, N namespace)
<N> Stream<org.apache.flink.api.java.tuple.Tuple2<K,N>>
getKeysAndNamespaces(String state)
org.rocksdb.ReadOptions
getReadOptions()
org.rocksdb.WriteOptions
getWriteOptions()
boolean
isSafeToReuseKVState()
void
notifyCheckpointAborted(long checkpointId)
void
notifyCheckpointComplete(long completedCheckpointId)
int
numKeyValueStateEntries()
boolean
requiresLegacySynchronousTimerSnapshots(org.apache.flink.runtime.checkpoint.SnapshotType checkpointType)
org.apache.flink.runtime.state.SavepointResources<K>
savepoint()
void
setCurrentKey(K newKey)
void
setCurrentKeyAndKeyGroup(K newKey, int newKeyGroupIndex)
RunnableFuture<org.apache.flink.runtime.state.SnapshotResult<org.apache.flink.runtime.state.KeyedStateHandle>>
snapshot(long checkpointId, long timestamp, org.apache.flink.runtime.state.CheckpointStreamFactory streamFactory, org.apache.flink.runtime.checkpoint.CheckpointOptions checkpointOptions)
Triggers an asynchronous snapshot of the keyed state backend from RocksDB.-
Methods inherited from class org.apache.flink.runtime.state.AbstractKeyedStateBackend
applyToAllKeys, applyToAllKeys, close, deregisterKeySelectionListener, getCurrentKey, getCurrentKeyGroupIndex, getKeyContext, getKeyGroupCompressionDecorator, getKeyGroupRange, getKeySerializer, getLatencyTrackingStateConfig, getNumberOfKeyGroups, getOrCreateKeyedState, getPartitionedState, notifyCheckpointSubsumed, numKeyValueStatesByName, publishQueryableStateIfEnabled, registerKeySelectionListener, setCurrentKeyGroupIndex
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
-
-
-
Field Detail
-
MERGE_OPERATOR_NAME
public static final String MERGE_OPERATOR_NAME
The name of the merge operator in RocksDB. Do not change except you know exactly what you do.- See Also:
- Constant Field Values
-
db
protected final org.rocksdb.RocksDB db
Our RocksDB database, this is used by the actual subclasses ofAbstractRocksDBState
to store state. The different k/v states that we have don't each have their own RocksDB instance. They all write to this instance but to their own column family.
-
-
Constructor Detail
-
RocksDBKeyedStateBackend
public RocksDBKeyedStateBackend(ClassLoader userCodeClassLoader, File instanceBasePath, RocksDBResourceContainer optionsContainer, Function<String,org.rocksdb.ColumnFamilyOptions> columnFamilyOptionsFactory, org.apache.flink.runtime.query.TaskKvStateRegistry kvStateRegistry, org.apache.flink.api.common.typeutils.TypeSerializer<K> keySerializer, org.apache.flink.api.common.ExecutionConfig executionConfig, org.apache.flink.runtime.state.ttl.TtlTimeProvider ttlTimeProvider, org.apache.flink.runtime.state.metrics.LatencyTrackingStateConfig latencyTrackingStateConfig, org.rocksdb.RocksDB db, LinkedHashMap<String,RocksDBKeyedStateBackend.RocksDbKvStateInfo> kvStateInformation, Map<String,org.apache.flink.runtime.state.heap.HeapPriorityQueueSnapshotRestoreWrapper<?>> registeredPQStates, int keyGroupPrefixBytes, org.apache.flink.core.fs.CloseableRegistry cancelStreamRegistry, org.apache.flink.runtime.state.StreamCompressionDecorator keyGroupCompressionDecorator, org.apache.flink.util.ResourceGuard rocksDBResourceGuard, RocksDBSnapshotStrategyBase<K,?> checkpointSnapshotStrategy, RocksDBWriteBatchWrapper writeBatchWrapper, org.rocksdb.ColumnFamilyHandle defaultColumnFamilyHandle, RocksDBNativeMetricMonitor nativeMetricMonitor, org.apache.flink.runtime.state.SerializedCompositeKeyBuilder<K> sharedRocksKeyBuilder, org.apache.flink.runtime.state.PriorityQueueSetFactory priorityQueueFactory, RocksDbTtlCompactFiltersManager ttlCompactFiltersManager, org.apache.flink.runtime.state.InternalKeyContext<K> keyContext, @Nonnegative long writeBatchSize, @Nullable CompletableFuture<Void> asyncCompactFuture, RocksDBManualCompactionManager rocksDBManualCompactionManager)
-
-
Method Detail
-
getKeysAndNamespaces
public <N> Stream<org.apache.flink.api.java.tuple.Tuple2<K,N>> getKeysAndNamespaces(String state)
-
setCurrentKey
public void setCurrentKey(K newKey)
-
setCurrentKeyAndKeyGroup
public void setCurrentKeyAndKeyGroup(K newKey, int newKeyGroupIndex)
-
dispose
public void dispose()
Should only be called by one thread, and only after all accesses to the DB happened.
-
create
@Nonnull public <T extends org.apache.flink.runtime.state.heap.HeapPriorityQueueElement & org.apache.flink.runtime.state.PriorityComparable<? super T> & org.apache.flink.runtime.state.Keyed<?>> org.apache.flink.runtime.state.KeyGroupedInternalPriorityQueue<T> create(@Nonnull String stateName, @Nonnull org.apache.flink.api.common.typeutils.TypeSerializer<T> byteOrderedElementSerializer)
-
create
public <T extends org.apache.flink.runtime.state.heap.HeapPriorityQueueElement & org.apache.flink.runtime.state.PriorityComparable<? super T> & org.apache.flink.runtime.state.Keyed<?>> org.apache.flink.runtime.state.KeyGroupedInternalPriorityQueue<T> create(@Nonnull String stateName, @Nonnull org.apache.flink.api.common.typeutils.TypeSerializer<T> byteOrderedElementSerializer, boolean allowFutureMetadataUpdates)
-
getKeyGroupPrefixBytes
public int getKeyGroupPrefixBytes()
-
getWriteOptions
public org.rocksdb.WriteOptions getWriteOptions()
-
getReadOptions
public org.rocksdb.ReadOptions getReadOptions()
-
snapshot
@Nonnull public RunnableFuture<org.apache.flink.runtime.state.SnapshotResult<org.apache.flink.runtime.state.KeyedStateHandle>> snapshot(long checkpointId, long timestamp, @Nonnull org.apache.flink.runtime.state.CheckpointStreamFactory streamFactory, @Nonnull org.apache.flink.runtime.checkpoint.CheckpointOptions checkpointOptions) throws Exception
Triggers an asynchronous snapshot of the keyed state backend from RocksDB. This snapshot can be canceled and is also stopped when the backend is closed throughdispose()
. For each backend, this method must always be called by the same thread.- Parameters:
checkpointId
- The Id of the checkpoint.timestamp
- The timestamp of the checkpoint.streamFactory
- The factory that we can use for writing our state to streams.checkpointOptions
- Options for how to perform this checkpoint.- Returns:
- Future to the state handle of the snapshot data.
- Throws:
Exception
- indicating a problem in the synchronous part of the checkpoint.
-
savepoint
@Nonnull public org.apache.flink.runtime.state.SavepointResources<K> savepoint() throws Exception
- Throws:
Exception
-
notifyCheckpointComplete
public void notifyCheckpointComplete(long completedCheckpointId) throws Exception
- Throws:
Exception
-
notifyCheckpointAborted
public void notifyCheckpointAborted(long checkpointId) throws Exception
- Throws:
Exception
-
createOrUpdateInternalState
@Nonnull public <N,SV,SEV,S extends org.apache.flink.api.common.state.State,IS extends S> IS createOrUpdateInternalState(@Nonnull org.apache.flink.api.common.typeutils.TypeSerializer<N> namespaceSerializer, @Nonnull org.apache.flink.api.common.state.StateDescriptor<S,SV> stateDesc, @Nonnull org.apache.flink.runtime.state.StateSnapshotTransformer.StateSnapshotTransformFactory<SEV> snapshotTransformFactory) throws Exception
- Throws:
Exception
-
createOrUpdateInternalState
@Nonnull public <N,SV,SEV,S extends org.apache.flink.api.common.state.State,IS extends S> IS createOrUpdateInternalState(@Nonnull org.apache.flink.api.common.typeutils.TypeSerializer<N> namespaceSerializer, @Nonnull org.apache.flink.api.common.state.StateDescriptor<S,SV> stateDesc, @Nonnull org.apache.flink.runtime.state.StateSnapshotTransformer.StateSnapshotTransformFactory<SEV> snapshotTransformFactory, boolean allowFutureMetadataUpdates) throws Exception
- Throws:
Exception
-
numKeyValueStateEntries
@VisibleForTesting public int numKeyValueStateEntries()
-
requiresLegacySynchronousTimerSnapshots
public boolean requiresLegacySynchronousTimerSnapshots(org.apache.flink.runtime.checkpoint.SnapshotType checkpointType)
- Overrides:
requiresLegacySynchronousTimerSnapshots
in classorg.apache.flink.runtime.state.AbstractKeyedStateBackend<K>
-
isSafeToReuseKVState
public boolean isSafeToReuseKVState()
-
compactState
@VisibleForTesting public void compactState(org.apache.flink.api.common.state.StateDescriptor<?,?> stateDesc) throws org.rocksdb.RocksDBException
- Throws:
org.rocksdb.RocksDBException
-
getAsyncCompactAfterRestoreFuture
@VisibleForTesting public Optional<CompletableFuture<Void>> getAsyncCompactAfterRestoreFuture()
-
-