Package org.apache.cassandra.db.memtable
Class TrieMemtable
- java.lang.Object
-
- All Implemented Interfaces:
java.lang.Comparable<Memtable>
,Memtable
,UnfilteredSource
public class TrieMemtable extends AbstractShardedMemtable
Trie memtable implementation. Improves memory usage, garbage collection efficiency and lookup performance. The implementation is described in detail in the paper: https://www.vldb.org/pvldb/vol15/p3359-lambov.pdf The configuration takes a single parameter: - shards: the number of shards to split into, defaulting to the number of CPU cores. Also see Memtable_API.md.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.cassandra.db.memtable.AbstractMemtable
AbstractMemtable.AbstractFlushablePartitionSet<P extends Partition>, AbstractMemtable.ColumnsCollector, AbstractMemtable.StatsCollector
-
Nested classes/interfaces inherited from interface org.apache.cassandra.db.memtable.Memtable
Memtable.FlushablePartitionSet<P extends Partition>, Memtable.LastCommitLogPosition, Memtable.MemoryUsage, Memtable.Owner
-
-
Field Summary
Fields Modifier and Type Field Description static BufferType
BUFFER_TYPE
Buffer type to use for memtable tries (on- vs off-heap)static ByteComparable.Version
BYTE_COMPARABLE_VERSION
The byte-ordering conversion version to use for memtables.static int
MAX_RECURSIVE_KEY_LENGTH
If keys is below this length, we will use a recursive procedure for inserting data in the memtable trie.-
Fields inherited from class org.apache.cassandra.db.memtable.AbstractShardedMemtable
boundaries, SHARDED_MEMTABLE_CONFIG_OBJECT_NAME, SHARDS_OPTION
-
Fields inherited from class org.apache.cassandra.db.memtable.AbstractAllocatorMemtable
allocator, initialComparator, initialFactory, MEMORY_POOL, owner
-
Fields inherited from class org.apache.cassandra.db.memtable.AbstractMemtable
columnsCollector, currentOperations, metadata, minLocalDeletionTime, minTimestamp, statsCollector
-
Fields inherited from interface org.apache.cassandra.db.memtable.Memtable
NO_MIN_TIMESTAMP
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
discard()
This memtable is no longer in use or required for outstanding flushes or operations.static org.apache.cassandra.db.memtable.TrieMemtable.Factory
factory(java.util.Map<java.lang.String,java.lang.String> optionsCopy)
Memtable.FlushablePartitionSet<org.apache.cassandra.db.memtable.TrieMemtable.MemtablePartition>
getFlushSet(PartitionPosition from, PartitionPosition to)
Get the collection of data between the given partition boundaries in a form suitable for flushing.long
getLiveDataSize()
Technically we should scatter gather on all the core threads because the size in following calls are not using volatile variables, but for metrics purpose this should be good enough.long
getMinLocalDeletionTime()
Minimum local deletion time in the memtablelong
getMinTimestamp()
Returns the minTS if one available, otherwise NO_MIN_TIMESTAMP.boolean
isClean()
True if the memtable contains no datalong
operationCount()
Number of "operations" (in the sense defined inPartitionUpdate.operationCount()
) the memtable has executed.long
partitionCount()
Number of partitions stored in the memtableorg.apache.cassandra.db.memtable.TrieMemtable.MemtableUnfilteredPartitionIterator
partitionIterator(ColumnFilter columnFilter, DataRange dataRange, SSTableReadsListener readsListener)
Returns a partition iterator for the given data range.long
put(PartitionUpdate update, UpdateTransaction indexer, OpOrder.Group opGroup)
Should only be called by ColumnFamilyStore.apply via Keyspace.apply, which supplies the appropriate OpOrdering.UnfilteredRowIterator
rowIterator(DecoratedKey key)
UnfilteredRowIterator
rowIterator(DecoratedKey key, Slices slices, ColumnFilter selectedColumns, boolean reversed, SSTableReadsListener listener)
Returns a row iterator for the given partition, applying the specified row and column filters.long
unusedReservedMemory()
-
Methods inherited from class org.apache.cassandra.db.memtable.AbstractShardedMemtable
getDefaultShardCount
-
Methods inherited from class org.apache.cassandra.db.memtable.AbstractAllocatorMemtable
addMemoryUsageTo, createMemtableAllocatorPoolInternal, flushLargestMemtable, getAllocator, localRangesUpdated, markExtraOffHeapUsed, markExtraOnHeapUsed, metadataUpdated, performSnapshot, shouldSwitch, switchOut, toString
-
Methods inherited from class org.apache.cassandra.db.memtable.AbstractMemtableWithCommitlog
accepts, getApproximateCommitLogLowerBound, getCommitLogLowerBound, getFinalCommitLogUpperBound, mayContainDataBefore
-
Methods inherited from class org.apache.cassandra.db.memtable.AbstractMemtable
getFlushTransaction, metadata, setFlushTransaction, updateMin, updateMin
-
-
-
-
Field Detail
-
BUFFER_TYPE
public static final BufferType BUFFER_TYPE
Buffer type to use for memtable tries (on- vs off-heap)
-
MAX_RECURSIVE_KEY_LENGTH
public static final int MAX_RECURSIVE_KEY_LENGTH
If keys is below this length, we will use a recursive procedure for inserting data in the memtable trie.- See Also:
- Constant Field Values
-
BYTE_COMPARABLE_VERSION
public static final ByteComparable.Version BYTE_COMPARABLE_VERSION
The byte-ordering conversion version to use for memtables.
-
-
Method Detail
-
isClean
public boolean isClean()
Description copied from interface:Memtable
True if the memtable contains no data
-
discard
public void discard()
Description copied from interface:Memtable
This memtable is no longer in use or required for outstanding flushes or operations. All held memory must be released.- Specified by:
discard
in interfaceMemtable
- Overrides:
discard
in classAbstractAllocatorMemtable
-
put
public long put(PartitionUpdate update, UpdateTransaction indexer, OpOrder.Group opGroup)
Should only be called by ColumnFamilyStore.apply via Keyspace.apply, which supplies the appropriate OpOrdering. commitLogSegmentPosition should only be null if this is a secondary index, in which case it is *expected* to be null- Parameters:
update
- the partition update, may be a new partition or an update to an existing oneindexer
- receives information about the update's effectopGroup
- write operation group, used to permit the operation to complete if it is needed to complete a flush to free space.- Returns:
- the smallest timestamp delta between corresponding rows from existing and update. A
timestamp delta being computed as the difference between the cells and DeletionTimes from any existing partition
and those in
update
. See CASSANDRA-7979.
-
getLiveDataSize
public long getLiveDataSize()
Technically we should scatter gather on all the core threads because the size in following calls are not using volatile variables, but for metrics purpose this should be good enough.
-
operationCount
public long operationCount()
Description copied from interface:Memtable
Number of "operations" (in the sense defined inPartitionUpdate.operationCount()
) the memtable has executed.- Specified by:
operationCount
in interfaceMemtable
- Overrides:
operationCount
in classAbstractMemtable
-
partitionCount
public long partitionCount()
Description copied from interface:Memtable
Number of partitions stored in the memtable
-
getMinTimestamp
public long getMinTimestamp()
Returns the minTS if one available, otherwise NO_MIN_TIMESTAMP. EncodingStats uses a synthetic epoch TS at 2015. We don't want to leak that (CASSANDRA-18118) so we return NO_MIN_TIMESTAMP instead.- Specified by:
getMinTimestamp
in interfaceUnfilteredSource
- Overrides:
getMinTimestamp
in classAbstractMemtable
- Returns:
- The minTS or NO_MIN_TIMESTAMP if none available
-
getMinLocalDeletionTime
public long getMinLocalDeletionTime()
Description copied from interface:UnfilteredSource
Minimum local deletion time in the memtable- Specified by:
getMinLocalDeletionTime
in interfaceUnfilteredSource
- Overrides:
getMinLocalDeletionTime
in classAbstractMemtable
-
partitionIterator
public org.apache.cassandra.db.memtable.TrieMemtable.MemtableUnfilteredPartitionIterator partitionIterator(ColumnFilter columnFilter, DataRange dataRange, SSTableReadsListener readsListener)
Description copied from interface:UnfilteredSource
Returns a partition iterator for the given data range.- Parameters:
columnFilter
- filter to apply to all returned partitionsdataRange
- the partition and clustering range queriedreadsListener
- a listener used to handle internal read events
-
rowIterator
public UnfilteredRowIterator rowIterator(DecoratedKey key, Slices slices, ColumnFilter selectedColumns, boolean reversed, SSTableReadsListener listener)
Description copied from interface:UnfilteredSource
Returns a row iterator for the given partition, applying the specified row and column filters.- Parameters:
key
- the partition keyslices
- the row ranges to returnselectedColumns
- filter to apply to all returned partitionsreversed
- true if the content should be returned in reverse orderlistener
- a listener used to handle internal read events
-
rowIterator
public UnfilteredRowIterator rowIterator(DecoratedKey key)
-
getFlushSet
public Memtable.FlushablePartitionSet<org.apache.cassandra.db.memtable.TrieMemtable.MemtablePartition> getFlushSet(PartitionPosition from, PartitionPosition to)
Description copied from interface:Memtable
Get the collection of data between the given partition boundaries in a form suitable for flushing.
-
factory
public static org.apache.cassandra.db.memtable.TrieMemtable.Factory factory(java.util.Map<java.lang.String,java.lang.String> optionsCopy)
-
unusedReservedMemory
public long unusedReservedMemory()
-
-