Package org.apache.cassandra.db.memtable
Interface Memtable
-
- All Superinterfaces:
java.lang.Comparable<Memtable>
,UnfilteredSource
- All Known Implementing Classes:
AbstractAllocatorMemtable
,AbstractMemtable
,AbstractMemtableWithCommitlog
,AbstractShardedMemtable
,ShardedSkipListMemtable
,SkipListMemtable
,TrieMemtable
public interface Memtable extends java.lang.Comparable<Memtable>, UnfilteredSource
Memtable interface. This defines the operations the ColumnFamilyStore can perform with memtables. They are of several types: - construction factory interface - write and read operations: put, rowIterator and partitionIterator - statistics and features, including partition counts, data size, encoding stats, written columns - memory usage tracking, including methods of retrieval and of adding extra allocated space (used non-CFS secondary indexes) - flush functionality, preparing the set of partitions to flush for given ranges - lifecycle management, i.e. operations that prepare and execute switch to a different memtable, together with ways of tracking the affected commit log spans See Memtable_API.md for details on implementing and using alternative memtable implementations.
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static interface
Memtable.Factory
Factory interface for constructing memtables, and querying write durability features.static interface
Memtable.FlushablePartitionSet<P extends Partition>
A collection of partitions for flushing plus some information required for writing an sstable.static class
Memtable.LastCommitLogPosition
Special commit log position marker used in the upper bound marker setting process (seeColumnFamilyStore.setCommitLogUpperBound(java.util.concurrent.atomic.AtomicReference<org.apache.cassandra.db.commitlog.CommitLogPosition>)
andaccepts(org.apache.cassandra.utils.concurrent.OpOrder.Group, org.apache.cassandra.db.commitlog.CommitLogPosition)
)static class
Memtable.MemoryUsage
static interface
Memtable.Owner
Interface for providing signals back and requesting information from the owner, i.e.
-
Field Summary
Fields Modifier and Type Field Description static long
NO_MIN_TIMESTAMP
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description boolean
accepts(OpOrder.Group opGroup, CommitLogPosition commitLogPosition)
Decide if this memtable should take a write with the given parameters, or if the write should go to the next memtable.void
addMemoryUsageTo(Memtable.MemoryUsage usage)
Add this memtable's used memory to the given usage object.default int
compareTo(Memtable that)
Order memtables by time as reflected in the commit log position at time of constructionvoid
discard()
This memtable is no longer in use or required for outstanding flushes or operations.CommitLogPosition
getApproximateCommitLogLowerBound()
Approximate commit log lower bound, <= getCommitLogLowerBound, used as a time stamp for orderingCommitLogPosition
getCommitLogLowerBound()
The commit log position at the time that this memtable was createdMemtable.LastCommitLogPosition
getFinalCommitLogUpperBound()
The commit log position at the time that this memtable was switched outMemtable.FlushablePartitionSet<?>
getFlushSet(PartitionPosition from, PartitionPosition to)
Get the collection of data between the given partition boundaries in a form suitable for flushing.LifecycleTransaction
getFlushTransaction()
long
getLiveDataSize()
Size of the data not accounting for any metadata / mapping overheadsstatic Memtable.MemoryUsage
getMemoryUsage(Memtable memtable)
Shorthand for the getting a given table's memory usage.boolean
isClean()
True if the memtable contains no datavoid
localRangesUpdated()
Called when the known ranges have been updated and owner.localRangeSplits() may return different values.void
markExtraOffHeapUsed(long additionalSpace, OpOrder.Group opGroup)
Adjust the used off-heap space by the given size (e.g.void
markExtraOnHeapUsed(long additionalSpace, OpOrder.Group opGroup)
Adjust the used on-heap space by the given size (e.g.boolean
mayContainDataBefore(CommitLogPosition position)
True if the memtable can contain any data that was written before the given commit log positionTableMetadata
metadata()
The table's definition metadata.void
metadataUpdated()
Called when the table's metadata is updated.static Memtable.MemoryUsage
newMemoryUsage()
Creates a holder for memory usage collection.long
operationCount()
Number of "operations" (in the sense defined inPartitionUpdate.operationCount()
) the memtable has executed.long
partitionCount()
Number of partitions stored in the memtablevoid
performSnapshot(java.lang.String snapshotName)
If the memtable needs to do some special action for snapshots (e.g.long
put(PartitionUpdate update, UpdateTransaction indexer, OpOrder.Group opGroup)
Put new data in the memtable.LifecycleTransaction
setFlushTransaction(LifecycleTransaction transaction)
boolean
shouldSwitch(ColumnFamilyStore.FlushReason reason)
Decides whether the memtable should be switched/flushed for the passed reason.void
switchOut(OpOrder.Barrier writeBarrier, java.util.concurrent.atomic.AtomicReference<CommitLogPosition> commitLogUpperBound)
Called to tell the memtable that it is being switched out and will be flushed (or dropped) and discarded.-
Methods inherited from interface org.apache.cassandra.db.rows.UnfilteredSource
getMinLocalDeletionTime, getMinTimestamp, partitionIterator, rowIterator, rowIterator
-
-
-
-
Field Detail
-
NO_MIN_TIMESTAMP
static final long NO_MIN_TIMESTAMP
- See Also:
- Constant Field Values
-
-
Method Detail
-
put
long put(PartitionUpdate update, UpdateTransaction indexer, OpOrder.Group opGroup)
Put new data in the memtable. This operation may block until enough memory is available in the memory pool.- Parameters:
update
- the partition update, may be a new partition or an update to an existing oneindexer
- receives information about the update's effectopGroup
- write operation group, used to permit the operation to complete if it is needed to complete a flush to free space.- Returns:
- the smallest timestamp delta between corresponding rows from existing and update. A
timestamp delta being computed as the difference between the cells and DeletionTimes from any existing partition
and those in
update
. See CASSANDRA-7979.
-
partitionCount
long partitionCount()
Number of partitions stored in the memtable
-
getLiveDataSize
long getLiveDataSize()
Size of the data not accounting for any metadata / mapping overheads
-
operationCount
long operationCount()
Number of "operations" (in the sense defined inPartitionUpdate.operationCount()
) the memtable has executed.
-
metadata
TableMetadata metadata()
The table's definition metadata. Note that this tracks the current state of the table and is not necessarily the same as what was used to create the memtable.
-
addMemoryUsageTo
void addMemoryUsageTo(Memtable.MemoryUsage usage)
Add this memtable's used memory to the given usage object. This can be used to retrieve a single memtable's usage as well as to combine the ones of related sstables (e.g. a table and its table-based secondary indexes).
-
newMemoryUsage
static Memtable.MemoryUsage newMemoryUsage()
Creates a holder for memory usage collection. This is used to track on- and off-heap memory, as well as the ratio to the total permitted memtable memory.
-
getMemoryUsage
static Memtable.MemoryUsage getMemoryUsage(Memtable memtable)
Shorthand for the getting a given table's memory usage. Implemented as a static to prevent implementations altering expectations by e.g. returning a cached object.
-
markExtraOnHeapUsed
void markExtraOnHeapUsed(long additionalSpace, OpOrder.Group opGroup)
Adjust the used on-heap space by the given size (e.g. to reflect memory used by a non-table-based index). This operation may block until enough memory is available in the memory pool.- Parameters:
additionalSpace
- the number of allocated bytesopGroup
- write operation group, used to permit the operation to complete if it is needed to complete a flush to free space.
-
markExtraOffHeapUsed
void markExtraOffHeapUsed(long additionalSpace, OpOrder.Group opGroup)
Adjust the used off-heap space by the given size (e.g. to reflect memory used by a non-table-based index). This operation may block until enough memory is available in the memory pool.- Parameters:
additionalSpace
- the number of allocated bytesopGroup
- write operation group, used to permit the operation to complete if it is needed to complete a flush to free space.
-
getFlushSet
Memtable.FlushablePartitionSet<?> getFlushSet(PartitionPosition from, PartitionPosition to)
Get the collection of data between the given partition boundaries in a form suitable for flushing.
-
switchOut
void switchOut(OpOrder.Barrier writeBarrier, java.util.concurrent.atomic.AtomicReference<CommitLogPosition> commitLogUpperBound)
Called to tell the memtable that it is being switched out and will be flushed (or dropped) and discarded. Will be followed by agetFlushSet(org.apache.cassandra.db.PartitionPosition, org.apache.cassandra.db.PartitionPosition)
call (if the table is not truncated or dropped), and adiscard()
.- Parameters:
writeBarrier
- The barrier that will signal that all writes to this memtable have completed. That is, the point after which writes cannot be accepted by this memtable (it is permitted for writes before this barrier to go into the next; seeaccepts(org.apache.cassandra.utils.concurrent.OpOrder.Group, org.apache.cassandra.db.commitlog.CommitLogPosition)
).commitLogUpperBound
- The upper commit log position for this memtable. The value may be modified after this call and will match the next memtable's lower commit log bound.
-
discard
void discard()
This memtable is no longer in use or required for outstanding flushes or operations. All held memory must be released.
-
accepts
boolean accepts(OpOrder.Group opGroup, CommitLogPosition commitLogPosition)
Decide if this memtable should take a write with the given parameters, or if the write should go to the next memtable. This enforces that no writes after the barrier set byswitchOut(org.apache.cassandra.utils.concurrent.OpOrder.Barrier, java.util.concurrent.atomic.AtomicReference<org.apache.cassandra.db.commitlog.CommitLogPosition>)
can be accepted, and is also used to define a shared commit log bound as the upper for this memtable and lower for the next.
-
getApproximateCommitLogLowerBound
CommitLogPosition getApproximateCommitLogLowerBound()
Approximate commit log lower bound, <= getCommitLogLowerBound, used as a time stamp for ordering
-
getCommitLogLowerBound
CommitLogPosition getCommitLogLowerBound()
The commit log position at the time that this memtable was created
-
getFinalCommitLogUpperBound
Memtable.LastCommitLogPosition getFinalCommitLogUpperBound()
The commit log position at the time that this memtable was switched out
-
mayContainDataBefore
boolean mayContainDataBefore(CommitLogPosition position)
True if the memtable can contain any data that was written before the given commit log position
-
isClean
boolean isClean()
True if the memtable contains no data
-
setFlushTransaction
LifecycleTransaction setFlushTransaction(LifecycleTransaction transaction)
-
getFlushTransaction
LifecycleTransaction getFlushTransaction()
-
compareTo
default int compareTo(Memtable that)
Order memtables by time as reflected in the commit log position at time of construction- Specified by:
compareTo
in interfacejava.lang.Comparable<Memtable>
-
shouldSwitch
boolean shouldSwitch(ColumnFamilyStore.FlushReason reason)
Decides whether the memtable should be switched/flushed for the passed reason. Normally this will return true, but e.g. persistent memtables may choose not to flush. Returning false will trigger further action for some reasons: - SCHEMA_CHANGE will be followed by metadataUpdated(). - OWNED_RANGES_CHANGE will be followed by localRangesUpdated(). - SNAPSHOT will be followed by performSnapshot(). - STREAMING/REPAIR will be followed by creating a FlushSet for the streamed/repaired ranges. This data will be used to create sstables, which will be streamed and then deleted. This will not be called to perform truncation or drop (in that case the memtable is unconditionally dropped), but a flush may nevertheless be requested in that case to prepare a snapshot.
-
metadataUpdated
void metadataUpdated()
Called when the table's metadata is updated. The memtable's metadata reference now points to the new version. This will not be called ifshouldSwitch(org.apache.cassandra.db.ColumnFamilyStore.FlushReason)
(SCHEMA_CHANGE) returns true, as the memtable will be swapped out instead.
-
localRangesUpdated
void localRangesUpdated()
Called when the known ranges have been updated and owner.localRangeSplits() may return different values. This will not be called ifshouldSwitch(org.apache.cassandra.db.ColumnFamilyStore.FlushReason)
(OWNED_RANGES_CHANGE) returns true, as the memtable will be swapped out instead.
-
performSnapshot
void performSnapshot(java.lang.String snapshotName)
If the memtable needs to do some special action for snapshots (e.g. because it is persistent and does not want to flush), it should return false on the above with reason SNAPSHOT and implement this method.
-
-