Package org.apache.cassandra.utils
Class DiagnosticSnapshotService
- java.lang.Object
-
- org.apache.cassandra.utils.DiagnosticSnapshotService
-
public class DiagnosticSnapshotService extends java.lang.Object
Provides a means to take snapshots when triggered by anomalous events or when the breaking of invariants is detected. When debugging certain classes of problems, having access to the relevant set of sstables when the problem is detected (or as close to then as possible) can be invaluable. This class performs two functions; on a replica where an anomaly is detected, it provides methods to issue snapshot requests to a provided set of replicas. For instance, if rows with duplicate clusterings are detected (CASSANDRA-15789) during a read, a snapshot request will be issued to all participating replicas. If detected during compaction, only the replica itself will receive the request. Requests are issued at a maximum rate of 1 per minute for any given table. Any additional triggers for the same table during the 60 second window are dropped, regardless of the replica set. This window is configurable via a system property (cassandra.diagnostic_snapshot_interval_nanos), but this is intended for use in testing only and operators are not expected to override the default. The second function performed is to handle snapshot requests on replicas. Snapshot names are prefixed with strings specific to the reason which triggered them. To manage consumption of disk space, replicas are restricted to taking a single snapshot for each prefix in a single calendar day. So if duplicate rows are detected by multiple coordinators during reads with the same replica set (or overlapping sets) on the same table, the coordinators may each issue snapshot requests, but the replicas will only accept the first one they receive. Further requests will be dropped on the replica side.
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
DUPLICATE_ROWS_DETECTED_SNAPSHOT_PREFIX
static DiagnosticSnapshotService
instance
static java.lang.String
REPAIRED_DATA_MISMATCH_SNAPSHOT_PREFIX
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static void
duplicateRows(TableMetadata metadata, java.lang.Iterable<InetAddressAndPort> replicas)
static java.lang.String
getSnapshotName(java.lang.String prefix)
static boolean
isDiagnosticSnapshotRequest(SnapshotCommand command)
static void
repairedDataMismatch(TableMetadata metadata, java.lang.Iterable<InetAddressAndPort> replicas)
static void
repairedDataMismatch(TableMetadata metadata, java.lang.Iterable<InetAddressAndPort> replicas, java.util.List<Range<Token>> ranges)
void
shutdownAndWait(long timeout, java.util.concurrent.TimeUnit unit)
static void
snapshot(SnapshotCommand command, java.util.List<Range<Token>> ranges, InetAddressAndPort initiator)
-
-
-
Field Detail
-
instance
public static final DiagnosticSnapshotService instance
-
REPAIRED_DATA_MISMATCH_SNAPSHOT_PREFIX
public static final java.lang.String REPAIRED_DATA_MISMATCH_SNAPSHOT_PREFIX
- See Also:
- Constant Field Values
-
DUPLICATE_ROWS_DETECTED_SNAPSHOT_PREFIX
public static final java.lang.String DUPLICATE_ROWS_DETECTED_SNAPSHOT_PREFIX
- See Also:
- Constant Field Values
-
-
Method Detail
-
repairedDataMismatch
public static void repairedDataMismatch(TableMetadata metadata, java.lang.Iterable<InetAddressAndPort> replicas)
-
repairedDataMismatch
public static void repairedDataMismatch(TableMetadata metadata, java.lang.Iterable<InetAddressAndPort> replicas, java.util.List<Range<Token>> ranges)
-
duplicateRows
public static void duplicateRows(TableMetadata metadata, java.lang.Iterable<InetAddressAndPort> replicas)
-
isDiagnosticSnapshotRequest
public static boolean isDiagnosticSnapshotRequest(SnapshotCommand command)
-
snapshot
public static void snapshot(SnapshotCommand command, java.util.List<Range<Token>> ranges, InetAddressAndPort initiator)
-
getSnapshotName
public static java.lang.String getSnapshotName(java.lang.String prefix)
-
shutdownAndWait
public void shutdownAndWait(long timeout, java.util.concurrent.TimeUnit unit) throws java.lang.InterruptedException, java.util.concurrent.TimeoutException
- Throws:
java.lang.InterruptedException
java.util.concurrent.TimeoutException
-
-