Class DiagnosticSnapshotService


  • public class DiagnosticSnapshotService
    extends java.lang.Object
    Provides a means to take snapshots when triggered by anomalous events or when the breaking of invariants is detected. When debugging certain classes of problems, having access to the relevant set of sstables when the problem is detected (or as close to then as possible) can be invaluable. This class performs two functions; on a replica where an anomaly is detected, it provides methods to issue snapshot requests to a provided set of replicas. For instance, if rows with duplicate clusterings are detected (CASSANDRA-15789) during a read, a snapshot request will be issued to all participating replicas. If detected during compaction, only the replica itself will receive the request. Requests are issued at a maximum rate of 1 per minute for any given table. Any additional triggers for the same table during the 60 second window are dropped, regardless of the replica set. This window is configurable via a system property (cassandra.diagnostic_snapshot_interval_nanos), but this is intended for use in testing only and operators are not expected to override the default. The second function performed is to handle snapshot requests on replicas. Snapshot names are prefixed with strings specific to the reason which triggered them. To manage consumption of disk space, replicas are restricted to taking a single snapshot for each prefix in a single calendar day. So if duplicate rows are detected by multiple coordinators during reads with the same replica set (or overlapping sets) on the same table, the coordinators may each issue snapshot requests, but the replicas will only accept the first one they receive. Further requests will be dropped on the replica side.
    • Field Detail

      • REPAIRED_DATA_MISMATCH_SNAPSHOT_PREFIX

        public static final java.lang.String REPAIRED_DATA_MISMATCH_SNAPSHOT_PREFIX
        See Also:
        Constant Field Values
      • DUPLICATE_ROWS_DETECTED_SNAPSHOT_PREFIX

        public static final java.lang.String DUPLICATE_ROWS_DETECTED_SNAPSHOT_PREFIX
        See Also:
        Constant Field Values
    • Method Detail

      • isDiagnosticSnapshotRequest

        public static boolean isDiagnosticSnapshotRequest​(SnapshotCommand command)
      • getSnapshotName

        public static java.lang.String getSnapshotName​(java.lang.String prefix)
      • shutdownAndWait

        public void shutdownAndWait​(long timeout,
                                    java.util.concurrent.TimeUnit unit)
                             throws java.lang.InterruptedException,
                                    java.util.concurrent.TimeoutException
        Throws:
        java.lang.InterruptedException
        java.util.concurrent.TimeoutException