Class ConsistentSession

  • Direct Known Subclasses:
    CoordinatorSession, LocalSession

    public abstract class ConsistentSession
    extends java.lang.Object
    Base class for consistent Local and Coordinator sessions

    There are 4 stages to a consistent incremental repair.

    Repair prepare

    First, the normal ActiveRepairService.prepareForRepair(TimeUUID, InetAddressAndPort, Set, RepairOption, boolean, List) stuff happens, which sends out PrepareMessage and creates a ActiveRepairService.ParentRepairSession on the coordinator and each of the neighbors.

    Consistent prepare

    The consistent prepare step promotes the parent repair session to a consistent session, and isolates the sstables being repaired from other sstables. First, the coordinator sends a PrepareConsistentRequest message to each repair participant (including itself). When received, the node creates a LocalSession instance, sets it's state to PREPARING, persists it, and begins a preparing the tables for incremental repair, which segregates the data being repaired from the rest of the table data. When the preparation completes, the session state is set to PREPARED, and a PrepareConsistentResponse is sent to the coordinator indicating success or failure. If the pending anti-compaction fails, the local session state is set to FAILED.

    (see LocalSessions.handlePrepareMessage(org.apache.cassandra.net.Message)

    Once the coordinator recieves positive PrepareConsistentResponse messages from all the participants, the coordinator begins the normal repair process.

    (see CoordinatorSession#handlePrepareResponse(InetAddressAndPort, boolean)

    Repair

    The coordinator runs the normal data repair process against the sstables segregated in the previous step. When a node recieves a ValidationRequest, it sets it's local session state to REPAIRING.

    If all of the RepairSessions complete successfully, the coordinator begins the Finalization process. Otherwise, it begins the Failure process.

    Finalization

    The finalization step finishes the session and promotes the sstables to repaired. The coordinator begins by sending FinalizePropose messages to each of the participants. Each participant will set it's state to FINALIZE_PROMISED and respond with a FinalizePromise message. Once the coordinator has received promise messages from all participants, it will send a FinalizeCommit message to all of them, ending the coordinator session. When a node receives the FinalizeCommit message, it will set it's sessions state to FINALIZED, completing the LocalSession.

    For the sake of simplicity, finalization does not immediately mark pending repair sstables repaired because of potential conflicts with in progress compactions. The sstables will be marked repaired as part of the normal compaction process.

    On the coordinator side, see CoordinatorSession.finalizePropose(), CoordinatorSession#handleFinalizePromise(InetAddressAndPort, boolean), & CoordinatorSession.finalizeCommit()

    On the local session side, see LocalSessions#handleFinalizeProposeMessage(InetAddressAndPort, FinalizePropose) & LocalSessions#handleFinalizeCommitMessage(InetAddressAndPort, FinalizeCommit)

    Failure

    If there are any failures or problems during the process above, the session will be failed. When a session is failed, the coordinator will send FailSession messages to each of the participants. In some cases (basically those not including Validation and Sync) errors are reported back to the coordinator by the local session, at which point, it will send FailSession messages out.

    Just as with finalization, sstables aren't immediately moved back to unrepaired, but will be demoted as part of the normal compaction process.

    See LocalSessions#failSession(UUID, boolean) and CoordinatorSession.fail()

    Failure Recovery & Session Cleanup

    There are a few scenarios where sessions can get stuck. If a node fails mid session, or it misses a FailSession or FinalizeCommit message, it will never finish. To address this, there is a cleanup task that runs every 10 minutes that attempts to complete idle sessions.

    If a session is not completed (not FINALIZED or FAILED) and there's been no activity on the session for over an hour, the cleanup task will attempt to finish the session by learning the session state of the other participants. To do this, it sends a StatusRequest message to the other session participants. The participants respond with a StatusResponse message, notifying the sender of their state. If the sender receives a FAILED response from any of the participants, it fails the session locally. If it receives a FINALIZED response from any of the participants, it will set it's state to FINALIZED as well. Since the coordinator won't finalize sessions until it's received FinalizePromise messages from all participants, this is safe.

    If a session is not completed, and hasn't had any activity for over a day, the session is auto-failed.

    Once a session has been completed for over 2 days, it's deleted.

    Operators can also manually fail sessions with nodetool repair_admin --cancel

    See LocalSessions.cleanup() and RepairAdmin

    • Field Detail

      • sessionID

        public final TimeUUID sessionID
      • tableIds

        public final com.google.common.collect.ImmutableSet<TableId> tableIds
      • repairedAt

        public final long repairedAt
      • ranges

        public final com.google.common.collect.ImmutableSet<Range<Token>> ranges
      • participants

        public final com.google.common.collect.ImmutableSet<InetAddressAndPort> participants
    • Method Detail

      • isCompleted

        public boolean isCompleted()
      • intersects

        public boolean intersects​(java.lang.Iterable<Range<Token>> otherRanges)
      • equals

        public boolean equals​(java.lang.Object o)
        Overrides:
        equals in class java.lang.Object
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class java.lang.Object
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object