Class EntryJournalV1

java.lang.Object
org.opendaylight.controller.cluster.raft.spi.EntryJournalV1
All Implemented Interfaces:
AutoCloseable, EntryJournal

@NonNullByDefault public final class EntryJournalV1 extends Object implements EntryJournal, AutoCloseable
Baseline implementation of a RAFT log journal. The storage is a bit complicated, as we need to balance imperfect tools. The idea here is that we first recover a state snapshot and then replay any and all entries from the journal, up to the last observed commitIndex.

The second part is not something RAFT mandates, but rather is an artifact of our current implementation originating from the initial design: stable storage was organized around Pekko persistence with all its semantics. This meant that we had no direct access to specific entry for the purposes of applying its state nor sending it to peers and all entries were replayed during actor recovery, i.e. before we start participating in RAFT protocol and those are kept on-heap.

In order to deal with all that we maintain a RaftJournal of metadata, tracking two indices:

  1. the first journalIndex to replay on recovery, and
  2. the journalIndex of the entry having the same index as the last observed RAFT commitIndex
We then use them to completely apply entries, effectively replaying our last actor state, which in turn allows us to keep the on-heap state minimal.

The other part we maintain is a RaftJournal of individual entries, but there is a twist there as well: since each journal requires a fixed maximum entry size, we specify that as 256KiB and provide our own dance to store larger entries in separate files.

Note: this introduces the concept of journalIndex, which currently defined to start with at 1. Value 0 is explicitly reserved for unsigned long overflow. Negative values yield undefined results.