final
class
Replicator extends Actor with ActorLogging
Instance Constructors
-
Type Members
-
Value Members
-
final
def
!=(arg0: Any): Boolean
-
final
def
##(): Int
-
def
+(other: String): String
-
def
->[B](y: B): (Replicator, B)
-
final
def
==(arg0: Any): Boolean
-
var
allReachableClockTime: Long
-
def
aroundPostRestart(reason: Throwable): Unit
-
def
aroundPostStop(): Unit
-
def
aroundPreRestart(reason: Throwable, message: Option[Any]): Unit
-
def
aroundPreStart(): Unit
-
def
aroundReceive(rcv: actor.Actor.Receive, msg: Any): Unit
-
final
def
asInstanceOf[T0]: T0
-
-
-
def
clone(): AnyRef
-
val
cluster: Cluster
-
def
collectRemovedNodes(): Unit
-
-
var
dataEntries: Map[KeyId, (DataEnvelope, Digest)]
-
def
deleteObsoletePruningPerformed(): Unit
-
val
deltaPropagationSelector: DeltaPropagationSelector { val gossipIntervalDivisor: Int }
-
-
def
digest(envelope: DataEnvelope): Digest
-
-
val
durableStore: ActorRef
-
val
durableWildcards: Set[String]
-
-
-
-
-
-
-
def
finalize(): Unit
-
def
formatted(fmtstr: String): String
-
var
fullStateGossipEnabled: Boolean
-
final
def
getClass(): Class[_]
-
def
getData(key: KeyId): Option[DataEnvelope]
-
-
def
getDigest(key: KeyId): Digest
-
-
def
gossipTo(address: Address): Unit
-
val
hasDurableKeys: Boolean
-
def
hasSubscriber(subscriber: ActorRef): Boolean
-
def
hashCode(): Int
-
def
initRemovedNodePruning(): Unit
-
-
final
def
isInstanceOf[T0]: Boolean
-
-
-
def
isLocalSender(): Boolean
-
-
-
var
leader: TreeSet[Member]
-
-
-
def
matchingRole(m: Member): Boolean
-
val
maxPruningDisseminationNanos: Long
-
-
-
var
nodes: Set[Address]
-
val
normalReceive: Receive
-
final
def
notify(): Unit
-
final
def
notifyAll(): Unit
-
-
def
performRemovedNodePruning(): Unit
-
def
postRestart(reason: Throwable): Unit
-
def
postStop(): Unit
-
def
preRestart(reason: Throwable, message: Option[Any]): Unit
-
def
preStart(): Unit
-
var
previousClockTime: Long
-
-
def
receive: actor.Actor.Receive
-
def
receiveClockTick(): Unit
-
-
def
receiveDeltaPropagation(fromNode: UniqueAddress, reply: Boolean, deltas: Map[KeyId, Delta]): Unit
-
def
receiveDeltaPropagationTick(): Unit
-
def
receiveFlushChanges(): Unit
-
-
def
receiveGetKeyIds(): Unit
-
def
receiveGetReplicaCount(): Unit
-
def
receiveGossip(updatedData: Map[KeyId, DataEnvelope], sendBack: Boolean): Unit
-
def
receiveGossipTick(): Unit
-
def
receiveMemberRemoved(m: Member): Unit
-
def
receiveMemberUp(m: Member): Unit
-
def
receiveOtherMemberEvent(m: Member): Unit
-
def
receiveReachable(m: Member): Unit
-
def
receiveRead(key: KeyId): Unit
-
def
receiveReadRepair(key: KeyId, writeEnvelope: DataEnvelope): Unit
-
def
receiveRemovedNodePruningTick(): Unit
-
def
receiveStatus(otherDigests: Map[KeyId, Digest], chunk: Int, totChunks: Int): Unit
-
def
receiveSubscribe(key: KeyR, subscriber: ActorRef): Unit
-
def
receiveTerminated(ref: ActorRef): Unit
-
def
receiveUnreachable(m: Member): Unit
-
def
receiveUnsubscribe(key: KeyR, subscriber: ActorRef): Unit
-
-
def
receiveWeaklyUpMemberUp(m: Member): Unit
-
def
receiveWrite(key: KeyId, envelope: DataEnvelope): Unit
-
-
-
-
-
implicit final
val
self: ActorRef
-
val
selfAddress: Address
-
-
final
def
sender(): ActorRef
-
-
def
setData(key: KeyId, envelope: DataEnvelope): DataEnvelope
-
var
statusCount: Long
-
var
statusTotChunks: Int
-
-
var
subscriptionKeys: Map[KeyId, KeyR]
-
val
supervisorStrategy: OneForOneStrategy
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
-
def
toString(): String
-
def
unhandled(message: Any): Unit
-
-
final
def
wait(): Unit
-
final
def
wait(arg0: Long, arg1: Int): Unit
-
final
def
wait(arg0: Long): Unit
-
var
weaklyUpNodes: Set[Address]
-
def
write(key: KeyId, writeEnvelope: DataEnvelope): Option[DataEnvelope]
-
def
writeAndStore(key: KeyId, writeEnvelope: DataEnvelope, reply: Boolean): Unit
-
def
→[B](y: B): (Replicator, B)
A replicated in-memory data store supporting low latency and high availability requirements.
The
Replicator
actor takes care of direct replication and gossip based dissemination of Conflict Free Replicated Data Types (CRDTs) to replicas in the the cluster. The data types must be convergent CRDTs and implement ReplicatedData, i.e. they provide a monotonic merge function and the state changes always converge.You can use your own custom ReplicatedData or DeltaReplicatedData types, and several types are provided by this package, such as:
For good introduction to the CRDT subject watch the The Final Causal Frontier and Eventually Consistent Data Structures talk by Sean Cribbs and and the talk by Mark Shapiro and read the excellent paper A comprehensive study of Convergent and Commutative Replicated Data Types by Mark Shapiro et. al.
The
Replicator
actor must be started on each node in the cluster, or group of nodes tagged with a specific role. It communicates with otherReplicator
instances with the same path (without address) that are running on other nodes . For convenience it can be used with the DistributedData extension but it can also be started as an ordinary actor using theReplicator.props
. If it is started as an ordinary actor it is important that it is given the same name, started on same path, on all nodes.Delta State Replicated Data Types is supported. delta-CRDT is a way to reduce the need for sending the full state for updates. For example adding element 'c' and 'd' to set {'a', 'b'} would result in sending the delta {'c', 'd'} and merge that with the state on the receiving side, resulting in set {'a', 'b', 'c', 'd'}.
The protocol for replicating the deltas supports causal consistency if the data type is marked with RequiresCausalDeliveryOfDeltas. Otherwise it is only eventually consistent. Without causal consistency it means that if elements 'c' and 'd' are added in two separate
Update
operations these deltas may occasionally be propagated to nodes in different order than the causal order of the updates. For this example it can result in that set {'a', 'b', 'd'} can be seen before element 'c' is seen. Eventually it will be {'a', 'b', 'c', 'd'}.Update
To modify and replicate a ReplicatedData value you send a Replicator.Update message to the local
Replicator
. The current data value for thekey
of theUpdate
is passed as parameter to themodify
function of theUpdate
. The function is supposed to return the new value of the data, which will then be replicated according to the given consistency level.The
modify
function is called by theReplicator
actor and must therefore be a pure function that only uses the data parameter and stable fields from enclosing scope. It must for example not accesssender()
reference of an enclosing actor.Update
is intended to only be sent from an actor running in same localActorSystem
as theReplicator
, because themodify
function is typically not serializable.You supply a write consistency level which has the following meaning:
WriteLocal
the value will immediately only be written to the local replica, and later disseminated with gossipWriteTo(n)
the value will immediately be written to at leastn
replicas, including the local replicaWriteMajority
the value will immediately be written to a majority of replicas, i.e. at leastN/2 + 1
replicas, where N is the number of nodes in the cluster (or cluster role group)WriteAll
the value will immediately be written to all nodes in the cluster (or all nodes in the cluster role group)As reply of the
Update
a Replicator.UpdateSuccess is sent to the sender of theUpdate
if the value was successfully replicated according to the supplied consistency level within the supplied timeout. Otherwise a Replicator.UpdateFailure subclass is sent back. Note that a Replicator.UpdateTimeout reply does not mean that the update completely failed or was rolled back. It may still have been replicated to some nodes, and will eventually be replicated to all nodes with the gossip protocol.You will always see your own writes. For example if you send two
Update
messages changing the value of the samekey
, themodify
function of the second message will see the change that was performed by the firstUpdate
message.In the
Update
message you can pass an optional request context, which theReplicator
does not care about, but is included in the reply messages. This is a convenient way to pass contextual information (e.g. original sender) without having to useask
or local correlation data structures.Get
To retrieve the current value of a data you send Replicator.Get message to the
Replicator
. You supply a consistency level which has the following meaning:ReadLocal
the value will only be read from the local replicaReadFrom(n)
the value will be read and merged fromn
replicas, including the local replicaReadMajority
the value will be read and merged from a majority of replicas, i.e. at leastN/2 + 1
replicas, where N is the number of nodes in the cluster (or cluster role group)ReadAll
the value will be read and merged from all nodes in the cluster (or all nodes in the cluster role group)As reply of the
Get
a Replicator.GetSuccess is sent to the sender of theGet
if the value was successfully retrieved according to the supplied consistency level within the supplied timeout. Otherwise a Replicator.GetFailure is sent. If the key does not exist the reply will be Replicator.NotFound.You will always read your own writes. For example if you send a
Update
message followed by aGet
of the samekey
theGet
will retrieve the change that was performed by the precedingUpdate
message. However, the order of the reply messages are not defined, i.e. in the previous example you may receive theGetSuccess
before theUpdateSuccess
.In the
Get
message you can pass an optional request context in the same way as for theUpdate
message, described above. For example the original sender can be passed and replied to after receiving and transformingGetSuccess
.Subscribe
You may also register interest in change notifications by sending Replicator.Subscribe message to the
Replicator
. It will send Replicator.Changed messages to the registered subscriber when the data for the subscribed key is updated. Subscribers will be notified periodically with the configurednotify-subscribers-interval
, and it is also possible to send an explicitReplicator.FlushChanges
message to theReplicator
to notify the subscribers immediately.The subscriber is automatically removed if the subscriber is terminated. A subscriber can also be deregistered with the Replicator.Unsubscribe message.
Delete
A data entry can be deleted by sending a Replicator.Delete message to the local local
Replicator
. As reply of theDelete
a Replicator.DeleteSuccess is sent to the sender of theDelete
if the value was successfully deleted according to the supplied consistency level within the supplied timeout. Otherwise a Replicator.ReplicationDeleteFailure is sent. Note thatReplicationDeleteFailure
does not mean that the delete completely failed or was rolled back. It may still have been replicated to some nodes, and may eventually be replicated to all nodes.A deleted key cannot be reused again, but it is still recommended to delete unused data entries because that reduces the replication overhead when new nodes join the cluster. Subsequent
Delete
,Update
andGet
requests will be replied with Replicator.DataDeleted. Subscribers will receive Replicator.Deleted.In the
Delete
message you can pass an optional request context in the same way as for theUpdate
message, described above. For example the original sender can be passed and replied to after receiving and transformingDeleteSuccess
.CRDT Garbage
One thing that can be problematic with CRDTs is that some data types accumulate history (garbage). For example a
GCounter
keeps track of one counter per node. If aGCounter
has been updated from one node it will associate the identifier of that node forever. That can become a problem for long running systems with many cluster nodes being added and removed. To solve this problem theReplicator
performs pruning of data associated with nodes that have been removed from the cluster. Data types that need pruning have to implement RemovedNodePruning. The pruning consists of several steps:maxPruningDissemination
duration has elapsed. The time measurement is stopped when any replica is unreachable, but it's still recommended to configure this with certain margin. It should be in the magnitude of minutes.PruningInitialized
marker in the data envelope. This is gossiped to all other nodes and they mark it as seen when they receive it.PruningInitialized
marker the leader performs the pruning and changes the marker toPruningPerformed
so that nobody else will redo the pruning. The data envelope with this pruning state is a CRDT itself. The pruning is typically performed by "moving" the part of the data associated with the removed node to the leader node. For example, aGCounter
is aMap
with the node as key and the counts done by that node as value. When pruning the value of the removed node is moved to the entry owned by the leader node. See RemovedNodePruning#prune.maxPruningDissemination
duration after pruning the last entry from the removed node thePruningPerformed
markers in the data envelope are collapsed into a single tombstone entry, for efficiency. Clients may continue to use old data and therefore all data are always cleared from parts associated with tombstoned nodes.