- All Implemented Interfaces:
Diffable<ClusterState>
,Writeable
,ToXContent
,ToXContentFragment
Conceptually immutable, but in practice it has a few components like RoutingNodes
which are pure functions of the immutable state
but are expensive to compute so they are built on-demand if needed.
The Metadata
portion is written to disk on each update so it persists across full-cluster restarts. The rest of this data is
maintained only in-memory and resets back to its initial state on a full-cluster restart, but it is held on all nodes so it persists
across master elections (and therefore is preserved in a rolling restart).
Updates are triggered by submitting tasks to the MasterService
on the elected master, typically using a TransportMasterNodeAction
to route a request to the master on which the task is submitted with ClusterService.submitStateUpdateTask(java.lang.String, T, org.elasticsearch.cluster.ClusterStateTaskConfig, org.elasticsearch.cluster.ClusterStateTaskExecutor<T>)
. Submitted tasks have an associated ClusterStateTaskConfig
which defines a priority and a
timeout. Tasks are processed in priority order, so a flood of higher-priority tasks can starve lower-priority ones from running.
Therefore, avoid priorities other than Priority.NORMAL
where possible. Tasks associated with client actions should typically have
a timeout, or otherwise be sensitive to client cancellations, to avoid surprises caused by the execution of stale tasks long after they
are submitted (since clients themselves tend to time out). In contrast, internal tasks can reasonably have an infinite timeout,
especially if a timeout would simply trigger a retry.
Tasks that share the same ClusterStateTaskExecutor
instance are processed as a batch. Each batch of tasks yields a new ClusterState
which is published to the cluster by ClusterStatePublisher.publish(org.elasticsearch.cluster.ClusterStatePublicationEvent, org.elasticsearch.action.ActionListener<java.lang.Void>, org.elasticsearch.cluster.coordination.ClusterStatePublisher.AckListener)
. Publication usually works by sending a diff,
computed via the Diffable
interface, rather than the full state, although it will fall back to sending the full state if the
receiving node is new or it has missed out on an intermediate state for some reason. States and diffs are published using the transport
protocol, i.e. the Writeable
interface and friends.
When committed, the new state is applied which exposes it to the node via ClusterStateApplier
and ClusterStateListener
callbacks registered with the ClusterApplierService
. The new state is also made available via ClusterService.state()
. The appliers are notified (in no particular order) before ClusterService.state()
is updated, and the
listeners are notified (in no particular order) afterwards. Cluster state updates run in sequence, one-by-one, so they can be a
performance bottleneck. See the JavaDocs on the linked classes and methods for more details.
Cluster state updates can be used to trigger various actions via a ClusterStateListener
rather than using a timer.
Implements ToXContentFragment
to be exposed in REST APIs (e.g. GET _cluster/state
and POST _cluster/reroute
) and
to be indexed by monitoring, mostly just for diagnostics purposes. The XContent
representation does not need to be 100% faithful
since we never reconstruct a cluster state from its XContent representation, but the more faithful it is the more useful it is for
diagnostics. Note that the XContent
representation of the Metadata
portion does have to be faithful (in Metadata.XContentContext.GATEWAY
context) since this is how it persists across full cluster restarts.
Security-sensitive data such as passwords or private keys should not be stored in the cluster state, since the contents of the cluster state are exposed in various APIs.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
static interface
static enum
Nested classes/interfaces inherited from interface org.elasticsearch.xcontent.ToXContent
ToXContent.DelegatingMapParams, ToXContent.MapParams, ToXContent.Params
Nested classes/interfaces inherited from interface org.elasticsearch.common.io.stream.Writeable
Writeable.Reader<V>, Writeable.Writer<V>
-
Field Summary
FieldsFields inherited from interface org.elasticsearch.xcontent.ToXContent
EMPTY_PARAMS
-
Constructor Summary
ConstructorsConstructorDescriptionClusterState
(long version, String stateUUID, ClusterState state) ClusterState
(ClusterName clusterName, long version, String stateUUID, Metadata metadata, RoutingTable routingTable, DiscoveryNodes nodes, ClusterBlocks blocks, ImmutableOpenMap<String, ClusterState.Custom> customs, boolean wasReadFromDiff, RoutingNodes routingNodes) -
Method Summary
Modifier and TypeMethodDescriptionblocks()
static ClusterState.Builder
builder
(ClusterName clusterName) static ClusterState.Builder
builder
(ClusterState state) copyAndUpdate
(Consumer<ClusterState.Builder> updater) copyAndUpdateMetadata
(Consumer<Metadata.Builder> updater) <T extends ClusterState.Custom>
T<T extends ClusterState.Custom>
Tcustoms()
diff
(ClusterState previousState) Returns serializable object representing differences between this and previousStategetNodes()
Returns a built (on demand) routing nodes view of the routing table.long
metadata()
Returns a fresh mutable copy of the routing nodes view.nodes()
Returns the set of nodes that should be exposed to things like REST handlers that behave differently depending on the nodes in the cluster and their versions.static Diff<ClusterState>
readDiffFrom
(StreamInput in, DiscoveryNode localNode) static ClusterState
readFrom
(StreamInput in, DiscoveryNode localNode) This stateUUID is automatically generated for for each version of cluster state.boolean
supersedes
(ClusterState other) a cluster state supersedes another state if they are from the same master and the version of this state is higher than that of the other state.long
term()
toString()
toXContent
(XContentBuilder builder, ToXContent.Params params) long
version()
void
writeTo
(StreamOutput out) Write this into the StreamOutput.Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface org.elasticsearch.xcontent.ToXContentFragment
isFragment
-
Field Details
-
EMPTY_STATE
-
UNKNOWN_UUID
- See Also:
-
UNKNOWN_VERSION
public static final long UNKNOWN_VERSION- See Also:
-
-
Constructor Details
-
ClusterState
-
ClusterState
public ClusterState(ClusterName clusterName, long version, String stateUUID, Metadata metadata, RoutingTable routingTable, DiscoveryNodes nodes, ClusterBlocks blocks, ImmutableOpenMap<String, ClusterState.Custom> customs, boolean wasReadFromDiff, @Nullable RoutingNodes routingNodes)
-
-
Method Details
-
term
public long term() -
version
public long version() -
getVersion
public long getVersion() -
stateUUID
This stateUUID is automatically generated for for each version of cluster state. It is used to make sure that we are applying diffs to the right previous state. -
nodes
-
getNodes
-
nodesIfRecovered
Returns the set of nodes that should be exposed to things like REST handlers that behave differently depending on the nodes in the cluster and their versions. Specifically, if the cluster has properly formed then this is the nodes in the last-applied cluster state, but if the cluster has not properly formed then no nodes are returned.- Returns:
- the nodes in the cluster if the cluster has properly formed, otherwise an empty set of nodes.
-
metadata
-
getMetadata
-
coordinationMetadata
-
routingTable
-
getRoutingTable
-
blocks
-
getBlocks
-
customs
-
getCustoms
-
custom
-
custom
-
getClusterName
-
getLastAcceptedConfiguration
-
getLastCommittedConfiguration
-
getVotingConfigExclusions
-
getRoutingNodes
Returns a built (on demand) routing nodes view of the routing table. -
mutableRoutingNodes
Returns a fresh mutable copy of the routing nodes view. -
toString
-
supersedes
a cluster state supersedes another state if they are from the same master and the version of this state is higher than that of the other state.In essence that means that all the changes from the other cluster state are also reflected by the current one
-
toXContent
public XContentBuilder toXContent(XContentBuilder builder, ToXContent.Params params) throws IOException - Specified by:
toXContent
in interfaceToXContent
- Throws:
IOException
-
builder
-
builder
-
copyAndUpdate
-
copyAndUpdateMetadata
-
diff
Description copied from interface:Diffable
Returns serializable object representing differences between this and previousState- Specified by:
diff
in interfaceDiffable<ClusterState>
-
readDiffFrom
public static Diff<ClusterState> readDiffFrom(StreamInput in, DiscoveryNode localNode) throws IOException - Throws:
IOException
-
readFrom
- Throws:
IOException
-
writeTo
Description copied from interface:Writeable
Write this into the StreamOutput.- Specified by:
writeTo
in interfaceWriteable
- Throws:
IOException
-