Class BlobStoreRepository

Closeable, AutoCloseable, LifecycleComponent, Releasable, Repository
FsRepository, MeteredBlobStoreRepository

public abstract class BlobStoreRepository extends AbstractLifecycleComponent implements Repository
BlobStore - based implementation of Snapshot Repository

This repository works with any BlobStore implementation. The blobStore could be (and preferred) lazy initialized in createBlobStore().

For in depth documentation on how exactly implementations of this class interact with the snapshot functionality please refer to the documentation of the package org.elasticsearch.repositories.blobstore.
    • metadata

      protected volatile RepositoryMetadata metadata
    • threadPool

      protected final ThreadPool threadPool

      public static final String STATELESS_SHARD_READ_THREAD_NAME
      public static final String STATELESS_TRANSLOG_THREAD_NAME
      public static final String STATELESS_SHARD_WRITE_THREAD_NAME
      public static final String SNAPSHOT_PREFIX
      public static final String INDEX_FILE_PREFIX
      public static final String INDEX_LATEST_BLOB
      public static final String METADATA_PREFIX
      public static final String METADATA_NAME_FORMAT
      public static final String SNAPSHOT_NAME_FORMAT
      public static final String SNAPSHOT_INDEX_NAME_FORMAT
      public static final String UPLOADED_DATA_BLOB_PREFIX
      public static final String URL_REPOSITORY_TYPE
      public static final String READONLY_SETTING_KEY
      All BlobStoreRepository implementations can be made read-only by setting this key to true in their settings.
      public static final Setting<Boolean> COMPRESS_SETTING
      When set to true metadata files are stored in compressed format. This setting doesn’t affect index files that are already compressed by default. Changing the setting does not invalidate existing files since reads do not observe the setting, instead they examine the file to see if it is compressed or not.

      public static final Setting<Boolean> CACHE_REPOSITORY_DATA
      Setting to disable caching of the latest repository data.

      public static final Setting<ByteSizeValue> BUFFER_SIZE_SETTING
      Size hint for the IO buffer size to use when reading from and writing to the repository.

      public static final Setting<Boolean> SUPPORT_URL_REPO
      Setting to disable writing the index.latest blob which enables the contents of this repository to be used with a url-repository.

      public static final Setting<Integer> MAX_SNAPSHOTS_SETTING
      Setting that defines the maximum number of snapshots to which the repository may grow. Trying to create a snapshot into the repository that would move it above this size will throw an exception.

      public static final Setting<Boolean> USE_FOR_PEER_RECOVERY_SETTING
      Setting that defines if the repository should be used to recover index files during peer recoveries.
      protected final boolean supportURLRepo

      public static final ChecksumBlobStoreFormat<Metadata> GLOBAL_METADATA_FORMAT

      public static final ChecksumBlobStoreFormat<IndexMetadata> INDEX_METADATA_FORMAT

      public static final ChecksumBlobStoreFormat<SnapshotInfo> SNAPSHOT_FORMAT

      public static final ChecksumBlobStoreFormat<BlobStoreIndexShardSnapshot> INDEX_SHARD_SNAPSHOT_FORMAT

      public static final ChecksumBlobStoreFormat<BlobStoreIndexShardSnapshots> INDEX_SHARD_SNAPSHOTS_FORMAT

      public static final Setting<ByteSizeValue> MAX_SNAPSHOT_BYTES_PER_SEC

      public static final Setting<ByteSizeValue> MAX_RESTORE_BYTES_PER_SEC
      protected final BigArrays bigArrays
      protected final int bufferSize
      IO buffer size hint for reading and writing to the underlying blob store.
      protected void doStart()
      Description copied from class: AbstractLifecycleComponent
      Start this component. Typically that means doing things like launching background processes and registering listeners on other components. Other components have been initialized by this point, but may not yet be started.

      If this method throws an exception then the startup process will fail, but this component will not be stopped before it is closed.

      This method is called while synchronized on AbstractLifecycleComponent.lifecycle. It is only called once in the lifetime of a component, although it may not be called at all if the startup process encountered some kind of fatal error, such as the failure of some other component to initialize or start.

      protected void doStop()
      Description copied from class: AbstractLifecycleComponent
      Stop this component. Typically that means doing the reverse of whatever AbstractLifecycleComponent.doStart() does.

      This method is called while synchronized on AbstractLifecycleComponent.lifecycle. It is only called once in the lifetime of a component, after calling AbstractLifecycleComponent.doStart(), although it will not be called at all if this component did not successfully start.

      protected void doClose()
      Description copied from class: AbstractLifecycleComponent
      Close this component. Typically that means doing the reverse of whatever happened during initialization, such as releasing resources acquired there.

      This method is called while synchronized on AbstractLifecycleComponent.lifecycle. It is called once in the lifetime of a component. If the component was started then it will be stopped before it is closed, and once it is closed it will not be started or stopped.

      public void awaitIdle()
      Description copied from interface: Repository
      Block until all in-flight operations for this repository have completed. Must only be called after this instance has been closed by a call to stop Releasable.close(). Waiting for ongoing operations should be implemented here instead of in LifecycleComponent.stop() or Releasable.close() hooks of this interface as these are expected to be called on the cluster state applier thread (which must not block) if a repository is removed from the cluster. This method is intended to be called on node shutdown instead as a means to ensure no repository operations are leaked.
      public void cloneShardSnapshot(SnapshotId source, SnapshotId target, RepositoryShardId shardId, @Nullable ShardGeneration shardGeneration, ActionListener<ShardSnapshotResult> listener)
      Clones a shard snapshot.
      Clones a shard snapshot.
      cloneShardSnapshot in interface Repository
      source - source snapshot
      target - target snapshot
      shardId - shard id
      shardGeneration - shard generation in repo
      listener - listener to complete with new shard generation once clone has completed
      public boolean canUpdateInPlace(Settings updatedSettings, Set<String> ignoredSettings)
      Description copied from interface: Repository
      Check if this instances Settings can be changed to the provided updated settings without recreating the repository.
      updatedSettings - new repository settings
      ignoredSettings - setting names to ignore even if changed
      true if the repository can be updated in place
      public void updateState(ClusterState state)
      Description copied from interface: Repository
      Update the repository with the incoming cluster state. This method is invoked from RepositoriesService.applyClusterState(org.elasticsearch.cluster.ClusterChangedEvent) and thus the same semantics as with ClusterStateApplier.applyClusterState(org.elasticsearch.cluster.ClusterChangedEvent) apply for the ClusterState that is passed here.
      state - new cluster state
      public ThreadPool threadPool()
      protected BlobStore getBlobStore()
      protected BlobContainer blobContainer()
      maintains single lazy instance of BlobContainer
      public BlobStore blobStore()
      Maintains single lazy instance of BlobStore. Public for testing.
      protected abstract BlobStore createBlobStore() throws Exception
      Creates new BlobStore to read and write data.
      public BlobPath basePath()
      Returns base path of the repository Public for testing.
      protected final boolean isCompress()
      Returns true if metadata and snapshot files should be compressed
      true if compression is needed
    • chunkSize

      protected ByteSizeValue chunkSize()
      Returns data file chunk size.

      This method should return null if no chunking is needed.

      chunk size
      public RepositoryMetadata getMetadata()
      Returns metadata about this repository.
      Returns metadata about this repository.
      public RepositoryStats stats()
      Returns stats on the repository usage
      Returns stats on the repository usage
      protected SnapshotDeleteListener wrapWithWeakConsistencyProtection(SnapshotDeleteListener snapshotDeleteListener)
      Some repositories (i.e. S3) run at extra risk of corruption when using the pre-7.6.0 repository format, against which we try and protect by adding some delays in between operations so that things have a chance to settle down. This method is the hook that allows the delete process to add this protection when necessary.
      public void deleteSnapshots(Collection<SnapshotId> snapshotIds, long repositoryDataGeneration, IndexVersion minimumNodeVersion, SnapshotDeleteListener listener)
      Deletes snapshots
      Deletes snapshots
      Specified by:
      deleteSnapshots in interface Repository
      snapshotIds - snapshot ids to delete
      repositoryDataGeneration - the generation of the RepositoryData in the repository at the start of the deletion
      minimumNodeVersion - the minimum IndexVersion across the nodes in the cluster, with which the repository format must remain compatible
      listener - completion listener, see SnapshotDeleteListener.
      public void cleanup(long repositoryDataGeneration, IndexVersion repositoryFormatIndexVersion, ActionListener<DeleteResult> listener)
      Runs cleanup actions on the repository. Increments the repository state id by one before executing any modifications on the repository. TODO: Add shard level cleanups TODO: Add unreferenced index metadata cleanup
      • Deleting stale indices
      • Deleting unreferenced root level blobs
      repositoryDataGeneration - Generation of RepositoryData at start of process
      repositoryFormatIndexVersion - Repository format version
      listener - Listener to complete when done
      public void finalizeSnapshot(FinalizeSnapshotContext finalizeSnapshotContext)
      Description copied from interface: Repository
      Finalizes snapshotting process

      This method is called on master after all shards are snapshotted.

      finalizeSnapshotContext - finalization context
      public void getSnapshotInfo(GetSnapshotInfoContext context)
      Reads snapshot descriptions from the repository.
      Reads snapshot descriptions from the repository.
      context - get-snapshot-info-context
      public Metadata getSnapshotGlobalMetadata(SnapshotId snapshotId)
      Returns global metadata associated with the snapshot.
      Returns global metadata associated with the snapshot.
      snapshotId - the snapshot id to load the global metadata from
      the global metadata about the snapshot
      public IndexMetadata getSnapshotIndexMetaData(RepositoryData repositoryData, SnapshotId snapshotId, IndexId index) throws IOException
      Returns the index metadata associated with the snapshot.
      Returns the index metadata associated with the snapshot.
      repositoryData - current RepositoryData
      snapshotId - the snapshot id to load the index metadata from
      index - the IndexId to load the metadata from
      the index metadata about the given index for the given snapshot
      public BlobContainer shardContainer(IndexId indexId, int shardId)
      public long getSnapshotThrottleTimeInNanos()
      Returns snapshot throttle time in nanoseconds
      Returns snapshot throttle time in nanoseconds
      public long getRestoreThrottleTimeInNanos()
      Returns restore throttle time in nanoseconds
      Returns restore throttle time in nanoseconds
      protected void assertSnapshotOrGenericThread()
      public String startVerification()
      Description copied from interface: Repository
      Verifies repository on the master node and returns the verification token.

      If the verification token is not null, it's passed to all data nodes for verification. If it's null - no additional verification is required

      verification token that should be passed to all Index Shard Repositories for additional verification or null
      public void endVerification(String seed)
      Description copied from interface: Repository
      Called at the end of repository verification process.

      This method should perform all necessary cleanup of the temporary files created in the repository

      seed - verification request generated by Repository.startVerification() command
      public void getRepositoryData(Executor responseExecutor, ActionListener<RepositoryData> listener)
      Description copied from interface: Repository
      Returns a RepositoryData to describe the data in the repository, including the snapshots and the indices across all snapshots found in the repository. Completes the listener with a RepositoryException if there was an error in reading the data.
      responseExecutor - Executor to use to complete the listener if not using the calling thread. Using EsExecutors.DIRECT_EXECUTOR_SERVICE means to complete the listener on the thread which ultimately resolved the RepositoryData, which might be a low-latency transport or cluster applier thread so make sure not to do anything slow or expensive in that case.
      listener - Listener which is either completed on the calling thread (if the RepositoryData is immediately available, e.g. from an in-memory cache), otherwise it is completed using responseExecutor.
      public boolean isReadOnly()
      Returns true if the repository supports only read operations
      Returns true if the repository supports only read operations
      true if the repository is read/only
      protected void writeIndexGen(RepositoryData repositoryData, long expectedGen, IndexVersion version, Function<ClusterState,ClusterState> stateFilter, ActionListener<RepositoryData> listener)
      Writing a new index generation (root) blob is a three-step process. Typically, it starts from a stable state where the pending generation RepositoryMetadata.pendingGeneration() is equal to the safe generation RepositoryMetadata.generation(), but after a failure it may be that the pending generation starts out greater than the safe generation.
      1. We reserve ourselves a new root blob generation G, greater than RepositoryMetadata.pendingGeneration(), via a cluster state update which edits the RepositoryMetadata entry for this repository, increasing its pending generation to G without changing its safe generation.
      2. We write the updated RepositoryData to a new root blob with generation G.
      3. We mark the successful end of the update of the repository data with a cluster state update which edits the RepositoryMetadata entry for this repository again, increasing its safe generation to equal to its pending generation G.
      We use this process to protect against problems such as a master failover part-way through. If a new master is elected while we're writing the root blob with generation G then we will fail to update the safe repository generation in the final step, and meanwhile the new master will choose a generation greater than G for all subsequent root blobs so there is no risk that we will clobber its writes. See the package level documentation for org.elasticsearch.repositories.blobstore for more details.

      Note that a failure here does not imply that the process was unsuccessful or the repository is unchanged. Once we have written the new root blob the repository is updated from the point of view of any other clusters reading from it, and if we performed a full cluster restart at that point then we would also pick up the new root blob. Writing the root blob may succeed without us receiving a successful response from the repository, leading us to report that the write operation failed. Updating the safe generation may likewise succeed on a majority of master-eligible nodes which does not include this one, again leading to an apparent failure.

      We therefore cannot safely clean up apparently-dangling blobs after a failure here. Instead, we defer any cleanup until after the next successful root-blob write, which may happen on a different master node or possibly even in a different cluster.

      repositoryData - RepositoryData to write
      expectedGen - expected repository generation at the start of the operation
      version - version of the repository metadata to write
      stateFilter - filter for the last cluster state update executed by this method
      listener - completion listener
      public void snapshotShard(SnapshotShardContext context)
      Description copied from interface: Repository
      Creates a snapshot of the shard referenced by the given SnapshotShardContext.

      As snapshot process progresses, implementation of this method should update IndexShardSnapshotStatus object returned by SnapshotShardContext.status() and call IndexShardSnapshotStatus.ensureNotAborted() to see if the snapshot process should be aborted.

      context - snapshot shard context that must be completed via SnapshotShardContext.onResponse(org.elasticsearch.repositories.ShardSnapshotResult) or DelegatingActionListener.onFailure(java.lang.Exception)
      protected void snapshotFiles(SnapshotShardContext context, BlockingQueue<BlobStoreIndexShardSnapshot.FileInfo> filesToSnapshot, ActionListener<Collection<Void>> allFilesUploadedListener)
      public void restoreShard(Store store, SnapshotId snapshotId, IndexId indexId, ShardId snapshotShardId, RecoveryState recoveryState, ActionListener<Void> listener)
      Description copied from interface: Repository
      Restores snapshot of the shard.

      The index can be renamed on restore, hence different shardId and snapshotShardId are supplied.

      store - the store to restore the index into
      snapshotId - snapshot id
      indexId - id of the index in the repository from which the restore is occurring
      snapshotShardId - shard id (in the snapshot)
      recoveryState - recovery state
      listener - listener to invoke once done
      public InputStream maybeRateLimitRestores(InputStream stream)
      Wrap the restore rate limiter (controlled by the repository setting `max_restore_bytes_per_sec` and the cluster setting `indices.recovery.max_bytes_per_sec`) around the given stream. Any throttling is reported to the given listener and not otherwise recorded in the value returned by getRestoreThrottleTimeInNanos().
      public InputStream maybeRateLimitRestores(InputStream stream, RateLimitingInputStream.Listener throttleListener)
      Wrap the restore rate limiter (controlled by the repository setting `max_restore_bytes_per_sec` and the cluster setting `indices.recovery.max_bytes_per_sec`) around the given stream. Any throttling is recorded in the value returned by getRestoreThrottleTimeInNanos().
      public InputStream maybeRateLimitSnapshots(InputStream stream)
      Wrap the snapshot rate limiter around the given stream. Any throttling is recorded in the value returned by getSnapshotThrottleTimeInNanos(). Note that speed is throttled by the repository setting `max_snapshot_bytes_per_sec` and, if recovery node bandwidth settings have been set, additionally by the `indices.recovery.max_bytes_per_sec` speed.
      public InputStream maybeRateLimitSnapshots(InputStream stream, RateLimitingInputStream.Listener throttleListener)
      Wrap the snapshot rate limiter around the given stream. Any throttling is recorded in the value returned by getSnapshotThrottleTimeInNanos(). Note that speed is throttled by the repository setting `max_snapshot_bytes_per_sec` and, if recovery node bandwidth settings have been set, additionally by the `indices.recovery.max_bytes_per_sec` speed.
      public IndexShardSnapshotStatus.Copy getShardSnapshotStatus(SnapshotId snapshotId, IndexId indexId, ShardId shardId)
      Retrieve shard snapshot status for the stored snapshot
      Retrieve shard snapshot status for the stored snapshot
      snapshotId - snapshot id
      indexId - the snapshotted index id for the shard to get status for
      shardId - shard id
      snapshot status
      public void verify(String seed, DiscoveryNode localNode)
      Verifies repository settings on data node.
      Verifies repository settings on data node.
      seed - value returned by Repository.startVerification()
      localNode - the local node information, for inclusion in verification errors
      public String toString()
      public BlobStoreIndexShardSnapshot loadShardSnapshot(BlobContainer shardContainer, SnapshotId snapshotId)
      Loads information about shard snapshot
      public BlobStoreIndexShardSnapshots getBlobStoreIndexShardSnapshots(IndexId indexId, int shardId, @Nullable ShardGeneration shardGen) throws IOException
      Loads all available snapshots in the repository using the given generation for a shard. When shardGen is null it tries to load it using the BwC mode, listing the available index- blobs in the shard container.
      protected void snapshotFile(SnapshotShardContext context, BlobStoreIndexShardSnapshot.FileInfo fileInfo) throws IOException
      Snapshot individual file
      fileInfo - file to snapshot
      public boolean supportURLRepo()
      public boolean hasAtomicOverwrites()
      whether this repository performs overwrites atomically. In practice we only overwrite the `index.latest` blob so this is not very important, but the repository analyzer does test that overwrites happen atomically. It will skip those tests if the repository overrides this method to indicate that it does not support atomic overwrites.
      public int getReadBufferSizeInBytes()