Class ShuffleMasterSnapshotUtil


  • public class ShuffleMasterSnapshotUtil
    extends Object
    Utility class for handling cluster-level snapshots for the ShuffleMaster. This class provides methods to write snapshots to the file system and read snapshots from the file system. Snapshots are immutable and these operations should be called only during the startup phase of a Flink cluster.

    Snapshots are written to and read from files in the specified working directory. The files created are named using a prefix followed by the cluster ID.

    • Constructor Detail

      • ShuffleMasterSnapshotUtil

        public ShuffleMasterSnapshotUtil()
    • Method Detail

      • restoreOrSnapshotShuffleMaster

        public static void restoreOrSnapshotShuffleMaster​(ShuffleMaster<?> shuffleMaster,
                                                          org.apache.flink.configuration.Configuration configuration,
                                                          Executor ioExecutor)
                                                   throws IOException
        Restores the state of the ShuffleMaster from a cluster-level snapshot if available. If the snapshot does not exist, it will create a new snapshot.

        This method first checks if job recovery is enabled and supported by the ShuffleMaster. It then attempts to locate and read an existing snapshot from the cluster storage. If a snapshot exists, the ShuffleMaster state is restored from it. If no snapshot is found, a new snapshot is taken and saved to the cluster storage asynchronously.

        Parameters:
        shuffleMaster - the shuffle master which state needs to be restored or saved
        configuration - the configuration containing settings relevant to job recovery
        ioExecutor - an executor that handles the IO operations for snapshot creation
        Throws:
        IOException - if an error occurs during reading or writing the snapshot