Class FailureHandlingResult


  • public class FailureHandlingResult
    extends Object
    Result containing the tasks to restart upon a task failure. Also contains the reason of the failure and the vertices to restart if the failure is recoverable (in contrast to non-recoverable failure type or restarting suppressed by restart strategy).
    • Method Detail

      • getVerticesToRestart

        public Set<ExecutionVertexID> getVerticesToRestart()
        Returns the tasks to restart.
        Returns:
        the tasks to restart
      • getRestartDelayMS

        public long getRestartDelayMS()
        Returns the delay before the restarting.
        Returns:
        the delay before the restarting
      • getFailedExecution

        public Optional<Execution> getFailedExecution()
        Returns an Optional with the Execution causing this failure or an empty Optional if it's a global failure.
        Returns:
        The Optional with the failed Execution or an empty Optional if it's a global failure.
      • getError

        @Nullable
        public Throwable getError()
        Returns reason why the restarting cannot be conducted.
        Returns:
        reason why the restarting cannot be conducted
      • getFailureLabels

        public CompletableFuture<Map<String,​String>> getFailureLabels()
        Returns the labels future associated with the failure.
        Returns:
        the CompletableFuture Map of String labels
      • getTimestamp

        public long getTimestamp()
        Returns the time of the failure.
        Returns:
        The timestamp.
      • canRestart

        public boolean canRestart()
        Returns whether the restarting can be conducted.
        Returns:
        whether the restarting can be conducted
      • isGlobalFailure

        public boolean isGlobalFailure()
        Checks if this failure was a global failure, i.e., coming from a "safety net" failover that involved all tasks and should reset also components like the coordinators.
      • isRootCause

        public boolean isRootCause()
        Returns:
        True means that the current failure is a new attempt, false means that there has been a failure before and has not been tried yet, and the current failure will be merged into the previous attempt, and these merged exceptions will be considered as the concurrent exceptions.
      • restartable

        public static FailureHandlingResult restartable​(@Nullable
                                                        Execution failedExecution,
                                                        @Nullable
                                                        Throwable cause,
                                                        long timestamp,
                                                        CompletableFuture<Map<String,​String>> failureLabels,
                                                        Set<ExecutionVertexID> verticesToRestart,
                                                        long restartDelayMS,
                                                        boolean globalFailure,
                                                        boolean isRootCause)
        Creates a result of a set of tasks to restart to recover from the failure.

        The result can be flagged to be from a global failure triggered by the scheduler, rather than from the failure of an individual task.

        Parameters:
        failedExecution - the Execution that the failure is originating from. Passing null as a value indicates that the failure was issued by Flink itself.
        cause - The reason of the failure.
        timestamp - The time of the failure.
        failureLabels - Map of labels characterizing the failure produced by the FailureEnrichers.
        verticesToRestart - containing task vertices to restart to recover from the failure. null indicates that the failure is not restartable.
        restartDelayMS - indicate a delay before conducting the restart
        Returns:
        result of a set of tasks to restart to recover from the failure
      • unrecoverable

        public static FailureHandlingResult unrecoverable​(@Nullable
                                                          Execution failedExecution,
                                                          @Nonnull
                                                          Throwable error,
                                                          long timestamp,
                                                          CompletableFuture<Map<String,​String>> failureLabels,
                                                          boolean globalFailure,
                                                          boolean isRootCause)
        Creates a result that the failure is not recoverable and no restarting should be conducted.

        The result can be flagged to be from a global failure triggered by the scheduler, rather than from the failure of an individual task.

        Parameters:
        failedExecution - the Execution that the failure is originating from. Passing null as a value indicates that the failure was issued by Flink itself.
        error - reason why the failure is not recoverable
        timestamp - The time of the failure.
        failureLabels - Map of labels characterizing the failure produced by the FailureEnrichers.
        Returns:
        result indicating the failure is not recoverable