Class ExecutionFailureHandler


  • public class ExecutionFailureHandler
    extends Object
    This handler deals with task failures to return a FailureHandlingResult which contains tasks to restart to recover from failures.
    • Constructor Detail

      • ExecutionFailureHandler

        public ExecutionFailureHandler​(org.apache.flink.configuration.Configuration jobMasterConfig,
                                       SchedulingTopology schedulingTopology,
                                       FailoverStrategy failoverStrategy,
                                       RestartBackoffTimeStrategy restartBackoffTimeStrategy,
                                       org.apache.flink.runtime.concurrent.ComponentMainThreadExecutor mainThreadExecutor,
                                       Collection<org.apache.flink.core.failure.FailureEnricher> failureEnrichers,
                                       org.apache.flink.core.failure.FailureEnricher.Context taskFailureCtx,
                                       org.apache.flink.core.failure.FailureEnricher.Context globalFailureCtx,
                                       org.apache.flink.metrics.MetricGroup metricGroup)
        Creates the handler to deal with task failures.
        Parameters:
        schedulingTopology - contains the topology info for failover
        failoverStrategy - helps to decide tasks to restart on task failures
        restartBackoffTimeStrategy - helps to decide whether to restart failed tasks and the restarting delay
        mainThreadExecutor - the main thread executor of the job master
        failureEnrichers - a collection of FailureEnricher that enrich failures
        taskFailureCtx - Task failure Context used by FailureEnrichers
        globalFailureCtx - Global failure Context used by FailureEnrichers
    • Method Detail

      • getFailureHandlingResult

        public FailureHandlingResult getFailureHandlingResult​(Execution failedExecution,
                                                              Throwable cause,
                                                              long timestamp)
        Return result of failure handling. Can be a set of task vertices to restart and a delay of the restarting. Or that the failure is not recoverable and the reason for it.
        Parameters:
        failedExecution - is the failed execution
        cause - of the task failure
        timestamp - of the task failure
        Returns:
        result of the failure handling
      • getGlobalFailureHandlingResult

        public FailureHandlingResult getGlobalFailureHandlingResult​(Throwable cause,
                                                                    long timestamp)
        Return result of failure handling on a global failure. Can be a set of task vertices to restart and a delay of the restarting. Or that the failure is not recoverable and the reason for it.
        Parameters:
        cause - of the task failure
        timestamp - of the task failure
        Returns:
        result of the failure handling
      • isUnrecoverableError

        public static boolean isUnrecoverableError​(Throwable cause)
      • getNumberOfRestarts

        public long getNumberOfRestarts()