Class StreamGraph

  • All Implemented Interfaces:
    Serializable, org.apache.flink.api.dag.Pipeline, ExecutionPlan

    @Internal
    public class StreamGraph
    extends Object
    implements org.apache.flink.api.dag.Pipeline, ExecutionPlan
    Class representing the streaming topology. It contains all the information necessary to build the jobgraph for the execution.
    See Also:
    Serialized Form
    • Constructor Detail

      • StreamGraph

        public StreamGraph​(org.apache.flink.configuration.Configuration jobConfiguration,
                           org.apache.flink.api.common.ExecutionConfig executionConfig,
                           CheckpointConfig checkpointConfig,
                           SavepointRestoreSettings savepointRestoreSettings)
    • Method Detail

      • clear

        public void clear()
        Remove all registered nodes etc.
      • getExecutionConfig

        public org.apache.flink.api.common.ExecutionConfig getExecutionConfig()
      • getJobConfiguration

        public org.apache.flink.configuration.Configuration getJobConfiguration()
        Description copied from interface: ExecutionPlan
        Gets the job configuration.
        Specified by:
        getJobConfiguration in interface ExecutionPlan
        Returns:
        the job configuration
      • getCheckpointingMode

        public org.apache.flink.core.execution.CheckpointingMode getCheckpointingMode()
      • getCheckpointingMode

        public static org.apache.flink.core.execution.CheckpointingMode getCheckpointingMode​(CheckpointConfig checkpointConfig)
      • addJar

        public void addJar​(org.apache.flink.core.fs.Path jar)
        Adds the path of a JAR file required to run the job on a task manager.
        Parameters:
        jar - path of the JAR file required to run the job on a task manager
      • getUserJars

        public List<org.apache.flink.core.fs.Path> getUserJars()
        Gets the list of assigned user jar paths.
        Specified by:
        getUserJars in interface ExecutionPlan
        Returns:
        The list of assigned user jar paths
      • createJobCheckpointingSettings

        public void createJobCheckpointingSettings()
      • getSerializedExecutionConfig

        public org.apache.flink.util.SerializedValue<org.apache.flink.api.common.ExecutionConfig> getSerializedExecutionConfig()
        Description copied from interface: ExecutionPlan
        Gets the serialized execution configuration.
        Specified by:
        getSerializedExecutionConfig in interface ExecutionPlan
        Returns:
        The serialized execution configuration object
      • getJobName

        public String getJobName()
      • setJobName

        public void setJobName​(String jobName)
      • setLineageGraph

        public void setLineageGraph​(LineageGraph lineageGraph)
      • setStateBackend

        public void setStateBackend​(StateBackend backend)
      • getStateBackend

        @VisibleForTesting
        public StateBackend getStateBackend()
      • setCheckpointStorage

        public void setCheckpointStorage​(CheckpointStorage checkpointStorage)
      • setGlobalStreamExchangeMode

        public void setGlobalStreamExchangeMode​(GlobalStreamExchangeMode globalExchangeMode)
      • setSlotSharingGroupResource

        public void setSlotSharingGroupResource​(Map<String,​ResourceProfile> slotSharingGroupResources)
      • hasFineGrainedResource

        public boolean hasFineGrainedResource()
      • setAllVerticesInSameSlotSharingGroupByDefault

        public void setAllVerticesInSameSlotSharingGroupByDefault​(boolean allVerticesInSameSlotSharingGroupByDefault)
        Set whether to put all vertices into the same slot sharing group by default.
        Parameters:
        allVerticesInSameSlotSharingGroupByDefault - indicates whether to put all vertices into the same slot sharing group by default.
      • isAllVerticesInSameSlotSharingGroupByDefault

        public boolean isAllVerticesInSameSlotSharingGroupByDefault()
        Gets whether to put all vertices into the same slot sharing group by default.
        Returns:
        whether to put all vertices into the same slot sharing group by default.
      • isEnableCheckpointsAfterTasksFinish

        public boolean isEnableCheckpointsAfterTasksFinish()
      • setEnableCheckpointsAfterTasksFinish

        public void setEnableCheckpointsAfterTasksFinish​(boolean enableCheckpointsAfterTasksFinish)
      • isChainingEnabled

        public boolean isChainingEnabled()
      • isChainingOfOperatorsWithDifferentMaxParallelismEnabled

        public boolean isChainingOfOperatorsWithDifferentMaxParallelismEnabled()
      • isIterative

        public boolean isIterative()
      • addSource

        public <IN,​OUT> void addSource​(Integer vertexID,
                                             @Nullable
                                             String slotSharingGroup,
                                             @Nullable
                                             String coLocationGroup,
                                             SourceOperatorFactory<OUT> operatorFactory,
                                             org.apache.flink.api.common.typeinfo.TypeInformation<IN> inTypeInfo,
                                             org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo,
                                             String operatorName)
      • addLegacySource

        public <IN,​OUT> void addLegacySource​(Integer vertexID,
                                                   @Nullable
                                                   String slotSharingGroup,
                                                   @Nullable
                                                   String coLocationGroup,
                                                   StreamOperatorFactory<OUT> operatorFactory,
                                                   org.apache.flink.api.common.typeinfo.TypeInformation<IN> inTypeInfo,
                                                   org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo,
                                                   String operatorName)
      • addSink

        public <IN,​OUT> void addSink​(Integer vertexID,
                                           @Nullable
                                           String slotSharingGroup,
                                           @Nullable
                                           String coLocationGroup,
                                           StreamOperatorFactory<OUT> operatorFactory,
                                           org.apache.flink.api.common.typeinfo.TypeInformation<IN> inTypeInfo,
                                           org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo,
                                           String operatorName)
      • addOperator

        public <IN,​OUT> void addOperator​(Integer vertexID,
                                               @Nullable
                                               String slotSharingGroup,
                                               @Nullable
                                               String coLocationGroup,
                                               StreamOperatorFactory<OUT> operatorFactory,
                                               org.apache.flink.api.common.typeinfo.TypeInformation<IN> inTypeInfo,
                                               org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo,
                                               String operatorName)
      • addCoOperator

        public <IN1,​IN2,​OUT> void addCoOperator​(Integer vertexID,
                                                            String slotSharingGroup,
                                                            @Nullable
                                                            String coLocationGroup,
                                                            StreamOperatorFactory<OUT> taskOperatorFactory,
                                                            org.apache.flink.api.common.typeinfo.TypeInformation<IN1> in1TypeInfo,
                                                            org.apache.flink.api.common.typeinfo.TypeInformation<IN2> in2TypeInfo,
                                                            org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo,
                                                            String operatorName)
      • addMultipleInputOperator

        public <OUT> void addMultipleInputOperator​(Integer vertexID,
                                                   String slotSharingGroup,
                                                   @Nullable
                                                   String coLocationGroup,
                                                   StreamOperatorFactory<OUT> operatorFactory,
                                                   List<org.apache.flink.api.common.typeinfo.TypeInformation<?>> inTypeInfos,
                                                   org.apache.flink.api.common.typeinfo.TypeInformation<OUT> outTypeInfo,
                                                   String operatorName)
      • addVirtualSideOutputNode

        public void addVirtualSideOutputNode​(Integer originalId,
                                             Integer virtualId,
                                             org.apache.flink.util.OutputTag outputTag)
        Adds a new virtual node that is used to connect a downstream vertex to only the outputs with the selected side-output OutputTag.
        Parameters:
        originalId - ID of the node that should be connected to.
        virtualId - ID of the virtual node.
        outputTag - The selected side-output OutputTag.
      • addVirtualPartitionNode

        public void addVirtualPartitionNode​(Integer originalId,
                                            Integer virtualId,
                                            StreamPartitioner<?> partitioner,
                                            StreamExchangeMode exchangeMode)
        Adds a new virtual node that is used to connect a downstream vertex to an input with a certain partitioning.

        When adding an edge from the virtual node to a downstream node the connection will be made to the original node, but with the partitioning given here.

        Parameters:
        originalId - ID of the node that should be connected to.
        virtualId - ID of the virtual node.
        partitioner - The partitioner
      • getSlotSharingGroup

        public String getSlotSharingGroup​(Integer id)
        Determines the slot sharing group of an operation across virtual nodes.
      • addEdge

        public void addEdge​(Integer upStreamVertexID,
                            Integer downStreamVertexID,
                            int typeNumber)
      • setParallelism

        public void setParallelism​(Integer vertexID,
                                   int parallelism)
      • isDynamic

        public boolean isDynamic()
        Description copied from interface: ExecutionPlan
        Checks if the execution plan is dynamic.
        Specified by:
        isDynamic in interface ExecutionPlan
        Returns:
        true if the execution plan is dynamic; false otherwise
      • isEmpty

        public boolean isEmpty()
        Description copied from interface: ExecutionPlan
        Checks if the execution plan is empty.
        Specified by:
        isEmpty in interface ExecutionPlan
        Returns:
        true if the plan is empty; false otherwise
      • setParallelism

        public void setParallelism​(Integer vertexId,
                                   int parallelism,
                                   boolean parallelismConfigured)
      • setDynamic

        public void setDynamic​(boolean dynamic)
      • setMaxParallelism

        public void setMaxParallelism​(int vertexID,
                                      int maxParallelism)
      • setResources

        public void setResources​(int vertexID,
                                 org.apache.flink.api.common.operators.ResourceSpec minResources,
                                 org.apache.flink.api.common.operators.ResourceSpec preferredResources)
      • setManagedMemoryUseCaseWeights

        public void setManagedMemoryUseCaseWeights​(int vertexID,
                                                   Map<org.apache.flink.core.memory.ManagedMemoryUseCase,​Integer> operatorScopeUseCaseWeights,
                                                   Set<org.apache.flink.core.memory.ManagedMemoryUseCase> slotScopeUseCases)
      • setOneInputStateKey

        public void setOneInputStateKey​(Integer vertexID,
                                        org.apache.flink.api.java.functions.KeySelector<?,​?> keySelector,
                                        org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer)
      • setTwoInputStateKey

        public void setTwoInputStateKey​(Integer vertexID,
                                        org.apache.flink.api.java.functions.KeySelector<?,​?> keySelector1,
                                        org.apache.flink.api.java.functions.KeySelector<?,​?> keySelector2,
                                        org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer)
      • setMultipleInputStateKey

        public void setMultipleInputStateKey​(Integer vertexID,
                                             List<org.apache.flink.api.java.functions.KeySelector<?,​?>> keySelectors,
                                             org.apache.flink.api.common.typeutils.TypeSerializer<?> keySerializer)
      • setBufferTimeout

        public void setBufferTimeout​(Integer vertexID,
                                     long bufferTimeout)
      • setSerializers

        public void setSerializers​(Integer vertexID,
                                   org.apache.flink.api.common.typeutils.TypeSerializer<?> in1,
                                   org.apache.flink.api.common.typeutils.TypeSerializer<?> in2,
                                   org.apache.flink.api.common.typeutils.TypeSerializer<?> out)
      • setInputFormat

        public void setInputFormat​(Integer vertexID,
                                   org.apache.flink.api.common.io.InputFormat<?,​?> inputFormat)
      • setOutputFormat

        public void setOutputFormat​(Integer vertexID,
                                    org.apache.flink.api.common.io.OutputFormat<?> outputFormat)
      • setTransformationUID

        public void setTransformationUID​(Integer nodeId,
                                         String transformationId)
      • getStreamEdges

        @VisibleForTesting
        public List<StreamEdge> getStreamEdges​(int sourceId)
      • getStreamEdges

        public List<StreamEdge> getStreamEdges​(int sourceId,
                                               int targetId)
      • getStreamEdgesOrThrow

        @VisibleForTesting
        @Deprecated
        public List<StreamEdge> getStreamEdgesOrThrow​(int sourceId,
                                                      int targetId)
        Deprecated.
      • getLoopTimeout

        public long getLoopTimeout​(Integer vertexID)
      • getJobGraph

        @VisibleForTesting
        public JobGraph getJobGraph()
        Gets the assembled JobGraph with a random JobID.
      • getJobGraph

        public JobGraph getJobGraph​(ClassLoader userClassLoader,
                                    @Nullable
                                    org.apache.flink.api.common.JobID jobID)
        Gets the assembled JobGraph with a specified JobID.
      • getStreamingPlanAsJSON

        public String getStreamingPlanAsJSON()
      • setJobType

        public void setJobType​(JobType jobType)
      • isAutoParallelismEnabled

        public boolean isAutoParallelismEnabled()
      • setAutoParallelismEnabled

        public void setAutoParallelismEnabled​(boolean autoParallelismEnabled)
      • getVertexDescriptionMode

        public org.apache.flink.configuration.PipelineOptions.VertexDescriptionMode getVertexDescriptionMode()
      • setVertexDescriptionMode

        public void setVertexDescriptionMode​(org.apache.flink.configuration.PipelineOptions.VertexDescriptionMode mode)
      • setVertexNameIncludeIndexPrefix

        public void setVertexNameIncludeIndexPrefix​(boolean includePrefix)
      • isVertexNameIncludeIndexPrefix

        public boolean isVertexNameIncludeIndexPrefix()
      • registerJobStatusHook

        public void registerJobStatusHook​(org.apache.flink.core.execution.JobStatusHook hook)
        Registers the JobStatusHook.
      • getJobStatusHooks

        public List<org.apache.flink.core.execution.JobStatusHook> getJobStatusHooks()
      • setSupportsConcurrentExecutionAttempts

        public void setSupportsConcurrentExecutionAttempts​(Integer vertexId,
                                                           boolean supportsConcurrentExecutionAttempts)
      • setAttribute

        public void setAttribute​(Integer vertexId,
                                 org.apache.flink.api.common.attribute.Attribute attribute)
      • setJobId

        public void setJobId​(org.apache.flink.api.common.JobID jobId)
      • getJobID

        public org.apache.flink.api.common.JobID getJobID()
        Description copied from interface: ExecutionPlan
        Gets the unique identifier of the job.
        Specified by:
        getJobID in interface ExecutionPlan
        Returns:
        the job id
      • setClasspath

        public void setClasspath​(List<URL> paths)
        Sets the classpath required to run the job on a task manager.
        Parameters:
        paths - paths of the directories/JAR files required to run the job on a task manager
      • getClasspath

        public List<URL> getClasspath()
      • getUserJarBlobKeys

        public List<PermanentBlobKey> getUserJarBlobKeys()
        Returns a list of BLOB keys referring to the JAR files required to run this job.
        Specified by:
        getUserJarBlobKeys in interface ExecutionPlan
        Returns:
        list of BLOB keys referring to the JAR files required to run this job
      • getClasspaths

        public List<URL> getClasspaths()
        Description copied from interface: ExecutionPlan
        Gets the classpath required for the job.
        Specified by:
        getClasspaths in interface ExecutionPlan
        Returns:
        a list of classpath URLs
      • addUserArtifact

        public void addUserArtifact​(String name,
                                    org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry file)
      • getUserArtifacts

        public Map<String,​org.apache.flink.api.common.cache.DistributedCache.DistributedCacheEntry> getUserArtifacts()
        Description copied from interface: ExecutionPlan
        Gets the user artifacts associated with the job.
        Specified by:
        getUserArtifacts in interface ExecutionPlan
        Returns:
        a map of user artifacts
      • setUserArtifactBlobKey

        public void setUserArtifactBlobKey​(String entryName,
                                           PermanentBlobKey blobKey)
                                    throws IOException
        Description copied from interface: ExecutionPlan
        Sets a user artifact blob key for a specified user artifact.
        Specified by:
        setUserArtifactBlobKey in interface ExecutionPlan
        Parameters:
        entryName - the name of the user artifact
        blobKey - the blob key corresponding to the user artifact
        Throws:
        IOException - if an error occurs during the operation
      • getMaximumParallelism

        public int getMaximumParallelism()
        Description copied from interface: ExecutionPlan
        Gets the maximum parallelism level for the job.
        Specified by:
        getMaximumParallelism in interface ExecutionPlan
        Returns:
        the maximum parallelism
      • setInitialClientHeartbeatTimeout

        public void setInitialClientHeartbeatTimeout​(long initialClientHeartbeatTimeout)
      • getInitialClientHeartbeatTimeout

        public long getInitialClientHeartbeatTimeout()
        Description copied from interface: ExecutionPlan
        Gets the initial client heartbeat timeout.
        Specified by:
        getInitialClientHeartbeatTimeout in interface ExecutionPlan
        Returns:
        the timeout duration in milliseconds
      • isPartialResourceConfigured

        public boolean isPartialResourceConfigured()
        Description copied from interface: ExecutionPlan
        Checks if partial resource configuration is specified.
        Specified by:
        isPartialResourceConfigured in interface ExecutionPlan
        Returns:
        true if partial resource configuration is set; false otherwise
      • serializeUserDefinedInstances

        public void serializeUserDefinedInstances()
                                           throws IOException
        Throws:
        IOException
      • getStreamNodesSortedTopologicallyFromSources

        public List<StreamNode> getStreamNodesSortedTopologicallyFromSources()
                                                                      throws org.apache.flink.api.common.InvalidProgramException
        Throws:
        org.apache.flink.api.common.InvalidProgramException
      • serializeAndSaveWatermarkDeclarations

        public void serializeAndSaveWatermarkDeclarations()
      • getSerializedWatermarkDeclarations

        public byte[] getSerializedWatermarkDeclarations()
        Get serialized watermark declarations, note that it may be null.