@Experimental public final class DataStreamUtils extends Object
DataStreams
.Modifier and Type | Class and Description |
---|---|
static class |
DataStreamUtils.ClientAndIterator<E>
A pair of an
Iterator to receive results from a streaming application and a JobClient to interact with the program. |
Modifier and Type | Method and Description |
---|---|
static <OUT> Iterator<OUT> |
collect(DataStream<OUT> stream)
Triggers the distributed execution of the streaming dataflow and returns an iterator over the
elements of the given DataStream.
|
static <OUT> Iterator<OUT> |
collect(DataStream<OUT> stream,
String executionJobName)
Triggers the distributed execution of the streaming dataflow and returns an iterator over the
elements of the given DataStream.
|
static <E> List<E> |
collectBoundedStream(DataStream<E> stream,
String jobName)
Collects contents the given DataStream into a list, assuming that the stream is a bounded
stream.
|
static <E> List<E> |
collectRecordsFromUnboundedStream(DataStreamUtils.ClientAndIterator<E> client,
int numElements) |
static <E> List<E> |
collectUnboundedStream(DataStream<E> stream,
int numElements,
String jobName)
Triggers execution of the DataStream application and collects the given number of records
from the stream.
|
static <OUT> DataStreamUtils.ClientAndIterator<OUT> |
collectWithClient(DataStream<OUT> stream,
String jobExecutionName)
Starts the execution of the program and returns an iterator to read the result of the given
data stream, plus a
JobClient to interact with the application execution. |
static <T,K> KeyedStream<T,K> |
reinterpretAsKeyedStream(DataStream<T> stream,
org.apache.flink.api.java.functions.KeySelector<T,K> keySelector)
|
static <T,K> KeyedStream<T,K> |
reinterpretAsKeyedStream(DataStream<T> stream,
org.apache.flink.api.java.functions.KeySelector<T,K> keySelector,
org.apache.flink.api.common.typeinfo.TypeInformation<K> typeInfo)
|
public static <OUT> Iterator<OUT> collect(DataStream<OUT> stream)
The DataStream application is executed in the regular distributed manner on the target environment, and the events from the stream are polled back to this application process and thread through Flink's REST API.
public static <OUT> Iterator<OUT> collect(DataStream<OUT> stream, String executionJobName)
The DataStream application is executed in the regular distributed manner on the target environment, and the events from the stream are polled back to this application process and thread through Flink's REST API.
public static <OUT> DataStreamUtils.ClientAndIterator<OUT> collectWithClient(DataStream<OUT> stream, String jobExecutionName) throws Exception
JobClient
to interact with the application execution.Exception
public static <E> List<E> collectBoundedStream(DataStream<E> stream, String jobName) throws Exception
This method blocks until the job execution is complete. By the time the method returns, the job will have reached its FINISHED status.
Note that if the stream is unbounded, this method will never return and might fail with an Out-of-Memory Error because it attempts to collect an infinite stream into a list.
Exception
- Exceptions that occur during the execution are forwarded.public static <E> List<E> collectUnboundedStream(DataStream<E> stream, int numElements, String jobName) throws Exception
Exception
public static <E> List<E> collectRecordsFromUnboundedStream(DataStreamUtils.ClientAndIterator<E> client, int numElements)
public static <T,K> KeyedStream<T,K> reinterpretAsKeyedStream(DataStream<T> stream, org.apache.flink.api.java.functions.KeySelector<T,K> keySelector)
DataStream
as a KeyedStream
, which extracts keys with
the given KeySelector
.
IMPORTANT: For every partition of the base stream, the keys of events in the base stream
must be partitioned exactly in the same way as if it was created through a DataStream.keyBy(KeySelector)
.
T
- Type of events in the data stream.K
- Type of the extracted keys.stream
- The data stream to reinterpret. For every partition, this stream must be
partitioned exactly in the same way as if it was created through a DataStream.keyBy(KeySelector)
.keySelector
- Function that defines how keys are extracted from the data stream.DataStream
as a KeyedStream
.public static <T,K> KeyedStream<T,K> reinterpretAsKeyedStream(DataStream<T> stream, org.apache.flink.api.java.functions.KeySelector<T,K> keySelector, org.apache.flink.api.common.typeinfo.TypeInformation<K> typeInfo)
DataStream
as a KeyedStream
, which extracts keys with
the given KeySelector
.
IMPORTANT: For every partition of the base stream, the keys of events in the base stream
must be partitioned exactly in the same way as if it was created through a DataStream.keyBy(KeySelector)
.
T
- Type of events in the data stream.K
- Type of the extracted keys.stream
- The data stream to reinterpret. For every partition, this stream must be
partitioned exactly in the same way as if it was created through a DataStream.keyBy(KeySelector)
.keySelector
- Function that defines how keys are extracted from the data stream.typeInfo
- Explicit type information about the key type.DataStream
as a KeyedStream
.Copyright © 2014–2021 The Apache Software Foundation. All rights reserved.