T
- The type of the elements in the Keyed Stream.KEY
- The type of the key in the Keyed Stream.@Public public class KeyedStream<T,KEY> extends DataStream<T>
KeyedStream
represents a DataStream
on which operator state is
partitioned by key using a provided KeySelector
. Typical operations supported by a
DataStream
are also possible on a KeyedStream
, with the exception of
partitioning methods such as shuffle, forward and keyBy.
Reduce-style operations, such as reduce(org.apache.flink.api.common.functions.ReduceFunction<T>)
, sum(int)
and fold(R, org.apache.flink.api.common.functions.FoldFunction<T, R>)
work on
elements that have the same key.
Modifier and Type | Class and Description |
---|---|
static class |
KeyedStream.IntervalJoin<T1,T2,KEY>
Perform a join over a time interval.
|
static class |
KeyedStream.IntervalJoined<IN1,IN2,KEY>
IntervalJoined is a container for two streams that have keys for both sides as well as
the time boundaries over which elements should be joined.
|
environment, transformation
Constructor and Description |
---|
KeyedStream(DataStream<T> dataStream,
org.apache.flink.api.java.functions.KeySelector<T,KEY> keySelector)
Creates a new
KeyedStream using the given KeySelector
to partition operator state by key. |
KeyedStream(DataStream<T> dataStream,
org.apache.flink.api.java.functions.KeySelector<T,KEY> keySelector,
org.apache.flink.api.common.typeinfo.TypeInformation<KEY> keyType)
Creates a new
KeyedStream using the given KeySelector
to partition operator state by key. |
Modifier and Type | Method and Description |
---|---|
DataStreamSink<T> |
addSink(SinkFunction<T> sinkFunction)
Adds the given sink to this DataStream.
|
protected SingleOutputStreamOperator<T> |
aggregate(AggregationFunction<T> aggregate) |
QueryableStateStream<KEY,T> |
asQueryableState(String queryableStateName)
Publishes the keyed stream as queryable ValueState instance.
|
<ACC> QueryableStateStream<KEY,ACC> |
asQueryableState(String queryableStateName,
org.apache.flink.api.common.state.FoldingStateDescriptor<T,ACC> stateDescriptor)
Deprecated.
will be removed in a future version
|
QueryableStateStream<KEY,T> |
asQueryableState(String queryableStateName,
org.apache.flink.api.common.state.ReducingStateDescriptor<T> stateDescriptor)
Publishes the keyed stream as a queryable ReducingState instance.
|
QueryableStateStream<KEY,T> |
asQueryableState(String queryableStateName,
org.apache.flink.api.common.state.ValueStateDescriptor<T> stateDescriptor)
Publishes the keyed stream as a queryable ValueState instance.
|
WindowedStream<T,KEY,GlobalWindow> |
countWindow(long size)
Windows this
KeyedStream into tumbling count windows. |
WindowedStream<T,KEY,GlobalWindow> |
countWindow(long size,
long slide)
Windows this
KeyedStream into sliding count windows. |
protected <R> SingleOutputStreamOperator<R> |
doTransform(String operatorName,
org.apache.flink.api.common.typeinfo.TypeInformation<R> outTypeInfo,
StreamOperatorFactory<R> operatorFactory) |
<R> SingleOutputStreamOperator<R> |
fold(R initialValue,
org.apache.flink.api.common.functions.FoldFunction<T,R> folder)
Deprecated.
will be removed in a future version
|
org.apache.flink.api.java.functions.KeySelector<T,KEY> |
getKeySelector()
Gets the key selector that can get the key by which the stream if partitioned from the elements.
|
org.apache.flink.api.common.typeinfo.TypeInformation<KEY> |
getKeyType()
Gets the type of the key by which the stream is partitioned.
|
<T1> KeyedStream.IntervalJoin<T,T1,KEY> |
intervalJoin(KeyedStream<T1,KEY> otherStream)
Join elements of this
KeyedStream with elements of another KeyedStream over
a time interval that can be specified with KeyedStream.IntervalJoin.between(Time, Time) . |
SingleOutputStreamOperator<T> |
max(int positionToMax)
Applies an aggregation that gives the current maximum of the data stream
at the given position by the given key.
|
SingleOutputStreamOperator<T> |
max(String field)
Applies an aggregation that gives the current maximum of the
data stream at the given field expression by the given key.
|
SingleOutputStreamOperator<T> |
maxBy(int positionToMaxBy)
Applies an aggregation that gives the current element with the
maximum value at the given position by the given key.
|
SingleOutputStreamOperator<T> |
maxBy(int positionToMaxBy,
boolean first)
Applies an aggregation that gives the current element with the
maximum value at the given position by the given key.
|
SingleOutputStreamOperator<T> |
maxBy(String positionToMaxBy)
Applies an aggregation that gives the current element with the
maximum value at the given position by the given key.
|
SingleOutputStreamOperator<T> |
maxBy(String field,
boolean first)
Applies an aggregation that gives the current maximum element of the
data stream by the given field expression by the given key.
|
SingleOutputStreamOperator<T> |
min(int positionToMin)
Applies an aggregation that gives the current minimum of the data
stream at the given position by the given key.
|
SingleOutputStreamOperator<T> |
min(String field)
Applies an aggregation that gives the current minimum of the
data stream at the given field expression by the given key.
|
SingleOutputStreamOperator<T> |
minBy(int positionToMinBy)
Applies an aggregation that gives the current element with the
minimum value at the given position by the given key.
|
SingleOutputStreamOperator<T> |
minBy(int positionToMinBy,
boolean first)
Applies an aggregation that gives the current element with the
minimum value at the given position by the given key.
|
SingleOutputStreamOperator<T> |
minBy(String positionToMinBy)
Applies an aggregation that gives the current element with the
minimum value at the given position by the given key.
|
SingleOutputStreamOperator<T> |
minBy(String field,
boolean first)
Applies an aggregation that gives the current minimum element of the
data stream by the given field expression by the given key.
|
<R> SingleOutputStreamOperator<R> |
process(KeyedProcessFunction<KEY,T,R> keyedProcessFunction)
Applies the given
KeyedProcessFunction on the input stream, thereby creating a transformed output stream. |
<R> SingleOutputStreamOperator<R> |
process(KeyedProcessFunction<KEY,T,R> keyedProcessFunction,
org.apache.flink.api.common.typeinfo.TypeInformation<R> outputType)
Applies the given
KeyedProcessFunction on the input stream, thereby creating a transformed output stream. |
<R> SingleOutputStreamOperator<R> |
process(ProcessFunction<T,R> processFunction)
Deprecated.
|
<R> SingleOutputStreamOperator<R> |
process(ProcessFunction<T,R> processFunction,
org.apache.flink.api.common.typeinfo.TypeInformation<R> outputType)
Deprecated.
|
SingleOutputStreamOperator<T> |
reduce(org.apache.flink.api.common.functions.ReduceFunction<T> reducer)
Applies a reduce transformation on the grouped data stream grouped on by
the given key position.
|
protected DataStream<T> |
setConnectionType(StreamPartitioner<T> partitioner)
Internal function for setting the partitioner for the DataStream.
|
SingleOutputStreamOperator<T> |
sum(int positionToSum)
Applies an aggregation that gives a rolling sum of the data stream at the
given position grouped by the given key.
|
SingleOutputStreamOperator<T> |
sum(String field)
Applies an aggregation that gives the current sum of the data
stream at the given field by the given key.
|
WindowedStream<T,KEY,TimeWindow> |
timeWindow(Time size)
Windows this
KeyedStream into tumbling time windows. |
WindowedStream<T,KEY,TimeWindow> |
timeWindow(Time size,
Time slide)
Windows this
KeyedStream into sliding time windows. |
<W extends Window> |
window(WindowAssigner<? super T,W> assigner)
Windows this data stream to a
WindowedStream , which evaluates windows
over a key grouped stream. |
assignTimestampsAndWatermarks, assignTimestampsAndWatermarks, assignTimestampsAndWatermarks, broadcast, broadcast, clean, coGroup, connect, connect, countWindowAll, countWindowAll, filter, flatMap, flatMap, forward, getExecutionConfig, getExecutionEnvironment, getId, getMinResources, getParallelism, getPreferredResources, getTransformation, getType, global, iterate, iterate, join, keyBy, keyBy, keyBy, keyBy, map, map, partitionCustom, partitionCustom, partitionCustom, print, print, printToErr, printToErr, project, rebalance, rescale, shuffle, split, timeWindowAll, timeWindowAll, transform, transform, union, windowAll, writeAsCsv, writeAsCsv, writeAsCsv, writeAsText, writeAsText, writeToSocket, writeUsingOutputFormat
public KeyedStream(DataStream<T> dataStream, org.apache.flink.api.java.functions.KeySelector<T,KEY> keySelector)
KeyedStream
using the given KeySelector
to partition operator state by key.dataStream
- Base stream of datakeySelector
- Function for determining state partitionspublic KeyedStream(DataStream<T> dataStream, org.apache.flink.api.java.functions.KeySelector<T,KEY> keySelector, org.apache.flink.api.common.typeinfo.TypeInformation<KEY> keyType)
KeyedStream
using the given KeySelector
to partition operator state by key.dataStream
- Base stream of datakeySelector
- Function for determining state partitions@Internal public org.apache.flink.api.java.functions.KeySelector<T,KEY> getKeySelector()
@Internal public org.apache.flink.api.common.typeinfo.TypeInformation<KEY> getKeyType()
protected DataStream<T> setConnectionType(StreamPartitioner<T> partitioner)
DataStream
setConnectionType
in class DataStream<T>
partitioner
- Partitioner to set.protected <R> SingleOutputStreamOperator<R> doTransform(String operatorName, org.apache.flink.api.common.typeinfo.TypeInformation<R> outTypeInfo, StreamOperatorFactory<R> operatorFactory)
doTransform
in class DataStream<T>
public DataStreamSink<T> addSink(SinkFunction<T> sinkFunction)
DataStream
StreamExecutionEnvironment.execute()
method is called.addSink
in class DataStream<T>
sinkFunction
- The object containing the sink's invoke function.@Deprecated @PublicEvolving public <R> SingleOutputStreamOperator<R> process(ProcessFunction<T,R> processFunction)
process(KeyedProcessFunction)
ProcessFunction
on the input stream, thereby creating a transformed output stream.
The function will be called for every element in the input streams and can produce zero
or more output elements. Contrary to the DataStream.flatMap(FlatMapFunction)
function, this function can also query the time and set timers. When reacting to the firing
of set timers the function can directly emit elements and/or register yet more timers.
process
in class DataStream<T>
R
- The type of elements emitted by the ProcessFunction
.processFunction
- The ProcessFunction
that is called for each element
in the stream.DataStream
.@Deprecated @Internal public <R> SingleOutputStreamOperator<R> process(ProcessFunction<T,R> processFunction, org.apache.flink.api.common.typeinfo.TypeInformation<R> outputType)
process(KeyedProcessFunction, TypeInformation)
ProcessFunction
on the input stream, thereby creating a transformed output stream.
The function will be called for every element in the input streams and can produce zero
or more output elements. Contrary to the DataStream.flatMap(FlatMapFunction)
function, this function can also query the time and set timers. When reacting to the firing
of set timers the function can directly emit elements and/or register yet more timers.
process
in class DataStream<T>
R
- The type of elements emitted by the ProcessFunction
.processFunction
- The ProcessFunction
that is called for each element
in the stream.outputType
- TypeInformation
for the result type of the function.DataStream
.@PublicEvolving public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY,T,R> keyedProcessFunction)
KeyedProcessFunction
on the input stream, thereby creating a transformed output stream.
The function will be called for every element in the input streams and can produce zero
or more output elements. Contrary to the DataStream.flatMap(FlatMapFunction)
function, this function can also query the time and set timers. When reacting to the firing
of set timers the function can directly emit elements and/or register yet more timers.
R
- The type of elements emitted by the KeyedProcessFunction
.keyedProcessFunction
- The KeyedProcessFunction
that is called for each element in the stream.DataStream
.@Internal public <R> SingleOutputStreamOperator<R> process(KeyedProcessFunction<KEY,T,R> keyedProcessFunction, org.apache.flink.api.common.typeinfo.TypeInformation<R> outputType)
KeyedProcessFunction
on the input stream, thereby creating a transformed output stream.
The function will be called for every element in the input streams and can produce zero
or more output elements. Contrary to the DataStream.flatMap(FlatMapFunction)
function, this function can also query the time and set timers. When reacting to the firing
of set timers the function can directly emit elements and/or register yet more timers.
R
- The type of elements emitted by the KeyedProcessFunction
.keyedProcessFunction
- The KeyedProcessFunction
that is called for each element in the stream.outputType
- TypeInformation
for the result type of the function.DataStream
.@PublicEvolving public <T1> KeyedStream.IntervalJoin<T,T1,KEY> intervalJoin(KeyedStream<T1,KEY> otherStream)
KeyedStream
with elements of another KeyedStream
over
a time interval that can be specified with KeyedStream.IntervalJoin.between(Time, Time)
.T1
- Type parameter of elements in the other streamotherStream
- The other keyed stream to join this keyed stream withKeyedStream.IntervalJoin
with this keyed stream and the other keyed streampublic WindowedStream<T,KEY,TimeWindow> timeWindow(Time size)
KeyedStream
into tumbling time windows.
This is a shortcut for either .window(TumblingEventTimeWindows.of(size))
or
.window(TumblingProcessingTimeWindows.of(size))
depending on the time characteristic
set using
StreamExecutionEnvironment.setStreamTimeCharacteristic(org.apache.flink.streaming.api.TimeCharacteristic)
size
- The size of the window.public WindowedStream<T,KEY,TimeWindow> timeWindow(Time size, Time slide)
KeyedStream
into sliding time windows.
This is a shortcut for either .window(SlidingEventTimeWindows.of(size, slide))
or
.window(SlidingProcessingTimeWindows.of(size, slide))
depending on the time
characteristic set using
StreamExecutionEnvironment.setStreamTimeCharacteristic(org.apache.flink.streaming.api.TimeCharacteristic)
size
- The size of the window.public WindowedStream<T,KEY,GlobalWindow> countWindow(long size)
KeyedStream
into tumbling count windows.size
- The size of the windows in number of elements.public WindowedStream<T,KEY,GlobalWindow> countWindow(long size, long slide)
KeyedStream
into sliding count windows.size
- The size of the windows in number of elements.slide
- The slide interval in number of elements.@PublicEvolving public <W extends Window> WindowedStream<T,KEY,W> window(WindowAssigner<? super T,W> assigner)
WindowedStream
, which evaluates windows
over a key grouped stream. Elements are put into windows by a WindowAssigner
. The
grouping of elements is done both by key and by window.
A Trigger
can be defined to
specify when windows are evaluated. However, WindowAssigners
have a default
Trigger
that is used if a Trigger
is not specified.
assigner
- The WindowAssigner
that assigns elements to windows.public SingleOutputStreamOperator<T> reduce(org.apache.flink.api.common.functions.ReduceFunction<T> reducer)
ReduceFunction
will receive input
values based on the key value. Only input values with the same key will
go to the same reducer.reducer
- The ReduceFunction
that will be called for every
element of the input values with the same key.@Deprecated public <R> SingleOutputStreamOperator<R> fold(R initialValue, org.apache.flink.api.common.functions.FoldFunction<T,R> folder)
FoldFunction
will receive input
values based on the key value. Only input values with the same key will
go to the same folder.folder
- The FoldFunction
that will be called for every element
of the input values with the same key.initialValue
- The initialValue passed to the folders for each key.public SingleOutputStreamOperator<T> sum(int positionToSum)
positionToSum
- The field position in the data points to sum. This is applicable to
Tuple types, basic and primitive array types, Scala case classes,
and primitive types (which is considered as having one field).public SingleOutputStreamOperator<T> sum(String field)
field
- In case of a POJO, Scala case class, or Tuple type, the
name of the (public) field on which to perform the aggregation.
Additionally, a dot can be used to drill down into nested
objects, as in "field1.fieldxy"
.
Furthermore "*" can be specified in case of a basic type
(which is considered as having only one field).public SingleOutputStreamOperator<T> min(int positionToMin)
positionToMin
- The field position in the data points to minimize. This is applicable to
Tuple types, Scala case classes, and primitive types (which is considered
as having one field).public SingleOutputStreamOperator<T> min(String field)
DataStream
's underlying type. A dot can be used to drill down into
objects, as in "field1.fieldxy"
.field
- In case of a POJO, Scala case class, or Tuple type, the
name of the (public) field on which to perform the aggregation.
Additionally, a dot can be used to drill down into nested
objects, as in "field1.fieldxy"
.
Furthermore "*" can be specified in case of a basic type
(which is considered as having only one field).public SingleOutputStreamOperator<T> max(int positionToMax)
positionToMax
- The field position in the data points to maximize. This is applicable to
Tuple types, Scala case classes, and primitive types (which is considered
as having one field).public SingleOutputStreamOperator<T> max(String field)
DataStream
's underlying type. A dot can be used to drill down into
objects, as in "field1.fieldxy"
.field
- In case of a POJO, Scala case class, or Tuple type, the
name of the (public) field on which to perform the aggregation.
Additionally, a dot can be used to drill down into nested
objects, as in "field1.fieldxy"
.
Furthermore "*" can be specified in case of a basic type
(which is considered as having only one field).public SingleOutputStreamOperator<T> minBy(String field, boolean first)
DataStream
's underlying type. A dot can be used to drill down into
objects, as in "field1.fieldxy"
.field
- In case of a POJO, Scala case class, or Tuple type, the
name of the (public) field on which to perform the aggregation.
Additionally, a dot can be used to drill down into nested
objects, as in "field1.fieldxy"
.
Furthermore "*" can be specified in case of a basic type
(which is considered as having only one field).first
- If True then in case of field equality the first object will
be returnedpublic SingleOutputStreamOperator<T> maxBy(String field, boolean first)
DataStream
's underlying type. A dot can be used to drill down into
objects, as in "field1.fieldxy"
.field
- In case of a POJO, Scala case class, or Tuple type, the
name of the (public) field on which to perform the aggregation.
Additionally, a dot can be used to drill down into nested
objects, as in "field1.fieldxy"
.
Furthermore "*" can be specified in case of a basic type
(which is considered as having only one field).first
- If True then in case of field equality the first object will
be returnedpublic SingleOutputStreamOperator<T> minBy(int positionToMinBy)
positionToMinBy
- The field position in the data points to minimize. This is applicable to
Tuple types, Scala case classes, and primitive types (which is considered
as having one field).public SingleOutputStreamOperator<T> minBy(String positionToMinBy)
positionToMinBy
- In case of a POJO, Scala case class, or Tuple type, the
name of the (public) field on which to perform the aggregation.
Additionally, a dot can be used to drill down into nested
objects, as in "field1.fieldxy"
.
Furthermore "*" can be specified in case of a basic type
(which is considered as having only one field).public SingleOutputStreamOperator<T> minBy(int positionToMinBy, boolean first)
positionToMinBy
- The field position in the data points to minimize. This is applicable to
Tuple types, Scala case classes, and primitive types (which is considered
as having one field).first
- If true, then the operator return the first element with the
minimal value, otherwise returns the lastpublic SingleOutputStreamOperator<T> maxBy(int positionToMaxBy)
positionToMaxBy
- The field position in the data points to minimize. This is applicable to
Tuple types, Scala case classes, and primitive types (which is considered
as having one field).public SingleOutputStreamOperator<T> maxBy(String positionToMaxBy)
positionToMaxBy
- In case of a POJO, Scala case class, or Tuple type, the
name of the (public) field on which to perform the aggregation.
Additionally, a dot can be used to drill down into nested
objects, as in "field1.fieldxy"
.
Furthermore "*" can be specified in case of a basic type
(which is considered as having only one field).public SingleOutputStreamOperator<T> maxBy(int positionToMaxBy, boolean first)
positionToMaxBy
- The field position in the data points to minimize. This is applicable to
Tuple types, Scala case classes, and primitive types (which is considered
as having one field).first
- If true, then the operator return the first element with the
maximum value, otherwise returns the lastprotected SingleOutputStreamOperator<T> aggregate(AggregationFunction<T> aggregate)
@PublicEvolving public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName)
queryableStateName
- Name under which to the publish the queryable state instance@PublicEvolving public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName, org.apache.flink.api.common.state.ValueStateDescriptor<T> stateDescriptor)
queryableStateName
- Name under which to the publish the queryable state instancestateDescriptor
- State descriptor to create state instance from@PublicEvolving @Deprecated public <ACC> QueryableStateStream<KEY,ACC> asQueryableState(String queryableStateName, org.apache.flink.api.common.state.FoldingStateDescriptor<T,ACC> stateDescriptor)
queryableStateName
- Name under which to the publish the queryable state instancestateDescriptor
- State descriptor to create state instance from@PublicEvolving public QueryableStateStream<KEY,T> asQueryableState(String queryableStateName, org.apache.flink.api.common.state.ReducingStateDescriptor<T> stateDescriptor)
queryableStateName
- Name under which to the publish the queryable state instancestateDescriptor
- State descriptor to create state instance fromCopyright © 2014–2020 The Apache Software Foundation. All rights reserved.