Package | Description |
---|---|
org.apache.flink.api.java | |
org.apache.flink.api.java.operators | |
org.apache.flink.api.java.operators.join | |
org.apache.flink.api.java.utils |
Modifier and Type | Method and Description |
---|---|
<X> DataSet<X> |
DataSet.runOperation(CustomUnaryOperation<T,X> operation)
Runs a
CustomUnaryOperation on the data set. |
Modifier and Type | Method and Description |
---|---|
protected static void |
DataSet.checkSameExecutionContext(DataSet<?> set1,
DataSet<?> set2) |
protected static void |
DataSet.checkSameExecutionContext(DataSet<?> set1,
DataSet<?> set2) |
<R> CoGroupOperator.CoGroupOperatorSets<T,R> |
DataSet.coGroup(DataSet<R> other)
Initiates a CoGroup transformation.
A CoGroup transformation combines the elements of two DataSets into one DataSet. |
<R> CrossOperator.DefaultCross<T,R> |
DataSet.cross(DataSet<R> other)
Initiates a Cross transformation.
A Cross transformation combines the elements of two DataSets into one DataSet. |
<R> CrossOperator.DefaultCross<T,R> |
DataSet.crossWithHuge(DataSet<R> other)
Initiates a Cross transformation.
A Cross transformation combines the elements of two DataSets into one DataSet. |
<R> CrossOperator.DefaultCross<T,R> |
DataSet.crossWithTiny(DataSet<R> other)
Initiates a Cross transformation.
A Cross transformation combines the elements of two DataSets into one DataSet. |
<R> JoinOperatorSetsBase<T,R> |
DataSet.fullOuterJoin(DataSet<R> other)
Initiates a Full Outer Join transformation.
An Outer Join transformation joins two elements of two DataSets on key equality and provides multiple ways to combine
joining elements into one DataSet.Elements of both DataSets that do not have a matching element on the opposing side are joined with null and emitted to the
resulting DataSet. |
<R> JoinOperatorSetsBase<T,R> |
DataSet.fullOuterJoin(DataSet<R> other,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint strategy)
Initiates a Full Outer Join transformation.
An Outer Join transformation joins two elements of two DataSets on key equality and provides multiple ways to combine
joining elements into one DataSet.Elements of both DataSets that do not have a matching element on the opposing side are joined with null and emitted to the
resulting DataSet. |
<R> DeltaIteration<T,R> |
DataSet.iterateDelta(DataSet<R> workset,
int maxIterations,
int... keyPositions)
Initiates a delta iteration.
|
<R> JoinOperator.JoinOperatorSets<T,R> |
DataSet.join(DataSet<R> other)
Initiates a Join transformation.
|
<R> JoinOperator.JoinOperatorSets<T,R> |
DataSet.join(DataSet<R> other,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint strategy)
Initiates a Join transformation.
|
<R> JoinOperator.JoinOperatorSets<T,R> |
DataSet.joinWithHuge(DataSet<R> other)
Initiates a Join transformation.
A Join transformation joins the elements of two DataSets on key equality and provides multiple ways to combine
joining elements into one DataSet.This method also gives the hint to the optimizer that the second DataSet to join is much larger than the first one. This method returns a JoinOperator.JoinOperatorSets on which one of the where methods
can be called to define the join key of the first joining (i.e., this) DataSet. |
<R> JoinOperator.JoinOperatorSets<T,R> |
DataSet.joinWithTiny(DataSet<R> other)
Initiates a Join transformation.
|
<R> JoinOperatorSetsBase<T,R> |
DataSet.leftOuterJoin(DataSet<R> other)
Initiates a Left Outer Join transformation.
An Outer Join transformation joins two elements of two DataSets on key equality and provides multiple ways to combine
joining elements into one DataSet.Elements of the left DataSet (i.e. |
<R> JoinOperatorSetsBase<T,R> |
DataSet.leftOuterJoin(DataSet<R> other,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint strategy)
Initiates a Left Outer Join transformation.
An Outer Join transformation joins two elements of two DataSets on key equality and provides multiple ways to combine
joining elements into one DataSet.Elements of the left DataSet (i.e. |
<R> JoinOperatorSetsBase<T,R> |
DataSet.rightOuterJoin(DataSet<R> other)
Initiates a Right Outer Join transformation.
An Outer Join transformation joins two elements of two DataSets on key equality and provides multiple ways to combine
joining elements into one DataSet.Elements of the right DataSet (i.e. |
<R> JoinOperatorSetsBase<T,R> |
DataSet.rightOuterJoin(DataSet<R> other,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint strategy)
Initiates a Right Outer Join transformation.
An Outer Join transformation joins two elements of two DataSets on key equality and provides multiple ways to combine
joining elements into one DataSet.Elements of the right DataSet (i.e. |
UnionOperator<T> |
DataSet.union(DataSet<T> other)
Creates a union of this DataSet with an other DataSet.
|
Modifier and Type | Class and Description |
---|---|
class |
AggregateOperator<IN>
This operator represents the application of a "aggregate" operation on a data set, and the
result data set produced by the function.
|
class |
BulkIterationResultSet<T> |
class |
CoGroupOperator<I1,I2,OUT>
A
DataSet that is the result of a CoGroup transformation. |
class |
CoGroupRawOperator<I1,I2,OUT>
A
DataSet that is the result of a CoGroup transformation. |
class |
CrossOperator<I1,I2,OUT>
A
DataSet that is the result of a Cross transformation. |
static class |
CrossOperator.DefaultCross<I1,I2>
A Cross transformation that wraps pairs of crossed elements into
Tuple2 .It also represents the DataSet that is the result of a Cross transformation. |
static class |
CrossOperator.ProjectCross<I1,I2,OUT extends org.apache.flink.api.java.tuple.Tuple>
A Cross transformation that projects crossing elements or fields of crossing
Tuples
into result Tuples . |
class |
DataSource<OUT>
An operation that creates a new data set (data source).
|
static class |
DeltaIteration.SolutionSetPlaceHolder<ST>
A
DataSet that acts as a placeholder for the solution set during the iteration. |
static class |
DeltaIteration.WorksetPlaceHolder<WT>
A
DataSet that acts as a placeholder for the workset during the iteration. |
class |
DeltaIterationResultSet<ST,WT> |
class |
DistinctOperator<T>
This operator represents the application of a "distinct" function on a data set, and the
result data set produced by the function.
|
class |
FilterOperator<T>
This operator represents the application of a "filter" function on a data set, and the
result data set produced by the function.
|
class |
FlatMapOperator<IN,OUT>
This operator represents the application of a "flatMap" function on a data set, and the
result data set produced by the function.
|
class |
GroupCombineOperator<IN,OUT>
This operator behaves like the GroupReduceOperator with Combine but only runs the Combine part which reduces all data
locally in their partitions.
|
class |
GroupReduceOperator<IN,OUT>
This operator represents the application of a "reduceGroup" function on a data set, and the
result data set produced by the function.
|
class |
IterativeDataSet<T>
The IterativeDataSet represents the start of an iteration.
|
class |
JoinOperator<I1,I2,OUT>
A
DataSet that is the result of a Join transformation. |
static class |
JoinOperator.DefaultJoin<I1,I2>
A Join transformation that wraps pairs of joining elements into
Tuple2 .It also represents the DataSet that is the result of a Join transformation. |
static class |
JoinOperator.EquiJoin<I1,I2,OUT>
A Join transformation that applies a
JoinFunction on each pair of joining elements.It also represents the DataSet that is the result of a Join transformation. |
static class |
JoinOperator.ProjectJoin<I1,I2,OUT extends org.apache.flink.api.java.tuple.Tuple>
A Join transformation that projects joining elements or fields of joining
Tuples
into result Tuples . |
class |
MapOperator<IN,OUT>
This operator represents the application of a "map" function on a data set, and the
result data set produced by the function.
|
class |
MapPartitionOperator<IN,OUT>
This operator represents the application of a "mapPartition" function on a data set, and the
result data set produced by the function.
|
class |
Operator<OUT,O extends Operator<OUT,O>>
Base class of all operators in the Java API.
|
class |
PartitionOperator<T>
This operator represents a partitioning.
|
class |
ProjectOperator<IN,OUT extends org.apache.flink.api.java.tuple.Tuple>
This operator represents the application of a projection operation on a data set, and the
result data set produced by the function.
|
class |
ReduceOperator<IN>
This operator represents the application of a "reduce" function on a data set, and the
result data set produced by the function.
|
class |
SingleInputOperator<IN,OUT,O extends SingleInputOperator<IN,OUT,O>>
Base class for operations that operates on a single input data set.
|
class |
SingleInputUdfOperator<IN,OUT,O extends SingleInputUdfOperator<IN,OUT,O>>
The SingleInputUdfOperator is the base class of all unary operators that execute
user-defined functions (UDFs).
|
class |
SortPartitionOperator<T>
This operator represents a DataSet with locally sorted partitions.
|
class |
TwoInputOperator<IN1,IN2,OUT,O extends TwoInputOperator<IN1,IN2,OUT,O>>
Base class for operations that operates on two input data sets.
|
class |
TwoInputUdfOperator<IN1,IN2,OUT,O extends TwoInputUdfOperator<IN1,IN2,OUT,O>>
The TwoInputUdfOperator is the base class of all binary operators that execute
user-defined functions (UDFs).
|
class |
UnionOperator<T>
Java API operator for union of two data sets
|
Modifier and Type | Field and Description |
---|---|
protected DataSet<T> |
Grouping.inputDataSet |
Modifier and Type | Method and Description |
---|---|
DataSet<ST> |
DeltaIteration.closeWith(DataSet<ST> solutionSetDelta,
DataSet<WT> newWorkset)
Closes the delta iteration.
|
DataSet<T> |
IterativeDataSet.closeWith(DataSet<T> iterationResult)
Closes the iteration.
|
DataSet<T> |
IterativeDataSet.closeWith(DataSet<T> iterationResult,
DataSet<?> terminationCriterion)
Closes the iteration and specifies a termination criterion.
|
DataSet<OUT> |
CustomUnaryOperation.createResult() |
DataSet<T> |
DataSink.getDataSet() |
DataSet<ST> |
DeltaIteration.getInitialSolutionSet()
Gets the initial solution set.
|
DataSet<WT> |
DeltaIteration.getInitialWorkset()
Gets the initial workset.
|
DataSet<IN> |
SingleInputOperator.getInput()
Gets the data set that this operation uses as its input.
|
DataSet<IN1> |
TwoInputOperator.getInput1()
Gets the data set that this operation uses as its first input.
|
DataSet<IN2> |
TwoInputOperator.getInput2()
Gets the data set that this operation uses as its second input.
|
DataSet<T> |
Grouping.getInputDataSet()
Returns the input DataSet of a grouping operation, that is the one before the grouping.
|
DataSet<T> |
BulkIterationResultSet.getNextPartialSolution() |
DataSet<ST> |
DeltaIterationResultSet.getNextSolutionSet() |
DataSet<WT> |
DeltaIterationResultSet.getNextWorkset() |
DataSet<?> |
BulkIterationResultSet.getTerminationCriterion() |
Modifier and Type | Method and Description |
---|---|
Map<String,DataSet<?>> |
TwoInputUdfOperator.getBroadcastSets() |
Map<String,DataSet<?>> |
UdfOperator.getBroadcastSets()
Gets the broadcast sets (name and data set) that have been added to context of the UDF.
|
Map<String,DataSet<?>> |
SingleInputUdfOperator.getBroadcastSets() |
Modifier and Type | Method and Description |
---|---|
DataSet<ST> |
DeltaIteration.closeWith(DataSet<ST> solutionSetDelta,
DataSet<WT> newWorkset)
Closes the delta iteration.
|
DataSet<ST> |
DeltaIteration.closeWith(DataSet<ST> solutionSetDelta,
DataSet<WT> newWorkset)
Closes the delta iteration.
|
DataSet<T> |
IterativeDataSet.closeWith(DataSet<T> iterationResult)
Closes the iteration.
|
DataSet<T> |
IterativeDataSet.closeWith(DataSet<T> iterationResult,
DataSet<?> terminationCriterion)
Closes the iteration and specifies a termination criterion.
|
DataSet<T> |
IterativeDataSet.closeWith(DataSet<T> iterationResult,
DataSet<?> terminationCriterion)
Closes the iteration and specifies a termination criterion.
|
void |
CustomUnaryOperation.setInput(DataSet<IN> inputData) |
O |
TwoInputUdfOperator.withBroadcastSet(DataSet<?> data,
String name) |
O |
UdfOperator.withBroadcastSet(DataSet<?> data,
String name)
Adds a certain data set as a broadcast set to this operator.
|
O |
SingleInputUdfOperator.withBroadcastSet(DataSet<?> data,
String name) |
Constructor and Description |
---|
AggregateOperator(DataSet<IN> input,
Aggregations function,
int field,
String aggregateLocationName)
Non grouped aggregation
|
CoGroupOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.CoGroupFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
List<org.apache.commons.lang3.tuple.Pair<Integer,org.apache.flink.api.common.operators.Order>> groupSortKeyOrderFirst,
List<org.apache.commons.lang3.tuple.Pair<Integer,org.apache.flink.api.common.operators.Order>> groupSortKeyOrderSecond,
org.apache.flink.api.common.functions.Partitioner<?> customPartitioner,
String defaultName) |
CoGroupOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.CoGroupFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
List<org.apache.commons.lang3.tuple.Pair<Integer,org.apache.flink.api.common.operators.Order>> groupSortKeyOrderFirst,
List<org.apache.commons.lang3.tuple.Pair<Integer,org.apache.flink.api.common.operators.Order>> groupSortKeyOrderSecond,
org.apache.flink.api.common.functions.Partitioner<?> customPartitioner,
String defaultName) |
CoGroupOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.CoGroupFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.functions.Partitioner<?> customPartitioner,
String defaultName) |
CoGroupOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.CoGroupFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.functions.Partitioner<?> customPartitioner,
String defaultName) |
CoGroupOperatorSets(DataSet<I1> input1,
DataSet<I2> input2) |
CoGroupOperatorSets(DataSet<I1> input1,
DataSet<I2> input2) |
CoGroupRawOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.CoGroupFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
String defaultName) |
CoGroupRawOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.CoGroupFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
String defaultName) |
CrossOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.functions.CrossFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint,
String defaultName) |
CrossOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.functions.CrossFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint,
String defaultName) |
CrossProjection(DataSet<I1> ds1,
DataSet<I2> ds2,
int[] firstFieldIndexes,
int[] secondFieldIndexes,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint) |
CrossProjection(DataSet<I1> ds1,
DataSet<I2> ds2,
int[] firstFieldIndexes,
int[] secondFieldIndexes,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint) |
DataSink(DataSet<T> data,
org.apache.flink.api.common.io.OutputFormat<T> format,
org.apache.flink.api.common.typeinfo.TypeInformation<T> type) |
DefaultCross(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint,
String defaultName) |
DefaultCross(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint,
String defaultName) |
DefaultJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName,
JoinType type) |
DefaultJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName,
JoinType type) |
DeltaIteration(ExecutionEnvironment context,
org.apache.flink.api.common.typeinfo.TypeInformation<ST> type,
DataSet<ST> solutionSet,
DataSet<WT> workset,
org.apache.flink.api.common.operators.Keys<ST> keys,
int maxIterations) |
DeltaIteration(ExecutionEnvironment context,
org.apache.flink.api.common.typeinfo.TypeInformation<ST> type,
DataSet<ST> solutionSet,
DataSet<WT> workset,
org.apache.flink.api.common.operators.Keys<ST> keys,
int maxIterations) |
DistinctOperator(DataSet<T> input,
org.apache.flink.api.common.operators.Keys<T> keys,
String distinctLocationName) |
EquiJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.FlatJoinFunction<I1,I2,OUT> generatedFunction,
org.apache.flink.api.common.functions.JoinFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName) |
EquiJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.FlatJoinFunction<I1,I2,OUT> generatedFunction,
org.apache.flink.api.common.functions.JoinFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName) |
EquiJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.FlatJoinFunction<I1,I2,OUT> generatedFunction,
org.apache.flink.api.common.functions.JoinFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName,
JoinType type) |
EquiJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.FlatJoinFunction<I1,I2,OUT> generatedFunction,
org.apache.flink.api.common.functions.JoinFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName,
JoinType type) |
EquiJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.FlatJoinFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName) |
EquiJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.FlatJoinFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName) |
EquiJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.FlatJoinFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName,
JoinType type) |
EquiJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.functions.FlatJoinFunction<I1,I2,OUT> function,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
String joinLocationName,
JoinType type) |
FilterOperator(DataSet<T> input,
org.apache.flink.api.common.functions.FilterFunction<T> function,
String defaultName) |
FlatMapOperator(DataSet<IN> input,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType,
org.apache.flink.api.common.functions.FlatMapFunction<IN,OUT> function,
String defaultName) |
GroupCombineOperator(DataSet<IN> input,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType,
org.apache.flink.api.common.functions.GroupCombineFunction<IN,OUT> function,
String defaultName)
Constructor for a non-grouped reduce (all reduce).
|
Grouping(DataSet<T> set,
org.apache.flink.api.common.operators.Keys<T> keys) |
GroupReduceOperator(DataSet<IN> input,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType,
org.apache.flink.api.common.functions.GroupReduceFunction<IN,OUT> function,
String defaultName)
Constructor for a non-grouped reduce (all reduce).
|
IterativeDataSet(ExecutionEnvironment context,
org.apache.flink.api.common.typeinfo.TypeInformation<T> type,
DataSet<T> input,
int maxIterations) |
JoinOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
JoinType type) |
JoinOperator(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> returnType,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
JoinType type) |
JoinOperatorSets(DataSet<I1> input1,
DataSet<I2> input2) |
JoinOperatorSets(DataSet<I1> input1,
DataSet<I2> input2) |
JoinOperatorSets(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint) |
JoinOperatorSets(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint) |
JoinProjection(DataSet<I1> ds1,
DataSet<I2> ds2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
int[] firstFieldIndexes,
int[] secondFieldIndexes) |
JoinProjection(DataSet<I1> ds1,
DataSet<I2> ds2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
int[] firstFieldIndexes,
int[] secondFieldIndexes) |
MapOperator(DataSet<IN> input,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType,
org.apache.flink.api.common.functions.MapFunction<IN,OUT> function,
String defaultName) |
MapPartitionOperator(DataSet<IN> input,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType,
org.apache.flink.api.common.functions.MapPartitionFunction<IN,OUT> function,
String defaultName) |
PartitionOperator(DataSet<T> input,
org.apache.flink.api.common.operators.Keys<T> pKeys,
org.apache.flink.api.common.functions.Partitioner<?> customPartitioner,
String partitionLocationName) |
PartitionOperator(DataSet<T> input,
org.apache.flink.api.common.operators.Keys<T> pKeys,
org.apache.flink.api.common.functions.Partitioner<P> customPartitioner,
org.apache.flink.api.common.typeinfo.TypeInformation<P> partitionerTypeInfo,
String partitionLocationName) |
PartitionOperator(DataSet<T> input,
org.apache.flink.api.common.operators.base.PartitionOperatorBase.PartitionMethod pMethod,
org.apache.flink.api.common.operators.Keys<T> pKeys,
String partitionLocationName) |
PartitionOperator(DataSet<T> input,
org.apache.flink.api.common.operators.base.PartitionOperatorBase.PartitionMethod pMethod,
String partitionLocationName) |
ProjectCross(DataSet<I1> input1,
DataSet<I2> input2,
int[] fields,
boolean[] isFromFirst,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType,
CrossOperator.CrossProjection<I1,I2> crossProjection,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint) |
ProjectCross(DataSet<I1> input1,
DataSet<I2> input2,
int[] fields,
boolean[] isFromFirst,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType,
CrossOperator.CrossProjection<I1,I2> crossProjection,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint) |
ProjectCross(DataSet<I1> input1,
DataSet<I2> input2,
int[] fields,
boolean[] isFromFirst,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint) |
ProjectCross(DataSet<I1> input1,
DataSet<I2> input2,
int[] fields,
boolean[] isFromFirst,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType,
org.apache.flink.api.common.operators.base.CrossOperatorBase.CrossHint hint) |
Projection(DataSet<T> ds,
int[] fieldIndexes) |
ProjectJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
int[] fields,
boolean[] isFromFirst,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType) |
ProjectJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
int[] fields,
boolean[] isFromFirst,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType) |
ProjectJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
int[] fields,
boolean[] isFromFirst,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType,
JoinOperator.JoinProjection<I1,I2> joinProj) |
ProjectJoin(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.Keys<I1> keys1,
org.apache.flink.api.common.operators.Keys<I2> keys2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
int[] fields,
boolean[] isFromFirst,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType,
JoinOperator.JoinProjection<I1,I2> joinProj) |
ProjectOperator(DataSet<IN> input,
int[] fields,
org.apache.flink.api.java.typeutils.TupleTypeInfo<OUT> returnType) |
ReduceOperator(DataSet<IN> input,
org.apache.flink.api.common.functions.ReduceFunction<IN> function,
String defaultName)
This is the case for a reduce-all case (in contrast to the reduce-per-group case).
|
SingleInputOperator(DataSet<IN> input,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType) |
SingleInputUdfOperator(DataSet<IN> input,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType)
Creates a new operators with the given data set as input.
|
SortedGrouping(DataSet<T> set,
org.apache.flink.api.common.operators.Keys<T> keys,
int field,
org.apache.flink.api.common.operators.Order order) |
SortedGrouping(DataSet<T> set,
org.apache.flink.api.common.operators.Keys<T> keys,
org.apache.flink.api.common.operators.Keys.SelectorFunctionKeys<T,K> keySelector,
org.apache.flink.api.common.operators.Order order) |
SortedGrouping(DataSet<T> set,
org.apache.flink.api.common.operators.Keys<T> keys,
String field,
org.apache.flink.api.common.operators.Order order) |
SortPartitionOperator(DataSet<T> dataSet,
int sortField,
org.apache.flink.api.common.operators.Order sortOrder,
String sortLocationName) |
SortPartitionOperator(DataSet<T> dataSet,
org.apache.flink.api.common.operators.Keys.SelectorFunctionKeys<T,K> sortKey,
org.apache.flink.api.common.operators.Order sortOrder,
String sortLocationName) |
SortPartitionOperator(DataSet<T> dataSet,
String sortField,
org.apache.flink.api.common.operators.Order sortOrder,
String sortLocationName) |
TwoInputOperator(DataSet<IN1> input1,
DataSet<IN2> input2,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType) |
TwoInputOperator(DataSet<IN1> input1,
DataSet<IN2> input2,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType) |
TwoInputUdfOperator(DataSet<IN1> input1,
DataSet<IN2> input2,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType)
Creates a new operators with the two given data sets as inputs.
|
TwoInputUdfOperator(DataSet<IN1> input1,
DataSet<IN2> input2,
org.apache.flink.api.common.typeinfo.TypeInformation<OUT> resultType)
Creates a new operators with the two given data sets as inputs.
|
UnionOperator(DataSet<T> input1,
DataSet<T> input2,
String unionLocationName)
Create an operator that produces the union of the two given data sets.
|
UnionOperator(DataSet<T> input1,
DataSet<T> input2,
String unionLocationName)
Create an operator that produces the union of the two given data sets.
|
UnsortedGrouping(DataSet<T> set,
org.apache.flink.api.common.operators.Keys<T> keys) |
Modifier and Type | Field and Description |
---|---|
protected DataSet<I1> |
JoinOperatorSetsBase.input1 |
protected DataSet<I2> |
JoinOperatorSetsBase.input2 |
Constructor and Description |
---|
JoinOperatorSetsBase(DataSet<I1> input1,
DataSet<I2> input2) |
JoinOperatorSetsBase(DataSet<I1> input1,
DataSet<I2> input2) |
JoinOperatorSetsBase(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint) |
JoinOperatorSetsBase(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint) |
JoinOperatorSetsBase(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
JoinType type) |
JoinOperatorSetsBase(DataSet<I1> input1,
DataSet<I2> input2,
org.apache.flink.api.common.operators.base.JoinOperatorBase.JoinHint hint,
JoinType type) |
Modifier and Type | Method and Description |
---|---|
static <T> DataSet<org.apache.flink.api.java.tuple.Tuple2<Integer,Long>> |
DataSetUtils.countElementsPerPartition(DataSet<T> input)
Method that goes over all the elements in each partition in order to retrieve
the total number of elements.
|
static <T> DataSet<T> |
DataSetUtils.sampleWithSize(DataSet<T> input,
boolean withReplacement,
int numSamples)
Generate a sample of DataSet which contains fixed size elements.
|
static <T> DataSet<T> |
DataSetUtils.sampleWithSize(DataSet<T> input,
boolean withReplacement,
int numSamples,
long seed)
Generate a sample of DataSet which contains fixed size elements.
|
static <T> DataSet<org.apache.flink.api.java.tuple.Tuple2<Long,T>> |
DataSetUtils.zipWithIndex(DataSet<T> input)
Method that assigns a unique
Long value to all elements in the input data set. |
static <T> DataSet<org.apache.flink.api.java.tuple.Tuple2<Long,T>> |
DataSetUtils.zipWithUniqueId(DataSet<T> input)
Method that assigns a unique
Long value to all elements in the input data set in the following way:
a map function is applied to the input data set
each map task holds a counter c which is increased for each record
c is shifted by n bits where n = log2(number of parallel tasks)
to create a unique ID among all tasks, the task id is added to the counter
for each record, the resulting counter is collected
|
Modifier and Type | Method and Description |
---|---|
static <T> Utils.ChecksumHashCode |
DataSetUtils.checksumHashCode(DataSet<T> input)
Convenience method to get the count (number of elements) of a DataSet
as well as the checksum (sum over element hashes).
|
static <T> DataSet<org.apache.flink.api.java.tuple.Tuple2<Integer,Long>> |
DataSetUtils.countElementsPerPartition(DataSet<T> input)
Method that goes over all the elements in each partition in order to retrieve
the total number of elements.
|
static <T> MapPartitionOperator<T,T> |
DataSetUtils.sample(DataSet<T> input,
boolean withReplacement,
double fraction)
Generate a sample of DataSet by the probability fraction of each element.
|
static <T> MapPartitionOperator<T,T> |
DataSetUtils.sample(DataSet<T> input,
boolean withReplacement,
double fraction,
long seed)
Generate a sample of DataSet by the probability fraction of each element.
|
static <T> DataSet<T> |
DataSetUtils.sampleWithSize(DataSet<T> input,
boolean withReplacement,
int numSamples)
Generate a sample of DataSet which contains fixed size elements.
|
static <T> DataSet<T> |
DataSetUtils.sampleWithSize(DataSet<T> input,
boolean withReplacement,
int numSamples,
long seed)
Generate a sample of DataSet which contains fixed size elements.
|
static <T> DataSet<org.apache.flink.api.java.tuple.Tuple2<Long,T>> |
DataSetUtils.zipWithIndex(DataSet<T> input)
Method that assigns a unique
Long value to all elements in the input data set. |
static <T> DataSet<org.apache.flink.api.java.tuple.Tuple2<Long,T>> |
DataSetUtils.zipWithUniqueId(DataSet<T> input)
Method that assigns a unique
Long value to all elements in the input data set in the following way:
a map function is applied to the input data set
each map task holds a counter c which is increased for each record
c is shifted by n bits where n = log2(number of parallel tasks)
to create a unique ID among all tasks, the task id is added to the counter
for each record, the resulting counter is collected
|
Copyright © 2014–2016 The Apache Software Foundation. All rights reserved.