T
- the type of the table aggregation resultACC
- the type of the table aggregation accumulator. The accumulator is used to keep the
aggregated values which are needed to compute an aggregation result.
TableAggregateFunction represents its state using accumulator, thereby the state of
the TableAggregateFunction must be put into the accumulator.@PublicEvolving public abstract class TableAggregateFunction<T,ACC> extends UserDefinedAggregateFunction<T,ACC>
The behavior of a TableAggregateFunction
can be defined by implementing a series of
custom methods. A TableAggregateFunction
needs at least three methods:
There is another method that can be optional to have:
All these methods must be declared publicly, not static, and named exactly as the names
mentioned above. The method UserDefinedAggregateFunction.createAccumulator()
is defined in
the UserDefinedAggregateFunction
functions, while other methods are explained below.
Processes the input values and update the provided accumulator instance. The method
accumulate can be overloaded with different custom types and arguments. A TableAggregateFunction
requires at least one accumulate() method.
param: accumulator the accumulator which contains the current aggregated results
param: [user defined inputs] the input value (usually obtained from a new arrived data).
public void accumulate(ACC accumulator, [user defined inputs])
Retracts the input values from the accumulator instance. The current design assumes the
inputs are the values that have been previously accumulated. The method retract can be
overloaded with different custom types and arguments.
param: accumulator the accumulator which contains the current aggregated results
param: [user defined inputs] the input value (usually obtained from a new arrived data).
public void retract(ACC accumulator, [user defined inputs])
Called every time when an aggregation result should be materialized. The returned value could
be either an early and incomplete result (periodically emitted as data arrive) or the final
result of the aggregation.
param: accumulator the accumulator which contains the current aggregated results
param: out the collector used to output data.
public void emitValue(ACC accumulator, Collector<T> out)
Called every time when an aggregation result should be materialized. The returned value could
be either an early and incomplete result (periodically emitted as data arrive) or the final
result of the aggregation.
Different from emitValue, emitUpdateWithRetract is used to emit values that have been updated.
This method outputs data incrementally in retract mode, i.e., once there is an update, we have
to retract old records before sending new updated ones. The emitUpdateWithRetract method will be
used in preference to the emitValue method if both methods are defined in the table aggregate
function, because the method is treated to be more efficient than emitValue as it can output
values incrementally.
param: accumulator the accumulator which contains the current aggregated results
param: out the retractable collector used to output data. Use collect method
to output(add) records and use retract method to retract(delete)
records.
public void emitUpdateWithRetract(ACC accumulator, RetractableCollector<T> out)
Modifier and Type | Class and Description |
---|---|
static interface |
TableAggregateFunction.RetractableCollector<T>
Collects a record and forwards it.
|
Constructor and Description |
---|
TableAggregateFunction() |
Modifier and Type | Method and Description |
---|---|
FunctionKind |
getKind()
Returns the kind of function this definition describes.
|
createAccumulator, getAccumulatorType, getResultType
close, functionIdentifier, open, toString
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
getRequirements, isDeterministic
public final FunctionKind getKind()
FunctionDefinition
Copyright © 2014–2019 The Apache Software Foundation. All rights reserved.