Interface ChangelogFunction
-
- All Superinterfaces:
FunctionDefinition
@PublicEvolving public interface ChangelogFunction extends FunctionDefinition
An extension that allows a process table function (PTF) to emit results with changelog semantics.By default, a
ProcessTableFunctioncan only emit insert-only (append-only) results. By implementing this interface, a function can declare the types of changes (e.g., inserts, updates, deletes) that it may emit, allowing the planner to make informed decisions during query planning.Note: This interface is intended for advanced use cases and should be implemented with care. Emitting an incorrect changelog from the PTF may lead to undefined behavior in the overall query. The `on_time` argument is unsupported for updating PTFs.
The resulting changelog mode can be influenced by:
- The changelog mode of the input table arguments, accessible via
ChangelogFunction.ChangelogContext.getTableChangelogMode(int). - The changelog mode required by downstream operators, accessible via
ChangelogFunction.ChangelogContext.getRequiredChangelogMode().
Changelog mode inference in the planner involves several steps. The
getChangelogMode(ChangelogContext)method is called during each step:- The planner checks whether the PTF emits updates or inserts-only.
- If updates are emitted, the planner determines whether the updates include
RowKind.UPDATE_BEFOREmessages (retract mode), or whetherRowKind.UPDATE_AFTERmessages are sufficient (upsert mode). For this,getChangelogMode(org.apache.flink.table.functions.ChangelogFunction.ChangelogContext)might be called twice to query both retract mode and upsert mode capabilities as indicated byChangelogFunction.ChangelogContext.getRequiredChangelogMode(). - If in upsert mode, the planner checks whether
RowKind.DELETEmessages contain all fields (full deletes) or only key fields (partial deletes). In the case of partial deletes, only the upsert key fields are set when a row is removed; all non-key fields are null, regardless of nullability constraints.ChangelogFunction.ChangelogContext.getRequiredChangelogMode()indicates whether a downstream operator requires full deletes.
Emitting changelogs is only valid for PTFs that take table arguments with set semantics (see
ArgumentTrait.SET_SEMANTIC_TABLE). In case of upserts, the upsert key must be equal to the PARTITION BY key.It is perfectly valid for a
ChangelogFunctionimplementation to return a fixedChangelogMode, regardless of theChangelogFunction.ChangelogContext. This approach may be appropriate when the PTF is designed for a specific scenario or pipeline setup, and does not need to adapt dynamically to different input modes. Note that in such cases, the PTFs applicability is limited, as it may only function correctly within the predefined context for which it was designed.In some cases, this interface should be used in combination with
SpecializedFunctionto reconfigure the PTF after the final changelog mode for the specific call location has been determined. The final changelog mode is also available during runtime viaProcessTableFunction.Context.getChangelogMode().- See Also:
ChangelogMode
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static interfaceChangelogFunction.ChangelogContextContext during changelog mode inference.
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description ChangelogModegetChangelogMode(ChangelogFunction.ChangelogContext changelogContext)Returns theChangelogModeof the PTF, taking into account the table arguments and the planner's requirements.-
Methods inherited from interface org.apache.flink.table.functions.FunctionDefinition
getKind, getRequirements, getTypeInference, isDeterministic, supportsConstantFolding
-
-
-
-
Method Detail
-
getChangelogMode
ChangelogMode getChangelogMode(ChangelogFunction.ChangelogContext changelogContext)
Returns theChangelogModeof the PTF, taking into account the table arguments and the planner's requirements.
-
-