Interface ChangelogFunction

  • All Superinterfaces:
    FunctionDefinition

    @PublicEvolving
    public interface ChangelogFunction
    extends FunctionDefinition
    An extension that allows a process table function (PTF) to emit results with changelog semantics.

    By default, a ProcessTableFunction can only emit insert-only (append-only) results. By implementing this interface, a function can declare the types of changes (e.g., inserts, updates, deletes) that it may emit, allowing the planner to make informed decisions during query planning.

    Note: This interface is intended for advanced use cases and should be implemented with care. Emitting an incorrect changelog from the PTF may lead to undefined behavior in the overall query. The `on_time` argument is unsupported for updating PTFs.

    The resulting changelog mode can be influenced by:

    Changelog mode inference in the planner involves several steps. The getChangelogMode(ChangelogContext) method is called during each step:

    1. The planner checks whether the PTF emits updates or inserts-only.
    2. If updates are emitted, the planner determines whether the updates include RowKind.UPDATE_BEFORE messages (retract mode), or whether RowKind.UPDATE_AFTER messages are sufficient (upsert mode). For this, getChangelogMode(org.apache.flink.table.functions.ChangelogFunction.ChangelogContext) might be called twice to query both retract mode and upsert mode capabilities as indicated by ChangelogFunction.ChangelogContext.getRequiredChangelogMode().
    3. If in upsert mode, the planner checks whether RowKind.DELETE messages contain all fields (full deletes) or only key fields (partial deletes). In the case of partial deletes, only the upsert key fields are set when a row is removed; all non-key fields are null, regardless of nullability constraints. ChangelogFunction.ChangelogContext.getRequiredChangelogMode() indicates whether a downstream operator requires full deletes.

    Emitting changelogs is only valid for PTFs that take table arguments with set semantics (see ArgumentTrait.SET_SEMANTIC_TABLE). In case of upserts, the upsert key must be equal to the PARTITION BY key.

    It is perfectly valid for a ChangelogFunction implementation to return a fixed ChangelogMode, regardless of the ChangelogFunction.ChangelogContext. This approach may be appropriate when the PTF is designed for a specific scenario or pipeline setup, and does not need to adapt dynamically to different input modes. Note that in such cases, the PTFs applicability is limited, as it may only function correctly within the predefined context for which it was designed.

    In some cases, this interface should be used in combination with SpecializedFunction to reconfigure the PTF after the final changelog mode for the specific call location has been determined. The final changelog mode is also available during runtime via ProcessTableFunction.Context.getChangelogMode().

    See Also:
    ChangelogMode