Strategy that enable to map a flat mutation model to be mapped to an insert/update/delete object that can be sent to the CDC plugin that writes on DeltaLake.
Implementation of the conversion to provide a compliant Dataframe for the GoldenGateConversion#conversion.
Implementation of the conversion to provide a compliant Dataframe for the GoldenGateConversion#conversion. The implementation must be compliant with the Goldengate documentation: Goldengate Doc
NB: reason about the changes of the primary key. See in particular the configuration key called gg.handler.name.format.pkUpdateHandlingformat.pkUpdateHandling in the page gg docs
the configuration gg.handler.name.format.includePrimaryKeys is required to be set to true.
Case class representing a Oracle row mutation.
Case class representing a Oracle row mutation. You can map the mutations in that object by calling org.apache.spark.sql.Encoder)
the class representing the structure of the table, nb the fields name are used by the encoder, so the must correspond to the names used in the mutation
object containing non trivial function utils in the case of cdc goldengate mutations.
object containing non trivial function utils in the case of cdc goldengate mutations. Avoid the call of org.apache.spark.sql.Encoder) and org.apache.spark.sql.Encoder) if not mandatory because they add an additional cost by inserting a mapping phase in the spark pipeline.
Strategy that enable to map a flat mutation model to be mapped to an insert/update/delete object that can be sent to the CDC plugin that writes on DeltaLake. So having has input the raw flat mutations coming from a goldengate topic it will produce in output a dataframe composed of rows that has the shape accepted in input by the cdc plugin.
NB:
in this case you need to insert the configuration the following line:
goldengate.key.fields=["CUST_CODE", "ORDER_DATE", "PRODUCT_CODE", "ORDER_ID"]"