cdc

Type Members

class DebeziumMutationStrategy extends Strategy with Logging
class GoldenGateAdapterFlatModelStrategy extends Strategy with Logging

Strategy that enable to map a flat mutation model to be mapped to an insert/update/delete object that can be sent to the CDC plugin that writes on DeltaLake.
Strategy that enable to map a flat mutation model to be mapped to an insert/update/delete object that can be sent to the CDC plugin that writes on DeltaLake. So having has input the raw flat mutations coming from a goldengate topic it will produce in output a dataframe composed of rows that has the shape accepted in input by the cdc plugin.
NB:
- this strategy is used to map the mutation incoming from what in the oracle language is known as: Row Formatter, if you need to map a message that is incoming from an Operation Formatter you need to wait the new feature for that. More details are available under: operation vs row formatter
- to enable the correct working of the strategy you need to ensure at runtime the configuration with path: goldengate.key.fields. This configuration is required and contains the list of primary keys fields for the mutation table. Suppose for example to have a table with the following structure: SHOP_TABLE ===> "PRODUCT_AMOUNT": Integer "TRANSACTION_ID": Integer "ORDER_DATE": Timestamp "PRODUCT_PRICE": Char "ORDER_ID": Integer "CUST_CODE": Long "PRODUCT_CODE": String and the primary key of this table is composed by the fields:
- CUST_CODE
- ORDER_DATE
- PRODUCT_CODE
- ORDER_ID
in this case you need to insert the configuration the following line:
goldengate.key.fields=["CUST_CODE", "ORDER_DATE", "PRODUCT_CODE", "ORDER_ID"]"
trait GoldenGateConversion extends CdcMapper

Implementation of the conversion to provide a compliant Dataframe for the GoldenGateConversion#conversion.
Implementation of the conversion to provide a compliant Dataframe for the GoldenGateConversion#conversion. The implementation must be compliant with the Goldengate documentation: Goldengate Doc
NB: reason about the changes of the primary key. See in particular the configuration key called gg.handler.name.format.pkUpdateHandlingformat.pkUpdateHandling in the page gg docs
the configuration gg.handler.name.format.includePrimaryKeys is required to be set to true.
final case class TableMutationFlatModel[A](table: String, op_type: String, op_ts: String, current_ts: String, pos: String, primary_keys: Seq[String], tokens: Map[String, String] = Map(), innerTable: A) extends Product with Serializable

Case class representing a Oracle row mutation.
Case class representing a Oracle row mutation. You can map the mutations in that object by calling org.apache.spark.sql.Encoder)
A
the class representing the structure of the table, nb the fields name are used by the encoder, so the must correspond to the names used in the mutation

Value Members

object DebeziumConversion extends CdcMapper with Product with Serializable
object GoldenGateMutationUtils

object containing non trivial function utils in the case of cdc goldengate mutations.
object containing non trivial function utils in the case of cdc goldengate mutations. Avoid the call of org.apache.spark.sql.Encoder) and org.apache.spark.sql.Encoder) if not mandatory because they add an additional cost by inserting a mapping phase in the spark pipeline.

package cdc

Type Members

class DebeziumMutationStrategy extends Strategy with Logging

class GoldenGateAdapterFlatModelStrategy extends Strategy with Logging

trait GoldenGateConversion extends CdcMapper

final case class TableMutationFlatModel[A](table: String, op_type: String, op_ts: String, current_ts: String, pos: String, primary_keys: Seq[String], tokens: Map[String, String] = Map(), innerTable: A) extends Product with Serializable

Value Members

object DebeziumConversion extends CdcMapper with Product with Serializable

object GoldenGateMutationUtils

Ungrouped