Package

it.agilelab.bigdata.wasp.consumers.spark.strategies

cdc

Permalink

package cdc

Visibility
  1. Public
  2. All

Type Members

  1. class DebeziumMutationStrategy extends Strategy with Logging

    Permalink
  2. class GoldenGateAdapterFlatModelStrategy extends Strategy with Logging

    Permalink

    Strategy that enable to map a flat mutation model to be mapped to an insert/update/delete object that can be sent to the CDC plugin that writes on DeltaLake.

    Strategy that enable to map a flat mutation model to be mapped to an insert/update/delete object that can be sent to the CDC plugin that writes on DeltaLake. So having has input the raw flat mutations coming from a goldengate topic it will produce in output a dataframe composed of rows that has the shape accepted in input by the cdc plugin.

    NB:

    • this strategy is used to map the mutation incoming from what in the oracle language is known as: Row Formatter, if you need to map a message that is incoming from an Operation Formatter you need to wait the new feature for that. More details are available under: operation vs row formatter
    • to enable the correct working of the strategy you need to ensure at runtime the configuration with path: goldengate.key.fields. This configuration is required and contains the list of primary keys fields for the mutation table. Suppose for example to have a table with the following structure: SHOP_TABLE ===> "PRODUCT_AMOUNT": Integer "TRANSACTION_ID": Integer "ORDER_DATE": Timestamp "PRODUCT_PRICE": Char "ORDER_ID": Integer "CUST_CODE": Long "PRODUCT_CODE": String and the primary key of this table is composed by the fields:
    • CUST_CODE
    • ORDER_DATE
    • PRODUCT_CODE
    • ORDER_ID

    in this case you need to insert the configuration the following line:

    goldengate.key.fields=["CUST_CODE", "ORDER_DATE", "PRODUCT_CODE", "ORDER_ID"]"

  3. trait GoldenGateConversion extends CdcMapper

    Permalink

    Implementation of the conversion to provide a compliant Dataframe for the GoldenGateConversion#conversion.

    Implementation of the conversion to provide a compliant Dataframe for the GoldenGateConversion#conversion. The implementation must be compliant with the Goldengate documentation: Goldengate Doc

    NB: reason about the changes of the primary key. See in particular the configuration key called gg.handler.name.format.pkUpdateHandlingformat.pkUpdateHandling in the page gg docs

    the configuration gg.handler.name.format.includePrimaryKeys is required to be set to true.

  4. final case class TableMutationFlatModel[A](table: String, op_type: String, op_ts: String, current_ts: String, pos: String, primary_keys: Seq[String], tokens: Map[String, String] = Map(), innerTable: A) extends Product with Serializable

    Permalink

    Case class representing a Oracle row mutation.

    Case class representing a Oracle row mutation. You can map the mutations in that object by calling org.apache.spark.sql.Encoder)

    A

    the class representing the structure of the table, nb the fields name are used by the encoder, so the must correspond to the names used in the mutation

Value Members

  1. object DebeziumConversion extends CdcMapper with Product with Serializable

    Permalink
  2. object GoldenGateMutationUtils

    Permalink

    object containing non trivial function utils in the case of cdc goldengate mutations.

    object containing non trivial function utils in the case of cdc goldengate mutations. Avoid the call of org.apache.spark.sql.Encoder) and org.apache.spark.sql.Encoder) if not mandatory because they add an additional cost by inserting a mapping phase in the spark pipeline.

Ungrouped