com.coxautodata.waimak.dataflow.spark.dataquality.deequ
Adds Deequ Checks for a label
Adds Deequ Checks for a label
the label to perform the validation for
first Deequ Check to perform
additional Deequ checks to perform
the alert handler to use for handling alerts for this check
additional alert handlers to use
Add Deequ validation for a given label (https://github.com/awslabs/deequ)
Add Deequ validation for a given label (https://github.com/awslabs/deequ)
the label to perform the validation for
the Deequ validation to perform e.g.
_.addCheck(
Check(CheckLevel.Error, "unit testing my data")
.hasSize(_ == 5) // we expect 5 rows
.isComplete("id") // should never be NULL
)
the alert handler to use for handling alerts for this check
additional alert handlers to use
Adds Deequ validation which uses a metrics repository (see Deequ docs on metrics repositories for more information)
N.B in order for this to work, you must set a metrics repository using setDeequMetricsRepository
or setDeequStorageLayerMetricsRepository
Adds Deequ validation which uses a metrics repository (see Deequ docs on metrics repositories for more information)
N.B in order for this to work, you must set a metrics repository using setDeequMetricsRepository
or setDeequStorageLayerMetricsRepository
the label to perform the validation for
the Deequ validation to perform e.g.
_.addAnomalyCheck(RateOfChangeStrategy(maxRateDecrease = Some(0.2)), Completeness("col1")))
the alert handler to use for handling alerts for this check
additional alert handlers to use
Sets the Deequ metrics repository to use for for Deequ checks using metrics, specified using addDeequValidationWithMetrics
Sets the Deequ metrics repository to use for for Deequ checks using metrics, specified using addDeequValidationWithMetrics
Deequ metrics repository builder
the date time to be used as a key in the metrics repository (defaults to now)
Sets the Deequ metrics repository to use for for Deequ checks using metrics, specified using addDeequValidationWithMetrics
The metrics repository will be a StorageLayerMetricsRepository
Sets the Deequ metrics repository to use for for Deequ checks using metrics, specified using addDeequValidationWithMetrics
The metrics repository will be a StorageLayerMetricsRepository
the base path to use for the StorageLayerMetricsRepository
the date time to be used as a key in the metrics repository (defaults to now)