Converts all measurements in an instance of ControlMeasure object into stings so it won't cause confusion when deserialized downstream.
Converts all measurements in an instance of ControlMeasure object into stings so it won't cause confusion when deserialized downstream.
A control measures.
The converted control measurements.
The method generates a temporary column name which does not exist in the specified DataFrame
.
The method generates a temporary column name which does not exist in the specified DataFrame
.
An column name as a string
Get current time as a string formatted according to Control Framework format za.co.absa.atum.utils.controlmeasure.ControlMeasureUtils#timestampFormat.
Get current time as a string formatted according to Control Framework format za.co.absa.atum.utils.controlmeasure.ControlMeasureUtils#timestampFormat.
The current timestamp as a string (e.g. "05-10-2017 09:43:50 +0200")
Get current date as a string formatted according to Control Framework format za.co.absa.atum.utils.controlmeasure.ControlMeasureUtils#dateFormat().
Get current date as a string formatted according to Control Framework format za.co.absa.atum.utils.controlmeasure.ControlMeasureUtils#dateFormat().
The current date as a string (e.g. "05-10-2017")
Normalizes all measurements in an instance of ControlMeasure object into standard values
Normalizes all measurements in an instance of ControlMeasure object into standard values
A control measures.
The normalized control measurements.
Will write Control Measure cm
as JSON to Hadoop FS (by default to into the dir specified in cm.metadata.dataFileName
, file name: _INFO)
Will write Control Measure cm
as JSON to Hadoop FS (by default to into the dir specified in cm.metadata.dataFileName
, file name: _INFO)
control measure
dir on outputFs
, usual choice is cm.metadata.dataFileName
JsonType.Minified
for compact json (no whitespaces) or JsonType.Pretty
for indented
hadoop FS. For regular HDFS, use e.g. FileSystem.get(sparkSession.sparkContext.hadoopConfiguration)
or your S3 FS
(or rely on e.g. org.apache.hadoop.conf.Configuration))
The method returns arbitrary object as a Json string.
The method returns arbitrary object as a Json string. Calls za.co.absa.atum.utils.SerializationUtils#asJson(java.lang.Object)
A string representing the object in Json format
(Since version 3.3.0) Use SerializationUtils.asJson instead
The method returns arbitrary object as a pretty Json string.
The method returns arbitrary object as a pretty Json string. Calls za.co.absa.atum.utils.SerializationUtils#asJsonPretty(java.lang.Object)
A string representing the object in Json format
(Since version 3.3.0) Use SerializationUtils.asJsonPretty instead
The method crates an _INFO file for a given dataset.
The method crates an _INFO file for a given dataset. The row count measurement is added automatically. You can also specify aggregation columns for aggregation measurements
A dataset for which _INFO file to be created.
The name of the application providing the data.
The path to the input file name. Can be a folder with file mask.
The date of the data generation (default = today).
The version of the data generation for the date, new versions replace old versions of data (default = 1).
Country name (default = "ZA").
History type (default = "Snapshot").
Source type (default = "Source").
The name of the initial checkpoint (default = "Source").
A workflow name to group several checkpoint sth in the chain (default = "Source").
A flag specifying if saving _INFO file to HDFS needed. If false the _INFO file will not be saved to HDFS.
Output pretty JSON.
Numeric column names for.
The content of the _INFO file.
(Since version 3.4.0) Use ControlMeasureBuilder.forDf(...) ... .build & ControlMeasureUtils.writeControlMeasureInfoFileToHadoopFs(...) instead
The method returns arbitrary object parsed from Json string.
The method returns arbitrary object parsed from Json string. Calls scala.reflect.Manifest)
An object deserialized from the Json string
(Since version 3.3.0) Use SerializationUtils.fromJson instead
This object contains utilities used in Control Measurements processing