: Domain name
: Schema
: stage
: Storage Handler
Function that retrieves full metrics dataframe with both set discrete and continuous metrics
Function that retrieves full metrics dataframe with both set discrete and continuous metrics
: dataframe obtain from computeDiscretMetric( ) or computeContinuiousMetric( )
: list of all variables
: list of column
Dataframe : that contain the full metrics with all variables and all metrics
Function to build the metrics save path
Function to build the metrics save path
: path where metrics are stored
: path where the metrics for the specified schema are stored
Partition a dataset using dataset columns.
Partition a dataset using dataset columns. To partition the dataset using the igestion time, use the reserved column names :
: Input dataset
: list of columns to use for partitioning.
The Spark session used to run this job
Just to force any spark job to implement its entry point using within the "run" method
Just to force any spark job to implement its entry point using within the "run" method
: Spark Session used for the job
Saves a dataset.
Saves a dataset. If the path is empty (the first time we call metrics on the schema) then we can write.
If there's already parquet files stored in it, then create a temporary directory to compute on, and flush the path to move updated metrics in it
: dataset to be saved
: Path to save the file at
Function Function that unifies discrete and continuous metrics dataframe, then write save the result to parquet
Function Function that unifies discrete and continuous metrics dataframe, then write save the result to parquet
: dataframe that contains all the discrete metrics
: dataframe that contains all the continuous metrics
: name of the domain
: schema of the initial data
: time which correspond to the ingestion
: stage (unit / global)