A UDAF for sketching a column of numeric ArrayData with an array of TDigest objects.
A UDAF for sketching a column of numeric ArrayData with an array of TDigest objects. Expected to be created using tdigestArrayUDAF.
the expected numeric type of the data; Double, Int, etc
The delta value to be used by the TDigest objects
The maxDiscrete value to be used by the TDigest objects
A UDAF for sketching a column of MLLib Vectors with an array of TDigest objects.
A UDAF for sketching a column of MLLib Vectors with an array of TDigest objects. Expected to be created using tdigestMLLibVecUDAF.
The delta value to be used by the TDigest object
The maxDiscrete value to be used by the TDigest object
A UDAF for sketching a column of ML Vectors with an array of TDigest objects.
A UDAF for sketching a column of ML Vectors with an array of TDigest objects. Expected to be created using tdigestMLVecUDAF.
The delta value to be used by the TDigest object
The maxDiscrete value to be used by the TDigest object
A base class that defines the common functionality for array sketching UDAFs
A UDAF for sketching numeric data with a TDigest.
A UDAF for sketching numeric data with a TDigest. Expected to be created using tdigestUDAF.
the expected numeric type of the data; Double, Int, etc
For declaring implicit values that map numeric types to corresponding DataType values
implicitly unpack a TDigestArraySQL to extract its Array[TDigest] payload
implicitly unpack a TDigestSQL to extract its TDigest payload
Obtain a UDAF for sketching a numeric array-data Dataset column, using a t-digest for each element.
Obtain a UDAF for sketching a numeric array-data Dataset column, using a t-digest for each element.
The numeric type of the array-data column; Double, Int, etc
A UDAF that can be applied to a Dataset array-data column
import org.isarnproject.sketches.udaf._, org.apache.spark.isarnproject.sketches.udt._ // create a UDAF for t-digest array, adding custom settings for delta and maxDiscrete val udafTD = tdigestArrayUDAF[Double].delta(0.1).maxDiscrete(25) // apply the UDAF to get an array of t-digests for each element in the array-data val agg = data.agg(udafTD($"NumericArrayColumn")) // extract the t-digest array val tdArray = agg.getAs[TDigestArraySQL](0).tdigests
Obtain a UDAF for sketching an MLLib Vector Dataset column, using a t-digest for each element in the vector
Obtain a UDAF for sketching an MLLib Vector Dataset column, using a t-digest for each element in the vector
A UDAF that can be applied to a MLLib Vector column
import org.isarnproject.sketches.udaf._, org.apache.spark.isarnproject.sketches.udt._ // create a UDAF for t-digest array, adding custom settings for delta and maxDiscrete val udafTD = tdigestMLLibVecUDAF[Double].delta(0.1).maxDiscrete(25) // apply the UDAF to get an array of t-digests for each element in the array-data val agg = data.agg(udafTD($"MLLibVecColumn")) // extract the t-digest array val tdArray = agg.getAs[TDigestArraySQL](0).tdigests
Obtain a UDAF for sketching an ML Vector Dataset column, using a t-digest for each element in the vector
Obtain a UDAF for sketching an ML Vector Dataset column, using a t-digest for each element in the vector
A UDAF that can be applied to a ML Vector column
import org.isarnproject.sketches.udaf._, org.apache.spark.isarnproject.sketches.udt._ // create a UDAF for t-digest array, adding custom settings for delta and maxDiscrete val udafTD = tdigestMLVecUDAF[Double].delta(0.1).maxDiscrete(25) // apply the UDAF to get an array of t-digests for each element in the array-data val agg = data.agg(udafTD($"MLVecColumn")) // extract the t-digest array val tdArray = agg.getAs[TDigestArraySQL](0).tdigests
Obtain a UDAF for sketching a single numeric Dataset column using a t-digest
Obtain a UDAF for sketching a single numeric Dataset column using a t-digest
The numeric type of the column; Double, Int, etc
A UDAF that can be applied to a Dataset column
import org.isarnproject.sketches.udaf._, org.apache.spark.isarnproject.sketches.udt._ // create a UDAF for a t-digest, adding custom settings for delta and maxDiscrete val udafTD = tdigestUDAF[Double].delta(0.1).maxDiscrete(25) // apply the UDAF to get a t-digest for a data column val agg = data.agg(udafTD($"NumericColumn")) // extract the t-digest val td = agg.getAs[TDigestSQL](0).tdigest
package-wide methods, implicits and definitions for sketching UDAFs