Package

org.isarnproject.sketches

udaf

Permalink

package udaf

package-wide methods, implicits and definitions for sketching UDAFs

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. udaf
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. case class TDigestArrayUDAF[N](deltaV: Double, maxDiscreteV: Int)(implicit num: Numeric[N], dataTpe: TDigestUDAFDataType[N]) extends TDigestMultiUDAF with Product with Serializable

    Permalink

    A UDAF for sketching a column of numeric ArrayData with an array of TDigest objects.

    A UDAF for sketching a column of numeric ArrayData with an array of TDigest objects. Expected to be created using tdigestArrayUDAF.

    N

    the expected numeric type of the data; Double, Int, etc

    deltaV

    The delta value to be used by the TDigest objects

    maxDiscreteV

    The maxDiscrete value to be used by the TDigest objects

  2. class TDigestDoubleUDAF extends UserDefinedAggregateFunction

    Permalink
  3. case class TDigestMLLibVecUDAF(deltaV: Double, maxDiscreteV: Int) extends TDigestMultiUDAF with Product with Serializable

    Permalink

    A UDAF for sketching a column of MLLib Vectors with an array of TDigest objects.

    A UDAF for sketching a column of MLLib Vectors with an array of TDigest objects. Expected to be created using tdigestMLLibVecUDAF.

    deltaV

    The delta value to be used by the TDigest object

    maxDiscreteV

    The maxDiscrete value to be used by the TDigest object

  4. case class TDigestMLVecUDAF(deltaV: Double, maxDiscreteV: Int) extends TDigestMultiUDAF with Product with Serializable

    Permalink

    A UDAF for sketching a column of ML Vectors with an array of TDigest objects.

    A UDAF for sketching a column of ML Vectors with an array of TDigest objects. Expected to be created using tdigestMLVecUDAF.

    deltaV

    The delta value to be used by the TDigest object

    maxDiscreteV

    The maxDiscrete value to be used by the TDigest object

  5. abstract class TDigestMultiUDAF extends UserDefinedAggregateFunction

    Permalink

    A base class that defines the common functionality for array sketching UDAFs

  6. class TDigestUDAF[N] extends UserDefinedAggregateFunction

    Permalink

    A UDAF for sketching numeric data with a TDigest.

    A UDAF for sketching numeric data with a TDigest. Expected to be created using tdigestUDAF.

    N

    the expected numeric type of the data; Double, Int, etc

  7. case class TDigestUDAFDataType[N](tpe: DataType) extends Product with Serializable

    Permalink

    For declaring implicit values that map numeric types to corresponding DataType values

Value Members

  1. object Static2TDigestDoubleUDAF extends UserDefinedAggregateFunction

    Permalink
  2. object StaticTDigestDoubleUDAF extends TDigestDoubleUDAF

    Permalink
  3. implicit def implicitTDigestArraySQLToTDigestArray(tdasql: TDigestArraySQL): Array[TDigest]

    Permalink

    implicitly unpack a TDigestArraySQL to extract its Array[TDigest] payload

  4. implicit def implicitTDigestSQLToTDigest(tdsql: TDigestSQL): TDigest

    Permalink

    implicitly unpack a TDigestSQL to extract its TDigest payload

  5. object python

    Permalink
  6. implicit val tDigestUDAFDataTypeByte: TDigestUDAFDataType[Byte]

    Permalink
  7. implicit val tDigestUDAFDataTypeDouble: TDigestUDAFDataType[Double]

    Permalink
  8. implicit val tDigestUDAFDataTypeFloat: TDigestUDAFDataType[Float]

    Permalink
  9. implicit val tDigestUDAFDataTypeInt: TDigestUDAFDataType[Int]

    Permalink
  10. implicit val tDigestUDAFDataTypeLong: TDigestUDAFDataType[Long]

    Permalink
  11. implicit val tDigestUDAFDataTypeShort: TDigestUDAFDataType[Short]

    Permalink
  12. def tdigestArrayUDAF[N](implicit num: Numeric[N], dataType: TDigestUDAFDataType[N]): TDigestArrayUDAF[N]

    Permalink

    Obtain a UDAF for sketching a numeric array-data Dataset column, using a t-digest for each element.

    Obtain a UDAF for sketching a numeric array-data Dataset column, using a t-digest for each element.

    N

    The numeric type of the array-data column; Double, Int, etc

    returns

    A UDAF that can be applied to a Dataset array-data column

    Example:
    1. import org.isarnproject.sketches.udaf._, org.apache.spark.isarnproject.sketches.udt._
      // create a UDAF for t-digest array, adding custom settings for delta and maxDiscrete
      val udafTD = tdigestArrayUDAF[Double].delta(0.1).maxDiscrete(25)
      // apply the UDAF to get an array of t-digests for each element in the array-data
      val agg = data.agg(udafTD($"NumericArrayColumn"))
      // extract the t-digest array
      val tdArray = agg.getAs[TDigestArraySQL](0).tdigests
  13. def tdigestMLLibVecUDAF: TDigestMLLibVecUDAF

    Permalink

    Obtain a UDAF for sketching an MLLib Vector Dataset column, using a t-digest for each element in the vector

    Obtain a UDAF for sketching an MLLib Vector Dataset column, using a t-digest for each element in the vector

    returns

    A UDAF that can be applied to a MLLib Vector column

    Example:
    1. import org.isarnproject.sketches.udaf._, org.apache.spark.isarnproject.sketches.udt._
      // create a UDAF for t-digest array, adding custom settings for delta and maxDiscrete
      val udafTD = tdigestMLLibVecUDAF[Double].delta(0.1).maxDiscrete(25)
      // apply the UDAF to get an array of t-digests for each element in the array-data
      val agg = data.agg(udafTD($"MLLibVecColumn"))
      // extract the t-digest array
      val tdArray = agg.getAs[TDigestArraySQL](0).tdigests
  14. def tdigestMLVecUDAF: TDigestMLVecUDAF

    Permalink

    Obtain a UDAF for sketching an ML Vector Dataset column, using a t-digest for each element in the vector

    Obtain a UDAF for sketching an ML Vector Dataset column, using a t-digest for each element in the vector

    returns

    A UDAF that can be applied to a ML Vector column

    Example:
    1. import org.isarnproject.sketches.udaf._, org.apache.spark.isarnproject.sketches.udt._
      // create a UDAF for t-digest array, adding custom settings for delta and maxDiscrete
      val udafTD = tdigestMLVecUDAF[Double].delta(0.1).maxDiscrete(25)
      // apply the UDAF to get an array of t-digests for each element in the array-data
      val agg = data.agg(udafTD($"MLVecColumn"))
      // extract the t-digest array
      val tdArray = agg.getAs[TDigestArraySQL](0).tdigests
  15. def tdigestUDAF[N](implicit num: Numeric[N], dataType: TDigestUDAFDataType[N]): TDigestUDAF[N]

    Permalink

    Obtain a UDAF for sketching a single numeric Dataset column using a t-digest

    Obtain a UDAF for sketching a single numeric Dataset column using a t-digest

    N

    The numeric type of the column; Double, Int, etc

    returns

    A UDAF that can be applied to a Dataset column

    Example:
    1. import org.isarnproject.sketches.udaf._, org.apache.spark.isarnproject.sketches.udt._
      // create a UDAF for a t-digest, adding custom settings for delta and maxDiscrete
      val udafTD = tdigestUDAF[Double].delta(0.1).maxDiscrete(25)
      // apply the UDAF to get a t-digest for a data column
      val agg = data.agg(udafTD($"NumericColumn"))
      // extract the t-digest
      val td = agg.getAs[TDigestSQL](0).tdigest

Inherited from AnyRef

Inherited from Any

Ungrouped