org.isarnproject.sketches

TDigest

case class TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap) extends Serializable with Product

A t-digest sketch of sampled numeric data, as described in: Computing Extremely Accurate Quantiles Using t-Digests, Ted Dunning and Otmar Ertl, https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf

import org.isarnproject.sketches.TDigest
val data = Vector.fill(10000) { scala.util.Random.nextGaussian() }
// sketch of some Gaussian data
val sketch = TDigest.sketch(data)
// the cumulative distribution function of the sketch; cdf(x) at x = 0
val cdf = sketch.cdf(0.0)
// inverse of the CDF, evaluated at q = 0.5
val cdfi = sketch.cdfInverse(0.5)
Linear Supertypes
Product, Equals, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. TDigest
  2. Product
  3. Equals
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap)

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. def +[N1, N2](xw: (N1, N2))(implicit num1: Numeric[N1], num2: Numeric[N2]): TDigest

    Returns a new t-digest with new pair (x, w) included in its sketch.

    Returns a new t-digest with new pair (x, w) included in its sketch.

    xw

    A pair (x, w) where x is the numeric value and w is its weight

    returns

    the updated sketch

    Note

    This implements 'algorithm 1' from: Computing Extremely Accurate Quantiles Using t-Digests, Ted Dunning and Otmar Ertl, https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf

  5. def +[N](x: N)(implicit num: Numeric[N]): TDigest

    Returns a new t-digest with value x included in its sketch; td + x is equivalent to td + (x, 1).

    Returns a new t-digest with value x included in its sketch; td + x is equivalent to td + (x, 1).

    x

    The numeric data value to include in the sketch

    returns

    the updated sketch

  6. def ++(that: TDigest): TDigest

    Add this digest to another

    Add this digest to another

    that

    The right-hand t-digest operand

    returns

    the result of combining left and right digests

  7. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  8. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  9. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  10. def cdf[N](x: N)(implicit num: Numeric[N]): Double

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch.

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch.

    x

    a numeric value

    returns

    the cumulative probability that a random sample from the distribution is <= x

  11. def cdfDiscrete[N](x: N)(implicit num: Numeric[N]): Double

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch, assuming sketch is "discrete" (e.

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch, assuming sketch is "discrete" (e.g. if number of clusters <= maxDiscrete setting)

    x

    a numeric value

    returns

    the cumulative probability that a random sample from the distribution is <= x

  12. def cdfDiscreteInverse[N](q: N)(implicit num: Numeric[N]): Double

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch, assuming the sketch is "discrete" (e.

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch, assuming the sketch is "discrete" (e.g. if number of clusters <= maxDiscrete setting)

    q

    a quantile value. The value of q is expected to be on interval [0, 1]

    returns

    the smallest value x such that q <= cdf(x)

  13. def cdfInverse[N](q: N)(implicit num: Numeric[N]): Double

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch.

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch.

    q

    a quantile value. The value of q is expected to be on interval [0, 1]

    returns

    the value x such that cdf(x) = q

  14. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  15. val clusters: TDigestMap

  16. val delta: Double

  17. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  18. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  19. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  20. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  21. val maxDiscrete: Int

  22. val nclusters: Int

  23. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  24. final def notify(): Unit

    Definition Classes
    AnyRef
  25. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  26. def sample: Double

    Perform a random sampling from the distribution as sketched by this t-digest, using "discrete" (PMF) mode if the number of clusters <= maxDiscrete setting, and "density" (PDF) mode otherwise.

    Perform a random sampling from the distribution as sketched by this t-digest, using "discrete" (PMF) mode if the number of clusters <= maxDiscrete setting, and "density" (PDF) mode otherwise.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  27. def samplePDF: Double

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability density" mode.

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability density" mode.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  28. def samplePMF: Double

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability mass" (i.

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability mass" (i.e. discrete) mode.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  29. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  30. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  31. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Product

Inherited from Equals

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped