Class/Object

org.isarnproject.sketches

TDigest

Related Docs: object TDigest | package sketches

Permalink

case class TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap) extends Serializable with Product

A t-digest sketch of sampled numeric data, as described in: Computing Extremely Accurate Quantiles Using t-Digests, Ted Dunning and Otmar Ertl, https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf

import org.isarnproject.sketches.TDigest
val data = Vector.fill(10000) { scala.util.Random.nextGaussian() }
// sketch of some Gaussian data
val sketch = TDigest.sketch(data)
// the cumulative distribution function of the sketch; cdf(x) at x = 0
val cdf = sketch.cdf(0.0)
// inverse of the CDF, evaluated at q = 0.5
val cdfi = sketch.cdfInverse(0.5)
Linear Supertypes
Product, Equals, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. TDigest
  2. Product
  3. Equals
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TDigest(delta: Double, maxDiscrete: Int, nclusters: Int, clusters: TDigestMap)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. def +[N1, N2](xw: (N1, N2))(implicit num1: Numeric[N1], num2: Numeric[N2]): TDigest

    Permalink

    Returns a new t-digest with new pair (x, w) included in its sketch.

    Returns a new t-digest with new pair (x, w) included in its sketch.

    xw

    A pair (x, w) where x is the numeric value and w is its weight

    returns

    the updated sketch

    Note

    This implements 'algorithm 1' from: Computing Extremely Accurate Quantiles Using t-Digests, Ted Dunning and Otmar Ertl, https://github.com/tdunning/t-digest/blob/master/docs/t-digest-paper/histo.pdf

  4. def +[N](x: N)(implicit num: Numeric[N]): TDigest

    Permalink

    Returns a new t-digest with value x included in its sketch; td + x is equivalent to td + (x, 1).

    Returns a new t-digest with value x included in its sketch; td + x is equivalent to td + (x, 1).

    x

    The numeric data value to include in the sketch

    returns

    the updated sketch

  5. def ++(that: TDigest): TDigest

    Permalink

    Add this digest to another

    Add this digest to another

    that

    The right-hand t-digest operand

    returns

    the result of combining left and right digests

  6. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  7. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  8. def cdf[N](x: N)(implicit num: Numeric[N]): Double

    Permalink

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch.

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch.

    x

    a numeric value

    returns

    the cumulative probability that a random sample from the distribution is <= x

  9. def cdfDiscrete[N](x: N)(implicit num: Numeric[N]): Double

    Permalink

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch, assuming sketch is "discrete" (e.g.

    Compute a cumulative probability (CDF) for a numeric value, from the estimated probability distribution represented by this t-digest sketch, assuming sketch is "discrete" (e.g. if number of clusters <= maxDiscrete setting)

    x

    a numeric value

    returns

    the cumulative probability that a random sample from the distribution is <= x

  10. def cdfDiscreteInverse[N](q: N)(implicit num: Numeric[N]): Double

    Permalink

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch, assuming the sketch is "discrete" (e.g.

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch, assuming the sketch is "discrete" (e.g. if number of clusters <= maxDiscrete setting)

    q

    a quantile value. The value of q is expected to be on interval [0, 1]

    returns

    the smallest value x such that q <= cdf(x)

  11. def cdfInverse[N](q: N)(implicit num: Numeric[N]): Double

    Permalink

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch.

    Compute the inverse cumulative probability (inverse-CDF) for a quantile value, from the estimated probability distribution represented by this t-digest sketch.

    q

    a quantile value. The value of q is expected to be on interval [0, 1]

    returns

    the value x such that cdf(x) = q

  12. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  13. val clusters: TDigestMap

    Permalink
  14. val delta: Double

    Permalink
  15. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  16. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  17. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  18. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  19. val maxDiscrete: Int

    Permalink
  20. val nclusters: Int

    Permalink
  21. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  22. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  23. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  24. def sample: Double

    Permalink

    Perform a random sampling from the distribution as sketched by this t-digest, using "discrete" (PMF) mode if the number of clusters <= maxDiscrete setting, and "density" (PDF) mode otherwise.

    Perform a random sampling from the distribution as sketched by this t-digest, using "discrete" (PMF) mode if the number of clusters <= maxDiscrete setting, and "density" (PDF) mode otherwise.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  25. def samplePDF: Double

    Permalink

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability density" mode.

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability density" mode.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  26. def samplePMF: Double

    Permalink

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability mass" (i.e.

    Perform a random sampling from the distribution as sketched by this t-digest, in "probability mass" (i.e. discrete) mode.

    returns

    A random number sampled from the sketched distribution

    Note

    uses the inverse transform sampling method

  27. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  28. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  30. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Product

Inherited from Equals

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped