Object

com.twitter.algebird

HyperLogLog

Related Doc: package algebird

Permalink

object HyperLogLog

Implementation of the HyperLogLog approximate counting as a Monoid

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. HyperLogLog
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. def alpha(bits: Int): Double

    Permalink
  5. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  6. def bitsForError(err: Double): Int

    Permalink

    This gives you a number of bits to use to have a given standard error

  7. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  10. def error(bits: Int): Double

    Permalink

    The true error is distributed like a Gaussian with this standard deviation.

    The true error is distributed like a Gaussian with this standard deviation. let m = 2^bits. The size of the HLL is m bytes.

    bits | size | error 9 512 0.0460 10 1024 0.0325 11 2048 0.0230 12 4096 0.0163 13 8192 0.0115 14 16384 0.0081 15 32768 0.0057 16 65536 0.0041 17 131072 0.0029 18 262144 0.0020 19 524288 0.0014 20 1048576 0.0010

    Keep in mind, to store N distinct longs, you only need 8N bytes. See SetSizeAggregator for an approach that uses an exact set when the cardinality is small, and switches to HLL after we have enough items. Ideally, you would keep an exact set until it is smaller to store the HLL (but actually since we use sparse vectors to store the HLL, a small HLL takes a lot less than the size above).

  11. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. def fromByteBuffer(bb: ByteBuffer): HLL

    Permalink
  13. def fromBytes(bytes: Array[Byte]): HLL

    Permalink
  14. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  15. def hash(input: Array[Byte]): Array[Byte]

    Permalink
  16. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  17. val hashSize: Int

    Permalink
  18. implicit def int2Bytes(i: Int): Array[Byte]

    Permalink
  19. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  20. def j(bsl: BitSetLite, bits: Int): Int

    Permalink

    the value 'j' is equal to <w_0, w_1 ...

    the value 'j' is equal to <w_0, w_1 ... w_(bits-1)> TODO: We could read in a byte at a time.

  21. def jRhoW(in: Array[Byte], bits: Int): (Int, Byte)

    Permalink

    We are computing j and \rho(w) from the paper, sorry for the name, but it allows someone to compare to the paper extremely low probability rhow (position of the leftmost one bit) is > 127, so we use a Byte to store it Given a hash <w_0, w_1, w_2 ...

    We are computing j and \rho(w) from the paper, sorry for the name, but it allows someone to compare to the paper extremely low probability rhow (position of the leftmost one bit) is > 127, so we use a Byte to store it Given a hash <w_0, w_1, w_2 ... w_n> the value 'j' is equal to <w_0, w_1 ... w_(bits-1)> and the value 'w' is equal to <w_bits ... w_n>. The function rho counts the number of leading zeroes in 'w'. We can calculate rho(w) at once with the method rhoW.

  22. implicit def long2Bytes(i: Long): Array[Byte]

    Permalink
  23. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  24. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  25. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  26. def rhoW(bsl: BitSetLite, bits: Int): Byte

    Permalink

    The value 'w' is equal to <w_bits ...

    The value 'w' is equal to <w_bits ... w_n>. The function rho counts the number of leading zeroes in 'w'. We can calculate rho(w) at once with the method rhoW.

  27. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  28. def toBytes(h: HLL): Array[Byte]

    Permalink
  29. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  30. def twopow(i: Int): Double

    Permalink
    Annotations
    @inline()
  31. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  32. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  33. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from AnyRef

Inherited from Any

Ungrouped