org.apache.cassandra.utils
Class StreamingHistogram

java.lang.Object
  extended by org.apache.cassandra.utils.StreamingHistogram

public class StreamingHistogram
extends java.lang.Object

Histogram that can be constructed from streaming of data. The algorithm is taken from following paper: Yael Ben-Haim and Elad Tom-Tov, "A Streaming Parallel Decision Tree Algorithm" (2010) http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf


Nested Class Summary
static class StreamingHistogram.StreamingHistogramSerializer
           
 
Field Summary
static StreamingHistogram.StreamingHistogramSerializer serializer
           
 
Constructor Summary
StreamingHistogram(int maxBinSize)
          Creates a new histogram with max bin size of maxBinSize
 
Method Summary
 java.util.Map<java.lang.Double,java.lang.Long> getAsMap()
           
 void merge(StreamingHistogram other)
          Merges given histogram with this histogram.
 double sum(double b)
          Calculates estimated number of points in interval [-inf,b].
 void update(double p)
          Adds new point p to this histogram.
 void update(double p, long m)
          Adds new point p with value m to this histogram.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

serializer

public static final StreamingHistogram.StreamingHistogramSerializer serializer
Constructor Detail

StreamingHistogram

public StreamingHistogram(int maxBinSize)
Creates a new histogram with max bin size of maxBinSize

Parameters:
maxBinSize - maximum number of bins this histogram can have
Method Detail

update

public void update(double p)
Adds new point p to this histogram.

Parameters:
p -

update

public void update(double p,
                   long m)
Adds new point p with value m to this histogram.

Parameters:
p -
m -

merge

public void merge(StreamingHistogram other)
Merges given histogram with this histogram.

Parameters:
other - histogram to merge

sum

public double sum(double b)
Calculates estimated number of points in interval [-inf,b].

Parameters:
b - upper bound of a interval to calculate sum
Returns:
estimated number of points in a interval [-inf,b].

getAsMap

public java.util.Map<java.lang.Double,java.lang.Long> getAsMap()


Copyright © 2013 The Apache Software Foundation