public class StreamingTombstoneHistogramBuilder
extends java.lang.Object
The original algorithm is taken from following paper: Yael Ben-Haim and Elad Tom-Tov, "A Streaming Parallel Decision Tree Algorithm" (2010) http://jmlr.csail.mit.edu/papers/volume11/ben-haim10a/ben-haim10a.pdf
Algorithm: Histogram is represented as collection of {point, weight} pairs. When new point p with weight m is added:
There are some optimization to make histogram builder faster:
Modifier and Type | Class and Description |
---|---|
static class |
StreamingTombstoneHistogramBuilder.AddResult |
Constructor and Description |
---|
StreamingTombstoneHistogramBuilder(int maxBinSize,
int maxSpoolSize,
int roundSeconds) |
Modifier and Type | Method and Description |
---|---|
TombstoneHistogram |
build()
Creates a 'finished' snapshot of the current state of the historgram, but leaves this builder instance
open for subsequent additions to the histograms.
|
void |
flushHistogram()
Drain the temporary spool into the final bins
|
void |
update(int p)
Adds new point p to this histogram.
|
void |
update(int p,
int m)
Adds new point p with value m to this histogram.
|
public StreamingTombstoneHistogramBuilder(int maxBinSize, int maxSpoolSize, int roundSeconds)
public void update(int p)
p
- public void update(int p, int m)
p
- m
- public void flushHistogram()
public TombstoneHistogram build()
Copyright © 2009-2019 The Apache Software Foundation