io.epiphanous.flinkrunner.algorithm.membership
a Guava funnel for taking input
number of cells (see the paper, m
is a Long
but m/floor(63/d)
must fit in a 32-bit Int
)
bits per cell (see the paper, should lie in [1,63] but often set to 1, 2 or 3)
expected false positive rate (should lie in (0,1))
expected false positive rate (should lie in (0,1))
number of hash functions used
total memory required
cell value to set upon insertion
number of cells to decrement on each insertion
Insert a stream element into the filter.
Insert a stream element into the filter.
the item to insert
bits per cell (see the paper, should lie in [1,63] but often set to 1, 2 or 3)
a Guava funnel for taking input
Gets the current value of the i
'th cell.
Gets the current value of the i
'th cell.
the cell to get (in [0, m)
)
murmur3 128 guava hashing function generator
number of cells (see the paper, m
is a Long
but m/floor(63/d)
must fit in a 32-bit Int
)
number of cells (see the paper, m
is a Long
but m/floor(63/d)
must fit in a 32-bit Int
)
Merge another filter into this filter.
Merge another filter into this filter.
the other filter
Return true if this SBF might contain the requested item.
Return true if this SBF might contain the requested item.
the item to check
random number generator for decrementing cells
heap storage for our bits
number of bits used per unit storage
number of longs used for storage
Implements the stable bloom filter from the paper by F. Deng and D. Rafiei. Approximately detecting duplicates for streaming data using stable bloom filters. In SIGMOD, pages 25–36, 2006.
We use heap storage (an array of Longs). This implies
M=m*d
can be set as high as about 125 giga-bits.the type of funnel used
a Guava funnel for taking input
number of cells (see the paper,
m
is aLong
butm/floor(63/d)
must fit in a 32-bitInt
)bits per cell (see the paper, should lie in [1,63] but often set to 1, 2 or 3)
expected false positive rate (should lie in (0,1))