Package org.apache.cassandra.utils
Class BloomCalculations
- java.lang.Object
-
- org.apache.cassandra.utils.BloomCalculations
-
public class BloomCalculations extends java.lang.Object
The following calculations are taken from: http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html "Bloom Filters - the math" This class's static methods are meant to facilitate the use of the Bloom Filter class by helping to choose correct values of 'bits per element' and 'number of hash functions, k'.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
BloomCalculations.BloomSpecification
A wrapper class that holds two key parameters for a Bloom Filter: the number of hash functions used, and the number of buckets per element used.
-
Constructor Summary
Constructors Constructor Description BloomCalculations()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static BloomCalculations.BloomSpecification
computeBloomSpec(int bucketsPerElement)
Given the number of buckets that can be used per element, return a specification that minimizes the false positive rate.static BloomCalculations.BloomSpecification
computeBloomSpec(int maxBucketsPerElement, double maxFalsePosProb)
Given a maximum tolerable false positive probability, compute a Bloom specification which will give less than the specified false positive rate, but minimize the number of buckets per element and the number of hash functions used.static int
maxBucketsPerElement(long numElements)
Calculates the maximum number of buckets per element that this implementation can support.static double
minSupportedBloomFilterFpChance()
Retrieves the minimum supported BloomFilterFpChance value
-
-
-
Method Detail
-
computeBloomSpec
public static BloomCalculations.BloomSpecification computeBloomSpec(int bucketsPerElement)
Given the number of buckets that can be used per element, return a specification that minimizes the false positive rate.- Parameters:
bucketsPerElement
- The number of buckets per element for the filter.- Returns:
- A spec that minimizes the false positive rate.
-
computeBloomSpec
public static BloomCalculations.BloomSpecification computeBloomSpec(int maxBucketsPerElement, double maxFalsePosProb)
Given a maximum tolerable false positive probability, compute a Bloom specification which will give less than the specified false positive rate, but minimize the number of buckets per element and the number of hash functions used. Because bandwidth (and therefore total bitvector size) is considered more expensive than computing power, preference is given to minimizing buckets per element rather than number of hash functions.- Parameters:
maxBucketsPerElement
- The maximum number of buckets available for the filter.maxFalsePosProb
- The maximum tolerable false positive rate.- Returns:
- A Bloom Specification which would result in a false positive rate less than specified by the function call
- Throws:
java.lang.UnsupportedOperationException
- if a filter satisfying the parameters cannot be met
-
maxBucketsPerElement
public static int maxBucketsPerElement(long numElements)
Calculates the maximum number of buckets per element that this implementation can support. Crucially, it will lower the bucket count if necessary to meet BitSet's size restrictions.
-
minSupportedBloomFilterFpChance
public static double minSupportedBloomFilterFpChance()
Retrieves the minimum supported BloomFilterFpChance value- Returns:
- Minimum supported value for BloomFilterFpChance
-
-