Class BloomCalculations


  • public class BloomCalculations
    extends java.lang.Object
    The following calculations are taken from: http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html "Bloom Filters - the math" This class's static methods are meant to facilitate the use of the Bloom Filter class by helping to choose correct values of 'bits per element' and 'number of hash functions, k'.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  BloomCalculations.BloomSpecification
      A wrapper class that holds two key parameters for a Bloom Filter: the number of hash functions used, and the number of buckets per element used.
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static BloomCalculations.BloomSpecification computeBloomSpec​(int bucketsPerElement)
      Given the number of buckets that can be used per element, return a specification that minimizes the false positive rate.
      static BloomCalculations.BloomSpecification computeBloomSpec​(int maxBucketsPerElement, double maxFalsePosProb)
      Given a maximum tolerable false positive probability, compute a Bloom specification which will give less than the specified false positive rate, but minimize the number of buckets per element and the number of hash functions used.
      static int maxBucketsPerElement​(long numElements)
      Calculates the maximum number of buckets per element that this implementation can support.
      static double minSupportedBloomFilterFpChance()
      Retrieves the minimum supported BloomFilterFpChance value
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • BloomCalculations

        public BloomCalculations()
    • Method Detail

      • computeBloomSpec

        public static BloomCalculations.BloomSpecification computeBloomSpec​(int bucketsPerElement)
        Given the number of buckets that can be used per element, return a specification that minimizes the false positive rate.
        Parameters:
        bucketsPerElement - The number of buckets per element for the filter.
        Returns:
        A spec that minimizes the false positive rate.
      • computeBloomSpec

        public static BloomCalculations.BloomSpecification computeBloomSpec​(int maxBucketsPerElement,
                                                                            double maxFalsePosProb)
        Given a maximum tolerable false positive probability, compute a Bloom specification which will give less than the specified false positive rate, but minimize the number of buckets per element and the number of hash functions used. Because bandwidth (and therefore total bitvector size) is considered more expensive than computing power, preference is given to minimizing buckets per element rather than number of hash functions.
        Parameters:
        maxBucketsPerElement - The maximum number of buckets available for the filter.
        maxFalsePosProb - The maximum tolerable false positive rate.
        Returns:
        A Bloom Specification which would result in a false positive rate less than specified by the function call
        Throws:
        java.lang.UnsupportedOperationException - if a filter satisfying the parameters cannot be met
      • maxBucketsPerElement

        public static int maxBucketsPerElement​(long numElements)
        Calculates the maximum number of buckets per element that this implementation can support. Crucially, it will lower the bucket count if necessary to meet BitSet's size restrictions.
      • minSupportedBloomFilterFpChance

        public static double minSupportedBloomFilterFpChance()
        Retrieves the minimum supported BloomFilterFpChance value
        Returns:
        Minimum supported value for BloomFilterFpChance