Computes the hascode of a single string that can be later passed to mightContain(Long)
.
Returns true if the bloom filter contains the string with the given hashcode.
Returns true if the bloom filter contains the string with the given hashcode.
This method is can help improve performance when calling mightContain
with the same query string to multiple different bloom filters.
Returns true if the bloom filter contains the given string.
Inserts a new string to the bloom filter that is the concatenation of the current hash value and the given character.
Inserts a new string to the bloom filter that is the concatenation of the current hash value and the given character.
Use this method when inserting multiple strings that share the same prefix, for example to insert all prefixes of "Simple" you can do
putCharIncrementally('S') // insert S putCharIncrementally('i') // insert Si putCharIncrementally('m') // insert Sim putCharIncrementally('p') // insert Simp putCharIncrementally('l') // insert Simpl putCharIncrementally('e') // insert Simple
Insert a single string into the bloom filter.
Resets the hash value.
Returns the current hash value.
A wrapper around a bloom filter that is optimized for fast insertions of strings with shared prefixes.
To index a classpath, Metals walks through all possible prefixes of a given classfile path. For example, given the string "InputStream.class" Metals builds a bloom filter containing the set of the following strings: I, In, Inp, Inpu, Input, S, St, Str, Stre, Strea and Stream.
The naive approach to construct a bloom filter of all those prefix strings is to create a
BloomFilter[CharSequence]
and insert all those prefixes. This approach has sub-optimal performance because it requires a quadratic number of iterations on the characters of the classfile path.This class implements an optimized approach to build that bloom filter in a way that requires only a linear pass on the characters of the classfile path. The trick is to construct a
BloomFilter[Long]
instead ofBloomFilter[CharSequence]
and incrementally build the hashcode of each prefix string as we iterate over each character in the string.Additionally, this class exposes a
mightContain(Long)
method that speeds up search queries by allowing the client to pre-compute the hash of the query string and re-usemightContain
calls to multiple bloom filters. For every single fuzzy symbol search (which happens on every single scope completion request) we usually perform several thousant bloom filtermightContain
calls so should also help avoid a non-trivial amount of unnecessary hashing.