Class EWAHCompressedBitmap
- java.lang.Object
-
- com.googlecode.javaewah.EWAHCompressedBitmap
-
- All Implemented Interfaces:
BitmapStorage
,LogicalElement<EWAHCompressedBitmap>
,Externalizable
,Serializable
,Cloneable
,Iterable<Integer>
public final class EWAHCompressedBitmap extends Object implements Cloneable, Externalizable, Iterable<Integer>, BitmapStorage, LogicalElement<EWAHCompressedBitmap>
This implements the patent-free(1) EWAH scheme. Roughly speaking, it is a 64-bit variant of the BBC compression scheme used by Oracle for its bitmap indexes.
The objective of this compression type is to provide some compression, while reducing as much as possible the CPU cycle usage.
Once constructed, the bitmap is essentially immutable (unless you call the "set" or "add" methods). Thus, it can be safely used in multi-threaded programs.
This implementation being 64-bit, it assumes a 64-bit CPU together with a 64-bit Java Virtual Machine. This same code on a 32-bit machine may not be as fast. There is also a 32-bit version of this code in the class javaewah32.EWAHCompressedBitmap32.
Here is a code sample to illustrate usage:
EWAHCompressedBitmap ewahBitmap1 = EWAHCompressedBitmap.bitmapOf(0, 2, 55, 64, 1 << 30); EWAHCompressedBitmap ewahBitmap2 = EWAHCompressedBitmap.bitmapOf(1, 3, 64, 1 << 30); EWAHCompressedBitmap ewahBitmap3 = EWAHCompressedBitmap .bitmapOf(5, 55, 1 << 30); EWAHCompressedBitmap ewahBitmap4 = EWAHCompressedBitmap .bitmapOf(4, 66, 1 << 30); EWAHCompressedBitmap orbitmap = ewahBitmap1.or(ewahBitmap2); EWAHCompressedBitmap andbitmap = ewahBitmap1.and(ewahBitmap2); EWAHCompressedBitmap xorbitmap = ewahBitmap1.xor(ewahBitmap2); andbitmap = EWAHCompressedBitmap.and(ewahBitmap1, ewahBitmap2, ewahBitmap3, ewahBitmap4); ByteArrayOutputStream bos = new ByteArrayOutputStream(); ObjectOutputStream oo = new ObjectOutputStream(bos); ewahBitmap1.writeExternal(oo); oo.close(); ewahBitmap1 = null; ewahBitmap1 = new EWAHCompressedBitmap(); ByteArrayInputStream bis = new ByteArrayInputStream(bos.toByteArray()); ewahBitmap1.readExternal(new ObjectInputStream(bis)); EWAHCompressedBitmap threshold2 = EWAHCompressedBitmap.threshold(2, ewahBitmap1, ewahBitmap2, ewahBitmap3, ewahBitmap4);
For more details, see the following papers:
- Daniel Lemire, Owen Kaser, Kamel Aouiche, Sorting improves word-aligned bitmap indexes. Data & Knowledge Engineering 69 (1), pages 3-28, 2010. http://arxiv.org/abs/0901.3751
- Owen Kaser and Daniel Lemire, Compressed bitmap indexes: beyond unions and intersections http://arxiv.org/abs/1402.4466
A 32-bit version of the compressed format was described by Wu et al. and named WBC:
- K. Wu, E. J. Otoo, A. Shoshani, H. Nordberg, Notes on design and implementation of compressed bit vectors, Tech. Rep. LBNL/PUB-3161, Lawrence Berkeley National Laboratory, available from http://crd.lbl. gov/~kewu/ps/PUB-3161.html (2001).
Probably, the best prior art is the Oracle bitmap compression scheme (BBC):
- G. Antoshenkov, Byte-Aligned Bitmap Compression, DCC'95, 1995.
1- The authors do not know of any patent infringed by the following implementation. However, similar schemes, like WAH are covered by patents.
- Since:
- 0.1.0
- See Also:
EWAHCompressedBitmap32
, Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static boolean
ADJUST_CONTAINER_SIZE_WHEN_AGGREGATING
whether we adjust after some aggregation by adding in zeroes *static int
WORD_IN_BITS
The Constant WORD_IN_BITS represents the number of bits in a long.
-
Constructor Summary
Constructors Constructor Description EWAHCompressedBitmap()
Creates an empty bitmap (no bit set to true).EWAHCompressedBitmap(int bufferSize)
Sets explicitly the buffer size (in 64-bit words).EWAHCompressedBitmap(ByteBuffer buffer)
Creates a bitmap with the specified ByteBuffer backend.EWAHCompressedBitmap(LongBuffer buffer)
Creates a bitmap with the specified java.nio.LongBuffer backend.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
add(long newData)
Deprecated.use addWord() instead.void
add(long newData, int bitsThatMatter)
Deprecated.use addWord() instead.void
addLiteralWord(long newData)
Adding literal word directly to the bitmap (for expert use).void
addStreamOfEmptyWords(boolean v, long number)
For experts: You want to add many zeroes or ones? This is the method you use.void
addStreamOfLiteralWords(com.googlecode.javaewah.Buffer buffer, int start, int number)
if you have several literal words to copy over, this might be faster.void
addStreamOfNegatedLiteralWords(com.googlecode.javaewah.Buffer buffer, int start, int number)
Same as addStreamOfLiteralWords, but the words are negated.void
addWord(long newData)
Adding words directly to the bitmap (for expert use).void
addWord(long newData, int bitsThatMatter)
Adding words directly to the bitmap (for expert use).EWAHCompressedBitmap
and(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise AND values of the current bitmap with some other bitmap.static EWAHCompressedBitmap
and(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise AND values of the provided bitmaps.int
andCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise AND of the values of the current bitmap with some other bitmap.static int
andCardinality(EWAHCompressedBitmap... bitmaps)
Returns the cardinality of the result of a bitwise AND of the values of the provided bitmaps.EWAHCompressedBitmap
andNot(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise AND NOT values of the current bitmap with some other bitmap.int
andNotCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise AND NOT of the values of the current bitmap with some other bitmap.void
andNotToContainer(EWAHCompressedBitmap a, BitmapStorage container)
Returns a new compressed bitmap containing the bitwise AND NOT values of the current bitmap with some other bitmap.void
andToContainer(EWAHCompressedBitmap a, BitmapStorage container)
Computes new compressed bitmap containing the bitwise AND values of the current bitmap with some other bitmap.static void
andWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
For internal use.static EWAHCompressedBitmap
bitmapOf(int... setBits)
Return a bitmap with the bit set to true at the given positions.int
cardinality()
reports the number of bits set to true.ChunkIterator
chunkIterator()
Iterator over the chunk of bits.void
clear()
Clear any set bits and set size in bits back to 0boolean
clear(int i)
Set the bit at position i to false.IntIterator
clearIntIterator()
Iterator over the clear bits.EWAHCompressedBitmap
clone()
EWAHCompressedBitmap
compose(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the composition of the current bitmap with some other bitmap.void
composeToContainer(EWAHCompressedBitmap a, EWAHCompressedBitmap container)
Computes a new compressed bitmap containing the composition of the current bitmap with some other bitmap.void
deserialize(DataInput in)
Deserialize.boolean
equals(Object o)
Check to see whether the two compressed bitmaps contain the same set bits.boolean
get(int i)
Query the value of a single bit.EWAHIterator
getEWAHIterator()
Gets an EWAHIterator over the data.int
getFirstSetBit()
getFirstSetBit is a light-weight method that returns the location of the set bit (=1) or -1 if there is none.IteratingRLW
getIteratingRLW()
Gets an IteratingRLW to iterate over the data.List<Integer>
getPositions()
Deprecated.use toList() instead.int
hashCode()
Returns a customized hash code (based on Karp-Rabin).boolean
intersects(EWAHCompressedBitmap a)
Return true if the two EWAHCompressedBitmap have both at least one true bit in the same position.IntIterator
intIterator()
Iterator over the set bits (this is what most people will want to use to browse the content if they want an iterator).boolean
isEmpty()
Checks whether this bitmap is empty (has a cardinality of zero).Iterator<Integer>
iterator()
Iterates over the positions of the true values.void
not()
Negate (bitwise) the current bitmap.EWAHCompressedBitmap
or(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise OR values of the current bitmap with some other bitmap.static EWAHCompressedBitmap
or(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise OR values of the provided bitmaps.int
orCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise OR of the values of the current bitmap with some other bitmap.static int
orCardinality(EWAHCompressedBitmap... bitmaps)
Returns the cardinality of the result of a bitwise OR of the values of the provided bitmaps.void
orToContainer(EWAHCompressedBitmap a, BitmapStorage container)
Computes the bitwise or between the current bitmap and the bitmap "a".static void
orWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
Uses an adaptive technique to compute the logical OR.void
readExternal(ObjectInput in)
IntIterator
reverseIntIterator()
Iterator over the set bits in reverse order.void
serialize(DataOutput out)
Serialize.int
serializedSizeInBytes()
Report the number of bytes required to serialize this bitmap The current bitmap is not modified.boolean
set(int i)
Set the bit at position i to true.boolean
setSizeInBits(int size, boolean defaultValue)
Change the reported size in bits of the *uncompressed* bitmap represented by this compressed bitmap.void
setSizeInBitsWithinLastWord(int size)
Sets the size in bits of the bitmap as an *uncompressed* bitmap.EWAHCompressedBitmap
shift(int b)
Generates a new bitmap shifted by "b" bits.int
sizeInBits()
Returns the size in bits of the *uncompressed* bitmap represented by this compressed bitmap.int
sizeInBytes()
Report the *compressed* size of the bitmap (equivalent to memory usage, after accounting for some overhead).void
swap(EWAHCompressedBitmap other)
Swap the content of the bitmap with another.static EWAHCompressedBitmap
threshold(int t, EWAHCompressedBitmap... bitmaps)
Compute a Boolean threshold function: bits are true where at least t bitmaps have a true bit.static void
thresholdWithContainer(BitmapStorage container, int t, EWAHCompressedBitmap... bitmaps)
Compute a Boolean threshold function: bits are true where at least T bitmaps have a true bit.int[]
toArray()
Populate an array of (sorted integers) corresponding to the location of the set bits.String
toDebugString()
A more detailed string describing the bitmap (useful for debugging).List<Integer>
toList()
Gets the locations of the true values as one list.String
toString()
A string describing the bitmap.void
trim()
Reduce the internal buffer to its minimal allowable size.void
writeExternal(ObjectOutput out)
EWAHCompressedBitmap
xor(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise XOR values of the current bitmap with some other bitmap.static EWAHCompressedBitmap
xor(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise XOR values of the provided bitmaps.int
xorCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise XOR of the values of the current bitmap with some other bitmap.void
xorToContainer(EWAHCompressedBitmap a, BitmapStorage container)
Computes a new compressed bitmap containing the bitwise XOR values of the current bitmap with some other bitmap.static void
xorWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
Uses an adaptive technique to compute the logical XOR.-
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
-
-
-
Field Detail
-
ADJUST_CONTAINER_SIZE_WHEN_AGGREGATING
public static final boolean ADJUST_CONTAINER_SIZE_WHEN_AGGREGATING
whether we adjust after some aggregation by adding in zeroes *- See Also:
- Constant Field Values
-
WORD_IN_BITS
public static final int WORD_IN_BITS
The Constant WORD_IN_BITS represents the number of bits in a long.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
EWAHCompressedBitmap
public EWAHCompressedBitmap()
Creates an empty bitmap (no bit set to true).
-
EWAHCompressedBitmap
public EWAHCompressedBitmap(int bufferSize)
Sets explicitly the buffer size (in 64-bit words). The initial memory usage will be "bufferSize * 64". For large poorly compressible bitmaps, using large values may improve performance. If the requested bufferSize is less than 1, a value of 1 is used by default. In particular, negative values of bufferSize are effectively ignored.- Parameters:
bufferSize
- number of 64-bit words reserved when the object is created)
-
EWAHCompressedBitmap
public EWAHCompressedBitmap(ByteBuffer buffer)
Creates a bitmap with the specified ByteBuffer backend. It assumes that a bitmap was serialized at this location. It is effectively "deserialized" though the actual content is not copied. This might be useful for implementing memory-mapped bitmaps.- Parameters:
buffer
- data source
-
EWAHCompressedBitmap
public EWAHCompressedBitmap(LongBuffer buffer)
Creates a bitmap with the specified java.nio.LongBuffer backend. The content of the LongBuffer is discarded.- Parameters:
buffer
- data source
-
-
Method Detail
-
add
@Deprecated public void add(long newData)
Deprecated.use addWord() instead.- Parameters:
newData
- the word
-
add
@Deprecated public void add(long newData, int bitsThatMatter)
Deprecated.use addWord() instead.- Parameters:
newData
- the wordbitsThatMatter
- the number of significant bits (by default it should be 64)
-
addWord
public void addWord(long newData)
Adding words directly to the bitmap (for expert use). This method adds bits in words of 4*8 bits. It is not to be confused with the set method which sets individual bits. Most users will want the set method. Example: if you add word 321 to an empty bitmap, you are have added (in binary notation) 0b101000001, so you have effectively called set(0), set(6), set(8) in sequence. Since this modifies the bitmap, this method is not thread-safe. API change: prior to version 0.8.3, this method was called add.- Specified by:
addWord
in interfaceBitmapStorage
- Parameters:
newData
- the word
-
addWord
public void addWord(long newData, int bitsThatMatter)
Adding words directly to the bitmap (for expert use). Since this modifies the bitmap, this method is not thread-safe. API change: prior to version 0.8.3, this method was called add.- Parameters:
newData
- the wordbitsThatMatter
- the number of significant bits (by default it should be 64)
-
addLiteralWord
public void addLiteralWord(long newData)
Adding literal word directly to the bitmap (for expert use). Since this modifies the bitmap, this method is not thread-safe.- Specified by:
addLiteralWord
in interfaceBitmapStorage
- Parameters:
newData
- the word
-
addStreamOfLiteralWords
public void addStreamOfLiteralWords(com.googlecode.javaewah.Buffer buffer, int start, int number)
if you have several literal words to copy over, this might be faster. Since this modifies the bitmap, this method is not thread-safe.- Specified by:
addStreamOfLiteralWords
in interfaceBitmapStorage
- Parameters:
buffer
- the buffer wrapping the literal wordsstart
- the starting point in the arraynumber
- the number of literal words to add
-
addStreamOfEmptyWords
public void addStreamOfEmptyWords(boolean v, long number)
For experts: You want to add many zeroes or ones? This is the method you use. Since this modifies the bitmap, this method is not thread-safe.- Specified by:
addStreamOfEmptyWords
in interfaceBitmapStorage
- Parameters:
v
- the boolean valuenumber
- the number
-
addStreamOfNegatedLiteralWords
public void addStreamOfNegatedLiteralWords(com.googlecode.javaewah.Buffer buffer, int start, int number)
Same as addStreamOfLiteralWords, but the words are negated. Since this modifies the bitmap, this method is not thread-safe.- Specified by:
addStreamOfNegatedLiteralWords
in interfaceBitmapStorage
- Parameters:
buffer
- the buffer wrapping the literal wordsstart
- the starting point in the arraynumber
- the number of literal words to add
-
and
public EWAHCompressedBitmap and(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise AND values of the current bitmap with some other bitmap. The running time is proportional to the sum of the compressed sizes (as reported by sizeInBytes()). If you are not planning on adding to the resulting bitmap, you may call the trim() method to reduce memory usage. The current bitmap is not modified.- Specified by:
and
in interfaceLogicalElement<EWAHCompressedBitmap>
- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the EWAH compressed bitmap
- Since:
- 0.4.3
-
andToContainer
public void andToContainer(EWAHCompressedBitmap a, BitmapStorage container)
Computes new compressed bitmap containing the bitwise AND values of the current bitmap with some other bitmap. The running time is proportional to the sum of the compressed sizes (as reported by sizeInBytes()). The current bitmap is not modified. The content of the container is overwritten.- Parameters:
a
- the other bitmap (it will not be modified)container
- where we store the result- Since:
- 0.4.0
-
andCardinality
public int andCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise AND of the values of the current bitmap with some other bitmap. Avoids allocating an intermediate bitmap to hold the result of the OR. The current bitmap is not modified.- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the cardinality
- Since:
- 0.4.0
-
andNot
public EWAHCompressedBitmap andNot(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise AND NOT values of the current bitmap with some other bitmap. The running time is proportional to the sum of the compressed sizes (as reported by sizeInBytes()). If you are not planning on adding to the resulting bitmap, you may call the trim() method to reduce memory usage. The current bitmap is not modified.- Specified by:
andNot
in interfaceLogicalElement<EWAHCompressedBitmap>
- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the EWAH compressed bitmap
-
andNotToContainer
public void andNotToContainer(EWAHCompressedBitmap a, BitmapStorage container)
Returns a new compressed bitmap containing the bitwise AND NOT values of the current bitmap with some other bitmap. This method is expected to be faster than doing A.and(B.clone().not()). The running time is proportional to the sum of the compressed sizes (as reported by sizeInBytes()). The current bitmap is not modified. The content of the container is overwritten.- Parameters:
a
- the other bitmap (it will not be modified)container
- where to store the result- Since:
- 0.4.0
-
andNotCardinality
public int andNotCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise AND NOT of the values of the current bitmap with some other bitmap. Avoids allocating an intermediate bitmap to hold the result of the OR. The current bitmap is not modified.- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the cardinality
- Since:
- 0.4.0
-
cardinality
public int cardinality()
reports the number of bits set to true. Running time is proportional to compressed size (as reported by sizeInBytes).- Returns:
- the number of bits set to true
-
clear
public void clear()
Clear any set bits and set size in bits back to 0- Specified by:
clear
in interfaceBitmapStorage
-
clone
public EWAHCompressedBitmap clone() throws CloneNotSupportedException
- Overrides:
clone
in classObject
- Throws:
CloneNotSupportedException
-
serialize
public void serialize(DataOutput out) throws IOException
Serialize. The current bitmap is not modified.- Parameters:
out
- the DataOutput stream- Throws:
IOException
- Signals that an I/O exception has occurred.
-
deserialize
public void deserialize(DataInput in) throws IOException
Deserialize.- Parameters:
in
- the DataInput stream- Throws:
IOException
- Signals that an I/O exception has occurred.
-
equals
public boolean equals(Object o)
Check to see whether the two compressed bitmaps contain the same set bits.- Overrides:
equals
in classObject
- See Also:
Object.equals(java.lang.Object)
-
getEWAHIterator
public EWAHIterator getEWAHIterator()
Gets an EWAHIterator over the data. This is a customized iterator which iterates over run length words. For experts only. The current bitmap is not modified.- Returns:
- the EWAHIterator
-
getIteratingRLW
public IteratingRLW getIteratingRLW()
Gets an IteratingRLW to iterate over the data. For experts only. Note that iterator does not know about the size in bits of the bitmap: the size in bits is effectively rounded up to the nearest multiple of 64. However, if you materialize a bitmap from an iterator, you can set the desired size in bits using the setSizeInBitsWithinLastWord methods:EWAHCompressedBitmap n = IteratorUtil.materialize(bitmap.getIteratingRLW())); n.setSizeInBitsWithinLastWord(bitmap.sizeInBits());
The current bitmap is not modified.- Returns:
- the IteratingRLW iterator corresponding to this bitmap
-
getPositions
@Deprecated public List<Integer> getPositions()
Deprecated.use toList() instead.- Returns:
- a list
-
toList
public List<Integer> toList()
Gets the locations of the true values as one list. (May use more memory than iterator().) The current bitmap is not modified. API change: prior to version 0.8.3, this method was called getPositions.- Returns:
- the positions in a list
-
hashCode
public int hashCode()
Returns a customized hash code (based on Karp-Rabin). Naturally, if the bitmaps are equal, they will hash to the same value. The current bitmap is not modified.
-
intersects
public boolean intersects(EWAHCompressedBitmap a)
Return true if the two EWAHCompressedBitmap have both at least one true bit in the same position. Equivalently, you could call "and" and check whether there is a set bit, but intersects will run faster if you don't need the result of the "and" operation. The current bitmap is not modified.- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- whether they intersect
- Since:
- 0.3.2
-
intIterator
public IntIterator intIterator()
Iterator over the set bits (this is what most people will want to use to browse the content if they want an iterator). The location of the set bits is returned, in increasing order. The current bitmap is not modified.- Returns:
- the int iterator
-
reverseIntIterator
public IntIterator reverseIntIterator()
Iterator over the set bits in reverse order. The current bitmap is not modified.- Returns:
- the int iterator
-
isEmpty
public boolean isEmpty()
Checks whether this bitmap is empty (has a cardinality of zero).- Returns:
- true if no bit is set
-
clearIntIterator
public IntIterator clearIntIterator()
Iterator over the clear bits. The location of the clear bits is returned, in increasing order. The current bitmap is not modified.- Returns:
- the int iterator
-
chunkIterator
public ChunkIterator chunkIterator()
Iterator over the chunk of bits. The current bitmap is not modified.- Returns:
- the chunk iterator
-
iterator
public Iterator<Integer> iterator()
Iterates over the positions of the true values. This is similar to intIterator(), but it uses Java generics. The current bitmap is not modified.
-
not
public void not()
Negate (bitwise) the current bitmap. To get a negated copy, do EWAHCompressedBitmap x= ((EWAHCompressedBitmap) mybitmap.clone()); x.not(); The running time is proportional to the compressed size (as reported by sizeInBytes()). Because this modifies the bitmap, this method is not thread-safe.- Specified by:
not
in interfaceLogicalElement<EWAHCompressedBitmap>
-
or
public EWAHCompressedBitmap or(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise OR values of the current bitmap with some other bitmap. The running time is proportional to the sum of the compressed sizes (as reported by sizeInBytes()). If you are not planning on adding to the resulting bitmap, you may call the trim() method to reduce memory usage. The current bitmap is not modified.- Specified by:
or
in interfaceLogicalElement<EWAHCompressedBitmap>
- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the EWAH compressed bitmap
-
orToContainer
public void orToContainer(EWAHCompressedBitmap a, BitmapStorage container)
Computes the bitwise or between the current bitmap and the bitmap "a". Stores the result in the container. The current bitmap is not modified. The content of the container is overwritten.- Parameters:
a
- the other bitmap (it will not be modified)container
- where we store the result- Since:
- 0.4.0
-
orCardinality
public int orCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise OR of the values of the current bitmap with some other bitmap. Avoids allocating an intermediate bitmap to hold the result of the OR. The current bitmap is not modified.- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the cardinality
- Since:
- 0.4.0
-
readExternal
public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException
- Specified by:
readExternal
in interfaceExternalizable
- Throws:
IOException
ClassNotFoundException
-
writeExternal
public void writeExternal(ObjectOutput out) throws IOException
- Specified by:
writeExternal
in interfaceExternalizable
- Throws:
IOException
-
serializedSizeInBytes
public int serializedSizeInBytes()
Report the number of bytes required to serialize this bitmap The current bitmap is not modified.- Returns:
- the size in bytes
-
get
public boolean get(int i)
Query the value of a single bit. Relying on this method when speed is needed is discouraged. The complexity is linear with the size of the bitmap. (This implementation is based on zhenjl's Go version of JavaEWAH.) The current bitmap is not modified.- Parameters:
i
- the bit we are interested in- Returns:
- whether the bit is set to true
-
getFirstSetBit
public int getFirstSetBit()
getFirstSetBit is a light-weight method that returns the location of the set bit (=1) or -1 if there is none.- Returns:
- location of the first set bit or -1
-
clear
public boolean clear(int i)
Set the bit at position i to false. Though you can clear the bits in any order (e.g., clear(100), clear(10), clear(1), you will typically get better performance if you clear the bits in increasing order (e.g., clear(1), clear(10), clear(100)). Clearing a bit that is larger than the biggest bit is a constant time operation. Clearing a bit that is smaller than the biggest bit can require time proportional to the compressed size of the bitmap, as the bitmap may need to be rewritten. Since this modifies the bitmap, this method is not thread-safe.- Parameters:
i
- the index- Returns:
- true if the value was unset
- Throws:
IndexOutOfBoundsException
- if i is negative or greater than Integer.MAX_VALUE - 64
-
set
public boolean set(int i)
Set the bit at position i to true. Though you can set the bits in any order (e.g., set(100), set(10), set(1), you will typically get better performance if you set the bits in increasing order (e.g., set(1), set(10), set(100)). Setting a bit that is larger than any of the current set bit is a constant time operation. Setting a bit that is smaller than an already set bit can require time proportional to the compressed size of the bitmap, as the bitmap may need to be rewritten. Since this modifies the bitmap, this method is not thread-safe.- Parameters:
i
- the index- Returns:
- true if the value was set
- Throws:
IndexOutOfBoundsException
- if i is negative or greater than Integer.MAX_VALUE - 64
-
setSizeInBitsWithinLastWord
public void setSizeInBitsWithinLastWord(int size)
Description copied from interface:BitmapStorage
Sets the size in bits of the bitmap as an *uncompressed* bitmap. Normally, this is used to reduce the size of the bitmaps within the scope of the last word. Specifically, this means that (sizeInBits()+63)/64 must be equal to (size +63)/64. If needed, the bitmap can be further padded with zeroes.- Specified by:
setSizeInBitsWithinLastWord
in interfaceBitmapStorage
- Parameters:
size
- the size in bits
-
setSizeInBits
public boolean setSizeInBits(int size, boolean defaultValue)
Change the reported size in bits of the *uncompressed* bitmap represented by this compressed bitmap. It may change the underlying compressed bitmap. It is not possible to reduce the sizeInBits, but it can be extended. The new bits are set to false or true depending on the value of defaultValue. This method is not thread-safe.- Parameters:
size
- the size in bitsdefaultValue
- the default boolean value- Returns:
- true if the update was possible
-
sizeInBits
public int sizeInBits()
Returns the size in bits of the *uncompressed* bitmap represented by this compressed bitmap. Initially, the sizeInBits is zero. It is extended automatically when you set bits to true. The current bitmap is not modified.- Specified by:
sizeInBits
in interfaceLogicalElement<EWAHCompressedBitmap>
- Returns:
- the size in bits
-
sizeInBytes
public int sizeInBytes()
Report the *compressed* size of the bitmap (equivalent to memory usage, after accounting for some overhead).- Specified by:
sizeInBytes
in interfaceLogicalElement<EWAHCompressedBitmap>
- Returns:
- the size in bytes
-
threshold
public static EWAHCompressedBitmap threshold(int t, EWAHCompressedBitmap... bitmaps)
Compute a Boolean threshold function: bits are true where at least t bitmaps have a true bit.- Parameters:
t
- the thresholdbitmaps
- input data- Returns:
- the aggregated bitmap
- Since:
- 0.8.1
-
thresholdWithContainer
public static void thresholdWithContainer(BitmapStorage container, int t, EWAHCompressedBitmap... bitmaps)
Compute a Boolean threshold function: bits are true where at least T bitmaps have a true bit. The content of the container is overwritten.- Parameters:
t
- the thresholdbitmaps
- input datacontainer
- where we write the aggregated bitmap- Since:
- 0.8.1
-
toArray
public int[] toArray()
Populate an array of (sorted integers) corresponding to the location of the set bits.- Returns:
- the array containing the location of the set bits
-
toDebugString
public String toDebugString()
A more detailed string describing the bitmap (useful for debugging).- Returns:
- the string
-
toString
public String toString()
A string describing the bitmap.
-
swap
public void swap(EWAHCompressedBitmap other)
Swap the content of the bitmap with another.- Parameters:
other
- bitmap to swap with
-
trim
public void trim()
Reduce the internal buffer to its minimal allowable size. This can free memory.
-
xor
public EWAHCompressedBitmap xor(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise XOR values of the current bitmap with some other bitmap. The running time is proportional to the sum of the compressed sizes (as reported by sizeInBytes()). If you are not planning on adding to the resulting bitmap, you may call the trim() method to reduce memory usage. The current bitmap is not modified.- Specified by:
xor
in interfaceLogicalElement<EWAHCompressedBitmap>
- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the EWAH compressed bitmap
-
xorToContainer
public void xorToContainer(EWAHCompressedBitmap a, BitmapStorage container)
Computes a new compressed bitmap containing the bitwise XOR values of the current bitmap with some other bitmap. The running time is proportional to the sum of the compressed sizes (as reported by sizeInBytes()). The current bitmap is not modified. The content of the container is overwritten.- Parameters:
a
- the other bitmap (it will not be modified)container
- where we store the result- Since:
- 0.4.0
-
xorCardinality
public int xorCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise XOR of the values of the current bitmap with some other bitmap. Avoids allocating an intermediate bitmap to hold the result of the OR. The current bitmap is not modified.- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the cardinality
- Since:
- 0.4.0
-
compose
public EWAHCompressedBitmap compose(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the composition of the current bitmap with some other bitmap. The composition A.compose(B) is defined as follows: we retain the ith set bit of A only if the ith bit of B is set. For example, if you have the following bitmap A = { 0, 1, 0, 1, 1, 0 } and want to keep only the second and third ones, you can call A.compose(B) with B = { 0, 1, 1 } and you will get C = { 0, 0, 0, 1, 1, 0 }. If you are not planning on adding to the resulting bitmap, you may call the trim() method to reduce memory usage. The current bitmap is not modified.- Specified by:
compose
in interfaceLogicalElement<EWAHCompressedBitmap>
- Parameters:
a
- the other bitmap (it will not be modified)- Returns:
- the EWAH compressed bitmap
-
composeToContainer
public void composeToContainer(EWAHCompressedBitmap a, EWAHCompressedBitmap container)
Computes a new compressed bitmap containing the composition of the current bitmap with some other bitmap. The composition A.compose(B) is defined as follows: we retain the ith set bit of A only if the ith bit of B is set. For example, if you have the following bitmap A = { 0, 1, 0, 1, 1, 0 } and want to keep only the second and third ones, you can call A.compose(B) with B = { 0, 1, 1 } and you will get C = { 0, 0, 0, 1, 1, 0 }. The current bitmap is not modified. The content of the container is overwritten.- Parameters:
a
- the other bitmap (it will not be modified)container
- where we store the result
-
andWithContainer
public static void andWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
For internal use. Computes the bitwise and of the provided bitmaps and stores the result in the container. The content of the container is overwritten.- Parameters:
container
- where the result is storedbitmaps
- bitmaps to AND- Since:
- 0.4.3
-
and
public static EWAHCompressedBitmap and(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise AND values of the provided bitmaps. It may or may not be faster than doing the aggregation two-by-two (A.and(B).and(C)). If only one bitmap is provided, it is returned as is. If you are not planning on adding to the resulting bitmap, you may call the trim() method to reduce memory usage.- Parameters:
bitmaps
- bitmaps to AND together- Returns:
- result of the AND
- Since:
- 0.4.3
-
andCardinality
public static int andCardinality(EWAHCompressedBitmap... bitmaps)
Returns the cardinality of the result of a bitwise AND of the values of the provided bitmaps. Avoids allocating an intermediate bitmap to hold the result of the AND.- Parameters:
bitmaps
- bitmaps to AND- Returns:
- the cardinality
- Since:
- 0.4.3
-
bitmapOf
public static EWAHCompressedBitmap bitmapOf(int... setBits)
Return a bitmap with the bit set to true at the given positions. The positions should be given in sorted order. (This is a convenience method.)- Parameters:
setBits
- list of set bit positions- Returns:
- the bitmap
- Since:
- 0.4.5
-
orWithContainer
public static void orWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
Uses an adaptive technique to compute the logical OR. Mostly for internal use. The content of the container is overwritten.- Parameters:
container
- where the aggregate is written.bitmaps
- to be aggregated
-
xorWithContainer
public static void xorWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
Uses an adaptive technique to compute the logical XOR. Mostly for internal use. The content of the container is overwritten.- Parameters:
container
- where the aggregate is written.bitmaps
- to be aggregated
-
or
public static EWAHCompressedBitmap or(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise OR values of the provided bitmaps. This is typically faster than doing the aggregation two-by-two (A.or(B).or(C).or(D)). If only one bitmap is provided, it is returned as is. If you are not planning on adding to the resulting bitmap, you may call the trim() method to reduce memory usage.- Parameters:
bitmaps
- bitmaps to OR together- Returns:
- result of the OR
- Since:
- 0.4.0
-
xor
public static EWAHCompressedBitmap xor(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise XOR values of the provided bitmaps. This is typically faster than doing the aggregation two-by-two (A.xor(B).xor(C).xor(D)). If only one bitmap is provided, it is returned as is. If you are not planning on adding to the resulting bitmap, you may call the trim() method to reduce memory usage.- Parameters:
bitmaps
- bitmaps to XOR together- Returns:
- result of the XOR
-
orCardinality
public static int orCardinality(EWAHCompressedBitmap... bitmaps)
Returns the cardinality of the result of a bitwise OR of the values of the provided bitmaps. Avoids allocating an intermediate bitmap to hold the result of the OR.- Parameters:
bitmaps
- bitmaps to OR- Returns:
- the cardinality
- Since:
- 0.4.0
-
shift
public EWAHCompressedBitmap shift(int b)
Generates a new bitmap shifted by "b" bits. If b is positive, the position of all set bits is increased by b. The negative case is not supported.- Parameters:
b
- number of bits- Returns:
- new shifted bitmap
-
-