public final class EWAHCompressedBitmap extends Object implements Cloneable, Externalizable, Iterable<Integer>, BitmapStorage, LogicalElement<EWAHCompressedBitmap>
This implements the patent-free(1) EWAH scheme. Roughly speaking, it is a 64-bit variant of the BBC compression scheme used by Oracle for its bitmap indexes.
The objective of this compression type is to provide some compression, while reducing as much as possible the CPU cycle usage.
Once constructed, the bitmap is essentially immutable (unless you call the "set" or "add" methods). Thus, it can be safely used in multi-threaded programs.
This implementation being 64-bit, it assumes a 64-bit CPU together with a 64-bit Java Virtual Machine. This same code on a 32-bit machine may not be as fast. There is also a 32-bit version of this code in the class javaewah32.EWAHCompressedBitmap32.
Here is a code sample to illustrate usage:
EWAHCompressedBitmap ewahBitmap1 = EWAHCompressedBitmap.bitmapOf(0, 2, 55, 64, 1 << 30); EWAHCompressedBitmap ewahBitmap2 = EWAHCompressedBitmap.bitmapOf(1, 3, 64, 1 << 30); EWAHCompressedBitmap ewahBitmap3 = EWAHCompressedBitmap .bitmapOf(5, 55, 1 << 30); EWAHCompressedBitmap ewahBitmap4 = EWAHCompressedBitmap .bitmapOf(4, 66, 1 << 30); EWAHCompressedBitmap orbitmap = ewahBitmap1.or(ewahBitmap2); EWAHCompressedBitmap andbitmap = ewahBitmap1.and(ewahBitmap2); EWAHCompressedBitmap xorbitmap = ewahBitmap1.xor(ewahBitmap2); andbitmap = EWAHCompressedBitmap.and(ewahBitmap1, ewahBitmap2, ewahBitmap3, ewahBitmap4); ByteArrayOutputStream bos = new ByteArrayOutputStream(); ObjectOutputStream oo = new ObjectOutputStream(bos); ewahBitmap1.writeExternal(oo); oo.close(); ewahBitmap1 = null; ewahBitmap1 = new EWAHCompressedBitmap(); ByteArrayInputStream bis = new ByteArrayInputStream(bos.toByteArray()); ewahBitmap1.readExternal(new ObjectInputStream(bis)); EWAHCompressedBitmap threshold2 = EWAHCompressedBitmap.threshold(2, ewahBitmap1, ewahBitmap2, ewahBitmap3, ewahBitmap4);
For more details, see the following papers:
A 32-bit version of the compressed format was described by Wu et al. and named WBC:
Probably, the best prior art is the Oracle bitmap compression scheme (BBC):
1- The authors do not know of any patent infringed by the following implementation. However, similar schemes, like WAH are covered by patents.
EWAHCompressedBitmap32
,
Serialized FormModifier and Type | Field and Description |
---|---|
static boolean |
ADJUST_CONTAINER_SIZE_WHEN_AGGREGATING
whether we adjust after some aggregation by adding in zeroes *
|
static int |
WORD_IN_BITS
The Constant WORD_IN_BITS represents the number of bits in a long.
|
Constructor and Description |
---|
EWAHCompressedBitmap()
Creates an empty bitmap (no bit set to true).
|
EWAHCompressedBitmap(ByteBuffer buffer)
Creates a bitmap with the specified ByteBuffer backend.
|
EWAHCompressedBitmap(int bufferSize)
Sets explicitly the buffer size (in 64-bit words).
|
EWAHCompressedBitmap(LongBuffer buffer)
Creates a bitmap with the specified java.nio.LongBuffer backend.
|
Modifier and Type | Method and Description |
---|---|
void |
add(long newData)
Deprecated.
use addWord() instead.
|
void |
add(long newData,
int bitsThatMatter)
Deprecated.
use addWord() instead.
|
void |
addLiteralWord(long newData)
Adding literal word directly to the bitmap (for expert use).
|
void |
addStreamOfEmptyWords(boolean v,
long number)
For experts: You want to add many zeroes or ones? This is the method
you use.
|
void |
addStreamOfLiteralWords(com.googlecode.javaewah.Buffer buffer,
int start,
int number)
if you have several literal words to copy over, this might be faster.
|
void |
addStreamOfNegatedLiteralWords(com.googlecode.javaewah.Buffer buffer,
int start,
int number)
Same as addStreamOfLiteralWords, but the words are negated.
|
void |
addWord(long newData)
Adding words directly to the bitmap (for expert use).
|
void |
addWord(long newData,
int bitsThatMatter)
Adding words directly to the bitmap (for expert use).
|
static EWAHCompressedBitmap |
and(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise AND values of
the provided bitmaps.
|
EWAHCompressedBitmap |
and(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise AND values of
the current bitmap with some other bitmap.
|
static int |
andCardinality(EWAHCompressedBitmap... bitmaps)
Returns the cardinality of the result of a bitwise AND of the values
of the provided bitmaps.
|
int |
andCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise AND of the values
of the current bitmap with some other bitmap.
|
EWAHCompressedBitmap |
andNot(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise AND NOT values
of the current bitmap with some other bitmap.
|
int |
andNotCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise AND NOT of the
values of the current bitmap with some other bitmap.
|
void |
andNotToContainer(EWAHCompressedBitmap a,
BitmapStorage container)
Returns a new compressed bitmap containing the bitwise AND NOT values
of the current bitmap with some other bitmap.
|
void |
andToContainer(EWAHCompressedBitmap a,
BitmapStorage container)
Computes new compressed bitmap containing the bitwise AND values of
the current bitmap with some other bitmap.
|
static void |
andWithContainer(BitmapStorage container,
EWAHCompressedBitmap... bitmaps)
For internal use.
|
static EWAHCompressedBitmap |
bitmapOf(int... setBits)
Return a bitmap with the bit set to true at the given positions.
|
int |
cardinality()
reports the number of bits set to true.
|
ChunkIterator |
chunkIterator()
Iterator over the chunk of bits.
|
void |
clear()
Clear any set bits and set size in bits back to 0
|
boolean |
clear(int i)
Set the bit at position i to false.
|
IntIterator |
clearIntIterator()
Iterator over the clear bits.
|
EWAHCompressedBitmap |
clone() |
EWAHCompressedBitmap |
compose(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the composition of
the current bitmap with some other bitmap.
|
void |
composeToContainer(EWAHCompressedBitmap a,
EWAHCompressedBitmap container)
Computes a new compressed bitmap containing the composition of
the current bitmap with some other bitmap.
|
void |
deserialize(DataInput in)
Deserialize.
|
boolean |
equals(Object o)
Check to see whether the two compressed bitmaps contain the same set
bits.
|
boolean |
get(int i)
Query the value of a single bit.
|
EWAHIterator |
getEWAHIterator()
Gets an EWAHIterator over the data.
|
int |
getFirstSetBit()
getFirstSetBit is a light-weight method that returns the
location of the set bit (=1) or -1 if there is none.
|
IteratingRLW |
getIteratingRLW()
Gets an IteratingRLW to iterate over the data.
|
List<Integer> |
getPositions()
Deprecated.
use toList() instead.
|
int |
hashCode()
Returns a customized hash code (based on Karp-Rabin).
|
boolean |
intersects(EWAHCompressedBitmap a)
Return true if the two EWAHCompressedBitmap have both at least one
true bit in the same position.
|
IntIterator |
intIterator()
Iterator over the set bits (this is what most people will want to use
to browse the content if they want an iterator).
|
boolean |
isEmpty()
Checks whether this bitmap is empty (has a cardinality of zero).
|
Iterator<Integer> |
iterator()
Iterates over the positions of the true values.
|
void |
not()
Negate (bitwise) the current bitmap.
|
static EWAHCompressedBitmap |
or(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise OR values of
the provided bitmaps.
|
EWAHCompressedBitmap |
or(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise OR values of
the current bitmap with some other bitmap.
|
static int |
orCardinality(EWAHCompressedBitmap... bitmaps)
Returns the cardinality of the result of a bitwise OR of the values
of the provided bitmaps.
|
int |
orCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise OR of the values
of the current bitmap with some other bitmap.
|
void |
orToContainer(EWAHCompressedBitmap a,
BitmapStorage container)
Computes the bitwise or between the current bitmap and the bitmap
"a".
|
static void |
orWithContainer(BitmapStorage container,
EWAHCompressedBitmap... bitmaps)
Uses an adaptive technique to compute the logical OR.
|
void |
readExternal(ObjectInput in) |
IntIterator |
reverseIntIterator()
Iterator over the set bits in reverse order.
|
void |
serialize(DataOutput out)
Serialize.
|
int |
serializedSizeInBytes()
Report the number of bytes required to serialize this bitmap
The current bitmap is not modified.
|
boolean |
set(int i)
Set the bit at position i to true.
|
boolean |
setSizeInBits(int size,
boolean defaultValue)
Change the reported size in bits of the *uncompressed* bitmap
represented by this compressed bitmap.
|
void |
setSizeInBitsWithinLastWord(int size)
Sets the size in bits of the bitmap as an *uncompressed* bitmap.
|
EWAHCompressedBitmap |
shift(int b)
Generate a new bitmap a new bitmap shifted by "b" bits.
|
int |
sizeInBits()
Returns the size in bits of the *uncompressed* bitmap represented by
this compressed bitmap.
|
int |
sizeInBytes()
Report the *compressed* size of the bitmap (equivalent to memory
usage, after accounting for some overhead).
|
void |
swap(EWAHCompressedBitmap other)
Swap the content of the bitmap with another.
|
static EWAHCompressedBitmap |
threshold(int t,
EWAHCompressedBitmap... bitmaps)
Compute a Boolean threshold function: bits are true where at least t
bitmaps have a true bit.
|
static void |
thresholdWithContainer(BitmapStorage container,
int t,
EWAHCompressedBitmap... bitmaps)
Compute a Boolean threshold function: bits are true where at least T
bitmaps have a true bit.
|
int[] |
toArray()
Populate an array of (sorted integers) corresponding to the location
of the set bits.
|
String |
toDebugString()
A more detailed string describing the bitmap (useful for debugging).
|
List<Integer> |
toList()
Gets the locations of the true values as one list.
|
String |
toString()
A string describing the bitmap.
|
void |
trim()
Reduce the internal buffer to its minimal allowable size.
|
void |
writeExternal(ObjectOutput out) |
static EWAHCompressedBitmap |
xor(EWAHCompressedBitmap... bitmaps)
Returns a new compressed bitmap containing the bitwise XOR values of
the provided bitmaps.
|
EWAHCompressedBitmap |
xor(EWAHCompressedBitmap a)
Returns a new compressed bitmap containing the bitwise XOR values of
the current bitmap with some other bitmap.
|
int |
xorCardinality(EWAHCompressedBitmap a)
Returns the cardinality of the result of a bitwise XOR of the values
of the current bitmap with some other bitmap.
|
void |
xorToContainer(EWAHCompressedBitmap a,
BitmapStorage container)
Computes a new compressed bitmap containing the bitwise XOR values of
the current bitmap with some other bitmap.
|
static void |
xorWithContainer(BitmapStorage container,
EWAHCompressedBitmap... bitmaps)
Uses an adaptive technique to compute the logical XOR.
|
finalize, getClass, notify, notifyAll, wait, wait, wait
forEach, spliterator
public static final boolean ADJUST_CONTAINER_SIZE_WHEN_AGGREGATING
public static final int WORD_IN_BITS
public EWAHCompressedBitmap()
public EWAHCompressedBitmap(int bufferSize)
bufferSize
- number of 64-bit words reserved when the object is
created)public EWAHCompressedBitmap(ByteBuffer buffer)
buffer
- data sourcepublic EWAHCompressedBitmap(LongBuffer buffer)
buffer
- data source@Deprecated public void add(long newData)
newData
- the word@Deprecated public void add(long newData, int bitsThatMatter)
newData
- the wordbitsThatMatter
- the number of significant bits (by default it should
be 64)public void addWord(long newData)
addWord
in interface BitmapStorage
newData
- the wordpublic void addWord(long newData, int bitsThatMatter)
newData
- the wordbitsThatMatter
- the number of significant bits (by default it should
be 64)public void addLiteralWord(long newData)
addLiteralWord
in interface BitmapStorage
newData
- the wordpublic void addStreamOfLiteralWords(com.googlecode.javaewah.Buffer buffer, int start, int number)
addStreamOfLiteralWords
in interface BitmapStorage
buffer
- the buffer wrapping the literal wordsstart
- the starting point in the arraynumber
- the number of literal words to addpublic void addStreamOfEmptyWords(boolean v, long number)
addStreamOfEmptyWords
in interface BitmapStorage
v
- the boolean valuenumber
- the numberpublic void addStreamOfNegatedLiteralWords(com.googlecode.javaewah.Buffer buffer, int start, int number)
addStreamOfNegatedLiteralWords
in interface BitmapStorage
buffer
- the buffer wrapping the literal wordsstart
- the starting point in the arraynumber
- the number of literal words to addpublic EWAHCompressedBitmap and(EWAHCompressedBitmap a)
and
in interface LogicalElement<EWAHCompressedBitmap>
a
- the other bitmap (it will not be modified)public void andToContainer(EWAHCompressedBitmap a, BitmapStorage container)
a
- the other bitmap (it will not be modified)container
- where we store the resultpublic int andCardinality(EWAHCompressedBitmap a)
a
- the other bitmap (it will not be modified)public EWAHCompressedBitmap andNot(EWAHCompressedBitmap a)
andNot
in interface LogicalElement<EWAHCompressedBitmap>
a
- the other bitmap (it will not be modified)public void andNotToContainer(EWAHCompressedBitmap a, BitmapStorage container)
a
- the other bitmap (it will not be modified)container
- where to store the resultpublic int andNotCardinality(EWAHCompressedBitmap a)
a
- the other bitmap (it will not be modified)public int cardinality()
public void clear()
clear
in interface BitmapStorage
public EWAHCompressedBitmap clone() throws CloneNotSupportedException
clone
in class Object
CloneNotSupportedException
public void serialize(DataOutput out) throws IOException
out
- the DataOutput streamIOException
- Signals that an I/O exception has occurred.public void deserialize(DataInput in) throws IOException
in
- the DataInput streamIOException
- Signals that an I/O exception has occurred.public boolean equals(Object o)
equals
in class Object
Object.equals(java.lang.Object)
public EWAHIterator getEWAHIterator()
public IteratingRLW getIteratingRLW()
EWAHCompressedBitmap n = IteratorUtil.materialize(bitmap.getIteratingRLW()));
n.setSizeInBitsWithinLastWord(bitmap.sizeInBits());
The current bitmap is not modified.@Deprecated public List<Integer> getPositions()
public List<Integer> toList()
public int hashCode()
public boolean intersects(EWAHCompressedBitmap a)
a
- the other bitmap (it will not be modified)public IntIterator intIterator()
public IntIterator reverseIntIterator()
public boolean isEmpty()
public IntIterator clearIntIterator()
public ChunkIterator chunkIterator()
public Iterator<Integer> iterator()
public void not()
not
in interface LogicalElement<EWAHCompressedBitmap>
public EWAHCompressedBitmap or(EWAHCompressedBitmap a)
or
in interface LogicalElement<EWAHCompressedBitmap>
a
- the other bitmap (it will not be modified)public void orToContainer(EWAHCompressedBitmap a, BitmapStorage container)
a
- the other bitmap (it will not be modified)container
- where we store the resultpublic int orCardinality(EWAHCompressedBitmap a)
a
- the other bitmap (it will not be modified)public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException
readExternal
in interface Externalizable
IOException
ClassNotFoundException
public void writeExternal(ObjectOutput out) throws IOException
writeExternal
in interface Externalizable
IOException
public int serializedSizeInBytes()
public boolean get(int i)
i
- the bit we are interested inpublic int getFirstSetBit()
public boolean clear(int i)
i
- the indexIndexOutOfBoundsException
- if i is negative or greater than Integer.MAX_VALUE - 64public boolean set(int i)
i
- the indexIndexOutOfBoundsException
- if i is negative or greater than Integer.MAX_VALUE - 64public void setSizeInBitsWithinLastWord(int size)
BitmapStorage
setSizeInBitsWithinLastWord
in interface BitmapStorage
size
- the size in bitspublic boolean setSizeInBits(int size, boolean defaultValue)
size
- the size in bitsdefaultValue
- the default boolean valuepublic int sizeInBits()
sizeInBits
in interface LogicalElement<EWAHCompressedBitmap>
public int sizeInBytes()
sizeInBytes
in interface LogicalElement<EWAHCompressedBitmap>
public static EWAHCompressedBitmap threshold(int t, EWAHCompressedBitmap... bitmaps)
t
- the thresholdbitmaps
- input datapublic static void thresholdWithContainer(BitmapStorage container, int t, EWAHCompressedBitmap... bitmaps)
t
- the thresholdbitmaps
- input datacontainer
- where we write the aggregated bitmappublic int[] toArray()
public String toDebugString()
public String toString()
public void swap(EWAHCompressedBitmap other)
other
- bitmap to swap withpublic void trim()
public EWAHCompressedBitmap xor(EWAHCompressedBitmap a)
xor
in interface LogicalElement<EWAHCompressedBitmap>
a
- the other bitmap (it will not be modified)public void xorToContainer(EWAHCompressedBitmap a, BitmapStorage container)
a
- the other bitmap (it will not be modified)container
- where we store the resultpublic int xorCardinality(EWAHCompressedBitmap a)
a
- the other bitmap (it will not be modified)public EWAHCompressedBitmap compose(EWAHCompressedBitmap a)
compose
in interface LogicalElement<EWAHCompressedBitmap>
a
- the other bitmap (it will not be modified)public void composeToContainer(EWAHCompressedBitmap a, EWAHCompressedBitmap container)
a
- the other bitmap (it will not be modified)container
- where we store the resultpublic static void andWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
container
- where the result is storedbitmaps
- bitmaps to ANDpublic static EWAHCompressedBitmap and(EWAHCompressedBitmap... bitmaps)
bitmaps
- bitmaps to AND togetherpublic static int andCardinality(EWAHCompressedBitmap... bitmaps)
bitmaps
- bitmaps to ANDpublic static EWAHCompressedBitmap bitmapOf(int... setBits)
setBits
- list of set bit positionspublic static void orWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
container
- where the aggregate is written.bitmaps
- to be aggregatedpublic static void xorWithContainer(BitmapStorage container, EWAHCompressedBitmap... bitmaps)
container
- where the aggregate is written.bitmaps
- to be aggregatedpublic static EWAHCompressedBitmap or(EWAHCompressedBitmap... bitmaps)
bitmaps
- bitmaps to OR togetherpublic static EWAHCompressedBitmap xor(EWAHCompressedBitmap... bitmaps)
bitmaps
- bitmaps to XOR togetherpublic static int orCardinality(EWAHCompressedBitmap... bitmaps)
bitmaps
- bitmaps to ORpublic EWAHCompressedBitmap shift(int b)
b
- number of bitsCopyright © 2016. All Rights Reserved.