Class ImmutableRoaringBitmap

  • All Implemented Interfaces:
    Cloneable, Iterable<Integer>, ImmutableBitmapDataProvider
    Direct Known Subclasses:
    MutableRoaringBitmap

    public class ImmutableRoaringBitmap
    extends Object
    implements Iterable<Integer>, Cloneable, ImmutableBitmapDataProvider
    ImmutableRoaringBitmap provides a compressed immutable (cannot be modified) bitmap. It is meant to be used with org.roaringbitmap.buffer.MutableRoaringBitmap, a derived class that adds methods to modify the bitmap. Because the class ImmutableRoaringBitmap is not final and because there exists one derived class (org.roaringbitmap.buffer.MutableRoaringBitmap), then it is possible for the programmer to modify some ImmutableRoaringBitmap instances, but this invariably involves casting to other classes: if your code is written in terms of ImmutableRoaringBitmap instances, then your objects will be truly immutable, and thus easy to reason about. Pure (non-derived) instances of ImmutableRoaringBitmap have their data backed by a ByteBuffer. This has the benefit that they may be constructed from a ByteBuffer (useful for memory mapping). Objects of this class may reside almost entirely in memory-map files. That is the primary reason for them to be considered immutable, since no reallocation is possible when using memory-mapped files. From a language design point of view, instances of this class are immutable only when used as per the interface of the ImmutableRoaringBitmap class. Given that the class is not final, it is possible to modify instances, through other interfaces. Thus we do not take the term "immutable" in a purist manner, but rather in a practical one. One of our motivations for this design where MutableRoaringBitmap instances can be casted down to ImmutableRoaringBitmap instances is that bitmaps are often large, or used in a context where memory allocations are to be avoided, so we avoid forcing copies. Copies could be expected if one needs to mix and match ImmutableRoaringBitmap and MutableRoaringBitmap instances.
     
           import org.roaringbitmap.buffer.*;
    
           //...
    
           MutableRoaringBitmap rr1 = MutableRoaringBitmap.bitmapOf(1, 2, 3, 1000);
           MutableRoaringBitmap rr2 = MutableRoaringBitmap.bitmapOf( 2, 3, 1010);
           ByteArrayOutputStream bos = new ByteArrayOutputStream();
           DataOutputStream dos = new DataOutputStream(bos);
           // could call "rr1.runOptimize()" and "rr2.runOptimize" if there
           // there were runs to compress
           rr1.serialize(dos);
           rr2.serialize(dos);
           dos.close();
           ByteBuffer bb = ByteBuffer.wrap(bos.toByteArray());
           ImmutableRoaringBitmap rrback1 = new ImmutableRoaringBitmap(bb);
           bb.position(bb.position() + rrback1.serializedSizeInBytes());
           ImmutableRoaringBitmap rrback2 = new ImmutableRoaringBitmap(bb);
     
     
    See Also:
    MutableRoaringBitmap
    • Constructor Detail

      • ImmutableRoaringBitmap

        protected ImmutableRoaringBitmap()
      • ImmutableRoaringBitmap

        public ImmutableRoaringBitmap​(ByteBuffer b)
        Constructs a new ImmutableRoaringBitmap starting at this ByteBuffer's position(). Only meta-data is loaded to RAM. The rest is mapped to the ByteBuffer. The byte stream should abide by the format specification https://github.com/RoaringBitmap/RoaringFormatSpec It is not necessary that limit() on the input ByteBuffer indicates the end of the serialized data. After creating this ImmutableRoaringBitmap, you can advance to the rest of the data (if there is more) by setting b.position(b.position() + bitmap.serializedSizeInBytes()); Note that the input ByteBuffer is effectively copied (with the slice operation) so you should expect the provided ByteBuffer to remain unchanged. This constructor may throw IndexOutOfBoundsException if the input is invalid/corrupted. This constructor throws an InvalidRoaringFormat if the provided input does not have a valid cookie or suffers from similar problems.
        Parameters:
        b - data source
    • Method Detail

      • and

        public static MutableRoaringBitmap and​(Iterator<? extends ImmutableRoaringBitmap> bitmaps,
                                               long rangeStart,
                                               long rangeEnd)
        Computes AND between input bitmaps in the given range, from rangeStart (inclusive) to rangeEnd (exclusive)
        Parameters:
        bitmaps - input bitmaps, these are not modified
        rangeStart - inclusive beginning of range
        rangeEnd - exclusive ending of range
        Returns:
        new result bitmap
      • and

        @Deprecated
        public static MutableRoaringBitmap and​(Iterator<? extends ImmutableRoaringBitmap> bitmaps,
                                               int rangeStart,
                                               int rangeEnd)
        Deprecated.
        use the version where longs specify the range. Negative range end are illegal.
        Computes AND between input bitmaps in the given range, from rangeStart (inclusive) to rangeEnd (exclusive)
        Parameters:
        bitmaps - input bitmaps, these are not modified
        rangeStart - inclusive beginning of range
        rangeEnd - exclusive ending of range
        Returns:
        new result bitmap
      • xorCardinality

        public static int xorCardinality​(ImmutableRoaringBitmap x1,
                                         ImmutableRoaringBitmap x2)
        Cardinality of the bitwise XOR (symmetric difference) operation. The provided bitmaps are *not* modified. This operation is thread-safe as long as the provided bitmaps remain unchanged.
        Parameters:
        x1 - first bitmap
        x2 - other bitmap
        Returns:
        cardinality of the symmetric difference
      • andNotCardinality

        public static int andNotCardinality​(ImmutableRoaringBitmap x1,
                                            ImmutableRoaringBitmap x2)
        Cardinality of the bitwise ANDNOT (left difference) operation. The provided bitmaps are *not* modified. This operation is thread-safe as long as the provided bitmaps remain unchanged.
        Parameters:
        x1 - first bitmap
        x2 - other bitmap
        Returns:
        cardinality of the left difference
      • andNot

        public static MutableRoaringBitmap andNot​(ImmutableRoaringBitmap x1,
                                                  ImmutableRoaringBitmap x2,
                                                  long rangeStart,
                                                  long rangeEnd)
        Bitwise ANDNOT (difference) operation for the given range, rangeStart (inclusive) and rangeEnd (exclusive). The provided bitmaps are *not* modified. This operation is thread-safe as long as the provided bitmaps remain unchanged.
        Parameters:
        x1 - first bitmap
        x2 - other bitmap
        rangeStart - beginning of the range (inclusive)
        rangeEnd - end of range (exclusive)
        Returns:
        result of the operation
      • andNot

        @Deprecated
        public static MutableRoaringBitmap andNot​(ImmutableRoaringBitmap x1,
                                                  ImmutableRoaringBitmap x2,
                                                  int rangeStart,
                                                  int rangeEnd)
        Deprecated.
        use the version where longs specify the range. Negative values for range endpoints are not allowed.
        Bitwise ANDNOT (difference) operation for the given range, rangeStart (inclusive) and rangeEnd (exclusive). The provided bitmaps are *not* modified. This operation is thread-safe as long as the provided bitmaps remain unchanged.
        Parameters:
        x1 - first bitmap
        x2 - other bitmap
        rangeStart - beginning of the range (inclusive)
        rangeEnd - end of range (exclusive)
        Returns:
        result of the operation
      • andNot

        public static MutableRoaringBitmap andNot​(ImmutableRoaringBitmap x1,
                                                  ImmutableRoaringBitmap x2)
        Bitwise ANDNOT (difference) operation. The provided bitmaps are *not* modified. This operation is thread-safe as long as the provided bitmaps remain unchanged.
        Parameters:
        x1 - first bitmap
        x2 - other bitmap
        Returns:
        result of the operation
      • bitmapOf

        public static ImmutableRoaringBitmap bitmapOf​(int... data)
        Generate a bitmap with the specified values set to true. The provided integers values don't have to be in sorted order, but it may be preferable to sort them from a performance point of view. This function is equivalent to :
         
               (ImmutableRoaringBitmap) MutableRoaringBitmap.bitmapOf(data)
         
         
        Parameters:
        data - set values
        Returns:
        a new bitmap
      • flip

        public static MutableRoaringBitmap flip​(ImmutableRoaringBitmap bm,
                                                long rangeStart,
                                                long rangeEnd)
        Complements the bits in the given range, from rangeStart (inclusive) rangeEnd (exclusive). The given bitmap is unchanged.
        Parameters:
        bm - bitmap being negated
        rangeStart - inclusive beginning of range
        rangeEnd - exclusive ending of range
        Returns:
        a new Bitmap
      • flip

        @Deprecated
        public static MutableRoaringBitmap flip​(ImmutableRoaringBitmap bm,
                                                int rangeStart,
                                                int rangeEnd)
        Deprecated.
        use the version where longs specify the range
        Complements the bits in the given range, from rangeStart (inclusive) rangeEnd (exclusive). The given bitmap is unchanged.
        Parameters:
        bm - bitmap being negated
        rangeStart - inclusive beginning of range
        rangeEnd - exclusive ending of range
        Returns:
        a new Bitmap
      • intersects

        public static boolean intersects​(ImmutableRoaringBitmap x1,
                                         ImmutableRoaringBitmap x2)
        Checks whether the two bitmaps intersect. This can be much faster than calling "and" and checking the cardinality of the result.
        Parameters:
        x1 - first bitmap
        x2 - other bitmap
        Returns:
        true if they intersect
      • or

        public static MutableRoaringBitmap or​(Iterator<? extends ImmutableRoaringBitmap> bitmaps,
                                              long rangeStart,
                                              long rangeEnd)
        Computes OR between input bitmaps in the given range, from rangeStart (inclusive) to rangeEnd (exclusive)
        Parameters:
        bitmaps - input bitmaps, these are not modified
        rangeStart - inclusive beginning of range
        rangeEnd - exclusive ending of range
        Returns:
        new result bitmap
      • or

        @Deprecated
        public static MutableRoaringBitmap or​(Iterator<? extends ImmutableRoaringBitmap> bitmaps,
                                              int rangeStart,
                                              int rangeEnd)
        Deprecated.
        use the version where longs specify the range. Negative range points are forbidden.
        Computes OR between input bitmaps in the given range, from rangeStart (inclusive) to rangeEnd (exclusive)
        Parameters:
        bitmaps - input bitmaps, these are not modified
        rangeStart - inclusive beginning of range
        rangeEnd - exclusive ending of range
        Returns:
        new result bitmap
      • xor

        public static MutableRoaringBitmap xor​(Iterator<? extends ImmutableRoaringBitmap> bitmaps,
                                               long rangeStart,
                                               long rangeEnd)
        Computes XOR between input bitmaps in the given range, from rangeStart (inclusive) to rangeEnd (exclusive)
        Parameters:
        bitmaps - input bitmaps, these are not modified
        rangeStart - inclusive beginning of range
        rangeEnd - exclusive ending of range
        Returns:
        new result bitmap
      • xor

        @Deprecated
        public static MutableRoaringBitmap xor​(Iterator<? extends ImmutableRoaringBitmap> bitmaps,
                                               int rangeStart,
                                               int rangeEnd)
        Deprecated.
        use the version where longs specify the range. Negative values not allowed for rangeStart and rangeEnd
        Computes XOR between input bitmaps in the given range, from rangeStart (inclusive) to rangeEnd (exclusive)
        Parameters:
        bitmaps - input bitmaps, these are not modified
        rangeStart - inclusive beginning of range
        rangeEnd - exclusive ending of range
        Returns:
        new result bitmap
      • contains

        public boolean contains​(int x)
        Checks whether the value in included, which is equivalent to checking if the corresponding bit is set (get in BitSet class).
        Specified by:
        contains in interface ImmutableBitmapDataProvider
        Parameters:
        x - integer value
        Returns:
        whether the integer value is included.
      • contains

        public boolean contains​(long minimum,
                                long supremum)
        Checks if the bitmap contains the range.
        Parameters:
        minimum - the inclusive lower bound of the range
        supremum - the exclusive upper bound of the range
        Returns:
        whether the bitmap intersects with the range
      • contains

        public boolean contains​(ImmutableRoaringBitmap subset)
        Checks whether the parameter is a subset of this RoaringBitmap or not
        Parameters:
        subset - the potential subset
        Returns:
        true if the parameter is a subset of this RoaringBitmap
      • isHammingSimilar

        public boolean isHammingSimilar​(ImmutableRoaringBitmap other,
                                        int tolerance)
        Returns true if the other bitmap has no more than tolerance bits differing from this bitmap. The other may be transformed into a bitmap equal to this bitmap in no more than tolerance bit flips if this method returns true.
        Parameters:
        other - the bitmap to compare to
        tolerance - the maximum number of bits that may differ
        Returns:
        true if the number of differing bits is smaller than tolerance
      • intersects

        public boolean intersects​(long minimum,
                                  long supremum)
        Checks if the range intersects with the bitmap.
        Parameters:
        minimum - the inclusive unsigned lower bound of the range
        supremum - the exclusive unsigned upper bound of the range
        Returns:
        whether the bitmap intersects with the range
      • getLongCardinality

        public long getLongCardinality()
        Returns the number of distinct integers added to the bitmap (e.g., number of bits set).
        Specified by:
        getLongCardinality in interface ImmutableBitmapDataProvider
        Returns:
        the cardinality
      • getCardinality

        public int getCardinality()
        Description copied from interface: ImmutableBitmapDataProvider
        Returns the number of distinct integers added to the bitmap (e.g., number of bits set). Internally, this is computed as a 64-bit number.
        Specified by:
        getCardinality in interface ImmutableBitmapDataProvider
        Returns:
        the cardinality
      • forEach

        public void forEach​(IntConsumer ic)
        Description copied from interface: ImmutableBitmapDataProvider
        Visit all values in the bitmap and pass them to the consumer. * Usage:
         
          bitmap.forEach(new IntConsumer() {
        
            {@literal @}Override
            public void accept(int value) {
              // do something here
              
            }});
           
         }
         
        Specified by:
        forEach in interface ImmutableBitmapDataProvider
        Parameters:
        ic - the consumer
      • getContainerPointer

        public MappeableContainerPointer getContainerPointer()
        Return a low-level container pointer that can be used to access the underlying data structure.
        Returns:
        container pointer
      • getLongSizeInBytes

        public long getLongSizeInBytes()
        Estimate of the memory usage of this data structure. This can be expected to be within 1% of the true memory usage in common usage scenarios. If exact measures are needed, we recommend using dedicated libraries such as ehcache-sizeofengine. When the bitmap is constructed from a ByteBuffer from a memory-mapped file, this estimate is invalid: we can expect the actual memory usage to be significantly (e.g., 10x) less. In adversarial cases, this estimate may be 10x the actual memory usage. For example, if you insert a single random value in a bitmap, then over a 100 bytes may be used by the JVM whereas this function may return an estimate of 32 bytes. The same will be true in the "sparse" scenario where you have a small set of random-looking integers spanning a wide range of values. These are considered adversarial cases because, as a general rule, if your data looks like a set of random integers, Roaring bitmaps are probably not the right data structure. Note that you can serialize your Roaring Bitmaps to disk and then construct ImmutableRoaringBitmap instances from a ByteBuffer. In such cases, the Java heap usage will be significantly less than what is reported. If your main goal is to compress arrays of integers, there are other libraries that are maybe more appropriate such as JavaFastPFOR. Note, however, that in general, random integers (as produced by random number generators or hash functions) are not compressible. Trying to compress random data is an adversarial use case.
        Specified by:
        getLongSizeInBytes in interface ImmutableBitmapDataProvider
        Returns:
        estimated memory usage.
        See Also:
        JavaFastPFOR
      • getSizeInBytes

        public int getSizeInBytes()
        Estimate of the memory usage of this data structure. This can be expected to be within 1% of the true memory usage in common usage scenarios. If exact measures are needed, we recommend using dedicated libraries such as ehcache-sizeofengine. When the bitmap is constructed from a ByteBuffer from a memory-mapped file, this estimate is invalid: we can expect the actual memory usage to be significantly (e.g., 10x) less. In adversarial cases, this estimate may be 10x the actual memory usage. For example, if you insert a single random value in a bitmap, then over a 100 bytes may be used by the JVM whereas this function may return an estimate of 32 bytes. The same will be true in the "sparse" scenario where you have a small set of random-looking integers spanning a wide range of values. These are considered adversarial cases because, as a general rule, if your data looks like a set of random integers, Roaring bitmaps are probably not the right data structure. Note that you can serialize your Roaring Bitmaps to disk and then construct ImmutableRoaringBitmap instances from a ByteBuffer. In such cases, the Java heap usage will be significantly less than what is reported. If your main goal is to compress arrays of integers, there are other libraries that are maybe more appropriate such as JavaFastPFOR. Note, however, that in general, random integers (as produced by random number generators or hash functions) are not compressible. Trying to compress random data is an adversarial use case.
        Specified by:
        getSizeInBytes in interface ImmutableBitmapDataProvider
        Returns:
        estimated memory usage.
        See Also:
        JavaFastPFOR
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object
      • hasRunCompression

        public boolean hasRunCompression()
        Check whether this bitmap has had its runs compressed.
        Returns:
        whether this bitmap has run compression
      • isEmpty

        public boolean isEmpty()
        Checks whether the bitmap is empty.
        Specified by:
        isEmpty in interface ImmutableBitmapDataProvider
        Returns:
        true if this bitmap contains no set bit
      • limit

        public MutableRoaringBitmap limit​(int maxcardinality)
        Create a new Roaring bitmap containing at most maxcardinality integers.
        Specified by:
        limit in interface ImmutableBitmapDataProvider
        Parameters:
        maxcardinality - maximal cardinality
        Returns:
        a new bitmap with cardinality no more than maxcardinality
      • rankLong

        public long rankLong​(int x)
        Rank returns the number of integers that are smaller or equal to x (Rank(infinity) would be GetCardinality()).
        Specified by:
        rankLong in interface ImmutableBitmapDataProvider
        Parameters:
        x - upper limit
        Returns:
        the rank
        See Also:
        Ranking in statistics
      • rangeCardinality

        public long rangeCardinality​(long start,
                                     long end)
        Description copied from interface: ImmutableBitmapDataProvider
        Computes the number of values in the interval [start,end) where start is included and end excluded. rangeCardinality(0,0x100000000) provides the total cardinality (getLongCardinality). The answer is a 64-bit value between 1 and 0x100000000.
        Specified by:
        rangeCardinality in interface ImmutableBitmapDataProvider
        Parameters:
        start - lower limit (included)
        end - upper limit (excluded)
        Returns:
        the number of elements in [start,end), between 0 and 0x100000000.
      • rank

        public int rank​(int x)
        Description copied from interface: ImmutableBitmapDataProvider
        Rank returns the number of integers that are smaller or equal to x (rank(infinity) would be getCardinality()). The value is internally computed as a 64-bit number.
        Specified by:
        rank in interface ImmutableBitmapDataProvider
        Parameters:
        x - upper limit
        Returns:
        the rank
        See Also:
        Ranking in statistics
      • select

        public int select​(int j)
        Return the jth value stored in this bitmap. The provided value needs to be smaller than the cardinality otherwise an IllegalArgumentException exception is thrown.
        Specified by:
        select in interface ImmutableBitmapDataProvider
        Parameters:
        j - index of the value
        Returns:
        the value
        See Also:
        Selection algorithm
      • nextValue

        public long nextValue​(int fromValue)
        Description copied from interface: ImmutableBitmapDataProvider
        Returns the first value equal to or larger than the provided value (interpreted as an unsigned integer). If no such bit exists then -1 is returned. It is not necessarily a computationally effective way to iterate through the values.
        Specified by:
        nextValue in interface ImmutableBitmapDataProvider
        Parameters:
        fromValue - the lower bound (inclusive)
        Returns:
        the smallest value larger than or equal to the specified value, or -1 if there is no such value
      • previousValue

        public long previousValue​(int fromValue)
        Description copied from interface: ImmutableBitmapDataProvider
        Returns the first value less than or equal to the provided value (interpreted as an unsigned integer). If no such bit exists then -1 is returned. It is not an efficient way to iterate through the values backwards.
        Specified by:
        previousValue in interface ImmutableBitmapDataProvider
        Parameters:
        fromValue - the upper bound (inclusive)
        Returns:
        the largest value less than or equal to the specified value, or -1 if there is no such value
      • serialize

        public void serialize​(DataOutput out)
                       throws IOException
        Serialize this bitmap. See format specification at https://github.com/RoaringBitmap/RoaringFormatSpec Consider calling MutableRoaringBitmap.runOptimize() before serialization to improve compression if this is a MutableRoaringBitmap instance. The current bitmap is not modified. Advanced example: To serialize your bitmap to a ByteBuffer, you can do the following.
         {
           @code
           // r is your bitmap
        
           // r.runOptimize(); // might improve compression, only if you have a
           // MutableRoaringBitmap instance.
           // next we create the ByteBuffer where the data will be stored
           ByteBuffer outbb = ByteBuffer.allocate(r.serializedSizeInBytes());
           // then we can serialize on a custom OutputStream
           mrb.serialize(new DataOutputStream(new OutputStream() {
             ByteBuffer mBB;
        
             OutputStream init(ByteBuffer mbb) {
               mBB = mbb;
               return this;
             }
        
             public void close() {}
        
             public void flush() {}
        
             public void write(int b) {
               mBB.put((byte) b);
             }
        
             public void write(byte[] b) {
               mBB.put(b);
             }
        
             public void write(byte[] b, int off, int l) {
               mBB.put(b, off, l);
             }
           }.init(outbb)));
           // outbuff will now contain a serialized version of your bitmap
         }
         
        Note: Java's data structures are in big endian format. Roaring serializes to a little endian format, so the bytes are flipped by the library during serialization to ensure that what is stored is in little endian---despite Java's big endianness. You can defeat this process by reflipping the bytes again in a custom DataOutput which could lead to serialized Roaring objects with an incorrect byte order.
        Specified by:
        serialize in interface ImmutableBitmapDataProvider
        Parameters:
        out - the DataOutput stream
        Throws:
        IOException - Signals that an I/O exception has occurred.
      • serializedSizeInBytes

        public int serializedSizeInBytes()
        Report the number of bytes required for serialization. This count will match the bytes written when calling the serialize method.
        Specified by:
        serializedSizeInBytes in interface ImmutableBitmapDataProvider
        Returns:
        the size in bytes
      • toArray

        public int[] toArray()
        Return the set values as an array if the cardinality is less than 2147483648. The integer values are in sorted order.
        Specified by:
        toArray in interface ImmutableBitmapDataProvider
        Returns:
        array representing the set values.
      • toMutableRoaringBitmap

        public MutableRoaringBitmap toMutableRoaringBitmap()
        Copies the content of this bitmap to a bitmap that can be modified.
        Returns:
        a mutable bitmap.
      • toRoaringBitmap

        public RoaringBitmap toRoaringBitmap()
        Copies this bitmap to a mutable RoaringBitmap.
        Returns:
        a copy of this bitmap as a RoaringBitmap.
      • toString

        public String toString()
        A string describing the bitmap.
        Overrides:
        toString in class Object
        Returns:
        the string