Class VectorTree<T extends ACell>

Type Parameters:
T - Type of Vector elements
All Implemented Interfaces:
IAssociative<CVMLong,T>, IValidated, IWriteable, Iterable<T>, Collection<T>, List<T>

public class VectorTree<T extends ACell> extends AVector<T>
Persistent Vector implemented as a merkle tree of chunks shift indicates the level of the tree: 4 = 1st level, 8 = second etc. Invariants:
  • All children except the last must be fully packed
  • Each non-terminal leaf chunk must be a tailless VectorLeaf of size 16
This implies that the entire tree must be a multiple of 16 in size. This is a desirable property as we want dense trees in our canonical representation. Any extra elements must be stored in a ListVector. This structure facilitates fast ~O(log(n)) operations for lookup and vector element update, and usually O(1) element additions/lookup at end. "Software gets slower faster than hardware gets faster" - Niklaus Wirth
  • Field Details

    • MINIMUM_SIZE

      public static final int MINIMUM_SIZE
      See Also:
    • MAX_EMBEDDED_LENGTH

      public static final int MAX_EMBEDDED_LENGTH
      See Also:
    • MAX_ENCODING_LENGTH

      public static final int MAX_ENCODING_LENGTH
  • Method Details

    • computeShift

      public static int computeShift(long count)
      Computes the shift value for a BlockVector of the given count Note: if returns zero, count cannot be supported by a valid BlockVector
      Parameters:
      count - Number of elements
      Returns:
      Shift value
    • create

      public static <T extends ACell> VectorTree<T> create(ACell[] things, int offset, int length)
      Create a TreeVector with the specified elements - things must have at least 32 elements (the minimum TreeVector size) - must be a whole multiple of 16 elements (complete chunks only)
      Parameters:
      things - Elements to include
      offset - Offset into element array
      length - Number of elements to include
      Returns:
      New TreeVector instance
    • get

      public T get(long i)
      Description copied from class: AVector
      Gets the element at the specified index in this vector
      Specified by:
      get in class AVector<T extends ACell>
      Parameters:
      i - The index of the element to get
      Returns:
      The element value at the specified index
    • getElementRef

      public Ref<T> getElementRef(long i)
      Description copied from class: ASequence
      Gets the element Ref at the specified index
      Specified by:
      getElementRef in class ASequence<T extends ACell>
      Parameters:
      i - Index of element to get
      Returns:
      Ref to element at specified index
    • assoc

      public <R extends ACell> AVector<R> assoc(long i, R value)
      Description copied from class: ASequence
      Updates a value at the given position in the sequence.
      Specified by:
      assoc in class AVector<T extends ACell>
      Parameters:
      i - Index of element to update
      value - New element value
      Returns:
      Updated sequence, or null if index is out of range
    • encode

      public int encode(byte[] bs, int pos)
      Description copied from class: ACell
      Writes this Cell's encoding to a byte array, including a tag byte which will be written first. Cell must be canonical, or else an error may occur.
      Specified by:
      encode in interface IWriteable
      Specified by:
      encode in class ACollection<T extends ACell>
      Parameters:
      bs - A byte array to which to write the encoding
      pos - The offset into the byte array
      Returns:
      New position after writing
    • encodeRaw

      public int encodeRaw(byte[] bs, int pos)
      Description copied from class: ACell
      Writes this Cell's encoding to a byte array, excluding the tag byte.
      Specified by:
      encodeRaw in class AVector<T extends ACell>
      Parameters:
      bs - A byte array to which to write the encoding
      pos - The offset into the byte array
      Returns:
      New position after writing
    • getEncodingLength

      public int getEncodingLength()
      Description copied from class: ACell
      Method to calculate the encoding length of a Cell. May be overridden to avoid creating encodings during memory size calculations. This reduces hashing!
      Overrides:
      getEncodingLength in class ACell
      Returns:
      Exact encoding length of this Cell
    • estimatedEncodingSize

      public int estimatedEncodingSize()
      Description copied from interface: IWriteable
      Estimate the encoded data size for this Cell. Used for quickly sizing buffers. Implementations should try to return a size that is highly likely to contain the entire object when encoded, including the tag byte. Should not traverse soft Refs, i.e. must be usable on arbitrary partial data structures
      Returns:
      The estimated size for the binary representation of this object.
    • read

      public static <T extends ACell> VectorTree<T> read(ByteBuffer bb, long count) throws BadFormatException
      Reads a VectorTree from the provided ByteBuffer Assumes the header byte and count is already read.
      Parameters:
      bb - ByteBuffer to read from
      count - Number of elements, assumed to be valid
      Returns:
      TreeVector instance as read from ByteBuffer
      Throws:
      BadFormatException - If encoding is invalid
    • read

      public static <T extends ACell> VectorTree<T> read(long count, Blob b, int pos) throws BadFormatException
      Throws:
      BadFormatException
    • appendChunk

      public VectorTree<T> appendChunk(VectorLeaf<T> b)
      Description copied from class: AVector
      Appends a ListVector chunk to this vector. This vector must contain a whole number of chunks
      Specified by:
      appendChunk in class AVector<T extends ACell>
      Parameters:
      b - A chunk to append. Must be a ListVector of maximum size
      Returns:
      The updated vector, of the same type as this vector @
    • isFullyPacked

      public boolean isFullyPacked()
      Description copied from class: AVector
      Returns true if this Vector is a single fully packed tree. i.e. a full ListVector or TreeVector.
      Specified by:
      isFullyPacked in class AVector<T extends ACell>
      Returns:
      true if fully packed, false otherwise
    • append

      public AVector<T> append(T value)
      Description copied from class: AVector
      Appends a single element to this vector
      Specified by:
      append in class AVector<T extends ACell>
      Parameters:
      value - Value to append
      Returns:
      Updated vector
    • concat

      public <R extends ACell> AVector<R> concat(ASequence<R> b)
      Description copied from class: ASequence
      Concatenates the elements from another sequence to the end of this sequence. Potentially O(n) in size of resulting sequence
      Specified by:
      concat in class AVector<T extends ACell>
      Parameters:
      b - A sequence of values to concatenate.
      Returns:
      The concatenated sequence, of the same type as this sequence.
    • copyToArray

      protected <K> void copyToArray(K[] arr, int offset)
      Description copied from class: ACollection
      Copies the elements of this collection in order to an array at the specified offset
      Specified by:
      copyToArray in class ACollection<T extends ACell>
      Type Parameters:
      K - Type of array elements required
    • longIndexOf

      public long longIndexOf(Object o)
      Description copied from class: ASequence
      Gets the first long index at which the specified value appears in the the sequence. Similar to Java's standard List.indexOf(...) but supports long indexes.
      Specified by:
      longIndexOf in class ASequence<T extends ACell>
      Parameters:
      o - Any value which could appear as an element of the sequence.
      Returns:
      Index of the value, or -1 if not found.
    • longLastIndexOf

      public long longLastIndexOf(Object o)
      Description copied from class: ASequence
      Gets the last long index at which the specified value appears in the the sequence. Similar to Java's standard List.lastIndexOf(...) but supports long indexes.
      Specified by:
      longLastIndexOf in class ASequence<T extends ACell>
      Parameters:
      o - Any value which could appear as an element of the sequence.
      Returns:
      Index of the value, or -1 if not found.
    • listIterator

      public ListIterator<T> listIterator()
    • listIterator

      public ListIterator<T> listIterator(long index)
      Description copied from class: ASequence
      Gets the ListIterator for a long position
      Specified by:
      listIterator in class AVector<T extends ACell>
      Returns:
      ListIterator instance.
    • forEach

      public void forEach(Consumer<? super T> action)
      Specified by:
      forEach in interface Iterable<T extends ACell>
      Specified by:
      forEach in class ASequence<T extends ACell>
    • anyMatch

      public boolean anyMatch(Predicate<? super T> pred)
      Specified by:
      anyMatch in class AVector<T extends ACell>
    • allMatch

      public boolean allMatch(Predicate<? super T> pred)
      Specified by:
      allMatch in class AVector<T extends ACell>
    • map

      public <R extends ACell> AVector<R> map(Function<? super T,? extends R> mapper)
      Description copied from class: ACollection
      Maps a function over a collection, applying it to each element in turn.
      Specified by:
      map in class AVector<T extends ACell>
      Type Parameters:
      R - Type of element in resulting collection
      Parameters:
      mapper - Function to map over collection
      Returns:
      Collection after function applied to each element
    • visitElementRefs

      public void visitElementRefs(Consumer<Ref<T>> f)
      Description copied from class: ASequence
      Visits all elements in this sequence, calling the specified consumer for each.
      Specified by:
      visitElementRefs in class ASequence<T extends ACell>
      Parameters:
      f - Function to call for each element
    • reduce

      public <R> R reduce(BiFunction<? super R,? super T,? extends R> func, R value)
      Specified by:
      reduce in class AVector<T extends ACell>
    • spliterator

      public Spliterator<T> spliterator(long position)
      Specified by:
      spliterator in class AVector<T extends ACell>
    • isCanonical

      public boolean isCanonical()
      Description copied from class: AVector
      Returns true if this vector is in canonical format, i.e. suitable as top-level serialised representation of a vector.
      Specified by:
      isCanonical in class AVector<T extends ACell>
      Returns:
      true if the vector is in canonical format, false otherwise
    • toCanonical

      public ACell toCanonical()
      Description copied from class: ACell
      Converts this Cell to its canonical version. Must return this Cell if already canonical, may be O(n) in size of value otherwise.
      Specified by:
      toCanonical in class ACell
      Returns:
      Canonical version of Cell
    • isCVMValue

      public final boolean isCVMValue()
      Description copied from class: ACell
      Returns true if this Cell represents a first class CVM Value. Sub-structural cells that are not themselves first class values should return false, pretty much everything else should return true. Note: CVM values might not be in a canonical format, e.g. temporary data structures
      Specified by:
      isCVMValue in class ACell
      Returns:
      true if the object is a CVM Value, false otherwise
    • toVector

      public final <R extends ACell> AVector<R> toVector()
      Description copied from class: ACollection
      Converts this collection to a canonical vector of elements
      Specified by:
      toVector in class ACollection<T extends ACell>
      Returns:
      This collection coerced to a vector
    • getRefCount

      public int getRefCount()
      Description copied from class: ACell
      Gets the number of Refs contained within this Cell. This number is final / immutable for any given instance and is defined by the Cell encoding rules. WARNING: may not be valid id Cell is not canonical Contained Refs may be either external or embedded.
      Specified by:
      getRefCount in class ACell
      Returns:
      The number of Refs in this Cell
    • getRef

      public <R extends ACell> Ref<R> getRef(int i)
      Description copied from class: ACell
      Gets a numbered child Ref from within this Cell. WARNING: May be unreliable is cell is not canonical
      Overrides:
      getRef in class ACell
      Type Parameters:
      R - Type of referenced Cell
      Parameters:
      i - Index of ref to get
      Returns:
      The Ref at the specified index
    • updateRefs

      public VectorTree<T> updateRefs(IRefFunction func)
      Description copied from class: ACell
      Updates all Refs in this object using the given function. The function *must not* change the hash value of Refs, in order to ensure structural integrity of modified data structures. The implementation *should* re-attach any original encoding in order to prevent re-encoding or surplus hashing This is a building block for a very sneaky trick that enables use to do a lot of efficient operations on large trees of smart references. Must return the same object if no Refs are altered.
      Specified by:
      updateRefs in class AVector<T extends ACell>
      Parameters:
      func - Ref update function
      Returns:
      Cell with updated Refs
    • commonPrefixLength

      public long commonPrefixLength(AVector<T> b)
      Description copied from class: AVector
      Computes the length of the longest common prefix of this vector and another vector.
      Specified by:
      commonPrefixLength in class AVector<T extends ACell>
      Parameters:
      b - Any vector
      Returns:
      Length of the longest common prefix
    • getChunk

      public VectorLeaf<T> getChunk(long offset)
      Description copied from class: AVector
      Gets the VectorLeaf chunk at a given offset
      Specified by:
      getChunk in class AVector<T extends ACell>
      Parameters:
      offset - Offset into this vector. Must be a valid chunk start position
      Returns:
      The chunk referenced
    • subVector

      public AVector<T> subVector(long start, long length)
      Description copied from class: ASequence
      Gets a vector containing the specified subset of this sequence.
      Specified by:
      subVector in class ASequence<T extends ACell>
      Parameters:
      start - Start index of sub vector
      length - Length of sub vector to produce
      Returns:
      Sub-vector of this sequence
    • next

      public AVector<T> next()
      Description copied from class: ASequence
      Gets the sequence of all elements after the first, or null if no elements remain
      Specified by:
      next in class AVector<T extends ACell>
      Returns:
      Sequence following the first element
    • validate

      public void validate() throws InvalidDataException
      Description copied from interface: IValidated
      Validates the complete structure of this object. It is necessary to ensure all child Refs are validated, so the general contract for validate is:
      1. Call super.validate() - which will indirectly call validateCell()
      2. Call validate() on any contained cells in this class
      Specified by:
      validate in interface IValidated
      Overrides:
      validate in class ACell
      Throws:
      InvalidDataException - If the data Value is invalid in any way
    • validateCell

      public void validateCell() throws InvalidDataException
      Description copied from class: ACell
      Validates the local structure and invariants of this cell. Called by validate() super implementation. Should validate directly contained data, but should not validate all other structure of this cell. In particular, should not traverse potentially missing child Refs.
      Specified by:
      validateCell in class ACell
      Throws:
      InvalidDataException - If the Cell is invalid