Package

org.apache.spark.util

collection

Permalink

package collection

Visibility
  1. Public
  2. All

Type Members

  1. class AppendOnlyMap[K, V] extends Iterable[(K, V)] with Serializable

    Permalink

    :: DeveloperApi :: A simple open hash table optimized for the append-only use case, where keys are never removed, but the value for each key may be changed.

    :: DeveloperApi :: A simple open hash table optimized for the append-only use case, where keys are never removed, but the value for each key may be changed.

    This implementation uses quadratic probing with a power-of-2 hash table size, which is guaranteed to explore all spaces for each key (see http://en.wikipedia.org/wiki/Quadratic_probing).

    The map can support up to 375809638 (0.7 * 2 ^ 29) elements.

    TODO: Cache the hash values of each key? java.util.HashMap does that.

    Annotations
    @DeveloperApi()
  2. class BitSet extends Serializable

    Permalink

    A simple, fixed-size bit set implementation.

    A simple, fixed-size bit set implementation. This implementation is fast because it avoids safety/bound checking.

  3. class ExternalAppendOnlyMap[K, V, C] extends Spillable[SizeTracker] with Serializable with Logging with Iterable[(K, C)]

    Permalink

    :: DeveloperApi :: An append-only map that spills sorted content to disk when there is insufficient space for it to grow.

    :: DeveloperApi :: An append-only map that spills sorted content to disk when there is insufficient space for it to grow.

    This map takes two passes over the data:

    (1) Values are merged into combiners, which are sorted and spilled to disk as necessary (2) Combiners are read from disk and merged together

    The setting of the spill threshold faces the following trade-off: If the spill threshold is too high, the in-memory map may occupy more memory than is available, resulting in OOM. However, if the spill threshold is too low, we spill frequently and incur unnecessary disk writes. This may lead to a performance regression compared to the normal case of using the non-spilling AppendOnlyMap.

    Annotations
    @DeveloperApi()
  4. class OpenHashSet[T] extends Serializable

    Permalink

    A simple, fast hash set optimized for non-null insertion-only use case, where keys are never removed.

    A simple, fast hash set optimized for non-null insertion-only use case, where keys are never removed.

    The underlying implementation uses Scala compiler's specialization to generate optimized storage for two primitive types (Long and Int). It is much faster than Java's standard HashSet while incurring much less memory overhead. This can serve as building blocks for higher level data structures such as an optimized HashMap.

    This OpenHashSet is designed to serve as building blocks for higher level data structures such as an optimized hash map. Compared with standard hash set implementations, this class provides its various callbacks interfaces (e.g. allocateFunc, moveFunc) and interfaces to retrieve the position of a key in the underlying array.

    It uses quadratic probing with a power-of-2 hash table size, which is guaranteed to explore all spaces for each key (see http://en.wikipedia.org/wiki/Quadratic_probing).

    Annotations
    @Private()

Value Members

  1. package unsafe

    Permalink

Ungrouped