org.bdgenomics.adam.rdd

LeftOuterShuffleRegionJoin

case class LeftOuterShuffleRegionJoin[T, U](leftRdd: RDD[(ReferenceRegion, T)], rightRdd: RDD[(ReferenceRegion, U)])(implicit evidence$7: ClassTag[T], evidence$8: ClassTag[U]) extends ShuffleRegionJoin[T, U, T, Option[U]] with VictimlessSortedIntervalPartitionJoin[T, U, T, Option[U]] with Product with Serializable

Linear Supertypes
Product, Equals, VictimlessSortedIntervalPartitionJoin[T, U, T, Option[U]], ShuffleRegionJoin[T, U, T, Option[U]], RegionJoin[T, U, T, Option[U]], Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. LeftOuterShuffleRegionJoin
  2. Product
  3. Equals
  4. VictimlessSortedIntervalPartitionJoin
  5. ShuffleRegionJoin
  6. RegionJoin
  7. Serializable
  8. Serializable
  9. AnyRef
  10. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new LeftOuterShuffleRegionJoin(leftRdd: RDD[(ReferenceRegion, T)], rightRdd: RDD[(ReferenceRegion, U)])(implicit arg0: ClassTag[T], arg1: ClassTag[U])

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def advanceCache(cache: SetTheoryCache[U, T, Option[U]], right: BufferedIterator[(ReferenceRegion, U)], until: ReferenceRegion): Unit

    Adds elements from right to cache based on the next region encountered.

    Adds elements from right to cache based on the next region encountered.

    cache

    The cache for this partition.

    right

    The right iterator.

    until

    The next region to join with.

    Attributes
    protected
    Definition Classes
    VictimlessSortedIntervalPartitionJoinShuffleRegionJoin
  7. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  8. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. def compute(): RDD[(T, Option[U])]

    Performs a region join between two RDDs (shuffle join).

    Performs a region join between two RDDs (shuffle join). All data should be pre-shuffled and copartitioned.

    returns

    An RDD of joins (x, y), where x is from leftRDD, y is from rightRDD, and the region corresponding to x overlaps the region corresponding to y.

    Definition Classes
    ShuffleRegionJoin
  10. def emptyFn(left: Iterator[(ReferenceRegion, T)], right: Iterator[(ReferenceRegion, U)]): Iterator[(T, Option[U])]

    Handles the case where the left or the right iterator were empty.

    Handles the case where the left or the right iterator were empty.

    left

    The left iterator.

    right

    The right iterator.

    returns

    The iterator containing properly formatted tuples.

    Attributes
    protected
    Definition Classes
    LeftOuterShuffleRegionJoinShuffleRegionJoin
  11. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  12. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  13. def finalizeHits(cache: SetTheoryCache[U, T, Option[U]], right: BufferedIterator[(ReferenceRegion, U)]): Iterable[(T, Option[U])]

    Computes all victims for the partition.

    Computes all victims for the partition. NOTE: These are victimless joins so we have no victims.

    cache

    The cache for this partition.

    right

    The right iterator.

    returns

    An empty iterator.

    Attributes
    protected
    Definition Classes
    VictimlessSortedIntervalPartitionJoinShuffleRegionJoin
  14. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  15. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  16. val leftRdd: RDD[(ReferenceRegion, T)]

  17. def makeIterator(leftIter: Iterator[(ReferenceRegion, T)], rightIter: Iterator[(ReferenceRegion, U)]): Iterator[(T, Option[U])]

    Attributes
    protected
    Definition Classes
    ShuffleRegionJoin
  18. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  19. final def notify(): Unit

    Definition Classes
    AnyRef
  20. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  21. def partitionAndJoin(left: RDD[(ReferenceRegion, T)], right: RDD[(ReferenceRegion, U)]): RDD[(T, Option[U])]

    Performs a region join between two RDDs.

    Performs a region join between two RDDs.

    returns

    An RDD of pairs (x, y), where x is from baseRDD, y is from joinedRDD, and the region corresponding to x overlaps the region corresponding to y.

    Definition Classes
    ShuffleRegionJoinRegionJoin
  22. def postProcessHits(iter: Iterable[U], currentLeft: T): Iterable[(T, Option[U])]

    Computes post processing required to complete the join and properly format hits.

    Computes post processing required to complete the join and properly format hits.

    iter

    The iterator of hits.

    currentLeft

    The current left value.

    returns

    the post processed iterator.

    Attributes
    protected
    Definition Classes
    LeftOuterShuffleRegionJoinShuffleRegionJoin
  23. def processHits(cache: SetTheoryCache[U, T, Option[U]], currentLeft: T, currentLeftRegion: ReferenceRegion): Iterable[(T, Option[U])]

    Process hits for a given object in left.

    Process hits for a given object in left.

    cache

    The cache containing potential hits.

    currentLeft

    The current object from the left

    currentLeftRegion

    The ReferenceRegion of currentLeft.

    returns

    An iterator containing all hits, formatted by postProcessHits.

    Attributes
    protected
    Definition Classes
    ShuffleRegionJoin
  24. def pruneCache(cache: SetTheoryCache[U, T, Option[U]], to: ReferenceRegion): Unit

    Removes elements from cache in place that do not meet the condition for the next region.

    Removes elements from cache in place that do not meet the condition for the next region.

    cache

    The cache for this partition.

    to

    The next region in the left iterator.

    Attributes
    protected
    Definition Classes
    VictimlessSortedIntervalPartitionJoinShuffleRegionJoin
    Note

    At one point these were all variables and we built new collections and reassigned the pointers every time. We fixed this by using trimStart() and ++=() to improve performance. Overall, we see roughly 25% improvement in runtime by doing things this way.

  25. val rightRdd: RDD[(ReferenceRegion, U)]

  26. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  27. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  28. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  29. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Product

Inherited from Equals

Inherited from VictimlessSortedIntervalPartitionJoin[T, U, T, Option[U]]

Inherited from ShuffleRegionJoin[T, U, T, Option[U]]

Inherited from RegionJoin[T, U, T, Option[U]]

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped