Class

org.bdgenomics.adam.rdd

RightOuterShuffleRegionJoinAndGroupByLeft

Related Doc: package rdd

Permalink

case class RightOuterShuffleRegionJoinAndGroupByLeft[T, U](sd: SequenceDictionary, partitionSize: Long, sc: SparkContext)(implicit evidence$13: ClassTag[T], evidence$14: ClassTag[U]) extends ShuffleRegionJoin[T, U, Option[T], Iterable[U]] with Product with Serializable

Extends the ShuffleRegionJoin trait to implement a right outer join followed by grouping by all non-null left values.

Linear Supertypes
Product, Equals, ShuffleRegionJoin[T, U, Option[T], Iterable[U]], RegionJoin[T, U, Option[T], Iterable[U]], Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. RightOuterShuffleRegionJoinAndGroupByLeft
  2. Product
  3. Equals
  4. ShuffleRegionJoin
  5. RegionJoin
  6. Serializable
  7. Serializable
  8. AnyRef
  9. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new RightOuterShuffleRegionJoinAndGroupByLeft(sd: SequenceDictionary, partitionSize: Long, sc: SparkContext)(implicit arg0: ClassTag[T], arg1: ClassTag[U])

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. val bins: Broadcast[GenomeBins]

    Permalink
    Attributes
    protected
    Definition Classes
    ShuffleRegionJoin
  6. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  7. def emptyFn(left: Iterator[((ReferenceRegion, Int), T)], right: Iterator[((ReferenceRegion, Int), U)]): Iterator[(Option[T], Iterable[U])]

    Permalink
    Attributes
    protected
    Definition Classes
    RightOuterShuffleRegionJoinAndGroupByLeftShuffleRegionJoin
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  12. def makeIterator(region: ReferenceRegion, left: BufferedIterator[((ReferenceRegion, Int), T)], right: BufferedIterator[((ReferenceRegion, Int), U)]): Iterator[(Option[T], Iterable[U])]

    Permalink
    Attributes
    protected
    Definition Classes
    RightOuterShuffleRegionJoinAndGroupByLeftShuffleRegionJoin
  13. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  14. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  15. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  16. def partitionAndJoin(leftRDD: RDD[(ReferenceRegion, T)], rightRDD: RDD[(ReferenceRegion, U)]): RDD[(Option[T], Iterable[U])]

    Permalink

    Performs a region join between two RDDs (shuffle join).

    Performs a region join between two RDDs (shuffle join).

    This implementation is shuffle-based, so does not require collecting one side into memory like BroadcastRegionJoin. It basically performs a global sort of each RDD by genome position and then does a sort-merge join, similar to the chromsweep implementation in bedtools. More specifically, it first defines a set of bins across the genome, then assigns each object in the RDDs to each bin that they overlap (replicating if necessary), performs the shuffle, and sorts the object in each bin. Finally, each bin independently performs a chromsweep sort-merge join.

    leftRDD

    The 'left' side of the join

    rightRDD

    The 'right' side of the join

    returns

    An RDD of pairs (x, y), where x is from leftRDD, y is from rightRDD, and the region corresponding to x overlaps the region corresponding to y.

    Definition Classes
    ShuffleRegionJoinRegionJoin
  17. val partitionSize: Long

    Permalink
  18. val sc: SparkContext

    Permalink
  19. val sd: SequenceDictionary

    Permalink
  20. val seqLengths: Map[String, Long]

    Permalink
    Attributes
    protected
    Definition Classes
    ShuffleRegionJoin
  21. def sweep(leftIter: Iterator[((ReferenceRegion, Int), T)], rightIter: Iterator[((ReferenceRegion, Int), U)]): Iterator[(Option[T], Iterable[U])]

    Permalink
    Definition Classes
    ShuffleRegionJoin
  22. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  23. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  24. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Product

Inherited from Equals

Inherited from ShuffleRegionJoin[T, U, Option[T], Iterable[U]]

Inherited from RegionJoin[T, U, Option[T], Iterable[U]]

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped