Package

org.bdgenomics.adam

rdd

Permalink

package rdd

Visibility
  1. Public
  2. All

Type Members

  1. class ADAMContext extends Serializable with Logging

    Permalink

    The ADAMContext provides functions on top of a SparkContext for loading genomic data.

  2. trait ADAMSaveAnyArgs extends SaveArgs

    Permalink

    Argument configuration for saving any output format.

  3. abstract class AvroGenomicRDD[T, U <: Product, V <: AvroGenomicRDD[T, U, V]] extends ADAMRDDFunctions[T] with GenomicDataset[T, U, V]

    Permalink

    An abstract class that extends GenomicRDD and where the underlying data are Avro IndexedRecords.

    An abstract class that extends GenomicRDD and where the underlying data are Avro IndexedRecords. This abstract class provides methods for saving to Parquet, and provides hooks for writing the metadata.

  4. abstract class AvroRecordGroupGenomicRDD[T, U <: Product, V <: AvroRecordGroupGenomicRDD[T, U, V]] extends AvroGenomicRDD[T, U, V] with GenomicRDDWithLineage[T, V]

    Permalink

    An abstract class describing a GenomicRDD where:

    An abstract class describing a GenomicRDD where:

    * The data are Avro IndexedRecords. * The data are associated to record groups (i.e., they are reads or fragments).

  5. case class FullOuterShuffleRegionJoin[T, U](leftRdd: RDD[(ReferenceRegion, T)], rightRdd: RDD[(ReferenceRegion, U)])(implicit evidence$11: ClassTag[T], evidence$12: ClassTag[U]) extends ShuffleRegionJoin[T, U, Option[T], Option[U]] with SortedIntervalPartitionJoinWithVictims[T, U, Option[T], Option[U]] with Product with Serializable

    Permalink
  6. case class GenericGenomicRDD[T](rdd: RDD[T], sequences: SequenceDictionary, regionFn: (T) ⇒ Seq[ReferenceRegion], optPartitionMap: Option[Array[Option[(ReferenceRegion, ReferenceRegion)]]] = None)(implicit tTag: ClassTag[T]) extends GenomicRDD[T, GenericGenomicRDD[T]] with Product with Serializable

    Permalink
  7. case class GenomeBins(binSize: Long, seqLengths: Map[String, Long]) extends Serializable with Product

    Permalink

    Partition a genome into a set of bins.

    Partition a genome into a set of bins.

    Note that this class will not tolerate invalid input, so filter in advance if you use it.

    binSize

    The size of each bin in nucleotides

    seqLengths

    A map containing the length of each contig

  8. trait GenomicDataset[T, U <: Product, V <: GenomicDataset[T, U, V]] extends GenomicRDD[T, V]

    Permalink

    A trait describing a GenomicRDD that also supports the Spark SQL APIs.

  9. trait GenomicDatasetConversion[T <: Product, U <: GenomicDataset[_, T, U], X <: Product, Y <: GenomicDataset[_, X, Y]] extends Function2[U, Dataset[X], Y]

    Permalink
  10. case class GenomicPositionPartitioner(numParts: Int, seqLengths: Map[String, Long]) extends Partitioner with Logging with Product with Serializable

    Permalink

    GenomicPositionPartitioner partitions ReferencePosition objects into separate, spatially-coherent regions of the genome.

    GenomicPositionPartitioner partitions ReferencePosition objects into separate, spatially-coherent regions of the genome.

    This can be used to organize genomic data for computation that is spatially distributed (e.g. GATK and Queue's "scatter-and-gather" for locus-parallelizable walkers).

    numParts

    The number of equally-sized regions into which the total genomic space is partitioned; the total number of partitions is numParts + 1, with the "+1" resulting from one extra partition that is used to capture null or UNMAPPED values of the ReferencePosition type.

    seqLengths

    a map relating sequence-name to length and indicating the set and length of all extant sequences in the genome.

  11. trait GenomicRDD[T, U <: GenomicRDD[T, U]] extends Logging

    Permalink

    A trait that wraps an RDD of genomic data with helpful metadata.

    A trait that wraps an RDD of genomic data with helpful metadata.

    T

    The type of the data in the wrapped RDD.

    U

    The type of this GenomicRDD.

  12. trait GenomicRDDWithLineage[T, U <: GenomicRDDWithLineage[T, U]] extends GenomicRDD[T, U]

    Permalink
  13. case class GenomicRegionPartitioner(partitionSize: Long, seqLengths: Map[String, Long], start: Boolean = true) extends Partitioner with Logging with Product with Serializable

    Permalink

    A partitioner for ReferenceRegion-keyed data.

    A partitioner for ReferenceRegion-keyed data.

    partitionSize

    The number of bases per partition.

    seqLengths

    A map between contig names and contig lengths.

    start

    If true, use the start position (instead of the end position) to decide which partition a key belongs to.

  14. trait InFormatter[T, U <: GenomicRDD[T, U], V <: InFormatter[T, U, V]] extends Serializable

    Permalink

    Formats data going into a pipe to an invoked process.

    Formats data going into a pipe to an invoked process.

    T

    The type of records being formatted.

  15. trait InFormatterCompanion[T, U <: GenomicRDD[T, U], V <: InFormatter[T, U, V]] extends AnyRef

    Permalink

    A trait for singleton objects that build an InFormatter from a GenomicRDD.

    A trait for singleton objects that build an InFormatter from a GenomicRDD.

    Often, when creating an outputstream, we need to add metadata to the output that is not attached to individual records. An example of this is writing a header with contig/read group/format info, as is done with SAM/BAM/VCF.

    T

    The type of the records this InFormatter writes out.

    U

    The type of the GenomicRDD this companion object understands.

    V

    The type of InFormatter this companion object creates.

  16. case class InnerShuffleRegionJoin[T, U](leftRdd: RDD[(ReferenceRegion, T)], rightRdd: RDD[(ReferenceRegion, U)])(implicit evidence$3: ClassTag[T], evidence$4: ClassTag[U]) extends ShuffleRegionJoin[T, U, T, U] with VictimlessSortedIntervalPartitionJoin[T, U, T, U] with Product with Serializable

    Permalink
  17. case class InnerShuffleRegionJoinAndGroupByLeft[T, U](leftRdd: RDD[(ReferenceRegion, T)], rightRdd: RDD[(ReferenceRegion, U)])(implicit evidence$5: ClassTag[T], evidence$6: ClassTag[U]) extends ShuffleRegionJoin[T, U, T, Iterable[U]] with VictimlessSortedIntervalPartitionJoin[T, U, T, Iterable[U]] with Product with Serializable

    Permalink
  18. case class InnerTreeRegionJoin[T, U]()(implicit evidence$1: ClassTag[T], evidence$2: ClassTag[U]) extends RegionJoin[T, U, T, U] with TreeRegionJoin[T, U, T, U] with Product with Serializable

    Permalink

    Implements an inner region join where the left side of the join is broadcast.

  19. case class InnerTreeRegionJoinAndGroupByRight[T, U]()(implicit evidence$5: ClassTag[T], evidence$6: ClassTag[U]) extends RegionJoin[T, U, Iterable[T], U] with TreeRegionJoin[T, U, Iterable[T], U] with Product with Serializable

    Permalink

    Performs an inner region join, followed logically by grouping by the right value.

    Performs an inner region join, followed logically by grouping by the right value. This is implemented without any shuffling; the join naturally returns values on the left grouped by the right value.

  20. case class LeftOuterShuffleRegionJoin[T, U](leftRdd: RDD[(ReferenceRegion, T)], rightRdd: RDD[(ReferenceRegion, U)])(implicit evidence$7: ClassTag[T], evidence$8: ClassTag[U]) extends ShuffleRegionJoin[T, U, T, Option[U]] with VictimlessSortedIntervalPartitionJoin[T, U, T, Option[U]] with Product with Serializable

    Permalink
  21. case class LeftOuterShuffleRegionJoinAndGroupByLeft[T, U](leftRdd: RDD[(ReferenceRegion, T)], rightRdd: RDD[(ReferenceRegion, U)])(implicit evidence$9: ClassTag[T], evidence$10: ClassTag[U]) extends ShuffleRegionJoin[T, U, T, Iterable[U]] with VictimlessSortedIntervalPartitionJoin[T, U, T, Iterable[U]] with Product with Serializable

    Permalink
  22. abstract class MultisampleAvroGenomicRDD[T, U <: Product, V <: MultisampleAvroGenomicRDD[T, U, V]] extends AvroGenomicRDD[T, U, V] with MultisampleGenomicRDD[T, V]

    Permalink

    An abstract class that extends the MultisampleGenomicRDD trait, where the data are Avro IndexedRecords.

  23. trait MultisampleGenomicRDD[T, U <: MultisampleGenomicRDD[T, U]] extends GenomicRDD[T, U]

    Permalink

    A trait describing a GenomicRDD with data from multiple samples.

  24. trait OutFormatter[T] extends Serializable

    Permalink

    Deserializes data coming out of a pipe from an invoked process.

    Deserializes data coming out of a pipe from an invoked process.

    T

    The type of records being formatted.

  25. case class ReferencePartitioner(sd: SequenceDictionary) extends Partitioner with Product with Serializable

    Permalink

    Repartitions objects that are keyed by a ReferencePosition or ReferenceRegion into a single partition per contig.

  26. abstract class RegionJoin[T, U, RT, RU] extends Serializable

    Permalink

    A trait describing a join in the genomic coordinate space between two RDDs where the values are keyed by a ReferenceRegion.

    A trait describing a join in the genomic coordinate space between two RDDs where the values are keyed by a ReferenceRegion.

    T

    The type of the left RDD.

    U

    The type of the right RDD.

    RT

    The type of data yielded by the left RDD at the output of the join. This may not match T if the join is an outer join, etc.

    RU

    The type of data yielded by the right RDD at the output of the join.

  27. case class RightOuterShuffleRegionJoinAndGroupByLeft[T, U](leftRdd: RDD[(ReferenceRegion, T)], rightRdd: RDD[(ReferenceRegion, U)])(implicit evidence$13: ClassTag[T], evidence$14: ClassTag[U]) extends ShuffleRegionJoin[T, U, Option[T], Iterable[U]] with SortedIntervalPartitionJoinWithVictims[T, U, Option[T], Iterable[U]] with Product with Serializable

    Permalink
  28. case class RightOuterTreeRegionJoin[T, U]()(implicit evidence$3: ClassTag[T], evidence$4: ClassTag[U]) extends RegionJoin[T, U, Option[T], U] with TreeRegionJoin[T, U, Option[T], U] with Product with Serializable

    Permalink

    Implements a right outer region join where the left side of the join is broadcast.

  29. case class RightOuterTreeRegionJoinAndGroupByRight[T, U]()(implicit evidence$7: ClassTag[T], evidence$8: ClassTag[U]) extends RegionJoin[T, U, Iterable[T], U] with TreeRegionJoin[T, U, Iterable[T], U] with Product with Serializable

    Permalink

    Performs a right outer region join, followed logically by grouping by the right value.

    Performs a right outer region join, followed logically by grouping by the right value. This is implemented without any shuffling; the join naturally returns values on the left grouped by the right value. In this implementation, empty collections on the left side of the join are kept.

  30. sealed abstract class ShuffleRegionJoin[T, U, RT, RU] extends RegionJoin[T, U, RT, RU]

    Permalink

    A trait describing join implementations that are based on a sort-merge join.

    A trait describing join implementations that are based on a sort-merge join.

    T

    The type of the left RDD.

    U

    The type of the right RDD.

    RT

    The type of data yielded by the left RDD at the output of the join. This may not match T if the join is an outer join, etc.

    RU

    The type of data yielded by the right RDD at the output of the join.

  31. sealed trait SortedIntervalPartitionJoinWithVictims[T, U, RT, RU] extends ShuffleRegionJoin[T, U, RT, RU]

    Permalink
  32. trait TreeRegionJoin[T, U, RT, RU] extends RegionJoin[T, U, RT, RU]

    Permalink

    Implements a shuffle free (broadcast) region join.

    Implements a shuffle free (broadcast) region join.

    The broadcast values are stored in a sorted array. It was going to be an ensemble of interval trees, but, that didn't work out.

  33. sealed trait VictimlessSortedIntervalPartitionJoin[T, U, RT, RU] extends ShuffleRegionJoin[T, U, RT, RU]

    Permalink

Value Members

  1. object ADAMContext extends Serializable

    Permalink

    This singleton provides an implicit conversion from a SparkContext to the ADAMContext, as well as implicit functions for the Pipe API.

  2. object GenomicPositionPartitioner extends Serializable

    Permalink

    Helper for creating genomic position partitioners.

  3. object GenomicRegionPartitioner extends Serializable

    Permalink

    Helper object for creating GenomicRegionPartitioners.

  4. package contig

    Permalink
  5. package feature

    Permalink
  6. package fragment

    Permalink
  7. package read

    Permalink
  8. package variant

    Permalink

Ungrouped