Class/Object

org.bdgenomics.adam.rdd

ADAMContext

Related Docs: object ADAMContext | package rdd

Permalink

class ADAMContext extends Serializable with Logging

Linear Supertypes
Logging, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ADAMContext
  2. Logging
  3. Serializable
  4. Serializable
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ADAMContext(sc: SparkContext)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  7. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  8. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  9. def findFiles(path: Path, regex: String): Seq[Path]

    Permalink

    Searches a path recursively, returning the names of all directories in the tree whose name matches the given regex.

    Searches a path recursively, returning the names of all directories in the tree whose name matches the given regex.

    path

    The path to begin the search at

    regex

    A regular expression

    returns

    A sequence of Path objects corresponding to the identified directories.

  10. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  11. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  12. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  13. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  14. def loadAlignments(filePath: String, projection: Option[Schema] = None, filePath2Opt: Option[String] = None, recordGroupOpt: Option[String] = None, stringency: ValidationStringency = ValidationStringency.STRICT): AlignmentRecordRDD

    Permalink

    Loads alignments from a given path, and infers the input type.

    Loads alignments from a given path, and infers the input type.

    This method can load:

    * AlignmentRecords via Parquet (default) * SAM/BAM (.sam, .bam) * FASTQ (interleaved, single end, paired end) (.ifq, .fq/.fastq) * FASTA (.fa, .fasta) * NucleotideContigFragments via Parquet (.contig.adam)

    As hinted above, the input type is inferred from the file path extension.

    filePath

    Path to load data from.

    projection

    The fields to project; ignored if not Parquet.

    filePath2Opt

    The path to load a second end of FASTQ data from. Ignored if not FASTQ.

    recordGroupOpt

    Optional record group name to set if loading FASTQ.

    stringency

    Validation stringency used on FASTQ import/merging.

    returns

    Returns an AlignmentRecordRDD which wraps the RDD of reads, sequence dictionary representing the contigs these reads are aligned to if the reads are aligned, and the record group dictionary for the reads if one is available.

    See also

    loadFasta

    loadFastq

    loadInterleavedFastq

    loadParquetAlignments

    loadBam

  15. def loadAlignmentsFromPaths(paths: Seq[Path]): AlignmentRecordRDD

    Permalink

    Takes a sequence of Path objects and loads alignments using that path.

    Takes a sequence of Path objects and loads alignments using that path.

    This infers the type of each path, and thus can be used to load a mixture of different files from disk. I.e., if you want to load 2 BAM files and 3 Parquet files, this is the method you are looking for!

    The RDDs obtained from loading each file are simply unioned together, while the record group dictionaries are naively merged. The sequence dictionaries are merged in a way that dedupes the sequence records in each dictionary.

    paths

    The locations of the files to load.

    returns

    Returns an AlignmentRecordRDD which wraps the RDD of reads, sequence dictionary representing the contigs these reads are aligned to if the reads are aligned, and the record group dictionary for the reads if one is available.

    See also

    loadAlignments

  16. def loadBam(filePath: String, validationStringency: ValidationStringency = ValidationStringency.STRICT): AlignmentRecordRDD

    Permalink

    Loads a SAM/BAM file.

    Loads a SAM/BAM file.

    This reads the sequence and record group dictionaries from the SAM/BAM file header. SAMRecords are read from the file and converted to the AlignmentRecord schema.

    filePath

    Path to the file on disk.

    returns

    Returns an AlignmentRecordRDD which wraps the RDD of reads, sequence dictionary representing the contigs these reads are aligned to if the reads are aligned, and the record group dictionary for the reads if one is available.

    See also

    loadAlignments

  17. def loadBed(filePath: String, minPartitions: Option[Int] = None): FeatureRDD

    Permalink
  18. def loadCoverage(filePath: String): CoverageRDD

    Permalink

    Loads Parquet file of Features to a CoverageRDD.

    Loads Parquet file of Features to a CoverageRDD. Coverage is stored in the score attribute of Feature.

    filePath

    File path to load coverage from

    returns

    CoverageRDD containing an RDD of Coverage

  19. def loadDictionary[T](filePath: String)(implicit ev1: (T) ⇒ SpecificRecord, ev2: Manifest[T]): SequenceDictionary

    Permalink

    This method should create a new SequenceDictionary from any parquet file which contains records that have the requisite reference{Name,Id,Length,Url} fields.

    This method should create a new SequenceDictionary from any parquet file which contains records that have the requisite reference{Name,Id,Length,Url} fields.

    (If the path is a BAM or SAM file, and the implicit type is an Read, then it just defaults to reading the SequenceDictionary out of the BAM header in the normal way.)

    T

    The type of records to return

    filePath

    The path to the input data

    returns

    A sequenceDictionary containing the names and indices of all the sequences to which the records in the corresponding file are aligned.

  20. def loadFasta(filePath: String, fragmentLength: Long): NucleotideContigFragmentRDD

    Permalink
  21. def loadFastq(filePath1: String, filePath2Opt: Option[String], recordGroupOpt: Option[String] = None, stringency: ValidationStringency = ValidationStringency.STRICT): AlignmentRecordRDD

    Permalink
  22. def loadFeatures(filePath: String, projection: Option[Schema] = None, minPartitions: Option[Int] = None): FeatureRDD

    Permalink
  23. def loadFeatures(filePath: String, projection: Option[Schema], minPartitions: Int): FeatureRDD

    Permalink
  24. def loadFragments(filePath: String): FragmentRDD

    Permalink
  25. def loadGenes(filePath: String, projection: Option[Schema] = None): GeneRDD

    Permalink
  26. def loadGenotypes(filePath: String, projection: Option[Schema] = None): GenotypeRDD

    Permalink
  27. def loadGff3(filePath: String, minPartitions: Option[Int] = None): FeatureRDD

    Permalink
  28. def loadGtf(filePath: String, minPartitions: Option[Int] = None): FeatureRDD

    Permalink
  29. def loadIndexedBam(filePath: String, viewRegions: Iterable[ReferenceRegion])(implicit s: DummyImplicit): AlignmentRecordRDD

    Permalink

    Functions like loadBam, but uses bam index files to look at fewer blocks, and only returns records within the specified ReferenceRegions.

    Functions like loadBam, but uses bam index files to look at fewer blocks, and only returns records within the specified ReferenceRegions. Bam index file required.

    filePath

    The path to the input data. Currently this path must correspond to a single Bam file. The bam index file associated needs to have the same name.

    viewRegions

    Iterable of ReferenceRegions we are filtering on

  30. def loadIndexedBam(filePath: String, viewRegion: ReferenceRegion): AlignmentRecordRDD

    Permalink

    Functions like loadBam, but uses bam index files to look at fewer blocks, and only returns records within a specified ReferenceRegion.

    Functions like loadBam, but uses bam index files to look at fewer blocks, and only returns records within a specified ReferenceRegion. Bam index file required.

    filePath

    The path to the input data. Currently this path must correspond to a single Bam file. The bam index file associated needs to have the same name.

    viewRegion

    The ReferenceRegion we are filtering on

  31. def loadIndexedVcf(filePath: String, viewRegions: Iterable[ReferenceRegion])(implicit s: DummyImplicit): VariantContextRDD

    Permalink

    Loads a VCF file indexed by a tabix (tbi) file into an RDD.

    Loads a VCF file indexed by a tabix (tbi) file into an RDD.

    filePath

    The file to load.

    viewRegions

    Iterator of ReferenceRegions we are filtering on.

    returns

    Returns a VariantContextRDD.

  32. def loadIndexedVcf(filePath: String, viewRegion: ReferenceRegion): VariantContextRDD

    Permalink

    Loads a VCF file indexed by a tabix (tbi) file into an RDD.

    Loads a VCF file indexed by a tabix (tbi) file into an RDD.

    filePath

    The file to load.

    viewRegion

    ReferenceRegions we are filtering on.

    returns

    Returns a VariantContextRDD.

  33. def loadInterleavedFastq(filePath: String): AlignmentRecordRDD

    Permalink
  34. def loadInterleavedFastqAsFragments(filePath: String): FragmentRDD

    Permalink
  35. def loadIntervalList(filePath: String, minPartitions: Option[Int] = None): FeatureRDD

    Permalink
  36. def loadNarrowPeak(filePath: String, minPartitions: Option[Int] = None): FeatureRDD

    Permalink
  37. def loadPairedFastq(filePath1: String, filePath2: String, recordGroupOpt: Option[String], stringency: ValidationStringency): AlignmentRecordRDD

    Permalink
  38. def loadParquet[T](filePath: String, predicate: Option[FilterPredicate] = None, projection: Option[Schema] = None)(implicit ev1: (T) ⇒ SpecificRecord, ev2: Manifest[T]): RDD[T]

    Permalink

    This method will create a new RDD.

    This method will create a new RDD.

    T

    The type of records to return

    filePath

    The path to the input data

    predicate

    An optional pushdown predicate to use when reading the data

    projection

    An option projection schema to use when reading the data

    returns

    An RDD with records of the specified type

  39. def loadParquetAlignments(filePath: String, predicate: Option[FilterPredicate] = None, projection: Option[Schema] = None): AlignmentRecordRDD

    Permalink

    Loads alignment data from a Parquet file.

    Loads alignment data from a Parquet file.

    filePath

    The path of the file to load.

    predicate

    An optional predicate to push down into the file.

    projection

    An optional schema designating the fields to project.

    returns

    Returns an AlignmentRecordRDD which wraps the RDD of reads, sequence dictionary representing the contigs these reads are aligned to if the reads are aligned, and the record group dictionary for the reads if one is available.

    Note

    The sequence dictionary is read from an avro file stored at filePath/_seqdict.avro and the record group dictionary is read from an avro file stored at filePath/_rgdict.avro. These files are pure avro, not Parquet.

    See also

    loadAlignments

  40. def loadParquetContigFragments(filePath: String, predicate: Option[FilterPredicate] = None, projection: Option[Schema] = None): NucleotideContigFragmentRDD

    Permalink
  41. def loadParquetFeatures(filePath: String, predicate: Option[FilterPredicate] = None, projection: Option[Schema] = None): FeatureRDD

    Permalink
  42. def loadParquetFragments(filePath: String, predicate: Option[FilterPredicate] = None, projection: Option[Schema] = None): FragmentRDD

    Permalink
  43. def loadParquetGenotypes(filePath: String, predicate: Option[FilterPredicate] = None, projection: Option[Schema] = None): GenotypeRDD

    Permalink
  44. def loadParquetVariantAnnotations(filePath: String, predicate: Option[FilterPredicate] = None, projection: Option[Schema] = None): DatabaseVariantAnnotationRDD

    Permalink
  45. def loadParquetVariants(filePath: String, predicate: Option[FilterPredicate] = None, projection: Option[Schema] = None): VariantRDD

    Permalink
  46. def loadReferenceFile(filePath: String, fragmentLength: Long): ReferenceFile

    Permalink
  47. def loadSequences(filePath: String, projection: Option[Schema] = None, fragmentLength: Long = 10000): NucleotideContigFragmentRDD

    Permalink
  48. def loadUnpairedFastq(filePath: String, recordGroupOpt: Option[String] = None, setFirstOfPair: Boolean = false, setSecondOfPair: Boolean = false, stringency: ValidationStringency = ValidationStringency.STRICT): AlignmentRecordRDD

    Permalink
  49. def loadVariantAnnotations(filePath: String, projection: Option[Schema] = None): DatabaseVariantAnnotationRDD

    Permalink
  50. def loadVariants(filePath: String, projection: Option[Schema] = None): VariantRDD

    Permalink
  51. def loadVcf(filePath: String): VariantContextRDD

    Permalink

    Loads a VCF file into an RDD.

    Loads a VCF file into an RDD.

    filePath

    The file to load.

    returns

    Returns a VariantContextRDD.

  52. def loadVcfAnnotations(filePath: String): DatabaseVariantAnnotationRDD

    Permalink
  53. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  54. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  55. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  56. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  57. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  58. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  59. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  60. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  61. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  62. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  63. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  64. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  65. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  66. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  67. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  68. val sc: SparkContext

    Permalink
  69. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  70. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  71. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  72. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  73. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped