Loads and extracts sequences directly from indexed fasta or fa files.
Implements a traversable collection that is backed by a Parquet file.
A broadcastable ReferenceFile backed by a map containing contig name -> Seq[NucleotideContigFragment] pairs.
A broadcastable ReferenceFile backed by a map containing contig name -> Seq[NucleotideContigFragment] pairs.
a map containing a Seq of contig fragments per contig.
File that contains a reference assembly that can be broadcasted
Represents a set of reference sequences backed by a .2bit file.
Represents a set of reference sequences backed by a .2bit file.
See http://genome.ucsc.edu/FAQ/FAQformat.html#format7 for the spec.
AttributeUtils is a utility object for parsing optional fields from a BAM file, or the attributes column from an ADAM file.
Utility singleton for flattening down nested Avro records.
Utility singleton for flattening down nested Avro records.
When we refer to a schema as flat, we mean that there are no nested records. We do not mean that the schema does not contain maps or arrays.
Helper object for setting the logging level for Parquet.
Helper singleton for converting Phred scores to/from probabilities.
Helper singleton for converting Phred scores to/from probabilities.
As a reminder, given an error probability \epsilon, the Phred score q is:
q = -10 log_{10} \epsilon
Companion object for creating a ReferenceContigMap from an RDD of contig fragments.
Loads and extracts sequences directly from indexed fasta or fa files. filePath requires fai index in the same directory with same naming convention.
path to fasta or fa index