package util
- Alphabetic
- Public
- All
Type Members
- class DnaFoldPrediction extends AnyRef
-
class
DnaFoldPredictor
extends AnyRef
Class that wraps ViennaRNAs RNAFold utility which can be used to estimate the minimum free energy secondary structure of DNA and RNA molecules.
Class that wraps ViennaRNAs RNAFold utility which can be used to estimate the minimum free energy secondary structure of DNA and RNA molecules. When constructed a background process is started running RNAFold, calls to predict() then pipe input and output through RNAFold.
-
trait
GenomicSpan
extends Locatable
A simple trait that extends htsjdk.samtools.util.Locatable but provides a few easier to use methods for getting the location.
-
case class
IndexMetric
(index: String, source: String, min_mismatches: Int, indices_at_min_mismatches: Count, gc: Proportion, longest_homopolymer: Int, worst_structure_seq: Option[String], worst_structure_dbn: Option[String], worst_structure_delta_g: Option[Double]) extends Metric with Product with Serializable
Output produced by
PickLongIndices
to describe the set of molecular indices picked by the program.Output produced by
PickLongIndices
to describe the set of molecular indices picked by the program.- index
The sequence of the molecular index (i.e. the set of bases).
- source
The source of the sequence - either
existing
for indices provided to the program ofnovel
for indices generated by the program.- min_mismatches
The smallest number of mismatches to any other reported molecular index.
- indices_at_min_mismatches
The number of other reported molecular indices with
min_mismatches
differences to this index.- gc
The fraction of the sequence composed of
G
andC
bases.- longest_homopolymer
The length of the longest homopolymer within the index sequence.
- worst_structure_seq
The sequence (including adapter plus index) that generated the worst (lowest energy) structure for this index.
- worst_structure_dbn
The lowest energy structure in dot-bracket notation.
- worst_structure_delta_g
The deltaG of the lowest energy structure.
-
class
Io
extends IoUtil
Provides common IO utility methods.
Provides common IO utility methods. Can be instantiated to create a custom factory, or the companion object can be used as a singleton version.
-
trait
Metric
extends Product with Iterable[(String, String)]
Base trait for metrics.
Base trait for metrics.
All classes extending this class should be a case class. By convention, all fields should be lower case with words separated by underscores.
-
class
PickIlluminaIndices
extends FgBioTool
Program for picking sets of indices of arbitrary length that meet certain constraints and attempt to maximize the edit distance between all members of the set picked.
Program for picking sets of indices of arbitrary length that meet certain constraints and attempt to maximize the edit distance between all members of the set picked.
- Annotations
- @ClpAnnotation()
-
class
PickLongIndices
extends FgBioTool with LazyLogging
- Annotations
- @ClpAnnotation()
-
case class
ProgressLogger
(logger: Logger, noun: String = "records", verb: String = "processed", unit: Int = 1000 * 1000) extends AbstractProgressLogger with Product with Serializable
A subclass of HTSJDK's progress logger that uses fgbio's logging system.
-
case class
ReadSegment
(offset: Int, length: Option[Int], kind: SegmentType) extends Product with Serializable
Encapsulates all the information about a segment within a read structure.
Encapsulates all the information about a segment within a read structure. A segment can either have a definite length, in which case length must be Some(Int), or an indefinite length (can be any length, 0 or more) in which case length must be None.
-
class
ReadStructure
extends Seq[ReadSegment]
Describes the structure of a give read.
Describes the structure of a give read. A read contains one or more read segments. A read segment describes a contiguous stretch of bases of the same type (ex. template bases) of some length and some offset from the start of the read.
-
class
RefFlatSource
extends Iterable[Gene] with Closeable with LazyLogging
Reads gene annotation information from a RefFlat file.
Reads gene annotation information from a RefFlat file.
Skips genes on unrecognized chromosomes if a sequence dictionary is provided.
The format is described here: http://genome.ucsc.edu/goldenPath/gbdDescriptionsOld.html#RefFlat
A Picard-style header is also supported (GENE_NAME, TRANSCRIPT_NAME, ...).
-
case class
SampleBarcodeMetric
(barcode_name: String = "", library_name: String = "", barcode: String = "", templates: Count = 0, pf_templates: Count = 0, perfect_matches: Count = 0, pf_perfect_matches: Count = 0, one_mismatch_matches: Count = 0, pf_one_mismatch_matches: Count = 0, fraction_matches: Proportion = 0d, ratio_this_barcode_to_best_barcode: Proportion = 0d, pf_fraction_matches: Proportion = 0d, pf_ratio_this_barcode_to_best_barcode: Proportion = 0d, pf_normalized_matches: Proportion = 0d) extends Metric with Product with Serializable
Metrics for matching templates to sample barcodes primarily used in com.fulcrumgenomics.fastq.DemuxFastqs.
Metrics for matching templates to sample barcodes primarily used in com.fulcrumgenomics.fastq.DemuxFastqs.
The number of templates will match the number of reads for an Illumina single-end sequencing run, while the number of templates will be half the number of reads for an Illumina paired-end sequencing run (i.e. R1 & R2 observe the same template).
- barcode_name
the name for the sample barcode, typically the sample name from the SampleSheet.
- library_name
the name of the library, typically the library identifier from the SampleSheet.
- barcode
the sample barcode bases. Dual index barcodes will have two sample barcode sequences delimited by a dash.
- templates
the total number of templates matching the given barcode.
- pf_templates
the total number of pass-filter templates matching the given barcode.
- perfect_matches
the number of templates that match perfectly the given barcode.
- pf_perfect_matches
the number of pass-filter templates that match perfectly the given barcode.
- one_mismatch_matches
the number of pass-filter templates that match the given barcode with exactly one mismatch.
- pf_one_mismatch_matches
the number of pass-filter templates that match the given barcode with exactly one mismatch.
- fraction_matches
the fraction of all templates that match the given barcode.
- ratio_this_barcode_to_best_barcode
the rate of all templates matching this barcode to all template reads matching the most prevalent barcode. For the most prevalent barcode this will be 1, for all others it will be less than 1 (except for the possible exception of when there are more unmatched templates than for any other barcode, in which case the value may be arbitrarily large). One over the lowest number in this column gives you the fold-difference in representation between barcodes.
- pf_fraction_matches
the fraction of all pass-filter templates that match the given barcode.
- pf_ratio_this_barcode_to_best_barcode
the rate of all pass-filter templates matching this barcode to all templates matching the most prevalent barcode. For the most prevalent barcode this will be 1, for all others it will be less than 1 (except for the possible exception of when there are more unmatched templates than for any other barcode, in which case the value may be arbitrarily large). One over the lowest number in this column gives you the fold-difference in representation between barcodes.
- pf_normalized_matches
The "normalized" matches to each barcode. This is calculated as the number of pass-filter templates matching this barcode over the mean of all pass-filter templates matching any barcode (excluding unmatched). If all barcodes are represented equally this will be 1.
-
sealed abstract
class
SegmentType
extends AnyRef
Sealed class hierarchy for the types of segments that can show up in a read structure.
-
class
Sorter
[A, B <: Ordered[B]] extends Iterable[A] with Writer[A]
An implementation of a disk-backed sorting system.
An implementation of a disk-backed sorting system. The implementation requires two things:
1. An implementation of Codec that can serialize and deserialize objects 2. A function that creates a Ordered key object for each object being sorted
Both must be thread-safe as they may be invoked across threads without external synchronization
Value Members
-
object
GeneAnnotations
Stores classes useful for storing annotation information for genes and their transcripts and exons.
-
object
IlluminaAdapters
An object providing access to various Illumina adapter sequences.
-
object
Io
extends Io
Singleton object that can be used when the default buffer size and compression are desired.
-
object
MathUtil
Some simple utility methods for various mathematical operations that are implemented in efficient albeit non-idiomatic scala.
- object Metric
-
object
NumericTypes
Container object for a set of numeric types for working with common probability scalings.
- object PickLongIndices
- object ReadSegment extends Serializable
-
object
ReadStructure
Companion object for ReadStructure that provides factory methods.
- object RefFlatSource
-
object
Rscript
extends LazyLogging
Object that enables running of R scripts via the Rscript command line too.
- object SampleBarcodeMetric extends Serializable
- object SegmentType
-
object
Sequences
Utility methods for working with DNA or RNA sequences
-
object
Sorter
Companion object for Sorter that contains various types used.