Creates a consensus read from the given records.
Creates a consensus read from the given records. If no consensus read was created, None is returned.
Calculates the length of the consensus read that should be produced.
Calculates the length of the consensus read that should be produced. The length is calculated as the maximum length at which minReads reads still have bases.
the set of reads being fed into the consensus
the minimum number of reads required
the length of consensus read that should be created
Returns the number of consensus reads constructed by this caller.
Returns the number of consensus reads constructed by this caller.
Takes in all the reads for a source molecule and, if possible, generates one or more output consensus reads as SAM records.
Takes in all the reads for a source molecule and, if possible, generates one or more output consensus reads as SAM records.
the full set of source SamRecords for a source molecule
a seq of consensus SAM records, may be empty
Takes in all the SamRecords for a single source molecule and produces consensus records.
Takes in all the SamRecords for a single source molecule and produces consensus records.
the full set of source SamRecords for a source molecule
a seq of consensus SAM records, may be empty
Creates a SamRecord
from the called consensus base and qualities.
Creates a SamRecord
from the called consensus base and qualities.
Takes in a non-empty seq of SamRecords and filters them such that the returned seq only contains those reads that share the most common alignment of the read sequence to the reference.
Takes in a non-empty seq of SamRecords and filters them such that the returned seq only contains those reads that share the most common alignment of the read sequence to the reference. If two or more different alignments share equal numbers of reads, the 'most common' will be an arbitrary pick amongst those alignments, and the group of reads with that alignment will be returned.
For the purposes of this method all that is implied by "same alignment" is that any insertions or deletions are at the same position and of the same length. This is done to allow for differential read length (either due to sequencing or untracked hard-clipping of adapters) and for differential soft-clipping at the starts and ends of reads.
NOTE: filtered out reads are sent to the rejectRecords method and do not need further handling
Logs statistics about how many reads were seen, and how many were filtered/discarded due to various filters.
Logs statistics about how many reads were seen, and how many were filtered/discarded due to various filters.
A RG.ID to apply to all generated reads.
A RG.ID to apply to all generated reads.
A prefix to use on all read names.
A prefix to use on all read names. If None then a suitable prefix will be synthesized.
Returns the number of raw reads filtered out due to there being insufficient reads present to build the necessary set of consensus reads.
Returns the number of raw reads filtered out due to there being insufficient reads present to build the necessary set of consensus reads.
Returns the number of raw reads filtered out because their alignment disagreed with the majority alignment of all raw reads for the same source molecule.
Returns the number of raw reads filtered out because their alignment disagreed with the majority alignment of all raw reads for the same source molecule.
If a reject writer was provided, emit the reads to that writer.
If a reject writer was provided, emit the reads to that writer.
Returns the value of the SAM tag directly.
Returns the value of the SAM tag directly.
a SamRecord
an identified for the source molecule
Split records into those that should make a single-end consensus read, first of pair consensus read, and second of pair consensus read, respectively.
Split records into those that should make a single-end consensus read, first of pair consensus read, and second of pair consensus read, respectively. The default method is to use the SAM flag to find unpaired reads, first of pair reads, and second of pair reads.
Sums a short array into an Int to avoid overflow.
Sums a short array into an Int to avoid overflow.
Converts from a SamRecord into a SourceRead.
Converts from a SamRecord into a SourceRead. During conversion the record is end-trimmed
to remove Ns and bases below the minBaseQuality
. Remaining bases that are below
minBaseQuality
are then masked to Ns.
Some(SourceRead) if there are any called bases with quality > minBaseQuality, else None
Returns the total number of reads filtered for any reason.
Returns the total number of reads filtered for any reason.
Returns the total number of input reads examined by the consensus caller so far.
Returns the total number of input reads examined by the consensus caller so far.
Calls consensus reads by grouping consecutive reads with the same SAM tag.
Consecutive reads with the SAM tag are partitioned into fragments, first of pair, and second of pair reads, and a consensus read is created for each partition. A consensus read for a given partition may not be returned if any of the conditions are not met (ex. minimum number of reads, minimum mean consensus base quality, ...).