Package htsjdk.samtools
Interface SamReader
- All Superinterfaces:
AutoCloseable
,Closeable
,Iterable<SAMRecord>
- All Known Implementing Classes:
SamReader.PrimitiveSamReaderToSamReaderAdapter
Describes functionality for objects that produce
SAMRecord
s and associated information.
Currently, only deprecated readers implement this directly; actual readers implement this
via SamReader.ReaderImplementation
and SamReader.PrimitiveSamReader
, which SamReaderFactory
converts into full readers by using SamReader.PrimitiveSamReaderToSamReaderAdapter
.-
Nested Class Summary
Nested ClassesModifier and TypeInterfaceDescriptionstatic class
static interface
Facet for index-related operations.static interface
The minimal subset of functionality needed for aSAMRecord
data source.static class
Decorator for aSamReader.PrimitiveSamReader
that expands its functionality into aSamReader
, given the backingSamInputResource
.static class
Internal interface for SAM/BAM/CRAM file reader implementations, as distinct from non-file-based readers.static class
Describes a type of SAM file. -
Method Summary
Modifier and TypeMethodDescriptionboolean
hasIndex()
indexing()
Exposes theSamReader.Indexing
facet of thisSamReader
.default boolean
iterator()
Iterate through file in order.query
(QueryInterval[] intervals, boolean contained) Iterate over records that match one of the given intervals.Iterate over records that match the given interval.queryAlignmentStart
(String sequence, int start) Iterate over records that map to the given sequence and start at the given position.queryContained
(QueryInterval[] intervals) Iterate over records that are contained in the given interval.queryContained
(String sequence, int start, int end) Iterate over records that are contained in the given interval.Fetch the mate for the given read.queryOverlapping
(QueryInterval[] intervals) Iterate over records that overlap any of the given intervals.queryOverlapping
(String sequence, int start, int end) Iterate over records that overlap the given interval.type()
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Method Details
-
getFileHeader
SAMFileHeader getFileHeader() -
type
SamReader.Type type()- Returns:
- the
SamReader.Type
of thisSamReader
-
getResourceDescription
String getResourceDescription()- Returns:
- a human readable description of the resource backing this sam reader
-
isQueryable
default boolean isQueryable()- Returns:
- true if this source can be queried by interval, regardless of whether it has an index
-
hasIndex
boolean hasIndex()- Returns:
- true if ths is a BAM file, and has an index
-
indexing
SamReader.Indexing indexing()Exposes theSamReader.Indexing
facet of thisSamReader
.- Throws:
UnsupportedOperationException
- IfhasIndex()
returns false.
-
iterator
SAMRecordIterator iterator()Iterate through file in order. For a SamReader constructed from an InputStream, and for any SAM file, a 2nd iteration starts where the 1st one left off. For a BAM constructed from a SeekableStream or File, each new iteration starts at the first record. Only a single open iterator on a SAM or BAM file may be extant at any one time. If you want to start a second iteration, the first one must be closed first. -
query
Iterate over records that match the given interval. Only valid to call this if hasIndex() == true. Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. You can use a second SamReader to iterate in parallel over the same underlying file. Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest. Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.- Parameters:
sequence
- Reference sequence of interest.start
- 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.end
- 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.contained
- If true, each SAMRecord returned will have its alignment completely contained in the interval of interest. If false, the alignment of the returned SAMRecords need only overlap the interval of interest.- Returns:
- Iterator over the SAMRecords matching the interval.
-
queryOverlapping
Iterate over records that overlap the given interval. Only valid to call this if hasIndex() == true. Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest. Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.- Parameters:
sequence
- Reference sequence of interest.start
- 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.end
- 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.- Returns:
- Iterator over the SAMRecords overlapping the interval.
-
queryContained
Iterate over records that are contained in the given interval. Only valid to call this if hasIndex() == true. Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest. Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.- Parameters:
sequence
- Reference sequence of interest.start
- 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.end
- 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.- Returns:
- Iterator over the SAMRecords contained in the interval.
-
query
Iterate over records that match one of the given intervals. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided. Only valid to call this if hasIndex() == true. Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. You can use a second SamReader to iterate in parallel over the same underlying file. Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match an interval of interest. Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.- Parameters:
intervals
- Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done withQueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
contained
- If true, each SAMRecord returned is will have its alignment completely contained in one of the intervals of interest. If false, the alignment of the returned SAMRecords need only overlap one of the intervals of interest.- Returns:
- Iterator over the SAMRecords matching the interval.
-
queryOverlapping
Iterate over records that overlap any of the given intervals. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided. Only valid to call this if hasIndex() == true. Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest. Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.- Parameters:
intervals
- Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done withQueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
-
queryContained
Iterate over records that are contained in the given interval. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided. Only valid to call this if hasIndex() == true. Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest. Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.- Parameters:
intervals
- Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done withQueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
- Returns:
- Iterator over the SAMRecords contained in any of the intervals.
-
queryUnmapped
SAMRecordIterator queryUnmapped() -
queryAlignmentStart
Iterate over records that map to the given sequence and start at the given position. Only valid to call this if hasIndex() == true. Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest. Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that matches the arguments.- Parameters:
sequence
- Reference sequence of interest.start
- Alignment start of interest.- Returns:
- Iterator over the SAMRecords with the given alignment start.
-
queryMate
Fetch the mate for the given read. Only valid to call this if hasIndex() == true. This will work whether the mate has a coordinate or not, so long as the given read has correct mate information. This method iterates over the SAM file, so there may not be an unclosed iterator on the SAM file when this method is called. Note that it is not possible to call queryMate when iterating over the SamReader, because queryMate requires its own iteration, and there cannot be two simultaneous iterations on the same SamReader. The work-around is to open a second SamReader on the same input file, and call queryMate on the second reader.- Parameters:
rec
- Record for which mate is sought. Must be a paired read.- Returns:
- rec's mate, or null if it cannot be found.
-