Interface SamReader

All Superinterfaces:
AutoCloseable, Closeable, Iterable<SAMRecord>
All Known Implementing Classes:
SamReader.PrimitiveSamReaderToSamReaderAdapter

public interface SamReader extends Iterable<SAMRecord>, Closeable
Describes functionality for objects that produce SAMRecords and associated information. Currently, only deprecated readers implement this directly; actual readers implement this via SamReader.ReaderImplementation and SamReader.PrimitiveSamReader, which SamReaderFactory converts into full readers by using SamReader.PrimitiveSamReaderToSamReaderAdapter.
  • Method Details

    • getFileHeader

      SAMFileHeader getFileHeader()
    • type

      Returns:
      the SamReader.Type of this SamReader
    • getResourceDescription

      String getResourceDescription()
      Returns:
      a human readable description of the resource backing this sam reader
    • isQueryable

      default boolean isQueryable()
      Returns:
      true if this source can be queried by interval, regardless of whether it has an index
    • hasIndex

      boolean hasIndex()
      Returns:
      true if ths is a BAM file, and has an index
    • indexing

      SamReader.Indexing indexing()
      Exposes the SamReader.Indexing facet of this SamReader.
      Throws:
      UnsupportedOperationException - If hasIndex() returns false.
    • iterator

      SAMRecordIterator iterator()
      Iterate through file in order. For a SamReader constructed from an InputStream, and for any SAM file, a 2nd iteration starts where the 1st one left off. For a BAM constructed from a SeekableStream or File, each new iteration starts at the first record.

      Only a single open iterator on a SAM or BAM file may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

      Specified by:
      iterator in interface Iterable<SAMRecord>
    • query

      SAMRecordIterator query(String sequence, int start, int end, boolean contained)
      Iterate over records that match the given interval. Only valid to call this if hasIndex() == true.

      Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. You can use a second SamReader to iterate in parallel over the same underlying file.

      Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

      Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

      Parameters:
      sequence - Reference sequence of interest.
      start - 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.
      end - 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.
      contained - If true, each SAMRecord returned will have its alignment completely contained in the interval of interest. If false, the alignment of the returned SAMRecords need only overlap the interval of interest.
      Returns:
      Iterator over the SAMRecords matching the interval.
    • queryOverlapping

      SAMRecordIterator queryOverlapping(String sequence, int start, int end)
      Iterate over records that overlap the given interval. Only valid to call this if hasIndex() == true.

      Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

      Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

      Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

      Parameters:
      sequence - Reference sequence of interest.
      start - 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.
      end - 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.
      Returns:
      Iterator over the SAMRecords overlapping the interval.
    • queryContained

      SAMRecordIterator queryContained(String sequence, int start, int end)
      Iterate over records that are contained in the given interval. Only valid to call this if hasIndex() == true.

      Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

      Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

      Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

      Parameters:
      sequence - Reference sequence of interest.
      start - 1-based, inclusive start of interval of interest. Zero implies start of the reference sequence.
      end - 1-based, inclusive end of interval of interest. Zero implies end of the reference sequence.
      Returns:
      Iterator over the SAMRecords contained in the interval.
    • query

      SAMRecordIterator query(QueryInterval[] intervals, boolean contained)
      Iterate over records that match one of the given intervals. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided.

      Only valid to call this if hasIndex() == true.

      Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first. You can use a second SamReader to iterate in parallel over the same underlying file.

      Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match an interval of interest.

      Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

      Parameters:
      intervals - Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done with QueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
      contained - If true, each SAMRecord returned is will have its alignment completely contained in one of the intervals of interest. If false, the alignment of the returned SAMRecords need only overlap one of the intervals of interest.
      Returns:
      Iterator over the SAMRecords matching the interval.
    • queryOverlapping

      SAMRecordIterator queryOverlapping(QueryInterval[] intervals)
      Iterate over records that overlap any of the given intervals. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided.

      Only valid to call this if hasIndex() == true.

      Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

      Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

      Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

      Parameters:
      intervals - Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done with QueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
    • queryContained

      SAMRecordIterator queryContained(QueryInterval[] intervals)
      Iterate over records that are contained in the given interval. This may be more efficient than querying each interval separately, because multiple reads of the same SAMRecords is avoided.

      Only valid to call this if hasIndex() == true.

      Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

      Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

      Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that is in the query region.

      Parameters:
      intervals - Intervals to be queried. The intervals must be optimized, i.e. in order, with overlapping and abutting intervals merged. This can be done with QueryInterval.optimizeIntervals(htsjdk.samtools.QueryInterval[])
      Returns:
      Iterator over the SAMRecords contained in any of the intervals.
    • queryUnmapped

      SAMRecordIterator queryUnmapped()
    • queryAlignmentStart

      SAMRecordIterator queryAlignmentStart(String sequence, int start)
      Iterate over records that map to the given sequence and start at the given position. Only valid to call this if hasIndex() == true.

      Only a single open iterator on a given SamReader may be extant at any one time. If you want to start a second iteration, the first one must be closed first.

      Note that indexed lookup is not perfectly efficient in terms of disk I/O. I.e. some SAMRecords may be read and then discarded because they do not match the interval of interest.

      Note that an unmapped read will be returned by this call if it has a coordinate for the purpose of sorting that matches the arguments.

      Parameters:
      sequence - Reference sequence of interest.
      start - Alignment start of interest.
      Returns:
      Iterator over the SAMRecords with the given alignment start.
    • queryMate

      SAMRecord queryMate(SAMRecord rec)
      Fetch the mate for the given read. Only valid to call this if hasIndex() == true. This will work whether the mate has a coordinate or not, so long as the given read has correct mate information. This method iterates over the SAM file, so there may not be an unclosed iterator on the SAM file when this method is called.

      Note that it is not possible to call queryMate when iterating over the SamReader, because queryMate requires its own iteration, and there cannot be two simultaneous iterations on the same SamReader. The work-around is to open a second SamReader on the same input file, and call queryMate on the second reader.

      Parameters:
      rec - Record for which mate is sought. Must be a paired read.
      Returns:
      rec's mate, or null if it cannot be found.