htsjdk.samtools.cram.build.SliceFactory

public final class SliceFactory extends Object

Factory for creating Slices when writing a CRAM stream. Determines when to emit a Slice, based on a set of rules implemented by this class; the accumulated SliceEntry state objects; and the parameter values in the provided CRAMEncodingStrategy object.

Constructor Summary

Constructors

Constructor

Description

SliceFactory(CRAMEncodingStrategy cramEncodingStrategy, CRAMReferenceSource cramReferenceSource, SAMFileHeader samFileHeader, long globalRecordCounter)
Method Summary

Modifier and Type

Method

Description

long

createNewSliceEntry(int currentReferenceContextID, List<SAMRecord> sliceSAMRecords)

Add a new slice entry, and return the number of sliceEntries.

List<Slice>

createSlices(CompressionHeader compressionHeader, long containerByteOffset)

Returns a set of Slices using the records accumulated by the factory, and resets the factory state.

List<CRAMCompressionRecord>

getCRAMRecordsForAllSlices()

Get all CRAM records accumulated by the factory.

int

getNumberOfSliceEntries()

int

getUpdatedReferenceContext(int currentReferenceContext, int nextReferenceIndex, int numberOfSAMRecords)

Decide if the current records should be flushed based on the current reference context, the reference context for the next record to be written, and the number of records seen so far.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- SliceFactory
  
  public SliceFactory(CRAMEncodingStrategy cramEncodingStrategy, CRAMReferenceSource cramReferenceSource, SAMFileHeader samFileHeader, long globalRecordCounter)
  
  Parameters:
  
  cramEncodingStrategy - CRAMEncodingStrategy to use for Slices that are created
  
  cramReferenceSource - CRAMReferenceSource to use for Slices that are created
  
  samFileHeader - for input records, used for finding read groups, sort order, etc.
  
  globalRecordCounter - initial global record counter for Slices that are created
Method Details
- createNewSliceEntry
  
  public long createNewSliceEntry(int currentReferenceContextID, List<SAMRecord> sliceSAMRecords)
  
  Add a new slice entry, and return the number of sliceEntries.
  
  Parameters:
  
  currentReferenceContextID -
  
  sliceSAMRecords -
  
  Returns:
- getCRAMRecordsForAllSlices
  
  public List<CRAMCompressionRecord> getCRAMRecordsForAllSlices()
  
  Get all CRAM records accumulated by the factory. These are the records that will be used to create one or more slices when createSlices(htsjdk.samtools.cram.structure.CompressionHeader, long) is called.
  
  Returns:
  
  the list of all CRAMRecords
- getNumberOfSliceEntries
  
  public int getNumberOfSliceEntries()
- createSlices
  
  public List<Slice> createSlices(CompressionHeader compressionHeader, long containerByteOffset)
  
  Returns a set of Slices using the records accumulated by the factory, and resets the factory state.
  
  Parameters:
  
  compressionHeader - the compression header to use to create the Slices
  
  containerByteOffset - the container byte offset to use for the newly created Slices
  
  Returns:
  
  List of Slices created from the accumulated state of this SliceFactory
- getUpdatedReferenceContext
  
  public int getUpdatedReferenceContext(int currentReferenceContext, int nextReferenceIndex, int numberOfSAMRecords)
  
  Decide if the current records should be flushed based on the current reference context, the reference context for the next record to be written, and the number of records seen so far. Slices with the Multiple Reference flag (-2) set as the sequence ID in the header may contain reads mapped to multiple external references, including unmapped reads (placed on these references or unplaced), but multiple embedded references cannot be combined in this way. When multiple references are used, the RI data series will be used to determine the reference sequence ID for each record. This data series is not present when only a single reference is used within a slice. The Unmapped (-1) sequence ID in the header is for slices containing only unplaced unmapped reads. A slice containing data that does not use the external reference in any sequence may set the reference MD5 sum to zero. This can happen because the data is unmapped or the sequence has been stored verbatim instead of via reference-differencing. This latter scenario is recommended for unsorted or non-coordinate-sorted data.
  
  Parameters:
  
  nextReferenceIndex - reference index of the next record to be emitted
  
  Returns:
  
  ReferenceContext.UNINITIALIZED_REFERENCE_ID if a current slice should be flushed and subsequent records should go into a new slice; otherwise the updated reference context.

Class SliceFactory

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

SliceFactory

Method Details

createNewSliceEntry

getCRAMRecordsForAllSlices

getNumberOfSliceEntries

createSlices

getUpdatedReferenceContext