Package htsjdk.samtools.cram.structure
Class CompressionHeaderEncodingMap
java.lang.Object
htsjdk.samtools.cram.structure.CompressionHeaderEncodingMap
Maintains a map of DataSeries to EncodingDescriptor, and a second map that contains the compressor to use
for each EncodingDescriptor that represents an EXTERNAL encoding.
There are two constructors; one populates the map from scratch using the default encodings chosen by
this (htsjdk) implementation, used when writing a new CRAM; one populates the map from a serialized
CRAM stream resulting in encodings chosen by the implementation that wrote that CRAM.
Although the CRAM spec defines a fixed list of data series, individual CRAM implementations
may choose to use only a subset of these. Therefore, the actual set of encodings that are
instantiated can vary depending on the source.
Notes on the htsjdk CRAM write implementation: This implementation encodes ALL DataSeries to external
blocks, (although some of the external encodings split the data between core and external; see
ByteArrayLenEncoding
, and does not use the 'BB' or 'QQ'
DataSeries when writing CRAM at all. Relies heavily on GZIP and RANS for compression.
See EncodingFactory
for details on how an EncodingDescriptor
is mapped to the codec that actually transfers data to and from underlying Slice blocks.-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionCompressionHeaderEncodingMap
(CRAMEncodingStrategy encodingStrategy) Constructor used to create the default encoding map for writing CRAMs.CompressionHeaderEncodingMap
(InputStream inputStream) Constructor used to discover an encoding map from a serialized CRAM stream. -
Method Summary
Modifier and TypeMethodDescriptioncreateCompressedBlockForStream
(Integer contentId, ByteArrayOutputStream outputStream) boolean
getBestExternalCompressor
(byte[] data, CRAMEncodingStrategy encodingStrategy) Return the best external compressor to use for the provided byte array (compressor that results in the smallest compressed size).getEncodingDescriptorForDataSeries
(DataSeries dataSeries) Get the encoding params that should be used for a given DataSeries.Get a list of all external IDs for this encoding mapint
hashCode()
void
putExternalEncoding
(DataSeries dataSeries, EncodingDescriptor encodingDescriptor, ExternalCompressor compressor) void
putTagBlockCompression
(int tagId, ExternalCompressor compressor) Add an external compressor for a tag blockvoid
write
(OutputStream outputStream) Write the encoding map out to a CRAM Stream
-
Field Details
-
DATASERIES_NOT_READ_BY_HTSJDK
-
-
Constructor Details
-
CompressionHeaderEncodingMap
Constructor used to create the default encoding map for writing CRAMs. The encoding strategy parameter values are used to set compression levels, etc, but any encoding map embedded is ignored since this uses the default strategy.- Parameters:
encodingStrategy
-CRAMEncodingStrategy
containing parameter values to use when creating the encoding map
-
CompressionHeaderEncodingMap
Constructor used to discover an encoding map from a serialized CRAM stream.- Parameters:
inputStream
- the CRAM input stream to be consumed
-
-
Method Details
-
putTagBlockCompression
Add an external compressor for a tag block- Parameters:
tagId
- the tag as a content IDcompressor
- compressor to be used for this tag block
-
getEncodingDescriptorForDataSeries
Get the encoding params that should be used for a given DataSeries.- Parameters:
dataSeries
-- Returns:
- EncodingDescriptor for the DataSeries
-
getExternalIDs
Get a list of all external IDs for this encoding map- Returns:
- list of all external IDs for this encoding map
-
createCompressedBlockForStream
Given a content ID, return aBlock
for that ID by obtaining the contents of the stream, compressing it using the compressor for that contentID, and converting the result to aBlock
.- Parameters:
contentId
- contentID to useoutputStream
- stream to compress- Returns:
- Block containing the compressed contends of the stream
-
write
Write the encoding map out to a CRAM Stream- Parameters:
outputStream
- stream to write- Throws:
IOException
-
getBestExternalCompressor
public ExternalCompressor getBestExternalCompressor(byte[] data, CRAMEncodingStrategy encodingStrategy) Return the best external compressor to use for the provided byte array (compressor that results in the smallest compressed size). Note that this does not necessarily mean this is the best compression to use for the source data series, as it does not consider the size of the alphabet (2 byte int, 4 byte int) since its only choosing from EXTERNAL compressors.- Parameters:
data
- byte array to compressencodingStrategy
- encoding strategy parameters to use- Returns:
- the best
ExternalCompressor
to use for this data
-
putExternalEncoding
public void putExternalEncoding(DataSeries dataSeries, EncodingDescriptor encodingDescriptor, ExternalCompressor compressor) -
equals
-
hashCode
public int hashCode()
-