RFile
public class MySequenceFile extends Object
MySequenceFile
s are flat files consisting of binary key/value pairs.
MySequenceFile
provides MySequenceFile.Writer
, MySequenceFile.Reader
and MySequenceFile.Sorter
classes for writing, reading and sorting respectively.
MySequenceFile
Writer
s based on the MySequenceFile.CompressionType
used to compress key/value pairs:
Writer
: Uncompressed records.RecordCompressWriter
: Record-compressed files, only compress values.BlockCompressWriter
: Block-compressed files, both keys & values are collected in 'blocks' separately and compressed. The size of the 'block' is
configurable.
The actual compression algorithm used to compress key and/or values can be specified by using the appropriate CompressionCodec
.
The recommended way is to use the static createWriter methods provided by the MySequenceFile
to chose the preferred format.
The MySequenceFile.Reader
acts as the bridge and can read any of the above MySequenceFile
formats.
Essentially there are 3 different formats for MySequenceFile
s depending on the CompressionType
specified. All of them share a common header described below.
CompressionCodec
class which is used for compression of keys and/or values (if compression is enabled).MySequenceFile.Metadata
for this file.100
bytes or so.100
bytes or so.100
bytes or so.The compressed blocks of key lengths and value lengths consist of the actual lengths of individual keys/values encoded in ZeroCompressedInteger format.
CompressionCodec
Modifier and Type | Class and Description |
---|---|
static class |
MySequenceFile.CompressionType
Deprecated.
The compression type used to compress key/value pairs in the
MySequenceFile . |
static class |
MySequenceFile.Metadata
Deprecated.
The class encapsulating with the metadata of a file.
|
static class |
MySequenceFile.Reader
Deprecated.
Reads key/value pairs from a sequence-format file.
|
static class |
MySequenceFile.Sorter
Deprecated.
Sorts key/value pairs in a sequence-format file.
|
static interface |
MySequenceFile.ValueBytes
Deprecated.
The interface to 'raw' values of SequenceFiles.
|
static class |
MySequenceFile.Writer
Deprecated.
Write key/value pairs to a sequence-format file.
|
Modifier and Type | Field and Description |
---|---|
static int |
SYNC_INTERVAL
Deprecated.
The number of bytes between sync points.
|
Modifier and Type | Method and Description |
---|---|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.FSDataOutputStream out,
Class keyClass,
Class valClass,
MySequenceFile.CompressionType compressionType,
org.apache.hadoop.io.compress.CompressionCodec codec)
Deprecated.
Construct the preferred type of 'raw' MySequenceFile Writer.
|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.FSDataOutputStream out,
Class keyClass,
Class valClass,
MySequenceFile.CompressionType compressionType,
org.apache.hadoop.io.compress.CompressionCodec codec,
MySequenceFile.Metadata metadata)
Deprecated.
Construct the preferred type of 'raw' MySequenceFile Writer.
|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path name,
Class keyClass,
Class valClass)
Deprecated.
Construct the preferred type of MySequenceFile Writer.
|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path name,
Class keyClass,
Class valClass,
int bufferSize,
short replication,
long blockSize,
MySequenceFile.CompressionType compressionType,
org.apache.hadoop.io.compress.CompressionCodec codec,
org.apache.hadoop.util.Progressable progress,
MySequenceFile.Metadata metadata)
Deprecated.
Construct the preferred type of MySequenceFile Writer.
|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path name,
Class keyClass,
Class valClass,
MySequenceFile.CompressionType compressionType)
Deprecated.
Construct the preferred type of MySequenceFile Writer.
|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path name,
Class keyClass,
Class valClass,
MySequenceFile.CompressionType compressionType,
org.apache.hadoop.io.compress.CompressionCodec codec)
Deprecated.
Construct the preferred type of MySequenceFile Writer.
|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path name,
Class keyClass,
Class valClass,
MySequenceFile.CompressionType compressionType,
org.apache.hadoop.io.compress.CompressionCodec codec,
org.apache.hadoop.util.Progressable progress)
Deprecated.
Construct the preferred type of MySequenceFile Writer.
|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path name,
Class keyClass,
Class valClass,
MySequenceFile.CompressionType compressionType,
org.apache.hadoop.io.compress.CompressionCodec codec,
org.apache.hadoop.util.Progressable progress,
MySequenceFile.Metadata metadata)
Deprecated.
Construct the preferred type of MySequenceFile Writer.
|
static MySequenceFile.Writer |
createWriter(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.conf.Configuration conf,
org.apache.hadoop.fs.Path name,
Class keyClass,
Class valClass,
MySequenceFile.CompressionType compressionType,
org.apache.hadoop.util.Progressable progress)
Deprecated.
Construct the preferred type of MySequenceFile Writer.
|
static MySequenceFile.CompressionType |
getCompressionType(org.apache.hadoop.conf.Configuration job)
Deprecated.
Use
SequenceFileOutputFormat.getOutputCompressionType(org.apache.hadoop.mapred.JobConf) to get
MySequenceFile.CompressionType for job-outputs. |
static void |
setCompressionType(org.apache.hadoop.conf.Configuration job,
MySequenceFile.CompressionType val)
Deprecated.
Use the one of the many MySequenceFile.createWriter methods to specify the
MySequenceFile.CompressionType while creating the MySequenceFile to
specify the MySequenceFile.CompressionType for job-outputs. |
public static final int SYNC_INTERVAL
@Deprecated public static MySequenceFile.CompressionType getCompressionType(org.apache.hadoop.conf.Configuration job)
SequenceFileOutputFormat.getOutputCompressionType(org.apache.hadoop.mapred.JobConf)
to get
MySequenceFile.CompressionType
for job-outputs.job
- the job config to look in@Deprecated public static void setCompressionType(org.apache.hadoop.conf.Configuration job, MySequenceFile.CompressionType val)
MySequenceFile.CompressionType
while creating the MySequenceFile
to
specify the MySequenceFile.CompressionType
for job-outputs.job
- the configuration to modifyval
- the new compression type (none, block, record)public static MySequenceFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Class keyClass, Class valClass) throws IOException
fs
- The configured filesystem.conf
- The configuration.name
- The name of the file.keyClass
- The 'key' type.valClass
- The 'value' type.IOException
public static MySequenceFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Class keyClass, Class valClass, MySequenceFile.CompressionType compressionType) throws IOException
fs
- The configured filesystem.conf
- The configuration.name
- The name of the file.keyClass
- The 'key' type.valClass
- The 'value' type.compressionType
- The compression type.IOException
public static MySequenceFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Class keyClass, Class valClass, MySequenceFile.CompressionType compressionType, org.apache.hadoop.util.Progressable progress) throws IOException
fs
- The configured filesystem.conf
- The configuration.name
- The name of the file.keyClass
- The 'key' type.valClass
- The 'value' type.compressionType
- The compression type.progress
- The Progressable object to track progress.IOException
public static MySequenceFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Class keyClass, Class valClass, MySequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec) throws IOException
fs
- The configured filesystem.conf
- The configuration.name
- The name of the file.keyClass
- The 'key' type.valClass
- The 'value' type.compressionType
- The compression type.codec
- The compression codec.IOException
public static MySequenceFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Class keyClass, Class valClass, MySequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec, org.apache.hadoop.util.Progressable progress, MySequenceFile.Metadata metadata) throws IOException
fs
- The configured filesystem.conf
- The configuration.name
- The name of the file.keyClass
- The 'key' type.valClass
- The 'value' type.compressionType
- The compression type.codec
- The compression codec.progress
- The Progressable object to track progress.metadata
- The metadata of the file.IOException
public static MySequenceFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Class keyClass, Class valClass, int bufferSize, short replication, long blockSize, MySequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec, org.apache.hadoop.util.Progressable progress, MySequenceFile.Metadata metadata) throws IOException
fs
- The configured filesystem.conf
- The configuration.name
- The name of the file.keyClass
- The 'key' type.valClass
- The 'value' type.bufferSize
- buffer size for the underlaying outputstream.replication
- replication factor for the file.blockSize
- block size for the file.compressionType
- The compression type.codec
- The compression codec.progress
- The Progressable object to track progress.metadata
- The metadata of the file.IOException
public static MySequenceFile.Writer createWriter(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path name, Class keyClass, Class valClass, MySequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec, org.apache.hadoop.util.Progressable progress) throws IOException
fs
- The configured filesystem.conf
- The configuration.name
- The name of the file.keyClass
- The 'key' type.valClass
- The 'value' type.compressionType
- The compression type.codec
- The compression codec.progress
- The Progressable object to track progress.IOException
public static MySequenceFile.Writer createWriter(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FSDataOutputStream out, Class keyClass, Class valClass, MySequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec, MySequenceFile.Metadata metadata) throws IOException
conf
- The configuration.out
- The stream on top which the writer is to be constructed.keyClass
- The 'key' type.valClass
- The 'value' type.compressionType
- The compression type.codec
- The compression codec.metadata
- The metadata of the file.IOException
public static MySequenceFile.Writer createWriter(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.FSDataOutputStream out, Class keyClass, Class valClass, MySequenceFile.CompressionType compressionType, org.apache.hadoop.io.compress.CompressionCodec codec) throws IOException
conf
- The configuration.out
- The stream on top which the writer is to be constructed.keyClass
- The 'key' type.valClass
- The 'value' type.compressionType
- The compression type.codec
- The compression codec.IOException
Copyright © 2012 The Apache Software Foundation. All Rights Reserved.