@InterfaceAudience.Public @InterfaceStability.Stable public class FixedLengthInputFormat extends FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable> implements JobConfigurable
FixedLengthRecordReader
FileInputFormat.Counter
Modifier and Type | Field and Description |
---|---|
static String |
FIXED_RECORD_LENGTH |
INPUT_DIR_RECURSIVE, LOG, NUM_INPUT_FILES
Constructor and Description |
---|
FixedLengthInputFormat() |
Modifier and Type | Method and Description |
---|---|
void |
configure(JobConf conf)
Initializes a new instance from a
JobConf . |
static int |
getRecordLength(org.apache.hadoop.conf.Configuration conf)
Get record length value
|
RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable> |
getRecordReader(InputSplit genericSplit,
JobConf job,
Reporter reporter)
Get the
RecordReader for the given InputSplit . |
protected boolean |
isSplitable(org.apache.hadoop.fs.FileSystem fs,
org.apache.hadoop.fs.Path file)
Is the given filename splittable? Usually, true, but if the file is
stream compressed, it will not be.
|
static void |
setRecordLength(org.apache.hadoop.conf.Configuration conf,
int recordLength)
Set the length of each record
|
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getInputPathFilter, getInputPaths, getSplitHosts, getSplits, listStatus, makeSplit, makeSplit, setInputPathFilter, setInputPaths, setInputPaths, setMinSplitSize
public static final String FIXED_RECORD_LENGTH
public static void setRecordLength(org.apache.hadoop.conf.Configuration conf, int recordLength)
conf
- configurationrecordLength
- the length of a recordpublic static int getRecordLength(org.apache.hadoop.conf.Configuration conf)
conf
- configurationpublic void configure(JobConf conf)
JobConfigurable
JobConf
.configure
in interface JobConfigurable
conf
- the configurationpublic RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable> getRecordReader(InputSplit genericSplit, JobConf job, Reporter reporter) throws IOException
InputFormat
RecordReader
for the given InputSplit
.
It is the responsibility of the RecordReader
to respect
record boundaries while processing the logical split to present a
record-oriented view to the individual task.
getRecordReader
in interface InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>
getRecordReader
in class FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>
genericSplit
- the InputSplit
job
- the job that this split belongs toRecordReader
IOException
protected boolean isSplitable(org.apache.hadoop.fs.FileSystem fs, org.apache.hadoop.fs.Path file)
FileInputFormat
FileInputFormat
always returns
true. Implementations that may deal with non-splittable files must
override this method.
FileInputFormat
implementations can override this and return
false
to ensure that individual input files are never split-up
so that Mapper
s process entire files.isSplitable
in class FileInputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.BytesWritable>
fs
- the file system that the file is onfile
- the file name to checkCopyright © 2018 Apache Software Foundation. All Rights Reserved.