org.apache.hadoop.mapreduce.lib.input
Class CombineFileSplit

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputSplit
      extended by org.apache.hadoop.mapreduce.lib.input.CombineFileSplit
All Implemented Interfaces:
org.apache.hadoop.io.Writable
Direct Known Subclasses:
CombineFileSplit

@InterfaceAudience.Public
@InterfaceStability.Stable
public class CombineFileSplit
extends InputSplit
implements org.apache.hadoop.io.Writable

A sub-collection of input files. Unlike FileSplit, CombineFileSplit class does not represent a split of a file, but a split of input files into smaller sets. A split may contain blocks from different file but all the blocks in the same split are probably local to some rack
CombineFileSplit can be used to implement RecordReader's, with reading one record per file.

See Also:
FileSplit, CombineFileInputFormat

Constructor Summary
CombineFileSplit()
          default constructor
CombineFileSplit(CombineFileSplit old)
          Copy constructor
CombineFileSplit(org.apache.hadoop.fs.Path[] files, long[] lengths)
           
CombineFileSplit(org.apache.hadoop.fs.Path[] files, long[] start, long[] lengths, String[] locations)
           
 
Method Summary
 long getLength()
          Get the size of the split, so that the input splits can be sorted by size.
 long getLength(int i)
          Returns the length of the ith Path
 long[] getLengths()
          Returns an array containing the lengths of the files in the split
 String[] getLocations()
          Returns all the Paths where this input-split resides
 int getNumPaths()
          Returns the number of Paths in the split
 long getOffset(int i)
          Returns the start offset of the ith Path
 org.apache.hadoop.fs.Path getPath(int i)
          Returns the ith Path
 org.apache.hadoop.fs.Path[] getPaths()
          Returns all the Paths in the split
 long[] getStartOffsets()
          Returns an array containing the start offsets of the files in the split
 void readFields(DataInput in)
           
 String toString()
           
 void write(DataOutput out)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

CombineFileSplit

public CombineFileSplit()
default constructor


CombineFileSplit

public CombineFileSplit(org.apache.hadoop.fs.Path[] files,
                        long[] start,
                        long[] lengths,
                        String[] locations)

CombineFileSplit

public CombineFileSplit(org.apache.hadoop.fs.Path[] files,
                        long[] lengths)

CombineFileSplit

public CombineFileSplit(CombineFileSplit old)
                 throws IOException
Copy constructor

Throws:
IOException
Method Detail

getLength

public long getLength()
Description copied from class: InputSplit
Get the size of the split, so that the input splits can be sorted by size.

Specified by:
getLength in class InputSplit
Returns:
the number of bytes in the split

getStartOffsets

public long[] getStartOffsets()
Returns an array containing the start offsets of the files in the split


getLengths

public long[] getLengths()
Returns an array containing the lengths of the files in the split


getOffset

public long getOffset(int i)
Returns the start offset of the ith Path


getLength

public long getLength(int i)
Returns the length of the ith Path


getNumPaths

public int getNumPaths()
Returns the number of Paths in the split


getPath

public org.apache.hadoop.fs.Path getPath(int i)
Returns the ith Path


getPaths

public org.apache.hadoop.fs.Path[] getPaths()
Returns all the Paths in the split


getLocations

public String[] getLocations()
                      throws IOException
Returns all the Paths where this input-split resides

Specified by:
getLocations in class InputSplit
Returns:
a new array of the node nodes.
Throws:
IOException

readFields

public void readFields(DataInput in)
                throws IOException
Specified by:
readFields in interface org.apache.hadoop.io.Writable
Throws:
IOException

write

public void write(DataOutput out)
           throws IOException
Specified by:
write in interface org.apache.hadoop.io.Writable
Throws:
IOException

toString

public String toString()
Overrides:
toString in class Object


Copyright © 2013 Apache Software Foundation. All Rights Reserved.