Class AccumuloRowInputFormat
- java.lang.Object
-
- org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
-
- org.apache.accumulo.hadoop.mapreduce.AccumuloRowInputFormat
-
public class AccumuloRowInputFormat extends org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
This class allows MapReduce jobs to use Accumulo as the source of data. ThisInputFormat
provides row names asText
as keys, and a correspondingPeekingIterator
as a value, which in turn makes theKey
/Value
pairs for that row available to the Map function. Configure the job using theconfigure()
method, which provides a fluent API. For Example:AccumuloRowInputFormat.configure().clientProperties(props).table(name) // required .auths(auths).addIterator(iter1).ranges(ranges).fetchColumns(columns).executionHints(hints) .samplerConfiguration(sampleConf).autoAdjustRanges(false) // enabled by default .scanIsolation(true) // not available with batchScan() .offlineScan(true) // not available with batchScan() .store(job);
For descriptions of all options seeInputFormatBuilder.InputFormatOptions
- Since:
- 2.0
-
-
Constructor Summary
Constructors Constructor Description AccumuloRowInputFormat()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static InputFormatBuilder.ClientParams<org.apache.hadoop.mapreduce.Job>
configure()
Sets all the information required for this map reduce job.org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)
List<org.apache.hadoop.mapreduce.InputSplit>
getSplits(org.apache.hadoop.mapreduce.JobContext context)
Gets the splits of the tables that have been set on the job by reading the metadata table for the specified ranges.
-
-
-
Method Detail
-
createRecordReader
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)
- Specified by:
createRecordReader
in classorg.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
-
getSplits
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context) throws IOException
Gets the splits of the tables that have been set on the job by reading the metadata table for the specified ranges.- Specified by:
getSplits
in classorg.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
- Returns:
- the splits from the tables based on the ranges.
- Throws:
IOException
- if a table set on the job doesn't exist or an error occurs initializing the tablet locator
-
configure
public static InputFormatBuilder.ClientParams<org.apache.hadoop.mapreduce.Job> configure()
Sets all the information required for this map reduce job.
-
-