Class AccumuloRowInputFormat

java.lang.Object
org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
org.apache.accumulo.hadoop.mapreduce.AccumuloRowInputFormat

public class AccumuloRowInputFormat extends org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
This class allows MapReduce jobs to use Accumulo as the source of data. This InputFormat provides row names as Text as keys, and a corresponding PeekingIterator as a value, which in turn makes the Key/Value pairs for that row available to the Map function. Configure the job using the configure() method, which provides a fluent API. For Example:
 AccumuloRowInputFormat.configure().clientProperties(props).table(name) // required
     .auths(auths).addIterator(iter1).ranges(ranges).fetchColumns(columns).executionHints(hints)
     .samplerConfiguration(sampleConf).autoAdjustRanges(false) // enabled by default
     .scanIsolation(true) // not available with batchScan()
     .offlineScan(true) // not available with batchScan()
     .store(job);
 
For descriptions of all options see InputFormatBuilder.InputFormatOptions
Since:
2.0
  • Constructor Details

    • AccumuloRowInputFormat

      public AccumuloRowInputFormat()
  • Method Details

    • createRecordReader

      public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)
      Specified by:
      createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
    • getSplits

      public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context) throws IOException
      Gets the splits of the tables that have been set on the job by reading the metadata table for the specified ranges.
      Specified by:
      getSplits in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,PeekingIterator<Map.Entry<Key,Value>>>
      Returns:
      the splits from the tables based on the ranges.
      Throws:
      IOException - if a table set on the job doesn't exist or an error occurs initializing the tablet locator
    • configure

      public static InputFormatBuilder.ClientParams<org.apache.hadoop.mapreduce.Job> configure()
      Sets all the information required for this map reduce job.