java.lang.Object
- org.apache.hadoop.mapreduce.InputFormat<K,V>
- - org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable>
  - - com.yahoo.vespa.hadoop.mapreduce.VespaSimpleJsonInputFormat

```
public class VespaSimpleJsonInputFormat
extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable>
```
Simple JSON reader which splits the input file along JSON object boundaries. There are two cases handled here: 1. Each line contains a JSON object, i.e. { ... } 2. The file contains an array of objects with arbitrary line breaks, i.e. [ {...}, {...} ] Not suitable for cases where you want to extract objects from some other arbitrary structure. TODO: Support config which points to a array in the JSON as start point for object extraction, ala how it is done in VespaHttpClient.parseResultJson, i.e. support rootNode config.

Author:

lesters

Nested Class Summary

Nested Classes
Modifier and Type Class Description

static class VespaSimpleJsonInputFormat.VespaJsonRecordReader
- Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
  org.apache.hadoop.mapreduce.lib.input.FileInputFormat.Counter

Field Summary
- Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
  DEFAULT_LIST_STATUS_NUM_THREADS, INPUT_DIR, INPUT_DIR_RECURSIVE, LIST_STATUS_NUM_THREADS, NUM_INPUT_FILES, PATHFILTER_CLASS, SPLIT_MAXSIZE, SPLIT_MINSIZE

Constructor Summary

Constructors
Constructor Description

VespaSimpleJsonInputFormat()

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method	Description
`org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext context)`

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, getSplits, isSplitable, listStatus, makeSplit, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSize

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail
- VespaSimpleJsonInputFormat
```
public VespaSimpleJsonInputFormat()
```

Method Detail

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
                                                                                                                                      org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                                                                                               throws java.io.IOException,
                                                                                                                                      java.lang.InterruptedException

Specified by:: createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.Text,org.apache.hadoop.io.NullWritable>
Throws:: java.io.IOException; java.lang.InterruptedException

Class VespaSimpleJsonInputFormat

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Field Summary

Fields inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat

Methods inherited from class java.lang.Object

Constructor Detail

VespaSimpleJsonInputFormat

Method Detail

createRecordReader