ColumnFamilyInputFormat (apache-cassandra API)

java.lang.Object
- org.apache.hadoop.mapreduce.InputFormat<K,Y>
- - org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>>
  - - org.apache.cassandra.hadoop.ColumnFamilyInputFormat

All Implemented Interfaces:

org.apache.hadoop.mapred.InputFormat<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>>
```
public class ColumnFamilyInputFormat
extends AbstractColumnFamilyInputFormat<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>>
```
Hadoop InputFormat allowing map/reduce against Cassandra rows within one ColumnFamily. At minimum, you need to set the CF and predicate (description of columns to extract from each row) in your Hadoop job Configuration. The ConfigHelper class is provided to make this simple: ConfigHelper.setInputColumnFamily ConfigHelper.setInputSlicePredicate You can also configure the number of rows per InputSplit with ConfigHelper.setInputSplitSize This should be "as big as possible, but no bigger." Each InputSplit is read from Cassandra with multiple get_slice_range queries, and the per-call overhead of get_slice_range is high, so larger split sizes are better -- but if it is too large, you will run out of memory. The default split size is 64k rows.

Field Summary
- Fields inherited from class org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat
  CASSANDRA_HADOOP_MAX_KEY_SIZE, CASSANDRA_HADOOP_MAX_KEY_SIZE_DEFAULT, MAPRED_TASK_ID

Constructor Summary

Constructors
Constructor and Description

ColumnFamilyInputFormat()

Constructors
Constructor and Description
`ColumnFamilyInputFormat()`

Method Summary

Methods
Modifier and Type	Method and Description
`org.apache.hadoop.mapreduce.RecordReader<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>>`	`createRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)`
`org.apache.hadoop.mapred.RecordReader<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>>`	`getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf jobConf, org.apache.hadoop.mapred.Reporter reporter)`
`protected void`	`validateConfiguration(org.apache.hadoop.conf.Configuration conf)`

Methods inherited from class org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat
createAuthenticatedClient, getSplits, getSplits

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Detail
- ColumnFamilyInputFormat
```
public ColumnFamilyInputFormat()
```

Method Detail

createRecordReader

public org.apache.hadoop.mapreduce.RecordReader<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>> createRecordReader(org.apache.hadoop.mapreduce.InputSplit inputSplit,
                                                                                                                  org.apache.hadoop.mapreduce.TaskAttemptContext taskAttemptContext)
                                                                                                                    throws java.io.IOException,
                                                                                                                           java.lang.InterruptedException

Specified by:: createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>>
Throws:: java.io.IOException; java.lang.InterruptedException

getRecordReader

public org.apache.hadoop.mapred.RecordReader<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                                            org.apache.hadoop.mapred.JobConf jobConf,
                                                                                                            org.apache.hadoop.mapred.Reporter reporter)
                                                                                                              throws java.io.IOException

Throws:: java.io.IOException

validateConfiguration
```
protected void validateConfiguration(org.apache.hadoop.conf.Configuration conf)
```
Overrides:

validateConfiguration in class AbstractColumnFamilyInputFormat<java.nio.ByteBuffer,java.util.SortedMap<CellName,Cell>>

Class ColumnFamilyInputFormat

Field Summary

Fields inherited from class org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat

Constructor Summary

Method Summary

Methods inherited from class org.apache.cassandra.hadoop.AbstractColumnFamilyInputFormat

Methods inherited from class java.lang.Object

Constructor Detail

ColumnFamilyInputFormat

Method Detail

createRecordReader

getRecordReader

validateConfiguration