AbstractColumnFamilyOutputFormat (apache-cassandra API)

java.lang.Object
- org.apache.hadoop.mapreduce.OutputFormat<K,Y>
- - org.apache.cassandra.hadoop.AbstractColumnFamilyOutputFormat<K,Y>

Type Parameters:
Y -

All Implemented Interfaces:

org.apache.hadoop.mapred.OutputFormat<K,Y>

Direct Known Subclasses:

ColumnFamilyOutputFormat, CqlOutputFormat
```
public abstract class AbstractColumnFamilyOutputFormat<K,Y>
extends org.apache.hadoop.mapreduce.OutputFormat<K,Y>
implements org.apache.hadoop.mapred.OutputFormat<K,Y>
```
The ColumnFamilyOutputFormat acts as a Hadoop-specific OutputFormat that allows reduce tasks to store keys (and corresponding values) as Cassandra rows (and respective columns) in a given ColumnFamily.
As is the case with the ColumnFamilyInputFormat, you need to set the Keyspace and ColumnFamily in your Hadoop job Configuration. The ConfigHelper class, through its ConfigHelper.setOutputColumnFamily(org.apache.hadoop.conf.Configuration, java.lang.String) method, is provided to make this simple.

For the sake of performance, this class employs a lazy write-back caching mechanism, where its record writer batches mutations created based on the reduce's inputs (in a task-specific map), and periodically makes the changes official by sending a batch mutate request to Cassandra.

Field Summary

Fields
Modifier and Type Field and Description

static java.lang.String BATCH_THRESHOLD

static java.lang.String QUEUE_SIZE

Fields
Modifier and Type	Field and Description
`static java.lang.String`	`BATCH_THRESHOLD`
`static java.lang.String`	`QUEUE_SIZE`

Constructor Summary

Constructors
Constructor and Description

AbstractColumnFamilyOutputFormat()

Constructors
Constructor and Description
`AbstractColumnFamilyOutputFormat()`

Method Summary

Methods
Modifier and Type	Method and Description
`protected void`	`checkOutputSpecs(org.apache.hadoop.conf.Configuration conf)`
`void`	`checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem, org.apache.hadoop.mapred.JobConf job)` Deprecated.
`void`	`checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)` Check for validity of the output-specification for the job.
`static org.apache.cassandra.thrift.Cassandra.Client`	`createAuthenticatedClient(java.lang.String host, int port, org.apache.hadoop.conf.Configuration conf)` Connects to the given server:port and returns a client based on the given socket that points to the configured keyspace, and is logged in with the configured credentials.
`org.apache.hadoop.mapreduce.OutputCommitter`	`getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)` The OutputCommitter for this format does not write any data to the DFS.

Methods inherited from class org.apache.hadoop.mapreduce.OutputFormat
getRecordWriter

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.hadoop.mapred.OutputFormat
getRecordWriter

- Field Detail
  - BATCH_THRESHOLD
```
public static final java.lang.String BATCH_THRESHOLD
```
    See Also:
    Constant Field Values
  - QUEUE_SIZE
```
public static final java.lang.String QUEUE_SIZE
```
    See Also:
    Constant Field Values
- Constructor Detail
  - AbstractColumnFamilyOutputFormat
```
public AbstractColumnFamilyOutputFormat()
```
- Method Detail
  - checkOutputSpecs
```
public void checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)
```
    Check for validity of the output-specification for the job.
    
    Specified by:
    
    checkOutputSpecs in class org.apache.hadoop.mapreduce.OutputFormat<K,Y>
    
    Parameters:
    context - information about the job
    
    Throws:
    
    java.io.IOException - when output should not be attempted
  - checkOutputSpecs
```
protected void checkOutputSpecs(org.apache.hadoop.conf.Configuration conf)
```
  - checkOutputSpecs
```
@Deprecated
public void checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem,
                               org.apache.hadoop.mapred.JobConf job)
                      throws java.io.IOException
```
    Deprecated.
    
    Fills the deprecated OutputFormat interface for streaming.
    
    Specified by:
    
    checkOutputSpecs in interface org.apache.hadoop.mapred.OutputFormat<K,Y>
    
    Throws:
    
    java.io.IOException
  - getOutputCommitter
```
public org.apache.hadoop.mapreduce.OutputCommitter getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
                                                               throws java.io.IOException,
                                                                      java.lang.InterruptedException
```
    The OutputCommitter for this format does not write any data to the DFS.
    
    Specified by:
    
    getOutputCommitter in class org.apache.hadoop.mapreduce.OutputFormat<K,Y>
    
    Parameters:
    context - the task context
    
    Returns:
    an output committer
    
    Throws:
    
    java.io.IOException
    
    java.lang.InterruptedException
  - createAuthenticatedClient
```
public static org.apache.cassandra.thrift.Cassandra.Client createAuthenticatedClient(java.lang.String host,
                                                                     int port,
                                                                     org.apache.hadoop.conf.Configuration conf)
                                                                              throws java.lang.Exception
```
    Connects to the given server:port and returns a client based on the given socket that points to the configured keyspace, and is logged in with the configured credentials.
    
    Parameters:
    host - fully qualified host name to connect to
    port - RPC port of the server
    conf - a job configuration
    
    Returns:
    a cassandra client
    
    Throws:
    
    java.lang.Exception - set of thrown exceptions may be implementation defined, depending on the used transport factory

Class AbstractColumnFamilyOutputFormat<K,Y>

Field Summary

Constructor Summary

Method Summary

Methods inherited from class org.apache.hadoop.mapreduce.OutputFormat

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.hadoop.mapred.OutputFormat

Field Detail

BATCH_THRESHOLD

QUEUE_SIZE

Constructor Detail

AbstractColumnFamilyOutputFormat

Method Detail

checkOutputSpecs

checkOutputSpecs

checkOutputSpecs

getOutputCommitter

createAuthenticatedClient