ColumnFamilyOutputFormat (apache-cassandra API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.cassandra.hadoop
Class ColumnFamilyOutputFormat

java.lang.Object
  org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.thrift.Mutation>>
      org.apache.cassandra.hadoop.ColumnFamilyOutputFormat

All Implemented Interfaces:: org.apache.hadoop.mapred.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.thrift.Mutation>>

public class ColumnFamilyOutputFormat
extends org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.thrift.Mutation>>
implements org.apache.hadoop.mapred.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.thrift.Mutation>>
extends org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.thrift.Mutation>>
implements org.apache.hadoop.mapred.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.thrift.Mutation>>

The ColumnFamilyOutputFormat acts as a Hadoop-specific OutputFormat that allows reduce tasks to store keys (and corresponding values) as Cassandra rows (and respective columns) in a given ColumnFamily.

As is the case with the ColumnFamilyInputFormat, you need to set the Keyspace and ColumnFamily in your Hadoop job Configuration. The ConfigHelper class, through its ConfigHelper.setOutputColumnFamily(org.apache.hadoop.conf.Configuration, java.lang.String, java.lang.String) method, is provided to make this simple.

For the sake of performance, this class employs a lazy write-back caching mechanism, where its record writer batches mutations created based on the reduce's inputs (in a task-specific map), and periodically makes the changes official by sending a batch mutate request to Cassandra.

Nested Class Summary
`static class`	`ColumnFamilyOutputFormat.NullOutputCommitter` An `OutputCommitter` that does nothing.

Field Summary
`static java.lang.String`	`BATCH_THRESHOLD`
`static java.lang.String`	`QUEUE_SIZE`

Constructor Summary
`ColumnFamilyOutputFormat()`

Method Summary
`void`	`checkOutputSpecs(org.apache.hadoop.fs.FileSystem filesystem, org.apache.hadoop.mapred.JobConf job)` Deprecated.
`void`	`checkOutputSpecs(org.apache.hadoop.mapreduce.JobContext context)` Check for validity of the output-specification for the job.
`static org.apache.cassandra.thrift.Cassandra.Client`	`createAuthenticatedClient(org.apache.thrift.transport.TSocket socket, org.apache.hadoop.conf.Configuration conf)` Return a client based on the given socket that points to the configured keyspace, and is logged in with the configured credentials.
`org.apache.hadoop.mapreduce.OutputCommitter`	`getOutputCommitter(org.apache.hadoop.mapreduce.TaskAttemptContext context)` The OutputCommitter for this format does not write any data to the DFS.
`org.apache.cassandra.hadoop.ColumnFamilyRecordWriter`	`getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem, org.apache.hadoop.mapred.JobConf job, java.lang.String name, org.apache.hadoop.util.Progressable progress)` Deprecated.
`org.apache.cassandra.hadoop.ColumnFamilyRecordWriter`	`getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)` Get the `RecordWriter` for the given task.

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

BATCH_THRESHOLD

public static final java.lang.String BATCH_THRESHOLD

See Also:: Constant Field Values

QUEUE_SIZE

public static final java.lang.String QUEUE_SIZE

See Also:: Constant Field Values

Constructor Detail