public class ColumnFamilyOutputFormat extends AbstractColumnFamilyOutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.thrift.Mutation>>
ColumnFamilyOutputFormat
acts as a Hadoop-specific
OutputFormat that allows reduce tasks to store keys (and corresponding
values) as Cassandra rows (and respective columns) in a given
ColumnFamily.
As is the case with the ColumnFamilyInputFormat
, you need to set the
Keyspace and ColumnFamily in your
Hadoop job Configuration. The ConfigHelper
class, through its
ConfigHelper.setOutputColumnFamily(org.apache.hadoop.conf.Configuration, java.lang.String)
method, is provided to make this
simple.
For the sake of performance, this class employs a lazy write-back caching mechanism, where its record writer batches mutations created based on the reduce's inputs (in a task-specific map), and periodically makes the changes official by sending a batch mutate request to Cassandra.
BATCH_THRESHOLD, QUEUE_SIZE
Constructor and Description |
---|
ColumnFamilyOutputFormat() |
Modifier and Type | Method and Description |
---|---|
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter |
getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem,
org.apache.hadoop.mapred.JobConf job,
java.lang.String name,
org.apache.hadoop.util.Progressable progress)
Deprecated.
|
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter |
getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context)
Get the
RecordWriter for the given task. |
checkOutputSpecs, checkOutputSpecs, checkOutputSpecs, createAuthenticatedClient, getOutputCommitter, login
@Deprecated public org.apache.cassandra.hadoop.ColumnFamilyRecordWriter getRecordWriter(org.apache.hadoop.fs.FileSystem filesystem, org.apache.hadoop.mapred.JobConf job, java.lang.String name, org.apache.hadoop.util.Progressable progress)
public org.apache.cassandra.hadoop.ColumnFamilyRecordWriter getRecordWriter(org.apache.hadoop.mapreduce.TaskAttemptContext context) throws java.lang.InterruptedException
RecordWriter
for the given task.getRecordWriter
in class org.apache.hadoop.mapreduce.OutputFormat<java.nio.ByteBuffer,java.util.List<org.apache.cassandra.thrift.Mutation>>
context
- the information about the current task.RecordWriter
to write the output for the job.java.io.IOException
java.lang.InterruptedException
Copyright © 2015 The Apache Software Foundation