org.apache.accumulo.core.client.mapreduce.lib.util
Class FileOutputConfigurator

java.lang.Object
  extended by org.apache.accumulo.core.client.mapreduce.lib.util.ConfiguratorBase
      extended by org.apache.accumulo.core.client.mapreduce.lib.util.FileOutputConfigurator

public class FileOutputConfigurator
extends ConfiguratorBase

Since:
1.5.0

Nested Class Summary
static class FileOutputConfigurator.Opts
          Configuration keys for AccumuloConfiguration.
 
Nested classes/interfaces inherited from class org.apache.accumulo.core.client.mapreduce.lib.util.ConfiguratorBase
ConfiguratorBase.ConnectorInfo, ConfiguratorBase.GeneralOpts, ConfiguratorBase.InstanceOpts
 
Constructor Summary
FileOutputConfigurator()
           
 
Method Summary
static AccumuloConfiguration getAccumuloConfiguration(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf)
          This helper method provides an AccumuloConfiguration object constructed from the Accumulo defaults, and overridden with Accumulo properties that have been stored in the Job's configuration.
protected static Boolean isSupportedAccumuloProperty(Property property)
          The supported Accumulo properties we set in this OutputFormat, that change the behavior of the RecordWriter.
These properties correspond to the supported public static setter methods available to this class.
static void setCompressionType(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, String compressionType)
          Sets the compression type to use for data blocks.
static void setDataBlockSize(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, long dataBlockSize)
          Sets the size for data blocks within each file.
Data blocks are a span of key/value pairs stored in the file that are compressed and indexed as a group.
static void setFileBlockSize(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, long fileBlockSize)
          Sets the size for file blocks in the file system; file blocks are managed, and replicated, by the underlying file system.
static void setIndexBlockSize(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, long indexBlockSize)
          Sets the size for index blocks within each file; smaller blocks means a deeper index hierarchy within the file, while larger blocks mean a more shallow index hierarchy within the file.
static void setReplication(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, int replication)
          Sets the file system replication factor for the resulting file, overriding the file system default.
 
Methods inherited from class org.apache.accumulo.core.client.mapreduce.lib.util.ConfiguratorBase
enumToConfKey, getInstance, getLogLevel, getPrincipal, getToken, getTokenClass, isConnectorInfoSet, setConnectorInfo, setLogLevel, setMockInstance, setZooKeeperInstance
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileOutputConfigurator

public FileOutputConfigurator()
Method Detail

isSupportedAccumuloProperty

protected static Boolean isSupportedAccumuloProperty(Property property)
The supported Accumulo properties we set in this OutputFormat, that change the behavior of the RecordWriter.
These properties correspond to the supported public static setter methods available to this class.

Parameters:
property - the Accumulo property to check
Since:
1.5.0

getAccumuloConfiguration

public static AccumuloConfiguration getAccumuloConfiguration(Class<?> implementingClass,
                                                             org.apache.hadoop.conf.Configuration conf)
This helper method provides an AccumuloConfiguration object constructed from the Accumulo defaults, and overridden with Accumulo properties that have been stored in the Job's configuration.

Parameters:
implementingClass - the class whose name will be used as a prefix for the property configuration key
conf - the Hadoop configuration object to configure
Since:
1.5.0

setCompressionType

public static void setCompressionType(Class<?> implementingClass,
                                      org.apache.hadoop.conf.Configuration conf,
                                      String compressionType)
Sets the compression type to use for data blocks. Specifying a compression may require additional libraries to be available to your Job.

Parameters:
implementingClass - the class whose name will be used as a prefix for the property configuration key
conf - the Hadoop configuration object to configure
compressionType - one of "none", "gz", "lzo", or "snappy"
Since:
1.5.0

setDataBlockSize

public static void setDataBlockSize(Class<?> implementingClass,
                                    org.apache.hadoop.conf.Configuration conf,
                                    long dataBlockSize)
Sets the size for data blocks within each file.
Data blocks are a span of key/value pairs stored in the file that are compressed and indexed as a group.

Making this value smaller may increase seek performance, but at the cost of increasing the size of the indexes (which can also affect seek performance).

Parameters:
implementingClass - the class whose name will be used as a prefix for the property configuration key
conf - the Hadoop configuration object to configure
dataBlockSize - the block size, in bytes
Since:
1.5.0

setFileBlockSize

public static void setFileBlockSize(Class<?> implementingClass,
                                    org.apache.hadoop.conf.Configuration conf,
                                    long fileBlockSize)
Sets the size for file blocks in the file system; file blocks are managed, and replicated, by the underlying file system.

Parameters:
implementingClass - the class whose name will be used as a prefix for the property configuration key
conf - the Hadoop configuration object to configure
fileBlockSize - the block size, in bytes
Since:
1.5.0

setIndexBlockSize

public static void setIndexBlockSize(Class<?> implementingClass,
                                     org.apache.hadoop.conf.Configuration conf,
                                     long indexBlockSize)
Sets the size for index blocks within each file; smaller blocks means a deeper index hierarchy within the file, while larger blocks mean a more shallow index hierarchy within the file. This can affect the performance of queries.

Parameters:
implementingClass - the class whose name will be used as a prefix for the property configuration key
conf - the Hadoop configuration object to configure
indexBlockSize - the block size, in bytes
Since:
1.5.0

setReplication

public static void setReplication(Class<?> implementingClass,
                                  org.apache.hadoop.conf.Configuration conf,
                                  int replication)
Sets the file system replication factor for the resulting file, overriding the file system default.

Parameters:
implementingClass - the class whose name will be used as a prefix for the property configuration key
conf - the Hadoop configuration object to configure
replication - the number of replicas for produced files
Since:
1.5.0


Copyright © 2013 Apache Accumulo Project. All Rights Reserved.