Class FileOutputConfigurator

java.lang.Object
org.apache.accumulo.hadoopImpl.mapreduce.lib.ConfiguratorBase
org.apache.accumulo.hadoopImpl.mapreduce.lib.FileOutputConfigurator

public class FileOutputConfigurator extends ConfiguratorBase
Since:
1.6.0
  • Constructor Details

    • FileOutputConfigurator

      public FileOutputConfigurator()
  • Method Details

    • isSupportedAccumuloProperty

      protected static Boolean isSupportedAccumuloProperty(Property property)
      The supported Accumulo properties we set in this OutputFormat, that change the behavior of the RecordWriter.
      These properties correspond to the supported public static setter methods available to this class.
      Parameters:
      property - the Accumulo property to check
      Since:
      1.6.0
    • getAccumuloConfiguration

      public static AccumuloConfiguration getAccumuloConfiguration(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf)
      This helper method provides an AccumuloConfiguration object constructed from the Accumulo defaults, and overridden with Accumulo properties that have been stored in the Job's configuration.
      Parameters:
      implementingClass - the class whose name will be used as a prefix for the property configuration key
      conf - the Hadoop configuration object to configure
      Since:
      1.6.0
    • setCompressionType

      public static void setCompressionType(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, String compressionType)
      Sets the compression type to use for data blocks. Specifying a compression may require additional libraries to be available to your Job.
      Parameters:
      implementingClass - the class whose name will be used as a prefix for the property configuration key
      conf - the Hadoop configuration object to configure
      compressionType - one of "none", "gz", "bzip2", "lzo", "lz4", "snappy", or "zstd"
      Since:
      1.6.0
    • setDataBlockSize

      public static void setDataBlockSize(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, long dataBlockSize)
      Sets the size for data blocks within each file.
      Data blocks are a span of key/value pairs stored in the file that are compressed and indexed as a group.

      Making this value smaller may increase seek performance, but at the cost of increasing the size of the indexes (which can also affect seek performance).

      Parameters:
      implementingClass - the class whose name will be used as a prefix for the property configuration key
      conf - the Hadoop configuration object to configure
      dataBlockSize - the block size, in bytes
      Since:
      1.6.0
    • setFileBlockSize

      public static void setFileBlockSize(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, long fileBlockSize)
      Sets the size for file blocks in the file system; file blocks are managed, and replicated, by the underlying file system.
      Parameters:
      implementingClass - the class whose name will be used as a prefix for the property configuration key
      conf - the Hadoop configuration object to configure
      fileBlockSize - the block size, in bytes
      Since:
      1.6.0
    • setIndexBlockSize

      public static void setIndexBlockSize(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, long indexBlockSize)
      Sets the size for index blocks within each file; smaller blocks means a deeper index hierarchy within the file, while larger blocks mean a more shallow index hierarchy within the file. This can affect the performance of queries.
      Parameters:
      implementingClass - the class whose name will be used as a prefix for the property configuration key
      conf - the Hadoop configuration object to configure
      indexBlockSize - the block size, in bytes
      Since:
      1.6.0
    • setReplication

      public static void setReplication(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, int replication)
      Sets the file system replication factor for the resulting file, overriding the file system default.
      Parameters:
      implementingClass - the class whose name will be used as a prefix for the property configuration key
      conf - the Hadoop configuration object to configure
      replication - the number of replicas for produced files
      Since:
      1.6.0
    • setSampler

      public static void setSampler(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, SamplerConfiguration samplerConfig)
      Since:
      1.8.0
    • setSummarizers

      public static void setSummarizers(Class<?> implementingClass, org.apache.hadoop.conf.Configuration conf, SummarizerConfiguration[] sumarizerConfigs)