Class FileOutputConfigurator

    • Constructor Detail

      • FileOutputConfigurator

        public FileOutputConfigurator()
    • Method Detail

      • isSupportedAccumuloProperty

        protected static Boolean isSupportedAccumuloProperty​(Property property)
        The supported Accumulo properties we set in this OutputFormat, that change the behavior of the RecordWriter.
        These properties correspond to the supported public static setter methods available to this class.
        Parameters:
        property - the Accumulo property to check
        Since:
        1.6.0
      • getAccumuloConfiguration

        public static AccumuloConfiguration getAccumuloConfiguration​(Class<?> implementingClass,
                                                                     org.apache.hadoop.conf.Configuration conf)
        This helper method provides an AccumuloConfiguration object constructed from the Accumulo defaults, and overridden with Accumulo properties that have been stored in the Job's configuration.
        Parameters:
        implementingClass - the class whose name will be used as a prefix for the property configuration key
        conf - the Hadoop configuration object to configure
        Since:
        1.6.0
      • setCompressionType

        public static void setCompressionType​(Class<?> implementingClass,
                                              org.apache.hadoop.conf.Configuration conf,
                                              String compressionType)
        Sets the compression type to use for data blocks. Specifying a compression may require additional libraries to be available to your Job.
        Parameters:
        implementingClass - the class whose name will be used as a prefix for the property configuration key
        conf - the Hadoop configuration object to configure
        compressionType - one of "none", "gz", "lzo", "snappy", or "zstd"
        Since:
        1.6.0
      • setDataBlockSize

        public static void setDataBlockSize​(Class<?> implementingClass,
                                            org.apache.hadoop.conf.Configuration conf,
                                            long dataBlockSize)
        Sets the size for data blocks within each file.
        Data blocks are a span of key/value pairs stored in the file that are compressed and indexed as a group.

        Making this value smaller may increase seek performance, but at the cost of increasing the size of the indexes (which can also affect seek performance).

        Parameters:
        implementingClass - the class whose name will be used as a prefix for the property configuration key
        conf - the Hadoop configuration object to configure
        dataBlockSize - the block size, in bytes
        Since:
        1.6.0
      • setFileBlockSize

        public static void setFileBlockSize​(Class<?> implementingClass,
                                            org.apache.hadoop.conf.Configuration conf,
                                            long fileBlockSize)
        Sets the size for file blocks in the file system; file blocks are managed, and replicated, by the underlying file system.
        Parameters:
        implementingClass - the class whose name will be used as a prefix for the property configuration key
        conf - the Hadoop configuration object to configure
        fileBlockSize - the block size, in bytes
        Since:
        1.6.0
      • setIndexBlockSize

        public static void setIndexBlockSize​(Class<?> implementingClass,
                                             org.apache.hadoop.conf.Configuration conf,
                                             long indexBlockSize)
        Sets the size for index blocks within each file; smaller blocks means a deeper index hierarchy within the file, while larger blocks mean a more shallow index hierarchy within the file. This can affect the performance of queries.
        Parameters:
        implementingClass - the class whose name will be used as a prefix for the property configuration key
        conf - the Hadoop configuration object to configure
        indexBlockSize - the block size, in bytes
        Since:
        1.6.0
      • setReplication

        public static void setReplication​(Class<?> implementingClass,
                                          org.apache.hadoop.conf.Configuration conf,
                                          int replication)
        Sets the file system replication factor for the resulting file, overriding the file system default.
        Parameters:
        implementingClass - the class whose name will be used as a prefix for the property configuration key
        conf - the Hadoop configuration object to configure
        replication - the number of replicas for produced files
        Since:
        1.6.0
      • setSampler

        public static void setSampler​(Class<?> implementingClass,
                                      org.apache.hadoop.conf.Configuration conf,
                                      SamplerConfiguration samplerConfig)
        Since:
        1.8.0
      • setSummarizers

        public static void setSummarizers​(Class<?> implementingClass,
                                          org.apache.hadoop.conf.Configuration conf,
                                          SummarizerConfiguration[] sumarizerConfigs)