Class FileOutputFormatBuilderImpl<T>
java.lang.Object
org.apache.accumulo.hadoopImpl.mapreduce.FileOutputFormatBuilderImpl<T>
- All Implemented Interfaces:
FileOutputFormatBuilder
,FileOutputFormatBuilder.OutputOptions<T>
,FileOutputFormatBuilder.PathParams<T>
public class FileOutputFormatBuilderImpl<T>
extends Object
implements FileOutputFormatBuilder, FileOutputFormatBuilder.PathParams<T>, FileOutputFormatBuilder.OutputOptions<T>
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.accumulo.hadoop.mapreduce.FileOutputFormatBuilder
FileOutputFormatBuilder.OutputOptions<T>, FileOutputFormatBuilder.PathParams<T>
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptioncompression
(String compressionType) Sets the compression type to use for data blocks, overriding the default.dataBlockSize
(long dataBlockSize) Sets the size for data blocks within each file.
Data blocks are a span of key/value pairs stored in the file that are compressed and indexed as a group.fileBlockSize
(long fileBlockSize) Sets the size for file blocks in the file system; file blocks are managed, and replicated, by the underlying file system.indexBlockSize
(long indexBlockSize) Sets the size for index blocks within each file; smaller blocks means a deeper index hierarchy within the file, while larger blocks mean a more shallow index hierarchy within the file.outputPath
(org.apache.hadoop.fs.Path path) Set the Path of the output directory for the map-reduce job.replication
(int replication) Sets the file system replication factor for the resulting file, overriding the file system default.sampler
(SamplerConfiguration samplerConfig) Specify a sampler to be used when writing out data.void
Finish configuring, verify and serialize options into the Job or JobConfsummarizers
(SummarizerConfiguration... summarizerConfigs) Specifies a list of summarizer configurations to create summary data in the output file.
-
Constructor Details
-
FileOutputFormatBuilderImpl
-
-
Method Details
-
outputPath
Description copied from interface:FileOutputFormatBuilder.PathParams
Set the Path of the output directory for the map-reduce job.- Specified by:
outputPath
in interfaceFileOutputFormatBuilder.PathParams<T>
-
compression
Description copied from interface:FileOutputFormatBuilder.OutputOptions
Sets the compression type to use for data blocks, overriding the default. Specifying a compression may require additional libraries to be available to your Job.- Specified by:
compression
in interfaceFileOutputFormatBuilder.OutputOptions<T>
- Parameters:
compressionType
- one of "none", "gz", "bzip2", "lzo", "lz4", "snappy", or "zstd"
-
dataBlockSize
Description copied from interface:FileOutputFormatBuilder.OutputOptions
Sets the size for data blocks within each file.
Data blocks are a span of key/value pairs stored in the file that are compressed and indexed as a group.Making this value smaller may increase seek performance, but at the cost of increasing the size of the indexes (which can also affect seek performance).
- Specified by:
dataBlockSize
in interfaceFileOutputFormatBuilder.OutputOptions<T>
- Parameters:
dataBlockSize
- the block size, in bytes
-
fileBlockSize
Description copied from interface:FileOutputFormatBuilder.OutputOptions
Sets the size for file blocks in the file system; file blocks are managed, and replicated, by the underlying file system.- Specified by:
fileBlockSize
in interfaceFileOutputFormatBuilder.OutputOptions<T>
- Parameters:
fileBlockSize
- the block size, in bytes
-
indexBlockSize
Description copied from interface:FileOutputFormatBuilder.OutputOptions
Sets the size for index blocks within each file; smaller blocks means a deeper index hierarchy within the file, while larger blocks mean a more shallow index hierarchy within the file. This can affect the performance of queries.- Specified by:
indexBlockSize
in interfaceFileOutputFormatBuilder.OutputOptions<T>
- Parameters:
indexBlockSize
- the block size, in bytes
-
replication
Description copied from interface:FileOutputFormatBuilder.OutputOptions
Sets the file system replication factor for the resulting file, overriding the file system default.- Specified by:
replication
in interfaceFileOutputFormatBuilder.OutputOptions<T>
- Parameters:
replication
- the number of replicas for produced files
-
sampler
Description copied from interface:FileOutputFormatBuilder.OutputOptions
Specify a sampler to be used when writing out data. This will result in the output file having sample data.- Specified by:
sampler
in interfaceFileOutputFormatBuilder.OutputOptions<T>
- Parameters:
samplerConfig
- The configuration for creating sample data in the output file.
-
summarizers
public FileOutputFormatBuilder.OutputOptions<T> summarizers(SummarizerConfiguration... summarizerConfigs) Description copied from interface:FileOutputFormatBuilder.OutputOptions
Specifies a list of summarizer configurations to create summary data in the output file. Each Key Value written will be passed to the configuredSummarizer
's.- Specified by:
summarizers
in interfaceFileOutputFormatBuilder.OutputOptions<T>
- Parameters:
summarizerConfigs
- summarizer configurations
-
store
Description copied from interface:FileOutputFormatBuilder.OutputOptions
Finish configuring, verify and serialize options into the Job or JobConf- Specified by:
store
in interfaceFileOutputFormatBuilder.OutputOptions<T>
-