public class GoogleHadoopFileSystemConfiguration extends Object
GoogleHadoopFileSystem
implementations.Modifier and Type | Field and Description |
---|---|
static HadoopConfigurationProperty<Long> |
BLOCK_SIZE
Configuration key for default block size of a file.
|
static List<String> |
CONFIG_KEY_PREFIXES |
static HadoopConfigurationProperty<String> |
DELEGATION_TOKEN_BINDING_CLASS
Configuration key for Delegation Token binding class.
|
static HadoopConfigurationProperty<Boolean> |
GCE_BUCKET_DELETE_ENABLE
If true, recursive delete on a path that refers to a GCS bucket itself ('/' for any
bucket-rooted GoogleHadoopFileSystem) or delete on that path when it's empty will result in
fully deleting the GCS bucket.
|
static HadoopConfigurationProperty<String> |
GCS_APPLICATION_NAME_SUFFIX
Configuration key for adding a suffix to the GHFS application name sent to GCS.
|
static HadoopConfigurationProperty<Integer> |
GCS_BATCH_THREADS
Configuration key for a number of threads to execute batch requests.
|
static HadoopConfigurationProperty<Boolean> |
GCS_CONCURRENT_GLOB_ENABLE
Configuration key for enabling the use of flat and regular glob search algorithms in two
parallel threads.
|
static HadoopConfigurationProperty<String> |
GCS_CONFIG_OVERRIDE_FILE
Override configuration file path.
|
static String |
GCS_CONFIG_PREFIX |
static HadoopConfigurationProperty<Boolean> |
GCS_COOPERATIVE_LOCKING_ENABLE
Configuration key for using cooperative locking to achieve a directory mutation operations
isolation.
|
static HadoopConfigurationProperty<Long> |
GCS_COOPERATIVE_LOCKING_EXPIRATION_TIMEOUT_MS
Configuration key for lock expiration when using cooperative locking.
|
static HadoopConfigurationProperty<Integer> |
GCS_COOPERATIVE_LOCKING_MAX_CONCURRENT_OPERATIONS
Configuration key for maximum allowed concurrent operations when using cooperative locking.
|
static HadoopConfigurationProperty<Integer> |
GCS_COPY_BATCH_THREADS
Configuration key for a number of threads to execute batch requests for copy operations.
|
static HadoopConfigurationProperty<Long> |
GCS_COPY_MAX_REQUESTS_PER_BATCH
Configuration key for a max number of GCS RPCs in batch request for copy operations.
|
static HadoopConfigurationProperty<Boolean> |
GCS_COPY_WITH_REWRITE_ENABLE
Configuration key for enabling the use of Rewrite requests for copy operations.
|
static HadoopConfigurationProperty<GoogleHadoopFileSystemBase.GcsFileChecksumType> |
GCS_FILE_CHECKSUM_TYPE
Configuration key for which type of FileChecksum to return; if a particular file doesn't
support the requested type, then getFileChecksum() will return null for that file.
|
static HadoopConfigurationProperty<Boolean> |
GCS_FLAT_GLOB_ENABLE
Configuration key for enabling the use of a large flat listing to pre-populate possible glob
matches in a single API call before running the core globbing logic in-memory rather than
sequentially and recursively performing API calls.
|
static HadoopConfigurationProperty<Integer> |
GCS_HTTP_CONNECT_TIMEOUT
Configuration key for the connect timeout (in millisecond) for HTTP request to GCS.
|
static HadoopConfigurationProperty<Map<String,String>> |
GCS_HTTP_HEADERS
Configuration key for the headers for HTTP request to GCS.
|
static HadoopConfigurationProperty<Integer> |
GCS_HTTP_MAX_RETRY
Configuration key for the max number of retries for failed HTTP request to GCS.
|
static HadoopConfigurationProperty<Integer> |
GCS_HTTP_READ_TIMEOUT
Configuration key for the connect timeout (in millisecond) for HTTP request to GCS.
|
static HadoopConfigurationProperty<Boolean> |
GCS_INFER_IMPLICIT_DIRECTORIES_ENABLE
Configuration key for enabling automatic inference of implicit directories.
|
static HadoopConfigurationProperty<Integer> |
GCS_INPUT_STREAM_BUFFER_SIZE
Configuration key for setting read buffer size.
|
static HadoopConfigurationProperty<GoogleCloudStorageReadOptions.Fadvise> |
GCS_INPUT_STREAM_FADVISE
Tunes reading objects behavior to optimize HTTP GET requests for various use cases.
|
static HadoopConfigurationProperty<Boolean> |
GCS_INPUT_STREAM_FAST_FAIL_ON_NOT_FOUND_ENABLE
If true, on opening a file we will proactively perform a metadata GET to check whether the
object exists, even though the underlying channel will not open a data stream until read() is
actually called so that streams can seek to nonzero file positions without incurring an extra
stream creation.
|
static HadoopConfigurationProperty<Long> |
GCS_INPUT_STREAM_INPLACE_SEEK_LIMIT
If forward seeks are within this many bytes of the current position, seeks are performed by
reading and discarding bytes in-place rather than opening a new underlying stream.
|
static HadoopConfigurationProperty<Integer> |
GCS_INPUT_STREAM_MIN_RANGE_REQUEST_SIZE
Minimum size in bytes of the HTTP Range header set in GCS request when opening new stream to
read an object.
|
static HadoopConfigurationProperty<Boolean> |
GCS_INPUT_STREAM_SUPPORT_GZIP_ENCODING_ENABLE
If true, reading a file with GZIP content encoding (HTTP header "Content-Encoding: gzip") will
result in failure (IOException is thrown).
|
static HadoopConfigurationProperty<Boolean> |
GCS_LAZY_INITIALIZATION_ENABLE
Configuration key for enabling lazy initialization of GCS FS instance.
|
static HadoopConfigurationProperty<String> |
GCS_MARKER_FILE_PATTERN
Configuration key for marker file pattern.
|
static HadoopConfigurationProperty<Long> |
GCS_MAX_LIST_ITEMS_PER_CALL
Configuration key for number of items to return per call to the list* GCS RPCs.
|
static HadoopConfigurationProperty<Long> |
GCS_MAX_REQUESTS_PER_BATCH
Configuration key for a max number of GCS RPCs in batch request.
|
static HadoopConfigurationProperty<Integer> |
GCS_MAX_WAIT_MILLIS_EMPTY_OBJECT_CREATE
Configuration key for modifying the maximum amount of time to wait for empty object creation.
|
static HadoopConfigurationProperty<Integer> |
GCS_OUTPUT_STREAM_BUFFER_SIZE
Configuration key for setting write buffer size.
|
static HadoopConfigurationProperty<Boolean> |
GCS_OUTPUT_STREAM_DIRECT_UPLOAD_ENABLE
Configuration key for enabling GCS direct upload.
|
static HadoopConfigurationProperty<Integer> |
GCS_OUTPUT_STREAM_PIPE_BUFFER_SIZE
Configuration key for setting pipe buffer size.
|
static HadoopConfigurationProperty<GoogleHadoopFileSystemBase.OutputStreamType> |
GCS_OUTPUT_STREAM_TYPE
Configuration key for which type of output stream to use; different options may have different
degrees of support for advanced features like
hsync() and different performance
characteristics. |
static HadoopConfigurationProperty<Integer> |
GCS_OUTPUT_STREAM_UPLOAD_CACHE_SIZE
Configuration for setting GCS upload cache size.
|
static HadoopConfigurationProperty<Integer> |
GCS_OUTPUT_STREAM_UPLOAD_CHUNK_SIZE
Configuration key for setting GCS upload chunk size.
|
static HadoopConfigurationProperty<Boolean> |
GCS_PERFORMANCE_CACHE_ENABLE
Configuration key for using a local item cache to supplement GCS API "getFile" results.
|
static HadoopConfigurationProperty<Long> |
GCS_PERFORMANCE_CACHE_MAX_ENTRY_AGE_MILLIS
Configuration key for maximum number of milliseconds a GoogleCloudStorageItemInfo will remain
"valid" in the performance cache before it's invalidated.
|
static HadoopConfigurationProperty<String> |
GCS_PROJECT_ID
Configuration key for GCS project ID.
|
static HadoopConfigurationProperty<Boolean> |
GCS_REPAIR_IMPLICIT_DIRECTORIES_ENABLE
Configuration key for enabling automatic repair of implicit directories whenever detected
inside delete and rename calls.
|
static HadoopConfigurationProperty<Collection<String>> |
GCS_REQUESTER_PAYS_BUCKETS
Configuration key for GCS Requester Pays Buckets.
|
static HadoopConfigurationProperty<RequesterPaysOptions.RequesterPaysMode> |
GCS_REQUESTER_PAYS_MODE
Configuration key for GCS project ID.
|
static HadoopConfigurationProperty<String> |
GCS_REQUESTER_PAYS_PROJECT_ID
Configuration key for GCS Requester Pays Project ID.
|
static HadoopConfigurationProperty<Long> |
GCS_REWRITE_MAX_BYTES_PER_CALL
Configuration key for specifying max number of bytes rewritten in a single rewrite request when
fs.gs.copy.with.rewrite.enable is set to 'true'.
|
static HadoopConfigurationProperty<String> |
GCS_ROOT_URL
Configuration key for the Cloud Storage API endpoint root URL.
|
static HadoopConfigurationProperty<Boolean> |
GCS_STATUS_PARALLEL_ENABLE
If true, executes GCS requests in
listStatus and getFileStatus methods in
parallel to reduce latency. |
static HadoopConfigurationProperty<String> |
GCS_WORKING_DIRECTORY
Configuration key for initial working directory of a GHFS instance.
|
static HadoopConfigurationProperty<String> |
PERMISSIONS_TO_REPORT
Key for the permissions that we report a file or directory to have.
|
Constructor and Description |
---|
GoogleHadoopFileSystemConfiguration() |
public static final String GCS_CONFIG_PREFIX
public static final HadoopConfigurationProperty<String> GCS_ROOT_URL
public static final HadoopConfigurationProperty<String> PERMISSIONS_TO_REPORT
FsPermission.FsPermission(String)
Default value for the permissions that we report a file or directory to have. Note: We do not really support file/dir permissions but we need to report some permission value when Hadoop calls getFileStatus(). A MapReduce job fails if we report permissions more relaxed than the value below and this is the default File System.
public static final HadoopConfigurationProperty<Long> BLOCK_SIZE
Note that this is the size that is reported to Hadoop FS clients. It does not modify the actual block size of an underlying GCS object, because GCS JSON API does not allow modifying or querying the value. Modifying this value allows one to control how many mappers are used to process a given file.
public static final HadoopConfigurationProperty<String> DELEGATION_TOKEN_BINDING_CLASS
public static final HadoopConfigurationProperty<String> GCS_PROJECT_ID
public static final HadoopConfigurationProperty<String> GCS_WORKING_DIRECTORY
public static final HadoopConfigurationProperty<Boolean> GCE_BUCKET_DELETE_ENABLE
public static final HadoopConfigurationProperty<RequesterPaysOptions.RequesterPaysMode> GCS_REQUESTER_PAYS_MODE
public static final HadoopConfigurationProperty<String> GCS_REQUESTER_PAYS_PROJECT_ID
public static final HadoopConfigurationProperty<Collection<String>> GCS_REQUESTER_PAYS_BUCKETS
public static final HadoopConfigurationProperty<GoogleHadoopFileSystemBase.GcsFileChecksumType> GCS_FILE_CHECKSUM_TYPE
public static final HadoopConfigurationProperty<Boolean> GCS_PERFORMANCE_CACHE_ENABLE
public static final HadoopConfigurationProperty<Long> GCS_PERFORMANCE_CACHE_MAX_ENTRY_AGE_MILLIS
public static final HadoopConfigurationProperty<Boolean> GCS_STATUS_PARALLEL_ENABLE
listStatus
and getFileStatus
methods in
parallel to reduce latency.public static final HadoopConfigurationProperty<Boolean> GCS_LAZY_INITIALIZATION_ENABLE
public static final HadoopConfigurationProperty<Boolean> GCS_REPAIR_IMPLICIT_DIRECTORIES_ENABLE
public static final HadoopConfigurationProperty<Boolean> GCS_INFER_IMPLICIT_DIRECTORIES_ENABLE
public static final HadoopConfigurationProperty<Boolean> GCS_FLAT_GLOB_ENABLE
public static final HadoopConfigurationProperty<Boolean> GCS_CONCURRENT_GLOB_ENABLE
public static final HadoopConfigurationProperty<String> GCS_MARKER_FILE_PATTERN
public static final HadoopConfigurationProperty<Long> GCS_MAX_REQUESTS_PER_BATCH
public static final HadoopConfigurationProperty<Integer> GCS_BATCH_THREADS
public static final HadoopConfigurationProperty<Long> GCS_COPY_MAX_REQUESTS_PER_BATCH
public static final HadoopConfigurationProperty<Integer> GCS_COPY_BATCH_THREADS
public static final HadoopConfigurationProperty<Boolean> GCS_COPY_WITH_REWRITE_ENABLE
public static final HadoopConfigurationProperty<Long> GCS_REWRITE_MAX_BYTES_PER_CALL
public static final HadoopConfigurationProperty<Long> GCS_MAX_LIST_ITEMS_PER_CALL
public static final HadoopConfigurationProperty<Integer> GCS_HTTP_MAX_RETRY
Also, note that this number will only control the number of retries in the low level HTTP request implementation.
public static final HadoopConfigurationProperty<Integer> GCS_HTTP_CONNECT_TIMEOUT
public static final HadoopConfigurationProperty<Integer> GCS_HTTP_READ_TIMEOUT
public static final HadoopConfigurationProperty<String> GCS_APPLICATION_NAME_SUFFIX
public static final HadoopConfigurationProperty<Integer> GCS_MAX_WAIT_MILLIS_EMPTY_OBJECT_CREATE
public static final HadoopConfigurationProperty<GoogleHadoopFileSystemBase.OutputStreamType> GCS_OUTPUT_STREAM_TYPE
hsync()
and different performance
characteristics. Options:
BASIC: Stream is closest analogue to direct wrapper around low-level HTTP stream into GCS.
SYNCABLE_COMPOSITE: Stream behaves similarly to BASIC when used with basic create/write/close patterns, but supports hsync() by creating discrete temporary GCS objects which are composed onto the destination object.
public static final HadoopConfigurationProperty<Integer> GCS_OUTPUT_STREAM_BUFFER_SIZE
public static final HadoopConfigurationProperty<Integer> GCS_OUTPUT_STREAM_PIPE_BUFFER_SIZE
public static final HadoopConfigurationProperty<Integer> GCS_OUTPUT_STREAM_UPLOAD_CHUNK_SIZE
public static final HadoopConfigurationProperty<Integer> GCS_OUTPUT_STREAM_UPLOAD_CACHE_SIZE
public static final HadoopConfigurationProperty<Boolean> GCS_OUTPUT_STREAM_DIRECT_UPLOAD_ENABLE
public static final HadoopConfigurationProperty<Integer> GCS_INPUT_STREAM_BUFFER_SIZE
public static final HadoopConfigurationProperty<Boolean> GCS_INPUT_STREAM_FAST_FAIL_ON_NOT_FOUND_ENABLE
public static final HadoopConfigurationProperty<Boolean> GCS_INPUT_STREAM_SUPPORT_GZIP_ENCODING_ENABLE
public static final HadoopConfigurationProperty<Long> GCS_INPUT_STREAM_INPLACE_SEEK_LIMIT
public static final HadoopConfigurationProperty<GoogleCloudStorageReadOptions.Fadvise> GCS_INPUT_STREAM_FADVISE
public static final HadoopConfigurationProperty<Integer> GCS_INPUT_STREAM_MIN_RANGE_REQUEST_SIZE
public static final HadoopConfigurationProperty<String> GCS_CONFIG_OVERRIDE_FILE
public static final HadoopConfigurationProperty<Boolean> GCS_COOPERATIVE_LOCKING_ENABLE
public static final HadoopConfigurationProperty<Long> GCS_COOPERATIVE_LOCKING_EXPIRATION_TIMEOUT_MS
public static final HadoopConfigurationProperty<Integer> GCS_COOPERATIVE_LOCKING_MAX_CONCURRENT_OPERATIONS
public static final HadoopConfigurationProperty<Map<String,String>> GCS_HTTP_HEADERS
Copyright © 2020. All rights reserved.