public class GoogleHadoopSyncableOutputStream
extends java.io.OutputStream
implements org.apache.hadoop.fs.Syncable
Syncable
interface by composing objects
created in separate underlying streams for each hsync() call.
Prior to the first hsync(), sync() or close() call, this channel will behave the same way as a basic non-syncable channel, writing directly to the destination file.
On the first call to hsync()/sync(), the destination file is committed and a new temporary file using a hidden-file prefix (underscore) is created with an additional suffix which differs for each subsequent temporary file in the series; during this time readers can read the data committed to the destination file, but not the bytes written to the temporary file since the last hsync() call.
On each subsequent hsync()/sync() call, the temporary file closed(), composed onto the destination file, then deleted, and a new temporary file is opened under a new filename for further writes.
Caveat: each hsync()/sync() requires many underlying read and mutation requests occurring sequentially, so latency is expected to be fairly high.
If errors occur mid-stream, there may be one or more temporary files failing to be cleaned up, and require manual intervention to discover and delete any such unused files. Data written prior to the most recent successful hsync() is persistent and safe in such a case.
If multiple writers are attempting to write to the same destination file, generation ids used with low-level precondition checks will cause all but a one writer to fail their precondition checks during writes, and a single remaining writer will safely occupy the stream.
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
TEMPFILE_PREFIX |
Constructor and Description |
---|
GoogleHadoopSyncableOutputStream(GoogleHadoopFileSystemBase ghfs,
java.net.URI gcsPath,
org.apache.hadoop.fs.FileSystem.Statistics statistics,
CreateFileOptions createFileOptions)
Creates a new GoogleHadoopSyncableOutputStream with initial stream initialized and expected to
begin at file-offset 0.
|
Modifier and Type | Method and Description |
---|---|
void |
close() |
void |
hflush()
There is no way to flush data to become available for readers without a full-fledged hsync(),
so this method is a no-op.
|
void |
hsync() |
void |
sync() |
void |
write(byte[] b,
int offset,
int len) |
void |
write(int b) |
public static final java.lang.String TEMPFILE_PREFIX
public GoogleHadoopSyncableOutputStream(GoogleHadoopFileSystemBase ghfs, java.net.URI gcsPath, org.apache.hadoop.fs.FileSystem.Statistics statistics, CreateFileOptions createFileOptions) throws java.io.IOException
java.io.IOException
public void write(int b) throws java.io.IOException
write
in class java.io.OutputStream
java.io.IOException
public void write(byte[] b, int offset, int len) throws java.io.IOException
write
in class java.io.OutputStream
java.io.IOException
public void close() throws java.io.IOException
close
in interface java.io.Closeable
close
in interface java.lang.AutoCloseable
close
in class java.io.OutputStream
java.io.IOException
public void sync() throws java.io.IOException
sync
in interface org.apache.hadoop.fs.Syncable
java.io.IOException
public void hflush() throws java.io.IOException
hflush
in interface org.apache.hadoop.fs.Syncable
java.io.IOException
public void hsync() throws java.io.IOException
hsync
in interface org.apache.hadoop.fs.Syncable
CompositeLimitExceededException
- if this hsync() call would require any future close()
call to exceed the component limit. If CompositeLimitExceededException is thrown, no actual
GCS operations are taken and it's safe to subsequently call close() on this stream as
normal; it just means data written since the last successful hsync() has not yet been
committed.java.io.IOException
Copyright © 2019. All rights reserved.