Class

com.salesforce.op.utils.io

DirectOutputCommitter

Related Doc: package io

Permalink

class DirectOutputCommitter extends OutputCommitter

OutputCommitter suitable for S3 workloads. Unlike the usual FileOutputCommitter, which writes files to a _temporary/ directory before renaming them to their final location, this simply writes directly to the final location.

The FileOutputCommitter is required for HDFS + speculation, which allows only one writer at a time for a file (so two people racing to write the same file would not work). However, S3 supports multiple writers outputting to the same file, where visibility is guaranteed to be atomic. This is a monotonic operation: all writers should be writing the same data, so which one wins is immaterial.

Code adapted from Ian Hummel's code from this PR: https://github.com/themodernlife/spark/commit/4359664b1d557d55b0579023df809542386d5b8c

Linear Supertypes
OutputCommitter, OutputCommitter, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DirectOutputCommitter
  2. OutputCommitter
  3. OutputCommitter
  4. AnyRef
  5. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DirectOutputCommitter()

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def abortJob(arg0: JobContext, arg1: State): Unit

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  5. def abortJob(arg0: JobContext, arg1: Int): Unit

    Permalink
    Definition Classes
    OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  6. def abortTask(taskContext: TaskAttemptContext): Unit

    Permalink
    Definition Classes
    DirectOutputCommitter → OutputCommitter
  7. final def abortTask(arg0: TaskAttemptContext): Unit

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. def commitJob(context: JobContext): Unit

    Permalink

    Creates a _SUCCESS file to indicate the entire job was successful.

    Creates a _SUCCESS file to indicate the entire job was successful. This mimics the behavior of FileOutputCommitter, reusing the same file name and conf option.

    Definition Classes
    DirectOutputCommitter → OutputCommitter
  11. final def commitJob(arg0: JobContext): Unit

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  12. def commitTask(taskContext: TaskAttemptContext): Unit

    Permalink
    Definition Classes
    DirectOutputCommitter → OutputCommitter
  13. final def commitTask(arg0: TaskAttemptContext): Unit

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  14. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  15. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  16. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  17. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  18. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  19. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  20. final def isRecoverySupported(arg0: JobContext): Boolean

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  21. def isRecoverySupported(arg0: JobContext): Boolean

    Permalink
    Definition Classes
    OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  22. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  23. def needsTaskCommit(taskContext: TaskAttemptContext): Boolean

    Permalink
    Definition Classes
    DirectOutputCommitter → OutputCommitter
  24. final def needsTaskCommit(arg0: TaskAttemptContext): Boolean

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  25. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  26. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  27. final def recoverTask(arg0: TaskAttemptContext): Unit

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  28. def recoverTask(arg0: TaskAttemptContext): Unit

    Permalink
    Definition Classes
    OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  29. def setupJob(jobContext: JobContext): Unit

    Permalink
    Definition Classes
    DirectOutputCommitter → OutputCommitter
  30. final def setupJob(arg0: JobContext): Unit

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  31. def setupTask(taskContext: TaskAttemptContext): Unit

    Permalink
    Definition Classes
    DirectOutputCommitter → OutputCommitter
  32. final def setupTask(arg0: TaskAttemptContext): Unit

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @throws( classOf[java.io.IOException] )
  33. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  34. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  35. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  36. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  37. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Deprecated Value Members

  1. final def cleanupJob(arg0: JobContext): Unit

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @Deprecated @deprecated @throws( classOf[java.io.IOException] )
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

  2. def cleanupJob(arg0: JobContext): Unit

    Permalink
    Definition Classes
    OutputCommitter
    Annotations
    @Deprecated @deprecated @throws( classOf[java.io.IOException] )
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

  3. def isRecoverySupported(): Boolean

    Permalink
    Definition Classes
    OutputCommitter → OutputCommitter
    Annotations
    @Deprecated @deprecated
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

Inherited from OutputCommitter

Inherited from OutputCommitter

Inherited from AnyRef

Inherited from Any

Ungrouped