Class

io.eels.component.hive

HiveFilePart

Related Doc: package hive

Permalink

class HiveFilePart extends Part

Linear Supertypes
Part, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. HiveFilePart
  2. Part
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new HiveFilePart(dialect: HiveDialect, file: LocatedFileStatus, metastoreSchema: Schema, projectionSchema: Schema, predicate: Option[Predicate], partitions: List[PartitionPart])(implicit fs: FileSystem)

    Permalink

    metastoreSchema

    the schema as present in the metastore and used to match up with the raw data in dialects where the schema is not present. For example with a CSV format in Hive, the metastoreSchema is required in order to know what each column represents. We can't use the projection schema for this because the projection schema might be in a different order.

    projectionSchema

    the schema actually required, optional in which case the metastoreSchema will be used. The reason the projectionSchema is pushed down to the dialects rather than being applied after is because some file schemas can read data more efficiently if they know they can omit some fields (eg Parquet).

    predicate

    is pushed down to the parquet reader for efficiency

    partitions

    a list of partition key-values for this file. We require this to repopulate the partition values when creating the final Row.

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. def data(): Observable[Row]

    Permalink

    Returns the data contained in this part in the form of an Observable that a subscriber can subscribe to.

    Returns the data contained in this part in the form of an Observable that a subscriber can subscribe to. This function should createReader a clean rows on each invocation. By clean, we mean that each seperate rows should provide the full set of data contained in the part, in a thread safe manner. Ie, it should be possible to invoke this method k times, and subscribe to those k observables concurrently, and each rows should emit the same data.

    Definition Classes
    HiveFilePartPart
  7. val dialect: HiveDialect

    Permalink
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  10. val file: LocatedFileStatus

    Permalink
  11. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  12. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  13. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  14. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  15. val metastoreSchema: Schema

    Permalink

    the schema as present in the metastore and used to match up with the raw data in dialects where the schema is not present.

    the schema as present in the metastore and used to match up with the raw data in dialects where the schema is not present. For example with a CSV format in Hive, the metastoreSchema is required in order to know what each column represents. We can't use the projection schema for this because the projection schema might be in a different order.

  16. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  17. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  18. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  19. val partitions: List[PartitionPart]

    Permalink

    a list of partition key-values for this file.

    a list of partition key-values for this file. We require this to repopulate the partition values when creating the final Row.

  20. val predicate: Option[Predicate]

    Permalink

    is pushed down to the parquet reader for efficiency

  21. val projectionSchema: Schema

    Permalink

    the schema actually required, optional in which case the metastoreSchema will be used.

    the schema actually required, optional in which case the metastoreSchema will be used. The reason the projectionSchema is pushed down to the dialects rather than being applied after is because some file schemas can read data more efficiently if they know they can omit some fields (eg Parquet).

  22. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  23. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  24. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  25. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  26. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Part

Inherited from AnyRef

Inherited from Any

Ungrouped