Package

io.eels.component

hive

Permalink

package hive

Visibility
  1. Public
  2. All

Type Members

  1. trait AlignmentStrategy extends AnyRef

    Permalink

    An alignment strategy will accept an input Row and return an output Row that is compatible with the target schema.

    An alignment strategy will accept an input Row and return an output Row that is compatible with the target schema. This allows writing to sinks whereby the output schema is not the same as the input schema.

    For example, the input may come from a JDBC table, and an output Hive table only defines a subset of the columns. Each row would need to be aligned so that it matches the subset schema.

    Implementations are free to add values, drop values or throw an exception if they wish.

  2. trait CommitCallback extends AnyRef

    Permalink
  3. trait EvolutionStrategy extends AnyRef

    Permalink

    A strategy that determines how a hive metastore schema is evolved for a given target schema.

    A strategy that determines how a hive metastore schema is evolved for a given target schema.

    For example, a strategy may choose to alter the hive table to add any missing columns. Or it may choose to abort a write by throwing an exception. Or it may choose to leave the schema as is and drop the columns from the input rows.

  4. trait FileListener extends AnyRef

    Permalink
  5. trait FilenameStrategy extends AnyRef

    Permalink

    Strategy responsible for the filenames created by eel when writing out data.

  6. class HiveContext extends AnyRef

    Permalink
  7. case class HiveDatabase(dbName: String)(implicit fs: FileSystem, client: IMetaStoreClient) extends Product with Serializable

    Permalink
  8. case class HiveDatasetUri(db: String, table: String) extends Product with Serializable

    Permalink
  9. trait HiveDialect extends Logging

    Permalink
  10. class HiveFilePublisher extends Publisher[Seq[Row]] with Using

    Permalink

  11. class HiveOps extends Logging

    Permalink
  12. trait HiveOutputStream extends AnyRef

    Permalink
  13. class HivePartitionExtractor extends AnyRef

    Permalink
  14. class HivePartitionPublisher extends Publisher[Seq[Row]] with Logging

    Permalink

    A Hive Part that can read values from the metastore, rather than reading values from files.

    A Hive Part that can read values from the metastore, rather than reading values from files. This can be used only when the requested fields are all partition keys.

  15. class HivePartitionScanner extends Logging

    Permalink
  16. case class HiveSink(dbName: String, tableName: String, permission: Option[FsPermission] = None, inheritPermissions: Option[Boolean] = None, principal: Option[String] = None, partitionFields: Seq[String] = Nil, partitionStrategy: PartitionStrategy = new DynamicPartitionStrategy, filenameStrategy: FilenameStrategy = DefaultFilenameStrategy, stagingStrategy: StagingStrategy = DefaultStagingStrategy, evolutionStrategy: EvolutionStrategy = AdditionEvolutionStrategy, alignStrategy: AlignmentStrategy = RowPaddingAlignmentStrategy, outputSchemaStrategy: OutputSchemaStrategy = SkipPartitionsOutputSchemaStrategy, keytabPath: Option[Path] = None, fileListener: FileListener = FileListener.noop, createTable: Boolean = false, dialect: Option[HiveDialect] = None, callbacks: Seq[CommitCallback] = Nil, roundingMode: RoundingMode = RoundingMode.UNNECESSARY, metadata: Map[String, String] = Map.empty)(implicit fs: FileSystem, client: IMetaStoreClient) extends Sink with Logging with Product with Serializable

    Permalink
  17. class HiveSinkWriter extends SinkWriter with Logging

    Permalink
  18. case class HiveSource(dbName: String, tableName: String, projection: List[String] = Nil, predicate: Option[Predicate] = None, partitionConstraints: Seq[PartitionConstraint] = Nil, principal: Option[String] = None, keytabPath: Option[Path] = None)(implicit fs: FileSystem, client: IMetaStoreClient) extends Source with Logging with Using with Product with Serializable

    Permalink

    projection

    sets which fields are required by the caller.

    predicate

    optional predicate which will filter rows at the read level

  19. trait HiveStats extends AnyRef

    Permalink
  20. case class HiveTable(dbName: String, tableName: String)(implicit fs: FileSystem, conf: Configuration, client: IMetaStoreClient) extends Logging with Product with Serializable

    Permalink
  21. trait OutputSchemaStrategy extends AnyRef

    Permalink

    Accepts a metastore schema and returns the schema that should actually be persisted to disk.

    Accepts a metastore schema and returns the schema that should actually be persisted to disk. This allows us to determine if some data is not written, for example in parquet files it is common to skip writing out partition data, since that data is present in the metastore.

  22. class ParquetHiveStats extends HiveStats with Logging

    Permalink
  23. case class PartitionColumn(name: String, dataType: DataType = StringType) extends Product with Serializable

    Permalink
  24. trait RowAligner extends AnyRef

    Permalink
  25. trait StagingStrategy extends AnyRef

    Permalink
  26. trait StagingStrategy2 extends AnyRef

    Permalink
  27. case class TableSpec(tableName: String, tableType: TableType, location: String, cols: Seq[FieldSchema], numBuckets: Int, bucketNames: List[String], params: Map[String, String], inputFormat: String, outputFormat: String, serde: String, retention: Int, createTime: Long, lastAccessTime: Long, owner: String) extends Product with Serializable

    Permalink

Value Members

  1. object AdditionEvolutionStrategy extends EvolutionStrategy with Logging

    Permalink

    The AdditionEvolutionStrategy will add any missing fields to the schema in the hive metastore.

    The AdditionEvolutionStrategy will add any missing fields to the schema in the hive metastore. It will not check that any existing fields are of the same type as in the metastore. The new fields cannot be added as partition fields.

  2. object DefaultFilenameStrategy extends FilenameStrategy

    Permalink
  3. object DefaultStagingStrategy extends StagingStrategy

    Permalink
  4. object FileListener

    Permalink
  5. object HiveDDL

    Permalink
  6. object HiveDatasetUri extends Serializable

    Permalink
  7. object HiveDialect extends Logging

    Permalink
  8. object HiveFileScanner extends Logging

    Permalink
  9. object HiveSchemaFns extends Logging

    Permalink
  10. object HiveSink extends Serializable

    Permalink
  11. object HiveTableFilesFn extends Logging

    Permalink

    Locates files for a given table.

    Locates files for a given table.

    Connects to the hive metastore to get the partitions list (or if no partitions then just root) and scans those directories.

    Returns a Map of each partition to the files in that partition.

    If partition constraints are specified then those partitions are filtered out.

  12. object RowPaddingAlignmentStrategy extends AlignmentStrategy

    Permalink

    An AlignmentStrategy that will use default values, or nulls, to pad out rows to match the target schema, dropping any fields that exist in the input, but not the output, schema

  13. object SkipPartitionsOutputSchemaStrategy extends OutputSchemaStrategy

    Permalink

    This strategy will drop partition columns from the schema so that they not written out to the files.

  14. package dialect

    Permalink
  15. package partition

    Permalink

Ungrouped