A Data Source is something that can be queried to produce a dataset on HDFS either as a Hive table or a Typed Pipe.
Compiled and parameterised Hive query with defined sources.
A Hive view that has been persisted/created.
A source that can be represented/used as a Hive view (e.g.
A source that can be represented/used as a Hive view (e.g. does not require materialisation).
A Hive table that has been persisted/already exists on HDFS.
A Hive table that has been persisted/already exists on HDFS.
path to dataset on HDFS
fully qualified name (e.g. db.tablename
)
A Data Source that wraps an underlying implementation and applies a strategy to decide whether to execute the underlying data source or to use a cache/stale data.
Generic Scalding Typed Pipe source.
Generic Scalding Typed Pipe source. Will be persisted to disk on first use as a Hive table.
Provides a caching mechanism that persists Hive results to the target folder to improve performance.
Replicates the functionality from cascading-hive.
Provides helpers for parsing/manipulating Hive queries
Provides functions to support a fake Hive environment
A Hive view that has been persisted/created.
type of output records.
name in the Hive metastore (including database).