A container for bucketing information.
A database defined in the catalog.
A function defined in the catalog.
A function defined in the catalog.
name of the function
fully qualified class name, e.g. "org.apache.spark.util.MyFunc"
resource types and Uris used by the function
This class of statistics is used in CatalogTable to interact with metastore.
This class of statistics is used in CatalogTable to interact with metastore. We define this new class instead of directly using Statistics here because there are no concepts of attributes or broadcast hint in catalog.
Storage format, used to describe how a partition or a table is stored.
A table defined in the catalog.
A table defined in the catalog.
Note that Hive's metastore also tracks skewed columns. We should consider adding that in the future once we have a better understanding of how we want to handle skewed columns.
the name of the data source provider for this table, e.g. parquet, json, etc. Can be None if this table is a View, should be "hive" for hive serde tables.
is a list of string descriptions of features that are used by the underlying table but not supported by Spark SQL yet.
whether this table's partition metadata is stored in the catalog. If false, it is inferred automatically based on file structure.
Whether or not the schema resolved for this table is case-sensitive. When using a Hive Metastore, this flag is set to false if a case- sensitive schema was unable to be read from the table properties. Used to trigger case-sensitive schema inference at query time, when configured.
A partition (Hive style) defined in the catalog.
A partition (Hive style) defined in the catalog.
partition spec values indexed by column name
storage format of the partition
some parameters for the partition, for example, stats.
Event fired after a database has been created.
Event fired before a database is created.
Event fired after a function has been created.
Event fired before a function is created.
Event fired after a table has been created.
Event fired before a table is created.
Event fired when a database is create or dropped.
Event fired after a database has been dropped.
Event fired before a database is dropped.
Event fired after a function has been dropped.
Event fired before a function is dropped.
Event fired after a table has been dropped.
Event fired before a table is dropped.
Interface for the system catalog (of functions, partitions, tables, and databases).
Interface for the system catalog (of functions, partitions, tables, and databases).
This is only used for non-temporary items, and implementations must be thread-safe as they can be accessed in multiple threads. This is an external catalog because it is expected to interact with external systems.
Implementations should throw NoSuchDatabaseException when databases don't exist.
Event emitted by the external catalog when it is modified.
Event emitted by the external catalog when it is modified. Events are either fired before or after the modification (the event should document this).
Listener interface for external catalog modification events.
Event fired when a function is created, dropped or renamed.
A simple trait representing a class that can be used to load resources used by a function.
A simple trait representing a class that can be used to load resources used by a function. Because only a SQLContext can load resources, we create this trait to avoid of explicitly passing SQLContext around.
A trait that represents the type of a resourced needed by a function.
A thread-safe manager for global temporary views, providing atomic operations to manage them, e.g.
A thread-safe manager for global temporary views, providing atomic operations to manage them, e.g. create, update, remove, etc.
Note that, the view name is always case-sensitive here, callers are responsible to format the view name w.r.t. case-sensitive config.
A LogicalPlan
that represents a hive table.
A LogicalPlan
that represents a hive table.
TODO: remove this after we completely make hive as a data source.
An in-memory (ephemeral) implementation of the system catalog.
An in-memory (ephemeral) implementation of the system catalog.
This is a dummy implementation that does not require setting up external systems. It is intended for testing or exploration purposes only and should not be used in production.
All public methods should be synchronized for thread-safety.
Event fired after a function has been renamed.
Event fired before a function is renamed.
Event fired after a table has been renamed.
Event fired before a table is renamed.
An internal catalog that is used by a Spark Session.
An internal catalog that is used by a Spark Session. This internal catalog serves as a proxy to the underlying metastore (e.g. Hive Metastore) and it also manages temporary tables and functions of the Spark Session that it belongs to.
This class must be thread-safe.
Event fired when a table is created, dropped or renamed.
A placeholder for a table relation, which will be replaced by concrete relation like
LogicalRelation
or HiveTableRelation
, during analysis.
A container for bucketing information. Bucketing is a technology for decomposing data sets into more manageable parts, and the number of buckets is fixed so it does not fluctuate with data.
number of buckets.
the names of the columns that used to generate the bucket id.
the names of the columns that used to sort data in each bucket.