Package

org.apache.spark.sql.catalyst

catalog

Permalink

package catalog

Visibility
  1. Public
  2. All

Type Members

  1. case class CatalogColumn(name: String, dataType: String, nullable: Boolean = true, comment: Option[String] = None) extends Product with Serializable

    Permalink

    A column in a table.

  2. case class CatalogDatabase(name: String, description: String, locationUri: String, properties: Map[String, String]) extends Product with Serializable

    Permalink

    A database defined in the catalog.

  3. case class CatalogFunction(identifier: FunctionIdentifier, className: String, resources: Seq[FunctionResource]) extends Product with Serializable

    Permalink

    A function defined in the catalog.

    A function defined in the catalog.

    identifier

    name of the function

    className

    fully qualified class name, e.g. "org.apache.spark.util.MyFunc"

    resources

    resource types and Uris used by the function

  4. trait CatalogRelation extends AnyRef

    Permalink

    An interface that is implemented by logical plans to return the underlying catalog table.

    An interface that is implemented by logical plans to return the underlying catalog table. If we can in the future consolidate SimpleCatalogRelation and MetastoreRelation, we should probably remove this interface.

  5. case class CatalogStorageFormat(locationUri: Option[String], inputFormat: Option[String], outputFormat: Option[String], serde: Option[String], compressed: Boolean, serdeProperties: Map[String, String]) extends Product with Serializable

    Permalink

    Storage format, used to describe how a partition or a table is stored.

  6. case class CatalogTable(identifier: TableIdentifier, tableType: CatalogTableType, storage: CatalogStorageFormat, schema: Seq[CatalogColumn], partitionColumnNames: Seq[String] = Seq.empty, sortColumnNames: Seq[String] = Seq.empty, bucketColumnNames: Seq[String] = Seq.empty, numBuckets: Int = 1, owner: String = "", createTime: Long = System.currentTimeMillis, lastAccessTime: Long = 1, properties: Map[String, String] = Map.empty, viewOriginalText: Option[String] = None, viewText: Option[String] = None, comment: Option[String] = None, hasUnsupportedFeatures: Boolean = false) extends Product with Serializable

    Permalink

    A table defined in the catalog.

    A table defined in the catalog.

    Note that Hive's metastore also tracks skewed columns. We should consider adding that in the future once we have a better understanding of how we want to handle skewed columns.

    hasUnsupportedFeatures

    is used to indicate whether all table metadata entries retrieved from the concrete underlying external catalog (e.g. Hive metastore) are supported by Spark SQL. For example, if the underlying Hive table has skewed columns, this information can't be mapped to CatalogTable since Spark SQL doesn't handle skewed columns for now. In this case hasUnsupportedFeatures is set to true. By default, it is false.

  7. case class CatalogTablePartition(spec: TablePartitionSpec, storage: CatalogStorageFormat) extends Product with Serializable

    Permalink

    A partition (Hive style) defined in the catalog.

    A partition (Hive style) defined in the catalog.

    spec

    partition spec values indexed by column name

    storage

    storage format of the partition

  8. case class CatalogTableType extends Product with Serializable

    Permalink
  9. abstract class ExternalCatalog extends AnyRef

    Permalink

    Interface for the system catalog (of columns, partitions, tables, and databases).

    Interface for the system catalog (of columns, partitions, tables, and databases).

    This is only used for non-temporary items, and implementations must be thread-safe as they can be accessed in multiple threads. This is an external catalog because it is expected to interact with external systems.

    Implementations should throw NoSuchDatabaseException when table or database don't exist.

  10. case class FunctionResource(resourceType: FunctionResourceType, uri: String) extends Product with Serializable

    Permalink
  11. trait FunctionResourceLoader extends AnyRef

    Permalink

    A simple trait representing a class that can be used to load resources used by a function.

    A simple trait representing a class that can be used to load resources used by a function. Because only a SQLContext can load resources, we create this trait to avoid of explicitly passing SQLContext around.

  12. abstract class FunctionResourceType extends AnyRef

    Permalink

    An trait that represents the type of a resourced needed by a function.

  13. class InMemoryCatalog extends ExternalCatalog

    Permalink

    An in-memory (ephemeral) implementation of the system catalog.

    An in-memory (ephemeral) implementation of the system catalog.

    This is a dummy implementation that does not require setting up external systems. It is intended for testing or exploration purposes only and should not be used in production.

    All public methods should be synchronized for thread-safety.

  14. class SessionCatalog extends Logging

    Permalink

    An internal catalog that is used by a Spark Session.

    An internal catalog that is used by a Spark Session. This internal catalog serves as a proxy to the underlying metastore (e.g. Hive Metastore) and it also manages temporary tables and functions of the Spark Session that it belongs to.

    This class must be thread-safe.

  15. case class SimpleCatalogRelation(databaseName: String, metadata: CatalogTable, alias: Option[String] = None) extends LeafNode with CatalogRelation with Product with Serializable

    Permalink

    A LogicalPlan that wraps CatalogTable.

    A LogicalPlan that wraps CatalogTable.

    Note that in the future we should consolidate this and HiveCatalogRelation.

Ungrouped