Packages

package root

Definition Classes: root

package org

Definition Classes: root

package apache

Definition Classes: org

package spark

Definition Classes: apache

package sql

Allows the execution of relational queries, including those expressed in SQL using Spark.

Definition Classes: spark

package execution

The physical execution component of Spark SQL.

The physical execution component of Spark SQL. Note that this is a private package. All classes in catalyst are considered an internal API to Spark SQL and are subject to change between minor releases.

Definition Classes: sql

package datasources

Definition Classes: execution

case class DataSource(sparkSession: SparkSession, className: String, paths: Seq[String] = Nil, userSpecifiedSchema: Option[StructType] = None, partitionColumns: Seq[String] = Seq.empty, bucketSpec: Option[BucketSpec] = None, options: Map[String, String] = Map.empty, catalogTable: Option[CatalogTable] = None) extends Logging with Product with Serializable

The main class responsible for representing a pluggable Data Source in Spark SQL.

The main class responsible for representing a pluggable Data Source in Spark SQL. In addition to acting as the canonical set of parameters that can describe a Data Source, this class is used to resolve a description to a concrete implementation that can be used in a query plan (either batch or streaming) or to write out data using an external library.

From an end user's perspective a DataSource description can be created explicitly using org.apache.spark.sql.DataFrameReader or CREATE TABLE USING DDL. Additionally, this class is used when resolving a description from a metastore to a concrete implementation.

Many of the arguments to this class are optional, though depending on the specific API being used these optional arguments might be filled in during resolution using either inference or external metadata. For example, when reading a partitioned table from a file system, partition columns will be inferred from the directory layout even if they are not specified.

paths: A list of file system paths that hold data. These will be globbed before if the "globPaths" option is true, and will be qualified. This option only works when reading from a FileFormat.
userSpecifiedSchema: An optional specification of the schema of the data. When present we skip attempting to infer the schema.
partitionColumns: A list of column names that the relation is partitioned by. This list is generally empty during the read path, unless this DataSource is managed by Hive. In these cases, during resolveRelation, we will call getOrInferFileFormatSchema for file based DataSources to infer the partitioning. In other cases, if this list is empty, then this table is unpartitioned.
bucketSpec: An optional specification for bucketing (hash-partitioning) of the data.
catalogTable: Optional catalog table reference that can be used to push down operations over the datasource to the catalog service.

Definition Classes: datasources

SourceInfo

org.apache.spark.sql.execution.datasources.DataSource

SourceInfo

case class SourceInfo(name: String, schema: StructType, partitionColumns: Seq[String]) extends Product with Serializable

Linear Supertypes

Serializable, Product, Equals, AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

SourceInfo
Serializable
Product
Equals
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Instance Constructors

new SourceInfo(name: String, schema: StructType, partitionColumns: Seq[String])

Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @native()
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable])
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
val name: String
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@native()
val partitionColumns: Seq[String]
def productElementNames: Iterator[String]
Definition Classes
Product
val schema: StructType
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()

Packages

SourceInfo

case class SourceInfo(name: String, schema: StructType, partitionColumns: Seq[String]) extends Product with Serializable

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

SourceInfo

case class SourceInfo(name: String, schema: StructType, partitionColumns: Seq[String]) extends Product with Serializable

Instance Constructors

Value Members

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped

SourceInfo