Package

org.apache.spark.sql

catalyst

Permalink

package catalyst

Catalyst is a library for manipulating relational query plans. All classes in catalyst are considered an internal API to Spark SQL and are subject to change between minor releases.

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. catalyst
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. trait CatalystConf extends AnyRef

    Permalink

    Interface for configuration options used in the catalyst module.

  2. trait DefinedByConstructorParams extends AnyRef

    Permalink

    A helper trait to create org.apache.spark.sql.catalyst.encoders.ExpressionEncoders for classes whose fields are entirely defined by constructor params but should not be case classes.

  3. case class FunctionIdentifier(funcName: String, database: Option[String]) extends IdentifierWithDatabase with Product with Serializable

    Permalink

    Identifies a function in a database.

    Identifies a function in a database. If database is not defined, the current database is used.

  4. sealed trait IdentifierWithDatabase extends AnyRef

    Permalink

    An identifier that optionally specifies a database.

    An identifier that optionally specifies a database.

    Format (unquoted): "name" or "db.name" Format (quoted): "name" or "db.name"

  5. abstract class InternalRow extends SpecializedGetters with Serializable

    Permalink

    An abstract class for row used internally in Spark SQL, which only contains the columns as internal types.

  6. trait ScalaReflection extends AnyRef

    Permalink

    Support for generating catalyst schemas for scala objects.

    Support for generating catalyst schemas for scala objects. Note that unlike its companion object, this trait able to work in both the runtime and the compile time (macro) universe.

  7. case class SimpleCatalystConf(caseSensitiveAnalysis: Boolean, orderByOrdinal: Boolean = true, groupByOrdinal: Boolean = true, optimizerMaxIterations: Int = 100, optimizerInSetConversionThreshold: Int = 10, maxCaseBranchesForCodegen: Int = 20, runSQLonFile: Boolean = true, crossJoinEnabled: Boolean = false, warehousePath: String = "/user/hive/warehouse") extends CatalystConf with Product with Serializable

    Permalink

    A CatalystConf that can be used for local testing.

  8. case class TableIdentifier(table: String, database: Option[String]) extends IdentifierWithDatabase with Product with Serializable

    Permalink

    Identifies a table in a database.

    Identifies a table in a database. If database is not defined, the current database is used. When we register a permanent function in the FunctionRegistry, we use unquotedString as the function name.

Value Members

  1. object CatalystTypeConverters

    Permalink

    Functions to convert Scala types to Catalyst types and vice versa.

  2. object FunctionIdentifier extends Serializable

    Permalink
  3. object InternalRow extends Serializable

    Permalink
  4. object JavaTypeInference

    Permalink

    Type-inference utilities for POJOs and Java collections.

  5. object ScalaReflection extends ScalaReflection

    Permalink

    A default version of ScalaReflection that uses the runtime universe.

  6. object ScalaReflectionLock

    Permalink

    A JVM-global lock that should be used to prevent thread safety issues when using things in scala.reflect.*.

    A JVM-global lock that should be used to prevent thread safety issues when using things in scala.reflect.*. Note that Scala Reflection API is made thread-safe in 2.11, but not yet for 2.10.* builds. See SI-6240 for more details.

    Attributes
    protected[org.apache.spark.sql]
  7. object TableIdentifier extends Serializable

    Permalink
  8. package analysis

    Permalink

    Provides a logical query plan Analyzer and supporting classes for performing analysis.

    Provides a logical query plan Analyzer and supporting classes for performing analysis. Analysis consists of translating UnresolvedAttributes and UnresolvedRelations into fully typed objects using information in a schema Catalog.

  9. package catalog

    Permalink
  10. package dsl

    Permalink

    A collection of implicit conversions that create a DSL for constructing catalyst data structures.

    A collection of implicit conversions that create a DSL for constructing catalyst data structures.

    scala> import org.apache.spark.sql.catalyst.dsl.expressions._
    
    // Standard operators are added to expressions.
    scala> import org.apache.spark.sql.catalyst.expressions.Literal
    scala> Literal(1) + Literal(1)
    res0: org.apache.spark.sql.catalyst.expressions.Add = (1 + 1)
    
    // There is a conversion from 'symbols to unresolved attributes.
    scala> 'a.attr
    res1: org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute = 'a
    
    // These unresolved attributes can be used to create more complicated expressions.
    scala> 'a === 'b
    res2: org.apache.spark.sql.catalyst.expressions.EqualTo = ('a = 'b)
    
    // SQL verbs can be used to construct logical query plans.
    scala> import org.apache.spark.sql.catalyst.plans.logical._
    scala> import org.apache.spark.sql.catalyst.dsl.plans._
    scala> LocalRelation('key.int, 'value.string).where('key === 1).select('value).analyze
    res3: org.apache.spark.sql.catalyst.plans.logical.LogicalPlan =
    Project [value#3]
     Filter (key#2 = 1)
      LocalRelation [key#2,value#3], []
  11. package encoders

    Permalink
  12. package errors

    Permalink

    Functions for attaching and retrieving trees that are associated with errors.

  13. package expressions

    Permalink

    A set of classes that can be used to represent trees of relational expressions.

    A set of classes that can be used to represent trees of relational expressions. A key goal of the expression library is to hide the details of naming and scoping from developers who want to manipulate trees of relational operators. As such, the library defines a special type of expression, a NamedExpression in addition to the standard collection of expressions.

    Standard Expressions

    A library of standard expressions (e.g., Add, EqualTo), aggregates (e.g., SUM, COUNT), and other computations (e.g. UDFs). Each expression type is capable of determining its output schema as a function of its children's output schema.

    Named Expressions

    Some expression are named and thus can be referenced by later operators in the dataflow graph. The two types of named expressions are AttributeReferences and Aliases. AttributeReferences refer to attributes of the input tuple for a given operator and form the leaves of some expression trees. Aliases assign a name to intermediate computations. For example, in the SQL statement SELECT a+b AS c FROM ..., the expressions a and b would be represented by AttributeReferences and c would be represented by an Alias.

    During analysis, all named expressions are assigned a globally unique expression id, which can be used for equality comparisons. While the original names are kept around for debugging purposes, they should never be used to check if two attributes refer to the same value, as plan transformations can result in the introduction of naming ambiguity. For example, consider a plan that contains subqueries, both of which are reading from the same table. If an optimization removes the subqueries, scoping information would be destroyed, eliminating the ability to reason about which subquery produced a given attribute.

    Evaluation

    The result of expressions can be evaluated using the Expression.apply(Row) method.

  14. package json

    Permalink
  15. package optimizer

    Permalink
  16. package parser

    Permalink
  17. package planning

    Permalink

    Contains classes for enumerating possible physical plans for a given logical query plan.

  18. package plans

    Permalink

    A collection of common abstractions for query plans as well as a base logical plan representation.

  19. package rules

    Permalink

    A framework for applying batches rewrite rules to trees, possibly to fixed point.

  20. package trees

    Permalink

    A library for easily manipulating trees of operators.

    A library for easily manipulating trees of operators. Operators that extend TreeNode are granted the following interface:

    • Scala collection like methods (foreach, map, flatMap, collect, etc)

    - transform - accepts a partial function that is used to generate a new tree. When the partial function can be applied to a given tree segment, that segment is replaced with the result. After attempting to apply the partial function to a given node, the transform function recursively attempts to apply the function to that node's children.

    • debugging support - pretty printing, easy splicing of trees, etc.
  21. package util

    Permalink

Inherited from AnyRef

Inherited from Any

Ungrouped