Packages

package sql

Linear Supertypes
AnyRef, Any

Package Members

  1. package artifact
  2. package avro
  3. package catalyst
  4. package classic

    Allows the execution of relational queries, including those expressed in SQL using Spark.

  5. package columnar
  6. package connector
  7. package execution

    The physical execution component of Spark SQL.

    The physical execution component of Spark SQL. Note that this is a private package. All classes in catalyst are considered an internal API to Spark SQL and are subject to change between minor releases.

  8. package internal

    All classes in this package are considered an internal API to Spark and are subject to change between minor releases.

  9. package jdbc
  10. package scripting
  11. package sources

    A set of APIs for adding data sources to Spark SQL.

  12. package streaming
  13. package util

Type Members

  1. type DataFrame = Dataset[Row]
  2. class ExperimentalMethods extends AnyRef

    :: Experimental :: Holder for experimental methods for the bravest.

    :: Experimental :: Holder for experimental methods for the bravest. We make NO guarantee about the stability regarding binary compatibility and source compatibility of methods here.

    spark.experimental.extraStrategies += ...
    Annotations
    @Experimental() @Unstable()
    Since

    1.3.0

  3. trait ExtendedExplainGenerator extends AnyRef

    A trait for a session extension to implement that provides addition explain plan information.

    A trait for a session extension to implement that provides addition explain plan information.

    Annotations
    @DeveloperApi() @Since("4.0.0")
  4. class SparkSessionExtensions extends AnyRef

    :: Experimental :: Holder for injection points to the SparkSession.

    :: Experimental :: Holder for injection points to the SparkSession. We make NO guarantee about the stability regarding binary compatibility and source compatibility of methods here.

    This current provides the following extension points:

    • Analyzer Rules.
    • Check Analysis Rules.
    • Cache Plan Normalization Rules.
    • Optimizer Rules.
    • Pre CBO Rules.
    • Planning Strategies.
    • Customized Parser.
    • (External) Catalog listeners.
    • Columnar Rules.
    • Adaptive Query Post Planner Strategy Rules.
    • Adaptive Query Stage Preparation Rules.
    • Adaptive Query Execution Runtime Optimizer Rules.
    • Adaptive Query Stage Optimizer Rules.

    The extensions can be used by calling withExtensions on the SparkSession.Builder, for example:

    SparkSession.builder()
      .master("...")
      .config("...", true)
      .withExtensions { extensions =>
        extensions.injectResolutionRule { session =>
          ...
        }
        extensions.injectParser { (session, parser) =>
          ...
        }
      }
      .getOrCreate()

    The extensions can also be used by setting the Spark SQL configuration property spark.sql.extensions. Multiple extensions can be set using a comma-separated list. For example:

    SparkSession.builder()
      .master("...")
      .config("spark.sql.extensions", "org.example.MyExtensions,org.example.YourExtensions")
      .getOrCreate()
    
    class MyExtensions extends Function1[SparkSessionExtensions, Unit] {
      override def apply(extensions: SparkSessionExtensions): Unit = {
        extensions.injectResolutionRule { session =>
          ...
        }
        extensions.injectParser { (session, parser) =>
          ...
        }
      }
    }
    
    class YourExtensions extends SparkSessionExtensionsProvider {
      override def apply(extensions: SparkSessionExtensions): Unit = {
        extensions.injectResolutionRule { session =>
          ...
        }
        extensions.injectFunction(...)
      }
    }

    Note that none of the injected builders should assume that the SparkSession is fully initialized and should not touch the session's internals (e.g. the SessionState).

    Annotations
    @DeveloperApi() @Experimental() @Unstable()
  5. trait SparkSessionExtensionsProvider extends (SparkSessionExtensions) => Unit

    :: Unstable ::

    :: Unstable ::

    Base trait for implementations used by SparkSessionExtensions

    For example, now we have an external function named Age to register as an extension for SparkSession:

    package org.apache.spark.examples.extensions
    
    import org.apache.spark.sql.catalyst.expressions.{CurrentDate, Expression, RuntimeReplaceable, SubtractDates}
    
    case class Age(birthday: Expression, child: Expression) extends RuntimeReplaceable {
    
      def this(birthday: Expression) = this(birthday, SubtractDates(CurrentDate(), birthday))
      override def exprsReplaced: Seq[Expression] = Seq(birthday)
      override protected def withNewChildInternal(newChild: Expression): Expression = copy(newChild)
    }

    We need to create our extension which inherits SparkSessionExtensionsProvider Example:

    package org.apache.spark.examples.extensions
    
    import org.apache.spark.sql.{SparkSessionExtensions, SparkSessionExtensionsProvider}
    import org.apache.spark.sql.catalyst.FunctionIdentifier
    import org.apache.spark.sql.catalyst.expressions.{Expression, ExpressionInfo}
    
    class MyExtensions extends SparkSessionExtensionsProvider {
      override def apply(v1: SparkSessionExtensions): Unit = {
        v1.injectFunction(
          (new FunctionIdentifier("age"),
            new ExpressionInfo(classOf[Age].getName, "age"),
            (children: Seq[Expression]) => new Age(children.head)))
      }
    }

    Then, we can inject MyExtensions in three ways,

    • withExtensions of SparkSession.Builder
    • Config - spark.sql.extensions
    • java.util.ServiceLoader - Add to src/main/resources/META-INF/services/org.apache.spark.sql.SparkSessionExtensionsProvider
    Annotations
    @DeveloperApi() @Unstable() @Since("3.2.0")
    Since

    3.2.0

    See also

    SparkSessionExtensions

    SparkSession.Builder

    java.util.ServiceLoader

Inherited from AnyRef

Inherited from Any

Ungrouped