A column that will be computed based on the data in a DataFrame.
:: Experimental :: A convenient class used for constructing schema.
:: Experimental :: Functionality for working with missing data in DataFrames.
Interface used to load a Dataset from external storage systems (e.
:: Experimental :: Statistic functions for DataFrames.
Interface used to write a Dataset to external storage systems (e.
A Dataset is a strongly typed collection of domain-specific objects that can be transformed in parallel using functional or relational operations.
A container for a Dataset, used for implicit conversions in Scala.
:: Experimental :: Holder for experimental methods for the bravest.
:: Experimental :: A class to consume data generated by a StreamingQuery.
:: Experimental :: A Dataset has been logically grouped by a user specified grouping key.
A set of methods for aggregations on a DataFrame, created by Dataset.groupBy.
Runtime configuration interface for Spark.
The entry point for working with structured data (rows and columns) in Spark 1.
A collection of implicit methods for converting common Scala objects into Datasets.
The entry point to programming Spark with the Dataset and DataFrame API.
Converts a logical plan into zero or more SparkPlans.
Converts a logical plan into zero or more SparkPlans. This API is exposed for experimenting with the query planner and is not designed to be stable across spark releases. Developers writing libraries should instead consider using the stable APIs provided in org.apache.spark.sql.sources
A Column where an Encoder has been given for the expected input and return type.
Functions for registering user-defined functions.
This SQLContext object contains utility functions to create a singleton SQLContext instance, or to get the created SQLContext instance.
Contains API classes that are specific to a single language (i.
The physical execution component of Spark SQL.
:: Experimental :: Functions available for DataFrame.
All classes in this package are considered an internal API to Spark and are subject to change between minor releases.
A set of APIs for adding data sources to Spark SQL.
Allows the execution of relational queries, including those expressed in SQL using Spark.