Applies the given function to each input row, appending the encoded result at the end of the row.
An optimized version of AppendColumnsExec, that can be executed on deserialized object directly.
Helper trait which defines methods that are shared by both LocalLimitExec and GlobalLimitExec.
Physical plan node for scanning data from a batched relation.
Provides support in a SQLContext for caching query results and automatically using these cached results when subsequent queries are executed.
Holds a cached logical plan and its data
Co-groups the data from left and right children, and calls the function with each group and 2 iterators containing all elements in the group from left and right side.
Iterates over GroupedIterators and returns the cogrouped data, i.
Physical plan for returning a new RDD that has exactly numPartitions
partitions.
A Partitioner that might group together one or more partitions from the parent.
An interface for those physical operators that support codegen.
Find the chained plans that support codegen, collapse them together as WholeStageCodegen.
Take the first limit
elements and collect them to a single partition.
Takes the input row from child and turns it into object using the given deserializer expression.
Apply all of the GroupExpressions to every input row, hence we will get multiple output rows for an input row.
An interface for relations that are backed by files.
Physical plan for Filter.
Groups the input rows together and calls the R function with each group and an iterator containing all elements in the group.
Applies a Generator to a stream of input rows, combining the output of each into a new stream of rows.
Take the first limit
elements of the child's single output partition.
Iterates over a presorted set of rows, chunking it up by the grouping expression.
InputAdapter is used to hide a SparkPlan from a subtree that support codegen.
Take the first limit
elements of each child partition, but do not collect or shuffle them.
Physical plan node for scanning data from a local collection.
Logical plan node for scanning data from an RDD.
Applies the given function to each input object.
Groups the input rows together and calls the function with each group and an iterator containing all elements in the group.
Applies the given function to input object iterator.
Physical version of ObjectConsumer
.
Physical version of ObjectProducer
.
A plan node that does nothing but lie about the output of its child.
Plans scalar subqueries from that are present in the given SparkPlan.
Physical plan for Project.
The primary workflow for executing relational queries using Spark.
Physical plan node for scanning data from an RDD.
Physical plan for range (generating a range of 64 bit numbers).
Physical plan node for scanning data from a relation.
An internal iterator interface which presents a more restrictive API than scala.collection.Iterator.
Physical plan for sampling the dataset.
A subquery that will return only one row and one column.
Takes the input object from child and turns in into unsafe row using the given serializer expression.
This is a specialized version of org.apache.spark.rdd.ShuffledRDD that is optimized for shuffling rows instead of Java key-value pairs.
Performs (external) sorting.
The base class for physical operators.
:: DeveloperApi :: Stores information about a SQL SparkPlan.
Builder that converts an ANTLR ParseTree into a LogicalPlan/Expression/TableIdentifier.
Concrete parser for Spark SQL statements.
Converts a logical plan into zero or more SparkPlans.
Physical plan for a subquery.
Take the first limit elements as defined by the sortOrder, and do projection if needed.
Physical plan for unioning two plans, without a distinct.
Serializer for serializing UnsafeRows during shuffle.
WholeStageCodegen compile a subtree of plans that support codegen together into single Java function.
This class calculates and outputs (windowed) aggregates over the rows in a single (sorted) partition.
Helper functions for physical operators that work with user defined objects.
Contains methods for debugging query execution.
Physical execution operators for join operations.
The physical execution component of Spark SQL. Note that this is a private package. All classes in catalyst are considered an internal API to Spark SQL and are subject to change between minor releases.