SparkPlanner

Instance Constructors

new SparkPlanner(sparkContext: SparkContext, conf: SQLConf, extraStrategies: Seq[Strategy])

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
object Aggregation extends Strategy

Used to plan the aggregate operator for expressions based on the AggregateFunction2 interface.
Used to plan the aggregate operator for expressions based on the AggregateFunction2 interface.

Definition Classes
SparkStrategies
object BasicOperators extends Strategy

Definition Classes
SparkStrategies
object DDLStrategy extends Strategy

Definition Classes
SparkStrategies
object InMemoryScans extends Strategy

Definition Classes
SparkStrategies
object JoinSelection extends Strategy with PredicateHelper

Select the proper physical plan for join based on joining keys and size of logical plan.
Select the proper physical plan for join based on joining keys and size of logical plan.
At first, uses the ExtractEquiJoinKeys pattern to find joins where at least some of the predicates can be evaluated by matching join keys. If found, Join implementations are chosen with the following precedence:
- Broadcast: if one side of the join has an estimated physical size that is smaller than the user-configurable SQLConf.AUTO_BROADCASTJOIN_THRESHOLD threshold or if that side has an explicit broadcast hint (e.g. the user applied the org.apache.spark.sql.functions.broadcast() function to a DataFrame), then that side of the join will be broadcasted and the other side will be streamed, with no shuffling performed. If both sides of the join are eligible to be broadcasted then the - Shuffle hash join: if the average size of a single partition is small enough to build a hash table. - Sort merge: if the matching join keys are sortable.
If there is no joining keys, Join implementations are chosen with the following precedence: - BroadcastNestedLoopJoin: if one side of the join could be broadcasted - CartesianProduct: for Inner join - BroadcastNestedLoopJoin

Definition Classes
SparkStrategies
object SpecialLimits extends Strategy

Plans special cases of limit operators.
Plans special cases of limit operators.

Definition Classes
SparkStrategies
object StatefulAggregationStrategy extends Strategy

Used to plan aggregation queries that are computed incrementally as part of a StreamingQuery.
Used to plan aggregation queries that are computed incrementally as part of a StreamingQuery. Currently this rule is injected into the planner on-demand, only when planning in a org.apache.spark.sql.execution.streaming.StreamExecution

Definition Classes
SparkStrategies
object StreamingRelationStrategy extends Strategy

This strategy is just for explaining Dataset/DataFrame created by spark.readStream.
This strategy is just for explaining Dataset/DataFrame created by spark.readStream. It won't affect the execution, because StreamingRelation will be replaced with StreamingExecutionRelation in StreamingQueryManager and StreamingExecutionRelation will be replaced with the real relation using the Source in StreamExecution.

Definition Classes
SparkStrategies
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
val conf: SQLConf
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
val extraStrategies: Seq[Strategy]
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def numPartitions: Int
def plan(plan: LogicalPlan): Iterator[SparkPlan]

Definition Classes
SparkPlanner → QueryPlanner
def pruneFilterProject(projectList: Seq[NamedExpression], filterPredicates: Seq[Expression], prunePushedDownFilters: (Seq[Expression]) ⇒ Seq[Expression], scanBuilder: (Seq[Attribute]) ⇒ SparkPlan): SparkPlan

Used to build table scan operators where complex projection and filtering are done using separate physical operators.
Used to build table scan operators where complex projection and filtering are done using separate physical operators. This function returns the given scan operator with Project and Filter nodes added only when needed. For example, a Project operator is only used when the final desired output requires complex expressions to be evaluated or when columns can be further eliminated out after filtering has been done.
The prunePushedDownFilters parameter is used to remove those filters that can be optimized away by the filter pushdown optimization.
The required attributes for both filtering and expression evaluation are passed to the provided scanBuilder function so that it can avoid unnecessary column materialization.
lazy val singleRowRdd: RDD[InternalRow]

Attributes
protected
Definition Classes
SparkStrategies
val sparkContext: SparkContext
def strategies: Seq[Strategy]

Definition Classes
SparkPlanner → QueryPlanner
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package execution

class SparkPlanner extends SparkStrategies

Instance Constructors

new SparkPlanner(sparkContext: SparkContext, conf: SQLConf, extraStrategies: Seq[Strategy])

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

object Aggregation extends Strategy

object BasicOperators extends Strategy

object DDLStrategy extends Strategy

object InMemoryScans extends Strategy

object JoinSelection extends Strategy with PredicateHelper

object SpecialLimits extends Strategy

object StatefulAggregationStrategy extends Strategy

object StreamingRelationStrategy extends Strategy

final def asInstanceOf[T0]: T0

def clone(): AnyRef

val conf: SQLConf

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

val extraStrategies: Seq[Strategy]

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def numPartitions: Int

def plan(plan: LogicalPlan): Iterator[SparkPlan]

def pruneFilterProject(projectList: Seq[NamedExpression], filterPredicates: Seq[Expression], prunePushedDownFilters: (Seq[Expression]) ⇒ Seq[Expression], scanBuilder: (Seq[Attribute]) ⇒ SparkPlan): SparkPlan

lazy val singleRowRdd: RDD[InternalRow]

val sparkContext: SparkContext

def strategies: Seq[Strategy]

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from SparkStrategies

Inherited from QueryPlanner[SparkPlan]

Inherited from AnyRef

Inherited from Any

Ungrouped