SparkPlanner

class SparkPlanner extends SparkStrategies with SQLConfHelper

Linear Supertypes

SQLConfHelper, SparkStrategies, QueryPlanner[SparkPlan], AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

SparkPlanner
SQLConfHelper
SparkStrategies
QueryPlanner
AnyRef
Any

Hide All
Show All

Visibility

Public
Protected

Instance Constructors

new SparkPlanner(session: SparkSession, experimentalMethods: ExperimentalMethods)

Type Members

case class StreamingGlobalLimitStrategy(outputMode: OutputMode) extends Strategy with Product with Serializable
Used to plan the streaming global limit operator for streams in append mode.
Used to plan the streaming global limit operator for streams in append mode. We need to check for either a direct Limit or a Limit wrapped in a ReturnAnswer operator, following the example of the SpecialLimits Strategy above.
Definition Classes
SparkStrategies

Value Members

final def !=(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def ##: Int
Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean
Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0
Definition Classes
Any
def clone(): AnyRef
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.CloneNotSupportedException]) @native()
def collectPlaceholders(plan: SparkPlan): Seq[(SparkPlan, LogicalPlan)]
Attributes
protected
Definition Classes
SparkPlanner → QueryPlanner
def conf: SQLConf
Definition Classes
SQLConfHelper
final def eq(arg0: AnyRef): Boolean
Definition Classes
AnyRef
def equals(arg0: AnyRef): Boolean
Definition Classes
AnyRef → Any
val experimentalMethods: ExperimentalMethods
def extraPlanningStrategies: Seq[Strategy]
Override to add extra planning strategies to the planner.
Override to add extra planning strategies to the planner. These strategies are tried after the strategies defined in ExperimentalMethods, and before the regular strategies.
def finalize(): Unit
Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.Throwable])
final def getClass(): Class[_ <: AnyRef]
Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int
Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean
Definition Classes
Any
final def ne(arg0: AnyRef): Boolean
Definition Classes
AnyRef
final def notify(): Unit
Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit
Definition Classes
AnyRef
Annotations
@native()
def numPartitions: Int
def plan(plan: LogicalPlan): Iterator[SparkPlan]
Definition Classes
SparkStrategies → QueryPlanner
def pruneFilterProject(projectList: Seq[NamedExpression], filterPredicates: Seq[Expression], prunePushedDownFilters: (Seq[Expression]) => Seq[Expression], scanBuilder: (Seq[Attribute]) => SparkPlan): SparkPlan
Used to build table scan operators where complex projection and filtering are done using separate physical operators.
Used to build table scan operators where complex projection and filtering are done using separate physical operators. This function returns the given scan operator with Project and Filter nodes added only when needed. For example, a Project operator is only used when the final desired output requires complex expressions to be evaluated or when columns can be further eliminated out after filtering has been done.
The prunePushedDownFilters parameter is used to remove those filters that can be optimized away by the filter pushdown optimization.
The required attributes for both filtering and expression evaluation are passed to the provided scanBuilder function so that it can avoid unnecessary column materialization.
def prunePlans(plans: Iterator[SparkPlan]): Iterator[SparkPlan]
Attributes
protected
Definition Classes
SparkPlanner → QueryPlanner
val session: SparkSession
lazy val singleRowRdd: RDD[InternalRow]
Attributes
protected
Definition Classes
SparkStrategies
def strategies: Seq[Strategy]
Definition Classes
SparkPlanner → QueryPlanner
final def synchronized[T0](arg0: => T0): T0
Definition Classes
AnyRef
def toString(): String
Definition Classes
AnyRef → Any
final def wait(): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long, arg1: Int): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException])
final def wait(arg0: Long): Unit
Definition Classes
AnyRef
Annotations
@throws(classOf[java.lang.InterruptedException]) @native()
object Aggregation extends Strategy
Used to plan the aggregate operator for expressions based on the AggregateFunction2 interface.
Used to plan the aggregate operator for expressions based on the AggregateFunction2 interface.
Definition Classes
SparkStrategies
object BasicOperators extends Strategy
Definition Classes
SparkStrategies
object FlatMapGroupsWithStateStrategy extends Strategy
Strategy to convert FlatMapGroupsWithState logical operator to physical operator in streaming plans.
Strategy to convert FlatMapGroupsWithState logical operator to physical operator in streaming plans. Conversion for batch plans is handled by BasicOperators.
Definition Classes
SparkStrategies
object InMemoryScans extends Strategy
Definition Classes
SparkStrategies
object JoinSelection extends Strategy with PredicateHelper with JoinSelectionHelper
Select the proper physical plan for join based on join strategy hints, the availability of equi-join keys and the sizes of joining relations.
Select the proper physical plan for join based on join strategy hints, the availability of equi-join keys and the sizes of joining relations. Below are the existing join strategies, their characteristics and their limitations.
- Broadcast hash join (BHJ): Only supported for equi-joins, while the join keys do not need to be sortable. Supported for all join types except full outer joins. BHJ usually performs faster than the other join algorithms when the broadcast side is small. However, broadcasting tables is a network-intensive operation and it could cause OOM or perform badly in some cases, especially when the build/broadcast side is big.
- Shuffle hash join: Only supported for equi-joins, while the join keys do not need to be sortable. Supported for all join types. Building hash map from table is a memory-intensive operation and it could cause OOM when the build side is big.
- Shuffle sort merge join (SMJ): Only supported for equi-joins and the join keys have to be sortable. Supported for all join types.
- Broadcast nested loop join (BNLJ): Supports both equi-joins and non-equi-joins. Supports all the join types, but the implementation is optimized for: 1) broadcasting the left side in a right outer join; 2) broadcasting the right side in a left outer, left semi, left anti or existence join; 3) broadcasting either side in an inner-like join. For other cases, we need to scan the data multiple times, which can be rather slow.
- Shuffle-and-replicate nested loop join (a.k.a. cartesian product join): Supports both equi-joins and non-equi-joins. Supports only inner like joins.
Definition Classes
SparkStrategies
object PythonEvals extends Strategy
Strategy to convert EvalPython logical operator to physical operator.
Strategy to convert EvalPython logical operator to physical operator.
Definition Classes
SparkStrategies
object SparkScripts extends Strategy
Definition Classes
SparkStrategies
object SpecialLimits extends Strategy
Plans special cases of limit operators.
Plans special cases of limit operators.
Definition Classes
SparkStrategies
object StatefulAggregationStrategy extends Strategy
Used to plan streaming aggregation queries that are computed incrementally as part of a org.apache.spark.sql.streaming.StreamingQuery.
Used to plan streaming aggregation queries that are computed incrementally as part of a org.apache.spark.sql.streaming.StreamingQuery. Currently this rule is injected into the planner on-demand, only when planning in a org.apache.spark.sql.execution.streaming.StreamExecution
Definition Classes
SparkStrategies
object StreamingDeduplicationStrategy extends Strategy
Used to plan the streaming deduplicate operator.
Used to plan the streaming deduplicate operator.
Definition Classes
SparkStrategies
object StreamingJoinStrategy extends Strategy
Definition Classes
SparkStrategies
object StreamingRelationStrategy extends Strategy
This strategy is just for explaining Dataset/DataFrame created by spark.readStream.
This strategy is just for explaining Dataset/DataFrame created by spark.readStream. It won't affect the execution, because StreamingRelation will be replaced with StreamingExecutionRelation in StreamingQueryManager and StreamingExecutionRelation will be replaced with the real relation using the Source in StreamExecution.
Definition Classes
SparkStrategies
object Window extends Strategy
Definition Classes
SparkStrategies

Packages

SparkPlanner

class SparkPlanner extends SparkStrategies with SQLConfHelper

Instance Constructors

Type Members

Value Members

Inherited from SQLConfHelper

Inherited from SparkStrategies

Inherited from QueryPlanner[SparkPlan]

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

SparkPlanner

class SparkPlanner extends SparkStrategies with SQLConfHelper

Instance Constructors

Type Members

Value Members

Inherited from SQLConfHelper

Inherited from SparkStrategies

Inherited from QueryPlanner[SparkPlan]

Inherited from AnyRef

Inherited from Any

Ungrouped

SparkPlanner