Packages

package adaptive

Ordering
  1. Alphabetic
Visibility
  1. Public
  2. All

Type Members

  1. case class AdaptiveSparkPlanExec(initialPlan: SparkPlan, session: SparkSession, preprocessingRules: Seq[Rule[SparkPlan]], subqueryCache: TrieMap[SparkPlan, BaseSubqueryExec], stageCache: TrieMap[SparkPlan, QueryStageExec], isSubquery: Boolean) extends SparkPlan with LeafExecNode with Product with Serializable

    A root node to execute the query plan adaptively.

    A root node to execute the query plan adaptively. It splits the query plan into independent stages and executes them in order according to their dependencies. The query stage materializes its output at the end. When one stage completes, the data statistics of the materialized output will be used to optimize the remainder of the query.

    To create query stages, we traverse the query tree bottom up. When we hit an exchange node, and if all the child query stages of this exchange node are materialized, we create a new query stage for this exchange node. The new stage is then materialized asynchronously once it is created.

    When one query stage finishes materialization, the rest query is re-optimized and planned based on the latest statistics provided by all materialized stages. Then we traverse the query plan again and create more stages if possible. After all stages have been materialized, we execute the rest of the plan.

  2. trait AdaptiveSparkPlanHelper extends AnyRef

    This class provides utility methods related to tree traversal of an AdaptiveSparkPlanExec plan.

    This class provides utility methods related to tree traversal of an AdaptiveSparkPlanExec plan. Unlike their counterparts in org.apache.spark.sql.catalyst.trees.TreeNode or org.apache.spark.sql.catalyst.plans.QueryPlan, these methods traverse down leaf nodes of adaptive plans, i.e., AdaptiveSparkPlanExec and QueryStageExec.

  3. case class BroadcastQueryStageExec(id: Int, plan: BroadcastExchangeExec) extends QueryStageExec with Product with Serializable

    A broadcast query stage whose child is a BroadcastExchangeExec.

  4. case class CoalescedShuffleReaderExec(child: SparkPlan, partitionStartIndices: Array[Int]) extends SparkPlan with UnaryExecNode with Product with Serializable

    A wrapper of shuffle query stage, which submits fewer reduce task as one reduce task may read multiple shuffle partitions.

    A wrapper of shuffle query stage, which submits fewer reduce task as one reduce task may read multiple shuffle partitions. This can avoid many small reduce tasks that hurt performance.

    child

    It's usually ShuffleQueryStageExec or ReusedQueryStageExec, but can be the shuffle exchange node during canonicalization.

  5. trait Cost extends Ordered[Cost]

    Represents the cost of a plan.

  6. trait CostEvaluator extends AnyRef

    Evaluates the cost of a physical plan.

  7. case class DemoteBroadcastHashJoin(conf: SQLConf) extends Rule[LogicalPlan] with Product with Serializable

    This optimization rule detects a join child that has a high ratio of empty partitions and adds a no-broadcast-hash-join hint to avoid it being broadcast.

  8. case class InsertAdaptiveSparkPlan(session: SparkSession) extends Rule[SparkPlan] with Product with Serializable

    This rule wraps the query plan with an AdaptiveSparkPlanExec, which executes the query plan and re-optimize the plan during execution based on runtime data statistics.

    This rule wraps the query plan with an AdaptiveSparkPlanExec, which executes the query plan and re-optimize the plan during execution based on runtime data statistics.

    Note that this rule is stateful and thus should not be reused across query executions.

  9. case class LocalShuffleReaderExec(child: SparkPlan, partitionStartIndicesPerMapper: Array[Array[Int]]) extends SparkPlan with UnaryExecNode with Product with Serializable

    A wrapper of shuffle query stage, which submits one or more reduce tasks per mapper to read the shuffle files written by one mapper.

    A wrapper of shuffle query stage, which submits one or more reduce tasks per mapper to read the shuffle files written by one mapper. By doing this, it's very likely to read the shuffle files locally, as the shuffle files that a reduce task needs to read are in one node.

    child

    It's usually ShuffleQueryStageExec or ReusedQueryStageExec, but can be the shuffle exchange node during canonicalization.

    partitionStartIndicesPerMapper

    A mapper usually writes many shuffle blocks, and it's better to launch multiple tasks to read shuffle blocks of one mapper. This array contains the partition start indices for each mapper.

  10. class LocalShuffledRowRDD extends RDD[InternalRow]

    This is a specialized version of org.apache.spark.sql.execution.ShuffledRowRDD.

    This is a specialized version of org.apache.spark.sql.execution.ShuffledRowRDD. This is used in Spark SQL adaptive execution when a shuffle join is converted to broadcast join at runtime because the map output of one input table is small enough for broadcast. This RDD represents the data of another input table of the join that reads from shuffle. Each partition of the RDD reads the whole data from just one mapper output locally. So actually there is no data transferred from the network.

    This RDD takes a ShuffleDependency (dependency).

    The dependency has the parent RDD of this RDD, which represents the dataset before shuffle (i.e. map output). Elements of this RDD are (partitionId, Row) pairs. Partition ids should be in the range [0, numPartitions - 1]. dependency.partitioner.numPartitions is the number of pre-shuffle partitions. (i.e. the number of partitions of the map output). The post-shuffle partition number is the same to the parent RDD's partition number.

    partitionStartIndicesPerMapper specifies how to split the shuffle blocks of each mapper into one or more partitions. For a mapper i, the jth partition includes shuffle blocks from partitionStartIndicesPerMapper[i][j] to partitionStartIndicesPerMapper[i][j+1] (exclusive).

  11. case class LogicalQueryStage(logicalPlan: LogicalPlan, physicalPlan: SparkPlan) extends LeafNode with Product with Serializable

    The LogicalPlan wrapper for a QueryStageExec, or a snippet of physical plan containing a QueryStageExec, in which all ancestor nodes of the QueryStageExec are linked to the same logical node.

    The LogicalPlan wrapper for a QueryStageExec, or a snippet of physical plan containing a QueryStageExec, in which all ancestor nodes of the QueryStageExec are linked to the same logical node.

    For example, a logical Aggregate can be transformed into FinalAgg - Shuffle - PartialAgg, in which the Shuffle will be wrapped into a QueryStageExec, thus the LogicalQueryStage will have FinalAgg - QueryStageExec as its physical plan.

  12. case class OptimizeLocalShuffleReader(conf: SQLConf) extends Rule[SparkPlan] with Product with Serializable

    A rule to optimize the shuffle reader to local reader iff no additional shuffles will be introduced: 1.

    A rule to optimize the shuffle reader to local reader iff no additional shuffles will be introduced: 1. if the input plan is a shuffle, add local reader directly as we can never introduce extra shuffles in this case. 2. otherwise, add local reader to the probe side of broadcast hash join and then run EnsureRequirements to check whether additional shuffle introduced. If introduced, we will revert all the local readers.

  13. case class PlanAdaptiveSubqueries(subqueryMap: Map[Long, ExecSubqueryExpression]) extends Rule[SparkPlan] with Product with Serializable
  14. abstract class QueryStageExec extends SparkPlan with LeafExecNode

    A query stage is an independent subgraph of the query plan.

    A query stage is an independent subgraph of the query plan. Query stage materializes its output before proceeding with further operators of the query plan. The data statistics of the materialized output can be used to optimize subsequent query stages.

    There are 2 kinds of query stages:

    1. Shuffle query stage. This stage materializes its output to shuffle files, and Spark launches another job to execute the further operators. 2. Broadcast query stage. This stage materializes its output to an array in driver JVM. Spark broadcasts the array before executing the further operators.
  15. case class ReduceNumShufflePartitions(conf: SQLConf) extends Rule[SparkPlan] with Product with Serializable

    A rule to adjust the post shuffle partitions based on the map output statistics.

    A rule to adjust the post shuffle partitions based on the map output statistics.

    The strategy used to determine the number of post-shuffle partitions is described as follows. To determine the number of post-shuffle partitions, we have a target input size for a post-shuffle partition. Once we have size statistics of all pre-shuffle partitions, we will do a pass of those statistics and pack pre-shuffle partitions with continuous indices to a single post-shuffle partition until adding another pre-shuffle partition would cause the size of a post-shuffle partition to be greater than the target size.

    For example, we have two stages with the following pre-shuffle partition size statistics: stage 1: [100 MiB, 20 MiB, 100 MiB, 10MiB, 30 MiB] stage 2: [10 MiB, 10 MiB, 70 MiB, 5 MiB, 5 MiB] assuming the target input size is 128 MiB, we will have four post-shuffle partitions, which are:

    • post-shuffle partition 0: pre-shuffle partition 0 (size 110 MiB)
    • post-shuffle partition 1: pre-shuffle partition 1 (size 30 MiB)
    • post-shuffle partition 2: pre-shuffle partition 2 (size 170 MiB)
    • post-shuffle partition 3: pre-shuffle partition 3 and 4 (size 50 MiB)
  16. case class ReuseAdaptiveSubquery(conf: SQLConf, reuseMap: TrieMap[SparkPlan, BaseSubqueryExec]) extends Rule[SparkPlan] with Product with Serializable
  17. case class ReusedQueryStageExec(id: Int, plan: QueryStageExec, output: Seq[Attribute]) extends QueryStageExec with Product with Serializable

    A wrapper for reused query stage to have different output.

  18. case class ShuffleQueryStageExec(id: Int, plan: ShuffleExchangeExec) extends QueryStageExec with Product with Serializable

    A shuffle query stage whose child is a ShuffleExchangeExec.

  19. case class SimpleCost(value: Long) extends Cost with Product with Serializable

    A simple implementation of Cost, which takes a number of Long as the cost value.

  20. case class StageFailure(stage: QueryStageExec, error: Throwable) extends StageMaterializationEvent with Product with Serializable

    The materialization of a query stage hit an error and failed.

  21. sealed trait StageMaterializationEvent extends AnyRef

    The event type for stage materialization.

  22. case class StageSuccess(stage: QueryStageExec, result: Any) extends StageMaterializationEvent with Product with Serializable

    The materialization of a query stage completed with success.

Value Members

  1. object AdaptiveSparkPlanExec extends Serializable
  2. object BroadcastQueryStageExec extends Serializable
  3. object LogicalQueryStageStrategy extends Strategy with PredicateHelper

    Strategy for plans containing LogicalQueryStage nodes: 1.

    Strategy for plans containing LogicalQueryStage nodes: 1. Transforms LogicalQueryStage to its corresponding physical plan that is either being executed or has already completed execution. 2. Transforms Join which has one child relation already planned and executed as a BroadcastQueryStageExec. This is to prevent reversing a broadcast stage into a shuffle stage in case of the larger join child relation finishes before the smaller relation. Note that this rule needs to applied before regular join strategies.

  4. object OptimizeLocalShuffleReader extends Serializable
  5. object ShuffleQueryStageExec extends Serializable
  6. object SimpleCostEvaluator extends CostEvaluator

    A simple implementation of CostEvaluator, which counts the number of ShuffleExchangeExec nodes in the plan.

Ungrouped