Class

org.apache.spark.sql.catalyst.analysis

Analyzer

Related Doc: package analysis

Permalink

class Analyzer extends RuleExecutor[LogicalPlan] with CheckAnalysis

Provides a logical query plan analyzer, which translates UnresolvedAttributes and UnresolvedRelations into fully typed objects using information in a SessionCatalog and a FunctionRegistry.

Linear Supertypes
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Analyzer
  2. CheckAnalysis
  3. PredicateHelper
  4. RuleExecutor
  5. Logging
  6. AnyRef
  7. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Analyzer(catalog: SessionCatalog, conf: CatalystConf)

    Permalink
  2. new Analyzer(catalog: SessionCatalog, conf: CatalystConf, maxIterations: Int)

    Permalink

Type Members

  1. case class Batch(name: String, strategy: Strategy, rules: Rule[TreeType]*) extends Product with Serializable

    Permalink

    A batch of rules.

    A batch of rules.

    Attributes
    protected
    Definition Classes
    RuleExecutor
  2. case class FixedPoint(maxIterations: Int) extends Strategy with Product with Serializable

    Permalink

    A strategy that runs until fix point or maxIterations times, whichever comes first.

    A strategy that runs until fix point or maxIterations times, whichever comes first.

    Definition Classes
    RuleExecutor
  3. abstract class Strategy extends AnyRef

    Permalink

    An execution strategy for rules that indicates the maximum number of executions.

    An execution strategy for rules that indicates the maximum number of executions. If the execution reaches fix point (i.e. converge) before maxIterations, it will stop.

    Definition Classes
    RuleExecutor

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. object CTESubstitution extends Rule[LogicalPlan]

    Permalink

    Substitute child plan with cte definitions

  5. object ExtractGenerator extends Rule[LogicalPlan]

    Permalink

    Extracts Generator from the projectList of a Project operator and create Generate operator under Project.

    Extracts Generator from the projectList of a Project operator and create Generate operator under Project.

    This rule will throw AnalysisException for following cases: 1. Generator is nested in expressions, e.g. SELECT explode(list) + 1 FROM tbl 2. more than one Generator is found in projectList, e.g. SELECT explode(list), explode(list) FROM tbl 3. Generator is found in other operators that are not Project or Generate, e.g. SELECT * FROM tbl SORT BY explode(list)

  6. object ExtractWindowExpressions extends Rule[LogicalPlan]

    Permalink

    Extracts WindowExpressions from the projectList of a Project operator and aggregateExpressions of an Aggregate operator and creates individual Window operators for every distinct WindowSpecDefinition.

    Extracts WindowExpressions from the projectList of a Project operator and aggregateExpressions of an Aggregate operator and creates individual Window operators for every distinct WindowSpecDefinition.

    This rule handles three cases:

    • A Project having WindowExpressions in its projectList;
    • An Aggregate having WindowExpressions in its aggregateExpressions.
    • A Filter->Aggregate pattern representing GROUP BY with a HAVING clause and the Aggregate has WindowExpressions in its aggregateExpressions. Note: If there is a GROUP BY clause in the query, aggregations and corresponding filters (expressions in the HAVING clause) should be evaluated before any WindowExpression. If a query has SELECT DISTINCT, the DISTINCT part should be evaluated after all WindowExpressions.

    For every case, the transformation works as follows: 1. For a list of Expressions (a projectList or an aggregateExpressions), partitions it two lists of Expressions, one for all WindowExpressions and another for all regular expressions. 2. For all WindowExpressions, groups them based on their WindowSpecDefinitions. 3. For every distinct WindowSpecDefinition, creates a Window operator and inserts it into the plan tree.

  7. object FixNullability extends Rule[LogicalPlan]

    Permalink

    Fixes nullability of Attributes in a resolved LogicalPlan by using the nullability of corresponding Attributes of its children output Attributes.

    Fixes nullability of Attributes in a resolved LogicalPlan by using the nullability of corresponding Attributes of its children output Attributes. This step is needed because users can use a resolved AttributeReference in the Dataset API and outer joins can change the nullability of an AttribtueReference. Without the fix, a nullable column's nullable field can be actually set as non-nullable, which cause illegal optimization (e.g., NULL propagation) and wrong answers. See SPARK-13484 and SPARK-13801 for the concrete queries of this case.

  8. object GlobalAggregates extends Rule[LogicalPlan]

    Permalink

    Turns projections that contain aggregate expressions into aggregations.

  9. object HandleNullInputsForUDF extends Rule[LogicalPlan]

    Permalink

    Correctly handle null primitive inputs for UDF by adding extra If expression to do the null check.

    Correctly handle null primitive inputs for UDF by adding extra If expression to do the null check. When user defines a UDF with primitive parameters, there is no way to tell if the primitive parameter is null or not, so here we assume the primitive input is null-propagatable and we should return null if the input is null.

  10. object Once extends Strategy with Product with Serializable

    Permalink

    A strategy that only runs once.

    A strategy that only runs once.

    Definition Classes
    RuleExecutor
  11. object PullOutNondeterministic extends Rule[LogicalPlan]

    Permalink

    Pulls out nondeterministic expressions from LogicalPlan which is not Project or Filter, put them into an inner Project and finally project them away at the outer Project.

  12. object ResolveAggregateFunctions extends Rule[LogicalPlan]

    Permalink

    This rule finds aggregate expressions that are not in an aggregate operator.

    This rule finds aggregate expressions that are not in an aggregate operator. For example, those in a HAVING clause or ORDER BY clause. These expressions are pushed down to the underlying aggregate operator and then projected away after the original operator.

  13. object ResolveAliases extends Rule[LogicalPlan]

    Permalink

    Replaces UnresolvedAliass with concrete aliases.

  14. object ResolveDeserializer extends Rule[LogicalPlan]

    Permalink

    Replaces UnresolvedDeserializer with the deserialization expression that has been resolved to the given input attributes.

  15. object ResolveFunctions extends Rule[LogicalPlan]

    Permalink

    Replaces UnresolvedFunctions with concrete Expressions.

  16. object ResolveGenerate extends Rule[LogicalPlan]

    Permalink

    Rewrites table generating expressions that either need one or more of the following in order to be resolved:

    Rewrites table generating expressions that either need one or more of the following in order to be resolved:

    • concrete attribute references for their output.
    • to be relocated from a SELECT clause (i.e. from a Project) into a Generate).

    Names for the output Attributes are extracted from Alias or MultiAlias expressions that wrap the Generator.

  17. object ResolveGroupingAnalytics extends Rule[LogicalPlan]

    Permalink
  18. object ResolveMissingReferences extends Rule[LogicalPlan]

    Permalink

    In many dialects of SQL it is valid to sort by attributes that are not present in the SELECT clause.

    In many dialects of SQL it is valid to sort by attributes that are not present in the SELECT clause. This rule detects such queries and adds the required attributes to the original projection, so that they will be available during sorting. Another projection is added to remove these attributes after sorting.

    The HAVING clause could also used a grouping columns that is not presented in the SELECT.

  19. object ResolveNaturalAndUsingJoin extends Rule[LogicalPlan]

    Permalink

    Removes natural or using joins by calculating output columns based on output from two sides, Then apply a Project on a normal Join to eliminate natural or using join.

  20. object ResolveNewInstance extends Rule[LogicalPlan]

    Permalink

    Resolves NewInstance by finding and adding the outer scope to it if the object being constructed is an inner class.

  21. object ResolveOrdinalInOrderByAndGroupBy extends Rule[LogicalPlan]

    Permalink

    In many dialects of SQL it is valid to use ordinal positions in order/sort by and group by clauses.

    In many dialects of SQL it is valid to use ordinal positions in order/sort by and group by clauses. This rule is to convert ordinal positions to the corresponding expressions in the select list. This support is introduced in Spark 2.0.

    - When the sort references or group by expressions are not integer but foldable expressions, just ignore them. - When spark.sql.orderByOrdinal/spark.sql.groupByOrdinal is set to false, ignore the position numbers too.

    Before the release of Spark 2.0, the literals in order/sort by and group by clauses have no effect on the results.

  22. object ResolvePivot extends Rule[LogicalPlan]

    Permalink
  23. object ResolveReferences extends Rule[LogicalPlan]

    Permalink

    Replaces UnresolvedAttributes with concrete AttributeReferences from a logical plan node's children.

  24. object ResolveRelations extends Rule[LogicalPlan]

    Permalink

    Replaces UnresolvedRelations with concrete relations from the catalog.

  25. object ResolveSubquery extends Rule[LogicalPlan] with PredicateHelper

    Permalink

    This rule resolves and rewrites subqueries inside expressions.

    This rule resolves and rewrites subqueries inside expressions.

    Note: CTEs are handled in CTESubstitution.

  26. object ResolveUpCast extends Rule[LogicalPlan]

    Permalink

    Replace the UpCast expression by Cast, and throw exceptions if the cast may truncate.

  27. object ResolveWindowFrame extends Rule[LogicalPlan]

    Permalink

    Check and add proper window frames for all window functions.

  28. object ResolveWindowOrder extends Rule[LogicalPlan]

    Permalink

    Check and add order to AggregateWindowFunctions.

  29. object WindowsSubstitution extends Rule[LogicalPlan]

    Permalink

    Substitute child plan with WindowSpecDefinitions.

  30. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  31. lazy val batches: Seq[Batch]

    Permalink

    Defines a sequence of rule batches, to be overridden by the implementation.

    Defines a sequence of rule batches, to be overridden by the implementation.

    Definition Classes
    AnalyzerRuleExecutor
  32. def canEvaluate(expr: Expression, plan: LogicalPlan): Boolean

    Permalink

    Returns true if expr can be evaluated using only the output of plan.

    Returns true if expr can be evaluated using only the output of plan. This method can be used to determine when it is acceptable to move expression evaluation within a query plan.

    For example consider a join between two relations R(a, b) and S(c, d).

    canEvaluate(EqualTo(a,b), R) returns true where as canEvaluate(EqualTo(a,c), R) returns false.

    Attributes
    protected
    Definition Classes
    PredicateHelper
  33. def checkAnalysis(plan: LogicalPlan): Unit

    Permalink
    Definition Classes
    CheckAnalysis
  34. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  35. def containsMultipleGenerators(exprs: Seq[Expression]): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    CheckAnalysis
  36. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  37. def equals(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  38. def execute(plan: LogicalPlan): LogicalPlan

    Permalink

    Executes the batches of rules defined by the subclass.

    Executes the batches of rules defined by the subclass. The batches are executed serially using the defined execution strategy. Within each batch, rules are also executed serially.

    Definition Classes
    RuleExecutor
  39. val extendedCheckRules: Seq[(LogicalPlan) ⇒ Unit]

    Permalink

    Override to provide additional checks for correct analysis.

    Override to provide additional checks for correct analysis. These rules will be evaluated after our built-in check rules.

    Definition Classes
    CheckAnalysis
  40. val extendedResolutionRules: Seq[Rule[LogicalPlan]]

    Permalink

    Override to provide additional rules for the "Resolution" batch.

  41. def failAnalysis(msg: String): Nothing

    Permalink
    Attributes
    protected
    Definition Classes
    CheckAnalysis
  42. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  43. val fixedPoint: FixedPoint

    Permalink
    Attributes
    protected
  44. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  45. def hashCode(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  46. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  47. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  48. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  49. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  50. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  51. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  52. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  53. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  54. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  55. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  56. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  57. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  58. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  59. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  60. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  61. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  62. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  63. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  64. def replaceAlias(condition: Expression, aliases: AttributeMap[Expression]): Expression

    Permalink
    Attributes
    protected
    Definition Classes
    PredicateHelper
  65. def resolveExpression(expr: Expression, plan: LogicalPlan, throws: Boolean = false): Expression

    Permalink
    Attributes
    protected[org.apache.spark.sql]
  66. def resolver: Resolver

    Permalink
  67. def splitConjunctivePredicates(condition: Expression): Seq[Expression]

    Permalink
    Attributes
    protected
    Definition Classes
    PredicateHelper
  68. def splitDisjunctivePredicates(condition: Expression): Seq[Expression]

    Permalink
    Attributes
    protected
    Definition Classes
    PredicateHelper
  69. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  70. def toString(): String

    Permalink
    Definition Classes
    AnyRef → Any
  71. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  72. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  73. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from CheckAnalysis

Inherited from PredicateHelper

Inherited from RuleExecutor[LogicalPlan]

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped