Class

org.apache.spark.sql.catalyst.plans.logical.statsEstimation

FilterEstimation

Related Doc: package statsEstimation

Permalink

case class FilterEstimation(plan: Filter, catalystConf: SQLConf) extends Logging with Product with Serializable

Linear Supertypes
Serializable, Serializable, Product, Equals, Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. FilterEstimation
  2. Serializable
  3. Serializable
  4. Product
  5. Equals
  6. Logging
  7. AnyRef
  8. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new FilterEstimation(plan: Filter, catalystConf: SQLConf)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def calculateFilterSelectivity(condition: Expression, update: Boolean = true): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting a condition in Filter node.

    Returns a percentage of rows meeting a condition in Filter node. If it's a single condition, we calculate the percentage directly. If it's a compound condition, it is decomposed into multiple single conditions linked with AND, OR, NOT. For logical AND conditions, we need to update stats after a condition estimation so that the stats will be more accurate for subsequent estimation. This is needed for range condition such as (c > 40 AND c <= 50) For logical OR and NOT conditions, we do not update stats after a condition estimation.

    condition

    the compound logical expression

    update

    a boolean flag to specify if we need to update ColumnStat of a column for subsequent conditions

    returns

    an optional double value to show the percentage of rows meeting a given condition. It returns None if the condition is not supported.

  6. def calculateSingleCondition(condition: Expression, update: Boolean): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting a single condition in Filter node.

    Returns a percentage of rows meeting a single condition in Filter node. Currently we only support binary predicates where one side is a column, and the other is a literal.

    condition

    a single logical expression

    update

    a boolean flag to specify if we need to update ColumnStat of a column for subsequent conditions

    returns

    an optional double value to show the percentage of rows meeting a given condition. It returns None if the condition is not supported.

  7. val catalystConf: SQLConf

    Permalink
  8. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  9. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  10. def estimate: Option[Statistics]

    Permalink

    Returns an option of Statistics for a Filter logical plan node.

    Returns an option of Statistics for a Filter logical plan node. For a given compound expression condition, this method computes filter selectivity (or the percentage of rows meeting the filter condition), which is used to compute row count, size in bytes, and the updated statistics after a given predicated is applied.

    returns

    Option[Statistics] When there is no statistics collected, it returns None.

  11. def evaluateBinary(op: BinaryComparison, attr: Attribute, literal: Literal, update: Boolean): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting a binary comparison expression.

    Returns a percentage of rows meeting a binary comparison expression.

    op

    a binary comparison operator such as =, <, <=, >, >=

    attr

    an Attribute (or a column)

    literal

    a literal value (or constant)

    update

    a boolean flag to specify if we need to update ColumnStat of a given column for subsequent conditions

    returns

    an optional double value to show the percentage of rows meeting a given condition It returns None if no statistics exists for a given column or wrong value.

  12. def evaluateBinaryForNumeric(op: BinaryComparison, attr: Attribute, literal: Literal, update: Boolean): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting a binary comparison expression.

    Returns a percentage of rows meeting a binary comparison expression. This method evaluate expression for Numeric/Date/Timestamp/Boolean columns.

    op

    a binary comparison operator such as =, <, <=, >, >=

    attr

    an Attribute (or a column)

    literal

    a literal value (or constant)

    update

    a boolean flag to specify if we need to update ColumnStat of a given column for subsequent conditions

    returns

    an optional double value to show the percentage of rows meeting a given condition

  13. def evaluateBinaryForTwoColumns(op: BinaryComparison, attrLeft: Attribute, attrRight: Attribute, update: Boolean): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting a binary comparison expression containing two columns.

    Returns a percentage of rows meeting a binary comparison expression containing two columns. In SQL queries, we also see predicate expressions involving two columns such as "column-1 (op) column-2" where column-1 and column-2 belong to same table. Note that, if column-1 and column-2 belong to different tables, then it is a join operator's work, NOT a filter operator's work.

    op

    a binary comparison operator, including =, <=>, <, <=, >, >=

    attrLeft

    the left Attribute (or a column)

    attrRight

    the right Attribute (or a column)

    update

    a boolean flag to specify if we need to update ColumnStat of the given columns for subsequent conditions

    returns

    an optional double value to show the percentage of rows meeting a given condition

  14. def evaluateEquality(attr: Attribute, literal: Literal, update: Boolean): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting an equality (=) expression.

    Returns a percentage of rows meeting an equality (=) expression. This method evaluates the equality predicate for all data types.

    For EqualNullSafe (<=>), if the literal is not null, result will be the same as EqualTo; if the literal is null, the condition will be changed to IsNull after optimization. So we don't need specific logic for EqualNullSafe here.

    attr

    an Attribute (or a column)

    literal

    a literal value (or constant)

    update

    a boolean flag to specify if we need to update ColumnStat of a given column for subsequent conditions

    returns

    an optional double value to show the percentage of rows meeting a given condition

  15. def evaluateInSet(attr: Attribute, hSet: Set[Any], update: Boolean): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting "IN" operator expression.

    Returns a percentage of rows meeting "IN" operator expression. This method evaluates the equality predicate for all data types.

    attr

    an Attribute (or a column)

    hSet

    a set of literal values

    update

    a boolean flag to specify if we need to update ColumnStat of a given column for subsequent conditions

    returns

    an optional double value to show the percentage of rows meeting a given condition It returns None if no statistics exists for a given column.

  16. def evaluateLiteral(literal: Literal): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting a Literal expression.

    Returns a percentage of rows meeting a Literal expression. This method evaluates all the possible literal cases in Filter.

    FalseLiteral and TrueLiteral should be eliminated by optimizer, but null literal might be added by optimizer rule NullPropagation. For safety, we handle all the cases here.

    literal

    a literal value (or constant)

    returns

    an optional double value to show the percentage of rows meeting a given condition

  17. def evaluateNullCheck(attr: Attribute, isNull: Boolean, update: Boolean): Option[BigDecimal]

    Permalink

    Returns a percentage of rows meeting "IS NULL" or "IS NOT NULL" condition.

    Returns a percentage of rows meeting "IS NULL" or "IS NOT NULL" condition.

    attr

    an Attribute (or a column)

    isNull

    set to true for "IS NULL" condition. set to false for "IS NOT NULL" condition

    update

    a boolean flag to specify if we need to update ColumnStat of a given column for subsequent conditions

    returns

    an optional double value to show the percentage of rows meeting a given condition It returns None if no statistics collected for a given column.

  18. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  19. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  20. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  21. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  22. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  23. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  24. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  25. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  26. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  27. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  28. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  29. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  30. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  31. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  32. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  33. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  34. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  35. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  36. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  37. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  38. val plan: Filter

    Permalink
  39. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  40. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  41. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  42. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Serializable

Inherited from Serializable

Inherited from Product

Inherited from Equals

Inherited from Logging

Inherited from AnyRef

Inherited from Any

Ungrouped