Package

org.apache.spark.sql.catalyst

expressions

Permalink

package expressions

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. expressions
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. final class DirectStringConsumer extends MemoryConsumer

    Permalink
  2. case class DynamicFoldableExpression(expr: Expression) extends UnaryExpression with DynamicReplacableConstant with KryoSerializable with Product with Serializable

    Permalink

    Wrap any TokenizedLiteral expression with this so that we can invoke literal initialization code within the .init() method of the generated class.

    Wrap any TokenizedLiteral expression with this so that we can invoke literal initialization code within the .init() method of the generated class.

    This pushes itself as reference object and uses a call to eval() on itself for actual evaluation and avoids embedding any generated code. This allows it to keep the generated code identical regardless of the constant expression (and in addition DynamicReplacableConstant trait casts to itself rather than actual object type).

    We try to locate first foldable expression in a query tree such that all its child is foldable but parent isn't. That way we locate the exact point where an expression is safe to evaluate once instead of evaluating every row.

    Expressions like select c from tab where case col2 when 1 then col3 else 'y' end = 22 like queries don't convert literal evaluation into init method.

    expr

    minimal expression tree that can be evaluated only once and turn into a constant.

  3. case class DynamicInSet(child: Expression, hset: IndexedSeq[Expression]) extends UnaryExpression with Predicate with Product with Serializable

    Permalink

    Unlike Spark's InSet expression, this allows for TokenizedLiterals that can change dynamically in executions.

  4. trait DynamicReplacableConstant extends Expression

    Permalink
  5. case class ParamLiteral(value: Any, dataType: DataType, pos: Int, execId: Int, tokenized: Boolean = false, positionIndependent: Boolean = false, valueEquals: Boolean = false) extends LeafExpression with TokenizedLiteral with KryoSerializable with Product with Serializable

    Permalink

    In addition to TokenLiteral, this class can also be used in plan caching so allows for internal value to be updated in subsequent runs when the plan is re-used with different constants.

    In addition to TokenLiteral, this class can also be used in plan caching so allows for internal value to be updated in subsequent runs when the plan is re-used with different constants. For that reason this does not extend Literal (to avoid Analyzer/Optimizer etc doing constant propagation for example) and its hash/equals ignores the value matching and only the position of the literal in the plan is used with the data type.

    Where ever ParamLiteral case matching is required, it must match for DynamicReplacableConstant and use .eval(..) for code generation. see SNAP-1597 for more details. For cases of common-subexpression elimination that depend on constant values being equal in different parts of the tree, a new RefParamLiteral has been added that points to a ParamLiteral and is always equal to it, see SNAP-2462 for more details.

  6. trait ParamLiteralHolder extends AnyRef

    Permalink
  7. final class RefParamLiteral extends ParamLiteral

    Permalink

    This class is used as a substitution for ParamLiteral when two ParamLiterals have same constant values during parsing.

    This class is used as a substitution for ParamLiteral when two ParamLiterals have same constant values during parsing. This behaves like being equal to the ParamLiteral it points to in all respects but will be different from other ParamLiterals. Two RefParamLiterals will be equal iff their respective ParamLiterals are.

    The above policy allows an expression like "a = 4 and b = 4" to be equal to "a = 5 and b = 5" after tokenization but will be different from "a = 5 and b = 6". This distinction is required because former can lead to a different execution plan after common-subexpression processing etc that can apply on because the actual values for the two tokenized values are equal in this instance. Hence it can lead to a different plan in case where actual constants are different, so after tokenization they should act as different expressions. See TPCH Q19 for an example where equal values in two different positions lead to an optimized plan due to common-subexpression being pulled out of OR conditions as a separate AND condition which leads to further filter push down which is not possible if the actual values are different.

    Note: This class maintains its own copy of value since it can change in execution (e.g. ROUND can change precision of underlying Decimal value) which should not lead to a change of value of referenced ParamLiteral or vice-versa. However, during planning, code generation and other phases before runJob, the value and dataType should match exactly which is checked by referenceEquals. After deserialization on remote executor, the class no longer maintains a reference and falls back to behaving like a regular ParamLiteral since the required analysis and other phases are already done, and final code generation requires a copy of the values.

  8. case class TermValues(literalValueRef: String, isNull: String, valueTerm: String) extends Product with Serializable

    Permalink
  9. final class TokenLiteral extends Literal with TokenizedLiteral with KryoSerializable

    Permalink

    A Literal that passes its value as a reference object in generated code instead of embedding as a constant to allow generated code reuse.

  10. trait TokenizedLiteral extends LeafExpression with DynamicReplacableConstant

    Permalink

Value Members

  1. val EmptyRow: InternalRow

    Permalink
  2. object TokenLiteral extends Serializable

    Permalink

Inherited from AnyRef

Inherited from Any

Ungrouped