Returns the tree node at the specified number.
Returns the tree node at the specified number. Numbers for each node can be found in the numberedTreeString.
Returns a string representing the arguments to this node, minus any children
Returns a string representing the arguments to this node, minus any children
Returns a 'scala code' representation of this TreeNode
and its children.
Returns a 'scala code' representation of this TreeNode
and its children. Intended for use
when debugging where the prettier toString function is obfuscating the actual structure. In the
case of 'pure' TreeNodes
that only contain primitives and other TreeNodes, the result can be
pasted in the REPL to build an equivalent Tree.
Returns an expression where a best effort attempt has been made to transform this
in a way
that preserves the result but removes cosmetic variations (case sensitivity, ordering for
commutative operations, etc.) See Canonicalize for more details.
Returns an expression where a best effort attempt has been made to transform this
in a way
that preserves the result but removes cosmetic variations (case sensitivity, ordering for
commutative operations, etc.) See Canonicalize for more details.
deterministic
expressions where this.canonicalized == other.canonicalized
will always
evaluate to the same result.
Checks the input data types, returns TypeCheckResult.success
if it's valid,
or returns a TypeCheckResult
with an error message if invalid.
Checks the input data types, returns TypeCheckResult.success
if it's valid,
or returns a TypeCheckResult
with an error message if invalid.
Note: it's not valid to call this method until childrenResolved == true
.
Returns a Seq of the children of this node.
Returns a Seq of the children of this node. Children should not change. Immutability required for containsChild optimization
Returns true if all the children of this expression have been resolved to a specific schema and false if any still contains any unresolved placeholders.
Returns true if all the children of this expression have been resolved to a specific schema and false if any still contains any unresolved placeholders.
Returns a Seq containing the result of applying a partial function to all elements in this tree on which the function is defined.
Returns a Seq containing the result of applying a partial function to all elements in this tree on which the function is defined.
Finds and returns the first TreeNode of the tree for which the given partial function is defined (pre-order), and applies the partial function to it.
Returns true iff we can say that the partitioning scheme of this Partitioning
guarantees the same partitioning scheme described by other
.
Returns true iff we can say that the partitioning scheme of this Partitioning
guarantees the same partitioning scheme described by other
.
Compatibility of partitionings is only checked for operators that have multiple children and that require a specific child output Distribution, such as joins.
Intuitively, partitionings are compatible if they route the same partitioning key to the same partition. For instance, two hash partitionings are only compatible if they produce the same number of output partitionings and hash records according to the same hash function and same partitioning key schema.
Put another way, two partitionings are compatible with each other if they satisfy all of the same distribution guarantees.
Returns the DataType of the result of evaluating this expression.
Returns the DataType of the result of evaluating this expression. It is
invalid to query the dataType of an unresolved expression (i.e., when resolved
== false).
Returns true when the current expression always return the same result for fixed inputs from children.
Returns true when the current expression always return the same result for fixed inputs from children.
Note that this means that an expression should be considered as non-deterministic if: - if it relies on some mutable internal state, or - if it relies on some implicit input that is not part of the children expression list. - if it has non-deterministic child or children.
An example would be SparkPartitionID
that relies on the partition id returned by TaskContext.
By default leaf expressions are deterministic as Nil.forall(_.deterministic) returns true.
Returns Java source code that can be compiled to evaluate this expression.
Returns Java source code that can be compiled to evaluate this expression. The default behavior is to call the eval method of the expression. Concrete expression implementations should override this to do actual code generation.
a CodegenContext
an ExprCode with unique terms.
an ExprCode containing the Java source code to generate the given expression
Returns the result of evaluating this expression on a given input Row
Returns the result of evaluating this expression on a given input Row
Faster version of equality which short-circuits when two treeNodes are the same instance.
Faster version of equality which short-circuits when two treeNodes are the same instance.
We don't just override Object.equals, as doing so prevents the scala compiler from
generating case class equals
methods
Find the first TreeNode that satisfies the condition specified by f
.
Returns a Seq by applying a function to all nodes in this tree and using the elements of the resulting collections.
Returns a Seq by applying a function to all nodes in this tree and using the elements of the resulting collections.
Returns true when an expression is a candidate for static evaluation before the query is executed.
Returns true when an expression is a candidate for static evaluation before the query is executed.
The following conditions are used to determine suitability for constant folding:
Runs the given function on this node and then recursively on children.
Runs the given function recursively on children then on this node.
Returns an ExprCode, that contains the Java source code to generate the result of evaluating the expression on an input row.
Returns an ExprCode, that contains the Java source code to generate the result of evaluating the expression on an input row.
a CodegenContext
ExprCode
Appends the string represent of this node and its children to the given StringBuilder.
Appends the string represent of this node and its children to the given StringBuilder.
The i
-th element in lastChildren
indicates whether the ancestor of the current node at
depth i + 1
is the last child of its own parent node. The depth of the root node is 0, and
lastChildren
for the root node should be empty.
Returns true iff we can say that the partitioning scheme of this Partitioning guarantees
the same partitioning scheme described by other
.
Returns true iff we can say that the partitioning scheme of this Partitioning guarantees
the same partitioning scheme described by other
. If a A.guarantees(B)
, then repartitioning
the child's output according to B
will be unnecessary. guarantees
is used as a performance
optimization to allow the exchange planner to avoid redundant repartitionings. By default,
a partitioning only guarantees partitionings that are equal to itself (i.e. the same number
of partitions, same strategy (range or hash), etc).
In order to enable more aggressive optimization, this strict equality check can be relaxed.
For example, say that the planner needs to repartition all of an operator's children so that
they satisfy the AllTuples distribution. One way to do this is to repartition all children
to have the SinglePartition partitioning. If one of the operator's children already happens
to be hash-partitioned with a single partition then we do not need to re-shuffle this child;
this repartitioning can be avoided if a single-partition HashPartitioning guarantees
SinglePartition.
The SinglePartition example given above is not particularly interesting; guarantees' real
value occurs for more advanced partitioning strategies. SPARK-7871 will introduce a notion
of null-safe partitionings, under which partitionings can specify whether rows whose
partitioning keys contain null values will be grouped into the same partition or whether they
will have an unknown / random distribution. If a partitioning does not require nulls to be
clustered then a partitioning which _does_ cluster nulls will guarantee the null clustered
partitioning. The converse is not true, however: a partitioning which clusters nulls cannot
be guaranteed by one which does not cluster them. Thus, in general guarantees
is not a
symmetric relation.
Another way to think about guarantees
: if A.guarantees(B)
, then any partitioning of rows
produced by A
could have also been produced by B
.
All the nodes that are parts of this node.
All the nodes that are parts of this node.
For example:
WholeStageCodegen +- SortMergeJoin |-- InputAdapter | +-- Sort +-- InputAdapter +-- Sort
the innerChildren of WholeStageCodegen will be Seq(SortMergeJoin), it will generate a tree string like this:
WholeStageCodegen : +- SortMergeJoin : :- INPUT : :- INPUT :- Sort :- Sort
Creates a copy of this type of tree node after a transformation.
Creates a copy of this type of tree node after a transformation. Must be overridden by child classes that have constructor arguments that are not present in the productIterator.
the new product arguments.
Returns a Seq containing the result of applying the given function to each node in this tree in a preorder traversal.
Returns a Seq containing the result of applying the given function to each node in this tree in a preorder traversal.
the function to be applied.
Returns a copy of this node where f
has been applied to all the nodes children.
Returns a copy of this node where f
has been applied to all the nodes children.
Returns the name of this type of TreeNode.
Returns the name of this type of TreeNode. Defaults to the class name. Note that we remove the "Exec" suffix for physical operators here.
Returns the number of partitions that the data is split across
Returns the number of partitions that the data is split across
Returns a string representation of the nodes in this tree, where each operator is numbered.
Args to the constructor that should be copied, but not transformed.
Args to the constructor that should be copied, but not transformed. These are appended to the transformed args automatically by makeCopy
Returns a user-facing string representation of this expression's name.
Returns a user-facing string representation of this expression's name. This should usually match the name of the function in SQL.
Returns true
if this expression and all its children have been resolved to a specific schema
and input data types checking passed, and false
if it still contains any unresolved
placeholders or has data types mismatch.
Returns true
if this expression and all its children have been resolved to a specific schema
and input data types checking passed, and false
if it still contains any unresolved
placeholders or has data types mismatch.
Implementations of expressions should override this if the resolution of this type of
expression involves more than just the resolution of its children and type checking.
Returns true iff the guarantees made by this Partitioning are sufficient
to satisfy the partitioning scheme mandated by the required
Distribution,
i.e.
Returns true iff the guarantees made by this Partitioning are sufficient
to satisfy the partitioning scheme mandated by the required
Distribution,
i.e. the current dataset does not need to be re-partitioned for the required
Distribution (it is possible that tuples within a partition need to be reorganized).
Returns true when two expressions will always compute the same result, even if they differ cosmetically (i.e.
Returns true when two expressions will always compute the same result, even if they differ cosmetically (i.e. capitalization of names in attributes may be different).
See Canonicalize for more details.
Returns a hashCode
for the calculation performed by this expression.
Returns a hashCode
for the calculation performed by this expression. Unlike the standard
hashCode
, an attempt has been made to eliminate cosmetic differences.
See Canonicalize for more details.
String representation of this node without any children.
String representation of this node without any children.
Returns SQL representation of this expression.
Returns SQL representation of this expression. For expressions extending NonSQLExpression, this method may return an arbitrary user facing string.
The arguments that should be included in the arg string.
The arguments that should be included in the arg string. Defaults to the productIterator
.
Returns a copy of this node where rule
has been recursively applied to the tree.
Returns a copy of this node where rule
has been recursively applied to the tree.
When rule
does not apply to a given node it is left unchanged.
Users should not expect a specific directionality. If a specific directionality is needed,
transformDown or transformUp should be used.
the function use to transform this nodes children
Returns a copy of this node where rule
has been recursively applied to all the children of
this node.
Returns a copy of this node where rule
has been recursively applied to all the children of
this node. When rule
does not apply to a given node it is left unchanged.
the function used to transform this nodes children
Returns a copy of this node where rule
has been recursively applied to it and all of its
children (pre-order).
Returns a copy of this node where rule
has been recursively applied to it and all of its
children (pre-order). When rule
does not apply to a given node it is left unchanged.
the function used to transform this nodes children
Returns a copy of this node where rule
has been recursively applied first to all of its
children and then itself (post-order).
Returns a copy of this node where rule
has been recursively applied first to all of its
children and then itself (post-order). When rule
does not apply to a given node, it is left
unchanged.
the function use to transform this nodes children
All the nodes that will be used to generate tree string.
All the nodes that will be used to generate tree string.
For example:
WholeStageCodegen +-- SortMergeJoin |-- InputAdapter | +-- Sort +-- InputAdapter +-- Sort
the treeChildren of WholeStageCodegen will be Seq(Sort, Sort), it will generate a tree string like this:
WholeStageCodegen : +- SortMergeJoin : :- INPUT : :- INPUT :- Sort :- Sort
Returns a string representation of the nodes in this tree
Returns a string representation of the nodes in this tree
Returns a copy of this node with the children replaced.
Returns a copy of this node with the children replaced. TODO: Validate somewhere (in debug mode?) that children are ordered correctly.
Represents a partitioning where rows are split across partitions based on some total ordering of the expressions specified in
ordering
. When data is partitioned in this manner the following two conditions are guaranteed to hold:ordering
evaluate to the same values will be in the same partition.min
andmax
row, relative to the given ordering. All rows that are in betweenmin
andmax
in thisordering
will reside in this partition.This class extends expression primarily so that transformations over expression will descend into its child.