class TungstenAggregationIterator extends AggregationIterator with Logging
An iterator used to evaluate aggregate functions. It operates on UnsafeRows.
This iterator first uses hash-based aggregation to process input rows. It uses a hash map to store groups and their corresponding aggregation buffers. If this map cannot allocate memory from memory manager, it spills the map into disk and creates a new one. After processed all the input, then merge all the spills together using external sorter, and do sort-based aggregation.
The process has the following step:
- Step 0: Do hash-based aggregation.
- Step 1: Sort all entries of the hash map based on values of grouping expressions and spill them to disk.
- Step 2: Create an external sorter based on the spilled sorted map entries and reset the map.
- Step 3: Get a sorted KVIterator from the external sorter.
- Step 4: Repeat step 0 until no more input.
- Step 5: Initialize sort-based aggregation on the sorted iterator. Then, this iterator works in the way of sort-based aggregation.
The code of this class is organized as follows:
- Part 1: Initializing aggregate functions.
- Part 2: Methods and fields used by setting aggregation buffer values, processing input rows from inputIter, and generating output rows.
- Part 3: Methods and fields used by hash-based aggregation.
- Part 4: Methods and fields used when we switch to sort-based aggregation.
- Part 5: Methods and fields used by sort-based aggregation.
- Part 6: Loads input and process input rows.
- Part 7: Public methods of this iterator.
- Part 8: A utility function used to generate a result when there is no input and there is no grouping expression.
- Alphabetic
- By Inheritance
- TungstenAggregationIterator
- AggregationIterator
- Logging
- Iterator
- TraversableOnce
- GenTraversableOnce
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
-
new
TungstenAggregationIterator(partIndex: Int, groupingExpressions: Seq[NamedExpression], aggregateExpressions: Seq[AggregateExpression], aggregateAttributes: Seq[Attribute], initialInputBufferOffset: Int, resultExpressions: Seq[NamedExpression], newMutableProjection: (Seq[Expression], Seq[Attribute]) ⇒ MutableProjection, originalInputAttributes: Seq[Attribute], inputIter: Iterator[InternalRow], testFallbackStartsAt: Option[(Int, Int)], numOutputRows: SQLMetric, peakMemory: SQLMetric, spillSize: SQLMetric, avgHashProbe: SQLMetric)
- partIndex
index of the partition
- groupingExpressions
expressions for grouping keys
- aggregateExpressions
AggregateExpression containing AggregateFunctions with mode Partial, PartialMerge, or Final.
- aggregateAttributes
the attributes of the aggregateExpressions' outputs when they are stored in the final aggregation buffer.
- resultExpressions
expressions for generating output rows.
- newMutableProjection
the function used to create mutable projections.
- originalInputAttributes
attributes of representing input rows from
inputIter
.- inputIter
the iterator containing input UnsafeRows.
Type Members
-
class
GroupedIterator[B >: A] extends AbstractIterator[Seq[B]] with Iterator[Seq[B]]
- Definition Classes
- Iterator
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
def
++[B >: UnsafeRow](that: ⇒ GenTraversableOnce[B]): Iterator[B]
- Definition Classes
- Iterator
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
addString(b: StringBuilder): StringBuilder
- Definition Classes
- TraversableOnce
-
def
addString(b: StringBuilder, sep: String): StringBuilder
- Definition Classes
- TraversableOnce
-
def
addString(b: StringBuilder, start: String, sep: String, end: String): StringBuilder
- Definition Classes
- TraversableOnce
-
def
aggregate[B](z: ⇒ B)(seqop: (B, UnsafeRow) ⇒ B, combop: (B, B) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
val
aggregateFunctions: Array[AggregateFunction]
- Attributes
- protected
- Definition Classes
- AggregationIterator
-
val
allImperativeAggregateFunctionPositions: Array[Int]
- Attributes
- protected[this]
- Definition Classes
- AggregationIterator
-
val
allImperativeAggregateFunctions: Array[ImperativeAggregate]
- Attributes
- protected[this]
- Definition Classes
- AggregationIterator
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
buffered: BufferedIterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
collect[B](pf: PartialFunction[UnsafeRow, B]): Iterator[B]
- Definition Classes
- Iterator
- Annotations
- @migration
- Migration
(Changed in version 2.8.0)
collect
has changed. The previous behavior can be reproduced withtoSeq
.
-
def
collectFirst[B](pf: PartialFunction[UnsafeRow, B]): Option[B]
- Definition Classes
- TraversableOnce
-
def
contains(elem: Any): Boolean
- Definition Classes
- Iterator
-
def
copyToArray[B >: UnsafeRow](xs: Array[B], start: Int, len: Int): Unit
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
copyToArray[B >: UnsafeRow](xs: Array[B]): Unit
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
copyToArray[B >: UnsafeRow](xs: Array[B], start: Int): Unit
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
copyToBuffer[B >: UnsafeRow](dest: Buffer[B]): Unit
- Definition Classes
- TraversableOnce
-
def
corresponds[B](that: GenTraversableOnce[B])(p: (UnsafeRow, B) ⇒ Boolean): Boolean
- Definition Classes
- Iterator
-
def
count(p: (UnsafeRow) ⇒ Boolean): Int
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
drop(n: Int): Iterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
dropWhile(p: (UnsafeRow) ⇒ Boolean): Iterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
duplicate: (Iterator[UnsafeRow], Iterator[UnsafeRow])
- Definition Classes
- Iterator
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
exists(p: (UnsafeRow) ⇒ Boolean): Boolean
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
val
expressionAggInitialProjection: MutableProjection
- Attributes
- protected[this]
- Definition Classes
- AggregationIterator
-
def
filter(p: (UnsafeRow) ⇒ Boolean): Iterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
filterNot(p: (UnsafeRow) ⇒ Boolean): Iterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
find(p: (UnsafeRow) ⇒ Boolean): Option[UnsafeRow]
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
flatMap[B](f: (UnsafeRow) ⇒ GenTraversableOnce[B]): Iterator[B]
- Definition Classes
- Iterator
-
def
fold[A1 >: UnsafeRow](z: A1)(op: (A1, A1) ⇒ A1): A1
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
foldLeft[B](z: B)(op: (B, UnsafeRow) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
foldRight[B](z: B)(op: (UnsafeRow, B) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
forall(p: (UnsafeRow) ⇒ Boolean): Boolean
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
foreach[U](f: (UnsafeRow) ⇒ U): Unit
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
val
generateOutput: (UnsafeRow, InternalRow) ⇒ UnsafeRow
- Attributes
- protected
- Definition Classes
- AggregationIterator
-
def
generateProcessRow(expressions: Seq[AggregateExpression], functions: Seq[AggregateFunction], inputAttributes: Seq[Attribute]): (InternalRow, InternalRow) ⇒ Unit
- Attributes
- protected
- Definition Classes
- AggregationIterator
-
def
generateResultProjection(): (UnsafeRow, InternalRow) ⇒ UnsafeRow
- Attributes
- protected
- Definition Classes
- TungstenAggregationIterator → AggregationIterator
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
grouped[B >: UnsafeRow](size: Int): GroupedIterator[B]
- Definition Classes
- Iterator
-
val
groupingAttributes: Seq[Attribute]
- Attributes
- protected
- Definition Classes
- AggregationIterator
-
val
groupingProjection: UnsafeProjection
- Attributes
- protected
- Definition Classes
- AggregationIterator
-
def
hasDefiniteSize: Boolean
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
final
def
hasNext: Boolean
- Definition Classes
- TungstenAggregationIterator → Iterator
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
indexOf[B >: UnsafeRow](elem: B, from: Int): Int
- Definition Classes
- Iterator
-
def
indexOf[B >: UnsafeRow](elem: B): Int
- Definition Classes
- Iterator
-
def
indexWhere(p: (UnsafeRow) ⇒ Boolean, from: Int): Int
- Definition Classes
- Iterator
-
def
indexWhere(p: (UnsafeRow) ⇒ Boolean): Int
- Definition Classes
- Iterator
-
def
initializeAggregateFunctions(expressions: Seq[AggregateExpression], startingInputBufferOffset: Int): Array[AggregateFunction]
- Attributes
- protected
- Definition Classes
- AggregationIterator
-
def
initializeBuffer(buffer: InternalRow): Unit
Initializes buffer values for all aggregate functions.
Initializes buffer values for all aggregate functions.
- Attributes
- protected
- Definition Classes
- AggregationIterator
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
isEmpty: Boolean
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
isTraversableAgain: Boolean
- Definition Classes
- Iterator → GenTraversableOnce
-
def
length: Int
- Definition Classes
- Iterator
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
map[B](f: (UnsafeRow) ⇒ B): Iterator[B]
- Definition Classes
- Iterator
-
def
max[B >: UnsafeRow](implicit cmp: Ordering[B]): UnsafeRow
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
maxBy[B](f: (UnsafeRow) ⇒ B)(implicit cmp: Ordering[B]): UnsafeRow
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
min[B >: UnsafeRow](implicit cmp: Ordering[B]): UnsafeRow
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
minBy[B](f: (UnsafeRow) ⇒ B)(implicit cmp: Ordering[B]): UnsafeRow
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
mkString: String
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
mkString(sep: String): String
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
mkString(start: String, sep: String, end: String): String
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
next(): UnsafeRow
- Definition Classes
- TungstenAggregationIterator → Iterator
-
def
nonEmpty: Boolean
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
outputForEmptyGroupingKeyWithoutInput(): UnsafeRow
Generate an output row when there is no input and there is no grouping expression.
-
def
padTo[A1 >: UnsafeRow](len: Int, elem: A1): Iterator[A1]
- Definition Classes
- Iterator
-
def
partition(p: (UnsafeRow) ⇒ Boolean): (Iterator[UnsafeRow], Iterator[UnsafeRow])
- Definition Classes
- Iterator
-
def
patch[B >: UnsafeRow](from: Int, patchElems: Iterator[B], replaced: Int): Iterator[B]
- Definition Classes
- Iterator
-
val
processRow: (InternalRow, InternalRow) ⇒ Unit
- Attributes
- protected
- Definition Classes
- AggregationIterator
-
def
product[B >: UnsafeRow](implicit num: Numeric[B]): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduce[A1 >: UnsafeRow](op: (A1, A1) ⇒ A1): A1
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduceLeft[B >: UnsafeRow](op: (B, UnsafeRow) ⇒ B): B
- Definition Classes
- TraversableOnce
-
def
reduceLeftOption[B >: UnsafeRow](op: (B, UnsafeRow) ⇒ B): Option[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduceOption[A1 >: UnsafeRow](op: (A1, A1) ⇒ A1): Option[A1]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduceRight[B >: UnsafeRow](op: (UnsafeRow, B) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reduceRightOption[B >: UnsafeRow](op: (UnsafeRow, B) ⇒ B): Option[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
reversed: List[UnsafeRow]
- Attributes
- protected[this]
- Definition Classes
- TraversableOnce
-
def
sameElements(that: Iterator[_]): Boolean
- Definition Classes
- Iterator
-
def
scanLeft[B](z: B)(op: (B, UnsafeRow) ⇒ B): Iterator[B]
- Definition Classes
- Iterator
-
def
scanRight[B](z: B)(op: (UnsafeRow, B) ⇒ B): Iterator[B]
- Definition Classes
- Iterator
-
def
seq: Iterator[UnsafeRow]
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
size: Int
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
sizeHintIfCheap: Int
- Attributes
- protected[collection]
- Definition Classes
- GenTraversableOnce
-
def
slice(from: Int, until: Int): Iterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
sliceIterator(from: Int, until: Int): Iterator[UnsafeRow]
- Attributes
- protected
- Definition Classes
- Iterator
-
def
sliding[B >: UnsafeRow](size: Int, step: Int): GroupedIterator[B]
- Definition Classes
- Iterator
-
def
span(p: (UnsafeRow) ⇒ Boolean): (Iterator[UnsafeRow], Iterator[UnsafeRow])
- Definition Classes
- Iterator
-
def
sum[B >: UnsafeRow](implicit num: Numeric[B]): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
take(n: Int): Iterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
takeWhile(p: (UnsafeRow) ⇒ Boolean): Iterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
to[Col[_]](implicit cbf: CanBuildFrom[Nothing, UnsafeRow, Col[UnsafeRow]]): Col[UnsafeRow]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toArray[B >: UnsafeRow](implicit arg0: ClassTag[B]): Array[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toBuffer[B >: UnsafeRow]: Buffer[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toIndexedSeq: IndexedSeq[UnsafeRow]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toIterable: Iterable[UnsafeRow]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toIterator: Iterator[UnsafeRow]
- Definition Classes
- Iterator → GenTraversableOnce
-
def
toList: List[UnsafeRow]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toMap[T, U](implicit ev: <:<[UnsafeRow, (T, U)]): Map[T, U]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toSeq: Seq[UnsafeRow]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toSet[B >: UnsafeRow]: Set[B]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
def
toStream: Stream[UnsafeRow]
- Definition Classes
- Iterator → GenTraversableOnce
-
def
toString(): String
- Definition Classes
- Iterator → AnyRef → Any
-
def
toTraversable: Traversable[UnsafeRow]
- Definition Classes
- Iterator → TraversableOnce → GenTraversableOnce
-
def
toVector: Vector[UnsafeRow]
- Definition Classes
- TraversableOnce → GenTraversableOnce
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
def
withFilter(p: (UnsafeRow) ⇒ Boolean): Iterator[UnsafeRow]
- Definition Classes
- Iterator
-
def
zip[B](that: Iterator[B]): Iterator[(UnsafeRow, B)]
- Definition Classes
- Iterator
-
def
zipAll[B, A1 >: UnsafeRow, B1 >: B](that: Iterator[B], thisElem: A1, thatElem: B1): Iterator[(A1, B1)]
- Definition Classes
- Iterator
-
def
zipWithIndex: Iterator[(UnsafeRow, Int)]
- Definition Classes
- Iterator
Deprecated Value Members
-
def
/:[B](z: B)(op: (B, UnsafeRow) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
- Annotations
- @deprecated
- Deprecated
(Since version 2.12.10) Use foldLeft instead of /:
-
def
:\[B](z: B)(op: (UnsafeRow, B) ⇒ B): B
- Definition Classes
- TraversableOnce → GenTraversableOnce
- Annotations
- @deprecated
- Deprecated
(Since version 2.12.10) Use foldRight instead of :\