Holds the result of a scan operation over an RDD
Analogue of ScanRDD for scans over the values of paired RDDs.
RDD wrapper supporting methods that compute partial-sums (from right to left) across the RDD.
RDD wrapper supporting methods that compute partial-sums (from right to left) across the RDD.
Callers should be aware of one implementation detail: by default, scan-rights proceed by reversing the RDD, performing a scan-left, then reversing the result, which involves 3 Spark jobs.
An alternative implementation delegates to scala.collection.Iterator.scanRight, which is likely less expensive, but materializes whole partitions into memory, which is generally a severe anti-pattern in Spark computations.
Holds the result of a scan operation over an RDD
post-scan RDD; elements are replaced with the "total" up to *and including* themselves. This differs from scala collections' "scan" behavior, which emits an initial "identity" element.
the "sum" of all elements that precede this partition; here the first element is the identity, consistent with scala collections' behavior, but the final "total" element is moved over to the total field, so that this array has the same number of elements as there are RDD partitions.
the "sum" of all elements in the scanned RDD; Scala collections typically leave this appended to the result of a scan, but it is pulled out separately here.