Wrap an RDD and expose a collectPartitions
method that is similar to RDD.collect, but returns an Array
of per-partition Arrays.
Some helpers for repartitioning an RDD while retaining the order of its elements.
Add partitionByKey
method to paired RDDs whose key is a tuple of (partition idx, elem idx), which sends
elements to the partition indicated by partition idx
, and sorts them within each partition according to elem idx
.
Helpers for fetching the first element from each partition of an RDD.
Helper for determining the size of each partition of an RDD.
Lazily key each element by its partition number and intra-partition idx.
Helper APIs for reducing RDD partitions to single elements, either with a combiner function (reducePartitions) or by directly mapping a partition Iterator to a single element (collapsePartitions).
Lazily key each element by its partition number and intra-partition idx.
Useful in tandem with PartitionByKeyRDD.