case classSplitByKeyRDD[K, V](rdd: RDD[(K, V)])(implicit evidence$1: ClassTag[K], evidence$2: ClassTag[V]) extends Product with Serializable
Add splitByKey method to any RDD of pairs: returns a Map from each key (K) to an RDD[V] with
all the values that had that key in the original RDD (with relative order preserved for each key).
One shuffle stage on all keys and their values yields an RDD whose partitions are arranged in disjoint,
contiguous regions corresponding to all the values for each key; this is much more efficient than a naive approach to
separating RDDs by key: performing an RDD.filter for each key in the RDD;.
However, it's worth noting that breaking up an RDD into a collection of RDDs in this way is fairly
unidiomatic, and if one finds themselves wanting this it's worth pausing and considering taking different actions
upstream.
rdd
Paired RDD to split up by key.
Linear Supertypes
Serializable, Serializable, Product, Equals, AnyRef, Any
Add splitByKey method to any RDD of pairs: returns a Map from each key (K) to an RDD[V] with all the values that had that key in the original RDD (with relative order preserved for each key).
One shuffle stage on all keys and their values yields an RDD whose partitions are arranged in disjoint, contiguous regions corresponding to all the values for each key; this is much more efficient than a naive approach to separating RDDs by key: performing an RDD.filter for each key in the RDD;.
However, it's worth noting that breaking up an RDD into a collection of RDDs in this way is fairly unidiomatic, and if one finds themselves wanting this it's worth pausing and considering taking different actions upstream.
Paired RDD to split up by key.