Wrap an RDD and expose a cappedGroupByKey method, which behaves like
org.apache.spark.rdd.PairRDDFunctions.groupByKey but with a cap on the number of values that will be accumulated
for each key.
Takes the first values for each key, discarding the rest; to obtain a random sampling of the elements for each key,
see SampleByKeyRDD.
Wrap an RDD and expose a
cappedGroupByKey
method, which behaves like org.apache.spark.rdd.PairRDDFunctions.groupByKey but with a cap on the number of values that will be accumulated for each key.Takes the first values for each key, discarding the rest; to obtain a random sampling of the elements for each key, see SampleByKeyRDD.