cogroup

inline fun <K, V, U, R> <Error class: unknown class><K, V>.cogroup(other: <Error class: unknown class><K, U>, noinline func: (key: K, left: Iterator<V>, right: Iterator<U>) -> Iterator<R>): <Error class: unknown class><R>

(Kotlin-specific) Applies the given function to each cogrouped data. For each unique group, the function will be passed the grouping key and 2 iterators containing all elements in the group from Dataset and other. The function can return an iterator containing elements of an arbitrary type which will be returned as a new Dataset.


fun <K, V, W> <Error class: unknown class><<Error class: unknown class><K, V>>.cogroup(other: <Error class: unknown class><<Error class: unknown class><K, W>>, numPartitions: Int = dstream().ssc().sc().defaultParallelism()): <Error class: unknown class><<Error class: unknown class><K, <Error class: unknown class><Iterable<V>, Iterable<W>>>>

Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream. Hash partitioning is used to generate the RDDs with numPartitions partitions.


fun <K, V, W> <Error class: unknown class><<Error class: unknown class><K, V>>.cogroup(other: <Error class: unknown class><<Error class: unknown class><K, W>>, partitioner: <Error class: unknown class>): <Error class: unknown class><<Error class: unknown class><K, <Error class: unknown class><Iterable<V>, Iterable<W>>>>

Return a new DStream by applying 'cogroup' between RDDs of this DStream and other DStream. The supplied org.apache.spark.Partitioner is used to partition the generated RDDs.