The left RDD to be used in the join
The right RDD to be used in the join
A function to create the Joiner implementation to use to perform the join
The partitioner to use
The serializer to use, otherwise use the default
(Since version 1.0.0) use mapPartitionsWithIndex and filter
(Since version 1.0.0) use mapPartitionsWithIndex and flatMap
(Since version 1.0.0) use mapPartitionsWithIndex and foreach
(Since version 1.2.0) use TaskContext.get
(Since version 0.7.0) use mapPartitionsWithIndex
(Since version 1.0.0) use mapPartitionsWithIndex
(Since version 1.0.0) use collect
:: @DeveloperApi ::
RDD implementation for merge-join that uses a shuffle to partition and sort by keys using an implicit Ordering for
K
, and then delegates to an instance of MergeJoin to perform the actual merge logic.There is an optimization in place to avoid a shuffle in some cases where
left
orright
are guaranteed to be partition-sorted already (ie: viarepartitionAndSortWithinPartitions
)