MapRDD is a map-side RDD to serialize partition data, and wait for all parent RDDs to finish.
Partition is a mirror for parent partition of original RDD, and keeps track of partition task, so we can reconstruct partitions on reduce stage.
ReduceRDD reduces each output from MapRDD and returns RDD that has original number of partitions and similar data distribution, meaning it is safe to rely on the same order of data in each partition.
MapRDD is a map-side RDD to serialize partition data, and wait for all parent RDDs to finish. Note that type 'T' must be serializable.