Class AllToAllBlockingResultInfo

    • Field Detail

      • aggregatedSubpartitionBytes

        @Nullable
        protected List<Long> aggregatedSubpartitionBytes
        Aggregated subpartition bytes, which aggregates the subpartition bytes with the same subpartition index in different partitions. Note that We can aggregate them because they will be consumed by the same downstream task.
      • numOfPartitions

        protected final int numOfPartitions
      • numOfSubpartitions

        protected final int numOfSubpartitions
      • subpartitionBytesByPartitionIndex

        protected final Map<Integer,​long[]> subpartitionBytesByPartitionIndex
        The subpartition bytes map. The key is the partition index, value is a subpartition bytes list.
    • Method Detail

      • isBroadcast

        public boolean isBroadcast()
        Description copied from interface: IntermediateResultInfo
        Determines whether the associated intermediate data set uses a broadcast distribution pattern.

        A broadcast distribution pattern indicates that all data produced by this intermediate data set should be broadcast to every downstream consumer.

        Returns:
        true if the intermediate data set is using a broadcast distribution pattern; false otherwise.
      • isSingleSubpartitionContainsAllData

        public boolean isSingleSubpartitionContainsAllData()
        Description copied from interface: IntermediateResultInfo
        Checks whether there is a single subpartition that contains all the produced data.
        Returns:
        true if one subpartition that contains all the data; false otherwise.
      • isPointwise

        public boolean isPointwise()
        Description copied from interface: IntermediateResultInfo
        Whether it is a pointwise result.
        Returns:
        whether it is a pointwise result
      • getNumPartitions

        public int getNumPartitions()
        Description copied from interface: IntermediateResultInfo
        Get number of partitions for this result.
        Returns:
        the number of partitions in this result
      • getNumSubpartitions

        public int getNumSubpartitions​(int partitionIndex)
        Description copied from interface: IntermediateResultInfo
        Get number of subpartitions for the given partition.
        Parameters:
        partitionIndex - the partition index
        Returns:
        the number of subpartitions of the partition
      • getNumBytesProduced

        public long getNumBytesProduced()
        Description copied from interface: BlockingResultInfo
        Return the num of bytes produced(numBytesProduced) by the producer.

        The difference between numBytesProduced and numBytesOut : numBytesProduced represents the number of bytes actually produced, and numBytesOut represents the number of bytes sent to downstream tasks. In unicast scenarios, these two values should be equal. In broadcast scenarios, numBytesOut should be (N * numBytesProduced), where N refers to the number of subpartitions.

        Returns:
        the num of bytes produced by the producer
      • onFineGrainedSubpartitionBytesNotNeeded

        protected void onFineGrainedSubpartitionBytesNotNeeded()
        This method should be called when fine-grained information is no longer needed. It will aggregate and clears the fine-grained subpartition bytes to reduce space usage.

        Once all partitions are finished and all consumer jobVertices are initialized, we can convert the subpartition bytes to aggregated value to reduce the space usage, because the distribution of source splits does not affect the distribution of data consumed by downstream tasks of ALL_TO_ALL edges(Hashing or Rebalancing, we do not consider rare cases such as custom partitions here).

      • resetPartitionInfo

        public void resetPartitionInfo​(int partitionIndex)
        Description copied from interface: BlockingResultInfo
        Reset the information of the result partition.
        Specified by:
        resetPartitionInfo in interface BlockingResultInfo
        Parameters:
        partitionIndex - the intermediate result partition index
      • getNumBytesProduced

        public long getNumBytesProduced​(IndexRange partitionIndexRange,
                                        IndexRange subpartitionIndexRange)
        Description copied from interface: BlockingResultInfo
        Return the aggregated num of bytes according to the index range for partition and subpartition.
        Specified by:
        getNumBytesProduced in interface BlockingResultInfo
        Parameters:
        partitionIndexRange - range of the index of the consumed partition.
        subpartitionIndexRange - range of the index of the consumed subpartition.
        Returns:
        aggregated bytes according to the index ranges.
      • getAggregatedSubpartitionBytes

        public List<Long> getAggregatedSubpartitionBytes()
      • getSubpartitionBytesByPartitionIndex

        public Map<Integer,​long[]> getSubpartitionBytesByPartitionIndex()
        Description copied from interface: BlockingResultInfo
        Gets subpartition bytes by partition index.
        Specified by:
        getSubpartitionBytesByPartitionIndex in interface BlockingResultInfo
        Returns:
        a map with integer keys representing partition indices and long array values representing subpartition bytes.