case class ReduceNumShufflePartitions(conf: SQLConf) extends Rule[SparkPlan] with Product with Serializable
A rule to adjust the post shuffle partitions based on the map output statistics.
The strategy used to determine the number of post-shuffle partitions is described as follows. To determine the number of post-shuffle partitions, we have a target input size for a post-shuffle partition. Once we have size statistics of all pre-shuffle partitions, we will do a pass of those statistics and pack pre-shuffle partitions with continuous indices to a single post-shuffle partition until adding another pre-shuffle partition would cause the size of a post-shuffle partition to be greater than the target size.
For example, we have two stages with the following pre-shuffle partition size statistics: stage 1: [100 MiB, 20 MiB, 100 MiB, 10MiB, 30 MiB] stage 2: [10 MiB, 10 MiB, 70 MiB, 5 MiB, 5 MiB] assuming the target input size is 128 MiB, we will have four post-shuffle partitions, which are:
- post-shuffle partition 0: pre-shuffle partition 0 (size 110 MiB)
- post-shuffle partition 1: pre-shuffle partition 1 (size 30 MiB)
- post-shuffle partition 2: pre-shuffle partition 2 (size 170 MiB)
- post-shuffle partition 3: pre-shuffle partition 3 and 4 (size 50 MiB)
- Alphabetic
- By Inheritance
- ReduceNumShufflePartitions
- Serializable
- Serializable
- Product
- Equals
- Rule
- Logging
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Instance Constructors
- new ReduceNumShufflePartitions(conf: SQLConf)
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
apply(plan: SparkPlan): SparkPlan
- Definition Classes
- ReduceNumShufflePartitions → Rule
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
- val conf: SQLConf
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
initializeForcefully(isInterpreter: Boolean, silent: Boolean): Unit
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
initializeLogIfNecessary(isInterpreter: Boolean): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
def
isTraceEnabled(): Boolean
- Attributes
- protected
- Definition Classes
- Logging
-
def
log: Logger
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logDebug(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logError(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logInfo(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logName: String
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logTrace(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String, throwable: Throwable): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
def
logWarning(msg: ⇒ String): Unit
- Attributes
- protected
- Definition Classes
- Logging
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
val
ruleName: String
- Definition Classes
- Rule
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()