Class

org.apache.spark.sql.execution

ObjectHashMapAccessor

Related Doc: package execution

Permalink

case class ObjectHashMapAccessor(session: SnappySession, ctx: CodegenContext, keyExprs: Seq[Expression], valueExprs: Seq[Expression], classPrefix: String, hashMapTerm: String, dataTerm: String, maskTerm: String, multiMap: Boolean, consumer: CodegenSupport, cParent: CodegenSupport, child: SparkPlan) extends SparkPlan with UnaryExecNode with CodegenSupport with Product with Serializable

Provides helper methods for generated code to use ObjectHashSet with a generated class (having key and value columns as corresponding java type fields). This implementation saves the entire overhead of UnsafeRow conversion for both key type (like in BytesToBytesMap) and value type (like in BytesToBytesMap and VectorizedHashMapGenerator).

It has been carefully optimized to minimize memory reads/writes, with minimalistic code to fit better in CPU instruction cache. Unlike the other two maps used by HashAggregateExec, this has no limitations on the key or value column types.

The basic idea being that all of the key and value columns will be individual fields in a generated java class having corresponding java types. Storage of a column value in the map is a simple matter of assignment of incoming variable to the corresponding field of the class object and access is likewise read from that field of class . Nullability information is crammed in long bit-mask fields which are generated as many required (instead of unnecessary overhead of something like a BitSet).

Hashcode and equals methods are generated for the key column fields. Having both key and value fields in the same class object helps both in cutting down of generated code as well as cache locality and reduces at least one memory access for each row. In testing this alone has shown to improve performance by ~25% in simple group by queries. Furthermore, this class also provides for inline hashcode and equals methods so that incoming register variables in generated code can be directly used (instead of stuffing into a lookup key that will again read those fields inside). The class hashcode method is supposed to be used only internally by rehashing and that too is just a field cached in the class object that is filled in during the initial insert (from the inline hashcode).

For memory management this uses a simple approach of starting with an estimated size, then improving that estimate for future in a rehash where the rehash will also collect the actual size of current entries. If the rehash tells that no memory is available, then it will fallback to dumping the current map into MemoryManager and creating a new one with merge being done by an external sorter in a manner similar to how UnsafeFixedWidthAggregationMap handles the situation. Caller can instead decide to dump the entire map in that scenario like when using for a HashJoin.

Overall this map is 5-10X faster than UnsafeFixedWidthAggregationMap and 2-4X faster than VectorizedHashMapGenerator. It is generic enough to be used for both group by aggregation as well as for HashJoins.

Linear Supertypes
CodegenSupport, UnaryExecNode, SparkPlan, Serializable, Serializable, internal.Logging, QueryPlan[SparkPlan], TreeNode[SparkPlan], Product, Equals, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ObjectHashMapAccessor
  2. CodegenSupport
  3. UnaryExecNode
  4. SparkPlan
  5. Serializable
  6. Serializable
  7. Logging
  8. QueryPlan
  9. TreeNode
  10. Product
  11. Equals
  12. AnyRef
  13. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new ObjectHashMapAccessor(session: SnappySession, ctx: CodegenContext, keyExprs: Seq[Expression], valueExprs: Seq[Expression], classPrefix: String, hashMapTerm: String, dataTerm: String, maskTerm: String, multiMap: Boolean, consumer: CodegenSupport, cParent: CodegenSupport, child: SparkPlan)

    Permalink

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. lazy val allAttributes: AttributeSeq

    Permalink
    Definition Classes
    QueryPlan
  5. def apply(number: Int): TreeNode[_]

    Permalink
    Definition Classes
    TreeNode
  6. def argString: String

    Permalink
    Definition Classes
    TreeNode
  7. def asCode: String

    Permalink
    Definition Classes
    TreeNode
  8. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  9. val cParent: CodegenSupport

    Permalink
  10. lazy val canonicalized: SparkPlan

    Permalink
    Attributes
    protected
    Definition Classes
    QueryPlan
  11. val child: SparkPlan

    Permalink
    Definition Classes
    ObjectHashMapAccessor → UnaryExecNode
  12. final def children: Seq[SparkPlan]

    Permalink
    Definition Classes
    UnaryExecNode → TreeNode
  13. val classPrefix: String

    Permalink
  14. lazy val cleanArgs: Seq[Any]

    Permalink
    Attributes
    protected
    Definition Classes
    QueryPlan
  15. def cleanExpression(e: Expression): Expression

    Permalink
    Attributes
    protected
    Definition Classes
    QueryPlan
  16. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  17. def collect[B](pf: PartialFunction[SparkPlan, B]): Seq[B]

    Permalink
    Definition Classes
    TreeNode
  18. def collectFirst[B](pf: PartialFunction[SparkPlan, B]): Option[B]

    Permalink
    Definition Classes
    TreeNode
  19. def collectLeaves(): Seq[SparkPlan]

    Permalink
    Definition Classes
    TreeNode
  20. lazy val constraints: ExpressionSet

    Permalink
    Definition Classes
    QueryPlan
  21. final def consume(ctx: CodegenContext, outputVars: Seq[ExprCode], row: String): String

    Permalink
    Definition Classes
    CodegenSupport
  22. val consumer: CodegenSupport

    Permalink
  23. lazy val containsChild: Set[TreeNode[_]]

    Permalink
    Definition Classes
    TreeNode
  24. val ctx: CodegenContext

    Permalink
  25. val dataTerm: String

    Permalink
  26. def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: ExprCode): String

    Permalink
    Definition Classes
    ObjectHashMapAccessor → CodegenSupport
  27. def doExecute(): RDD[InternalRow]

    Permalink
    Attributes
    protected
    Definition Classes
    ObjectHashMapAccessor → SparkPlan
  28. def doExecuteBroadcast[T](): Broadcast[T]

    Permalink
    Attributes
    protected[org.apache.spark.sql]
    Definition Classes
    SparkPlan
  29. def doPrepare(): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  30. def doProduce(ctx: CodegenContext): String

    Permalink
    Attributes
    protected
    Definition Classes
    ObjectHashMapAccessor → CodegenSupport
  31. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  32. def evaluateRequiredVariables(attributes: Seq[Attribute], variables: Seq[ExprCode], required: AttributeSet): String

    Permalink
    Attributes
    protected
    Definition Classes
    CodegenSupport
  33. def evaluateVariables(variables: Seq[ExprCode]): String

    Permalink
    Attributes
    protected
    Definition Classes
    CodegenSupport
  34. final def execute(): RDD[InternalRow]

    Permalink
    Definition Classes
    SparkPlan
  35. final def executeBroadcast[T](): Broadcast[T]

    Permalink
    Definition Classes
    SparkPlan
  36. def executeCollect(): Array[InternalRow]

    Permalink
    Definition Classes
    SparkPlan
  37. def executeCollectPublic(): Array[Row]

    Permalink
    Definition Classes
    SparkPlan
  38. final def executeQuery[T](query: ⇒ T): T

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  39. def executeTake(n: Int): Array[InternalRow]

    Permalink
    Definition Classes
    SparkPlan
  40. def executeToIterator(): Iterator[InternalRow]

    Permalink
    Definition Classes
    SparkPlan
  41. final def expressions: Seq[Expression]

    Permalink
    Definition Classes
    QueryPlan
  42. def fastEquals(other: TreeNode[_]): Boolean

    Permalink
    Definition Classes
    TreeNode
  43. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  44. def find(f: (SparkPlan) ⇒ Boolean): Option[SparkPlan]

    Permalink
    Definition Classes
    TreeNode
  45. def flatMap[A](f: (SparkPlan) ⇒ TraversableOnce[A]): Seq[A]

    Permalink
    Definition Classes
    TreeNode
  46. def foreach(f: (SparkPlan) ⇒ Unit): Unit

    Permalink
    Definition Classes
    TreeNode
  47. def foreachUp(f: (SparkPlan) ⇒ Unit): Unit

    Permalink
    Definition Classes
    TreeNode
  48. def generateEquals(objVar: String, keyVars: Seq[ExprCode]): String

    Permalink

    Generate code to compare equality of a given object (objVar) against key column variables.

  49. def generateHashCode(hashVar: Array[String], keyVars: Seq[ExprCode], keyExpressions: Seq[Expression], skipDeclaration: Boolean = false, register: Boolean = true): String

    Permalink

    Generate code to calculate the hash code for given column variables that correspond to the key columns in this class.

  50. def generateMapGetOrInsert(objVar: String, valueInitVars: Seq[ExprCode], valueInitCode: String, input: Seq[ExprCode], dictArrayVar: String, dictArrayInitVar: String): String

    Permalink

    Generate code to lookup the map or insert a new key, value if not found.

  51. def generateMapLookup(entryVar: String, localValueVar: String, keyIsUnique: String, numRows: String, nullMaskVars: Array[String], initCode: String, checkCond: (Option[ExprCode], String), streamKeys: Seq[Expression], streamKeyVars: Seq[ExprCode], streamOutput: Seq[Attribute], buildKeyVars: Seq[ExprCode], buildVars: Seq[ExprCode], input: Seq[ExprCode], resultVars: Seq[ExprCode], dictArrayVar: String, dictArrayInitVar: String, joinType: JoinType, buildSide: BuildSide): String

    Permalink
  52. def generateTreeString(depth: Int, lastChildren: Seq[Boolean], builder: StringBuilder, verbose: Boolean, prefix: String): StringBuilder

    Permalink
    Definition Classes
    TreeNode
  53. def generateUpdate(objVar: String, columnVars: Seq[ExprCode], resultVars: Seq[ExprCode], forKey: Boolean, doCopy: Boolean = true, forInit: Boolean = true): String

    Permalink

    Generate code to update a class object fields with given resultVars.

    Generate code to update a class object fields with given resultVars. If accessors for fields have been generated (using getColumnVars) then those can be passed for faster reads where required.

    objVar

    the variable holding reference to the class object

    columnVars

    accessors for object fields, if available

    resultVars

    result values to be assigned to object fields

    forKey

    if true then update key fields else value fields

    doCopy

    if true then a copy of reference values is assigned else only reference copy done

    forInit

    if true then this is for initialization of fields after object creation so some checks can be skipped

    returns

    code to assign objVar fields to given resultVars

  54. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  55. def getClassName: String

    Permalink

    get the generated class name

  56. def getColumnVars(keyObjVar: String, localValObjVar: String, onlyKeyVars: Boolean, onlyValueVars: Boolean, checkNullObj: Boolean = false): (String, Seq[ExprCode], Array[String])

    Permalink

    Get the ExprCode for the key and/or value columns given a class object variable.

    Get the ExprCode for the key and/or value columns given a class object variable. This also returns an initialization code that should be inserted in generated code first. The last element in the result tuple is the names of null mask variables.

  57. def getRelevantConstraints(constraints: Set[Expression]): Set[Expression]

    Permalink
    Attributes
    protected
    Definition Classes
    QueryPlan
  58. def hashCode(): Int

    Permalink
    Definition Classes
    TreeNode → AnyRef → Any
  59. val hashMapTerm: String

    Permalink
  60. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  61. def innerChildren: Seq[QueryPlan[_]]

    Permalink
    Attributes
    protected
    Definition Classes
    QueryPlan → TreeNode
  62. def inputRDDs(): Seq[RDD[InternalRow]]

    Permalink
    Definition Classes
    ObjectHashMapAccessor → CodegenSupport
  63. def inputSet: AttributeSet

    Permalink
    Definition Classes
    QueryPlan
  64. lazy val integralKeys: Seq[Int]

    Permalink
  65. lazy val integralKeysMaxVars: Seq[String]

    Permalink
  66. lazy val integralKeysMinVars: Seq[String]

    Permalink
  67. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  68. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  69. def jsonFields: List[JField]

    Permalink
    Attributes
    protected
    Definition Classes
    TreeNode
  70. val keyExprs: Seq[Expression]

    Permalink
  71. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  72. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  73. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  74. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  75. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  76. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  77. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  78. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  79. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  80. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  81. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  82. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  83. def longMetric(name: String): SQLMetric

    Permalink
    Definition Classes
    SparkPlan
  84. def makeCopy(newArgs: Array[AnyRef]): SparkPlan

    Permalink
    Definition Classes
    SparkPlan → TreeNode
  85. def map[A](f: (SparkPlan) ⇒ A): Seq[A]

    Permalink
    Definition Classes
    TreeNode
  86. def mapChildren(f: (SparkPlan) ⇒ SparkPlan): SparkPlan

    Permalink
    Definition Classes
    TreeNode
  87. def mapExpressions(f: (Expression) ⇒ Expression): ObjectHashMapAccessor.this.type

    Permalink
    Definition Classes
    QueryPlan
  88. def mapProductIterator[B](f: (Any) ⇒ B)(implicit arg0: ClassTag[B]): Array[B]

    Permalink
    Attributes
    protected
    Definition Classes
    TreeNode
  89. val maskTerm: String

    Permalink
  90. def metadata: Map[String, String]

    Permalink
    Definition Classes
    SparkPlan
  91. def metricTerm(ctx: CodegenContext, name: String): String

    Permalink
    Definition Classes
    CodegenSupport
  92. def metrics: Map[String, SQLMetric]

    Permalink
    Definition Classes
    SparkPlan
  93. def missingInput: AttributeSet

    Permalink
    Definition Classes
    QueryPlan
  94. val multiMap: Boolean

    Permalink
  95. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  96. def newMutableProjection(expressions: Seq[Expression], inputSchema: Seq[Attribute], useSubexprElimination: Boolean): MutableProjection

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  97. def newNaturalAscendingOrdering(dataTypes: Seq[DataType]): Ordering[InternalRow]

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  98. def newOrdering(order: Seq[SortOrder], inputSchema: Seq[Attribute]): Ordering[InternalRow]

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  99. def newPredicate(expression: Expression, inputSchema: Seq[Attribute]): Predicate

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  100. def nodeName: String

    Permalink
    Definition Classes
    TreeNode
  101. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  102. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  103. def numberedTreeString: String

    Permalink
    Definition Classes
    TreeNode
  104. val origin: Origin

    Permalink
    Definition Classes
    TreeNode
  105. def otherCopyArgs: Seq[AnyRef]

    Permalink
    Attributes
    protected
    Definition Classes
    TreeNode
  106. def output: Seq[Attribute]

    Permalink
    Definition Classes
    ObjectHashMapAccessor → QueryPlan
  107. def outputOrdering: Seq[SortOrder]

    Permalink
    Definition Classes
    SparkPlan
  108. def outputPartitioning: Partitioning

    Permalink
    Definition Classes
    SparkPlan
  109. def outputSet: AttributeSet

    Permalink
    Definition Classes
    QueryPlan
  110. def p(number: Int): SparkPlan

    Permalink
    Definition Classes
    TreeNode
  111. var parent: CodegenSupport

    Permalink
    Attributes
    protected
    Definition Classes
    CodegenSupport
  112. final def prepare(): Unit

    Permalink
    Definition Classes
    SparkPlan
  113. def prepareSubqueries(): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  114. def prettyJson: String

    Permalink
    Definition Classes
    TreeNode
  115. def printSchema(): Unit

    Permalink
    Definition Classes
    QueryPlan
  116. final def produce(ctx: CodegenContext, parent: CodegenSupport): String

    Permalink
    Definition Classes
    CodegenSupport
  117. def producedAttributes: AttributeSet

    Permalink
    Definition Classes
    QueryPlan
  118. def references: AttributeSet

    Permalink
    Definition Classes
    QueryPlan
  119. def requiredChildDistribution: Seq[Distribution]

    Permalink
    Definition Classes
    SparkPlan
  120. def requiredChildOrdering: Seq[Seq[SortOrder]]

    Permalink
    Definition Classes
    SparkPlan
  121. def resetMetrics(): Unit

    Permalink
    Definition Classes
    SparkPlan
  122. def sameResult(plan: SparkPlan): Boolean

    Permalink
    Definition Classes
    QueryPlan
  123. lazy val schema: StructType

    Permalink
    Definition Classes
    QueryPlan
  124. def schemaString: String

    Permalink
    Definition Classes
    QueryPlan
  125. val session: SnappySession

    Permalink
  126. def simpleString: String

    Permalink
    Definition Classes
    QueryPlan → TreeNode
  127. def sparkContext: SparkContext

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  128. final val sqlContext: SQLContext

    Permalink
    Definition Classes
    SparkPlan
  129. def statePrefix: String

    Permalink
    Attributes
    protected
    Definition Classes
    QueryPlan
  130. def stringArgs: Iterator[Any]

    Permalink
    Attributes
    protected
    Definition Classes
    TreeNode
  131. val subexpressionEliminationEnabled: Boolean

    Permalink
    Definition Classes
    SparkPlan
  132. def subqueries: Seq[SparkPlan]

    Permalink
    Definition Classes
    QueryPlan
  133. def supportCodegen: Boolean

    Permalink
    Definition Classes
    CodegenSupport
  134. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  135. def toJSON: String

    Permalink
    Definition Classes
    TreeNode
  136. def toString(): String

    Permalink
    Definition Classes
    TreeNode → AnyRef → Any
  137. def transform(rule: PartialFunction[SparkPlan, SparkPlan]): SparkPlan

    Permalink
    Definition Classes
    TreeNode
  138. def transformAllExpressions(rule: PartialFunction[Expression, Expression]): ObjectHashMapAccessor.this.type

    Permalink
    Definition Classes
    QueryPlan
  139. def transformDown(rule: PartialFunction[SparkPlan, SparkPlan]): SparkPlan

    Permalink
    Definition Classes
    TreeNode
  140. def transformExpressions(rule: PartialFunction[Expression, Expression]): ObjectHashMapAccessor.this.type

    Permalink
    Definition Classes
    QueryPlan
  141. def transformExpressionsDown(rule: PartialFunction[Expression, Expression]): ObjectHashMapAccessor.this.type

    Permalink
    Definition Classes
    QueryPlan
  142. def transformExpressionsUp(rule: PartialFunction[Expression, Expression]): ObjectHashMapAccessor.this.type

    Permalink
    Definition Classes
    QueryPlan
  143. def transformUp(rule: PartialFunction[SparkPlan, SparkPlan]): SparkPlan

    Permalink
    Definition Classes
    TreeNode
  144. def treeString(verbose: Boolean): String

    Permalink
    Definition Classes
    TreeNode
  145. def treeString: String

    Permalink
    Definition Classes
    TreeNode
  146. def usedInputs: AttributeSet

    Permalink
    Definition Classes
    CodegenSupport
  147. def validConstraints: Set[Expression]

    Permalink
    Attributes
    protected
    Definition Classes
    QueryPlan
  148. val valueExprs: Seq[Expression]

    Permalink
  149. def verboseString: String

    Permalink
    Definition Classes
    QueryPlan → TreeNode
  150. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  151. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  152. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  153. def waitForSubqueries(): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    SparkPlan
  154. def withNewChildren(newChildren: Seq[SparkPlan]): SparkPlan

    Permalink
    Definition Classes
    TreeNode

Inherited from CodegenSupport

Inherited from UnaryExecNode

Inherited from SparkPlan

Inherited from Serializable

Inherited from Serializable

Inherited from internal.Logging

Inherited from QueryPlan[SparkPlan]

Inherited from TreeNode[SparkPlan]

Inherited from Product

Inherited from Equals

Inherited from AnyRef

Inherited from Any

Ungrouped