org.apache.spark.ml.odkl

TopKUDAF

class TopKUDAF[B] extends UserDefinedAggregateFunction with Logging

Created by eugeny.malyutin on 24.06.16.

UDAF designed to extract top-numRows rows by columnValue Used to replace Hive Window-functions which are to slow in case of all-df in one aggregation cell Result of aggFun is packed in a column "arrData" and need to be org.apache.spark.sql.functions.explode-d

B

- type of columnToSortBy with implicit ordering for type B

Linear Supertypes
Logging, UserDefinedAggregateFunction, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. TopKUDAF
  2. Logging
  3. UserDefinedAggregateFunction
  4. Serializable
  5. Serializable
  6. AnyRef
  7. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Instance Constructors

  1. new TopKUDAF(numRows: Int = 20, dfSchema: StructType, columnToSortBy: String)(implicit cmp: Ordering[B])

    numRows

    num rows per aggregation colemn

    dfSchema

    dataframe schema with all columns in one struct-column named "data"

    columnToSortBy

Value Members

  1. final def !=(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  2. final def !=(arg0: Any): Boolean

    Definition Classes
    Any
  3. final def ##(): Int

    Definition Classes
    AnyRef → Any
  4. final def ==(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  5. final def ==(arg0: Any): Boolean

    Definition Classes
    Any
  6. def apply(exprs: Column*): Column

    Definition Classes
    UserDefinedAggregateFunction
    Annotations
    @varargs()
  7. final def asInstanceOf[T0]: T0

    Definition Classes
    Any
  8. def bufferSchema: StructType

    Definition Classes
    TopKUDAF → UserDefinedAggregateFunction
  9. def clone(): AnyRef

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  10. implicit val cmp: Ordering[B]

  11. val columnToSortByIndex: Int

  12. def dataType: DataType

    Definition Classes
    TopKUDAF → UserDefinedAggregateFunction
  13. def deterministic: Boolean

    Definition Classes
    TopKUDAF → UserDefinedAggregateFunction
  14. def distinct(exprs: Column*): Column

    Definition Classes
    UserDefinedAggregateFunction
    Annotations
    @varargs()
  15. final def eq(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  16. def equals(arg0: Any): Boolean

    Definition Classes
    AnyRef → Any
  17. def evaluate(buffer: Row): Any

    Definition Classes
    TopKUDAF → UserDefinedAggregateFunction
  18. def finalize(): Unit

    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  19. final def getClass(): Class[_]

    Definition Classes
    AnyRef → Any
  20. def hashCode(): Int

    Definition Classes
    AnyRef → Any
  21. def initialize(buffer: MutableAggregationBuffer): Unit

    Definition Classes
    TopKUDAF → UserDefinedAggregateFunction
  22. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Attributes
    protected
    Definition Classes
    Logging
  23. def inputSchema: StructType

    Definition Classes
    TopKUDAF → UserDefinedAggregateFunction
  24. final def isInstanceOf[T0]: Boolean

    Definition Classes
    Any
  25. def isTraceEnabled(): Boolean

    Attributes
    protected
    Definition Classes
    Logging
  26. def k: Int

  27. def log: Logger

    Attributes
    protected
    Definition Classes
    Logging
  28. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  29. def logDebug(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  30. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  31. def logError(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  32. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  33. def logInfo(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  34. def logName: String

    Attributes
    protected
    Definition Classes
    Logging
  35. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  36. def logTrace(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  37. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Attributes
    protected
    Definition Classes
    Logging
  38. def logWarning(msg: ⇒ String): Unit

    Attributes
    protected
    Definition Classes
    Logging
  39. def merge(buffer1: MutableAggregationBuffer, buffer2: Row): Unit

    Definition Classes
    TopKUDAF → UserDefinedAggregateFunction
  40. final def ne(arg0: AnyRef): Boolean

    Definition Classes
    AnyRef
  41. final def notify(): Unit

    Definition Classes
    AnyRef
  42. final def notifyAll(): Unit

    Definition Classes
    AnyRef
  43. val numRows: Int

    num rows per aggregation colemn

  44. lazy val rowComparator: Comparator[AnyRef]

  45. final def synchronized[T0](arg0: ⇒ T0): T0

    Definition Classes
    AnyRef
  46. def toString(): String

    Definition Classes
    AnyRef → Any
  47. def update(buffer: MutableAggregationBuffer, input: Row): Unit

    Definition Classes
    TopKUDAF → UserDefinedAggregateFunction
  48. final def wait(): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  49. final def wait(arg0: Long, arg1: Int): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  50. final def wait(arg0: Long): Unit

    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Logging

Inherited from UserDefinedAggregateFunction

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped