classTopKUDAF[B] extends UserDefinedAggregateFunction with Logging
Created by eugeny.malyutin on 24.06.16.
UDAF designed to extract top-numRows rows by columnValue
Used to replace Hive Window-functions which are to slow in case of all-df in one aggregation cell
Result of aggFun is packed in a column "arrData" and need to be org.apache.spark.sql.functions.explode-d
B
- type of columnToSortBy with implicit ordering for type B
Linear Supertypes
Logging, UserDefinedAggregateFunction, Serializable, Serializable, AnyRef, Any
Ordering
Alphabetic
By Inheritance
Inherited
TopKUDAF
Logging
UserDefinedAggregateFunction
Serializable
Serializable
AnyRef
Any
Hide All
Show All
Visibility
Public
All
Instance Constructors
newTopKUDAF(numRows: Int = 20, dfSchema: StructType, columnToSortBy: String)(implicit cmp: Ordering[B])
numRows
num rows per aggregation colemn
dfSchema
dataframe schema with all columns in one struct-column named "data"
Created by eugeny.malyutin on 24.06.16.
UDAF designed to extract top-numRows rows by columnValue Used to replace Hive Window-functions which are to slow in case of all-df in one aggregation cell Result of aggFun is packed in a column "arrData" and need to be org.apache.spark.sql.functions.explode-d
- type of columnToSortBy with implicit ordering for type B