doric
package doric
- Grouped
- Alphabetic
- By Inheritance
- doric
- All
- SortingOps
- CollectOps
- JoinOps
- TransformOps
- AggregationOps
- RelationalGroupedDatasetDoricInterface
- All
- DStructs3x
- NumericColumns2_31
- StringColumn3x
- MapColumns3x
- CommonColumns3x
- ArrayColumns3x
- BinaryColumns30_31
- Interpolators
- BinaryColumns
- CNameOps
- AggregationColumns
- ControlStructures
- StringColumns
- BooleanColumns
- TimestampColumns
- DateColumns
- NumericColumns
- MapColumns
- LiteralConversions
- DStructs
- CommonColumns
- ColGetters
- TypeMatcher
- ArrayColumns
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
- type ArrayColumn[A] = DoricColumn[Array[A]]
- type BinaryColumn = DoricColumn[Array[Byte]]
- type BooleanColumn = DoricColumn[Boolean]
- type ByteColumn = DoricColumn[Byte]
- case class CName(value: String) extends Product with Serializable
- case class CNameOrd(name: CName, order: Order) extends Product with Serializable
-
implicit
class
CollectSyntax[A] extends AnyRef
- Definition Classes
- CollectOps
-
implicit
class
DStructOps3x[T] extends AnyRef
- Definition Classes
- DStructs3x
-
implicit
class
DataframeAggSyntax extends AnyRef
- Definition Classes
- AggregationOps
-
implicit
class
DataframeSortSyntax extends AnyRef
- Definition Classes
- SortingOps
-
implicit
class
DataframeTransformationSyntax[A] extends AnyRef
- Definition Classes
- TransformOps
- type DateColumn = DoricColumn[Date]
- type Doric[T] = Kleisli[DoricValidated, Dataset[_], T]
- sealed trait DoricColumn[T] extends AnyRef
- type DoricJoin[T] = Kleisli[DoricValidated, (Dataset[_], Dataset[_]), T]
- case class DoricJoinColumn(elem: DoricJoin[Column]) extends Product with Serializable
- type DoricValidated[T] = Validated[NonEmptyChain[DoricSingleError], T]
- type DoubleColumn = DoricColumn[Double]
- type FloatColumn = DoricColumn[Float]
- type InstantColumn = DoricColumn[Instant]
- type IntegerColumn = DoricColumn[Int]
- sealed abstract class JoinSideDoricColumn[T] extends AnyRef
- case class LeftDoricColumn[T](elem: Doric[Column]) extends JoinSideDoricColumn[T] with Product with Serializable
- case class LiteralDoricColumn[T] extends DoricColumn[T] with Product with Serializable
- type LocalDateColumn = DoricColumn[LocalDate]
- type LongColumn = DoricColumn[Long]
- type MapColumn[K, V] = DoricColumn[Map[K, V]]
- case class NamedDoricColumn[T] extends DoricColumn[T] with Product with Serializable
- type NullColumn = DoricColumn[Null]
- sealed trait Order extends AnyRef
-
implicit
class
RelationalGroupedDatasetSem extends AnyRef
- Definition Classes
- AggregationOps
- case class RightDoricColumn[T](elem: Doric[Column]) extends JoinSideDoricColumn[T] with Product with Serializable
- type RowColumn = DoricColumn[Row]
- type StringColumn = DoricColumn[String]
- implicit final class StringIntCNameOps extends AnyVal
- type TimestampColumn = DoricColumn[Timestamp]
- case class TransformationDoricColumn[T] extends DoricColumn[T] with Product with Serializable
-
implicit
class
DataframeJoinSyntax[A] extends AnyRef
- Definition Classes
- JoinOps
-
implicit
class
ArrayArrayColumnSyntax[G[_], F[_], T] extends AnyRef
- Definition Classes
- ArrayColumns
-
implicit
class
ArrayColumnSyntax[T, F[_]] extends AnyRef
Extension methods for arrays
Extension methods for arrays
- Definition Classes
- ArrayColumns
-
implicit
class
ArrayColumnTupleSyntax[K, V, F[_]] extends AnyRef
Extension methods for arrays
Extension methods for arrays
- Definition Classes
- ArrayColumns
-
implicit
class
ArrayColumnSyntax3x[T, F[_]] extends AnyRef
- Definition Classes
- ArrayColumns3x
-
implicit
class
ArrayStructColumnSyntax3x[F[_]] extends AnyRef
- Definition Classes
- ArrayColumns3x
-
implicit
class
BinaryOperationsSyntax[T] extends AnyRef
- Definition Classes
- BinaryColumns
-
implicit
class
BinaryOperationsSyntax30_31[T] extends AnyRef
- Definition Classes
- BinaryColumns30_31
-
implicit
class
BooleanOperationsSyntax extends AnyRef
- Definition Classes
- BooleanColumns
-
implicit
class
CNameOps extends AnyRef
- Definition Classes
- CNameOps
-
implicit
class
StringCNameOps extends AnyRef
- Definition Classes
- CNameOps
-
implicit
class
BasicCol[T] extends AnyRef
Extension methods for any kind of column
Extension methods for any kind of column
- Definition Classes
- CommonColumns
-
implicit
class
CastingImpl[T] extends AnyRef
Casting methods
Casting methods
- Definition Classes
- CommonColumns
-
implicit
class
SparkCol extends AnyRef
- Definition Classes
- CommonColumns
-
implicit
class
ControlStructuresImpl[O] extends AnyRef
- Definition Classes
- ControlStructures
-
implicit
class
DStructOps[T] extends AnyRef
- Definition Classes
- DStructs
-
class
DynamicFieldAccessor[T] extends Dynamic
- Definition Classes
- DStructs
-
trait
SelectorLPI extends AnyRef
- Definition Classes
- DStructs
-
trait
SelectorWithSparkType[L <: HList, K <: Symbol] extends AnyRef
- Definition Classes
- DStructs
- Annotations
- @implicitNotFound( "No field ${K} in record ${L}" )
-
implicit
class
StructOps[T, L <: HList] extends AnyRef
- Definition Classes
- DStructs
-
implicit
class
DateColumnLikeSyntax[T] extends AnyRef
- Definition Classes
- DateColumns
-
implicit
class
doricStringInterpolator extends AnyRef
- Definition Classes
- Interpolators
-
implicit
class
DoricColLiteralGetter[T] extends AnyRef
- Definition Classes
- LiteralConversions
-
implicit
class
LiteralOps[L] extends AnyRef
- Definition Classes
- LiteralConversions
-
implicit
class
MapColumnOps[K, V] extends AnyRef
Extension methods for Map Columns
Extension methods for Map Columns
- Definition Classes
- MapColumns
-
implicit
class
MapColumnOps3x[K, V] extends AnyRef
Extension methods for Map Columns
Extension methods for Map Columns
- Definition Classes
- MapColumns3x
-
implicit
class
IntegralOperationsSyntax[T] extends AnyRef
INTEGRAL OPERATIONS
INTEGRAL OPERATIONS
- Definition Classes
- NumericColumns
-
implicit
class
LongOperationsSyntax extends AnyRef
LONG OPERATIONS
LONG OPERATIONS
- Definition Classes
- NumericColumns
-
implicit
class
NumWithDecimalsOperationsSyntax[T] extends AnyRef
NUM WITH DECIMALS OPERATIONS
NUM WITH DECIMALS OPERATIONS
- Definition Classes
- NumericColumns
-
implicit
class
NumericOperationsSyntax[T] extends AnyRef
GENERIC NUMERIC OPERATIONS
GENERIC NUMERIC OPERATIONS
- Definition Classes
- NumericColumns
-
implicit
class
IntegralOperationsSyntax2_31[T] extends AnyRef
INTEGRAL OPERATIONS
INTEGRAL OPERATIONS
- Definition Classes
- NumericColumns2_31
-
implicit
class
StringOperationsSyntax3x extends AnyRef
- Definition Classes
- StringColumn3x
-
implicit
class
StringOperationsSyntax extends AnyRef
Unique column operations
Unique column operations
- Definition Classes
- StringColumns
-
implicit
class
TimestampColumnLikeSyntax[T] extends AnyRef
- Definition Classes
- TimestampColumns
Abstract Value Members
-
abstract
def
constructSide[T](column: Doric[Column], colName: String): NamedDoricColumn[T]
- Attributes
- protected
- Definition Classes
- ColGetters
- Annotations
- @inline()
Concrete Value Members
-
def
andAgg(col: BooleanColumn): BooleanColumn
Aggregate function: returns the AND value for a boolean column
Aggregate function: returns the AND value for a boolean column
- Definition Classes
- AggregationColumns
-
def
aproxCountDistinct(colName: String): LongColumn
Aggregate function: returns the approximate number of distinct items in a group.
Aggregate function: returns the approximate number of distinct items in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
aproxCountDistinct(colName: String, rsd: Double): LongColumn
Aggregate function: returns the approximate number of distinct items in a group.
Aggregate function: returns the approximate number of distinct items in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
aproxCountDistinct(col: DoricColumn[_]): LongColumn
Aggregate function: returns the approximate number of distinct items in a group.
Aggregate function: returns the approximate number of distinct items in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
aproxCountDistinct(col: DoricColumn[_], rsd: Double): LongColumn
Aggregate function: returns the approximate number of distinct items in a group.
Aggregate function: returns the approximate number of distinct items in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
array[T](cols: DoricColumn[T]*)(implicit arg0: SparkType[T], arg1: ClassTag[T], lt: LiteralSparkType[Array[T]]): ArrayColumn[T]
Creates a new array column.
Creates a new array column. The input columns must all have the same data type.
- Definition Classes
- ArrayColumns
- To do
scaladoc link (issue #135)
- See also
org.apache.spark.sql.functions.array
-
def
avg[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: returns the average of the values in a group.
Aggregate function: returns the average of the values in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
coalesce[T](cols: DoricColumn[T]*): DoricColumn[T]
Returns the first column that is not null, or null if all inputs are null.
Returns the first column that is not null, or null if all inputs are null.
For example,
coalesce(a, b, c)
will return a if a is not null, or b if a is null and b is not null, or c if both a and b are null but c is not null.- cols
the DoricColumns to coalesce
- returns
the first column that is not null, or null if all inputs are null.
- Definition Classes
- CommonColumns
- See also
-
def
col[T](colName: String)(implicit arg0: SparkType[T], location: Location): NamedDoricColumn[T]
Retrieves a column with the provided name and the provided type.
Retrieves a column with the provided name and the provided type.
- T
the expected type of the column
- colName
the name of the column to find.
- location
error location.
- returns
the column reference
- Definition Classes
- ColGetters
-
def
colArray[T](colName: String)(implicit arg0: ClassTag[T], location: Location, st: SparkType[Array[T]]): NamedDoricColumn[Array[T]]
Retrieves a column with the provided name expecting it to be of array of T type.
Retrieves a column with the provided name expecting it to be of array of T type.
- T
the type of the elements of the array.
- colName
the name of the column to find.
- location
error location.
- returns
the array of T column reference.
- Definition Classes
- ColGetters
-
def
colArrayInt(colName: String)(implicit location: Location): NamedDoricColumn[Array[Int]]
Retrieves a column with the provided name expecting it to be of array of integers type.
Retrieves a column with the provided name expecting it to be of array of integers type.
- colName
the name of the column to find.
- location
error location.
- returns
the array of integers column reference.
- Definition Classes
- ColGetters
-
def
colArrayString(colName: String)(implicit location: Location): NamedDoricColumn[Array[String]]
Retrieves a column with the provided name expecting it to be of array of string type.
Retrieves a column with the provided name expecting it to be of array of string type.
- colName
the name of the column to find.
- location
error location.
- returns
the array of string column reference.
- Definition Classes
- ColGetters
-
def
colBinary(colName: String)(implicit location: Location): NamedDoricColumn[Array[Byte]]
Retrieves a column with the provided name expecting it to be of array of bytes type.
Retrieves a column with the provided name expecting it to be of array of bytes type.
- colName
the name of the column to find.
- location
error location.
- returns
the binary column reference.
- Definition Classes
- ColGetters
-
def
colBoolean(colName: String)(implicit location: Location): NamedDoricColumn[Boolean]
Retrieves a column with the provided name expecting it to be of double type.
Retrieves a column with the provided name expecting it to be of double type.
- colName
the name of the column to find.
- location
error location.
- returns
the long column reference
- Definition Classes
- ColGetters
-
def
colDate(colName: String)(implicit location: Location): NamedDoricColumn[Date]
Retrieves a column with the provided name expecting it to be of Date type.
Retrieves a column with the provided name expecting it to be of Date type.
- colName
the name of the column to find.
- location
error location.
- returns
the Date column reference
- Definition Classes
- ColGetters
-
def
colDouble(colName: String)(implicit location: Location): NamedDoricColumn[Double]
Retrieves a column with the provided name expecting it to be of double type.
Retrieves a column with the provided name expecting it to be of double type.
- colName
the name of the column to find.
- location
error location.
- returns
the double column reference
- Definition Classes
- ColGetters
-
def
colFloat(colName: String)(implicit location: Location): NamedDoricColumn[Float]
Retrieves a column with the provided name expecting it to be of float type.
Retrieves a column with the provided name expecting it to be of float type.
- colName
the name of the column to find.
- location
error location.
- returns
the float column reference
- Definition Classes
- ColGetters
-
def
colFromDF[T](colName: String, originDF: Dataset[_])(implicit arg0: SparkType[T], location: Location): NamedDoricColumn[T]
Retrieves a column of the provided dataframe.
Retrieves a column of the provided dataframe. Useful to prevent column ambiguity errors.
- T
the type of the doric column.
- colName
the name of the column to find.
- originDF
the dataframe to force the column.
- location
error location.
- returns
the column of type T column reference.
- Definition Classes
- ColGetters
-
def
colInstant(colName: String)(implicit location: Location): NamedDoricColumn[Instant]
Retrieves a column with the provided name expecting it to be of instant type.
Retrieves a column with the provided name expecting it to be of instant type.
- colName
the name of the column to find.
- location
error location.
- returns
the instant column reference
- Definition Classes
- ColGetters
-
def
colInt(colName: String)(implicit location: Location): NamedDoricColumn[Int]
Retrieves a column with the provided name expecting it to be of integer type.
Retrieves a column with the provided name expecting it to be of integer type.
- colName
the name of the column to find.
- location
error location.
- returns
the integer column reference
- Definition Classes
- ColGetters
-
def
colLocalDate(colName: String)(implicit location: Location): NamedDoricColumn[LocalDate]
Retrieves a column with the provided name expecting it to be of LocalDate type.
Retrieves a column with the provided name expecting it to be of LocalDate type.
- colName
the name of the column to find.
- location
error location.
- returns
the LocalDate column reference
- Definition Classes
- ColGetters
-
def
colLong(colName: String)(implicit location: Location): NamedDoricColumn[Long]
Retrieves a column with the provided name expecting it to be of long type.
Retrieves a column with the provided name expecting it to be of long type.
- colName
the name of the column to find.
- location
error location.
- returns
the long column reference
- Definition Classes
- ColGetters
-
def
colMap[K, V](colName: String)(implicit arg0: SparkType[K], arg1: SparkType[V], location: Location): NamedDoricColumn[Map[K, V]]
Retrieves a column with the provided name expecting it to be of map type.
Retrieves a column with the provided name expecting it to be of map type.
- K
the type of the keys of the map.
- V
the type of the values of the map.
- colName
the name of the column to find.
- location
error location.
- returns
the map column reference.
- Definition Classes
- ColGetters
-
def
colMapString[V](colName: String)(implicit arg0: SparkType[V], location: Location): NamedDoricColumn[Map[String, V]]
Retrieves a column with the provided name expecting it to be of map type.
Retrieves a column with the provided name expecting it to be of map type.
- V
the type of the values of the map.
- colName
the name of the column to find.
- location
error location.
- returns
the map column reference.
- Definition Classes
- ColGetters
-
def
colNull(colName: String)(implicit location: Location): NamedDoricColumn[Null]
Retrieves a column with the provided name expecting it to be of null type.
Retrieves a column with the provided name expecting it to be of null type.
- colName
the name of the column to find.
- location
error location.
- returns
the null column reference
- Definition Classes
- ColGetters
-
def
colString(colName: String)(implicit location: Location): NamedDoricColumn[String]
Retrieves a column with the provided name expecting it to be of string type.
Retrieves a column with the provided name expecting it to be of string type.
- colName
the name of the column to find.
- location
error location.
- returns
the string column reference
- Definition Classes
- ColGetters
-
def
colStruct(colName: String)(implicit location: Location): NamedDoricColumn[Row]
Retrieves a column with the provided name expecting it to be of struct type.
Retrieves a column with the provided name expecting it to be of struct type.
- colName
the name of the column to find.
- location
error location.
- returns
the struct column reference.
- Definition Classes
- ColGetters
-
def
colTimestamp(colName: String)(implicit location: Location): NamedDoricColumn[Timestamp]
Retrieves a column with the provided name expecting it to be of Timestamp type.
Retrieves a column with the provided name expecting it to be of Timestamp type.
- colName
the name of the column to find.
- location
error location.
- returns
the Timestamp column reference
- Definition Classes
- ColGetters
-
def
collectList[T](col: DoricColumn[T]): ArrayColumn[T]
Aggregate function: returns a list of objects with duplicates.
Aggregate function: returns a list of objects with duplicates.
- Definition Classes
- AggregationColumns
- Note
The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
- See also
-
def
collectSet[T](col: DoricColumn[T]): ArrayColumn[T]
Aggregate function: returns a set of objects with duplicate elements eliminated.
Aggregate function: returns a set of objects with duplicate elements eliminated.
- Definition Classes
- AggregationColumns
- Note
The function is non-deterministic because the order of collected results depends on the order of the rows which may be non-deterministic after a shuffle.
- See also
-
def
concat(cols: StringColumn*): StringColumn
Concatenate string columns to form a single one
Concatenate string columns to form a single one
- cols
the String DoricColumns to concatenate
- returns
a reference of a single DoricColumn with all strings concatenated. If at least one is null will return null.
- Definition Classes
- StringColumns
- See also
-
def
concatArrays[T, F[_]](cols: DoricColumn[F[T]]*)(implicit arg0: CollectionType[F]): DoricColumn[F[T]]
Concatenates multiple array columns together into a single column.
Concatenates multiple array columns together into a single column.
- T
The type of the elements of the arrays.
- cols
the array columns, must be Arrays of the same type.
- returns
Doric Column with the concatenation.
- Definition Classes
- ArrayColumns
- See also
-
def
concatBinary(col: BinaryColumn, cols: BinaryColumn*): BinaryColumn
Concatenates multiple binary columns together into a single column.
Concatenates multiple binary columns together into a single column.
- col
the first binary column
- cols
the binary columns
- returns
Doric Column with the concatenation.
- Definition Classes
- BinaryColumns
- See also
-
def
concatMaps[K, V](col: MapColumn[K, V], cols: MapColumn[K, V]*): MapColumn[K, V]
Returns the union of all the given maps.
Returns the union of all the given maps.
- Definition Classes
- MapColumns
- See also
-
def
concatWs(sep: StringColumn, cols: StringColumn*): StringColumn
Concatenates multiple input string columns together into a single string column, using the given separator.
Concatenates multiple input string columns together into a single string column, using the given separator.
- Definition Classes
- StringColumns
df.withColumn("res", concatWs("-".lit, col("col1"), col("col2"))) .show(false) +----+----+----+ |col1|col2| res| +----+----+----+ | 1| 1| 1-1| |null| 2| 2| | 3|null| 3| |null|null| | +----+----+----+
- Note
even if
cols
contain null columns, it prints remaining string columns (or empty string).- See also
Example: -
def
correlation(col1: DoubleColumn, col2: DoubleColumn): DoubleColumn
Aggregate function: returns the Pearson Correlation Coefficient for two columns.
Aggregate function: returns the Pearson Correlation Coefficient for two columns.
- Definition Classes
- AggregationColumns
- See also
-
def
count(colName: CName): LongColumn
Aggregate function: returns the number of items in a group.
Aggregate function: returns the number of items in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
count(col: DoricColumn[_]): LongColumn
Aggregate function: returns the number of items in a group.
Aggregate function: returns the number of items in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
countDistinct(columnName: CName, columnNames: CName*): LongColumn
Aggregate function: returns the number of distinct items in a group.
Aggregate function: returns the number of distinct items in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
countDistinct(expr: DoricColumn[_], exprs: DoricColumn[_]*): LongColumn
Aggregate function: returns the number of distinct items in a group.
Aggregate function: returns the number of distinct items in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
covarPop(col1: DoubleColumn, col2: DoubleColumn): DoubleColumn
Aggregate function: returns the population covariance for two columns.
Aggregate function: returns the population covariance for two columns.
- Definition Classes
- AggregationColumns
- See also
-
def
covarSamp(col1: DoubleColumn, col2: DoubleColumn): DoubleColumn
Aggregate function: returns the sample covariance for two columns.
Aggregate function: returns the sample covariance for two columns.
- Definition Classes
- AggregationColumns
- See also
-
def
currentDate(): DateColumn
Returns the current date at the start of query evaluation as a date column.
Returns the current date at the start of query evaluation as a date column. All calls of current_date within the same query return the same value.
- Definition Classes
- DateColumns
- See also
-
def
currentDateT[T]()(implicit arg0: DateType[T], arg1: SparkType[T]): DoricColumn[T]
Returns the current date at the start of query evaluation as a date column typed with the provided T.
Returns the current date at the start of query evaluation as a date column typed with the provided T. All calls of current_date within the same query return the same value.
- Definition Classes
- DateColumns
- See also
-
def
currentTimestamp(): TimestampColumn
Returns the current timestamp at the start of query evaluation as a timestamp column.
Returns the current timestamp at the start of query evaluation as a timestamp column. All calls of current_timestamp within the same query return the same value.
- Definition Classes
- TimestampColumns
- See also
-
def
currentTimestampT[T]()(implicit arg0: TimestampType[T], arg1: SparkType[T]): DoricColumn[T]
Returns the current timestamp at the start of query evaluation as a timestamp column.
Returns the current timestamp at the start of query evaluation as a timestamp column. All calls of current_timestamp within the same query return the same value.
- Definition Classes
- TimestampColumns
- See also
-
def
first[T](col: DoricColumn[T], ignoreNulls: Boolean): DoricColumn[T]
Aggregate function: returns the first value in a group.
Aggregate function: returns the first value in a group.
The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
- Definition Classes
- AggregationColumns
- Note
The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- See also
-
def
first[T](col: DoricColumn[T]): DoricColumn[T]
Aggregate function: returns the first value in a group.
Aggregate function: returns the first value in a group.
The function by default returns the first values it sees. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
- Definition Classes
- AggregationColumns
- Note
The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- See also
-
def
formatString(format: StringColumn, arguments: DoricColumn[_]*): StringColumn
Formats the arguments in printf-style and returns the result as a string column.
Formats the arguments in printf-style and returns the result as a string column.
- format
Printf format
- arguments
the String DoricColumns to format
- returns
Formats the arguments in printf-style and returns the result as a string column.
- Definition Classes
- StringColumns
- See also
-
def
greatest[T](col: DoricColumn[T], cols: DoricColumn[T]*): DoricColumn[T]
Returns the greatest value of the list of values, skipping null values.
Returns the greatest value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
- Definition Classes
- CommonColumns
- Note
skips null values
- See also
-
def
grouping(columnName: CName): ByteColumn
Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.
Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.
- Definition Classes
- AggregationColumns
- See also
-
def
grouping(col: DoricColumn[_]): ByteColumn
Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.
Aggregate function: indicates whether a specified column in a GROUP BY list is aggregated or not, returns 1 for aggregated or 0 for not aggregated in the result set.
- Definition Classes
- AggregationColumns
- See also
-
def
groupingId(colName: CName, colNames: CName*): LongColumn
Aggregate function: returns the level of grouping, equals to
Aggregate function: returns the level of grouping, equals to
- Definition Classes
- AggregationColumns
(grouping(c1) <<; (n-1)) + (grouping(c2) <<; (n-2)) + ... + grouping(cn)
- Note
The list of columns should match with grouping columns exactly, or empty (means all the grouping columns).
- See also
Example: -
def
groupingId(col: DoricColumn[_], cols: DoricColumn[_]*): LongColumn
Aggregate function: returns the level of grouping, equals to
Aggregate function: returns the level of grouping, equals to
- Definition Classes
- AggregationColumns
(grouping(c1) <<; (n-1)) + (grouping(c2) <<; (n-2)) + ... + grouping(cn)
- Note
The list of columns should match with grouping columns exactly, or empty (means all the grouping columns).
- See also
Example: -
def
hash(cols: DoricColumn[_]*): IntegerColumn
Calculates the hash code of given columns, and returns the result as an integer column.
Calculates the hash code of given columns, and returns the result as an integer column.
- Definition Classes
- CommonColumns
- See also
-
def
inputFileName(): StringColumn
Creates a string column for the file name of the current Spark task.
Creates a string column for the file name of the current Spark task.
- Definition Classes
- StringColumns
- See also
-
def
kurtosis(col: DoubleColumn): DoubleColumn
Aggregate function: returns the kurtosis of the values in a group.
Aggregate function: returns the kurtosis of the values in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
last[T](col: DoricColumn[T], ignoreNulls: Boolean): DoricColumn[T]
Aggregate function: returns the last value in a group.
Aggregate function: returns the last value in a group.
The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
- Definition Classes
- AggregationColumns
- Note
The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- See also
-
def
last[T](col: DoricColumn[T]): DoricColumn[T]
Aggregate function: returns the last value in a group.
Aggregate function: returns the last value in a group.
The function by default returns the last values it sees. It will return the last non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
- Definition Classes
- AggregationColumns
- Note
The function is non-deterministic because its results depends on the order of the rows which may be non-deterministic after a shuffle.
- See also
-
def
least[T](col: DoricColumn[T], cols: DoricColumn[T]*): DoricColumn[T]
Returns the least value of the list of values, skipping null values.
Returns the least value of the list of values, skipping null values. This function takes at least 2 parameters. It will return null iff all parameters are null.
- Definition Classes
- CommonColumns
- Note
skips null values
- See also
-
def
list[T](cols: DoricColumn[T]*): DoricColumn[List[T]]
Creates a new list column.
Creates a new list column. The input columns must all have the same data type.
- Definition Classes
- ArrayColumns
- To do
scaladoc link (issue #135)
- See also
org.apache.spark.sql.functions.array
-
def
lit[L](litv: L)(implicit arg0: SparkType[L], arg1: LiteralSparkType[L], l: Location): LiteralDoricColumn[L]
Creates a literal with the provided value.
Creates a literal with the provided value.
- L
The type of the literal.
- litv
the element to create as a literal.
- returns
A doric column that represent the literal value and the same type as the value.
- Definition Classes
- LiteralConversions
-
def
map[K, V](first: (DoricColumn[K], DoricColumn[V]), rest: (DoricColumn[K], DoricColumn[V])*): MapColumn[K, V]
Creates a new map column.
Creates a new map column. The input is formed by tuples of key and the corresponding value.
- K
the type of the keys of the Map
- V
the type of the values of the Map
- first
a pair of key value DoricColumns
- rest
the rest of pairs of key and corresponding Values.
- returns
the DoricColumn of the corresponding Map type
- Definition Classes
- MapColumns
- See also
-
def
mapFromArrays[K, V](keys: DoricColumn[Array[K]], values: DoricColumn[Array[V]]): MapColumn[K, V]
Creates a new map column.
Creates a new map column. The array in the first column is used for keys. The array in the second column is used for values. All elements in the array for key should not be null.
- K
the type of the Array elements of the keys.
- V
the type of the Array elements of the value.
- keys
the array to create the keys.
- values
the array to create the values.
- returns
an DoricColumn of type Map of the keys and values.
- Definition Classes
- MapColumns
- See also
-
def
matchToType[T](colName: String)(implicit arg0: SparkType[T]): EmptyTypeMatcher[T]
- Definition Classes
- TypeMatcher
-
def
max[T](col: DoricColumn[T]): DoricColumn[T]
Aggregate function: returns the maximum value of the expression in a group.
Aggregate function: returns the maximum value of the expression in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
mean[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: returns the maximum value of the expression in a group.
Aggregate function: returns the maximum value of the expression in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
min[T](col: DoricColumn[T]): DoricColumn[T]
Aggregate function: returns the maximum value of the expression in a group.
Aggregate function: returns the maximum value of the expression in a group.
- Definition Classes
- AggregationColumns
- See also
- lazy val minorScalaVersion: Int
-
def
monotonicallyIncreasingId(): LongColumn
A column expression that generates monotonically increasing 64-bit integers.
A column expression that generates monotonically increasing 64-bit integers.
The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the record number within each partition in the lower 33 bits. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.
- Definition Classes
- NumericColumns
consider a
DataFrame
with two partitions, each with 3 records. This expression would return the following IDs:0, 1, 2, 8589934592 (1L << 33), 8589934593, 8589934594.
- See also
Example: -
def
not(col: BooleanColumn): BooleanColumn
Inversion of boolean expression, i.e.
Inversion of boolean expression, i.e. NOT.
- Definition Classes
- BooleanColumns
- See also
-
def
orAgg(col: BooleanColumn): BooleanColumn
Aggregate function: returns the OR value for a boolean column
Aggregate function: returns the OR value for a boolean column
- Definition Classes
- AggregationColumns
-
def
random(seed: LongColumn): DoubleColumn
Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).
Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).
- Definition Classes
- NumericColumns
- Note
The function is non-deterministic in general case.
- See also
-
def
random(): DoubleColumn
Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).
Generate a random column with independent and identically distributed (i.i.d.) samples uniformly distributed in [0.0, 1.0).
- Definition Classes
- NumericColumns
- Note
The function is non-deterministic in general case.
- See also
-
def
randomN(seed: LongColumn): DoubleColumn
Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
- Definition Classes
- NumericColumns
- Note
The function is non-deterministic in general case.
- See also
-
def
randomN(): DoubleColumn
Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
Generate a column with independent and identically distributed (i.i.d.) samples from the standard normal distribution.
- Definition Classes
- NumericColumns
- Note
The function is non-deterministic in general case.
- See also
-
def
skewness[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: returns the skewness of the values in a group.
Aggregate function: returns the skewness of the values in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
sparkAgg(relationalGroupedDataset: RelationalGroupedDataset, expr: DoricColumn[_], exprs: DoricColumn[_]*): DoricValidated[DataFrame]
- Definition Classes
- RelationalGroupedDatasetDoricInterface
-
def
sparkCube(df: DataFrame, cols: DoricColumn[_]*): DoricValidated[RelationalGroupedDataset]
- Attributes
- protected
- Definition Classes
- RelationalGroupedDatasetDoricInterface
-
def
sparkGroupBy(df: DataFrame, cols: DoricColumn[_]*): DoricValidated[RelationalGroupedDataset]
- Attributes
- protected
- Definition Classes
- RelationalGroupedDatasetDoricInterface
-
def
sparkPartitionId(): IntegerColumn
Partition ID.
Partition ID.
- Definition Classes
- NumericColumns
- Note
This is non-deterministic because it depends on data partitioning and task scheduling.
- See also
-
def
sparkPivot[T](relationalGroupedDataset: RelationalGroupedDataset, expr: DoricColumn[T], values: Seq[T]): DoricValidated[RelationalGroupedDataset]
- Definition Classes
- RelationalGroupedDatasetDoricInterface
-
def
sparkRollup(df: DataFrame, cols: DoricColumn[_]*): DoricValidated[RelationalGroupedDataset]
- Attributes
- protected
- Definition Classes
- RelationalGroupedDatasetDoricInterface
-
def
sparkTaskName(): StringColumn
Creates a string column for the file name of the current Spark task.
Creates a string column for the file name of the current Spark task.
- Definition Classes
- StringColumns
- Annotations
- @inline()
- See also
-
def
stdDev[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: alias for
stddev_samp
.Aggregate function: alias for
stddev_samp
.- Definition Classes
- AggregationColumns
- See also
-
def
stdDevPop[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: returns the population standard deviation of the expression in a group.
Aggregate function: returns the population standard deviation of the expression in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
stdDevSamp[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: returns the sample standard deviation of the expression in a group.
Aggregate function: returns the sample standard deviation of the expression in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
struct(cols: DoricColumn[_]*): RowColumn
Creates a struct with the columns
Creates a struct with the columns
- cols
the columns that will form the struct
- returns
A DStruct DoricColumn.
- Definition Classes
- DStructs
-
def
sum[T](col: DoricColumn[T])(implicit nt: NumericType[T]): DoricColumn[Sum]
Aggregate function: returns the sum of all values in the expression.
Aggregate function: returns the sum of all values in the expression.
- Definition Classes
- AggregationColumns
- See also
-
def
sumDistinct[T](col: DoricColumn[T])(implicit nt: NumericType[T]): DoricColumn[Sum]
Aggregate function: returns the sum of distinct values in the expression.
Aggregate function: returns the sum of distinct values in the expression.
- Definition Classes
- AggregationColumns
- See also
-
def
unixTimestamp(): LongColumn
Returns the current Unix timestamp (in seconds) as a long.
Returns the current Unix timestamp (in seconds) as a long.
- Definition Classes
- NumericColumns
- Note
All calls of
unix_timestamp
within the same query return the same value (i.e. the current timestamp is calculated at the start of query evaluation).- See also
-
def
varPop[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: returns the population variance of the values in a group.
Aggregate function: returns the population variance of the values in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
varSamp[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: returns the unbiased variance of the values in a group.
Aggregate function: returns the unbiased variance of the values in a group.
- Definition Classes
- AggregationColumns
- See also
-
def
variance[T](col: DoricColumn[T])(implicit arg0: NumericType[T]): DoubleColumn
Aggregate function: alias for
var_samp
.Aggregate function: alias for
var_samp
.- Definition Classes
- AggregationColumns
- See also
-
def
when[T]: WhenBuilder[T]
Initialize a when builder
Initialize a when builder
- T
the type of the returnign DoricColumn
- returns
WhenBuilder instance to add the required logic.
- Definition Classes
- ControlStructures
-
def
xxhash64(cols: DoricColumn[_]*): LongColumn
Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column.
Calculates the hash code of given columns using the 64-bit variant of the xxHash algorithm, and returns the result as a long column.
- Definition Classes
- CommonColumns3x
- See also
- object Asc extends Order
- object AscNullsFirst extends Order
- object AscNullsLast extends Order
- object CName extends Serializable
- object CNameOrd extends Serializable
- object Desc extends Order
- object DescNullsFirst extends Order
- object DescNullsLast extends Order
- object Doric
- object DoricColumn extends ColGetters[NamedDoricColumn]
- object LeftDF extends ColGetters[LeftDoricColumn]
- object LiteralDoricColumn extends Serializable
- object NamedDoricColumn extends Serializable
- object RightDF extends ColGetters[RightDoricColumn]
-
object
row extends Dynamic
The object
row
stands for the top-level row of the DataFrame.The object
row
stands for the top-level row of the DataFrame.- Definition Classes
- ColGetters
-
object
SelectorWithSparkType extends SelectorLPI
- Definition Classes
- DStructs