Package

org.apache.spark.sql

expressions

Permalink

package expressions

Visibility
  1. Public
  2. All

Type Members

  1. abstract class Aggregator[-I, B, O] extends Serializable

    Permalink

    A base class for user-defined aggregations, which can be used in DataFrame and Dataset operations to take all of the elements of a group and reduce them to a single value.

    A base class for user-defined aggregations, which can be used in DataFrame and Dataset operations to take all of the elements of a group and reduce them to a single value.

    For example, the following aggregator extracts an int from a specific class and adds them up:

    case class Data(i: Int)
    
    val customSummer =  new Aggregator[Data, Int, Int] {
      def zero: Int = 0
      def reduce(b: Int, a: Data): Int = b + a.i
      def merge(b1: Int, b2: Int): Int = b1 + b2
      def finish(r: Int): Int = r
    }.toColumn()
    
    val ds: Dataset[Data] = ...
    val aggregated = ds.select(customSummer)

    Based loosely on Aggregator from Algebird: https://github.com/twitter/algebird

    I

    The input type for the aggregation.

    B

    The type of the intermediate value of the reduction.

    O

    The type of the final output result.

    Since

    1.6.0

  2. abstract class MutableAggregationBuffer extends Row

    Permalink

    :: Experimental :: A Row representing an mutable aggregation buffer.

    :: Experimental :: A Row representing an mutable aggregation buffer.

    This is not meant to be extended outside of Spark.

    Annotations
    @Experimental()
  3. abstract class UserDefinedAggregateFunction extends Serializable

    Permalink

    :: Experimental :: The base class for implementing user-defined aggregate functions (UDAF).

    :: Experimental :: The base class for implementing user-defined aggregate functions (UDAF).

    Annotations
    @Experimental()
  4. class Window extends AnyRef

    Permalink

    :: Experimental :: Utility functions for defining window in DataFrames.

    :: Experimental :: Utility functions for defining window in DataFrames.

    // PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    Window.partitionBy("country").orderBy("date").rowsBetween(Long.MinValue, 0)
    
    // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
    Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
    Annotations
    @Experimental()
    Since

    1.4.0

  5. class WindowSpec extends AnyRef

    Permalink

    :: Experimental :: A window specification that defines the partitioning, ordering, and frame boundaries.

    :: Experimental :: A window specification that defines the partitioning, ordering, and frame boundaries.

    Use the static methods in Window to create a WindowSpec.

    Annotations
    @Experimental()
    Since

    1.4.0

Value Members

  1. object Window

    Permalink

    :: Experimental :: Utility functions for defining window in DataFrames.

    :: Experimental :: Utility functions for defining window in DataFrames.

    // PARTITION BY country ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    Window.partitionBy("country").orderBy("date").rowsBetween(Long.MinValue, 0)
    
    // PARTITION BY country ORDER BY date ROWS BETWEEN 3 PRECEDING AND 3 FOLLOWING
    Window.partitionBy("country").orderBy("date").rowsBetween(-3, 3)
    Annotations
    @Experimental()
    Since

    1.4.0

Ungrouped