Class Reducer.Builder

  • Enclosing class:
    Reducer

    public static class Reducer.Builder
    extends Object
    • Constructor Detail

      • Builder

        public Builder​(ReduceOp defaultOp)
        Create a Reducer builder, and set the default column reduction operation. For any columns that aren't specified explicitly, they will use the default reduction operation. If a column does have a reduction operation explicitly specified, then it will override the default specified here.
        Parameters:
        defaultOp - Default reduction operation to perform
    • Method Detail

      • keyColumns

        public Reducer.Builder keyColumns​(String... keyColumns)
        Specify the key columns. The idea here is to be able to create a (potentially compound) key out of multiple columns, using the toString representation of the values in these columns
        Parameters:
        keyColumns - Columns that will make up the key
        Returns:
      • minColumns

        public Reducer.Builder minColumns​(String... columns)
        Reduce the specified columns by taking the minimum value
      • maxColumn

        public Reducer.Builder maxColumn​(String... columns)
        Reduce the specified columns by taking the maximum value
      • sumColumns

        public Reducer.Builder sumColumns​(String... columns)
        Reduce the specified columns by taking the sum of values
      • prodColumns

        public Reducer.Builder prodColumns​(String... columns)
        Reduce the specified columns by taking the product of values
      • meanColumns

        public Reducer.Builder meanColumns​(String... columns)
        Reduce the specified columns by taking the mean of the values
      • stdevColumns

        public Reducer.Builder stdevColumns​(String... columns)
        Reduce the specified columns by taking the standard deviation of the values
      • uncorrectedStdevColumns

        public Reducer.Builder uncorrectedStdevColumns​(String... columns)
        Reduce the specified columns by taking the uncorrected standard deviation of the values
      • variance

        public Reducer.Builder variance​(String... columns)
        Reduce the specified columns by taking the variance of the values
      • populationVariance

        public Reducer.Builder populationVariance​(String... columns)
        Reduce the specified columns by taking the population variance of the values
      • countColumns

        public Reducer.Builder countColumns​(String... columns)
        Reduce the specified columns by counting the number of values
      • rangeColumns

        public Reducer.Builder rangeColumns​(String... columns)
        Reduce the specified columns by taking the range (max-min) of the values
      • countUniqueColumns

        public Reducer.Builder countUniqueColumns​(String... columns)
        Reduce the specified columns by counting the number of unique values
      • takeFirstColumns

        public Reducer.Builder takeFirstColumns​(String... columns)
        Reduce the specified columns by taking the first value
      • takeLastColumns

        public Reducer.Builder takeLastColumns​(String... columns)
        Reduce the specified columns by taking the last value
      • appendColumns

        public Reducer.Builder appendColumns​(String... columns)
        Reduce the specified columns by taking the concatenation of all content Beware, the output will be huge!
      • prependColumns

        public Reducer.Builder prependColumns​(String... columns)
        Reduce the specified columns by taking the concatenation of all content in the reverse order Beware, the output will be huge!
      • customReduction

        public Reducer.Builder customReduction​(String column,
                                               AggregableColumnReduction columnReduction)
        Reduce the specified column using a custom column reduction functionality.
        Parameters:
        column - Column to execute the custom reduction functionality on
        columnReduction - Column reduction to execute on that column
      • conditionalReduction

        public Reducer.Builder conditionalReduction​(String column,
                                                    List<String> outputNames,
                                                    List<ReduceOp> reductions,
                                                    Condition condition)
        Conditional reduction: apply the reduces on a specified column, where the reduction occurs *only* on those examples where the condition returns true. Examples where the condition does not apply (returns false) are ignored/excluded.
        Parameters:
        column - Name of the column to execute the conditional reduction on
        outputName - Name of the column, after the reduction has been executed
        reductions - Reductions to execute
        condition - Condition to use in the reductions
      • conditionalReduction

        public Reducer.Builder conditionalReduction​(String column,
                                                    String outputName,
                                                    ReduceOp reduction,
                                                    Condition condition)
        Conditional reduction: apply the reduces on a specified column, where the reduction occurs *only* on those examples where the condition returns true. Examples where the condition does not apply (returns false) are ignored/excluded.
        Parameters:
        column - Name of the column to execute the conditional reduction on
        outputName - Name of the column, after the reduction has been executed
        reductions - Reductions to execute
        condition - Condition to use in the reductions
      • setIgnoreInvalid

        public Reducer.Builder setIgnoreInvalid​(String... columns)
        When doing the reduction: set the specified columns to ignore any invalid values. Invalid: defined as being not valid according to the ColumnMetaData: ColumnMetaData.isValid(Writable). For numerical columns, this typically means being unable to parse the Writable. For example, Writable.toLong() failing for a Long column. If the column has any restrictions (min/max values, regex for Strings etc) these will also be taken into account.
        Parameters:
        columns - Columns to set 'ignore invalid' for