TransformProcess.Builder |
TransformProcess.Builder.addConstantColumn(String newColumnName,
ColumnType newColumnType,
Writable fixedValue) |
Add a new column, where all values in the column are identical and as specified.
|
TransformProcess.Builder |
TransformProcess.Builder.addConstantDoubleColumn(String newColumnName,
double value) |
Add a new double column, where the value for that column (for all records) are identical
|
TransformProcess.Builder |
TransformProcess.Builder.addConstantIntegerColumn(String newColumnName,
int value) |
Add a new integer column, where th
e value for that column (for all records) are identical
|
TransformProcess.Builder |
TransformProcess.Builder.addConstantLongColumn(String newColumnName,
long value) |
Add a new integer column, where the value for that column (for all records) are identical
|
TransformProcess.Builder |
TransformProcess.Builder.appendStringColumnTransform(String column,
String toAppend) |
Append a String to a specified column
|
TransformProcess.Builder |
TransformProcess.Builder.calculateSortedRank(String newColumnName,
String sortOnColumn,
WritableComparator comparator) |
CalculateSortedRank: calculate the rank of each example, after sorting example.
|
TransformProcess.Builder |
TransformProcess.Builder.calculateSortedRank(String newColumnName,
String sortOnColumn,
WritableComparator comparator,
boolean ascending) |
CalculateSortedRank: calculate the rank of each example, after sorting example.
|
TransformProcess.Builder |
TransformProcess.Builder.categoricalToInteger(String... columnNames) |
Convert the specified column(s) from a categorical representation to an integer representation.
|
TransformProcess.Builder |
TransformProcess.Builder.categoricalToOneHot(String... columnNames) |
Convert the specified column(s) from a categorical representation to a one-hot representation.
|
TransformProcess.Builder |
TransformProcess.Builder.conditionalCopyValueTransform(String columnToReplace,
String sourceColumn,
Condition condition) |
Replace the value in a specified column with a new value taken from another column, if a condition is satisfied/true.
Note that the condition can be any generic condition, including on other column(s), different to the column
that will be modified if the condition is satisfied/true.
|
TransformProcess.Builder |
TransformProcess.Builder.conditionalReplaceValueTransform(String column,
Writable newValue,
Condition condition) |
Replace the values in a specified column with a specified new value, if some condition holds.
|
TransformProcess.Builder |
TransformProcess.Builder.conditionalReplaceValueTransformWithDefault(String column,
Writable yesVal,
Writable noVal,
Condition condition) |
Replace the values in a specified column with a specified "yes" value, if some condition holds.
|
TransformProcess.Builder |
TransformProcess.Builder.convertFromSequence() |
Convert a sequence to a set of individual values (by treating each value in each sequence as a separate example)
|
TransformProcess.Builder |
TransformProcess.Builder.convertToDouble(String inputColumn) |
Convert the specified column to a double.
|
TransformProcess.Builder |
TransformProcess.Builder.convertToInteger(String inputColumn) |
Convert the specified column to an integer.
|
TransformProcess.Builder |
TransformProcess.Builder.convertToSequence() |
Convert a set of independent records/examples into a sequence; each example is simply treated as a sequence
of length 1, without any join/group operations.
|
TransformProcess.Builder |
TransformProcess.Builder.convertToSequence(String keyColumn,
SequenceComparator comparator) |
Convert a set of independent records/examples into a sequence, according to some key.
|
TransformProcess.Builder |
TransformProcess.Builder.convertToSequence(List<String> keyColumns,
SequenceComparator comparator) |
Convert a set of independent records/examples into a sequence, where each sequence is grouped according to
one or more key values (i.e., the values in one or more columns)
Within each sequence, values are ordered using the provided SequenceComparator
|
TransformProcess.Builder |
TransformProcess.Builder.convertToString(String inputColumn) |
Convert the specified column to a string.
|
TransformProcess.Builder |
TransformProcess.Builder.doubleColumnsMathOp(String newColumnName,
MathOp mathOp,
String... columnNames) |
Calculate and add a new double column by performing a mathematical operation on a number of existing columns.
|
TransformProcess.Builder |
TransformProcess.Builder.doubleMathFunction(String columnName,
MathFunction mathFunction) |
Perform a mathematical operation (such as sin(x), ceil(x), exp(x) etc) on a column
|
TransformProcess.Builder |
TransformProcess.Builder.doubleMathOp(String columnName,
MathOp mathOp,
double scalar) |
Perform a mathematical operation (add, subtract, scalar max etc) on the specified double column, with a scalar
|
TransformProcess.Builder |
TransformProcess.Builder.duplicateColumn(String column,
String newName) |
Duplicate a single column
|
TransformProcess.Builder |
TransformProcess.Builder.duplicateColumns(List<String> columnNames,
List<String> newNames) |
Duplicate a set of columns
|
TransformProcess.Builder |
TransformProcess.Builder.filter(Condition condition) |
Add a filter operation, based on the specified condition.
|
TransformProcess.Builder |
TransformProcess.Builder.filter(Filter filter) |
Add a filter operation to be executed after the previously-added operations have been executed
|
TransformProcess.Builder |
TransformProcess.Builder.firstDigitTransform(String inputColumn,
String outputColumn) |
FirstDigitTransform converts a column to a categorical column, with values being the first digit of the number.
For example, "3.1415" becomes "3" and "2.0" becomes "2".
Negative numbers ignore the sign: "-7.123" becomes "7".
Note that two FirstDigitTransform.Mode s are supported, which determines how non-numerical entries should be handled:
EXCEPTION_ON_INVALID: output has 10 category values ("0", ..., "9"), and any non-numerical values result in an exception
INCLUDE_OTHER_CATEGORY: output has 11 category values ("0", ..., "9", "Other"), all non-numerical values are mapped to "Other"
FirstDigitTransform is useful (combined with CategoricalToOneHotTransform and Reductions) to implement
Benford's law.
|
TransformProcess.Builder |
TransformProcess.Builder.firstDigitTransform(String inputColumn,
String outputColumn,
FirstDigitTransform.Mode mode) |
FirstDigitTransform converts a column to a categorical column, with values being the first digit of the number.
For example, "3.1415" becomes "3" and "2.0" becomes "2".
Negative numbers ignore the sign: "-7.123" becomes "7".
Note that two FirstDigitTransform.Mode s are supported, which determines how non-numerical entries should be handled:
EXCEPTION_ON_INVALID: output has 10 category values ("0", ..., "9"), and any non-numerical values result in an exception
INCLUDE_OTHER_CATEGORY: output has 11 category values ("0", ..., "9", "Other"), all non-numerical values are mapped to "Other"
FirstDigitTransform is useful (combined with CategoricalToOneHotTransform and Reductions) to implement
Benford's law.
|
TransformProcess.Builder |
TransformProcess.Builder.floatColumnsMathOp(String newColumnName,
MathOp mathOp,
String... columnNames) |
Calculate and add a new float column by performing a mathematical operation on a number of existing columns.
|
TransformProcess.Builder |
TransformProcess.Builder.floatMathFunction(String columnName,
MathFunction mathFunction) |
Perform a mathematical operation (such as sin(x), ceil(x), exp(x) etc) on a column
|
TransformProcess.Builder |
TransformProcess.Builder.floatMathOp(String columnName,
MathOp mathOp,
float scalar) |
Perform a mathematical operation (add, subtract, scalar max etc) on the specified double column, with a scalar
|
TransformProcess.Builder |
TransformProcess.Builder.integerColumnsMathOp(String newColumnName,
MathOp mathOp,
String... columnNames) |
Calculate and add a new integer column by performing a mathematical operation on a number of existing columns.
|
TransformProcess.Builder |
TransformProcess.Builder.integerMathOp(String column,
MathOp mathOp,
int scalar) |
Perform a mathematical operation (add, subtract, scalar max etc) on the specified integer column, with a scalar
|
TransformProcess.Builder |
TransformProcess.Builder.integerToCategorical(String columnName,
List<String> categoryStateNames) |
Convert the specified column from an integer representation (assume values 0 to numCategories-1) to
a categorical representation, given the specified state names
|
TransformProcess.Builder |
TransformProcess.Builder.integerToCategorical(String columnName,
Map<Integer,String> categoryIndexNameMap) |
Convert the specified column from an integer representation to a categorical representation, given the specified
mapping between integer indexes and state names
|
TransformProcess.Builder |
TransformProcess.Builder.integerToOneHot(String columnName,
int minValue,
int maxValue) |
Convert an integer column to a set of 1 hot columns, based on the value in integer column
|
TransformProcess.Builder |
TransformProcess.Builder.longColumnsMathOp(String newColumnName,
MathOp mathOp,
String... columnNames) |
Calculate and add a new long column by performing a mathematical operation on a number of existing columns.
|
TransformProcess.Builder |
TransformProcess.Builder.longMathOp(String columnName,
MathOp mathOp,
long scalar) |
Perform a mathematical operation (add, subtract, scalar max etc) on the specified long column, with a scalar
|
TransformProcess.Builder |
TransformProcess.Builder.ndArrayColumnsMathOpTransform(String newColumnName,
MathOp mathOp,
String... columnNames) |
Perform an element wise mathematical operation (such as add, subtract, multiply) on NDArray columns.
|
TransformProcess.Builder |
TransformProcess.Builder.ndArrayDistanceTransform(String newColumnName,
Distance distance,
String firstCol,
String secondCol) |
Calculate a distance (cosine similarity, Euclidean, Manhattan) on two equal-sized NDArray columns.
|
TransformProcess.Builder |
TransformProcess.Builder.ndArrayMathFunctionTransform(String columnName,
MathFunction mathFunction) |
Apply an element wise mathematical function (sin, tanh, abs etc) to an NDArray column.
|
TransformProcess.Builder |
TransformProcess.Builder.ndArrayScalarOpTransform(String columnName,
MathOp op,
double value) |
Element-wise NDArray math operation (add, subtract, etc) on an NDArray column
|
TransformProcess.Builder |
TransformProcess.Builder.normalize(String column,
Normalize type,
DataAnalysis da) |
Normalize the specified column with a given type of normalization
|
TransformProcess.Builder |
TransformProcess.Builder.offsetSequence(List<String> columnsToOffset,
int offsetAmount,
SequenceOffsetTransform.OperationType operationType) |
Perform a sequence of operation on the specified columns.
|
TransformProcess.Builder |
TransformProcess.Builder.reduce(IAssociativeReducer reducer) |
Reduce (i.e., aggregate/combine) a set of examples (typically by key).
|
TransformProcess.Builder |
TransformProcess.Builder.reduceSequence(IAssociativeReducer reducer) |
Reduce (i.e., aggregate/combine) a set of sequence examples - for each sequence individually.
|
TransformProcess.Builder |
TransformProcess.Builder.reduceSequenceByWindow(IAssociativeReducer reducer,
WindowFunction windowFunction) |
Reduce (i.e., aggregate/combine) a set of sequence examples - for each sequence individually - using a window function.
|
TransformProcess.Builder |
TransformProcess.Builder.removeAllColumnsExceptFor(String... columnNames) |
Remove all columns, except for those that are specified here
|
TransformProcess.Builder |
TransformProcess.Builder.removeAllColumnsExceptFor(Collection<String> columnNames) |
Remove all columns, except for those that are specified here
|
TransformProcess.Builder |
TransformProcess.Builder.removeColumns(String... columnNames) |
Remove all of the specified columns, by name
|
TransformProcess.Builder |
TransformProcess.Builder.removeColumns(Collection<String> columnNames) |
Remove all of the specified columns, by name
|
TransformProcess.Builder |
TransformProcess.Builder.renameColumn(String oldName,
String newName) |
Rename a single column
|
TransformProcess.Builder |
TransformProcess.Builder.renameColumns(List<String> oldNames,
List<String> newNames) |
Rename multiple columns
|
TransformProcess.Builder |
TransformProcess.Builder.reorderColumns(String... newOrder) |
Reorder the columns using a partial or complete new ordering.
|
TransformProcess.Builder |
TransformProcess.Builder.replaceStringTransform(String columnName,
Map<String,String> mapping) |
Replace one or more String values in the specified column that match regular expressions.
|
TransformProcess.Builder |
TransformProcess.Builder.sequenceMovingWindowReduce(String columnName,
int lookback,
ReduceOp op) |
SequenceMovingWindowReduceTransform: Adds a new column, where the value is derived by:
(a) using a window of the last N values in a single column,
(b) Apply a reduction op on the window to calculate a new value
for example, this transformer can be used to implement a simple moving average of the last N values,
or determine the minimum or maximum values in the last N time steps.
|
TransformProcess.Builder |
TransformProcess.Builder.splitSequence(SequenceSplit split) |
Split sequences into 1 or more other sequences.
|
TransformProcess.Builder |
TransformProcess.Builder.stringMapTransform(String columnName,
Map<String,String> mapping) |
Replace one or more String values in the specified column with new values.
|
TransformProcess.Builder |
TransformProcess.Builder.stringRemoveWhitespaceTransform(String columnName) |
Remove all whitespace characters from the values in the specified String column
|
TransformProcess.Builder |
TransformProcess.Builder.stringToCategorical(String columnName,
List<String> stateNames) |
Convert the specified String column to a categorical column.
|
TransformProcess.Builder |
TransformProcess.Builder.stringToTimeTransform(String column,
String format,
org.joda.time.DateTimeZone dateTimeZone) |
Convert a String column (containing a date/time String) to a time column (by parsing the date/time String)
|
TransformProcess.Builder |
TransformProcess.Builder.stringToTimeTransform(String column,
String format,
org.joda.time.DateTimeZone dateTimeZone,
Locale locale) |
Convert a String column (containing a date/time String) to a time column (by parsing the date/time String)
|
TransformProcess.Builder |
TransformProcess.Builder.timeMathOp(String columnName,
MathOp mathOp,
long timeQuantity,
TimeUnit timeUnit) |
Perform a mathematical operation (add, subtract, scalar min/max only) on the specified time column
|
TransformProcess.Builder |
TransformProcess.Builder.transform(Transform transform) |
Add a transformation to be executed after the previously-added operations have been executed
|
TransformProcess.Builder |
TransformProcess.Builder.trimOrPadSequenceToLength(int length,
@NonNull List<Writable> pad) |
Trim or pad the sequence to the specified length (number of sequence steps).
Sequences longer than the specified maximum will be trimmed to exactly the maximum.
|
TransformProcess.Builder |
TransformProcess.Builder.trimSequence(int numStepsToTrim,
boolean trimFromStart) |
SequenceTrimTranform removes the first or last N values in a sequence.
|
TransformProcess.Builder |
TransformProcess.Builder.trimSequenceToLength(int maxLength) |
Trim the sequence to the specified length (number of sequence steps).
Sequences longer than the specified maximum will be trimmed to exactly the maximum.
|