public static class TransformProcess.Builder extends Object
Modifier and Type | Method and Description |
---|---|
TransformProcess.Builder |
addConstantColumn(String newColumnName,
ColumnType newColumnType,
Writable fixedValue)
Add a new column, where all values in the column are identical and as specified.
|
TransformProcess.Builder |
addConstantDoubleColumn(String newColumnName,
double value)
Add a new double column, where the value for that column (for all records) are identical
|
TransformProcess.Builder |
addConstantIntegerColumn(String newColumnName,
int value)
Add a new integer column, where th
e value for that column (for all records) are identical
|
TransformProcess.Builder |
addConstantLongColumn(String newColumnName,
long value)
Add a new integer column, where the value for that column (for all records) are identical
|
TransformProcess.Builder |
appendStringColumnTransform(String column,
String toAppend)
Append a String to a specified column
|
TransformProcess |
build()
Create the TransformProcess object
|
TransformProcess.Builder |
calculateSortedRank(String newColumnName,
String sortOnColumn,
WritableComparator comparator)
CalculateSortedRank: calculate the rank of each example, after sorting example.
|
TransformProcess.Builder |
calculateSortedRank(String newColumnName,
String sortOnColumn,
WritableComparator comparator,
boolean ascending)
CalculateSortedRank: calculate the rank of each example, after sorting example.
|
TransformProcess.Builder |
categoricalToInteger(String... columnNames)
Convert the specified column(s) from a categorical representation to an integer representation.
|
TransformProcess.Builder |
categoricalToOneHot(String... columnNames)
Convert the specified column(s) from a categorical representation to a one-hot representation.
|
TransformProcess.Builder |
conditionalCopyValueTransform(String columnToReplace,
String sourceColumn,
Condition condition)
Replace the value in a specified column with a new value taken from another column, if a condition is satisfied/true.
Note that the condition can be any generic condition, including on other column(s), different to the column that will be modified if the condition is satisfied/true. |
TransformProcess.Builder |
conditionalReplaceValueTransform(String column,
Writable newValue,
Condition condition)
Replace the values in a specified column with a specified new value, if some condition holds.
|
TransformProcess.Builder |
conditionalReplaceValueTransformWithDefault(String column,
Writable yesVal,
Writable noVal,
Condition condition)
Replace the values in a specified column with a specified "yes" value, if some condition holds.
|
TransformProcess.Builder |
convertFromSequence()
Convert a sequence to a set of individual values (by treating each value in each sequence as a separate example)
|
TransformProcess.Builder |
convertToDouble(String inputColumn)
Convert the specified column to a double.
|
TransformProcess.Builder |
convertToInteger(String inputColumn)
Convert the specified column to an integer.
|
TransformProcess.Builder |
convertToSequence()
Convert a set of independent records/examples into a sequence; each example is simply treated as a sequence
of length 1, without any join/group operations.
|
TransformProcess.Builder |
convertToSequence(List<String> keyColumns,
SequenceComparator comparator)
Convert a set of independent records/examples into a sequence, where each sequence is grouped according to
one or more key values (i.e., the values in one or more columns)
Within each sequence, values are ordered using the provided
SequenceComparator |
TransformProcess.Builder |
convertToSequence(String keyColumn,
SequenceComparator comparator)
Convert a set of independent records/examples into a sequence, according to some key.
|
TransformProcess.Builder |
convertToString(String inputColumn)
Convert the specified column to a string.
|
TransformProcess.Builder |
doubleColumnsMathOp(String newColumnName,
MathOp mathOp,
String... columnNames)
Calculate and add a new double column by performing a mathematical operation on a number of existing columns.
|
TransformProcess.Builder |
doubleMathFunction(String columnName,
MathFunction mathFunction)
Perform a mathematical operation (such as sin(x), ceil(x), exp(x) etc) on a column
|
TransformProcess.Builder |
doubleMathOp(String columnName,
MathOp mathOp,
double scalar)
Perform a mathematical operation (add, subtract, scalar max etc) on the specified double column, with a scalar
|
TransformProcess.Builder |
duplicateColumn(String column,
String newName)
Duplicate a single column
|
TransformProcess.Builder |
duplicateColumns(List<String> columnNames,
List<String> newNames)
Duplicate a set of columns
|
TransformProcess.Builder |
filter(Condition condition)
Add a filter operation, based on the specified condition.
|
TransformProcess.Builder |
filter(Filter filter)
Add a filter operation to be executed after the previously-added operations have been executed
|
TransformProcess.Builder |
firstDigitTransform(String inputColumn,
String outputColumn)
FirstDigitTransform converts a column to a categorical column, with values being the first digit of the number.
For example, "3.1415" becomes "3" and "2.0" becomes "2". Negative numbers ignore the sign: "-7.123" becomes "7". Note that two FirstDigitTransform.Mode s are supported, which determines how non-numerical entries should be handled:EXCEPTION_ON_INVALID: output has 10 category values ("0", ..., "9"), and any non-numerical values result in an exception INCLUDE_OTHER_CATEGORY: output has 11 category values ("0", ..., "9", "Other"), all non-numerical values are mapped to "Other" FirstDigitTransform is useful (combined with CategoricalToOneHotTransform and Reductions) to implement
Benford's law. |
TransformProcess.Builder |
firstDigitTransform(String inputColumn,
String outputColumn,
FirstDigitTransform.Mode mode)
FirstDigitTransform converts a column to a categorical column, with values being the first digit of the number.
For example, "3.1415" becomes "3" and "2.0" becomes "2". Negative numbers ignore the sign: "-7.123" becomes "7". Note that two FirstDigitTransform.Mode s are supported, which determines how non-numerical entries should be handled:EXCEPTION_ON_INVALID: output has 10 category values ("0", ..., "9"), and any non-numerical values result in an exception INCLUDE_OTHER_CATEGORY: output has 11 category values ("0", ..., "9", "Other"), all non-numerical values are mapped to "Other" FirstDigitTransform is useful (combined with CategoricalToOneHotTransform and Reductions) to implement
Benford's law. |
TransformProcess.Builder |
floatColumnsMathOp(String newColumnName,
MathOp mathOp,
String... columnNames)
Calculate and add a new float column by performing a mathematical operation on a number of existing columns.
|
TransformProcess.Builder |
floatMathFunction(String columnName,
MathFunction mathFunction)
Perform a mathematical operation (such as sin(x), ceil(x), exp(x) etc) on a column
|
TransformProcess.Builder |
floatMathOp(String columnName,
MathOp mathOp,
float scalar)
Perform a mathematical operation (add, subtract, scalar max etc) on the specified double column, with a scalar
|
TransformProcess.Builder |
integerColumnsMathOp(String newColumnName,
MathOp mathOp,
String... columnNames)
Calculate and add a new integer column by performing a mathematical operation on a number of existing columns.
|
TransformProcess.Builder |
integerMathOp(String column,
MathOp mathOp,
int scalar)
Perform a mathematical operation (add, subtract, scalar max etc) on the specified integer column, with a scalar
|
TransformProcess.Builder |
integerToCategorical(String columnName,
List<String> categoryStateNames)
Convert the specified column from an integer representation (assume values 0 to numCategories-1) to
a categorical representation, given the specified state names
|
TransformProcess.Builder |
integerToCategorical(String columnName,
Map<Integer,String> categoryIndexNameMap)
Convert the specified column from an integer representation to a categorical representation, given the specified
mapping between integer indexes and state names
|
TransformProcess.Builder |
integerToOneHot(String columnName,
int minValue,
int maxValue)
Convert an integer column to a set of 1 hot columns, based on the value in integer column
|
TransformProcess.Builder |
longColumnsMathOp(String newColumnName,
MathOp mathOp,
String... columnNames)
Calculate and add a new long column by performing a mathematical operation on a number of existing columns.
|
TransformProcess.Builder |
longMathOp(String columnName,
MathOp mathOp,
long scalar)
Perform a mathematical operation (add, subtract, scalar max etc) on the specified long column, with a scalar
|
TransformProcess.Builder |
ndArrayColumnsMathOpTransform(String newColumnName,
MathOp mathOp,
String... columnNames)
Perform an element wise mathematical operation (such as add, subtract, multiply) on NDArray columns.
|
TransformProcess.Builder |
ndArrayDistanceTransform(String newColumnName,
Distance distance,
String firstCol,
String secondCol)
Calculate a distance (cosine similarity, Euclidean, Manhattan) on two equal-sized NDArray columns.
|
TransformProcess.Builder |
ndArrayMathFunctionTransform(String columnName,
MathFunction mathFunction)
Apply an element wise mathematical function (sin, tanh, abs etc) to an NDArray column.
|
TransformProcess.Builder |
ndArrayScalarOpTransform(String columnName,
MathOp op,
double value)
Element-wise NDArray math operation (add, subtract, etc) on an NDArray column
|
TransformProcess.Builder |
normalize(String column,
Normalize type,
DataAnalysis da)
Normalize the specified column with a given type of normalization
|
TransformProcess.Builder |
offsetSequence(List<String> columnsToOffset,
int offsetAmount,
SequenceOffsetTransform.OperationType operationType)
Perform a sequence of operation on the specified columns.
|
TransformProcess.Builder |
reduce(IAssociativeReducer reducer)
Reduce (i.e., aggregate/combine) a set of examples (typically by key).
|
TransformProcess.Builder |
reduceSequence(IAssociativeReducer reducer)
Reduce (i.e., aggregate/combine) a set of sequence examples - for each sequence individually.
|
TransformProcess.Builder |
reduceSequenceByWindow(IAssociativeReducer reducer,
WindowFunction windowFunction)
Reduce (i.e., aggregate/combine) a set of sequence examples - for each sequence individually - using a window function.
|
TransformProcess.Builder |
removeAllColumnsExceptFor(Collection<String> columnNames)
Remove all columns, except for those that are specified here
|
TransformProcess.Builder |
removeAllColumnsExceptFor(String... columnNames)
Remove all columns, except for those that are specified here
|
TransformProcess.Builder |
removeColumns(Collection<String> columnNames)
Remove all of the specified columns, by name
|
TransformProcess.Builder |
removeColumns(String... columnNames)
Remove all of the specified columns, by name
|
TransformProcess.Builder |
renameColumn(String oldName,
String newName)
Rename a single column
|
TransformProcess.Builder |
renameColumns(List<String> oldNames,
List<String> newNames)
Rename multiple columns
|
TransformProcess.Builder |
reorderColumns(String... newOrder)
Reorder the columns using a partial or complete new ordering.
|
TransformProcess.Builder |
replaceStringTransform(String columnName,
Map<String,String> mapping)
Replace one or more String values in the specified column that match regular expressions.
|
TransformProcess.Builder |
sequenceMovingWindowReduce(String columnName,
int lookback,
ReduceOp op)
SequenceMovingWindowReduceTransform: Adds a new column, where the value is derived by:
(a) using a window of the last N values in a single column, (b) Apply a reduction op on the window to calculate a new value for example, this transformer can be used to implement a simple moving average of the last N values, or determine the minimum or maximum values in the last N time steps. |
TransformProcess.Builder |
splitSequence(SequenceSplit split)
Split sequences into 1 or more other sequences.
|
TransformProcess.Builder |
stringMapTransform(String columnName,
Map<String,String> mapping)
Replace one or more String values in the specified column with new values.
|
TransformProcess.Builder |
stringRemoveWhitespaceTransform(String columnName)
Remove all whitespace characters from the values in the specified String column
|
TransformProcess.Builder |
stringToCategorical(String columnName,
List<String> stateNames)
Convert the specified String column to a categorical column.
|
TransformProcess.Builder |
stringToTimeTransform(String column,
String format,
org.joda.time.DateTimeZone dateTimeZone)
Convert a String column (containing a date/time String) to a time column (by parsing the date/time String)
|
TransformProcess.Builder |
stringToTimeTransform(String column,
String format,
org.joda.time.DateTimeZone dateTimeZone,
Locale locale)
Convert a String column (containing a date/time String) to a time column (by parsing the date/time String)
|
TransformProcess.Builder |
timeMathOp(String columnName,
MathOp mathOp,
long timeQuantity,
TimeUnit timeUnit)
Perform a mathematical operation (add, subtract, scalar min/max only) on the specified time column
|
TransformProcess.Builder |
transform(Transform transform)
Add a transformation to be executed after the previously-added operations have been executed
|
TransformProcess.Builder |
trimOrPadSequenceToLength(int length,
@NonNull List<Writable> pad)
Trim or pad the sequence to the specified length (number of sequence steps).
Sequences longer than the specified maximum will be trimmed to exactly the maximum. |
TransformProcess.Builder |
trimSequence(int numStepsToTrim,
boolean trimFromStart)
SequenceTrimTranform removes the first or last N values in a sequence.
|
TransformProcess.Builder |
trimSequenceToLength(int maxLength)
Trim the sequence to the specified length (number of sequence steps).
Sequences longer than the specified maximum will be trimmed to exactly the maximum. |
public Builder(Schema initialSchema)
public TransformProcess.Builder transform(Transform transform)
transform
- Transform to executepublic TransformProcess.Builder filter(Filter filter)
filter
- Filter operation to executepublic TransformProcess.Builder filter(Condition condition)
condition
- Condition to filter onpublic TransformProcess.Builder removeColumns(String... columnNames)
columnNames
- Names of the columns to removepublic TransformProcess.Builder removeColumns(Collection<String> columnNames)
columnNames
- Names of the columns to removepublic TransformProcess.Builder removeAllColumnsExceptFor(String... columnNames)
columnNames
- Names of the columns to keeppublic TransformProcess.Builder removeAllColumnsExceptFor(Collection<String> columnNames)
columnNames
- Names of the columns to keeppublic TransformProcess.Builder renameColumn(String oldName, String newName)
oldName
- Original column namenewName
- New column namepublic TransformProcess.Builder renameColumns(List<String> oldNames, List<String> newNames)
oldNames
- List of original column namesnewNames
- List of new column namespublic TransformProcess.Builder reorderColumns(String... newOrder)
newOrder
- Names of the columns, in the order they will appear in the outputpublic TransformProcess.Builder duplicateColumn(String column, String newName)
column
- Name of the column to duplicatenewName
- Name of the new (duplicate) columnpublic TransformProcess.Builder duplicateColumns(List<String> columnNames, List<String> newNames)
columnNames
- Names of the columns to duplicatenewNames
- Names of the new (duplicated) columnspublic TransformProcess.Builder integerMathOp(String column, MathOp mathOp, int scalar)
column
- The integer column to perform the operation onmathOp
- The mathematical operationscalar
- The scalar value to use in the mathematical operationpublic TransformProcess.Builder integerColumnsMathOp(String newColumnName, MathOp mathOp, String... columnNames)
newColumnName
- Name of the new/derived columnmathOp
- Mathematical operation to execute on the columnscolumnNames
- Names of the columns to use in the mathematical operationpublic TransformProcess.Builder longMathOp(String columnName, MathOp mathOp, long scalar)
columnName
- The long column to perform the operation onmathOp
- The mathematical operationscalar
- The scalar value to use in the mathematical operationpublic TransformProcess.Builder longColumnsMathOp(String newColumnName, MathOp mathOp, String... columnNames)
newColumnName
- Name of the new/derived columnmathOp
- Mathematical operation to execute on the columnscolumnNames
- Names of the columns to use in the mathematical operationpublic TransformProcess.Builder floatMathOp(String columnName, MathOp mathOp, float scalar)
columnName
- The float column to perform the operation onmathOp
- The mathematical operationscalar
- The scalar value to use in the mathematical operationpublic TransformProcess.Builder floatColumnsMathOp(String newColumnName, MathOp mathOp, String... columnNames)
newColumnName
- Name of the new/derived columnmathOp
- Mathematical operation to execute on the columnscolumnNames
- Names of the columns to use in the mathematical operationpublic TransformProcess.Builder floatMathFunction(String columnName, MathFunction mathFunction)
columnName
- Column name to operate onmathFunction
- MathFunction to apply to the columnpublic TransformProcess.Builder doubleMathOp(String columnName, MathOp mathOp, double scalar)
columnName
- The double column to perform the operation onmathOp
- The mathematical operationscalar
- The scalar value to use in the mathematical operationpublic TransformProcess.Builder doubleColumnsMathOp(String newColumnName, MathOp mathOp, String... columnNames)
newColumnName
- Name of the new/derived columnmathOp
- Mathematical operation to execute on the columnscolumnNames
- Names of the columns to use in the mathematical operationpublic TransformProcess.Builder doubleMathFunction(String columnName, MathFunction mathFunction)
columnName
- Column name to operate onmathFunction
- MathFunction to apply to the columnpublic TransformProcess.Builder timeMathOp(String columnName, MathOp mathOp, long timeQuantity, TimeUnit timeUnit)
columnName
- The integer column to perform the operation onmathOp
- The mathematical operationtimeQuantity
- The quantity used in the mathematical optimeUnit
- The unit that timeQuantity is specified inpublic TransformProcess.Builder categoricalToOneHot(String... columnNames)
columnNames
- Names of the categorical column(s) to convert to a one-hot representationpublic TransformProcess.Builder categoricalToInteger(String... columnNames)
columnNames
- Name of the categorical column(s) to convert to an integer representationpublic TransformProcess.Builder integerToCategorical(String columnName, List<String> categoryStateNames)
columnName
- Name of the column to convertcategoryStateNames
- Names of the states for the categorical columnpublic TransformProcess.Builder integerToCategorical(String columnName, Map<Integer,String> categoryIndexNameMap)
columnName
- Name of the column to convertcategoryIndexNameMap
- Names of the states for the categorical columnpublic TransformProcess.Builder integerToOneHot(String columnName, int minValue, int maxValue)
columnName
- Name of the integer columnminValue
- Minimum value possible for the integer column (inclusive)maxValue
- Maximum value possible for the integer column (inclusive)public TransformProcess.Builder addConstantColumn(String newColumnName, ColumnType newColumnType, Writable fixedValue)
newColumnName
- Name of the new columnnewColumnType
- Type of the new columnfixedValue
- Value in the new column for all recordspublic TransformProcess.Builder addConstantDoubleColumn(String newColumnName, double value)
newColumnName
- Name of the new columnvalue
- Value in the new column for all recordspublic TransformProcess.Builder addConstantIntegerColumn(String newColumnName, int value)
newColumnName
- Name of the new columnvalue
- Value of the new column for all recordspublic TransformProcess.Builder addConstantLongColumn(String newColumnName, long value)
newColumnName
- Name of the new columnvalue
- Value in the new column for all recordspublic TransformProcess.Builder convertToString(String inputColumn)
inputColumn
- the input column to convertpublic TransformProcess.Builder convertToDouble(String inputColumn)
inputColumn
- the input column to convertpublic TransformProcess.Builder convertToInteger(String inputColumn)
inputColumn
- the input column to convertpublic TransformProcess.Builder normalize(String column, Normalize type, DataAnalysis da)
column
- Column to normalizetype
- Type of normalization to applyda
- DataAnalysis objectpublic TransformProcess.Builder convertToSequence(String keyColumn, SequenceComparator comparator)
SequenceComparator
keyColumn
- Column to use as a key (values with the same key will be combined into sequences)comparator
- A SequenceComparator to order the values within each sequence (for example, by time or String order)public TransformProcess.Builder convertToSequence()
convertToSequence(List, SequenceComparator)
for this functionalitypublic TransformProcess.Builder convertToSequence(List<String> keyColumns, SequenceComparator comparator)
SequenceComparator
keyColumns
- Column to use as a key (values with the same key will be combined into sequences)comparator
- A SequenceComparator to order the values within each sequence (for example, by time or String order)public TransformProcess.Builder convertFromSequence()
public TransformProcess.Builder splitSequence(SequenceSplit split)
split
- SequenceSplit that defines how splits will occurpublic TransformProcess.Builder trimSequence(int numStepsToTrim, boolean trimFromStart)
numStepsToTrim
- Number of time steps to trim from the sequencetrimFromStart
- If true: Trim values from the start of the sequence. If false: trim values from the end.public TransformProcess.Builder trimSequenceToLength(int maxLength)
maxLength
- Maximum sequence length (number of time steps)public TransformProcess.Builder trimOrPadSequenceToLength(int length, @NonNull @NonNull List<Writable> pad)
length
- Required length - trim sequences longer than this, pad sequences shorter than thispad
- Values to pad at the end of the sequencepublic TransformProcess.Builder offsetSequence(List<String> columnsToOffset, int offsetAmount, SequenceOffsetTransform.OperationType operationType)
transform(new SequenceOffsetTransform(...)
to change this.
See SequenceOffsetTransform
for details on exactly what this operation does and how.columnsToOffset
- Columns to offsetoffsetAmount
- Amount to offset the specified columns by (positive offset: 'columnsToOffset' are
moved to later time steps)operationType
- Whether the offset should be done in-place or by adding a new columnpublic TransformProcess.Builder reduce(IAssociativeReducer reducer)
reducer
- Reducer to usepublic TransformProcess.Builder reduceSequence(IAssociativeReducer reducer)
transform(new ReduceSequenceTransform(reducer))
.reducer
- Reducer to use to reduce each windowpublic TransformProcess.Builder reduceSequenceByWindow(IAssociativeReducer reducer, WindowFunction windowFunction)
reducer
- Reducer to use to reduce each windowwindowFunction
- Window function to find apply on each sequence individuallypublic TransformProcess.Builder sequenceMovingWindowReduce(String columnName, int lookback, ReduceOp op)
For example, for a simple moving average, length 20: new SequenceMovingWindowReduceTransform("myCol", 20, ReduceOp.Mean)
columnName
- Column name to perform windowing onlookback
- Look back period for windowingop
- Reduction operation to perform on each windowpublic TransformProcess.Builder calculateSortedRank(String newColumnName, String sortOnColumn, WritableComparator comparator)
Currently, CalculateSortedRank can only be applied on standard (i.e., non-sequence) data Furthermore, the current implementation can only sort on one column
newColumnName
- Name of the new column (will contain the rank for each example)sortOnColumn
- Column to sort oncomparator
- Comparator used to sort examplespublic TransformProcess.Builder calculateSortedRank(String newColumnName, String sortOnColumn, WritableComparator comparator, boolean ascending)
Currently, CalculateSortedRank can only be applied on standard (i.e., non-sequence) data Furthermore, the current implementation can only sort on one column
newColumnName
- Name of the new column (will contain the rank for each example)sortOnColumn
- Column to sort oncomparator
- Comparator used to sort examplesascending
- If true: sort ascending. False: descendingpublic TransformProcess.Builder stringToCategorical(String columnName, List<String> stateNames)
columnName
- Name of the String column to convert to categoricalstateNames
- State names of the categorypublic TransformProcess.Builder stringRemoveWhitespaceTransform(String columnName)
columnName
- Name of the column to remove whitespace frompublic TransformProcess.Builder stringMapTransform(String columnName, Map<String,String> mapping)
Keys in the map are the original values; the Values in the map are their replacements. If a String appears in the data but does not appear in the provided map (as a key), that String values will not be modified.
columnName
- Name of the column in which to do replacementmapping
- Map of oldValues -> newValuespublic TransformProcess.Builder stringToTimeTransform(String column, String format, org.joda.time.DateTimeZone dateTimeZone)
column
- String column containing the date/time Stringsformat
- Format of the strings. Time format is specified as per http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.htmldateTimeZone
- Timezone of the columnpublic TransformProcess.Builder stringToTimeTransform(String column, String format, org.joda.time.DateTimeZone dateTimeZone, Locale locale)
column
- String column containing the date/time Stringsformat
- Format of the strings. Time format is specified as per http://www.joda.org/joda-time/apidocs/org/joda/time/format/DateTimeFormat.htmldateTimeZone
- Timezone of the columnlocale
- Locale of the columnpublic TransformProcess.Builder appendStringColumnTransform(String column, String toAppend)
column
- Column to append the value totoAppend
- String to append to the end of each writablepublic TransformProcess.Builder conditionalReplaceValueTransform(String column, Writable newValue, Condition condition)
column
- Column to operate onnewValue
- Value to use as replacement, if condition is satisfiedcondition
- Condition that must be satisfied for replacementpublic TransformProcess.Builder conditionalReplaceValueTransformWithDefault(String column, Writable yesVal, Writable noVal, Condition condition)
column
- Column to operate onyesVal
- Value to use as replacement, if condition is satisfiednoVal
- Value to use as replacement, if condition is not satisfiedcondition
- Condition that must be satisfied for replacementpublic TransformProcess.Builder conditionalCopyValueTransform(String columnToReplace, String sourceColumn, Condition condition)
columnToReplace
- Name of the column in which values will be replaced (if condition is satisfied)sourceColumn
- Name of the column from which the new values will becondition
- Condition to usepublic TransformProcess.Builder replaceStringTransform(String columnName, Map<String,String> mapping)
Keys in the map are the regular expressions; the Values in the map are their String replacements. For example:
Original Regex Replacement Result Data_Vec _ DataVec B1C2T3 \\d one BoneConeTone '  4.25 ' ^\\s+|\\s+$ '4.25'
columnName
- Name of the column in which to do replacementmapping
- Map of old values or regular expression to new valuespublic TransformProcess.Builder ndArrayScalarOpTransform(String columnName, MathOp op, double value)
columnName
- Name of the NDArray column to perform the operation onop
- Operation to performvalue
- Value for the operationpublic TransformProcess.Builder ndArrayColumnsMathOpTransform(String newColumnName, MathOp mathOp, String... columnNames)
newColumnName
- Name of the new NDArray columnmathOp
- Operation to performcolumnNames
- Name of the columns used as input to the operationpublic TransformProcess.Builder ndArrayMathFunctionTransform(String columnName, MathFunction mathFunction)
columnName
- Name of the column to perform the operation onmathFunction
- Mathematical function to applypublic TransformProcess.Builder ndArrayDistanceTransform(String newColumnName, Distance distance, String firstCol, String secondCol)
newColumnName
- Name of the new column (result) to adddistance
- Distance to applyfirstCol
- first column to use in the distance calculationsecondCol
- second column to use in the distance calculationpublic TransformProcess.Builder firstDigitTransform(String inputColumn, String outputColumn)
FirstDigitTransform.Mode
s are supported, which determines how non-numerical entries should be handled:CategoricalToOneHotTransform
and Reductions) to implement
Benford's law.inputColumn
- Input column nameoutputColumn
- Output column name. If same as input, input column is replacedpublic TransformProcess.Builder firstDigitTransform(String inputColumn, String outputColumn, FirstDigitTransform.Mode mode)
FirstDigitTransform.Mode
s are supported, which determines how non-numerical entries should be handled:CategoricalToOneHotTransform
and Reductions) to implement
Benford's law.inputColumn
- Input column nameoutputColumn
- Output column name. If same as input, input column is replacedmode
- See FirstDigitTransform.Mode
public TransformProcess build()
Copyright © 2021. All rights reserved.