Class StringListToCountsNDArrayTransform
- java.lang.Object
-
- org.datavec.api.transform.transform.BaseTransform
-
- org.datavec.api.transform.transform.string.StringListToCountsNDArrayTransform
-
- Direct Known Subclasses:
StringListToIndicesNDArrayTransform
public class StringListToCountsNDArrayTransform extends BaseTransform
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected boolean
binary
protected int
columnIdx
protected String
columnName
protected String
delimiter
protected boolean
ignoreUnknown
protected Map<String,Integer>
map
protected String
newColumnName
protected List<String>
vocabulary
-
Fields inherited from class org.datavec.api.transform.transform.BaseTransform
inputSchema
-
-
Constructor Summary
Constructors Constructor Description StringListToCountsNDArrayTransform(String columnName, String newColumnName, List<String> vocabulary, String delimiter, boolean binary, boolean ignoreUnknown)
StringListToCountsNDArrayTransform(String columnName, List<String> vocabulary, String delimiter, boolean binary, boolean ignoreUnknown)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description String
columnName()
Returns a singular column name this op is meant to run onString[]
columnNames()
Returns column names this op is meant to run onprotected Collection<Integer>
getIndices(String text)
protected INDArray
makeBOWNDArray(Collection<Integer> indices)
Object
map(Object input)
Transform an object in to another objectList<Writable>
map(List<Writable> writables)
Transform a writable in to another writableObject
mapSequence(Object sequence)
Transform a sequenceString
outputColumnName()
The output column name after the operation has been appliedString[]
outputColumnNames()
The output column names This will often be the same as the inputstatic List<String>
readVocabFromFile(String path)
void
setInputSchema(Schema inputSchema)
Set the input schema.String
toString()
Schema
transform(Schema inputSchema)
-
Methods inherited from class org.datavec.api.transform.transform.BaseTransform
getInputSchema, mapSequence
-
-
-
-
Constructor Detail
-
StringListToCountsNDArrayTransform
public StringListToCountsNDArrayTransform(String columnName, List<String> vocabulary, String delimiter, boolean binary, boolean ignoreUnknown)
- Parameters:
columnName
- The name of the column to convertvocabulary
- The possible tokens that may be present.delimiter
- The delimiter for the Strings to convertignoreUnknown
- Whether to ignore unknown tokens
-
StringListToCountsNDArrayTransform
public StringListToCountsNDArrayTransform(String columnName, String newColumnName, List<String> vocabulary, String delimiter, boolean binary, boolean ignoreUnknown)
- Parameters:
columnName
- The name of the column to convertvocabulary
- The possible tokens that may be present.delimiter
- The delimiter for the Strings to convertignoreUnknown
- Whether to ignore unknown tokens
-
-
Method Detail
-
readVocabFromFile
public static List<String> readVocabFromFile(String path) throws IOException
- Throws:
IOException
-
setInputSchema
public void setInputSchema(Schema inputSchema)
Description copied from interface:ColumnOp
Set the input schema.- Specified by:
setInputSchema
in interfaceColumnOp
- Overrides:
setInputSchema
in classBaseTransform
-
toString
public String toString()
- Specified by:
toString
in classBaseTransform
-
getIndices
protected Collection<Integer> getIndices(String text)
-
makeBOWNDArray
protected INDArray makeBOWNDArray(Collection<Integer> indices)
-
map
public List<Writable> map(List<Writable> writables)
Description copied from interface:Transform
Transform a writable in to another writable- Parameters:
writables
- the record to transform- Returns:
- the transformed writable
-
map
public Object map(Object input)
Transform an object in to another object- Parameters:
input
- the record to transform- Returns:
- the transformed writable
-
outputColumnName
public String outputColumnName()
The output column name after the operation has been applied- Returns:
- the output column name
-
outputColumnNames
public String[] outputColumnNames()
The output column names This will often be the same as the input- Returns:
- the output column names
-
columnNames
public String[] columnNames()
Returns column names this op is meant to run on- Returns:
-
columnName
public String columnName()
Returns a singular column name this op is meant to run on- Returns:
-
-