Package tech.tablesaw.columns.strings
Class TextualStringData
- java.lang.Object
-
- tech.tablesaw.columns.strings.TextualStringData
-
- All Implemented Interfaces:
Iterable<String>,StringData,StringFilters,StringReduceUtils,FilterSpec<Selection>,StringFilterSpec<Selection>
public class TextualStringData extends Object implements StringData
A column that contains String values. They are assumed to be free-form text. For categorical data, use stringColumnThis is the default column type for SQL longvarchar and longnvarchar types
Because the MISSING_VALUE for this column type is an empty string, there is little or no need for special handling of missing values in this class's methods.
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description TextualStringDataaddAll(List<String> stringValues)Add all the strings in the list to this columnTextualStringDataappend(String value)Added for naming consistency with all other columnsvoidappend(Column<String> column)TextualStringDataappendMissing()TextualStringDataappendObj(Object obj)byte[]asBytes(int rowNumber)Returns the contents of the cell at rowNumber as a byte[]double[]asDoubleArray()List<String>asList()Returns a List<String> representation of all the values in this columnString[]asObjectArray()Set<String>asSet()voidclear()booleancontains(String aString)Returns true if this column contains a cell with the given string, and false otherwiseTextualStringDatacopy()TablecountByCategory(String columnName)intcountMissing()Returns the count of missing values in this columnintcountOccurrences(String value)intcountUnique()static TextualStringDatacreate()static TextualStringDatacreate(int size)static TextualStringDatacreate(String... strings)static TextualStringDatacreate(Collection<String> strings)static TextualStringDatacreate(Stream<String> stream)TextualStringDataemptyCopy()TextualStringDataemptyCopy(int rowSize)booleanequals(int rowNumber1, int rowNumber2)intfirstIndexOf(String value)Stringget(int rowIndex)Returns the value at rowIndex in this column.DictionaryMapgetDictionary()Returns null, as this Column is not backed by a dictionaryMapdoublegetDouble(int i)Returns a double that can stand in for the string at index i in some ML applicationsList<BooleanColumn>getDummies()Unsupported Operation This can't be used on a text column as the number of BooleanColumns would likely be excessivebooleanisEmpty()SelectionisIn(String... strings)SelectionisIn(Collection<String> strings)booleanisMissing(int rowNumber)SelectionisNotIn(String... strings)SelectionisNotIn(Collection<String> strings)Iterator<String>iterator()TextualStringDatalag(int n)TextualStringDatalead(int n)TextualStringDataremoveMissing()it.unimi.dsi.fastutil.ints.IntComparatorrowComparator()TextualStringDataset(int rowIndex, String stringValue)TextualStringDataset(Selection rowSelection, String newValue)Conditionally update this column, replacing current values with newValue for all rows where the current value matches the selection criteriaTextualStringDatasetMissing(int i)intsize()Returns the number of elements (a.k.a.voidsortAscending()voidsortDescending()Tablesummary()TextualStringDataunique()Returns a new Column containing all the unique values in this columnintvalueHash(int rowNumber)static booleanvalueIsMissing(String string)TextualStringDatawhere(Selection selection)-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Methods inherited from interface tech.tablesaw.columns.strings.StringData
subset
-
Methods inherited from interface tech.tablesaw.columns.strings.StringFilters
containsString, endsWith, equalsIgnoreCase, equalsIgnoreCase, eval, eval, eval, eval, isAlpha, isAlphaNumeric, isEmptyString, isEqualTo, isEqualTo, isIn, isLongerThan, isLowerCase, isMissing, isNotEqualTo, isNotEqualTo, isNotIn, isNotMissing, isNumeric, isShorterThan, isUpperCase, lengthEquals, matchesRegex, startsWith, startsWith
-
Methods inherited from interface tech.tablesaw.columns.strings.StringReduceUtils
appendAll, appendAll
-
-
-
-
Method Detail
-
valueHash
public int valueHash(int rowNumber)
-
equals
public boolean equals(int rowNumber1, int rowNumber2)
-
valueIsMissing
public static boolean valueIsMissing(String string)
-
appendMissing
public TextualStringData appendMissing()
- Specified by:
appendMissingin interfaceStringData
-
create
public static TextualStringData create()
-
create
public static TextualStringData create(String... strings)
-
create
public static TextualStringData create(Collection<String> strings)
-
create
public static TextualStringData create(int size)
-
create
public static TextualStringData create(Stream<String> stream)
-
isMissing
public boolean isMissing(int rowNumber)
- Specified by:
isMissingin interfaceStringData
-
emptyCopy
public TextualStringData emptyCopy()
- Specified by:
emptyCopyin interfaceStringData
-
emptyCopy
public TextualStringData emptyCopy(int rowSize)
- Specified by:
emptyCopyin interfaceStringData
-
sortAscending
public void sortAscending()
- Specified by:
sortAscendingin interfaceStringData
-
sortDescending
public void sortDescending()
- Specified by:
sortDescendingin interfaceStringData
-
size
public int size()
Returns the number of elements (a.k.a. rows or cells) in the column- Specified by:
sizein interfaceStringFilters- Specified by:
sizein interfaceStringReduceUtils- Returns:
- size as int
-
get
public String get(int rowIndex)
Returns the value at rowIndex in this column. The index is zero-based.- Specified by:
getin interfaceStringFilters- Parameters:
rowIndex- index of the row- Returns:
- value as String
- Throws:
IndexOutOfBoundsException- if the given rowIndex is not in the column
-
asList
public List<String> asList()
Returns a List<String> representation of all the values in this columnNOTE: Unless you really need a string consider using the column itself for large datasets as it uses much less memory
- Specified by:
asListin interfaceStringData- Returns:
- values as a list of String.
-
countByCategory
public Table countByCategory(String columnName)
- Specified by:
countByCategoryin interfaceStringData
-
summary
public Table summary()
-
clear
public void clear()
- Specified by:
clearin interfaceStringData
-
lead
public TextualStringData lead(int n)
- Specified by:
leadin interfaceStringData
-
lag
public TextualStringData lag(int n)
- Specified by:
lagin interfaceStringData
-
set
public TextualStringData set(Selection rowSelection, String newValue)
Conditionally update this column, replacing current values with newValue for all rows where the current value matches the selection criteriaExamples: myCatColumn.set(myCatColumn.isEqualTo("Cat"), "Dog"); // no more cats myCatColumn.set(myCatColumn.valueIsMissing(), "Fox"); // no more missing values
- Specified by:
setin interfaceStringData
-
set
public TextualStringData set(int rowIndex, String stringValue)
- Specified by:
setin interfaceStringData
-
countUnique
public int countUnique()
- Specified by:
countUniquein interfaceStringData
-
contains
public boolean contains(String aString)
Returns true if this column contains a cell with the given string, and false otherwise- Specified by:
containsin interfaceStringData- Parameters:
aString- the value to look for- Returns:
- true if contains, false otherwise
-
setMissing
public TextualStringData setMissing(int i)
- Specified by:
setMissingin interfaceStringData
-
addAll
public TextualStringData addAll(List<String> stringValues)
Add all the strings in the list to this column- Parameters:
stringValues- a list of values
-
rowComparator
public it.unimi.dsi.fastutil.ints.IntComparator rowComparator()
- Specified by:
rowComparatorin interfaceStringData
-
isEmpty
public boolean isEmpty()
- Specified by:
isEmptyin interfaceStringData
-
unique
public TextualStringData unique()
Returns a new Column containing all the unique values in this column- Specified by:
uniquein interfaceStringData- Returns:
- a column with unique values.
-
where
public TextualStringData where(Selection selection)
- Specified by:
wherein interfaceStringData
-
copy
public TextualStringData copy()
- Specified by:
copyin interfaceStringData
-
append
public void append(Column<String> column)
- Specified by:
appendin interfaceStringData
-
countMissing
public int countMissing()
Returns the count of missing values in this column- Specified by:
countMissingin interfaceStringData
-
removeMissing
public TextualStringData removeMissing()
- Specified by:
removeMissingin interfaceStringData
-
asSet
public Set<String> asSet()
- Specified by:
asSetin interfaceStringData
-
asBytes
public byte[] asBytes(int rowNumber)
Returns the contents of the cell at rowNumber as a byte[]- Specified by:
asBytesin interfaceStringData
-
append
public TextualStringData append(String value)
Added for naming consistency with all other columns- Specified by:
appendin interfaceStringData
-
appendObj
public TextualStringData appendObj(Object obj)
- Specified by:
appendObjin interfaceStringData
-
isIn
public Selection isIn(String... strings)
- Specified by:
isInin interfaceStringFilters- Specified by:
isInin interfaceStringFilterSpec<Selection>
-
isIn
public Selection isIn(Collection<String> strings)
- Specified by:
isInin interfaceStringFilters- Specified by:
isInin interfaceStringFilterSpec<Selection>
-
isNotIn
public Selection isNotIn(String... strings)
- Specified by:
isNotInin interfaceStringFilters- Specified by:
isNotInin interfaceStringFilterSpec<Selection>
-
isNotIn
public Selection isNotIn(Collection<String> strings)
- Specified by:
isNotInin interfaceStringFilters- Specified by:
isNotInin interfaceStringFilterSpec<Selection>
-
firstIndexOf
public int firstIndexOf(String value)
- Specified by:
firstIndexOfin interfaceStringData
-
asObjectArray
public String[] asObjectArray()
- Specified by:
asObjectArrayin interfaceStringData
-
getDouble
public double getDouble(int i)
Returns a double that can stand in for the string at index i in some ML applicationsTODO: Evaluate use of hashCode() here for uniqueness
- Specified by:
getDoublein interfaceStringData- Parameters:
i- The index in this column
-
asDoubleArray
public double[] asDoubleArray()
- Specified by:
asDoubleArrayin interfaceStringData
-
countOccurrences
public int countOccurrences(String value)
- Specified by:
countOccurrencesin interfaceStringData
-
getDummies
public List<BooleanColumn> getDummies()
Unsupported Operation This can't be used on a text column as the number of BooleanColumns would likely be excessive- Specified by:
getDummiesin interfaceStringData
-
getDictionary
@Nullable public DictionaryMap getDictionary()
Returns null, as this Column is not backed by a dictionaryMap- Specified by:
getDictionaryin interfaceStringData
-
-