public class CategoryColumn extends AbstractColumn implements CategoryFilters, CategoryColumnUtils, IntConvertibleColumn, Iterable<String>
Because the MISSING_VALUE for this column type is an empty string, there is little or no need for special handling of missing values in this class's methods.
| Modifier and Type | Field and Description |
|---|---|
static String |
MISSING_VALUE |
it.unimi.dsi.fastutil.ints.IntComparator |
rowComparator |
isMissing, isNotMissing| Constructor and Description |
|---|
CategoryColumn(ColumnMetadata metadata) |
CategoryColumn(String name) |
CategoryColumn(String name,
int size) |
CategoryColumn(String name,
List<String> categories) |
CategoryColumn(String name,
String[] categories) |
| Modifier and Type | Method and Description |
|---|---|
void |
add(String stringValue)
Deprecated.
Use append(String value) instead
|
void |
addAll(List<String> stringValues)
Add all the strings in the list to this column
|
void |
append(Column column) |
void |
append(String value)
Added for naming consistency with all other columns
|
void |
appendCell(String object) |
CategoryColumn |
appendString(CategoryColumn append)
Return a copy of this column with the given string appended
|
CategoryColumn |
appendString(String append)
Return a copy of this column with the given string appended
|
byte[] |
asBytes(int rowNumber)
Returns the contents of the cell at rowNumber as a byte[]
|
int[] |
asIntArray() |
IntColumn |
asIntColumn() |
List<String> |
asList()
Returns a List<String> representation of all the values in this column
NOTE: Unless you really need a string consider using the column itself for large datasets as it uses much less memory
|
Set<String> |
asSet() |
List<String> |
bottom(int n)
Returns the smallest ("bottom") n values in the column
|
int |
byteSize()
Returns the width of a cell in this column, in bytes.
|
void |
clear() |
boolean |
contains(String aString)
Returns true if this column contains a cell with the given string, and false otherwise
|
static String |
convert(String stringValue) |
CategoryColumn |
copy()
Returns a deep copy of the receiver
|
Table |
countByCategory() |
int |
countMissing()
Returns the count of missing values in this column
|
int |
countUnique()
Returns the count of unique values in this column.
|
it.unimi.dsi.fastutil.ints.IntArrayList |
data()
Returns the integers that back this column.
|
DictionaryMap |
dictionaryMap() |
CategoryColumn |
emptyCopy()
Returns a copy of the receiver with no data.
|
CategoryColumn |
emptyCopy(int rowSize)
Returns an empty copy of the receiver, with its internal storage initialized to the given row size.
|
String |
get(int rowIndex)
Returns the value at rowIndex in this column.
|
List<BooleanColumn> |
getDummies()
Returns a list of boolean columns suitable for use as dummy variables in, for example, regression analysis,
selectWhere a column of categorical data must be encoded as a list of columns, such that each column represents
a single category and indicates whether it is present (1) or not present (0)
|
int |
getInt(int rowNumber) |
String |
getString(int row)
Returns a string representation of the value at the given row.
|
it.unimi.dsi.fastutil.ints.IntArrayList |
getValues(it.unimi.dsi.fastutil.ints.IntArrayList indexes)
Returns all the values associated with the given indexes.
|
int[] |
indexes()
Returns the raw indexes that this column contains.
|
void |
initializeWith(it.unimi.dsi.fastutil.ints.IntArrayList list,
DictionaryMap map)
Initializes this Column with the given values for performance
|
boolean |
isEmpty()
Returns true if the column has no data
|
Selection |
isEqualTo(CategoryColumn other) |
Selection |
isEqualTo(String string) |
Selection |
isIn(Collection<String> strings) |
Selection |
isIn(String... strings) |
Selection |
isMissing() |
Selection |
isNotEqualTo(String string) |
Selection |
isNotIn(Collection<String> strings) |
Selection |
isNotIn(String... strings) |
Selection |
isNotMissing() |
Iterator<String> |
iterator() |
String |
print() |
CategoryColumn |
replaceAll(String[] regexArray,
String replacement)
Creates a new column, replacing each string in this column with a new string formed by
replacing any substring that matches the regex
|
it.unimi.dsi.fastutil.ints.IntComparator |
rowComparator() |
Selection |
select(StringBiPredicate predicate,
String value) |
Selection |
select(StringPredicate predicate) |
CategoryColumn |
selectIf(StringPredicate predicate) |
void |
set(int rowIndex,
String stringValue) |
void |
set(String newValue,
Selection rowSelection)
Conditionally update this column, replacing current values with newValue for all rows where the current value
matches the selection criteria
Examples:
myCatColumn.set("Dog", myCatColumn.isEqualTo("Cat")); // no more cats
myCatColumn.set("Fox", myCatColumn.isMissing()); // no more missing values
|
int |
size()
Returns the number of elements (a.k.a.
|
void |
sortAscending() |
void |
sortDescending() |
Table |
summary() |
CategoryColumn |
tokenizeAndRemoveDuplicates() |
CategoryColumn |
tokenizeAndSort()
Splits on Whitespace and returns the lexicographically sorted result.
|
CategoryColumn |
tokenizeAndSort(String separator) |
List<String> |
top(int n)
Returns the largest ("top") n values in the column
|
String |
toString() |
ColumnType |
type()
Returns this column's ColumnType
|
CategoryColumn |
unique()
Returns a new Column containing all the unique values in this column
|
it.unimi.dsi.fastutil.ints.IntArrayList |
values()
Returns the integer encoded value of each cell in this column.
|
columnMetadata, columnWidth, comment, difference, id, metadata, name, setComment, setNameclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitempty, endsWith, equalToIgnoringCase, hasLengthEqualTo, isAlpha, isAlphaNumeric, isLongerThan, isLowerCase, isNumeric, isShorterThan, isUpperCase, matchesRegex, startsWith, stringContainsabbreviate, commonPrefix, commonSuffix, distance, join, lowerCase, padEnd, padStart, replaceAll, replaceFirst, substring, substring, trim, upperCaseappendAll, appendAllasDoubleArray, columnMetadata, columnWidth, comment, difference, first, first, id, last, last, metadata, name, rolling, setComment, setName, subset, titleforEach, spliteratorasIntegerSetpublic static final String MISSING_VALUE
public final it.unimi.dsi.fastutil.ints.IntComparator rowComparator
public CategoryColumn(String name)
public CategoryColumn(ColumnMetadata metadata)
public CategoryColumn(String name, int size)
public ColumnType type()
Columntype in interface ColumnColumnTypepublic String getString(int row)
Columnpublic CategoryColumn emptyCopy()
Columnpublic CategoryColumn emptyCopy(int rowSize)
Columnpublic void sortAscending()
sortAscending in interface Columnpublic void sortDescending()
sortDescending in interface Columnpublic int size()
size in interface CategoryReduceUtilssize in interface Columnpublic String get(int rowIndex)
rowIndex - index of the rowIndexOutOfBoundsException - if the given rowIndex is not in the columnpublic List<String> asList()
public int[] asIntArray()
asIntArray in interface IntConvertibleColumnpublic Table countByCategory()
public Selection isEqualTo(CategoryColumn other)
public void set(String newValue, Selection rowSelection)
public void set(int rowIndex,
String stringValue)
public int countUnique()
ColumncountUnique in interface Columnpublic List<String> top(int n)
n - The maximum number of records to return. The actual number will be smaller if n is greater than the
number of observations in the columnpublic List<String> bottom(int n)
n - The maximum number of records to return. The actual number will be smaller if n is greater than the
number of observations in the columnpublic void add(String stringValue)
public void initializeWith(it.unimi.dsi.fastutil.ints.IntArrayList list,
DictionaryMap map)
public boolean contains(String aString)
aString - the value to look forpublic it.unimi.dsi.fastutil.ints.IntArrayList getValues(it.unimi.dsi.fastutil.ints.IntArrayList indexes)
indexes - the indexesIntArrayListpublic void addAll(List<String> stringValues)
stringValues - a list of valuespublic void appendCell(String object)
appendCell in interface ColumnappendCell in class AbstractColumnpublic it.unimi.dsi.fastutil.ints.IntComparator rowComparator()
rowComparator in interface Columnpublic boolean isEmpty()
Columnpublic List<BooleanColumn> getDummies()
BooleanColumnpublic int getInt(int rowNumber)
public CategoryColumn unique()
public it.unimi.dsi.fastutil.ints.IntArrayList data()
IntArrayListpublic IntColumn asIntColumn()
public DictionaryMap dictionaryMap()
dictionaryMap in interface CategoryColumnUtilspublic int[] indexes()
public CategoryColumn appendString(CategoryColumn append)
append - the column to appendpublic CategoryColumn appendString(String append)
append - the string to appendpublic CategoryColumn replaceAll(String[] regexArray, String replacement)
regexArray - the regex array to replacereplacement - the replacement arraypublic CategoryColumn tokenizeAndSort(String separator)
public CategoryColumn tokenizeAndSort()
CategoryColumnpublic CategoryColumn tokenizeAndRemoveDuplicates()
public Selection isNotMissing()
isNotMissing in interface Columnpublic Selection select(StringPredicate predicate)
public Selection select(StringBiPredicate predicate, String value)
public CategoryColumn copy()
Columnpublic int countMissing()
countMissing in interface Columnpublic CategoryColumn selectIf(StringPredicate predicate)
public it.unimi.dsi.fastutil.ints.IntArrayList values()
values in interface CategoryColumnUtilsIntArrayListpublic int byteSize()
Columnpublic byte[] asBytes(int rowNumber)
public Selection isIn(Collection<String> strings)
public void append(String value)
public Selection isNotIn(Collection<String> strings)
Copyright © 2018. All rights reserved.