public class CategoryColumn extends AbstractColumn implements CategoryFilters, CategoryColumnUtils, IntConvertibleColumn, Iterable<String>
Because the MISSING_VALUE for this column type is an empty string, there is little or no need for special handling of missing values in this class's methods.
Modifier and Type | Field and Description |
---|---|
static String |
MISSING_VALUE |
it.unimi.dsi.fastutil.ints.IntComparator |
rowComparator |
isMissing, isNotMissing
Constructor and Description |
---|
CategoryColumn(ColumnMetadata metadata) |
CategoryColumn(String name) |
CategoryColumn(String name,
int size) |
CategoryColumn(String name,
List<String> categories) |
CategoryColumn(String name,
String[] categories) |
Modifier and Type | Method and Description |
---|---|
void |
add(String stringValue) |
void |
addAll(List<String> stringValues)
Add all the strings in the list to this column
|
void |
append(Column column) |
void |
appendCell(String object) |
CategoryColumn |
appendString(CategoryColumn append)
Return a copy of this column with the given string appended
|
CategoryColumn |
appendString(String append)
Return a copy of this column with the given string appended
|
byte[] |
asBytes(int rowNumber)
Returns the contents of the cell at rowNumber as a byte[]
|
Set<String> |
asSet() |
List<String> |
bottom(int n)
Returns the smallest ("bottom") n values in the column
|
int |
byteSize()
Returns the width of a cell in this column, in bytes
|
void |
clear() |
boolean |
contains(String aString)
Returns true if this column contains a cell with the given string, and false otherwise
|
static String |
convert(String stringValue) |
CategoryColumn |
copy()
Returns a deep copy of the receiver
|
Table |
countByCategory() |
int |
countMissing()
Returns the count of missing values in this column
|
int |
countUnique()
Returns the count of unique values in this column
|
it.unimi.dsi.fastutil.ints.IntArrayList |
data()
Returns the integers that back this column
|
DictionaryMap |
dictionaryMap() |
CategoryColumn |
emptyCopy()
Returns a copy of the receiver with no data.
|
CategoryColumn |
emptyCopy(int rowSize)
Returns an empty copy of the receiver, with its internal storage initialized to the given row size
|
String |
get(int rowIndex)
Returns the value at rowIndex in this column.
|
List<BooleanColumn> |
getDummies()
Returns a list of boolean columns suitable for use as dummy variables in, for example, regression analysis,
selectWhere a column of categorical data must be encoded as a list of columns, such that each column represents
a single category and indicates whether it is present (1) or not present (0)
|
int |
getInt(int rowNumber) |
String |
getString(int row)
Returns a string representation of the value at the given row
|
it.unimi.dsi.fastutil.ints.IntArrayList |
getValues(it.unimi.dsi.fastutil.ints.IntArrayList indexes)
Returns all the values associated with the given indexes
|
int[] |
indexes()
Returns the raw indexes that this column contains.
|
void |
initializeWith(it.unimi.dsi.fastutil.ints.IntArrayList list,
DictionaryMap map)
Initializes this Column with the given values for performance
|
boolean |
isEmpty()
Returns true if the column has no data
|
Selection |
isEqualTo(String string) |
Selection |
isIn(Collection<String> strings) |
Selection |
isIn(String... strings) |
Selection |
isMissing() |
Selection |
isNotEqualTo(String string) |
Selection |
isNotIn(Collection<String> strings) |
Selection |
isNotIn(String... strings) |
Selection |
isNotMissing() |
Iterator<String> |
iterator() |
String |
print() |
CategoryColumn |
replaceAll(String[] regexArray,
String replacement)
Creates a new column, replacing each string in this column with a new string formed by
replacing any substring that matches the regex
|
it.unimi.dsi.fastutil.ints.IntComparator |
rowComparator() |
Selection |
select(StringBiPredicate predicate,
String value) |
Selection |
select(StringPredicate predicate) |
CategoryColumn |
selectIf(StringPredicate predicate) |
void |
set(int rowIndex,
String stringValue) |
int |
size()
Returns the number of elements (a.k.a.
|
void |
sortAscending() |
void |
sortDescending() |
Table |
summary() |
int[] |
toIntArray() |
IntColumn |
toIntColumn() |
CategoryColumn |
tokenizeAndRemoveDuplicates() |
CategoryColumn |
tokenizeAndSort()
Splits on Whitespace and returns the lexicographically sorted result
|
CategoryColumn |
tokenizeAndSort(String separator) |
List<String> |
toList()
Returns a List<String> representation of all the values in this column
NOTE: Unless you really need a string consider using the column itself for large datasets as it uses much less memory
|
List<String> |
top(int n)
Returns the largest ("top") n values in the column
|
String |
toString() |
ColumnType |
type()
Returns this column's ColumnType
|
CategoryColumn |
unique()
Returns a new Column containing all the unique values in this column
|
it.unimi.dsi.fastutil.ints.IntArrayList |
values()
Returns the integer encoded value of each cell in this column.
|
columnMetadata, columnWidth, comment, difference, id, metadata, name, setComment, setName
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
empty, endsWith, equalToIgnoringCase, hasLengthEqualTo, isAlpha, isAlphaNumeric, isLongerThan, isLowerCase, isNumeric, isShorterThan, isUpperCase, matchesRegex, startsWith, stringContains
abbreviate, commonPrefix, commonSuffix, distance, join, lowerCase, padEnd, padStart, replaceAll, replaceFirst, substring, substring, trim, upperCase
appendAll, appendAll
columnMetadata, columnWidth, comment, difference, first, first, id, last, last, metadata, name, setComment, setName, subset, title, toDoubleArray
forEach, spliterator
asIntegerSet
public static final String MISSING_VALUE
public final it.unimi.dsi.fastutil.ints.IntComparator rowComparator
public CategoryColumn(String name)
public CategoryColumn(ColumnMetadata metadata)
public CategoryColumn(String name, int size)
public ColumnType type()
Column
public String getString(int row)
Column
public CategoryColumn emptyCopy()
Column
public CategoryColumn emptyCopy(int rowSize)
Column
public void sortAscending()
sortAscending
in interface Column
public void sortDescending()
sortDescending
in interface Column
public int size()
size
in interface Column
size
in interface CategoryReduceUtils
public String get(int rowIndex)
IndexOutOfBoundsException
- if the given rowIndex is not in the columnpublic List<String> toList()
public int[] toIntArray()
toIntArray
in interface IntConvertibleColumn
public Table countByCategory()
public void set(int rowIndex, String stringValue)
public int countUnique()
Column
countUnique
in interface Column
public List<String> top(int n)
n
- The maximum number of records to return. The actual number will be smaller if n is greater than the
number of observations in the columnpublic List<String> bottom(int n)
n
- The maximum number of records to return. The actual number will be smaller if n is greater than the
number of observations in the columnpublic void add(String stringValue)
public void initializeWith(it.unimi.dsi.fastutil.ints.IntArrayList list, DictionaryMap map)
public boolean contains(String aString)
public it.unimi.dsi.fastutil.ints.IntArrayList getValues(it.unimi.dsi.fastutil.ints.IntArrayList indexes)
public void appendCell(String object)
appendCell
in interface Column
appendCell
in class AbstractColumn
public it.unimi.dsi.fastutil.ints.IntComparator rowComparator()
rowComparator
in interface Column
public boolean isEmpty()
Column
public List<BooleanColumn> getDummies()
public int getInt(int rowNumber)
public CategoryColumn unique()
public it.unimi.dsi.fastutil.ints.IntArrayList data()
public IntColumn toIntColumn()
public DictionaryMap dictionaryMap()
dictionaryMap
in interface CategoryColumnUtils
public int[] indexes()
public CategoryColumn appendString(CategoryColumn append)
append
- the column to appendpublic CategoryColumn appendString(String append)
append
- the string to appendpublic CategoryColumn replaceAll(String[] regexArray, String replacement)
regexArray
- the regex array to replacereplacement
- the replacement arraypublic CategoryColumn tokenizeAndSort(String separator)
public CategoryColumn tokenizeAndSort()
public CategoryColumn tokenizeAndRemoveDuplicates()
public Selection isNotMissing()
isNotMissing
in interface Column
public Selection select(StringPredicate predicate)
public Selection select(StringBiPredicate predicate, String value)
public CategoryColumn copy()
Column
public int countMissing()
countMissing
in interface Column
public CategoryColumn selectIf(StringPredicate predicate)
public it.unimi.dsi.fastutil.ints.IntArrayList values()
values
in interface CategoryColumnUtils
public int byteSize()
Column
public byte[] asBytes(int rowNumber)
public Selection isIn(Collection<String> strings)
public Selection isNotIn(Collection<String> strings)
Copyright © 2017. All rights reserved.