Class Table
- java.lang.Object
-
- tech.tablesaw.table.Relation
-
- tech.tablesaw.api.Table
-
public class Table extends Relation implements Iterable<Row>
A table of data, consisting of some number of columns, each of which has the same number of rows. All the data in a column has the same type: integer, float, category, etc., but a table may contain an arbitrary number of columns of any type.Tables are the main data-type and primary focus of Tablesaw.
-
-
Field Summary
Fields Modifier and Type Field Description static ReaderRegistrydefaultReaderRegistrystatic WriterRegistrydefaultWriterRegistrystatic StringMELT_VALUE_COLUMN_NAMEstatic StringMELT_VARIABLE_COLUMN_NAME
-
Constructor Summary
Constructors Modifier Constructor Description protectedTable(String name, Collection<Column<?>> columns)Returns a new Table initialized with the given names and columnsprotectedTable(String name, Column<?>... columns)Returns a new Table initialized with the given names and columns
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description TableaddColumns(Column<?>... cols)Adds the given column to this table.voidaddRow(int rowIndex, Table sourceTable)Adds a single row to this table from sourceTable, copying every column in sourceTableTableappend(Row row)Appends the given row to this table and returns the table.Tableappend(Relation tableToAppend)Returns this table after adding the data from the argumentRowappendRow()Appends an empty row and returns a Row object indexed to the newly added row so values can be set.Tablecast()Cast implements the 'tidy' cast operation as described in these papers by Hadley Wickham:List<CategoricalColumn<?>>categoricalColumns(String... columnNames)Returns only the columns whose names are given in the input arrayvoidclear()Clears all the data from this tableColumn<?>column(int columnIndex)Returns the column at the given index in the column listColumn<?>[]columnArray()Returns the columns in this table as an arrayintcolumnCount()Returns the number of columns in the tableintcolumnIndex(String columnName)Returns the index of the column with the given nameintcolumnIndex(Column<?> column)Returns the index of the given column (its position in the list of columns)List<String>columnNames()Returns a List of the names of all the columns in this tableList<Column<?>>columns()Returns the list of columnsstatic booleancompareRows(int rowNumber, Table table1, Table table2)Returnstrueif the rowrowNumberintable1holds the same values than the row atrowNumberintable2.Tableconcat(Table tableToConcatenate)Add all the columns of tableToConcatenate to this table Note: The columns in the result must have unique names, when compared case insensitive Note: Both tables must have the same number of rowsTablecopy()Returns a table with the same columns and data as this tablevoidcopyRowsToTable(int[] rows, Table newTable)Copies the rows indicated by the row index values in the given array from oldTable to newTablevoidcopyRowsToTable(Selection rows, Table newTable)Copies the rows specified by Selection into newTableTablecountBy(String... categoricalColumnNames)Returns a table containing a column for each grouping column, and a column named "Count" that contains the counts for each combination of grouping column valuesTablecountBy(CategoricalColumn<?>... groupingColumns)Returns a table containing two columns, the grouping column, and a column named "Count" that contains the counts for each grouping column valuestatic Tablecreate()Returns a new, empty table (without rows or columns)static Tablecreate(String tableName)Returns a new, empty table (without rows or columns) with the given namestatic Tablecreate(String name, Collection<Column<?>> columns)Returns a new table with the given columns and given namestatic Tablecreate(String name, Stream<Column<?>> columns)Returns a new table with the given columns and given namestatic Tablecreate(String name, Column<?>... columns)Returns a new table with the given columns and given namestatic Tablecreate(Collection<Column<?>> columns)Returns a new table with the given columnsstatic Tablecreate(Stream<Column<?>> columns)Returns a new table with the given columnsstatic Tablecreate(Column<?>... columns)Returns a new table with the given columnsTabledropDuplicateRows()Returns the unique records in this table, such that any record that appears more than once in this table, appears only once in the returned table.TabledropRange(int rowCount)Returns a new table EXCLUDING the first rowCount rows if rowCount positive.TabledropRange(int rowStart, int rowEnd)Returns a table EXCLUDING the rows contained in the range from rowStart inclusive to rowEnd exclusiveTabledropRows(int... rowNumbers)Returns a table EXCLUDING the rows contained in the given array of row indicesTabledropRowsWithMissingValues()Returns only those records in this table that have no columns with missing valuesTabledropWhere(Function<Table,Selection> selection)Returns a new Table made by EXCLUDING any rows returned when the given function is applied to this tableTabledropWhere(Selection selection)Returns a table EXCLUDING the rows contained in the given SelectionTableemptyCopy()Returns a table with the same columns as this table, but no dataTableemptyCopy(int rowSize)Returns a table with the same columns as this table, but no data, initialized to the given row sizeTablefirst(int nRows)Returns a new table containing the firstnrowsof data in this tableTableinRange(int rowCount)Returns a new table containing the first rowCount rows if rowCount positive.TableinRange(int rowStart, int rowEnd)Returns a new table containing the rows contained in the range from rowStart inclusive to rowEnd exclusiveTableinsertColumn(int index, Column<?> column)Adds the given column to this table at the given position in the column list.voidinternalAddWithoutValidation(Column<?> c)For internal Tablesaw use onlyprotected booleanisDuplicate(Row row, it.unimi.dsi.fastutil.ints.Int2ObjectMap<it.unimi.dsi.fastutil.ints.IntArrayList> uniqueHashes)Returns true if all the values in row are identical to those in another row previously seen and recorded in the list.Iterator<Row>iterator()DataFrameJoinerjoinOn(String... columnNames)Returns a new DataFrameJoiner initialized with multiplecolumnNamesTablelast(int nRows)Returns a new table containing the lastnrowsof data in this tableTablemelt(List<String> idVariables, List<NumericColumn<?>> measuredVariables, boolean dropMissing)Melt implements the 'tidy' melt operation as described in these papers by Hadley Wickham.TablemissingValueCounts()Returns a table containing the number of missing values in each column in this tableStringname()Returns the name of the tableTablepivot(String column1Name, String column2Name, String column3Name, AggregateFunction<?,?> aggregateFunction)Returns a pivot on this table, where: The first column contains unique values from the index column1 There are n additional columns, one for each unique value in column2 The values in each of the cells in these new columns are the result of applying the given AggregateFunction to the data in column3, grouped by the values of column1 and column2Tablepivot(CategoricalColumn<?> column1, CategoricalColumn<?> column2, NumericColumn<?> column3, AggregateFunction<?,?> aggregateFunction)Returns a pivot on this table, where: The first column contains unique values from the index column1 There are n additional columns, one for each unique value in column2 The values in each of the cells in these new columns are the result of applying the given AggregateFunction to the data in column3, grouped by the values of column1 and column2static DataFrameReaderread()Returns an object that can be used to read data from a file into a new TableTablerejectColumns(int... columnIndexes)Returns a new table containing copies of all the columns from this table, except those at the given indexesTablerejectColumns(String... columnNames)Returns a new table containing copies of all the columns from this table, except those named in the argumentTablerejectColumns(Column<?>... columns)Returns a new table containing copies of all the columns from this table, except those named in the argumentTableremoveColumns(int... columnIndexes)Removes the columns at the given indices from this table and returns this tableTableremoveColumns(String... columns)Removes the columns with the given names from this table and returns this tableTableremoveColumns(Column<?>... columns)Removes the given columns from this table and returns this tableTableremoveColumnsWithMissingValues()Removes all columns with missing values from this table, and returns this table.TablereorderColumns(String... columnNames)Return a new table (shallow copy) that contains all the columns in this table, in the order given in the argument.TablereplaceColumn(int colIndex, Column<?> newColumn)Replaces an existing column (by index) in this table with the given new columnTablereplaceColumn(String columnName, Column<?> newColumn)Replaces an existing column (by name) in this table with the given new columnTablereplaceColumn(Column<?> newColumn)Replaces an existing column having the same name of the given column with the given columnTableretainColumns(int... columnIndexes)Removes all columns except for those given in the argument from this table and returns this tableTableretainColumns(String... columnNames)Removes all columns except for those given in the argument from this table and returns this tableTableretainColumns(Column<?>... columns)Removes all columns except for those given in the argument from this table and returns this tableIterator<Row[]>rollingIterator(int n)Iterates over rolling sets of rows.Stream<Row[]>rollingStream(int n)Streams over rolling sets of rows.Rowrow(int rowIndex)Returns a new Row object with its position set to the given zero-based row index.introwCount()Returns the number of rows in the tableTablerows(int... rowNumbers)Returns a table containing the rows contained in the given array of row indicesTablesampleN(int nRows)Returns a table consisting of randomly selected records from this tableTable[]sampleSplit(double table1Proportion)TablesampleX(double proportion)Returns a table consisting of randomly selected records from this table.TableselectColumns(int... columnIndexes)Returns a new table containing copies of the columns at the given indexesTableselectColumns(String... columnNames)Returns a new table containing copies of the selected columns from this tableTableselectColumns(Column<?>... columns)Returns a new table containing copies of the selected columns from this tableTablesetName(String name)Sets the name of the tableTablesortAscendingOn(String... columnNames)Returns a copy of this table sorted in the order of the given column names, in ascending orderTablesortDescendingOn(String... columnNames)Returns a copy of this table sorted on the given column names, applied in order, descending TODO: Provide equivalent methods naming columns by indexTablesortOn(int... columnIndexes)Sorts this table into a new table on the columns indexedTablesortOn(String... columnNames)Returns a copy of this table sorted on the given column names, applied in order,TablesortOn(Comparator<Row> rowComparator)Returns a copy of this table sorted using the given comparatorTablesortOn(Sort key)Returns a copy of this table sorted using the given sort key.TableSliceGroupsplitOn(String... columns)Returns a non-overlapping and exhaustive collection of "slices" over this table.TableSliceGroupsplitOn(CategoricalColumn<?>... columns)Returns a non-overlapping and exhaustive collection of "slices" over this table.Iterator<Row[]>steppingIterator(int n)Streams over stepped sets of rows.Stream<Row[]>steppingStream(int n)Streams over stepped sets of rows.Table[]stratifiedSampleSplit(CategoricalColumn<?> column, double table1Proportion)Splits the table into two stratified samples, this uses the specified column to divide the table into groups, randomly assigning records to each according to the proportion given in trainingProportion.Stream<Row>stream()Returns the rows in this table as a StreamSummarizersummarize(String col1Name, String col2Name, String col3Name, String col4Name, AggregateFunction<?,?>... functions)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Summarizersummarize(String col1Name, String col2Name, String col3Name, AggregateFunction<?,?>... functions)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Summarizersummarize(String numericColumn1Name, String numericColumn2Name, AggregateFunction<?,?>... functions)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Summarizersummarize(String columName, AggregateFunction<?,?>... functions)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Summarizersummarize(List<String> columnNames, AggregateFunction<?,?>... functions)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Summarizersummarize(Column<?> numberColumn, AggregateFunction<?,?>... function)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Summarizersummarize(Column<?> column1, Column<?> column2, AggregateFunction<?,?>... function)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Summarizersummarize(Column<?> column1, Column<?> column2, Column<?> column3, AggregateFunction<?,?>... function)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Summarizersummarize(Column<?> column1, Column<?> column2, Column<?> column3, Column<?> column4, AggregateFunction<?,?>... function)Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions.Tabletranspose()Transposes data in the table, switching rows for columns.Tabletranspose(boolean includeColumnHeadingsAsFirstColumn, boolean useFirstColumnForHeadings)Transposes data in the table, switching rows for columns.Tablewhere(Function<Table,Selection> selection)Returns a new Table made by applying the given function to this tableTablewhere(Selection selection)Returns a table containing the rows contained in the given SelectionDataFrameWriterwrite()Returns an object that an be used to write data from a Table into a file.TablexTabColumnPercents(String column1Name, String column2Name)Returns a table with n by m + 1 cells.TablexTabCounts(String column1Name)Returns a table with two columns, the first contains a value each unique value in the argument, and the second contains the number of observations of each valueTablexTabCounts(String column1Name, String column2Name)Returns a table with n by m + 1 cells.TablexTabPercents(String column1Name)TODO: Rename the method to xTabProportions, deprecating this version Returns a table with two columns, the first contains a value each unique value in the argument, and the second contains the proportion of observations having that valueTablexTabRowPercents(String column1Name, String column2Name)Returns a table with n by m + 1 cells.TablexTabTablePercents(String column1Name, String column2Name)Returns a table with n by m + 1 cells.-
Methods inherited from class tech.tablesaw.table.Relation
as, booleanColumn, booleanColumn, booleanColumns, categoricalColumn, categoricalColumn, column, columns, columns, columnsOfType, colWidths, containsColumn, containsColumn, dateColumn, dateColumn, dateColumns, dateTimeColumn, dateTimeColumn, dateTimeColumns, doubleColumn, doubleColumn, floatColumn, floatColumn, get, getString, getString, getUnformatted, instantColumn, instantColumn, instantColumns, intColumn, intColumn, isEmpty, longColumn, longColumn, nCol, nCol, numberColumn, numberColumn, numberColumns, numericColumns, numericColumns, numericColumns, print, print, printAll, shape, shortColumn, shortColumn, smile, stringColumn, stringColumn, stringColumns, structure, summary, timeColumn, timeColumn, timeColumns, toString, typeArray, types
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
-
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
-
-
-
Field Detail
-
defaultReaderRegistry
public static final ReaderRegistry defaultReaderRegistry
-
defaultWriterRegistry
public static final WriterRegistry defaultWriterRegistry
-
MELT_VARIABLE_COLUMN_NAME
public static final String MELT_VARIABLE_COLUMN_NAME
- See Also:
- Constant Field Values
-
MELT_VALUE_COLUMN_NAME
public static final String MELT_VALUE_COLUMN_NAME
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
Table
protected Table(String name, Column<?>... columns)
Returns a new Table initialized with the given names and columns- Parameters:
name- The name of the tablecolumns- One or more columns, all of which must have either the same length or size 0
-
Table
protected Table(String name, Collection<Column<?>> columns)
Returns a new Table initialized with the given names and columns- Parameters:
name- The name of the tablecolumns- One or more columns, all of which must have either the same length or size 0
-
-
Method Detail
-
create
public static Table create()
Returns a new, empty table (without rows or columns)
-
create
public static Table create(String tableName)
Returns a new, empty table (without rows or columns) with the given name
-
create
public static Table create(Column<?>... columns)
Returns a new table with the given columns- Parameters:
columns- one or more columns, all of the same @code{column.size()}
-
create
public static Table create(Collection<Column<?>> columns)
Returns a new table with the given columns- Parameters:
columns- one or more columns, all of the same @code{column.size()}
-
create
public static Table create(Stream<Column<?>> columns)
Returns a new table with the given columns- Parameters:
columns- one or more columns, all of the same @code{column.size()}
-
create
public static Table create(String name, Column<?>... columns)
Returns a new table with the given columns and given name- Parameters:
name- the name for this tablecolumns- one or more columns, all of the same @code{column.size()}
-
create
public static Table create(String name, Collection<Column<?>> columns)
Returns a new table with the given columns and given name- Parameters:
name- the name for this tablecolumns- one or more columns, all of the same @code{column.size()}
-
create
public static Table create(String name, Stream<Column<?>> columns)
Returns a new table with the given columns and given name- Parameters:
name- the name for this tablecolumns- one or more columns, all of the same @code{column.size()}
-
read
public static DataFrameReader read()
Returns an object that can be used to read data from a file into a new Table
-
write
public DataFrameWriter write()
Returns an object that an be used to write data from a Table into a file. If the file exists, it is over-written
-
addColumns
public Table addColumns(Column<?>... cols)
Adds the given column to this table. Column must either be empty or have size() == the rowCount() of the table they're being added to. Column names in the table must remain unique.- Specified by:
addColumnsin classRelation- Returns:
- This Relation
-
internalAddWithoutValidation
public void internalAddWithoutValidation(Column<?> c)
For internal Tablesaw use onlyAdds the given column to this table without performing duplicate-name or column size checks
-
insertColumn
public Table insertColumn(int index, Column<?> column)
Adds the given column to this table at the given position in the column list. Columns must either be empty or have size() == the rowCount() of the table they're being added to. Column names in the table must remain unique.- Parameters:
index- Zero-based index into the column listcolumn- Column to be added
-
reorderColumns
public Table reorderColumns(String... columnNames)
Return a new table (shallow copy) that contains all the columns in this table, in the order given in the argument. Throw an IllegalArgument exception if the number of names given does not match the number of columns in this table. NOTE: This does not make a copy of the columns, so they are shared between the two tables.- Parameters:
columnNames- a column name or array of names
-
replaceColumn
public Table replaceColumn(int colIndex, Column<?> newColumn)
Replaces an existing column (by index) in this table with the given new column- Parameters:
colIndex- Zero-based index of the column to be replacednewColumn- Column to be added
-
replaceColumn
public Table replaceColumn(String columnName, Column<?> newColumn)
Replaces an existing column (by name) in this table with the given new column- Parameters:
columnName- String name of the column to be replacednewColumn- Column to be added
-
replaceColumn
public Table replaceColumn(Column<?> newColumn)
Replaces an existing column having the same name of the given column with the given column- Parameters:
newColumn- Column to be added
-
column
public Column<?> column(int columnIndex)
Returns the column at the given index in the column list
-
columnCount
public int columnCount()
Returns the number of columns in the table- Specified by:
columnCountin classRelation
-
rowCount
public int rowCount()
Returns the number of rows in the table
-
columnArray
public Column<?>[] columnArray()
Returns the columns in this table as an array
-
categoricalColumns
public List<CategoricalColumn<?>> categoricalColumns(String... columnNames)
Returns only the columns whose names are given in the input array- Overrides:
categoricalColumnsin classRelation
-
columnIndex
public int columnIndex(String columnName)
Returns the index of the column with the given name- Overrides:
columnIndexin classRelation- Throws:
IllegalArgumentException- if the input string is not the name of any column in the table
-
columnIndex
public int columnIndex(Column<?> column)
Returns the index of the given column (its position in the list of columns)- Specified by:
columnIndexin classRelation- Throws:
IllegalArgumentException- if the column is not present in this table
-
columnNames
public List<String> columnNames()
Returns a List of the names of all the columns in this table- Specified by:
columnNamesin classRelation
-
copy
public Table copy()
Returns a table with the same columns and data as this table
-
emptyCopy
public Table emptyCopy()
Returns a table with the same columns as this table, but no data
-
emptyCopy
public Table emptyCopy(int rowSize)
Returns a table with the same columns as this table, but no data, initialized to the given row size
-
copyRowsToTable
public void copyRowsToTable(Selection rows, Table newTable)
Copies the rows specified by Selection into newTable- Parameters:
rows- A Selection defining the rows to copynewTable- The table to copy the rows into
-
copyRowsToTable
public void copyRowsToTable(int[] rows, Table newTable)Copies the rows indicated by the row index values in the given array from oldTable to newTable
-
compareRows
public static boolean compareRows(int rowNumber, Table table1, Table table2)Returnstrueif the rowrowNumberintable1holds the same values than the row atrowNumberintable2. Returnsfalseif the number of columns is different in the two tables.
-
sampleSplit
public Table[] sampleSplit(double table1Proportion)
-
stratifiedSampleSplit
public Table[] stratifiedSampleSplit(CategoricalColumn<?> column, double table1Proportion)
Splits the table into two stratified samples, this uses the specified column to divide the table into groups, randomly assigning records to each according to the proportion given in trainingProportion.- Parameters:
column- the column to be used for the stratified samplingtable1Proportion- The proportion to go in the first table- Returns:
- An array two tables, with the first table having the proportion specified in the method parameter, and the second table having the balance of the rows
-
sampleX
public Table sampleX(double proportion)
Returns a table consisting of randomly selected records from this table. The sample size is based on the given proportion- Parameters:
proportion- The proportion to go in the sample
-
sampleN
public Table sampleN(int nRows)
Returns a table consisting of randomly selected records from this table- Parameters:
nRows- The number of rows to go in the sample
-
first
public Table first(int nRows)
Returns a new table containing the firstnrowsof data in this table
-
last
public Table last(int nRows)
Returns a new table containing the lastnrowsof data in this table
-
sortOn
public Table sortOn(int... columnIndexes)
Sorts this table into a new table on the columns indexedif index is negative then sort that column in descending order otherwise sort ascending
-
sortOn
public Table sortOn(String... columnNames)
Returns a copy of this table sorted on the given column names, applied in order,if column name starts with - then sort that column descending otherwise sort ascending
-
sortAscendingOn
public Table sortAscendingOn(String... columnNames)
Returns a copy of this table sorted in the order of the given column names, in ascending order
-
sortDescendingOn
public Table sortDescendingOn(String... columnNames)
Returns a copy of this table sorted on the given column names, applied in order, descending TODO: Provide equivalent methods naming columns by index
-
sortOn
public Table sortOn(Sort key)
Returns a copy of this table sorted using the given sort key.- Parameters:
key- to sort on.- Returns:
- a sorted copy of this table.
-
sortOn
public Table sortOn(Comparator<Row> rowComparator)
Returns a copy of this table sorted using the given comparator
-
addRow
public void addRow(int rowIndex, Table sourceTable)Adds a single row to this table from sourceTable, copying every column in sourceTable- Parameters:
rowIndex- The row in sourceTable to add to this tablesourceTable- A table with the same column structure as this table
-
row
public Row row(int rowIndex)
Returns a new Row object with its position set to the given zero-based row index.
-
rows
public Table rows(int... rowNumbers)
Returns a table containing the rows contained in the given array of row indices
-
dropRows
public Table dropRows(int... rowNumbers)
Returns a table EXCLUDING the rows contained in the given array of row indices
-
inRange
public Table inRange(int rowCount)
Returns a new table containing the first rowCount rows if rowCount positive. Returns the last rowCount rows if rowCount negative.
-
inRange
public Table inRange(int rowStart, int rowEnd)
Returns a new table containing the rows contained in the range from rowStart inclusive to rowEnd exclusive
-
dropRange
public Table dropRange(int rowCount)
Returns a new table EXCLUDING the first rowCount rows if rowCount positive. Drops the last rowCount rows if rowCount negative.
-
dropRange
public Table dropRange(int rowStart, int rowEnd)
Returns a table EXCLUDING the rows contained in the range from rowStart inclusive to rowEnd exclusive
-
where
public Table where(Selection selection)
Returns a table containing the rows contained in the given Selection
-
where
public Table where(Function<Table,Selection> selection)
Returns a new Table made by applying the given function to this table
-
dropWhere
public Table dropWhere(Function<Table,Selection> selection)
Returns a new Table made by EXCLUDING any rows returned when the given function is applied to this table
-
dropWhere
public Table dropWhere(Selection selection)
Returns a table EXCLUDING the rows contained in the given Selection
-
pivot
public Table pivot(CategoricalColumn<?> column1, CategoricalColumn<?> column2, NumericColumn<?> column3, AggregateFunction<?,?> aggregateFunction)
Returns a pivot on this table, where: The first column contains unique values from the index column1 There are n additional columns, one for each unique value in column2 The values in each of the cells in these new columns are the result of applying the given AggregateFunction to the data in column3, grouped by the values of column1 and column2
-
pivot
public Table pivot(String column1Name, String column2Name, String column3Name, AggregateFunction<?,?> aggregateFunction)
Returns a pivot on this table, where: The first column contains unique values from the index column1 There are n additional columns, one for each unique value in column2 The values in each of the cells in these new columns are the result of applying the given AggregateFunction to the data in column3, grouped by the values of column1 and column2
-
splitOn
public TableSliceGroup splitOn(String... columns)
Returns a non-overlapping and exhaustive collection of "slices" over this table. Each slice is like a virtual table containing a subset of the records in this tableThis method is intended for advanced or unusual operations on the subtables. If you want to calculate summary statistics for each subtable, the summarize methods (e.g)
table.summarize(myColumn, mean, median).by(columns)
are preferred
-
splitOn
public TableSliceGroup splitOn(CategoricalColumn<?>... columns)
Returns a non-overlapping and exhaustive collection of "slices" over this table. Each slice is like a virtual table containing a subset of the records in this tableThis method is intended for advanced or unusual operations on the subtables. If you want to calculate summary statistics for each subtable, the summarize methods (e.g)
table.summarize(myColumn, mean, median).by(columns)
are preferred
-
isDuplicate
protected boolean isDuplicate(Row row, it.unimi.dsi.fastutil.ints.Int2ObjectMap<it.unimi.dsi.fastutil.ints.IntArrayList> uniqueHashes)
Returns true if all the values in row are identical to those in another row previously seen and recorded in the list.- Parameters:
row- the row to evaluateuniqueHashes- a map of row hashes to the id of an exemplar row that produces that hash. If two different rows produce the same hash, then the row number for each is placed in the list, so that there are exemplars for both- Returns:
- true if the row's values exactly match a row that was previously seen
-
dropDuplicateRows
public Table dropDuplicateRows()
Returns the unique records in this table, such that any record that appears more than once in this table, appears only once in the returned table.
-
dropRowsWithMissingValues
public Table dropRowsWithMissingValues()
Returns only those records in this table that have no columns with missing values
-
selectColumns
public Table selectColumns(Column<?>... columns)
Returns a new table containing copies of the selected columns from this table- Parameters:
columns- The columns to copy into the new table- See Also:
retainColumns(Column[])
-
selectColumns
public Table selectColumns(String... columnNames)
Returns a new table containing copies of the selected columns from this table- Parameters:
columnNames- The names of the columns to include- See Also:
retainColumns(String[])
-
rejectColumns
public Table rejectColumns(int... columnIndexes)
Returns a new table containing copies of all the columns from this table, except those at the given indexes- Parameters:
columnIndexes- The indexes of the columns to exclude- See Also:
removeColumns(int[])
-
rejectColumns
public Table rejectColumns(String... columnNames)
Returns a new table containing copies of all the columns from this table, except those named in the argument- Parameters:
columnNames- The names of the columns to exclude- See Also:
removeColumns(int[])
-
rejectColumns
public Table rejectColumns(Column<?>... columns)
Returns a new table containing copies of all the columns from this table, except those named in the argument- Parameters:
columns- The names of the columns to exclude- See Also:
removeColumns(int[])
-
selectColumns
public Table selectColumns(int... columnIndexes)
Returns a new table containing copies of the columns at the given indexes- Parameters:
columnIndexes- The indexes of the columns to include- See Also:
retainColumns(int[])
-
removeColumns
public Table removeColumns(Column<?>... columns)
Removes the given columns from this table and returns this table- Specified by:
removeColumnsin classRelation- Returns:
- This Relation
-
removeColumnsWithMissingValues
public Table removeColumnsWithMissingValues()
Removes all columns with missing values from this table, and returns this table.
-
retainColumns
public Table retainColumns(Column<?>... columns)
Removes all columns except for those given in the argument from this table and returns this table
-
retainColumns
public Table retainColumns(int... columnIndexes)
Removes all columns except for those given in the argument from this table and returns this table
-
retainColumns
public Table retainColumns(String... columnNames)
Removes all columns except for those given in the argument from this table and returns this table
-
append
public Table append(Relation tableToAppend)
Returns this table after adding the data from the argument
-
append
public Table append(Row row)
Appends the given row to this table and returns the table.Note: The table is modified in-place TODO: Performance
-
removeColumns
public Table removeColumns(String... columns)
Removes the columns with the given names from this table and returns this table- Overrides:
removeColumnsin classRelation- Returns:
- This Relation
-
removeColumns
public Table removeColumns(int... columnIndexes)
Removes the columns at the given indices from this table and returns this table- Overrides:
removeColumnsin classRelation- Returns:
- This Relation
-
appendRow
public Row appendRow()
Appends an empty row and returns a Row object indexed to the newly added row so values can be set.Intended usage:
for (int i = 0; ...) { Row row = table.appendRow(); row.setString("name", "Bob"); row.setFloat("IQ", 123.4f); ...etc. }
-
concat
public Table concat(Table tableToConcatenate)
Add all the columns of tableToConcatenate to this table Note: The columns in the result must have unique names, when compared case insensitive Note: Both tables must have the same number of rows- Parameters:
tableToConcatenate- The table containing the columns to be added- Returns:
- This table
-
summarize
public Summarizer summarize(String columName, AggregateFunction<?,?>... functions)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(List<String> columnNames, AggregateFunction<?,?>... functions)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(String numericColumn1Name, String numericColumn2Name, AggregateFunction<?,?>... functions)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(String col1Name, String col2Name, String col3Name, AggregateFunction<?,?>... functions)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(String col1Name, String col2Name, String col3Name, String col4Name, AggregateFunction<?,?>... functions)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(Column<?> numberColumn, AggregateFunction<?,?>... function)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(Column<?> column1, Column<?> column2, AggregateFunction<?,?>... function)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(Column<?> column1, Column<?> column2, Column<?> column3, AggregateFunction<?,?>... function)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(Column<?> column1, Column<?> column2, Column<?> column3, Column<?> column4, AggregateFunction<?,?>... function)
Returns aSummarizerthat can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
xTabCounts
public Table xTabCounts(String column1Name, String column2Name)
Returns a table with n by m + 1 cells. The first column contains labels, the other cells contains the counts for every unique combination of values from the two specified columns in this table.
-
xTabRowPercents
public Table xTabRowPercents(String column1Name, String column2Name)
Returns a table with n by m + 1 cells. The first column contains labels, the other cells contains the row percents for every unique combination of values from the two specified columns in this table. Row percents total to 100% in every row.
-
xTabColumnPercents
public Table xTabColumnPercents(String column1Name, String column2Name)
Returns a table with n by m + 1 cells. The first column contains labels, the other cells contains the column percents for every unique combination of values from the two specified columns in this table. Column percents total to 100% in every column.
-
xTabTablePercents
public Table xTabTablePercents(String column1Name, String column2Name)
Returns a table with n by m + 1 cells. The first column contains labels, the other cells contains the proportion for a unique combination of values from the two specified columns in this table
-
xTabPercents
public Table xTabPercents(String column1Name)
TODO: Rename the method to xTabProportions, deprecating this version Returns a table with two columns, the first contains a value each unique value in the argument, and the second contains the proportion of observations having that value
-
xTabCounts
public Table xTabCounts(String column1Name)
Returns a table with two columns, the first contains a value each unique value in the argument, and the second contains the number of observations of each value
-
countBy
public Table countBy(CategoricalColumn<?>... groupingColumns)
Returns a table containing two columns, the grouping column, and a column named "Count" that contains the counts for each grouping column value
-
countBy
public Table countBy(String... categoricalColumnNames)
Returns a table containing a column for each grouping column, and a column named "Count" that contains the counts for each combination of grouping column values- Parameters:
categoricalColumnNames- The name(s) of one or more CategoricalColumns in this table- Returns:
- A table containing counts of rows grouped by the categorical columns
- Throws:
ClassCastException- if the categoricalColumnName parameter is the name of a column that does not * implement categorical
-
joinOn
public DataFrameJoiner joinOn(String... columnNames)
Returns a new DataFrameJoiner initialized with multiplecolumnNames- Parameters:
columnNames- Name of the columns to join on.- Returns:
- The new DataFrameJoiner
-
missingValueCounts
public Table missingValueCounts()
Returns a table containing the number of missing values in each column in this table
-
rollingIterator
public Iterator<Row[]> rollingIterator(int n)
Iterates over rolling sets of rows. I.e. 0 to n-1, 1 to n, 2 to n+1, etc.- Parameters:
n- the number of rows to return for each iteration
-
steppingIterator
public Iterator<Row[]> steppingIterator(int n)
Streams over stepped sets of rows. I.e. 0 to n-1, n to 2n-1, 2n to 3n-1, etc. Only returns full sets of rows.- Parameters:
n- the number of rows to return for each iteration
-
steppingStream
public Stream<Row[]> steppingStream(int n)
Streams over stepped sets of rows. I.e. 0 to n-1, n to 2n-1, 2n to 3n-1, etc. Only returns full sets of rows.- Parameters:
n- the number of rows to return for each iteration
-
rollingStream
public Stream<Row[]> rollingStream(int n)
Streams over rolling sets of rows. I.e. 0 to n-1, 1 to n, 2 to n+1, etc.- Parameters:
n- the number of rows to return for each iteration
-
transpose
public Table transpose()
Transposes data in the table, switching rows for columns. For example, a table like this.
value1 | value2 |
-------------------------------
1 | 2 |
1.1 | 2.1 |
1.2 | 2.2 |
Is transposed into the following
0 | 1 | 2 |
-------------------------------------
1 | 1.1 | 1.2 |
2 | 2.1 | 2.2 |- Returns:
- transposed table
- See Also:
transpose(boolean,boolean)
-
transpose
public Table transpose(boolean includeColumnHeadingsAsFirstColumn, boolean useFirstColumnForHeadings)
Transposes data in the table, switching rows for columns. For example, a table like this.
label | value1 | value2 |
-------------------------------
row1 | 1 | 2 |
row2 | 1.1 | 2.1 |
row3 | 1.2 | 2.2 |
Is transposed into the following
label | row1 | row2 | row3 |
-------------------------------------
value1 | 1 | 1.1 | 1.2 |
value2 | 2 | 2.1 | 2.2 |- Parameters:
includeColumnHeadingsAsFirstColumn- Toggle whether to include the column headings as first column in resultuseFirstColumnForHeadings- Use the first column as the column headings in the result. Useful if the data set already has a first column which contains a set of labels- Returns:
- The transposed table
-
melt
public Table melt(List<String> idVariables, List<NumericColumn<?>> measuredVariables, boolean dropMissing)
Melt implements the 'tidy' melt operation as described in these papers by Hadley Wickham.Tidy concepts: see https://www.jstatsoft.org/article/view/v059i10
Cast function details: see https://www.jstatsoft.org/article/view/v021i12
In short, melt turns columns into rows, but in a particular way. Used with the cast method, it can help make data tidy. In a tidy dataset, every variable is a column and every observation a row.
This method returns a table that contains all the data in this table, but organized such that there is a set of identifier variables (columns) and a single measured variable (column). For example, given a table with columns:
patient_id, gender, age, weight, temperature,
it returns a table with the columns:
patient_id, variable, value
In the new format, the strings age, weight, and temperature have become cells in the measurement table, such that a single row in the source table might look like this in the result table:
1234, gender, male 1234, age, 42 1234, weight, 186 1234, temperature, 97.4
This kind of structure often makes for a good intermediate format for performing subsequent transformations. It is especially useful when combined with the
cast()operation- Parameters:
idVariables- A list of column names intended to be used as identifiers. In the example, only patient_id would be an identifiermeasuredVariables- A list of columns intended to be used as measured variables. All columns must have the same typedropMissing- drop any row where the value is missing
-
cast
public Table cast()
Cast implements the 'tidy' cast operation as described in these papers by Hadley Wickham:Cast takes a table in 'molten' format, such as is produced by the
#melt(List, List, Boolean)t} method, and returns a version in standard tidy format.The molten table should have a StringColumn called "variable" and a column called "value" Every unique variable name will become a column in the output table.
All other columns in this table are considered identifier variable. Each combination of identifier variables specifies an observation, so there will be one row for each, with the other variables added.
Variable columns are returned in an arbitrary order. Use
reorderColumns(String...)if column order is important.Tidy concepts: see https://www.jstatsoft.org/article/view/v059i10
Cast function details: see https://www.jstatsoft.org/article/view/v021i12
-
-