Class Table
Tables are the main data-type and primary focus of Tablesaw.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final ReaderRegistry
static final WriterRegistry
static final String
static final String
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionaddColumns
(Column<?>... cols) Adds the given column to this table.void
Adds a single row to this table from sourceTable, copying every column in sourceTableAppends the given row to this table and returns the table.Returns this table after adding the data from the argumentAppends an empty row and returns a Row object indexed to the newly added row so values can be set.cast()
Cast implements the 'tidy' cast operation as described in these papers by Hadley Wickham:categoricalColumns
(String... columnNames) Returns only the columns whose names are given in the input arrayvoid
clear()
Clears all the data from this tableColumn<?>
column
(int columnIndex) Returns the column at the given index in the column listColumn<?>[]
Returns the columns in this table as an arrayint
Returns the number of columns in the tableint
columnIndex
(String columnName) Returns the index of the column with the given nameint
columnIndex
(Column<?> column) Returns the index of the given column (its position in the list of columns)Returns a List of the names of all the columns in this tablecolumns()
Returns the list of columnsstatic boolean
compareRows
(int rowNumber, Table table1, Table table2) Returnstrue
if the rowrowNumber
intable1
holds the same values than the row atrowNumber
intable2
.Add all the columns of tableToConcatenate to this table Note: The columns in the result must have unique names, when compared case insensitive Note: Both tables must have the same number of rowscopy()
Returns a table with the same columns and data as this tablevoid
copyRowsToTable
(int[] rows, Table newTable) Copies the rows indicated by the row index values in the given array from oldTable to newTablevoid
copyRowsToTable
(Selection rows, Table newTable) Copies the rows specified by Selection into newTableReturns a table containing a column for each grouping column, and a column named "Count" that contains the counts for each combination of grouping column valuescountBy
(CategoricalColumn<?>... groupingColumns) Returns a table containing two columns, the grouping column, and a column named "Count" that contains the counts for each grouping column valuestatic Table
create()
Returns a new, empty table (without rows or columns)static Table
Returns a new, empty table (without rows or columns) with the given namestatic Table
create
(String name, Collection<Column<?>> columns) Returns a new table with the given columns and given namestatic Table
Returns a new table with the given columns and given namestatic Table
Returns a new table with the given columns and given namestatic Table
create
(Collection<Column<?>> columns) Returns a new table with the given columnsstatic Table
Returns a new table with the given columnsstatic Table
Returns a new table with the given columnsReturns the unique records in this table, such that any record that appears more than once in this table, appears only once in the returned table.dropRange
(int rowCount) Returns a new table EXCLUDING the first rowCount rows if rowCount positive.dropRange
(int rowStart, int rowEnd) Returns a table EXCLUDING the rows contained in the range from rowStart inclusive to rowEnd exclusivedropRows
(int... rowNumbers) Returns a table EXCLUDING the rows contained in the given array of row indicesReturns only those records in this table that have no columns with missing valuesReturns a new Table made by EXCLUDING any rows returned when the given function is applied to this tableReturns a table EXCLUDING the rows contained in the given SelectionReturns a table with the same columns as this table, but no dataemptyCopy
(int rowSize) Returns a table with the same columns as this table, but no data, initialized to the given row sizefirst
(int nRows) Returns a new table containing the firstnrows
of data in this tableinRange
(int rowCount) Returns a new table containing the first rowCount rows if rowCount positive.inRange
(int rowStart, int rowEnd) Returns a new table containing the rows contained in the range from rowStart inclusive to rowEnd exclusiveinsertColumn
(int index, Column<?> column) Adds the given column to this table at the given position in the column list.void
For internal Tablesaw use onlyiterator()
Returns a new DataFrameJoiner initialized with multiplecolumnNames
last
(int nRows) Returns a new table containing the lastnrows
of data in this tablemelt
(List<String> idVariables, List<NumericColumn<?>> measuredVariables, boolean dropMissing) Melt implements the 'tidy' melt operation as described in these papers by Hadley Wickham.Returns a table containing the number of missing values in each column in this tablename()
Returns the name of the tablepivot
(String column1Name, String column2Name, String column3Name, AggregateFunction<?, ?> aggregateFunction) Returns a pivot on this table, where: The first column contains unique values from the index column1 There are n additional columns, one for each unique value in column2 The values in each of the cells in these new columns are the result of applying the given AggregateFunction to the data in column3, grouped by the values of column1 and column2pivot
(CategoricalColumn<?> column1, CategoricalColumn<?> column2, NumericColumn<?> column3, AggregateFunction<?, ?> aggregateFunction) Returns a pivot on this table, where: The first column contains unique values from the index column1 There are n additional columns, one for each unique value in column2 The values in each of the cells in these new columns are the result of applying the given AggregateFunction to the data in column3, grouped by the values of column1 and column2static DataFrameReader
read()
Returns an object that an be used to read data from a file into a new TablerejectColumns
(int... columnIndexes) Returns a new table containing copies of all the columns from this table, except those at the given indexesrejectColumns
(String... columnNames) Returns a new table containing copies of all the columns from this table, except those named in the argumentrejectColumns
(Column<?>... columns) Returns a new table containing copies of all the columns from this table, except those named in the argumentremoveColumns
(int... columnIndexes) Removes the columns at the given indices from this table and returns this tableremoveColumns
(String... columns) Removes the columns with the given names from this table and returns this tableremoveColumns
(Column<?>... columns) Removes the given columns from this table and returns this tableRemoves all columns with missing values from this table, and returns this table.reorderColumns
(String... columnNames) Return a new table (shallow copy) that contains all the columns in this table, in the order given in the argument.replaceColumn
(int colIndex, Column<?> newColumn) Replaces an existing column (by index) in this table with the given new columnreplaceColumn
(String columnName, Column<?> newColumn) Replaces an existing column (by name) in this table with the given new columnreplaceColumn
(Column<?> newColumn) Replaces an existing column having the same name of the given column with the given columnretainColumns
(int... columnIndexes) Removes all columns except for those given in the argument from this table and returns this tableretainColumns
(String... columnNames) Removes all columns except for those given in the argument from this table and returns this tableretainColumns
(Column<?>... columns) Removes all columns except for those given in the argument from this table and returns this tablerollingIterator
(int n) Iterates over rolling sets of rows.rollingStream
(int n) Streams over rolling sets of rows.row
(int rowIndex) Returns a new Row object with its position set to the given zero-based row index.int
rowCount()
Returns the number of rows in the tablerows
(int... rowNumbers) Returns a table containing the rows contained in the given array of row indicessampleN
(int nRows) Returns a table consisting of randomly selected records from this tableTable[]
sampleSplit
(double table1Proportion) sampleX
(double proportion) Returns a table consisting of randomly selected records from this table.selectColumns
(int... columnIndexes) Returns a new table containing copies of the columns at the given indexesselectColumns
(String... columnNames) Returns a new table containing copies of the selected columns from this tableselectColumns
(Column<?>... columns) Returns a new table containing copies of the selected columns from this tableSets the name of the tablesortAscendingOn
(String... columnNames) Returns a copy of this table sorted in the order of the given column names, in ascending ordersortDescendingOn
(String... columnNames) Returns a copy of this table sorted on the given column names, applied in order, descending TODO: Provide equivalent methods naming columns by indexsortOn
(int... columnIndexes) Sorts this table into a new table on the columns indexedReturns a copy of this table sorted on the given column names, applied in order,sortOn
(Comparator<Row> rowComparator) Returns a copy of this table sorted using the given comparatorReturns a copy of this table sorted using the given sort key.Returns a non-overlapping and exhaustive collection of "slices" over this table.splitOn
(CategoricalColumn<?>... columns) Returns a non-overlapping and exhaustive collection of "slices" over this table.steppingIterator
(int n) Streams over stepped sets of rows.steppingStream
(int n) Streams over stepped sets of rows.Table[]
stratifiedSampleSplit
(CategoricalColumn<?> column, double table1Proportion) Splits the table into two stratified samples, this uses the specified column to divide the table into groups, randomly assigning records to each according to the proportion given in trainingProportion.stream()
Returns the rows in this table as a Streamsummarize
(String col1Name, String col2Name, String col3Name, String col4Name, AggregateFunction<?, ?>... functions) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.summarize
(String col1Name, String col2Name, String col3Name, AggregateFunction<?, ?>... functions) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.summarize
(String numericColumn1Name, String numericColumn2Name, AggregateFunction<?, ?>... functions) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.summarize
(String columName, AggregateFunction<?, ?>... functions) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.summarize
(List<String> columnNames, AggregateFunction<?, ?>... functions) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.summarize
(Column<?> numberColumn, AggregateFunction<?, ?>... function) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.summarize
(Column<?> column1, Column<?> column2, AggregateFunction<?, ?>... function) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.summarize
(Column<?> column1, Column<?> column2, Column<?> column3, AggregateFunction<?, ?>... function) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.summarize
(Column<?> column1, Column<?> column2, Column<?> column3, Column<?> column4, AggregateFunction<?, ?>... function) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions.Transposes data in the table, switching rows for columns.transpose
(boolean includeColumnHeadingsAsFirstColumn, boolean useFirstColumnForHeadings) Transposes data in the table, switching rows for columns.Returns a new Table made by applying the given function to this tableReturns a table containing the rows contained in the given Selectionwrite()
Returns an object that an be used to write data from a Table into a file.xTabColumnPercents
(String column1Name, String column2Name) Returns a table with n by m + 1 cells.xTabCounts
(String column1Name) Returns a table with two columns, the first contains a value each unique value in the argument, and the second contains the number of observations of each valuexTabCounts
(String column1Name, String column2Name) Returns a table with n by m + 1 cells.xTabPercents
(String column1Name) TODO: Rename the method to xTabProportions, deprecating this version Returns a table with two columns, the first contains a value each unique value in the argument, and the second contains the proportion of observations having that valuexTabRowPercents
(String column1Name, String column2Name) Returns a table with n by m + 1 cells.xTabTablePercents
(String column1Name, String column2Name) Returns a table with n by m + 1 cells.Methods inherited from class tech.tablesaw.table.Relation
as, booleanColumn, booleanColumn, booleanColumns, categoricalColumn, categoricalColumn, column, columns, columns, columnsOfType, colWidths, containsColumn, containsColumn, dateColumn, dateColumn, dateColumns, dateTimeColumn, dateTimeColumn, dateTimeColumns, doubleColumn, doubleColumn, floatColumn, floatColumn, get, getString, getString, getUnformatted, instantColumn, instantColumn, instantColumns, intColumn, intColumn, isEmpty, longColumn, longColumn, nCol, nCol, numberColumn, numberColumn, numberColumns, numericColumns, numericColumns, numericColumns, print, print, printAll, shape, shortColumn, shortColumn, smile, stringColumn, stringColumn, stringColumns, structure, summary, textColumn, textColumn, timeColumn, timeColumn, timeColumns, toString, typeArray, types
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface java.lang.Iterable
forEach, spliterator
-
Field Details
-
defaultReaderRegistry
-
defaultWriterRegistry
-
MELT_VARIABLE_COLUMN_NAME
- See Also:
-
MELT_VALUE_COLUMN_NAME
- See Also:
-
-
Constructor Details
-
Table
Returns a new Table initialized with the given names and columns- Parameters:
name
- The name of the tablecolumns
- One or more columns, all of which must have either the same length or size 0
-
Table
Returns a new Table initialized with the given names and columns- Parameters:
name
- The name of the tablecolumns
- One or more columns, all of which must have either the same length or size 0
-
-
Method Details
-
create
Returns a new, empty table (without rows or columns) -
create
Returns a new, empty table (without rows or columns) with the given name -
create
Returns a new table with the given columns- Parameters:
columns
- one or more columns, all of the same @code{column.size()}
-
create
Returns a new table with the given columns- Parameters:
columns
- one or more columns, all of the same @code{column.size()}
-
create
Returns a new table with the given columns- Parameters:
columns
- one or more columns, all of the same @code{column.size()}
-
create
Returns a new table with the given columns and given name- Parameters:
name
- the name for this tablecolumns
- one or more columns, all of the same @code{column.size()}
-
create
Returns a new table with the given columns and given name- Parameters:
name
- the name for this tablecolumns
- one or more columns, all of the same @code{column.size()}
-
create
Returns a new table with the given columns and given name- Parameters:
name
- the name for this tablecolumns
- one or more columns, all of the same @code{column.size()}
-
read
Returns an object that an be used to read data from a file into a new Table -
write
Returns an object that an be used to write data from a Table into a file. If the file exists, it is over-written -
addColumns
Adds the given column to this table. Column must either be empty or have size() == the rowCount() of the table they're being added to. Column names in the table must remain unique.- Specified by:
addColumns
in classRelation
- Returns:
- This Relation
-
internalAddWithoutValidation
For internal Tablesaw use onlyAdds the given column to this table without performing duplicate-name or column size checks
-
insertColumn
Adds the given column to this table at the given position in the column list. Columns must either be empty or have size() == the rowCount() of the table they're being added to. Column names in the table must remain unique.- Parameters:
index
- Zero-based index into the column listcolumn
- Column to be added
-
reorderColumns
Return a new table (shallow copy) that contains all the columns in this table, in the order given in the argument. Throw an IllegalArgument exception if the number of names given does not match the number of columns in this table. NOTE: This does not make a copy of the columns, so they are shared between the two tables.- Parameters:
columnNames
- a column name or array of names
-
replaceColumn
Replaces an existing column (by index) in this table with the given new column- Parameters:
colIndex
- Zero-based index of the column to be replacednewColumn
- Column to be added
-
replaceColumn
Replaces an existing column (by name) in this table with the given new column- Parameters:
columnName
- String name of the column to be replacednewColumn
- Column to be added
-
replaceColumn
Replaces an existing column having the same name of the given column with the given column- Parameters:
newColumn
- Column to be added
-
setName
Sets the name of the table -
column
Returns the column at the given index in the column list -
columnCount
public int columnCount()Returns the number of columns in the table- Specified by:
columnCount
in classRelation
-
rowCount
public int rowCount()Returns the number of rows in the table -
columns
Returns the list of columns -
columnArray
Returns the columns in this table as an array -
categoricalColumns
Returns only the columns whose names are given in the input array- Overrides:
categoricalColumns
in classRelation
-
columnIndex
Returns the index of the column with the given name- Overrides:
columnIndex
in classRelation
- Throws:
IllegalArgumentException
- if the input string is not the name of any column in the table
-
columnIndex
Returns the index of the given column (its position in the list of columns)- Specified by:
columnIndex
in classRelation
- Throws:
IllegalArgumentException
- if the column is not present in this table
-
name
Returns the name of the table -
columnNames
Returns a List of the names of all the columns in this table- Specified by:
columnNames
in classRelation
-
copy
Returns a table with the same columns and data as this table -
emptyCopy
Returns a table with the same columns as this table, but no data -
emptyCopy
Returns a table with the same columns as this table, but no data, initialized to the given row size -
copyRowsToTable
Copies the rows specified by Selection into newTable- Parameters:
rows
- A Selection defining the rows to copynewTable
- The table to copy the rows into
-
copyRowsToTable
Copies the rows indicated by the row index values in the given array from oldTable to newTable -
compareRows
Returnstrue
if the rowrowNumber
intable1
holds the same values than the row atrowNumber
intable2
. Returns false if the number of columns is different in the two tables.- Parameters:
rowNumber
- the row to comparetable1
- the first table to comparetable2
- the second table to compare- Returns:
- false if row
rowNumber
is different intable1
andtable2
-
sampleSplit
-
stratifiedSampleSplit
Splits the table into two stratified samples, this uses the specified column to divide the table into groups, randomly assigning records to each according to the proportion given in trainingProportion.- Parameters:
column
- the column to be used for the stratified samplingtable1Proportion
- The proportion to go in the first table- Returns:
- An array two tables, with the first table having the proportion specified in the method parameter, and the second table having the balance of the rows
-
sampleX
Returns a table consisting of randomly selected records from this table. The sample size is based on the given proportion- Parameters:
proportion
- The proportion to go in the sample
-
sampleN
Returns a table consisting of randomly selected records from this table- Parameters:
nRows
- The number of rows to go in the sample
-
clear
public void clear()Clears all the data from this table -
first
Returns a new table containing the firstnrows
of data in this table -
last
Returns a new table containing the lastnrows
of data in this table -
sortOn
Sorts this table into a new table on the columns indexedif index is negative then sort that column in descending order otherwise sort ascending
-
sortOn
Returns a copy of this table sorted on the given column names, applied in order,if column name starts with - then sort that column descending otherwise sort ascending
-
sortAscendingOn
Returns a copy of this table sorted in the order of the given column names, in ascending order -
sortDescendingOn
Returns a copy of this table sorted on the given column names, applied in order, descending TODO: Provide equivalent methods naming columns by index -
sortOn
Returns a copy of this table sorted using the given sort key.- Parameters:
key
- to sort on.- Returns:
- a sorted copy of this table.
-
sortOn
Returns a copy of this table sorted using the given comparator -
addRow
Adds a single row to this table from sourceTable, copying every column in sourceTable- Parameters:
rowIndex
- The row in sourceTable to add to this tablesourceTable
- A table with the same column structure as this table
-
row
Returns a new Row object with its position set to the given zero-based row index. -
rows
Returns a table containing the rows contained in the given array of row indices -
dropRows
Returns a table EXCLUDING the rows contained in the given array of row indices -
inRange
Returns a new table containing the first rowCount rows if rowCount positive. Returns the last rowCount rows if rowCount negative. -
inRange
Returns a new table containing the rows contained in the range from rowStart inclusive to rowEnd exclusive -
dropRange
Returns a new table EXCLUDING the first rowCount rows if rowCount positive. Drops the last rowCount rows if rowCount negative. -
dropRange
Returns a table EXCLUDING the rows contained in the range from rowStart inclusive to rowEnd exclusive -
where
Returns a table containing the rows contained in the given Selection -
where
Returns a new Table made by applying the given function to this table -
dropWhere
Returns a new Table made by EXCLUDING any rows returned when the given function is applied to this table -
dropWhere
Returns a table EXCLUDING the rows contained in the given Selection -
pivot
public Table pivot(CategoricalColumn<?> column1, CategoricalColumn<?> column2, NumericColumn<?> column3, AggregateFunction<?, ?> aggregateFunction) Returns a pivot on this table, where: The first column contains unique values from the index column1 There are n additional columns, one for each unique value in column2 The values in each of the cells in these new columns are the result of applying the given AggregateFunction to the data in column3, grouped by the values of column1 and column2 -
pivot
public Table pivot(String column1Name, String column2Name, String column3Name, AggregateFunction<?, ?> aggregateFunction) Returns a pivot on this table, where: The first column contains unique values from the index column1 There are n additional columns, one for each unique value in column2 The values in each of the cells in these new columns are the result of applying the given AggregateFunction to the data in column3, grouped by the values of column1 and column2 -
splitOn
Returns a non-overlapping and exhaustive collection of "slices" over this table. Each slice is like a virtual table containing a subset of the records in this tableThis method is intended for advanced or unusual operations on the subtables. If you want to calculate summary statistics for each subtable, the summarize methods (e.g)
table.summarize(myColumn, mean, median).by(columns)
are preferred
-
splitOn
Returns a non-overlapping and exhaustive collection of "slices" over this table. Each slice is like a virtual table containing a subset of the records in this tableThis method is intended for advanced or unusual operations on the subtables. If you want to calculate summary statistics for each subtable, the summarize methods (e.g)
table.summarize(myColumn, mean, median).by(columns)
are preferred
-
dropDuplicateRows
Returns the unique records in this table, such that any record that appears more than once in this table, appears only once in the returned table. -
dropRowsWithMissingValues
Returns only those records in this table that have no columns with missing values -
selectColumns
Returns a new table containing copies of the selected columns from this table- Parameters:
columns
- The columns to copy into the new table- See Also:
-
selectColumns
Returns a new table containing copies of the selected columns from this table- Parameters:
columnNames
- The names of the columns to include- See Also:
-
rejectColumns
Returns a new table containing copies of all the columns from this table, except those at the given indexes- Parameters:
columnIndexes
- The indexes of the columns to exclude- See Also:
-
rejectColumns
Returns a new table containing copies of all the columns from this table, except those named in the argument- Parameters:
columnNames
- The names of the columns to exclude- See Also:
-
rejectColumns
Returns a new table containing copies of all the columns from this table, except those named in the argument- Parameters:
columns
- The names of the columns to exclude- See Also:
-
selectColumns
Returns a new table containing copies of the columns at the given indexes- Parameters:
columnIndexes
- The indexes of the columns to include- See Also:
-
removeColumns
Removes the given columns from this table and returns this table- Specified by:
removeColumns
in classRelation
- Returns:
- This Relation
-
removeColumnsWithMissingValues
Removes all columns with missing values from this table, and returns this table. -
retainColumns
Removes all columns except for those given in the argument from this table and returns this table -
retainColumns
Removes all columns except for those given in the argument from this table and returns this table -
retainColumns
Removes all columns except for those given in the argument from this table and returns this table -
append
Returns this table after adding the data from the argument -
append
Appends the given row to this table and returns the table.Note: The table is modified in-place TODO: Performance
-
removeColumns
Removes the columns with the given names from this table and returns this table- Overrides:
removeColumns
in classRelation
- Returns:
- This Relation
-
removeColumns
Removes the columns at the given indices from this table and returns this table- Overrides:
removeColumns
in classRelation
- Returns:
- This Relation
-
appendRow
Appends an empty row and returns a Row object indexed to the newly added row so values can be set.Intended usage:
for (int i = 0; ...) { Row row = table.appendRow(); row.setString("name", "Bob"); row.setFloat("IQ", 123.4f); ...etc. }
-
concat
Add all the columns of tableToConcatenate to this table Note: The columns in the result must have unique names, when compared case insensitive Note: Both tables must have the same number of rows- Parameters:
tableToConcatenate
- The table containing the columns to be added- Returns:
- This table
-
summarize
Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(String numericColumn1Name, String numericColumn2Name, AggregateFunction<?, ?>... functions) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(String col1Name, String col2Name, String col3Name, AggregateFunction<?, ?>... functions) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(String col1Name, String col2Name, String col3Name, String col4Name, AggregateFunction<?, ?>... functions) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(Column<?> column1, Column<?> column2, AggregateFunction<?, ?>... function) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(Column<?> column1, Column<?> column2, Column<?> column3, AggregateFunction<?, ?>... function) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
summarize
public Summarizer summarize(Column<?> column1, Column<?> column2, Column<?> column3, Column<?> column4, AggregateFunction<?, ?>... function) Returns aSummarizer
that can be used to summarize the column with the given name(s) using the given functions. This object implements reduce/aggregation operations on a table.Summarizer can return the results as a table using the Summarizer:apply() method. Summarizer can compute sub-totals using the Summarizer:by() method.
-
xTabCounts
Returns a table with n by m + 1 cells. The first column contains labels, the other cells contains the counts for every unique combination of values from the two specified columns in this table. -
xTabRowPercents
Returns a table with n by m + 1 cells. The first column contains labels, the other cells contains the row percents for every unique combination of values from the two specified columns in this table. Row percents total to 100% in every row. -
xTabColumnPercents
Returns a table with n by m + 1 cells. The first column contains labels, the other cells contains the column percents for every unique combination of values from the two specified columns in this table. Column percents total to 100% in every column. -
xTabTablePercents
Returns a table with n by m + 1 cells. The first column contains labels, the other cells contains the proportion for a unique combination of values from the two specified columns in this table -
xTabPercents
TODO: Rename the method to xTabProportions, deprecating this version Returns a table with two columns, the first contains a value each unique value in the argument, and the second contains the proportion of observations having that value -
xTabCounts
Returns a table with two columns, the first contains a value each unique value in the argument, and the second contains the number of observations of each value -
countBy
Returns a table containing two columns, the grouping column, and a column named "Count" that contains the counts for each grouping column value -
countBy
Returns a table containing a column for each grouping column, and a column named "Count" that contains the counts for each combination of grouping column values- Parameters:
categoricalColumnNames
- The name(s) of one or more CategoricalColumns in this table- Returns:
- A table containing counts of rows grouped by the categorical columns
- Throws:
ClassCastException
- if the categoricalColumnName parameter is the name of a column that does not * implement categorical
-
joinOn
Returns a new DataFrameJoiner initialized with multiplecolumnNames
- Parameters:
columnNames
- Name of the columns to join on.- Returns:
- The new DataFrameJoiner
-
missingValueCounts
Returns a table containing the number of missing values in each column in this table -
iterator
-
rollingIterator
Iterates over rolling sets of rows. I.e. 0 to n-1, 1 to n, 2 to n+1, etc.- Parameters:
n
- the number of rows to return for each iteration
-
steppingIterator
Streams over stepped sets of rows. I.e. 0 to n-1, n to 2n-1, 2n to 3n-1, etc. Only returns full sets of rows.- Parameters:
n
- the number of rows to return for each iteration
-
stream
Returns the rows in this table as a Stream -
steppingStream
Streams over stepped sets of rows. I.e. 0 to n-1, n to 2n-1, 2n to 3n-1, etc. Only returns full sets of rows.- Parameters:
n
- the number of rows to return for each iteration
-
rollingStream
Streams over rolling sets of rows. I.e. 0 to n-1, 1 to n, 2 to n+1, etc.- Parameters:
n
- the number of rows to return for each iteration
-
transpose
Transposes data in the table, switching rows for columns. For example, a table like this.
value1 | value2 |
-------------------------------
1 | 2 |
1.1 | 2.1 |
1.2 | 2.2 |
Is transposed into the following
0 | 1 | 2 |
-------------------------------------
1 | 1.1 | 1.2 |
2 | 2.1 | 2.2 |- Returns:
- transposed table
- See Also:
-
transpose
public Table transpose(boolean includeColumnHeadingsAsFirstColumn, boolean useFirstColumnForHeadings) Transposes data in the table, switching rows for columns. For example, a table like this.
label | value1 | value2 |
-------------------------------
row1 | 1 | 2 |
row2 | 1.1 | 2.1 |
row3 | 1.2 | 2.2 |
Is transposed into the following
label | row1 | row2 | row3 |
-------------------------------------
value1 | 1 | 1.1 | 1.2 |
value2 | 2 | 2.1 | 2.2 |- Parameters:
includeColumnHeadingsAsFirstColumn
- Toggle whether to include the column headings as first column in resultuseFirstColumnForHeadings
- Use the first column as the column headings in the result. Useful if the data set already has a first column which contains a set of labels- Returns:
- The transposed table
-
melt
public Table melt(List<String> idVariables, List<NumericColumn<?>> measuredVariables, boolean dropMissing) Melt implements the 'tidy' melt operation as described in these papers by Hadley Wickham.Tidy concepts: see https://www.jstatsoft.org/article/view/v059i10
Cast function details: see https://www.jstatsoft.org/article/view/v021i12
In short, melt turns columns into rows, but in a particular way. Used with the cast method, it can help make data tidy. In a tidy dataset, every variable is a column and every observation a row.
This method returns a table that contains all the data in this table, but organized such that there is a set of identifier variables (columns) and a single measured variable (column). For example, given a table with columns:
patient_id, gender, age, weight, temperature,
it returns a table with the columns:
patient_id, variable, value
In the new format, the strings age, weight, and temperature have become cells in the measurement table, such that a single row in the source table might look like this in the result table:
1234, gender, male 1234, age, 42 1234, weight, 186 1234, temperature, 97.4
This kind of structure often makes for a good intermediate format for performing subsequent transformations. It is especially useful when combined with the
cast()
operation- Parameters:
idVariables
- A list of column names intended to be used as identifiers. In the example, only patient_id would be an identifiermeasuredVariables
- A list of columns intended to be used as measured variables. All columns must have the same typedropMissing
- drop any row where the value is missing
-
cast
Cast implements the 'tidy' cast operation as described in these papers by Hadley Wickham:Cast takes a table in 'molten' format, such as is produced by the
#melt(List, List, Boolean)
t} method, and returns a version in standard tidy format.The molten table should have a StringColumn called "variable" and a column called "value" Every unique variable name will become a column in the output table.
All other columns in this table are considered identifier variable. Each combination of identifier variables specifies an observation, so there will be one row for each, with the other variables added.
Variable columns are returned in an arbitrary order. Use
reorderColumns(String...)
if column order is important.Tidy concepts: see https://www.jstatsoft.org/article/view/v059i10
Cast function details: see https://www.jstatsoft.org/article/view/v021i12
-