Package tech.tablesaw.io.csv
Class CsvReadOptions.Builder
java.lang.Object
tech.tablesaw.io.ReadOptions.Builder
tech.tablesaw.io.csv.CsvReadOptions.Builder
- Enclosing class:
- CsvReadOptions
-
Field Summary
Fields inherited from class tech.tablesaw.io.ReadOptions.Builder
columnTypeFunction, columnTypeMap, columnTypes, columnTypesToDetect, completeColumnTypeFunction, dateFormat, dateFormatter, dateTimeFormat, dateTimeFormatter, header, ignoreZeroDecimal, locale, maxCharsPerColumn, minimizeColumnSizes, missingValueIndicators, sample, skipRowsWithInvalidColumnCount, source, tableName, timeFormat, timeFormatter
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionallowDuplicateColumnNames
(Boolean allow) Enable reading of a table with duplicate column names.build()
columnTypes
(Function<String, ColumnType> columnTypeFunction) Provide a function that determines ColumnType for all column names.columnTypes
(ColumnType[] columnTypes) Provide column types for all columns skipping autodetect column type logic.columnTypesPartial
(Function<String, Optional<ColumnType>> columnTypeFunction) Provide a function that determines ColumnType for some column names.columnTypesPartial
(Map<String, ColumnType> columnTypeByName) Provide a map that determines ColumnType for given column names.columnTypesToDetect
(List<ColumnType> columnTypesToDetect) commentPrefix
(Character commentPrefix) dateFormat
(String dateFormat) Deprecated.dateFormat
(DateTimeFormatter dateFormat) dateTimeFormat
(String dateTimeFormat) Deprecated.dateTimeFormat
(DateTimeFormatter dateTimeFormat) escapeChar
(Character escapeChar) header
(boolean header) ignoreZeroDecimal
(boolean ignoreZeroDecimal) Ignore zero value decimals in data values.lineEnding
(String lineEnding) maxCharsPerColumn
(int maxCharsPerColumn) maxNumberOfColumns
(Integer maxNumberOfColumns) Defines maximal value of columns in csv file.Allow theColumnTypeDetector
to choose shorter column types such as float instead of double when the data will fit in a smaller typemissingValueIndicator
(String... missingValueIndicators) sample
(boolean sample) sampleSize
(int numSamples) Defines the maximum number of rows to be read from the file.skipRowsWithInvalidColumnCount
(boolean skipRowsWithInvalidColumnCount) Skip the rows with invalid column count in data values.timeFormat
(String timeFormat) Deprecated.timeFormat
(DateTimeFormatter timeFormat)
-
Constructor Details
-
Builder
-
Builder
- Throws:
IOException
-
Builder
-
Builder
-
Builder
-
Builder
-
-
Method Details
-
columnTypes
Description copied from class:ReadOptions.Builder
Provide column types for all columns skipping autodetect column type logic. The array must contain a ColumnType for each column in the table. An error will be thrown if they don't match up- Overrides:
columnTypes
in classReadOptions.Builder
-
columnTypes
Description copied from class:ReadOptions.Builder
Provide a function that determines ColumnType for all column names. To provide only for some useReadOptions.Builder.columnTypesPartial(Function)
This method is generally more efficient because it skips column type detection
- Overrides:
columnTypes
in classReadOptions.Builder
-
columnTypesPartial
public CsvReadOptions.Builder columnTypesPartial(Function<String, Optional<ColumnType>> columnTypeFunction) Description copied from class:ReadOptions.Builder
Provide a function that determines ColumnType for some column names. To provide for all column names useReadOptions.Builder.columnTypes(Function)
that generally is more efficient because it skips column type detection- Overrides:
columnTypesPartial
in classReadOptions.Builder
-
columnTypesPartial
Description copied from class:ReadOptions.Builder
Provide a map that determines ColumnType for given column names. Types for not present column names will be autodetected. To provide type for all column names useReadOptions.Builder.columnTypes(Function)
that generally is more efficient because it skips column type detection- Overrides:
columnTypesPartial
in classReadOptions.Builder
-
separator
-
quoteChar
-
escapeChar
-
lineEnding
-
maxNumberOfColumns
Defines maximal value of columns in csv file.- Parameters:
maxNumberOfColumns
- - must be positive integer. Default is 10,000
-
commentPrefix
-
sampleSize
Defines the maximum number of rows to be read from the file. Sampling is performed in a single pass using the reservoir sampling algorithm (https://en.wikipedia.org/wiki/Reservoir_sampling). Given a file with 'n' rows, if 'numSamples is smaller than 'n', than exactly 'numSamples' random samples are returned; if 'numSamples' is greater than 'n', then only 'n' samples are returned (no oversampling is performed to increase the data to match 'numSamples'). -
build
- Overrides:
build
in classReadOptions.Builder
-
header
- Overrides:
header
in classReadOptions.Builder
-
allowDuplicateColumnNames
Enable reading of a table with duplicate column names. After the first appearance of a column name, subsequent appearances will have a number appended.- Overrides:
allowDuplicateColumnNames
in classReadOptions.Builder
- Parameters:
allow
- if true, duplicate names will be allowed
-
columnTypesToDetect
- Overrides:
columnTypesToDetect
in classReadOptions.Builder
- See Also:
-
tableName
- Overrides:
tableName
in classReadOptions.Builder
-
sample
- Overrides:
sample
in classReadOptions.Builder
-
dateFormat
Deprecated.Description copied from class:ReadOptions.Builder
Deprecated. Use dateFormat(DateTimeFormatter dateFormat) instead- Overrides:
dateFormat
in classReadOptions.Builder
-
timeFormat
Deprecated.Description copied from class:ReadOptions.Builder
Deprecated. Use timeFormat(DateTimeFormatter dateFormat) instead- Overrides:
timeFormat
in classReadOptions.Builder
-
dateTimeFormat
Deprecated.Description copied from class:ReadOptions.Builder
Deprecated. Use dateTimeFormat(DateTimeFormatter dateFormat) instead- Overrides:
dateTimeFormat
in classReadOptions.Builder
-
dateFormat
- Overrides:
dateFormat
in classReadOptions.Builder
-
timeFormat
- Overrides:
timeFormat
in classReadOptions.Builder
-
dateTimeFormat
- Overrides:
dateTimeFormat
in classReadOptions.Builder
-
maxCharsPerColumn
- Overrides:
maxCharsPerColumn
in classReadOptions.Builder
-
locale
- Overrides:
locale
in classReadOptions.Builder
-
missingValueIndicator
- Overrides:
missingValueIndicator
in classReadOptions.Builder
-
minimizeColumnSizes
Description copied from class:ReadOptions.Builder
Allow theColumnTypeDetector
to choose shorter column types such as float instead of double when the data will fit in a smaller type- Overrides:
minimizeColumnSizes
in classReadOptions.Builder
-
ignoreZeroDecimal
Description copied from class:ReadOptions.Builder
Ignore zero value decimals in data values. Defaults totrue
.- Overrides:
ignoreZeroDecimal
in classReadOptions.Builder
-
skipRowsWithInvalidColumnCount
public CsvReadOptions.Builder skipRowsWithInvalidColumnCount(boolean skipRowsWithInvalidColumnCount) Description copied from class:ReadOptions.Builder
Skip the rows with invalid column count in data values. Defaluts tofalse
.- Overrides:
skipRowsWithInvalidColumnCount
in classReadOptions.Builder
-