CSV configuration used to create CSVParser or CSVRenderer.
This config may be used as a builder to create a parser:
val parser = CSVConfig().fieldSizeLimit(1000).noHeader.parser[IO]
or renderer:
val renderer = CSVConfig().escapeSpaces.noHeader.renderer[IO]
Field delimiter is ','
by default.
Record delimiter is '\n'
by default. When the delimiter is set to line feed ('\n'
, ASCII 10) and it is preceded by carriage return ('\r'
, ASCII 13), they are treated as a single character.
Quotation mark is '"'
by default. It is required to wrap special characters, field and record delimiters, in quotes. Quotation mark in actual content may appear only inside quotation marks. It has to be doubled to be interpreted as part of actual data, not a control character.
While parsing, the header setting defines if a header is present in source data, which is true by default. Header is used as keyset for actual values and not included in data. If there is no header, a number-based keys, in tuple style, are created (starting from "_1"
). While rendering, the header setting defines if header row should be added to output. If no header is explicitly defined, a number-based one is used, like for parsing.
If CSV records are converted to case classes, header values are used as class fields and may require remapping. This can be achieved through mapHeader:
config.mapHeader("first name" -> "firstName", "last name" -> "lastName")
or if an implicit header is generated:
config.mapHeader("_1" -> "firstName", "_2" -> "lastName")
A Map
instance may be provided instead of sequence of pairs:
val hm = Map("first name" -> "firstName", "last name" -> "lastName")
config.mapHeader(hm)
Remapping may be provided for any subset of header names. Not matching names are ommited.
Header mapping may be also position-based, which is especially handy when there are duplicates in header and name-based remapping does not solve it (because it remaps all occurrences):
config.mapHeader("firstName", "lastName")
New names are set for subsequent fields, starting from first one (0-index). If the list is shorter than header, old names are retained. If the list is longer than header, superfluous names are ommited. Again, a Map
instance may be provided in this case, which allows selective remapping:
val hm = Map(0 -> "firstName", 1 -> "lastName", 5 -> "birth date")
config.mapHeader(hm)
Remapping may be used for renderer as well, allowing customized header while converting data from case classes or tuples.
Unescaped fields with leading or trailing spaces may be automatically trimmed while parsing when trimSpaces
is set to true
. This setting is false
by default and white spaces are preserved, even for unescaped fields.
Field size limit is used to stop processing input when it is significantly larger then expected to avoid OutOfMemoryError
. This might happen if the source structure is invalid, e.g. the closing quotation mark is missing. There is no limit by default.
While rendering CSV content, different quoting polices may be used, which is controlled by escapeMode
setting. By default only fields which contain field delimiter, record delimiter or quotation mark are put into quotes. When set to EscapeSpaces
quotes are put additionally around fields with leading or trailing spaces. EscapeAll
results in putting quotes around all fields.
Value parameters
- escapeMode
-
method of escaping fields,
EscapeRequired
by default, valid only for rendering - fieldDelimiter
-
field (cell) separator,
','
by default - fieldSizeLimit
-
maximal size of a field,
None
by default, valid only for parsing - hasHeader
-
set if data starts with header row,
true
by default - headerMap
-
definition of header remapping, by name or index, empty by default
- quoteMark
-
character used to wrap (quote) field content,
'"'
by default - recordDelimiter
-
record (row) separator,
'\n'
by default - trimSpaces
-
flag to strip spaces,
false
by default, valid only for parsing
Attributes
- Companion
- object
- Graph
-
- Supertypes