info.fingo.spata
Members list
Packages
Type members
Classlikes
CSV configuration used to create CSVParser or CSVRenderer.
CSV configuration used to create CSVParser or CSVRenderer.
This config may be used as a builder to create a parser:
val parser = CSVConfig().fieldSizeLimit(1000).noHeader.parser[IO]
or renderer:
val renderer = CSVConfig().escapeSpaces.noHeader.renderer[IO]
Field delimiter is ','
by default.
Record delimiter is '\n'
by default. When the delimiter is set to line feed ('\n'
, ASCII 10) and it is preceded by carriage return ('\r'
, ASCII 13), they are treated as a single character.
Quotation mark is '"'
by default. It is required to wrap special characters, field and record delimiters, in quotes. Quotation mark in actual content may appear only inside quotation marks. It has to be doubled to be interpreted as part of actual data, not a control character.
While parsing, the header setting defines if a header is present in source data, which is true by default. Header is used as keyset for actual values and not included in data. If there is no header, a number-based keys, in tuple style, are created (starting from "_1"
). While rendering, the header setting defines if header row should be added to output. If no header is explicitly defined, a number-based one is used, like for parsing.
If CSV records are converted to case classes, header values are used as class fields and may require remapping. This can be achieved through mapHeader:
config.mapHeader("first name" -> "firstName", "last name" -> "lastName")
or if an implicit header is generated:
config.mapHeader("_1" -> "firstName", "_2" -> "lastName")
A Map
instance may be provided instead of sequence of pairs:
val hm = Map("first name" -> "firstName", "last name" -> "lastName")
config.mapHeader(hm)
Remapping may be provided for any subset of header names. Not matching names are ommited.
Header mapping may be also position-based, which is especially handy when there are duplicates in header and name-based remapping does not solve it (because it remaps all occurrences):
config.mapHeader("firstName", "lastName")
New names are set for subsequent fields, starting from first one (0-index). If the list is shorter than header, old names are retained. If the list is longer than header, superfluous names are ommited. Again, a Map
instance may be provided in this case, which allows selective remapping:
val hm = Map(0 -> "firstName", 1 -> "lastName", 5 -> "birth date")
config.mapHeader(hm)
Remapping may be used for renderer as well, allowing customized header while converting data from case classes or tuples.
Unescaped fields with leading or trailing spaces may be automatically trimmed while parsing when trimSpaces
is set to true
. This setting is false
by default and white spaces are preserved, even for unescaped fields.
Field size limit is used to stop processing input when it is significantly larger then expected to avoid OutOfMemoryError
. This might happen if the source structure is invalid, e.g. the closing quotation mark is missing. There is no limit by default.
While rendering CSV content, different quoting polices may be used, which is controlled by escapeMode
setting. By default only fields which contain field delimiter, record delimiter or quotation mark are put into quotes. When set to EscapeSpaces
quotes are put additionally around fields with leading or trailing spaces. EscapeAll
results in putting quotes around all fields.
Value parameters
- escapeMode
-
method of escaping fields,
EscapeRequired
by default, valid only for rendering - fieldDelimiter
-
field (cell) separator,
','
by default - fieldSizeLimit
-
maximal size of a field,
None
by default, valid only for parsing - hasHeader
-
set if data starts with header row,
true
by default - headerMap
-
definition of header remapping, by name or index, empty by default
- quoteMark
-
character used to wrap (quote) field content,
'"'
by default - recordDelimiter
-
record (row) separator,
'\n'
by default - trimSpaces
-
flag to strip spaces,
false
by default, valid only for parsing
Attributes
- Companion
- object
- Supertypes
A utility for parsing comma-separated values (CSV) sources. The source is assumed to be RFC 4180 conform, although some aspects of its format are configurable.
A utility for parsing comma-separated values (CSV) sources. The source is assumed to be RFC 4180 conform, although some aspects of its format are configurable.
The parser may be created with default configuration:
val parser = CSVParser[IO]
or through CSVParser.config helper function to set custom properties:
val parser = CSVParser.config.fieldDelimiter(';').parser[IO]
Actual parsing is done through one of the 3 groups of methods:
- parse to transform a stream of characters (or stream of strings in case of parseS) into records and process data in a functional way, which is the recommended approach,
- get to fetch whole source data at once into a list,
- process to deal with individual records through a callback function.
This parser is normally used with stream fetching data from some external source, so its computations are wrapped for deferred evaluation into an effect F
, e.g. cats.effect.IO. Basic parsing does not impose any special requirements on F
, except its support for suspended execution, which requires given instance of cats.effect.Sync.
To trigger evaluation, one of the unsafe
operations on F
has to be called. Their exact form depends on actual effect in use (e.g. cats.effect.IO.unsafeRunSync).
No method in this class does context (thread) shift and by default they execute synchronously on current thread. Concurrency or asynchronous execution may be introduced through various fs2.Stream methods. There is also supporting class CSVParser#Async available, which provides method for asynchronous callbacks.
Type parameters
- F
-
the effect type, with a type class providing support for suspended execution (typically cats.effect.IO) and logging (provided internally by spata)
Value parameters
- config
-
the configuration for CSV parsing (delimiters, header presence etc.)
Attributes
- Constructor
-
Creates parser with provided configuration.
- Companion
- object
- Supertypes
-
class Objecttrait Matchableclass Any
A utility for rendering data to CSV representation.
A utility for rendering data to CSV representation.
The renderer may be created with default configuration:
val renderer = CSVRenderer[IO]
or through CSVRenderer.config helper function to set custom properties:
val renderer = CSVRenderer.config.fieldDelimiter(';').renderer[IO]
Actual rendering is done through one of the 2 groups of methods:
- render to transform a stream of records into stream of characters (or strings in case of renderS), which represent full CSV content.
- rows to convert records to strings representing individual CSV rows.
This renderer is normally used with stream supplying data to some external destination, so its computations are wrapped for deferred evaluation into an effect F
, e.g. cats.effect.IO. Basic parsing does not impose any special requirements on F
, except its support for raising and handling errors, which requires given instance of fs2.RaiseThrowable which effectively means cats.ApplicativeError.
To trigger evaluation, one of the unsafe
operations on F
has to be called. Their exact form depends on actual effect in use (e.g. cats.effect.IO.unsafeRunSync).
No method in this class does context (thread) shift and by default they execute synchronously on current thread. Concurrency or asynchronous execution may be introduced through various fs2.Stream methods.
Type parameters
- F
-
the effect type, with a type class providing support for raising and handling errors
Value parameters
- config
-
the configuration for CSV rendering (delimiters, header presence etc.)
Attributes
- Companion
- object
- Supertypes
-
class Objecttrait Matchableclass Any
CSVRenderer companion object with convenience methods to create renderers.
CSVRenderer companion object with convenience methods to create renderers.
Attributes
- Companion
- class
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
-
CSVRenderer.type
CSV header with names of each field.
CSV header with names of each field.
Header created through parsing process ensured to have no duplicate names. This guarantee is not held for user-created headers. Providing duplicates does not cause any erroneous conditions while accessing record data, however the values associated with duplicates will be not accessible by field name.
Value parameters
- names
-
the sequence of names
Attributes
- Companion
- object
- Supertypes
-
class Objecttrait Matchableclass Any
Trait representing header remapping methods. It is not used directly but through conversion of [S2S] or [I2S] partial function to one its implementation classes.
Trait representing header remapping methods. It is not used directly but through conversion of [S2S] or [I2S] partial function to one its implementation classes.
Attributes
- See also
-
[CSVConfig] for sample usage.
- Companion
- object
- Supertypes
-
class Objecttrait Matchableclass Any
Representation of location of record in source data.
Representation of location of record in source data.
rowNum
is record counter. It start with 1
for data, with header row having number 0
. It differs from lineNum
for sources with header or fields containing line breaks.
lineNum
is the last line in source data which content is used to parse a record
- in other words it is the number of lines consumed so far to load a record. It starts with
1
, including header line - first data record has typically line number2
. There may be many lines per record when some fields contain line breaks. New line is interpreted independently from CSV record separator, as the standard platformEOL
character sequence.
Value parameters
- line
-
the line number
- row
-
the row number
Attributes
- Supertypes
CSV record representation. A record is basically a map from string to string. Values are indexed by header provided explicitly or read from header row in source data. If no header is provided nor can be scanned from source, a tuple-style header "_1"
, "_2"
etc. is generated.
CSV record representation. A record is basically a map from string to string. Values are indexed by header provided explicitly or read from header row in source data. If no header is provided nor can be scanned from source, a tuple-style header "_1"
, "_2"
etc. is generated.
Position information is always available for parsed data - for records created through CSVParser. It is missing for records created explicitly in application code (in order to be rendered to CSV).
Value parameters
- hdr
-
indexing header (field names)
- position
-
record position in source data
- values
-
core record data
Attributes
- Companion
- object
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
Types
Convenience type representing result of decoding record data.
Convenience type representing result of decoding record data.
Attributes
Convenience type.
Convenience type.
Attributes
Convenience type.
Convenience type.