info.fingo.spata

Members list

Type members

Classlikes

final case class CSVConfig

CSV configuration used to create CSVParser or CSVRenderer.

CSV configuration used to create CSVParser or CSVRenderer.

This config may be used as a builder to create a parser:

val parser = CSVConfig().fieldSizeLimit(1000).noHeader.parser[IO]

or renderer:

val renderer = CSVConfig().escapeSpaces.noHeader.renderer[IO]

Field delimiter is ',' by default.

Record delimiter is '\n' by default. When the delimiter is set to line feed ('\n', ASCII 10) and it is preceded by carriage return ('\r', ASCII 13), they are treated as a single character.

Quotation mark is '"' by default. It is required to wrap special characters, field and record delimiters, in quotes. Quotation mark in actual content may appear only inside quotation marks. It has to be doubled to be interpreted as part of actual data, not a control character.

While parsing, the header setting defines if a header is present in source data, which is true by default. Header is used as keyset for actual values and not included in data. If there is no header, a number-based keys, in tuple style, are created (starting from "_1"). While rendering, the header setting defines if header row should be added to output. If no header is explicitly defined, a number-based one is used, like for parsing.

If CSV records are converted to case classes, header values are used as class fields and may require remapping. This can be achieved through mapHeader:

config.mapHeader("first name" -> "firstName", "last name" -> "lastName")

or if an implicit header is generated:

config.mapHeader("_1" -> "firstName", "_2" -> "lastName")

A Map instance may be provided instead of sequence of pairs:

val hm = Map("first name" -> "firstName", "last name" -> "lastName")
config.mapHeader(hm)

Remapping may be provided for any subset of header names. Not matching names are ommited.

Header mapping may be also position-based, which is especially handy when there are duplicates in header and name-based remapping does not solve it (because it remaps all occurrences):

config.mapHeader("firstName", "lastName")

New names are set for subsequent fields, starting from first one (0-index). If the list is shorter than header, old names are retained. If the list is longer than header, superfluous names are ommited. Again, a Map instance may be provided in this case, which allows selective remapping:

val hm = Map(0 -> "firstName", 1 -> "lastName", 5 -> "birth date")
config.mapHeader(hm)

Remapping may be used for renderer as well, allowing customized header while converting data from case classes or tuples.

Unescaped fields with leading or trailing spaces may be automatically trimmed while parsing when trimSpaces is set to true. This setting is false by default and white spaces are preserved, even for unescaped fields.

Field size limit is used to stop processing input when it is significantly larger then expected to avoid OutOfMemoryError. This might happen if the source structure is invalid, e.g. the closing quotation mark is missing. There is no limit by default.

While rendering CSV content, different quoting polices may be used, which is controlled by escapeMode setting. By default only fields which contain field delimiter, record delimiter or quotation mark are put into quotes. When set to EscapeSpaces quotes are put additionally around fields with leading or trailing spaces. EscapeAll results in putting quotes around all fields.

Value parameters

escapeMode

method of escaping fields, EscapeRequired by default, valid only for rendering

fieldDelimiter

field (cell) separator, ',' by default

fieldSizeLimit

maximal size of a field, None by default, valid only for parsing

hasHeader

set if data starts with header row, true by default

headerMap

definition of header remapping, by name or index, empty by default

quoteMark

character used to wrap (quote) field content, '"' by default

recordDelimiter

record (row) separator, '\n' by default

trimSpaces

flag to strip spaces, false by default, valid only for parsing

Attributes

Companion
object
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
object CSVConfig

CSVConfig companion object with escape mode definitions.

CSVConfig companion object with escape mode definitions.

Attributes

Companion
class
Supertypes
trait Product
trait Mirror
class Object
trait Matchable
class Any
Self type
CSVConfig.type
final class CSVParser[F[_]](config: CSVConfig)(implicit evidence$1: Sync[F], evidence$2: Logger[F])

A utility for parsing comma-separated values (CSV) sources. The source is assumed to be RFC 4180 conform, although some aspects of its format are configurable.

A utility for parsing comma-separated values (CSV) sources. The source is assumed to be RFC 4180 conform, although some aspects of its format are configurable.

The parser may be created with default configuration:

val parser = CSVParser[IO]

or through CSVParser.config helper function to set custom properties:

val parser = CSVParser.config.fieldDelimiter(';').parser[IO]

Actual parsing is done through one of the 3 groups of methods:

  • parse to transform a stream of characters (or stream of strings in case of parseS) into records and process data in a functional way, which is the recommended approach,
  • get to fetch whole source data at once into a list,
  • process to deal with individual records through a callback function.

This parser is normally used with stream fetching data from some external source, so its computations are wrapped for deferred evaluation into an effect F, e.g. cats.effect.IO. Basic parsing does not impose any special requirements on F, except its support for suspended execution, which requires given instance of cats.effect.Sync.

To trigger evaluation, one of the unsafe operations on F has to be called. Their exact form depends on actual effect in use (e.g. cats.effect.IO.unsafeRunSync).

No method in this class does context (thread) shift and by default they execute synchronously on current thread. Concurrency or asynchronous execution may be introduced through various fs2.Stream methods. There is also supporting class CSVParser#Async available, which provides method for asynchronous callbacks.

Type parameters

F

the effect type, with a type class providing support for suspended execution (typically cats.effect.IO) and logging (provided internally by spata)

Value parameters

config

the configuration for CSV parsing (delimiters, header presence etc.)

Attributes

Constructor

Creates parser with provided configuration.

Companion
object
Supertypes
class Object
trait Matchable
class Any
object CSVParser

CSVParser companion object with types definitions and convenience methods to create parsers.

CSVParser companion object with types definitions and convenience methods to create parsers.

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
CSVParser.type
final class CSVRenderer[F[_]](config: CSVConfig)(implicit evidence$1: RaiseThrowable[F])

A utility for rendering data to CSV representation.

A utility for rendering data to CSV representation.

The renderer may be created with default configuration:

val renderer = CSVRenderer[IO]

or through CSVRenderer.config helper function to set custom properties:

val renderer = CSVRenderer.config.fieldDelimiter(';').renderer[IO]

Actual rendering is done through one of the 2 groups of methods:

  • render to transform a stream of records into stream of characters (or strings in case of renderS), which represent full CSV content.
  • rows to convert records to strings representing individual CSV rows.

This renderer is normally used with stream supplying data to some external destination, so its computations are wrapped for deferred evaluation into an effect F, e.g. cats.effect.IO. Basic parsing does not impose any special requirements on F, except its support for raising and handling errors, which requires given instance of fs2.RaiseThrowable which effectively means cats.ApplicativeError.

To trigger evaluation, one of the unsafe operations on F has to be called. Their exact form depends on actual effect in use (e.g. cats.effect.IO.unsafeRunSync).

No method in this class does context (thread) shift and by default they execute synchronously on current thread. Concurrency or asynchronous execution may be introduced through various fs2.Stream methods.

Type parameters

F

the effect type, with a type class providing support for raising and handling errors

Value parameters

config

the configuration for CSV rendering (delimiters, header presence etc.)

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
object CSVRenderer

CSVRenderer companion object with convenience methods to create renderers.

CSVRenderer companion object with convenience methods to create renderers.

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
final class Header

CSV header with names of each field.

CSV header with names of each field.

Header created through parsing process ensured to have no duplicate names. This guarantee is not held for user-created headers. Providing duplicates does not cause any erroneous conditions while accessing record data, however the values associated with duplicates will be not accessible by field name.

Value parameters

names

the sequence of names

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
object Header

Header companion

Header companion

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
Header.type
sealed trait HeaderMap

Trait representing header remapping methods. It is not used directly but through conversion of [S2S] or [I2S] partial function to one its implementation classes.

Trait representing header remapping methods. It is not used directly but through conversion of [S2S] or [I2S] partial function to one its implementation classes.

Attributes

See also

[CSVConfig] for sample usage.

Companion
object
Supertypes
class Object
trait Matchable
class Any
object HeaderMap

Implicit conversions for [HeaderMap] trait.

Implicit conversions for [HeaderMap] trait.

Attributes

Companion
trait
Supertypes
class Object
trait Matchable
class Any
Self type
HeaderMap.type
final case class Position(row: Int, line: Int)

Representation of location of record in source data.

Representation of location of record in source data.

rowNum is record counter. It start with 1 for data, with header row having number 0. It differs from lineNum for sources with header or fields containing line breaks.

lineNum is the last line in source data which content is used to parse a record

  • in other words it is the number of lines consumed so far to load a record. It starts with 1, including header line - first data record has typically line number 2. There may be many lines per record when some fields contain line breaks. New line is interpreted independently from CSV record separator, as the standard platform EOL character sequence.

Value parameters

line

the line number

row

the row number

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
final class Record

CSV record representation. A record is basically a map from string to string. Values are indexed by header provided explicitly or read from header row in source data. If no header is provided nor can be scanned from source, a tuple-style header "_1", "_2" etc. is generated.

CSV record representation. A record is basically a map from string to string. Values are indexed by header provided explicitly or read from header row in source data. If no header is provided nor can be scanned from source, a tuple-style header "_1", "_2" etc. is generated.

Position information is always available for parsed data - for records created through CSVParser. It is missing for records created explicitly in application code (in order to be rendered to CSV).

Value parameters

hdr

indexing header (field names)

position

record position in source data

values

core record data

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Self type
object Record

Record helper object. Used to create and convert records.

Record helper object. Used to create and convert records.

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
Record.type

Types

type Decoded[A] = Either[ContentError, A]

Convenience type representing result of decoding record data.

Convenience type representing result of decoding record data.

Attributes

Convenience type.

Convenience type.

Attributes

Convenience type.

Convenience type.

Attributes