CSVParser

info.fingo.spata.CSVParser
See theCSVParser companion object
final class CSVParser[F[_]](config: CSVConfig)(implicit evidence$1: Sync[F], evidence$2: Logger[F])

A utility for parsing comma-separated values (CSV) sources. The source is assumed to be RFC 4180 conform, although some aspects of its format are configurable.

The parser may be created with default configuration:

val parser = CSVParser[IO]

or through CSVParser.config helper function to set custom properties:

val parser = CSVParser.config.fieldDelimiter(';').parser[IO]

Actual parsing is done through one of the 3 groups of methods:

  • parse to transform a stream of characters (or stream of strings in case of parseS) into records and process data in a functional way, which is the recommended approach,
  • get to fetch whole source data at once into a list,
  • process to deal with individual records through a callback function.

This parser is normally used with stream fetching data from some external source, so its computations are wrapped for deferred evaluation into an effect F, e.g. cats.effect.IO. Basic parsing does not impose any special requirements on F, except its support for suspended execution, which requires given instance of cats.effect.Sync.

To trigger evaluation, one of the unsafe operations on F has to be called. Their exact form depends on actual effect in use (e.g. cats.effect.IO.unsafeRunSync).

No method in this class does context (thread) shift and by default they execute synchronously on current thread. Concurrency or asynchronous execution may be introduced through various fs2.Stream methods. There is also supporting class CSVParser#Async available, which provides method for asynchronous callbacks.

Type parameters

F

the effect type, with a type class providing support for suspended execution (typically cats.effect.IO) and logging (provided internally by spata)

Value parameters

config

the configuration for CSV parsing (delimiters, header presence etc.)

Attributes

Constructor

Creates parser with provided configuration.

Companion
object
Graph
Supertypes
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

def async(using F: Async[F]): Async[F]

Provides access to asynchronous parsing method.

Provides access to asynchronous parsing method.

Value parameters

F

type class (monad) providing support for concurrency

Attributes

Returns

helper class with asynchronous method

def get(stream: Stream[F, Char]): F[List[Record]]

Fetches whole source content into list of records.

Fetches whole source content into list of records.

This function should be used only for small source data sets to avoid memory overflow.

Value parameters

stream

the source stream containing CSV content

Attributes

Returns

the list of records

Throws
error.StructureException

in case of flawed CSV structure

def get(stream: Stream[F, Char], limit: Long): F[List[Record]]

Fetches requested number of CSV records into a list.

Fetches requested number of CSV records into a list.

This functions stops processing source data as soon as the limit is reached. It mustn't be called twice for the same data source however - first call may consume more elements than requested, leaving the source pointer at an unpredictable position.

Value parameters

limit

the number of records to get

stream

the source stream containing CSV content

Attributes

Returns

the list of records

Throws
error.StructureException

in case of flawed CSV structure

def parse: (F, Char) => Record

Transforms stream of characters representing CSV data into records. This function is intended to be used with fs2.Stream.through. The transformed fs2.Stream allows further input processing in a very flexible, purely functional manner.

Transforms stream of characters representing CSV data into records. This function is intended to be used with fs2.Stream.through. The transformed fs2.Stream allows further input processing in a very flexible, purely functional manner.

Processing of data sources may be achieved by combining this function with io.Reader, e.g.:

val stream: Stream[IO, Record] = Stream
 .bracket(IO(Source.fromFile("input.csv")))(source => IO(source.close()))
 .flatMap(Reader[IO].read)
 .through(CSVParser[IO].parse)

Transformation may result in error.StructureException, to be handled with fs2.Stream.handleErrorWith. If not handled, the exception will be thrown.

Attributes

Returns

a pipe to converter scala.Chars into Records

See also

FS2 documentation for further guidance.

def parseS: (F, String) => Record

Transforms stream of strings representing CSV data into records. This function is intended to be used with fs2.Stream.through. The transformed fs2.Stream allows further input processing in a very flexible, purely functional manner.

Transforms stream of strings representing CSV data into records. This function is intended to be used with fs2.Stream.through. The transformed fs2.Stream allows further input processing in a very flexible, purely functional manner.

If you would like to work with spata's io.Reader, use the parse methods, which consumes a stream of characters rather than a stream of strings. This method is better suited to work with methods from fs2.io.file.Files, which operates on strings (or bytes to be converted by fs2.text.decodeWithCharset).

Attributes

Returns

a pipe to converter Strings into Records

See also

parse for more details

Example
val stream: Stream[IO, Record] = Files[IO]
 .readUtf8(Path("input.csv"))
 .through(CSVParser[IO].parseS)
def process(stream: Stream[F, Char])(cb: Callback): F[Unit]

Processes each CSV record with provided callback functions to execute some side effects. Stops processing input as soon as the callback function returns false or stream is exhausted.

Processes each CSV record with provided callback functions to execute some side effects. Stops processing input as soon as the callback function returns false or stream is exhausted.

Value parameters

cb

the callback function to operate on each CSV record and produce some side effect. It should return true to continue the process with next record or false to stop processing the input.

stream

the source stream containing CSV content

Attributes

Returns

unit effect, used as a handle to trigger evaluation

Throws
error.CSVException

in case of flawed CSV structure or field parsing errors