Creates a akka.stream.scaladsl.Source that reads Parquet data from the specified path.
Creates a akka.stream.scaladsl.Source that reads Parquet data from the specified path.
If there are multiple files at path then the order in which files are loaded is determined by underlying
filesystem.
Path can refer to local file, HDFS, AWS S3, Google Storage, Azure, etc.
Please refer to Hadoop client documentation or your data provider in order to know how to configure the connection.
type of data that represent the schema of the Parquet data, e.g.:
case class MyData(id: Long, name: String, created: java.sql.Timestamp)
URI to Parquet files, e.g.:
"file:///data/users"
The source of Parquet data
Creates a akka.stream.scaladsl.Sink that writes Parquet data to files at the specified path.
Creates a akka.stream.scaladsl.Sink that writes Parquet data to files at the specified path. Sink splits files into number of pieces equal to parallelism. Files are written in parallel. Data is written in unordered way.
Path can refer to local file, HDFS, AWS S3, Google Storage, Azure, etc.
Please refer to Hadoop client documentation or your data provider in order to know how to configure the connection.
type of data that represent the schema of the Parquet data, e.g.:
case class MyData(id: Long, name: String, created: java.sql.Timestamp)
URI to Parquet files, e.g.:
"file:///data/users"
defines how many files are created and how many parallel threads are responsible for it
set of options that define how Parquet files will be created
The sink that writes Parquet files
Creates a akka.stream.scaladsl.Sink that writes Parquet data to files at the specified path.
Creates a akka.stream.scaladsl.Sink that writes Parquet data to files at the specified path. Sink splits files sequentially into pieces. Each file contains maximal number of records according to maxRecordsPerFile. It is recommended to define maxRecordsPerFile as a multiple of com.github.mjakubowski84.parquet4s.ParquetWriter.Options.rowGroupSize.
Path can refer to local file, HDFS, AWS S3, Google Storage, Azure, etc.
Please refer to Hadoop client documentation or your data provider in order to know how to configure the connection.
type of data that represent the schema of the Parquet data, e.g.:
case class MyData(id: Long, name: String, created: java.sql.Timestamp)
URI to Parquet files, e.g.:
"file:///data/users"
the maximum size of file
set of options that define how Parquet files will be created
The sink that writes Parquet files
Creates a akka.stream.scaladsl.Sink that writes Parquet data to single file at the specified path (including file name).
Creates a akka.stream.scaladsl.Sink that writes Parquet data to single file at the specified path (including
file name).
Path can refer to local file, HDFS, AWS S3, Google Storage, Azure, etc.
Please refer to Hadoop client documentation or your data provider in order to know how to configure the connection.
type of data that represent the schema of the Parquet data, e.g.:
case class MyData(id: Long, name: String, created: java.sql.Timestamp)
URI to Parquet files, e.g.:
"file:///data/users/users-2019-01-01.parquet"
set of options that define how Parquet files will be created
The sink that writes Parquet file
Holds factory of Akka Streams sources and sinks that allow reading from and writing to Parquet files.