fetchlib
package fetchlib
- Alphabetic
- Public
- Protected
Value Members
- object BatchingSection extends AnyFlatSpec with Matchers with Section
As we have learned, Fetch performs batched requests whenever it can.
Batching
As we have learned, Fetch performs batched requests whenever it can. It also exposes a couple knobs for tweaking the maximum batch size and whether multiple batches are run in parallel or sequentially.
- object CachingSection extends AnyFlatSpec with Matchers with Section
As we have learned, Fetch caches intermediate results implicitly.
Caching
As we have learned, Fetch caches intermediate results implicitly. You can provide a prepopulated cache for running a fetch, replay a fetch with the cache of a previous one, and even implement a custom cache.
- object DebuggingSection extends AnyFlatSpec with Matchers with Section
We have introduced the handy
fetch.debug.describe
function for debugging errors, but it can do more than that.Debugging
We have introduced the handy
fetch.debug.describe
function for debugging errors, but it can do more than that. It can also give you a detailed description of a fetch execution given an execution log.Add the following line to your dependencies for including Fetch’s debugging facilities:
"com.47deg" %% "fetch-debug" % "1.2.2"
- object ErrorHandlingSection extends AnyFlatSpec with Matchers with Section
Fetch is used for reading data from remote sources and the queries we perform can and will fail at some point.
Error handling
Fetch is used for reading data from remote sources and the queries we perform can and will fail at some point. There are many things that can go wrong:
- an exception can be thrown by client code of certain data sources
- an identity may be missing
- the data source may be temporarily available
Since the error cases are plenty and can’t be anticipated Fetch errors are represented by the 'FetchException' trait, which extends
Throwable
. Currently fetch definesFetchException
cases for missing identities and arbitrary exceptions but you can extendFetchException
with any error you want. - object FetchLibrary extends Library
Fetch is a library for making access to data both simple & efficient.
- object FetchTutorialHelper
- object SyntaxSection extends AnyFlatSpec with Matchers with Section
Syntax
- object UsageSection extends AnyFlatSpec with Matchers with Section
Fetch is a library that allows your data fetches to be written in a concise, composable way while executing efficiently.
Introduction
Fetch is a library that allows your data fetches to be written in a concise, composable way while executing efficiently. You don't need to use any explicit concurrency construct but existing idioms: applicative for concurrency and monad for sequencing.
Oftentimes, our applications read and manipulate data from a variety of different sources such as databases, web services or file systems. These data sources are subject to latency, and we'd prefer to query them efficiently.
If we are just reading data, we can make a series of optimizations such as:
- batching requests to the same data source
- requesting independent data from different sources in parallel
- caching previously seen results
However, if we mix these optimizations with the code that fetches the data we may end up trading clarity for performance. Furthermore, we are mixing low-level (optimization) and high-level (business logic with the data we read) concerns.
Installation
To begin, add the following dependency to your SBT build file:
"com.47deg" %% "fetch" % "1.2.2"
Or, if using Scala.js:
"com.47deg" %%% "fetch" % "1.2.2"
Now you’ll have Fetch available in both Scala and Scala.js.
Usage
In order to tell Fetch how to retrieve data, we must implement the
DataSource
typeclass.import cats.effect.Concurrent import cats.data.NonEmptyList trait DataSource[F[_], Identity, Result]{ def data: Data[Identity, Result] def CF: Concurrent[F] def fetch(id: Identity): F[Option[Result]] /* `batch` is implemented in terms of `fetch` by default */ def batch(ids: NonEmptyList[Identity]): F[Map[Identity, Result]] }
It takes two type parameters:
Identity
: the identity we want to fetch (aUserId
if we were fetching users)Result
: the type of the data we retrieve (aUser
if we were fetching users)
There are two methods:
fetch
andbatch
.fetch
receives one identity and must return aConcurrent
containing an optional result. Returning anOption
Fetch can detect whether an identity couldn't be fetched or no longer exists.batch
method takes a non-empty list of identities and must return aConcurrent
containing a map from identities to results. Accepting a list of identities gives Fetch the ability to batch requests to the same data source, and returning a mapping from identities to results, Fetch can detect whenever an identity couldn’t be fetched or no longer exists.The
data
method returns aData[Identity, Result]
instance that Fetch uses to optimize requests to the same data source, and is expected to return a singletonobject
that extendsData[Identity, Result]
.Writing your first data source
Now that we know about the
DataSource
typeclass, let's write our first data source! We'll start by implementing a data source for fetching users given their id. The first thing we'll do is define the types for user ids and users.type UserId = Int case class User(id: UserId, username: String)
We’ll simulate unpredictable latency with this function.
import cats.effect._ import cats.syntax.all._ def latency[F[_] : Concurrent](msg: String): F[Unit] = for { _ <- Sync[F].delay(println(s"--> [${Thread.currentThread.getId}] $msg")) _ <- Sync[F].delay(Thread.sleep(100)) _ <- Sync[F].delay(println(s"<-- [${Thread.currentThread.getId}] $msg")) } yield ()
And now we're ready to write our user data source; we'll emulate a database with an in-memory map.
import cats.data.NonEmptyList import cats.instances.list._ import fetch._ val userDatabase: Map[UserId, User] = Map( 1 -> User(1, "@one"), 2 -> User(2, "@two"), 3 -> User(3, "@three"), 4 -> User(4, "@four") ) object Users extends Data[UserId, User] { def name = "Users" def source[F[_] : Concurrent]: DataSource[F, UserId, User] = new DataSource[F, UserId, User] { override def data = Users override def CF = Concurrent[F] override def fetch(id: UserId): F[Option[User]] = latency[F](s"One User $id") >> CF.pure(userDatabase.get(id)) override def batch(ids: NonEmptyList[UserId]): F[Map[UserId, User]] = latency[F](s"Batch Users $ids") >> CF.pure(userDatabase.filterKeys(ids.toList.toSet).toMap) } }
Now that we have a data source we can write a function for fetching users given an id, we just have to pass a
UserId
as an argument toFetch
.def getUser[F[_] : Concurrent](id: UserId): Fetch[F, User] = Fetch(id, Users.source)
Optional identities
If you want to create a Fetch that doesn’t fail if the identity is not found, you can use
Fetch#optional
instead ofFetch#apply
. Note that instead of aFetch[F, A]
you will get aFetch[F, Option[A]]
.def maybeGetUser[F[_] : Concurrent](id: UserId): Fetch[F, Option[User]] = Fetch.optional(id, Users.source)
Data sources that don’t support batching
If your data source doesn’t support batching, you can simply leave the
batch
method unimplemented. Note that it will use thefetch
implementation for requesting identities in parallel.object Unbatched extends Data[Int, Int]{ def name = "Unbatched" def source[F[_] : Concurrent]: DataSource[F, Int, Int] = new DataSource[F, Int, Int]{ override def data = Unbatched override def CF = Concurrent[F] override def fetch(id: Int): F[Option[Int]] = CF.pure(Option(id)) } }
Batching individuals requests sequentially
The default
batch
implementation run requests to the data source in parallel, but you can easily override it. We can makebatch
sequential usingNonEmptyList.traverse
for fetching individual identities.object UnbatchedSeq extends Data[Int, Int]{ def name = "UnbatchedSeq" def source[F[_] : Concurrent]: DataSource[F, Int, Int] = new DataSource[F, Int, Int]{ override def data = UnbatchedSeq override def CF = Concurrent[F] override def fetch(id: Int): F[Option[Int]] = CF.pure(Option(id)) override def batch(ids: NonEmptyList[Int]): F[Map[Int, Int]] = ids.traverse( (id) => fetch(id).map(v => (id, v)) ).map(_.collect { case (i, Some(x)) => (i, x) }.toMap) } }
Data sources that only support batching
If your data source only supports querying it in batches, you can implement
fetch
in terms ofbatch
.object OnlyBatched extends Data[Int, Int]{ def name = "OnlyBatched" def source[F[_] : Concurrent]: DataSource[F, Int, Int] = new DataSource[F, Int, Int]{ override def data = OnlyBatched override def CF = Concurrent[F] override def fetch(id: Int): F[Option[Int]] = batch(NonEmptyList(id, List())).map(_.get(id)) override def batch(ids: NonEmptyList[Int]): F[Map[Int, Int]] = CF.pure(ids.map(x => (x, x)).toList.toMap) } }