Balance

object Balance

Provides mechanisms for balancing the distribution of chunks across multiple streams.

Provides mechanisms for balancing the distribution of chunks across multiple streams.

class Object
trait Matchable
class Any

Value members

Concrete methods

def apply[F[_], O](chunkSize: Int)(`evidence$1`: Concurrent[F]): (F, O) => Stream[F, O]

Allows balanced processing of this stream via multiple concurrent streams.

Allows balanced processing of this stream via multiple concurrent streams.

This could be viewed as Stream "fan-out" operation, allowing concurrent processing of elements. As the elements arrive, they are evenly distributed to streams that have already started their evaluation. To control the fairness of the balance, the chunkSize parameter is available, which controls the maximum number of elements pulled by single inner stream.

Note that this will pull only enough elements to satisfy needs of all inner streams currently being evaluated. When there are no stream awaiting the elements, this will stop pulling more elements from source.

If there is need to achieve high throughput, balance may be used together with prefetch to initially prefetch large chunks that will be available for immediate distribution to streams. For example:

 source.prefetch(100).balance(chunkSize=10).take(10)

This constructs a stream of 10 subscribers, which always takes 100 elements from the source, and gives 10 elements to each subscriber. While the subscribers process the elements, this will pull another 100 elements, which will be available for distribution when subscribers are ready.

Often this combinator is used together with parJoin, such as :

 Stream(1,2,3,4).balance.map { worker =>
   worker.map(_.toString)
 }.take(3).parJoinUnbounded.compile.to(Set).unsafeRunSync

When source terminates, the resulting streams (workers) are terminated once all elements so far pulled from source are processed.

The resulting stream terminates after the source stream terminates and all workers terminate. Conversely, if the resulting stream is terminated early, the source stream will be terminated.

def through[F[_], O, O2](chunkSize: Int)(pipes: (F, O) => O2*)(`evidence$2`: Concurrent[F]): (F, O) => O2

Like apply but instead of providing a stream of worker streams, the supplied pipes are used to transform each worker.

Like apply but instead of providing a stream of worker streams, the supplied pipes are used to transform each worker.

Each supplied pipe is run concurrently with other. This means that amount of pipes determines concurrency.

Each pipe may have a different implementation, if required; for example one pipe may process elements while another may send elements for processing to another machine.

Results from pipes are collected and emitted as the resulting stream.

This will terminate when :

  • this terminates
  • any pipe fails
  • all pipes terminate
Value Params
chunkSize

maximum chunk to present to each pipe, allowing fair distribution between pipes

pipes

pipes to use to process work