KafkaConsumer

KafkaConsumer represents a consumer of Kafka messages, with the ability to subscribe to topics, start a single top-level stream, and optionally control it via the provided fiber instance.

The following top-level streams are provided.

- stream provides a single stream of messages, where the order of records is guaranteed per topic-partition.
- partitionedStream provides a stream with elements as streams that continually request records for a single partition. Order is guaranteed per topic-partition, but all assigned partitions will have to be processed in parallel.

For the streams, records are wrapped in CommittableMessages which provide CommittableOffsets with the ability to commit record offsets to Kafka. For performance reasons, offsets are usually committed in batches using CommittableOffsetBatch. Provided Sinks, like commitBatch or commitBatchWithin are available for batch committing offsets. If you are not committing offsets to Kafka, you can simply discard the CommittableOffset, and only make use of the record.

While it's technically possible to start more than one stream from a single KafkaConsumer, it is generally not recommended as there is no guarantee which stream will receive which records, and there might be an overlap, in terms of duplicate messages, between the two streams. If a first stream completes, possibly with error, there's no guarantee the stream has processed all of the messages it received, and a second stream from the same KafkaConsumer might not be able to pick up where the first one left off. Therefore, only create a single top-level stream per KafkaConsumer, and if you want to start a new stream if the first one finishes, let the KafkaConsumer shutdown and create a new one.

Linear Supertypes

AnyRef, Any

Abstract Value Members

abstract def fiber: Fiber[F, Unit]

A Fiber that can be used to cancel the underlying consumer, or wait for it to complete.
A Fiber that can be used to cancel the underlying consumer, or wait for it to complete. If you're using stream, or any other provided stream in KafkaConsumer, these will be automatically interrupted when the underlying consumer has been cancelled or when it finishes with an exception.

Whenever cancel is invoked, an attempt will be made to stop the underlying consumer. The cancel operation will not wait for the consumer to shutdown. If you also want to wait for the shutdown to complete, you can use join. Note that join is guaranteed to complete after consumer shutdown, even when the consumer is cancelled with cancel.

This Fiber instance is usually only required if the consumer needs to be cancelled due to some external condition, or when an external process needs to be cancelled whenever the consumer has shut down. Most of the time, when you're only using the streams provided by KafkaConsumer, there is no need to use this.
abstract def partitionedStream: Stream[F, Stream[F, CommittableMessage[F, K, V]]]

Stream where the elements themselves are Streams which continually request records for a single partition.
Stream where the elements themselves are Streams which continually request records for a single partition. These Streams will have to be processed in parallel, using parJoin or parJoinUnbounded. Note that when using parJoin(n) and n is smaller than the number of currently assigned partitions, then there will be assigned partitions which won't be processed. For that reason, prefer parJoinUnbounded and the actual limit will be the number of assigned partitions.

If you do not want to process all partitions in parallel, then you can use stream instead, where records for all partitions are in a single Stream.

Note
you have to first use subscribe to subscribe the consumer before using this Stream. If you forgot to subscribe, there will be a NotSubscribedException raised in the Stream.
abstract def stream: Stream[F, CommittableMessage[F, K, V]]

Stream where the elements are Kafka messages and where ordering is guaranteed per topic-partition.
Stream where the elements are Kafka messages and where ordering is guaranteed per topic-partition. Parallelism can be achieved on record-level, using for example parEvalMap. For partition-level parallelism, use partitionedStream, where all partitions need to be processed in parallel.

The Stream works by continually making requests for records on every assigned partition, waiting for records to come back on all partitions, or up to ConsumerSettings#fetchTimeout. Records can be processed as soon as they are received, without waiting on other partition requests, but a second request for the same partition will wait for outstanding fetches to complete or timeout before being sent.

Note
you have to first use subscribe to subscribe the consumer before using this Stream. If you forgot to subscribe, there will be a NotSubscribedException raised in the Stream.
abstract def subscribe(regex: Regex): Stream[F, Unit]

Subscribes the consumer to the topics matching the specified Regex.
Subscribes the consumer to the topics matching the specified Regex. Note that you have to use one of the subscribe functions before you can use any of the provided Streams, or a NotSubscribedException will be raised in the Streams.
regex
the regex to which matching topics should be subscribed
abstract def subscribe(topics: NonEmptyList[String]): Stream[F, Unit]

Subscribes the consumer to the specified topics.
Subscribes the consumer to the specified topics. Note that you have to use one of the subscribe functions to subscribe to one or more topics before using any of the provided Streams, or a NotSubscribedException will be raised in the Streams.
topics
the topics to which the consumer should subscribe

Concrete Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package kafka

sealed abstract class KafkaConsumer[F[_], K, V] extends AnyRef

Abstract Value Members

abstract def fiber: Fiber[F, Unit]

abstract def partitionedStream: Stream[F, Stream[F, CommittableMessage[F, K, V]]]

abstract def stream: Stream[F, CommittableMessage[F, K, V]]

abstract def subscribe(regex: Regex): Stream[F, Unit]

abstract def subscribe(topics: NonEmptyList[String]): Stream[F, Unit]

Concrete Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from AnyRef

Inherited from Any

Ungrouped