KafkaConsumer

sealed abstract class KafkaConsumer[F[_], K, V] extends KafkaConsume[F, K, V] with KafkaAssignment[F] with KafkaOffsets[F] with KafkaSubscription[F] with KafkaTopics[F] with KafkaCommit[F] with KafkaMetrics[F] with KafkaConsumerLifecycle[F]

KafkaConsumer represents a consumer of Kafka records, with the ability to subscribe to topics, start a single top-level stream, and optionally control it via the provided fiber instance.

The following top-level streams are provided.

KafkaConsumer represents a consumer of Kafka records, with the ability to subscribe to topics, start a single top-level stream, and optionally control it via the provided fiber instance.

The following top-level streams are provided.

  • stream provides a single stream of records, where the order of records is guaranteed per topic-partition.
  • partitionedStream provides a stream with elements as streams that continually request records for a single partition. Order is guaranteed per topic-partition, but all assigned partitions will have to be processed in parallel.
  • partitionsMapStream provides a stream where each element contains a current assignment. The current assignment is the Map, where keys is a TopicPartition, and values are streams with records for a particular TopicPartition.

For the streams, records are wrapped in [[CommittableConsumerRecord]]s which provide [[CommittableOffset]]s with the ability to commit record offsets to Kafka. For performance reasons, offsets are usually committed in batches using [[CommittableOffsetBatch]]. Provided `Pipe`s, like [[commitBatchWithin]] are available for batch committing offsets. If you are not committing offsets to Kafka, you can simply discard the [[CommittableOffset]], and only make use of the record.

While it's technically possible to start more than one stream from a single [[KafkaConsumer]], it is generally not recommended as there is no guarantee which stream will receive which records, and there might be an overlap, in terms of duplicate records, between the two streams. If a first stream completes, possibly with error, there's no guarantee the stream has processed all of the records it received, and a second stream from the same [[KafkaConsumer]] might not be able to pick up where the first one left off. Therefore, only create a single top-level stream per [[KafkaConsumer]], and if you want to start a new stream if the first one finishes, let the [[KafkaConsumer]] shutdown and create a new one.
Companion
object
trait KafkaMetrics[F]
trait KafkaCommit[F]
trait KafkaTopics[F]
trait KafkaOffsets[F]
trait KafkaConsume[F, K, V]
class Object
trait Matchable
class Any

Value members

Inherited methods

def assign(topic: String): F[Unit]

Manually assigns all partitions for the specified topic to the consumer.

Manually assigns all partitions for the specified topic to the consumer.

Inherited from
KafkaAssignment
def assign(topic: String, partitions: Type[Int]): F[Unit]

Manually assigns the specified list of partitions for the specified topic to the consumer. This function does not allow for incremental assignment and will replace the previous assignment (if there is one).

Manually assigns the specified list of partitions for the specified topic to the consumer. This function does not allow for incremental assignment and will replace the previous assignment (if there is one).

Manual topic assignment through this method does not use the consumer's group management functionality. As such, there will be no rebalance operation triggered when group membership or cluster and topic metadata change. Note that it is not possible to use both manual partition assignment with assign and group assignment with subscribe.

If auto-commit is enabled, an async commit (based on the old assignment) will be triggered before the new assignment replaces the old one.

To unassign all partitions, use KafkaConsumer#unsubscribe.

See also

org.apache.kafka.clients.consumer.KafkaConsumer#assign

Inherited from
KafkaAssignment
def assign(partitions: Type[TopicPartition]): F[Unit]

Manually assigns the specified list of topic partitions to the consumer. This function does not allow for incremental assignment and will replace the previous assignment (if there is one).

Manually assigns the specified list of topic partitions to the consumer. This function does not allow for incremental assignment and will replace the previous assignment (if there is one).

Manual topic assignment through this method does not use the consumer's group management functionality. As such, there will be no rebalance operation triggered when group membership or cluster and topic metadata change. Note that it is not possible to use both manual partition assignment with assign and group assigment with subscribe.

If auto-commit is enabled, an async commit (based on the old assignment) will be triggered before the new assignment replaces the old one.

To unassign all partitions, use KafkaConsumer#unsubscribe.

See also

org.apache.kafka.clients.consumer.KafkaConsumer#assign

Inherited from
KafkaAssignment
def assignment: F[SortedSet[TopicPartition]]

Returns the set of partitions currently assigned to this consumer.

Returns the set of partitions currently assigned to this consumer.

Inherited from
KafkaAssignment
def assignmentStream: Stream[F, SortedSet[TopicPartition]]

Stream where the elements are the set of TopicPartitions currently assigned to this consumer. The stream emits whenever a rebalance changes partition assignments.

Stream where the elements are the set of TopicPartitions currently assigned to this consumer. The stream emits whenever a rebalance changes partition assignments.

Inherited from
KafkaAssignment
@nowarn("cat=deprecation")
def awaitTermination: F[Unit]

Wait for consumer to shut down. Note that awaitTermination is guaranteed to complete after consumer shutdown, even when the consumer is cancelled with terminate.

Wait for consumer to shut down. Note that awaitTermination is guaranteed to complete after consumer shutdown, even when the consumer is cancelled with terminate.

This method will not initiate shutdown. To initiate shutdown and wait for it to complete, you can use terminate >> awaitTermination.

Inherited from
KafkaConsumerLifecycle
def beginningOffsets(partitions: Set[TopicPartition], timeout: FiniteDuration): F[Map[TopicPartition, Long]]

Returns the first offset for the specified partitions.

Returns the first offset for the specified partitions.

Inherited from
KafkaTopics
def beginningOffsets(partitions: Set[TopicPartition]): F[Map[TopicPartition, Long]]

Returns the first offset for the specified partitions.

Timeout is determined by default.api.timeout.ms, which is set using ConsumerSettings#withDefaultApiTimeout.

Returns the first offset for the specified partitions.

Timeout is determined by default.api.timeout.ms, which is set using ConsumerSettings#withDefaultApiTimeout.

Inherited from
KafkaTopics
def commitAsync(offsets: Map[TopicPartition, OffsetAndMetadata]): F[Unit]

Commit the specified offsets for the specified list of topics and partitions to Kafka.

This commits offsets to Kafka. The offsets committed using this API will be used on the first fetch after every rebalance and also on startup. As such, if you need to store offsets in anything other than Kafka, this API should not be used. The committed offset should be the next message your application will consume, i.e. lastProcessedMessageOffset + 1. If automatic group management with subscribe is used, then the committed offsets must belong to the currently auto-assigned partitions.

Offsets committed through multiple calls to this API are guaranteed to be sent in the same order as the invocations. Additionally note that offsets committed through this API are guaranteed to complete before a subsequent call to commitSync (and variants) returns.

Note, that the recommended way for committing offsets in fs2-kafka is to use commit on CommittableConsumerRecord, CommittableOffset or CommittableOffsetBatch. commitAsync and commitSync usually needs only for some custom scenarios.

Commit the specified offsets for the specified list of topics and partitions to Kafka.

This commits offsets to Kafka. The offsets committed using this API will be used on the first fetch after every rebalance and also on startup. As such, if you need to store offsets in anything other than Kafka, this API should not be used. The committed offset should be the next message your application will consume, i.e. lastProcessedMessageOffset + 1. If automatic group management with subscribe is used, then the committed offsets must belong to the currently auto-assigned partitions.

Offsets committed through multiple calls to this API are guaranteed to be sent in the same order as the invocations. Additionally note that offsets committed through this API are guaranteed to complete before a subsequent call to commitSync (and variants) returns.

Note, that the recommended way for committing offsets in fs2-kafka is to use commit on CommittableConsumerRecord, CommittableOffset or CommittableOffsetBatch. commitAsync and commitSync usually needs only for some custom scenarios.

Value Params
offsets

A map of offsets by partition with associate metadata.

See also

org.apache.kafka.clients.consumer.KafkaConsumer#commitAsync

Inherited from
KafkaCommit
def commitSync(offsets: Map[TopicPartition, OffsetAndMetadata]): F[Unit]

Commit the specified offsets for the specified list of topics and partitions.

This commits offsets to Kafka. The offsets committed using this API will be used on the first fetch after every rebalance and also on startup. As such, if you need to store offsets in anything other than Kafka, this API should not be used. The committed offset should be the next message your application will consume, i.e. lastProcessedMessageOffset + 1. If automatic group management with subscribe is used, then the committed offsets must belong to the currently auto-assigned partitions.

Despite of it's name, this method is not blocking. But it's based on a blocking org.apache.kafka.clients.consumer.KafkaConsumer#commitSync method.

Note, that the recommended way for committing offsets in fs2-kafka is to use commit on CommittableConsumerRecord, CommittableOffset or CommittableOffsetBatch. commitAsync and commitSync usually needs only for some custom scenarios.

Commit the specified offsets for the specified list of topics and partitions.

This commits offsets to Kafka. The offsets committed using this API will be used on the first fetch after every rebalance and also on startup. As such, if you need to store offsets in anything other than Kafka, this API should not be used. The committed offset should be the next message your application will consume, i.e. lastProcessedMessageOffset + 1. If automatic group management with subscribe is used, then the committed offsets must belong to the currently auto-assigned partitions.

Despite of it's name, this method is not blocking. But it's based on a blocking org.apache.kafka.clients.consumer.KafkaConsumer#commitSync method.

Note, that the recommended way for committing offsets in fs2-kafka is to use commit on CommittableConsumerRecord, CommittableOffset or CommittableOffsetBatch. commitAsync and commitSync usually needs only for some custom scenarios.

Value Params
offsets

A map of offsets by partition with associated metadata

See also

org.apache.kafka.clients.consumer.KafkaConsumer#commitSync

Inherited from
KafkaCommit
def endOffsets(partitions: Set[TopicPartition], timeout: FiniteDuration): F[Map[TopicPartition, Long]]

Returns the last offset for the specified partitions.

Returns the last offset for the specified partitions.

Inherited from
KafkaTopics
def endOffsets(partitions: Set[TopicPartition]): F[Map[TopicPartition, Long]]

Returns the last offset for the specified partitions.

Timeout is determined by request.timeout.ms, which is set using ConsumerSettings#withRequestTimeout.

Returns the last offset for the specified partitions.

Timeout is determined by request.timeout.ms, which is set using ConsumerSettings#withRequestTimeout.

Inherited from
KafkaTopics
def metrics: F[Map[MetricName, Metric]]

Returns consumer metrics.

Returns consumer metrics.

See also

org.apache.kafka.clients.consumer.KafkaConsumer#metrics

Inherited from
KafkaMetrics
def partitionedStream: Stream[F, Stream[F, CommittableConsumerRecord[F, K, V]]]

Stream where the elements themselves are Streams which continually request records for a single partition. These Streams will have to be processed in parallel, using parJoin or parJoinUnbounded. Note that when using parJoin(n) and n is smaller than the number of currently assigned partitions, then there will be assigned partitions which won't be processed. For that reason, prefer parJoinUnbounded and the actual limit will be the number of assigned partitions.

If you do not want to process all partitions in parallel, then you can use stream instead, where records for all partitions are in a single Stream.

Stream where the elements themselves are Streams which continually request records for a single partition. These Streams will have to be processed in parallel, using parJoin or parJoinUnbounded. Note that when using parJoin(n) and n is smaller than the number of currently assigned partitions, then there will be assigned partitions which won't be processed. For that reason, prefer parJoinUnbounded and the actual limit will be the number of assigned partitions.

If you do not want to process all partitions in parallel, then you can use stream instead, where records for all partitions are in a single Stream.

Note

you have to first use subscribe to subscribe the consumer before using this Stream. If you forgot to subscribe, there will be a NotSubscribedException raised in the Stream.

Inherited from
KafkaConsume
def partitionsFor(topic: String, timeout: FiniteDuration): F[List[PartitionInfo]]

Returns the partitions for the specified topic.

Returns the partitions for the specified topic.

Inherited from
KafkaTopics
def partitionsFor(topic: String): F[List[PartitionInfo]]

Returns the partitions for the specified topic.

Returns the partitions for the specified topic.

Timeout is determined by default.api.timeout.ms, which is set using ConsumerSettings#withDefaultApiTimeout.

Inherited from
KafkaTopics
def partitionsMapStream: Stream[F, Map[TopicPartition, Stream[F, CommittableConsumerRecord[F, K, V]]]]

Stream where each element contains a current assignment. The current assignment is the Map, where keys is a TopicPartition, and values are streams with records for a particular TopicPartition.

New assignments will be received on each rebalance. On rebalance, Kafka revoke all previously assigned partitions, and after that assigned new partitions all at once. partitionsMapStream reflects this process in a streaming manner.

Note, that partition streams for revoked partitions will be closed after the new assignment comes.

This is the most generic Stream method. If you don't need such control, consider using partitionedStream or stream methods. They are both based on a partitionsMapStream.

Stream where each element contains a current assignment. The current assignment is the Map, where keys is a TopicPartition, and values are streams with records for a particular TopicPartition.

New assignments will be received on each rebalance. On rebalance, Kafka revoke all previously assigned partitions, and after that assigned new partitions all at once. partitionsMapStream reflects this process in a streaming manner.

Note, that partition streams for revoked partitions will be closed after the new assignment comes.

This is the most generic Stream method. If you don't need such control, consider using partitionedStream or stream methods. They are both based on a partitionsMapStream.

See also
Note

you have to first use subscribe to subscribe the consumer before using this Stream. If you forgot to subscribe, there will be a NotSubscribedException raised in the Stream.

Inherited from
KafkaConsume
def position(partition: TopicPartition, timeout: FiniteDuration): F[Long]

Returns the offset of the next record that will be fetched.

Returns the offset of the next record that will be fetched.

Inherited from
KafkaOffsets
def position(partition: TopicPartition): F[Long]

Returns the offset of the next record that will be fetched.

Timeout is determined by default.api.timeout.ms, which is set using ConsumerSettings#withDefaultApiTimeout.

Returns the offset of the next record that will be fetched.

Timeout is determined by default.api.timeout.ms, which is set using ConsumerSettings#withDefaultApiTimeout.

Inherited from
KafkaOffsets
def seek(partition: TopicPartition, offset: Long): F[Unit]

Overrides the fetch offsets that the consumer will use when reading the next record. If this API is invoked for the same partition more than once, the latest offset will be used. Note that you may lose data if this API is arbitrarily used in the middle of consumption to reset the fetch offsets.

Overrides the fetch offsets that the consumer will use when reading the next record. If this API is invoked for the same partition more than once, the latest offset will be used. Note that you may lose data if this API is arbitrarily used in the middle of consumption to reset the fetch offsets.

Inherited from
KafkaOffsets
def seekToBeginning[G[_]](partitions: G[TopicPartition])(`evidence$1`: Foldable[G]): F[Unit]

Seeks to the first offset for each of the specified partitions. If no partitions are provided, seeks to the first offset for all currently assigned partitions.

Note that this seek evaluates lazily, and only on the next call to poll or position.

Seeks to the first offset for each of the specified partitions. If no partitions are provided, seeks to the first offset for all currently assigned partitions.

Note that this seek evaluates lazily, and only on the next call to poll or position.

Inherited from
KafkaOffsets
def seekToBeginning: F[Unit]

Seeks to the first offset for each currently assigned partition. This is equivalent to using seekToBeginning with an empty set of partitions.

Note that this seek evaluates lazily, and only on the next call to poll or position.

Seeks to the first offset for each currently assigned partition. This is equivalent to using seekToBeginning with an empty set of partitions.

Note that this seek evaluates lazily, and only on the next call to poll or position.

Inherited from
KafkaOffsets
def seekToEnd[G[_]](partitions: G[TopicPartition])(`evidence$2`: Foldable[G]): F[Unit]

Seeks to the last offset for each of the specified partitions. If no partitions are provided, seeks to the last offset for all currently assigned partitions.

Note that this seek evaluates lazily, and only on the next call to poll or position.

Seeks to the last offset for each of the specified partitions. If no partitions are provided, seeks to the last offset for all currently assigned partitions.

Note that this seek evaluates lazily, and only on the next call to poll or position.

Inherited from
KafkaOffsets
def seekToEnd: F[Unit]

Seeks to the last offset for each currently assigned partition. This is equivalent to using seekToEnd with an empty set of partitions.

Note that this seek evaluates lazily, and only on the next call to poll or position.

Seeks to the last offset for each currently assigned partition. This is equivalent to using seekToEnd with an empty set of partitions.

Note that this seek evaluates lazily, and only on the next call to poll or position.

Inherited from
KafkaOffsets
def stopConsuming: F[Unit]

Stops consuming new messages from Kafka. This method could be used to implement a graceful shutdown.

This method has a few effects:

Stops consuming new messages from Kafka. This method could be used to implement a graceful shutdown.

This method has a few effects:

  1. After this call no more data will be fetched from Kafka through the poll method.
  2. All currently running streams will continue to run until all in-flight messages will be processed. It means that streams will be completed when all fetched messages will be processed.

If some of the [[stream]] methods will be called after [[stopConsuming]] call, these methods will return empty streams.

More than one call of [[stopConsuming]] will have no effect.
Inherited from
KafkaConsume
def stream: Stream[F, CommittableConsumerRecord[F, K, V]]

Alias for partitionedStream.parJoinUnbounded. See partitionedStream for more information.

Alias for partitionedStream.parJoinUnbounded. See partitionedStream for more information.

Note

you have to first use subscribe to subscribe the consumer before using this Stream. If you forgot to subscribe, there will be a NotSubscribedException raised in the Stream.

Inherited from
KafkaConsume
def subscribe(regex: Regex): F[Unit]

Subscribes the consumer to the topics matching the specified Regex. Note that you have to use one of the subscribe functions before you can use any of the provided Streams, or a NotSubscribedException will be raised in the Streams.

Subscribes the consumer to the topics matching the specified Regex. Note that you have to use one of the subscribe functions before you can use any of the provided Streams, or a NotSubscribedException will be raised in the Streams.

Value Params
regex

the regex to which matching topics should be subscribed

Inherited from
KafkaSubscription
def subscribe[G[_]](topics: G[String])(`evidence$1`: Reducible[G]): F[Unit]

Subscribes the consumer to the specified topics. Note that you have to use one of the subscribe functions to subscribe to one or more topics before using any of the provided Streams, or a NotSubscribedException will be raised in the Streams.

Subscribes the consumer to the specified topics. Note that you have to use one of the subscribe functions to subscribe to one or more topics before using any of the provided Streams, or a NotSubscribedException will be raised in the Streams.

Value Params
topics

the topics to which the consumer should subscribe

Inherited from
KafkaSubscription
def subscribeTo(firstTopic: String, remainingTopics: String*): F[Unit]

Subscribes the consumer to the specified topics. Note that you have to use one of the subscribe functions to subscribe to one or more topics before using any of the provided Streams, or a NotSubscribedException will be raised in the Streams.

Subscribes the consumer to the specified topics. Note that you have to use one of the subscribe functions to subscribe to one or more topics before using any of the provided Streams, or a NotSubscribedException will be raised in the Streams.

Inherited from
KafkaSubscription
@nowarn("cat=deprecation")
def terminate: F[Unit]

Whenever terminate is invoked, an attempt will be made to stop the underlying consumer. The terminate operation will not wait for the consumer to shutdown. If you also want to wait for the shutdown to complete, you can use terminate >> awaitTermination.

Whenever terminate is invoked, an attempt will be made to stop the underlying consumer. The terminate operation will not wait for the consumer to shutdown. If you also want to wait for the shutdown to complete, you can use terminate >> awaitTermination.

Inherited from
KafkaConsumerLifecycle
def unsubscribe: F[Unit]

Unsubscribes the consumer from all topics and partitions assigned by subscribe or assign.

Unsubscribes the consumer from all topics and partitions assigned by subscribe or assign.

Inherited from
KafkaSubscription

Deprecated and Inherited methods

@deprecated("Use terminate/awaitTermination instead", since = "1.4.0")
def fiber: Fiber[F, Unit]

A Fiber that can be used to cancel the underlying consumer, or wait for it to complete. If you're using KafkaConsumer.stream, or any other provided stream in KafkaConsumer, these will be automatically interrupted when the underlying consumer has been cancelled or when it finishes with an exception.

Whenever cancel is invoked, an attempt will be made to stop the underlying consumer. The cancel operation will not wait for the consumer to shutdown. If you also want to wait for the shutdown to complete, you can use join. Note that join is guaranteed to complete after consumer shutdown, even when the consumer is cancelled with cancel.

This Fiber instance is usually only required if the consumer needs to be cancelled due to some external condition, or when an external process needs to be cancelled whenever the consumer has shut down. Most of the time, when you're only using the streams provided by KafkaConsumer, there is no need to use this.

A Fiber that can be used to cancel the underlying consumer, or wait for it to complete. If you're using KafkaConsumer.stream, or any other provided stream in KafkaConsumer, these will be automatically interrupted when the underlying consumer has been cancelled or when it finishes with an exception.

Whenever cancel is invoked, an attempt will be made to stop the underlying consumer. The cancel operation will not wait for the consumer to shutdown. If you also want to wait for the shutdown to complete, you can use join. Note that join is guaranteed to complete after consumer shutdown, even when the consumer is cancelled with cancel.

This Fiber instance is usually only required if the consumer needs to be cancelled due to some external condition, or when an external process needs to be cancelled whenever the consumer has shut down. Most of the time, when you're only using the streams provided by KafkaConsumer, there is no need to use this.

Deprecated
[Since version 1.4.0]
Inherited from
KafkaConsumerLifecycle