Deep Dive Into Concept of Kafka Consumer — Part III

How Kafka Consumer Works

Dileep
3 min readMar 7, 2021

This is the 3rd Part of our article on Kafka.

If you didn’t read the second article then I recommended to read the 2nd one before reading this — Part II

Consumer

Consumer is used to read messages from topic with the help of ConsumerRecord.

Consumer Groups

Consumer groups contain multiple consumers which are subscribed to a single topic. let say our application producing message rate is greater the single consumer consumption rate then in this case if we continue with single consumer then it will pile-up message in kafka for that topic and your application fall farther and farther behind, unable to keep up with the rate of incoming messages. In that case we need to scale topic consumption and we need to allow multiple consumers to read from the same topic, splitting the data between them.

** Single partition messages can only be consumed by a single consumer at a point of time**

Image Source Book

A consumer can consume messages from different partition in a topic but a partition message can’t consume by multiple consumers.

The main way we scale data consumption from a Kafka topic is by adding more consumers to a consumer group. If we add more consumers than no. of partition in a single group subscribe to a topic. some of the consumers will be idle

There is no point in adding more consumers than you have partitions in a topic. some of the consumers will just be idle.

Image Source Book

When we add a new consumer in group, it start consuming messages from partition that is previously consume by other consumer. The same thing happens when a consumer shuts down or crashes; it leaves the group, and the partitions it used to consume will be consumed by one of the remaining consumers.

Rebalancing: Partition ownership movement when a new consumer add in group or leave. (Consumer can’t consume messages during rebalancing, that’s why rebalancing is a short time).

Group Coordinator(broker): Every consumer sending heartbeat to Group Coordinator(GC). as long as consumer send heartbeat to GC, it assumed to be alive and can process messages from the partition. when consumer stop send heart beat to GC. GC thinks consumer dead and trigger rebalance.

Commits and Offsets: Offsets are the numbers which is assigned to every message in a partition. Commits is no of messages that are consumed by consumers till now from a partition in a topic.

We poll some messages from the topic and process them. After that again when we ask for next batch of messages then commits come in handy. This commit related details regarding a partition kafka store in __consumer_offsets topic. so by any chance one consumer goes down and other consumer start consuming messages from that partition. it will get commit from __consumer_offsets topic and start consuming messages after the last commit.

Thanks for reading.

Deep Dive Into Concept of kafka Internals — PART IV

--

--

Dileep

Passionate about coding, cyber security | Software Engineer | IIT Roorkee.