Scaling-out Consumers
How to handle countless partitions for a consumer perspective ? Having a bunch of consumers independently consuming messages from topics and partitions won’t solve this challenge of scalability.
Consumers have to work together ! So what ?
Consumer Groups is the solution. IT’s a collection of individual independent consumers processes working together as a team.
Joining a group is configurable with the setting “group.id ”. The consumer group enables the sharing of the messages consumption and of the processing load : high level of throughput and parallelism; upgrade of performance; increase of levels of redundancy (failure of a single consumer).
How to rebalance a consumer group if needed ?
When a new consumer is assigned to a partition, it needs to know what offset it should start from because it does not a current position for this particular partition. Fortunately, the last committed offset has been cached from the previous consumer by the consumers subscription state object. From now, the consumers subscription state object can instruct the new consumer that it will start with offset 4.
Rebalancing protocol is also initiated during a consumer failure, when topic change (partition added).