Advertisement
Interview Question
Your Kafka consumer groups are showing high lag and messages are processing slowly. How do you investigate and remediate this?
Key Points to Cover
- Check consumer offsets and group rebalances
- Validate consumer parallelism and partition assignment
- Inspect slow processing logic and batch sizes
- Scale consumers horizontally or tune fetch configs
- Add monitoring/alerting on lag growth rate
Evaluation Rubric
Analyzes offsets/lag correctly30% weight
Checks partition assignment and parallelism30% weight
Identifies slow processing causes20% weight
Proposes scaling/tuning fixes20% weight
Hints
- 💡Look at rebalances, GC pauses, and slow consumers.
Common Pitfalls to Avoid
- ⚠️Assuming the problem is solely with Kafka when the bottleneck is actually within the consumer application's processing logic.
- ⚠️Not considering the impact of consumer group rebalances as a primary cause of temporary or persistent lag.
- ⚠️Failing to correlate consumer lag with producer throughput or the number of partitions in the topic.
- ⚠️Overlooking resource constraints (CPU, memory, network) on the consumer instances themselves.
- ⚠️Not inspecting the consumer's fetching and batching configurations (e.g., `fetch.min.bytes`, `fetch.max.wait.ms`) which can significantly impact performance.
Potential Follow-up Questions
- ❓What about exactly-once semantics?
- ❓How do you size partitions properly?
Advertisement
Related Questions
Questions that share similar topics with this one
Scaling Message Queues
Advanced🔬 Technical Deep Dive•5 min•Technical
Message Queue Backlog
Intermediate🔧 Troubleshooting Scenarios•10 min•Scenario
HTTP Keep-Alive & Connection Pooling
Intermediate📞 Phone Screen•2 min•Phone
HTTP/1.1 vs HTTP/2
Intermediate📞 Phone Screen•2 min•Phone
CPU Load Average Explained
Intermediate📞 Phone Screen•2 min•Phone