IntermediateScenario
10 min
Container OOMKilled Repeatedly
ContainersKubernetesMemory
Advertisement
Interview Question
A container is consistently OOMKilled under normal workload. How do you debug and fix it?
Key Points to Cover
- Check kube events and dmesg for OOMKilled
- Inspect resource requests/limits vs usage
- Profile memory usage inside container
- Fix leaks or adjust JVM/GC/heap configs
- Raise limits or split workload across pods
Evaluation Rubric
Detects OOMKilled via logs/events30% weight
Analyzes memory patterns/configs30% weight
Proposes app/code fixes or limit tuning20% weight
Mentions scaling/splitting workload20% weight
Hints
- 💡Heap configs often mismatch pod memory limits.
Common Pitfalls to Avoid
- ⚠️Focusing solely on Kubernetes limits without investigating application-level memory usage.
- ⚠️Not checking kernel-level logs (dmesg) for the OOM killer's direct indication.
- ⚠️Assuming the issue is external to the application without proper profiling.
- ⚠️Forgetting to correlate observed memory usage with defined resource requests/limits.
- ⚠️Not considering the possibility of memory fragmentation or other node-level memory issues.
Potential Follow-up Questions
- ❓How to prevent noisy neighbor OOMs?
- ❓What about memory QoS in k8s?
Advertisement