IntermediateScenario
10 min
Service in CrashLoopBackOff
KubernetesContainersReliability
Advertisement
Interview Question
A Kubernetes service keeps restarting with CrashLoopBackOff. How do you debug and resolve this?
Key Points to Cover
- Inspect pod logs for root cause of crashes
- Check readiness/liveness probe configs
- Validate environment variables and secrets
- Review resource limits (OOMKilled/CPU throttling)
- Roll back recent config changes if needed
Evaluation Rubric
Checks logs for errors30% weight
Validates health probe settings30% weight
Considers resource-related crashes20% weight
Mentions rollback/safe fixes20% weight
Hints
- 💡Check init containers and secrets.
Common Pitfalls to Avoid
- ⚠️Not checking pod logs as the first step, leading to unnecessary investigation.
- ⚠️Overlooking the importance of `kubectl describe pod` and its events section for exit codes and other status indicators.
- ⚠️Assuming the application code is always the problem, neglecting configuration, environment variables, and secrets.
- ⚠️Not thoroughly testing readiness/liveness probes independently or ensuring their configurations match the application's behavior.
- ⚠️Failing to consider resource constraints (CPU/memory) as a potential cause for crashes, especially OOMKilled events.
Potential Follow-up Questions
- ❓How do you prevent repeated restarts?
- ❓What alerts would you add?
Advertisement