Interview Questions/Troubleshooting Scenarios/Service in CrashLoopBackOff

IntermediateScenario

10 min

Service in CrashLoopBackOff

Kubernetes Containers Reliability

Advertisement

Interview Question

A Kubernetes service keeps restarting with CrashLoopBackOff. How do you debug and resolve this?

Key Points to Cover

Inspect pod logs for root cause of crashes
Check readiness/liveness probe configs
Validate environment variables and secrets
Review resource limits (OOMKilled/CPU throttling)
Roll back recent config changes if needed

Evaluation Rubric

Checks logs for errors30% weight

Validates health probe settings30% weight

Considers resource-related crashes20% weight

Mentions rollback/safe fixes20% weight

Hints

💡Check init containers and secrets.

Common Pitfalls to Avoid

⚠️Not checking pod logs as the first step, leading to unnecessary investigation.
⚠️Overlooking the importance of `kubectl describe pod` and its events section for exit codes and other status indicators.
⚠️Assuming the application code is always the problem, neglecting configuration, environment variables, and secrets.
⚠️Not thoroughly testing readiness/liveness probes independently or ensuring their configurations match the application's behavior.
⚠️Failing to consider resource constraints (CPU/memory) as a potential cause for crashes, especially OOMKilled events.

Potential Follow-up Questions

❓How do you prevent repeated restarts?
❓What alerts would you add?

Advertisement

Related Questions

Questions that share similar topics with this one

Back to Troubleshooting Scenarios View All Categories