IntermediateTechnical
5 min
Kubernetes Pod Restart Loop Troubleshooting
KubernetesTroubleshootingContainer Orchestration
Advertisement
Interview Question
A Kubernetes pod is stuck in a restart loop. Walk me through your systematic approach to diagnose and fix this issue.
Key Points to Cover
- Check pod status and events: kubectl describe pod
- Examine logs: kubectl logs pod-name --previous
- Review resource limits and requests
- Check liveness/readiness probes configuration
- Verify image availability and configuration
- Check node resource availability
- Examine security contexts and permissions
Evaluation Rubric
Follows systematic troubleshooting approach20% weight
Uses appropriate kubectl commands30% weight
Identifies potential root causes30% weight
Suggests concrete resolution steps20% weight
Hints
- 💡Start with kubectl describe and logs
- 💡Consider both application and infrastructure issues
Common Pitfalls to Avoid
- ⚠️Jumping to application code issues without first checking kubectl describe and events, missing obvious infrastructure problems like resource limits or failed health checks
- ⚠️Forgetting to use --previous flag with kubectl logs, which shows logs from the current restart attempt instead of the crash that caused the loop
- ⚠️Not checking liveness/readiness probe configurations early - aggressive probes are a very common cause of restart loops that look like application failures
- ⚠️Overlooking node-level resource constraints like disk pressure, memory pressure, or insufficient allocatable resources that prevent pods from running
- ⚠️Failing to verify the complete dependency chain including init containers, config maps, secrets, and persistent volume mounts that the pod requires
Potential Follow-up Questions
- ❓How would you prevent this in the future?
- ❓What monitoring would you implement?
Advertisement