Container Image Pulls Throttled by Registry

Kubernetes Containers Supply Chain Reliability

Interview Question

New pods are failing with ImagePullBackOff and registry logs show rate limiting/throttling. How do you restore service quickly and prevent recurrence?

Key Points to Cover

Confirm registry throttling via events and registry metrics/status
Warm nodes with pre-pulled images or use imagePullPolicy=IfNotPresent
Configure private mirror/registry cache and authenticated pulls
Reduce image churn: smaller layers, less frequent tags, pin SHAs
Add backoff, retries, and alerts on pull failures

Evaluation Rubric

Confirms throttling root cause30% weight

Restores service via mirrors/caching/prepull30% weight

Reduces image churn/size/policy issues20% weight

Adds monitoring and quotas/auth20% weight

Hints

💡Consider node DaemonSets to pre-pull images.

Common Pitfalls to Avoid

⚠️Failing to confirm the specific cause of `ImagePullBackOff` and assuming it's always rate limiting.
⚠️Only addressing the immediate symptom without implementing a long-term caching or mirroring solution.
⚠️Forgetting to configure proper authentication for private registries/mirrors.
⚠️Not considering the scale of image pulls and potential for optimization.
⚠️Neglecting to set up proactive monitoring and alerting for future occurrences.

Potential Follow-up Questions

❓How do you mirror public images securely?
❓What KPIs would you track for pull health?

Container Image Pulls Throttled by Registry

Service in CrashLoopBackOff

Kubernetes Pod Stuck in Pending

K8s Readiness vs Liveness Probes

Kubernetes Pod Stuck in Pending

Container OOMKilled Repeatedly