Interview Questions/Troubleshooting Scenarios/Cloud API Throttling (429) Causing Failures
IntermediateScenario
10 min

Cloud API Throttling (429) Causing Failures

CloudRate LimitingReliability
Advertisement
Interview Question

Background jobs calling a cloud provider API start failing with 429 Too Many Requests during peak hours. How do you stabilize and prevent it?

Key Points to Cover
  • Measure request rate vs provider quotas; identify burst patterns
  • Implement client-side backoff/jitter and respect Retry-After
  • Shard credentials/projects or request quotas where appropriate
  • Batch requests and add concurrency controls
  • Add dashboards/alerts on 429 rates and latency
Evaluation Rubric
Quantifies usage vs quotas30% weight
Implements robust retry/backoff30% weight
Batches/shards to reduce bursts20% weight
Monitors 429s and adjusts proactively20% weight
Hints
  • 💡Prefer token bucket style client throttling.
Common Pitfalls to Avoid
  • ⚠️Not implementing exponential backoff, leading to immediate retries that exacerbate the problem.
  • ⚠️Forgetting to add jitter to backoff strategies, resulting in synchronized retries (thundering herd).
  • ⚠️Ignoring or not correctly implementing the 'Retry-After' header provided by the API.
  • ⚠️Assuming a simple retry is sufficient without understanding the underlying burst patterns or quota limits.
  • ⚠️Focusing solely on client-side fixes without exploring upstream solutions like quota increases or workload optimization.
Potential Follow-up Questions
  • When to request quota increases?
  • How to test under quota constraints?
Advertisement