AdvancedSystem-Design
45 min
Design a Global Distributed Rate Limiter
System DesignCachingConsistencyNetworking
Advertisement
Interview Question
Design a globally distributed rate limiter for multi-region APIs supporting per-user, per-IP, and per-endpoint quotas.
Key Points to Cover
- Algorithms: token bucket, leaky bucket, sliding window
- Data plane: co-located sidecars/proxies; control plane for policy
- Storage: Redis/Memcache with sharding and replication; hot key mitigation
- Correctness: eventual vs strong consistency; skew tolerance
- Failure modes: partial outages, clock drift, retries/backoff
- Observability: per-key metrics, audit trails, dashboards
Evaluation Rubric
Chooses and applies rate-limit algorithms25% weight
Designs scalable/consistent state storage25% weight
Handles failures and multi-region issues25% weight
Monitoring, audit, and safety levers25% weight
Hints
- 💡Consider sticky routing vs global coordination.
Common Pitfalls to Avoid
- ⚠️Attempting to achieve strong global consistency for every rate limit check, which introduces unacceptable latency and complexity.
- ⚠️Ignoring the critical need for hot key mitigation, leading to single points of contention and performance bottlenecks in the distributed state store.
- ⚠️Failing to design a dedicated control plane for policy management, making rate limit rule updates and deployments cumbersome and error-prone.
- ⚠️Not adequately addressing cross-region latency for synchronous operations, which can severely degrade API performance for users far from the primary region.
- ⚠️Overlooking graceful degradation strategies, leading to either complete system failure (fail-closed) or uncontrolled abuse (fail-open without limits) during rate limiter service outages.
Potential Follow-up Questions
- ❓How do you prevent hot keys?
- ❓Where do you enforce per-tenant quotas?
Advertisement