IntermediateScenario
10 min
Clients Using Stale DNS Cache
DNSNetworkingRelease Engineering
Advertisement
Interview Question
Canary rollout passed, but a portion of clients hit decommissioned IPs due to stale DNS caches. What do you do?
Key Points to Cover
- Confirm stale cache via TTLs and client resolver behavior
- Lower TTLs ahead of migrations; use overlapping windows
- Enable graceful drain/listener on old IPs with redirects
- Coordinate with ISPs/CDNs; push cache purges where possible
- Add synthetic checks from diverse networks to detect staleness
Evaluation Rubric
Confirms staleness and scope30% weight
Uses TTL planning/overlap strategies30% weight
Keeps old endpoints usable safely20% weight
Monitors/alerts for stale DNS hits20% weight
Hints
- 💡Mobile carriers often have aggressive caches.
Common Pitfalls to Avoid
- ⚠️Not verifying the root cause and jumping straight to fixes.
- ⚠️Focusing only on the technical DNS fix without considering client-side caching.
- ⚠️Reverting DNS without ensuring the old IPs are still available for a grace period.
- ⚠️Not considering the impact of different network architectures and DNS resolvers.
- ⚠️Failing to implement long-term preventative measures and relying on reactive solutions.
Potential Follow-up Questions
- ❓How do you validate TTL compliance?
- ❓When to use SRV/ALIAS records?
Advertisement