IntermediateScenario
10 min

Service Timeout Errors

Advertisement
Interview Question

Multiple services are failing with timeout errors when calling an internal API. How do you approach debugging?

Key Points to Cover
  • Check API latency, error rates, and load
  • Inspect service-to-service networking and DNS
  • Validate connection pools and retries
  • Check upstream health and rate limits
  • Isolate failing region or data center if needed
Evaluation Rubric
Uses metrics to confirm timeouts30% weight
Checks network/DNS connectivity30% weight
Considers connection pool exhaustion20% weight
Localizes to region/DC issues20% weight
Hints
  • 💡Timeouts can hide dependency failures.
Common Pitfalls to Avoid
  • ⚠️Focusing solely on the API without considering client-side retry logic.
  • ⚠️Ignoring network connectivity and DNS as potential culprits.
  • ⚠️Jumping to code changes without sufficient data gathering and analysis.
  • ⚠️Not correlating timeout events with infrastructure resource utilization.
  • ⚠️Failing to consider the impact of large or malformed request/response payloads.
Potential Follow-up Questions
  • How do you set timeout budgets?
  • What about retry storms?
Advertisement