Observability
All interview questions related to Observability
What is distributed tracing, and why is it important in microservices architectures?
Discuss the advantages and disadvantages of adopting a service mesh (e.g., Istio, Linkerd) in production.
Explain the roles of metrics, logs, and traces in observability, and how they complement each other.
Describe how you would leverage eBPF for deep observability and runtime security in production Linux systems.
Explain how you would design and implement distributed tracing in a microservices environment. How do you ensure minimal performance overhead?
Design a monitoring and alerting system for a microservices architecture running on Kubernetes. Consider metrics, logs, traces, and alerting.
Design a judge/sandbox to safely compile and run untrusted code in multiple languages with resource limits and scaling.
Design a horizontally scalable time-series database for metrics with high-cardinality support, rollups, and retention policies.
Production dashboards show a sharp increase in HTTP 5xx responses from the web tier over the last 10 minutes, but traffic volume is normal. Describe your step-by-step triage and remediation.
Tell me about a time you decided quickly to roll back a deployment. What signals guided your decision?