Networking
All interview questions related to Networking
In CIDR notation, what does a /24 signify and roughly how many usable IPv4 addresses are in such a subnet?
Explain the meaning of HTTP status codes 200, 301, 403, 404, and 500.
What are the key differences between TCP and UDP, and when would you use one over the other?
Explain the difference between Layer 4 and Layer 7 load balancing. Provide examples of when each would be used.
Walk me through the steps when you type a URL into your browser and press enter.
What is the difference between SSL and TLS, and why is TLS preferred today?
Why are SSH key pairs generally preferred over passwords for server access?
What is the difference between ping and traceroute, and when would you use each?
What is HTTP keep-alive and why does connection pooling improve performance?
Briefly explain ClusterIP, NodePort, and LoadBalancer service types in Kubernetes and when to use each.
What is the difference between a subnet and a VLAN, and how do they relate in typical architectures?
What is the difference between NAT and PAT, and where would you use each?
Briefly describe the TLS handshake steps and how the client verifies the server.
What problem does a CDN solve and how does it improve user-perceived performance?
What is the difference between an A record and a CNAME record in DNS, and when would you use each?
What is the difference between Kubernetes Ingress and Service, and when would you use each?
What is DNS propagation, and why can DNS record changes take time to be visible globally?
Your workloads face intermittent connectivity failures across regions. Walk through your diagnostic and remediation approach.
Discuss the advantages and disadvantages of adopting a service mesh (e.g., Istio, Linkerd) in production.
Explain different approaches to service discovery in microservices and their trade-offs.
What considerations would you make when scaling an API gateway for millions of requests per second?
What are the main challenges of hybrid cloud networking, and how would you address them?
Contrast WebSockets and gRPC streaming for real-time communication. How do you scale and secure each?
A critical API has intermittent p99 latency spikes without increased error rates. How would you isolate the cause and stabilize tail latency?
Design a globally available URL shortener like TinyURL/Bitly. Cover API design, key generation, storage, redirects, analytics, abuse prevention, and scalability.
Design a globally distributed rate limiter for multi-region APIs supporting per-user, per-IP, and per-endpoint quotas.
Design a web/mobile chat system with 1:1 and group chats, typing indicators, presence, read receipts, and offline support.
Design a Google Docs–style collaborative editor with real-time edits, offline support, and conflict resolution.
Design a multi-tenant API gateway that handles routing, auth, rate limiting, request/response transformations, canarying, and observability across regions.
Design a distributed cache that supports eviction policies, consistency across nodes, replication, and client-side failover.
Design a global CDN for static and dynamic content delivery, cache invalidation, SSL termination, and DDoS protection.
Design a scalable video conferencing service like Zoom with low latency, adaptive quality, and security.
Design a multiplayer online gaming platform with matchmaking, anti-cheat, and real-time state sync.
Your production website is suddenly showing SSL errors for users. How do you troubleshoot and fix this?
Your services suddenly cannot resolve domain names, breaking connectivity to dependencies. Walk me through your triage.
Users in one region report very slow page loads, but the rest of the world is fine. How do you troubleshoot this CDN performance issue?
Half your nodes cannot communicate with the other half due to a suspected network partition. How do you investigate and respond?
Multiple services are failing with timeout errors when calling an internal API. How do you approach debugging?
After a certificate rotation, services in the mesh begin failing with 503s. How do you diagnose and restore traffic?
Canary rollout passed, but a portion of clients hit decommissioned IPs due to stale DNS caches. What do you do?
After tightening TLS settings, some clients fail during handshake. How do you triage and restore compatibility without weakening security?
Internal and external clients see different DNS answers, causing failures. How do you debug and fix split-horizon issues?
Clients receive 431 Request Header Fields Too Large errors. Walk me through how you identify and remediate.