Interview Questions/System Design/Design a URL Shortener (TinyURL)
AdvancedSystem-Design
45 min

Design a URL Shortener (TinyURL)

System DesignDatabasesCachingNetworkingSecurity
Advertisement
Interview Question

Design a globally available URL shortener like TinyURL/Bitly. Cover API design, key generation, storage, redirects, analytics, abuse prevention, and scalability.

Key Points to Cover
  • API: create/resolve endpoints, rate limits, auth for premium features
  • Key generation: base62 IDs, collision avoidance, snowflake/KSUID, custom aliases
  • Storage: KV store (hot) + relational/log for analytics; TTL/archival
  • Performance: CDN edge redirects, caching, read-heavy optimization
  • Reliability: multi-region replication, eventual consistency trade-offs
  • Abuse: spam/phishing detection, domain allow/deny lists, quotas
  • Analytics: click tracking, unique visitors, geo/UA aggregation
Evaluation Rubric
Clear API and key/ID strategy25% weight
Efficient storage and caching plan25% weight
Global scale, replication, and latency25% weight
Abuse prevention and analytics25% weight
Hints
  • 💡Consider read-path at CDN edge and write-path in regions.
Common Pitfalls to Avoid
  • ⚠️**Single Point of Failure in Key Generation/Database:** Relying on a single, non-distributed mechanism for generating unique short codes or storing the core mapping without adequate replication/sharding.
  • ⚠️**Synchronous Analytics Processing:** Performing click logging and analytics processing in the critical redirect path, leading to high latency and scalability issues under heavy load.
  • ⚠️**Insufficient Abuse Prevention:** Neglecting robust measures against URL blacklisting, content scanning, and rate limiting, opening the service to spam, phishing, and malware, which degrades trust and reputation.
  • ⚠️**Inadequate Storage for High-Volume Reads:** Using a traditional relational database for the `shortCode` -> `longUrl` mapping without proper caching or a dedicated high-throughput KV store, leading to performance bottlenecks during high redirect volumes.
  • ⚠️**Neglecting Global Distribution & Latency:** Failing to design for multi-region deployment, global load balancing, and data replication, resulting in high latency for users geographically distant from the primary data center.
Potential Follow-up Questions
  • How do you handle custom domains?
  • How would you prevent hash enumeration?
Advertisement