Interview Questions/System Design/Design an ML Feature Store
AdvancedSystem-Design
45 min

Design an ML Feature Store

MLDataConsistencyStreaming
Advertisement
Interview Question

Design an ML feature store that supports offline feature engineering and online low-latency serving with consistency guarantees.

Key Points to Cover
  • Feature registry & schema/versioning; lineage and governance
  • Offline pipeline (batch) and online pipeline (stream) materialization
  • Consistency: point-in-time correctness, training/serving skew reduction
  • Storage: offline (data lake) vs online (KV/Redis) with TTL/backfills
  • Serving APIs, caching, and multi-tenant quotas/SLA
  • Monitoring: feature drift, nulls, and freshness
Evaluation Rubric
Clear registry & versioning strategy25% weight
Correctness and skew mitigation25% weight
Hot/cold storage & materialization25% weight
Serving SLAs and monitoring25% weight
Hints
  • 💡Point-in-time joins are essential for correctness.
Common Pitfalls to Avoid
  • ⚠️**Data Latency Mismatches:** Insufficient buffering or delayed event processing in the streaming pipeline can lead to stale features being served online, while offline pipelines are up-to-date.
  • ⚠️**Inconsistent Transformation Logic:** Using different codebases or versions for feature transformations in offline and online pipelines will inevitably lead to training-serving skew.
  • ⚠️**Lack of Point-in-Time Correctness in Offline Data:** Failing to accurately reconstruct historical feature states can lead to models being trained on data that wasn't available at that specific time.
  • ⚠️**Schema Drift without Versioning:** Unmanaged changes to feature schemas can break downstream models and pipelines without proper tracking and notification.
  • ⚠️**Performance Bottlenecks in Online Serving:** Overly complex transformations or inefficient data retrieval from the online store can result in unacceptable inference latency.
Potential Follow-up Questions
  • How do you deprecate a feature safely?
  • How do you backfill online features?
Advertisement