r/MachineLearning · · 1 min read

How are production ML systems typically handling distribution shift over time? [D]

Mirrored from r/MachineLearning for archival readability. Support the source by reading on the original site.

In deployed ML systems, data distribution drift seems unavoidable over longer time horizons.

I’m trying to understand what approaches are commonly used in practice:

  • Continuous retraining pipelines (fixed intervals vs trigger-based)
  • Online monitoring for feature or prediction drift
  • Use of shadow models or fallback models in production
  • Human-in-the-loop review for edge cases

In most real deployments I’ve seen discussed, retraining strategy seems more operationally constrained than model-related.

Curious what approaches are actually working reliably in production environments and what tends to fail first.

submitted by /u/Electrical_Mine1912
[link] [comments]

Discussion (0)

Sign in to join the discussion. Free account, 30 seconds — email code or GitHub.

Sign in →

No comments yet. Sign in and be the first to say something.

More from r/MachineLearning