[Avg. reading time: 8 minutes]
Drift
Monitoring and observability in ML is about continuously checking:
- What data is coming in
- How that data is changing
- Whether the model’s predictions are still reliable
- Whether the business metrics are degrading
Three key issues:
Data Drift: Incoming feature distributions shift from what the model was trained on.
Concept Drift: The relationship between features and target changes.
Model Performance Decay: Accuracy, precision, recall, RMSE, etc. degrade over time.
Use cases
- Fraud models stop detecting new fraud patterns.
- Demand forecasting fails when consumer behavior changes.
- Recommendation systems decay as user preferences evolve.
- Healthcare/diagnosis models degrade with new demographics.
- NLP sentiment models break due to new slang or cultural shifts.
Example
Phase 1: Training distribution
- sqft mean ~1500
- bedrooms mostly 2 or 3
- house_age mostly 5–15 years
Model learns reasonable patterns.
Phase 2: Production year later
Neighborhood changes + new houses get built.
1. Data Drift
Example:
- sqft mean shifts from 1500 to 2300
- more 4-bedroom homes appear
- house_age shifts from 10 years old to 2 years old (new constructions)
This is feature distribution drift. Model still predicts, but sees very different patterns than training.
2. Concept Drift
Originally:
- Price increases roughly 150 per extra sqft
After market shift:
- Price increases 250 per extra sqft
Meaning: the mapping from features to target changed, even though features look similar.
3. Model Performance Decay
You track weekly RMSE:
- Week 1: RMSE 19k
- Week 15: RMSE 25k
- Week 32: RMSE 42k
Why does it decay?
- Market changed
- New developers building larger homes
- New inflation conditions
- Seasonal patterns changed
- The model is outdated.
Data Quality Drift
Quality of incoming data begins to degrade:
- more missing values
- more zeros
- more invalid/out-of-range values
- more outliers
- schema changes
- feature suddenly becomes constant
- new categories never seen before
This is one of the most important practical drifts.
Example:
“furnished”, “semi-furnished” → suddenly “fully-furnished” appears (NEW category)
Data Freshness Drift (Latency Drift)
Data arrives:
- late
- too early
- stale
- out-of-order
Feature Importance Drift
Rank of feature importance changes:
Example:
- bedrooms used to be the strongest feature
- now open backyard becomes dominant
- previously irrelevant features become important and vice-versa
Input Volume Drift
Sudden spikes or drops in data volume.
Example:
Daily 500 requests suddenly becomes 10,000.
This affects latency, performance, and reliability.
Demo
https://colab.research.google.com/drive/1gf2Qs3avNej6JP-LmKHe022HUiSqbCmy?usp=sharing
git clone https://github.com/gchandra10/python_model_drift
Open Source Tools
https://github.com/evidentlyai/evidently