Skip to content

perashanid/driftguard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DriftGuard — ML Model Monitoring & Data Drift Detection System

91% of ML models degrade in production (NeurIPS study). DriftGuard catches it before your revenue does.

A production-grade monitoring system that watches deployed ML models and raises alerts when data drift, concept drift, or performance degradation occur. Includes automatic root cause analysis, drift severity scoring, and retraining triggers.


Architecture

┌─────────────────────────────────────────────────────────┐
│                    DriftGuard System                     │
├─────────────┬─────────────┬─────────────┬───────────────┤
│  Data Drift │Concept Drift│ Performance │  Root Cause   │
│  Detection  │  Detection  │  Monitoring │  Analysis     │
├─────────────┼─────────────┼─────────────┼───────────────┤
│ • KS Test   │ • ADWIN     │ • Accuracy  │ • Feature     │
│ • PSI       │ • Page-     │ • Latency   │   Ranking     │
│ • MMD       │   Hinkley   │ • Error     │ • Severity    │
│ • Adversar- │ • DDM       │   Rate      │   Scoring     │
│   ial       │             │ • Custom    │ • Co-drift    │
│             │             │   Metrics   │   Groups      │
├─────────────┴─────────────┴─────────────┴───────────────┤
│              Retraining Trigger Engine                    │
│  Multi-signal composite scoring + cooldown + urgency     │
├─────────────────────────────────────────────────────────┤
│              Drift Scenario Simulator                    │
│  Sudden | Gradual | Recurring | Incremental              │
└─────────────────────────────────────────────────────────┘

Quick Start

Installation

cd driftguard
pip install -r requirements.txt

Run the Dashboard

streamlit run app.py

Run Tests

cd driftguard
python -m pytest tests/ -v

Features

1. Data Drift Detection

Detects when input feature distributions shift from training data:

Detector Method Best For
KS Test Kolmogorov-Smirnov two-sample test Per-feature distribution comparison
PSI Population Stability Index Binned distribution shift magnitude
MMD Maximum Mean Discrepancy (RBF kernel) Multivariate distribution comparison
Adversarial Train classifier to distinguish datasets Complex non-linear drift patterns
from src.detectors.data_drift import KSTestDetector, PSIDetector

ks = KSTestDetector(significance=0.05)
result = ks.detect(reference_data, current_data, feature_names)

if result.is_drift:
    print(f"Drift detected! Score: {result.score:.4f}")
    print(f"Drifted features: {result.details['drifted_features']}")

2. Concept Drift Detection

Detects when the relationship between features and target changes:

Detector Method Characteristics
ADWIN Adaptive Windowing Variable window, Hoeffding bound
Page-Hinkley Cumulative sum test Detects mean shifts in streams
DDM Drift Detection Method Error rate + std dev monitoring
from src.detectors.concept_drift import ADWINDetector

adwin = ADWINDetector(delta=0.002)
for error in prediction_errors:
    result = adwin.add_element(error)
    if result.is_drift:
        print("Concept drift detected!")

3. Performance Monitoring

Tracks model performance metrics over sliding windows:

from src.detectors.performance_drift import PerformanceDriftDetector

monitor = PerformanceDriftDetector(window_size=200)
monitor.set_baselines({"accuracy": 0.95, "latency_ms": 10})

snapshot = monitor.record(accuracy=0.89, latency_ms=15)
if snapshot.is_degraded:
    print("Performance degradation detected!")

4. Root Cause Analysis

Automatically identifies which features drove the drift:

from src.analysis.root_cause import RootCauseAnalyzer

analyzer = RootCauseAnalyzer()
report = analyzer.analyze(reference_data, current_data, feature_names)

print(report.summary())
# Shows: severity ranking, co-drifted groups, actionable recommendations

5. Retraining Trigger

Multi-signal decision engine with cooldown and urgency scoring:

from src.monitors.retraining_trigger import RetrainingTrigger

trigger = RetrainingTrigger(retrain_threshold=0.40)
decision = trigger.evaluate(
    data_drift_result=ks_result,
    concept_drift_results=adwin_results,
    performance_snapshot=perf_snapshot,
)

if decision.should_retrain:
    print(f"Urgency: {decision.urgency.value}")
    print(f"Reasons: {decision.reasons}")

6. Drift Simulator

Generate realistic drift scenarios for testing:

from src.simulator.drift_simulator import DriftSimulator, get_preset_scenarios

simulator = DriftSimulator()
scenarios = get_preset_scenarios()

# Simulate a sudden drift (new product category)
stream = simulator.simulate(scenarios["sudden_category"])

# Train a model and monitor it
model = simulator.create_model(stream.reference_X, stream.reference_y)

Built-in scenarios:

  • Sudden: New Product Category — Abrupt distribution shift
  • Gradual: Seasonal Change — Slow linear shift over time
  • Recurring: Weekday/Weekend — Periodic oscillating patterns
  • Incremental: Slow Creep — Constant-rate accumulating drift
  • Sudden: Major Pipeline Break — Severe multi-feature drift

Project Structure

driftguard/
├── app.py                          # Streamlit dashboard
├── requirements.txt
├── README.md
├── src/
│   ├── __init__.py
│   ├── detectors/
│   │   ├── __init__.py
│   │   ├── data_drift.py           # KS, PSI, MMD, Adversarial
│   │   ├── concept_drift.py        # ADWIN, Page-Hinkley, DDM
│   │   └── performance_drift.py    # Sliding window metrics
│   ├── analysis/
│   │   ├── __init__.py
│   │   └── root_cause.py           # Feature ranking & recommendations
│   ├── monitors/
│   │   ├── __init__.py
│   │   └── retraining_trigger.py   # Auto-retrain decision engine
│   └── simulator/
│       ├── __init__.py
│       └── drift_simulator.py      # Scenario generation
├── tests/
│   ├── __init__.py
│   ├── test_data_drift.py
│   ├── test_concept_drift.py
│   └── test_pipeline.py            # Integration + E2E tests
├── models/                          # Saved model artifacts
└── data/                            # Reference datasets

Technical Details

Drift Severity Scoring

Composite severity score (0–1) per feature:

  • KS statistic (30%): Non-parametric distribution distance
  • PSI, normalized (30%): Binned distribution divergence
  • Mean shift (25%): Standardized location change
  • Std shift (15%): Scale change magnitude

Retraining Decision Logic

Weighted composite of three signals:

composite = 0.35 × data_drift + 0.35 × concept_drift + 0.30 × performance

Urgency levels:

Composite Score Urgency Action
< 0.28 None Continue monitoring
0.28 – 0.40 Low Schedule review
0.40 – 0.55 Medium Plan retraining
0.55 – 0.70 High Retrain soon
> 0.70 Critical Immediate retraining

Why This Matters

This project demonstrates the production ML mindset — the gap between training a model and maintaining one in production:

  • Data pipelines break silently — DriftGuard catches distribution shifts before they impact users
  • Models degrade gradually — Concept drift detection finds when the world changes under your model
  • Root cause analysis saves hours — Instead of "something is wrong," you get "feature X shifted by 2.3σ because of Y"
  • Automated retraining — No more manual checks; the system decides when to retrain

Companies like Netflix, Uber, and Stripe maintain dedicated teams for exactly this capability.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages