Direct Epistemic Uncertainty Prediction (DEUP) for any scikit-learn model — with first-class, leakage-correct time-series support.
DEUP estimates epistemic uncertainty by training a secondary error predictor on your model's out-of-sample errors — no ensembles, no Bayesian retraining, works with the model you already use.
Method credit: DEUP is due to Lahlou et al., 2023 (TMLR). This package is a maintained, benchmarked, scikit-learn-compatible implementation with time-series / cross-sectional finance support and aggregation-reliability diagnostics.
Repository: https://github.com/ursinasanderink/deup · Docs: https://ursinasanderink.github.io/deup/
from sklearn.ensemble import RandomForestRegressor
from deup import DEUPRegressor
model = DEUPRegressor(base_model=RandomForestRegressor())
model.fit(X_train, y_train)
pred, unc = model.predict(X_test, return_uncertainty=True)Tabular gradient boosting (LightGBM / XGBoost / CatBoost):
from deup.domains.tabular import TabularDEUP
model = TabularDEUP(backend="lgbm", cv=5).fit(X_train, y_train)
unc = model.predict_epistemic(X_test)Time-series / cross-sectional finance (flagship preset):
from deup.domains.finance import CrossSectionalDEUP
model = CrossSectionalDEUP(horizon=20).fit(panel_df)
pred, unc = model.predict(test_df, return_uncertainty=True)
health = model.health_report(test_df)pip install deup # core (numpy + scikit-learn)
pip install "deup[gbm]" # + LightGBM (TabularDEUP backend)
pip install "deup[xgb]" # + XGBoost
pip install "deup[catboost]" # + CatBoost
pip install "deup[gbm-all]" # all gradient-boosting backends
pip install "deup[finance]" # + pandas (CrossSectionalDEUP)
pip install "deup[docs]" # + MkDocs site locallyThe only public DEUP code was a stale research repo of notebooks — no maintained
pip-installable package. Major UQ libraries (torch-uncertainty, uncertainty-toolbox,
MAPIE) don't implement DEUP. deup fills that gap with leakage-correct OOF error
construction and walk-forward / purged CV for time-series and finance.
| Method | ρ | Notes |
|---|---|---|
| DEUP | 0.509 | OOF error predictor (RF base) |
| DEUP + LightGBM | 0.444 | TabularDEUP(backend="lgbm") |
| DEUP + XGBoost | 0.400 | TabularDEUP(backend="xgb") |
| DEUP + CatBoost | 0.407 | TabularDEUP(backend="catboost") |
| Ensemble disagreement | 0.460 | Bootstrap variance |
| Conformal residual | 0.447 | |y − ŷ| on cal split |
| Laplace (last-layer) | 0.015 | Not applicable to trees |
Full results: Benchmarks.
| Topic | Link |
|---|---|
| Getting started | docs/getting-started |
| Five-axis guide | docs/concepts |
| Domain presets | docs/domains |
| Tutorials | tabular · finance · conformal · active learning |
| When is agg-g reliable? | reliability |
| PyTorch / TorchUncertainty | pytorch-integration |
| Contributing | CONTRIBUTING.md |
v0.4.0 — complete library: core DEUP, conformal calibration, reliability diagnostics, domain presets (tabular GBM backends, finance, vision), benchmarks, tutorials, TorchUncertainty integration.
Cite this software (CITATION.cff), Lahlou et al. (2023) for the
DEUP method, and Sanderink (2026) for cross-sectional ranking and aggregation-
reliability extensions:
Sanderink, U. (2026) 'When Alpha Breaks: Two-Level Uncertainty for Safe Deployment of Cross-Sectional Stock Rankers', arXiv preprint arXiv:2603.13252. Available at: https://arxiv.org/pdf/2603.13252
Apache-2.0. See LICENSE.
