In this repository we provide baselines to evaluate uncertainty in regression problems.
We provide 11 datasets, 3 uncertainty metrics, 3 deep models, 7 shallow models, and an uncertainty calibration routine.
- Multilayer Perceptron
- Regular (homoskedastic)
- Bayesian dropout
- Two outputs: mean and stddev, learning with log-likelihood loss function
- Extreme gradient boosting
- Regular (homoskedastic)
- Tree variance as stddev
- Two outputs: mean and stddev, learning with log-likelihood loss function
- Random forest
- Regular (homoskedastic)
- Tree variance as stddev
- Linear (both homoskedastic)
- Linear regression
- Bayesian linear regression
Uncertainty calibration is a procedure where we calibrate the uncertainty on a validation set in order to maximize the predictive log likelihood (normal distribution):
Where
We provide 10 UCI regression datasets typically used in the bayesian deep learning literature plus one extra large dataset (flight delays).
- NLPD (negative log predictive distribution) of a normal distribution (sometimes knows as negative log likelihood)
- RMSE (root mean squared error)
- Area under curve of the RMSE (each point of the curve is the RMSE with the top X% most uncertain samples removed from the test set)
- Normalized area under curve of the RMSE (normalized by the RMSE itself)
- Shallow models make very strong baselines both in RMSE and NLPD, even compared with the state of the art literature
- Heteroskedastic variance is almost always more useful than homoskedastic, no matter the method or model
- Clone the repo locally
- Go to its directory
- Install with:
pip install -e .
- To run a script:
python scripts/shallow_experiments.py