Lecture 12 — Uncertainty Quantification · AI4Chemical Sciences Bootcamp

Recording

Recording will be available after the bootcamp.

August 2026

Learning Objectives

Distinguish aleatoric (data) uncertainty from epistemic (model) uncertainty and design experiments to measure each
Build calibration plots and compute Expected Calibration Error (ECE) for a trained model
Apply conformal prediction to produce guaranteed coverage intervals without distributional assumptions
Implement a deep ensemble and compare its uncertainty estimates to MC-Dropout and a single GP

Key Takeaways

Takeaway 1. A well-calibrated model is as confident as it should be — a predicted 90% confidence interval should contain the true value 90% of the time. Miscalibration is the norm, not the exception, for molecular ML models.
Takeaway 2. Deep ensembles (5–10 independently trained models) are the most reliable uncertainty estimator in practice and are often more accurate than Bayesian approximations while being simpler to implement.
Takeaway 3. Conformal prediction provides coverage guarantees that hold regardless of model or data distribution — this is especially valuable in chemistry where distribution shift is common.
Takeaway 4. Uncertainty estimates are only meaningful within the applicability domain. A model that is confidently wrong in a new chemical space is more dangerous than one that correctly flags its uncertainty.