Lecture 8 — Diffusion Models · AI4Chemical Sciences Bootcamp

Recording

Recording will be available after the bootcamp.

August 2026

Learning Objectives

Derive the forward (noising) and reverse (denoising) processes of a DDPM at a high level
Explain how score matching connects diffusion models to energy-based models
Describe how equivariant diffusion (e.g., EDM, DiffSBDD) handles the SO(3) symmetry of 3-D molecules
Identify the strengths and limitations of diffusion models compared to VAEs and normalising flows for molecular generation

Key Takeaways

Takeaway 1. Diffusion models generate samples by learning to reverse a gradual noising process — the key insight is that reversing small Gaussian perturbations is easier to learn than mapping noise directly to data.
Takeaway 2. For 3-D molecular generation, the diffusion process must be equivariant to rotations and translations; operating in the centre-of-mass frame and using SE(3)-equivariant networks (e.g., EGNN) achieves this.
Takeaway 3. Diffusion models produce higher-quality and more diverse samples than VAEs but are slower at inference because they require many denoising steps; consistency models and DDIM reduce this cost significantly.
Takeaway 4. Validity and novelty metrics (QED, SA score, RMSD to crystal structures) are necessary but not sufficient — always benchmark generated molecules against experimental activity data when possible.