Caltech · Intensive Training Program

AI4Chemical Sciences

Applying machine learning to the chemical sciences

August 10 – 22, 2026
RSC 275 · Caltech
2 Weeks · 12 Lectures
Lecture
Tutorial (hands-on)
Hackathon
Break / Free

Foundations of ML & Representations

Aug 10 – 14
Time
MondayAug 10
TuesdayAug 11
WednesdayAug 12
ThursdayAug 13
FridayAug 14
9:00
Lecture Introduction Bootcamp objectives - Logistics Jules Schleinitz Lecture Supervised Learning: Regression & Classification Bias-variance, hyperparameter optimization, feature selection, small models Anjali Gurajapu x Jules Schleinitz Lecture Graph Neural Networks Message passing, GCN, GAT, MPNN, PyTorch Geometric UMA | Amin Tavakoli [tbd] Lecture Learned Representations Pre-training, transfer learning, fine tuning Chenghao Liu
🧑‍💻 Hackathon Full-Day Hackathon Morning session — apply week's concepts to a chemistry challenge
☕ 10:30 – 10:45  Coffee break
10:45
Student Introductions Attendees Introduction & Areas of Research Round-table introductions, research backgrounds, interest in the bootcamp
Tutorial Model training and performance evaluation Train/val/test splitting (feature selection, hyperparameter opt.), data leakage, chemistry-aware splits, baseline definition Tutorial Molecular Property Prediction with GNNs PyG on QM9, MPNN for HOMO-LUMO, atom embedding visualization Tutorial Few shot learning Transfer learning, application, failure modes...
🧑‍💻 Hackathon Full-Day Hackathon Late morning — team work continues
🍽️ 12:15 – 13:30  Lunch
13:30
Lecture Evolution of Molecular Representations SMILES, InChI, SELFIES, fingerprints, expert-crafted, molecular graphs Jules Schleinitz Lecture Neural Networks & Deep Learning MLPs, backprop, PyTorch overview UMA | Amin Tavakoli [tbd] Lecture Model Interpretability SHAP, LIME, attention maps, concept-based explanations, interpretability in chemistry Michael Yusov Lecture Diffusion Generative Model Chenghao Liu
🧑‍💻 Hackathon Full-Day Hackathon Afternoon — final sprint & preparation of results
☕ 15:00 – 15:15  Coffee break
15:15
Tutorial Molecular Property Prediction Building a pipeline using rdkit, feature engineering, training and performance evaluation Tutorial Feed-Forward Net — Formation Energy PyTorch from scratch, training loop, early stopping Tutorial Explaining ML Predictions SHAP values on molecular fingerprints, attention visualization on GNNs Tutorial Molecular Generation with VAEs Encode/decode molecules, latent space exploration, validity checks
🧑‍💻 Hackathon Full-Day Hackathon Presentations & wrap-up of hackathon results

Dataset Design & LLMs

Aug 17 – 21
Time
MondayAug 17
TuesdayAug 18
WednesdayAug 19
ThursdayAug 20
FridayAug 21
9:00
Lecture Transformers Text-based representations for chemistry, tokenization Amin Tavakoli Lecture Design of Experiments (DOE) Factorial design, screening, response surfaces, and experimental planning strategies Michael Yusov Lecture Active Learning Query strategies: uncertainty, BALD, core-set, pool-based vs. stream-based Jules Schleinitz Lecture Agentic AI for Chemistry ReAct, Coscientist, multi-agent, self-driving labs James Wade x Jules Schleinitz
🧑‍💻 Hackathon Hackathon Friday morning session — apply week's concepts to a chemistry challenge
☕ 10:30 – 10:45  Coffee break
10:45
Tutorial Fine-tuning ChemBERTa HuggingFace, BACE IC50, frozen vs. full fine-tuning
Tutorial Design of Experiments in Practice Factorial screening, response surface setup, and experiment planning workflows
Tutorial Active Learning for DFT Triage QM9/ANI-1 pool, GP uncertainty sampling, learning curves Tutorial Building a Chemistry ReAct Agent PubChem lookup, RDKit tools, BO surrogate, guardrails
🧑‍💻 Hackathon Hackathon Friday late morning — team work continues
🍽️ 12:15 – 13:30  Lunch
13:30
Lecture Data Stewardship Data management in experimental projects for efficient use in data-driven projects. Alix Schmidt Lecture Bayesian Optimization Surrogates, EI/UCB/PI, BO loop, molecular & materials discovery, multi-objective BO Michael Yusov Lecture LLMs as Chemistry Assistants RAG, tool use, literature extraction, hallucination limits TBD [guest?]
🧑‍💻 Hackathon Full-Day Hackathon Afternoon — final sprint & preparation of results
Lecture AI Readiness Guest lecturer [TBD]
☕ 15:00 – 15:15  Coffee break
15:15
Tutorial Data Stewardship Organizing and managing research data, metadata standards, data sharing practices Tutorial Reaction Yield Optimization with BO Suzuki coupling, BoTorch, EI vs. UCB convergence Tutorial Chemistry RAG Assistant Index papers, Q&A on reactions, RDKit tool calls
🧑‍💻 Hackathon Full-Day Hackathon Presentations & wrap-up of hackathon results
Closing Wrap-up & Perspectives