Lecture 13 — LLMs as Chemistry Assistants · AI4Chemical Sciences Bootcamp

Recording

Recording will be available after the bootcamp.

August 2026

Learning Objectives

Explain retrieval-augmented generation (RAG) and implement a simple chemistry paper Q&A system
Describe how tool use (function calling) enables LLMs to interact with RDKit, databases, and simulation codes
Identify failure modes of LLMs on chemistry tasks: hallucinated reactions, incorrect SMILES, wrong stereochemistry
Evaluate LLM outputs critically using literature cross-referencing and automated validity checks

Key Takeaways

Takeaway 1. RAG grounds LLM answers in retrieved documents, dramatically reducing hallucination rates on factual chemistry questions — but retrieval quality is the bottleneck, not the language model.
Takeaway 2. LLMs are poor chemistry calculators: they hallucinate reaction mechanisms, confidently produce invalid SMILES, and make errors in stoichiometry. Always validate programmatically with RDKit or similar tools.
Takeaway 3. Tool use transforms LLMs from text generators into orchestrators of computation — they can call a yield predictor, retrieve a crystal structure, or run a retrosynthesis engine without the user writing any integration code.
Takeaway 4. The bottleneck in LLM-assisted chemistry is not language understanding but knowledge currency: models have training cutoffs, cannot access paywalled journals, and lack lab-specific institutional knowledge unless explicitly provided.