Self-attention, tokenisation of SMILES, and chemical language models — from BERT to domain-adapted chemistry LLMs.