A fragment based approach towards curating, comparing and developing machine learning models applied in photochemistry

Abstract

The development of graph neural networks for predicting molecular properties has garnered significant attention, as it enables the correlation of quickly computable atomic and bond descriptors with overall molecular properties. With the rising interest in photochemistry and photocatalysis as sustainable alternatives to thermal reactions, curation of virtual databases of computed photophysical properties for training of machine learning models has become popular. Unfortunately, current efforts fail to consider the exciton localization onto different chromophores of the same molecule, leading to potentially large prediction errors. Here we describe a molecular fragmentation strategy that can be used to overcome this limitation, while also providing a way to compare structural diversity between different libraries. Using a newly generated database of 46 432 adiabatic S0–T1 energy gaps (ALFAST-DB), we compare its diversity with two datasets from the literature and demonstrate that a fragment-based delta learning approach improves model generalizability while achieving accuracies comparable to those of traditional message passing graph neural network architectures (MPGNN).

Graphical abstract: A fragment based approach towards curating, comparing and developing machine learning models applied in photochemistry

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Edge Article
Submitted
26 Jul 2025
Accepted
11 Oct 2025
First published
15 Oct 2025
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY license

Chem. Sci., 2025, Advance Article

A fragment based approach towards curating, comparing and developing machine learning models applied in photochemistry

R. Pérez-Soto, M. V. Popescu, S. Kumar, L. A. Gomes, C. Lee, E. Shore, S. A. Lopez, R. S. Paton and S. Kim, Chem. Sci., 2025, Advance Article , DOI: 10.1039/D5SC05615B

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements