Open Access Article
Thanapat
Worakul
a,
Rubén
Laplaza
ab,
J. Terence
Blaskovits
ac and
Clémence
Corminboeuf
*ab
aLaboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, Ecole Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland. E-mail: clemence.corminboeuf@epfl.ch
bNational Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
cMax-Planck Institute for Polymer Research, Ackermannweg 10, 55128 Mainz, Germany
First published on 25th August 2025
Recently, we leveraged the FORMED repository made up of 116
687 synthesizeable molecules to deploy fragment-based high-throughput virtual screening (HTVS) and genetic algorithm (GA) searches of singlet fission (SF) molecular candidates. With these approaches, both prototypical (e.g., acenes, boron-dipyrromethane (BODIPY)) and unreported (e.g., heteroatom-rich mesoionic) classes of chromophore candidates fulfilling specific SF energetic requirements were identified. Yet, the reliance on predefined fragments limits chemical space exploration and, thus, the discovery of truly unforeseen molecular cores. Here, we exploit FORMED to train a generative learning framework driven by reinforcement learning and property predictions. The generative model rediscovers a diverse range of previously reported SF chromophore classes, including polyenes, benzofurans, fulvenoids and quinoidal systems, but also suggests an unexpected scaffold absent from the training data, neocoumarin (2-benzopyran-3-one), characterized by two endocyclic double bonds in an ortho arrangement and capped by a lactone group. An in-depth investigation reveals a diradicaloid behavior over the conjugated core comparable to 2-benzofuran, a widely known SF compound. This work highlights the potential of using both generative and property prediction models to discover candidates beyond derivatives of known chemistry for tailored material applications.
Extensive experimental and computational work has focused on designing SF materials, primarily through the screening and modification of known compounds, leading to tailored design rules.10–21 In this spirit, some of us previously built the FORMED dataset by mining the Cambridge Structural Database (Fig. 1a) and characterizing 116
687 experimentally accessible organic molecules with time-dependent density functional theory (TD-DFT).22 FORMED enabled the construction of over a million donor–acceptor copolymers by cross-coupling the C(sp2) sites of selected fragments in silico, which were subsequently screened using statistical models to identify systems with suited SF thermodynamics. Although this approach successfully identified several potential donor–acceptor systems, it relied on previously defined heuristic rules23 to limit the combinatorial space.
To navigate the chemical space more efficiently, this high-throughput effort was followed by the development of an uncertainty-controlled genetic algorithm (GA),24 based on NavicatGA,25,26 (Fig. 1b). Upon GA optimization, molecules were assembled from a FORMED-derived pool of fragments, called reFORMED, and ensemble machine learning predictive models trained on FORMED data served to score candidates. This approach led to the rediscovery of known SF compounds and to the identification of acceptors, such as heteroatom-rich mesoionic compounds, not previously investigated for SF. However, genetic optimization requires a predefined fragment database and fixed rules for their recombination, which inherently limits the potential of genetic approaches to identify structural motifs completely outside the box.27–29
This limitation is potentially overcome by deep learning-based generative models,30–37 which implicitly learn the rules for generating chemical structures. These models, trained on molecules, uncover the underlying structural patterns and relationships among them, encoding this information into a continuous latent space, a compressed representation of molecules. By sampling from this latent space, generative models create molecules that reside within the learned chemical domain. Free from the constraints of manually predefined recombination rules and fragment libraries, generative models enable a broader, unbiased chemical space exploration, with the potential of discovering compounds that traditional approaches would have missed.
Once the generative models have learned to create molecules, multiple conditioning strategies are devised to steer the generative process toward desirable properties, thereby enabling the inverse design process.38–44 There are several approaches to direct the generation to the target molecules, such as gradient optimization in the latent space,45 gradient-based guidance diffusion,38 and classifier-free guidance diffusion.46,47 Among these, reinforcement learning (RL)39–44 is the optimization method that iteratively refines the model to meet target objectives. In each RL iteration, the model generates a batch of candidate molecules, which are then evaluated using a scoring function that quantifies how closely the generated molecules align with the desired properties. Based on the evaluation, the model parameters are updated through a feedback loop that improves its ability to generate higher-scoring molecules in subsequent iterations. Over multiple iterations, the generative model's outputs are refined toward compounds that satisfy the expected properties. Li and Tabor48 demonstrated the potential of this approach by integrating a generative model with RL to identify SF candidates. Because their methodology relied on semi-empirical computations to evaluate excited-state energies during each RL iteration and because their model was trained on a ChEMBL database49 containing only small drug-like molecules, the exploration of the chemical space remained limited, leading to the identification of thiophene and acene derivatives primarily. Replacing these expensive computations with machine learning models as scoring functions and leveraging a chemically diverse database to train the models offers an efficient alternative, enabling scalable and broader chemical space exploration.
Here, we develop a data-driven design platform in which both structure generation and structure–property estimation are accomplished via machine learning. We leverage the FORMED database to train both the generative and property prediction models (Fig. 1c). By combining a generative model capable of creating diverse molecules with a robust predictive model that estimates excited-state properties with minimal expense, our approach offers an efficient means for discovering unexpected molecular scaffolds fulfilling the energetic criteria of SF materials. Our approach specifically discovers a hitherto unknown cyclic ester core, neocoumarin not present in FORMED, that has been previously synthesized. This class of molecules follows the well-known diradical design principle elaborated by Michl and coworkers,17 while overcoming some of the limitations of the 2-benzofuran core. Importantly, the approach also rediscovered most other known or predicted classes of SF molecules, including acenes, polyenes, and benzofurans.
687 molecules, along with their excited-state properties, using Chemprop v1.5.2.50 This model predicts four key electronic excited-state properties—singlet and triplet vertical excitation energies (ES1,ve, ET1,ve, and ET2,ve) as well as exciton size
—from SMILES as a molecular representation. The Chemprop model architecture consists of a 3-layer GNN with a hidden size of 300 and a dropout probability of 0.2. The dataset was randomly split into training, validation, and test sets with an 80/10/10 ratio. On the test set, the Chemprop model achieved mean absolute error (MAE) losses of 0.13 eV for ES1,ve, 0.12 eV for, 0.13 eV for ET2,ve, and 0.17 Å for the exciton size (Fig. 3).
In line with previous work,24 we also evaluated the predictive performance of the Chemprop model on an external test set derived from the reFORMED database. The trained model demonstrated acceptable predictive accuracy across all excited-state energies, with MAE values of 0.22 eV for ES1,ve, 0.20 eV for ET1,ve, and 0.31 Å for the exciton size (Fig. S1). As such, we concluded that the multi-target Chemprop model is able to predict the excited-state properties of unseen molecules accurately enough for the RL process (vide infra).
with tokens
. The joint probability is expressed as:![]() | (1) |
The training involves minimizing the negative log-likelihood (NLL), which quantifies how well the model predicts the sequences in the training data. The NLL is defined as:
![]() | (2) |
Once trained, the model generates SMILES strings by sampling tokens sequentially from the learned probability distribution P(T). Starting with an initial token (^ in our case), the model predicts the probability distribution for the next token based on the conditional probabilities it has learned. The process is repeated iteratively, with each generated token influencing the prediction of the subsequent token, until sequence generation is complete. The sequence is terminated when a predefined stop token is added ($ in our case).
log Paug(T) = log Pprior(T) + σS(T), | (3) |
To optimize the agent model, the loss function is defined as
![]() | (4) |
During the optimization loop, the RL tends to overexploit specific SMILES sequence patterns, corresponding to high reward value, which leads to the generation of structures with similar scaffolds within an iteration loop. To promote structural diversity among the molecules generated from the agent model, we employed the diversity filter implemented in REINVENT v4.65 This filter penalizes SMILES strings that are too similar to those already stored in a memory bucket, which keeps track of previously generated molecules. By discouraging the agent model from repeatedly generating structurally similar compounds, the filter ensures greater diversity. In addition, we used experience replay68–70 to improve the convergence of the RL process by storing high-scoring molecules generated during previous iterations and periodically reintroducing them into the training process.
The energy score function taken from previous work,24 where a higher score corresponds to a higher reward, was used to optimize the S1/T1 energy levels to satisfy three SF requirements:
(1) Thermodynamic constraint: ES1,ve − 2ET1,ve > −1 eV.
(2) Solar cell semiconductor compatibility: ET1,ve > 1.5 eV.
(3) Matching with the solar emission spectrum: ES1,ve < 3.8 eV.
A detailed mathematical definition of the energy score is given in the SI. In addition to optimizing the energy levels of S1 and T1, our goal was to maximize the exciton size (i.e., the root-mean square electron–hole separation) to promote delocalized singlet exciton formation. A larger exciton size is indicative of charge-transfer or delocalized excited-state character, which are both beneficial for the triplet-pair formation from the singlet state. We also consider the energy gap between the vertical second and first excited triplet states, which is to be maximized to reduce the likelihood of competing T1 to T2 upconversion processes.71 Furthermore, we bias the generative model against the generation of charged structures by penalizing the score of charged molecules.
Given the complexity of the score function, which involves multiple objectives to be optimized simultaneously, we adopted a two-stage curriculum for the RL optimization process (Fig. 2).72 In the first stage, we focused on structural constraints (more details are given in the SI). To avoid overfitting, this first stage of RL was limited to 20 iterations. In the second stage, we focus on optimizing the SF-related properties, namely the energy score, exciton size, and T1/T2 gap (ET2,ve − ET1,ve). Splitting the RL into two stages focused on different aspects significantly simplifies the learning. During each stage, the individual components of the score function were aggregated using a weighted geometric mean:
![]() | (5) |
At the end of the second stage of each trial, 1280 molecules were generated using the post-RL generative model, and their excited-state properties were predicted using the Chemprop model. Subsequently, the 10 best molecules, ranked according to their energy scores, were selected for further validation and optimized with DFT to have their properties evaluated using TD-DFT computations.
of the generated molecules reach convergence (Fig. 4a, b and S5). The ET2,ve − ET1,ve is also sufficiently high (∼1.2 eV) to suppress undesired triplet–triplet upconversion (Fig. 4c). For simplicity, we narrow the discussion to the energy score and the singlet and triplet excitation energies, which are the most critical factors for SF propensity. To evaluate the impact of reinforcement learning, we compare the average excited-state properties of molecules generated before and after the initial optimization trial. Specifically, we sample 1280 molecules from both the pre-RL and post-RL generative models (model 1) and analyze their chemical structures and predicted excited-state properties.
The pre-RL generative model creates molecules that broadly span the chemical space of the FORMED database (Fig. S8). The resulting molecules are chemically diverse, exhibiting an average Tanimoto similarity score of 0.10 and yielding ∼367 unique scaffolds87 from a sample of 1280 generated molecules (Table S3). Unique scaffolds are defined as a Murcko scaffold with a Tanimoto similarity below 0.70 relative to other Murcko scaffolds in the dataset. In terms of their excited-state properties, the distributions of vertical excitation energies for the singlet (ET1,ve) and triplet (ET1,ve) states are largely overlapping (Fig. 4d), suggesting that most generated molecules do not meet the thermodynamic requirements for SF. Of the 1280 molecules generated by the pre-RL model, only two are predicted to satisfy the energetic criteria, a low hit rate at this stage. In contrast, molecules generated by the post-RL generative model 1 are confined to a narrow region of chemical space (Fig. 5) and are less diverse, with an average Tanimoto similarity score of 0.35. These molecules mainly share the same Murcko scaffolds (1–5 unique scaffolds among 1280 generated molecules, Table S3). However, the distribution of their excited-state energy levels is well separated (Fig. 4e), with the ET1,ve centered at approximately half ES1,ve. Consequently, more than 750 structures of the 1280 molecules created by the post-RL generative model 1 are within the energetic target region for the desired SF property (Fig. 4f).
![]() | ||
| Fig. 5 t-SNE plot generated from the Morgan fingerprint representation of the generated structures from the pre-RL and post-RL generative models. | ||
Despite improving the excited-state properties, the post-RL generative model 1 predominantly suggests polyenes (e.g., 1 in Fig. 5 and 6) and similar molecules with extended π-conjugated systems that are known SF chromophores. This apparent preference for polyenes is sound considering the simplicity of their SMILES pattern, which consists of frequently occurring tokens in the FORMED database (e.g., C and =). Furthermore, these molecules are abundant in FORMED (used to train both the generative and property prediction models).
To steer the generative exploration toward other chemical space regions, we introduce custom alerts,39 SMARTS-based filters for unwanted substructures into the scoring function at both stages of the RL curriculum. If a generated molecule contains any substructure from the predefined list, its score is set to zero, effectively penalizing the generation of such molecules. Molecules containing ring systems and extensive π-conjugated frameworks (entries 1–15 in the complete list of unwanted substructures provided in the SI) were initially penalized.
With these structural constraints, the second and third optimization trials guide the generative models toward more compact and tunable molecules featuring rigid π-conjugated cores. However, each post-RL generative model continues to create molecules within the same structural class. Specifically, the second and third post-RL models predominantly generate substituted acenes88–93 (e.g., 2) and derivatives of boron-dipyrromethane (BODIPY)94–97 (e.g., 3), respectively. Notably, despite these structural constraints, the diversity of molecules generated in these trials remains comparable to that of the first RL trial in terms of the number of unique scaffolds and similarity scores (Table S3), where no custom alert filter was used. Similar to polyenes, these molecule classes have been investigated for SF and are prominently represented in FORMED, with BODIPY and anthracene appearing 621 and 928 times, respectively.
We thus include BODIPY and acene in the unwanted structure list (entries 16–17 in the complete list of unwanted substructures provided in the SI) for further exploration. Incorporating the full list of unwanted substructures in the structural constraints leads to a more challenging optimization, requiring a larger number of iterations to improve the excited-state properties of the generated molecules (Fig. S4 and S5). Seven additional optimization trials were carried out with the full list of unwanted structures. Each trial directs the generative model toward a distinct family of SF chromophores, corresponding to a unique region of chemical space, as illustrated by the dimensionality reduction plot in Fig. 5. The RL optimization curves and kernel density histogram plots of the excited-state properties of all trials are presented in Fig. S3–S5, respectively. A summary of the performance of all generative models is provided in Tables S3 and S4.
At the end of each of the 10 optimizations, we collect the top 10 candidates from 1280 molecules created by the post-RL generative model according to their predicted energy scores. These top performers are then optimized with DFT, and their vertical energies computed with TD-DFT. Among the 100 evaluated molecules, 73 meet the SF criterion, demonstrating that Chemprop reliably guides the generative model toward high-reward regions of chemical space via RL. From this pool of computationally validated SF molecules, we (re)discover molecular classes with energy splittings that satisfy the SF thermodynamic criteria. One to three representative molecules from each post-RL generative model are presented in Fig. 6. Their adiabatic excited-state energies, along with the Chemprop-predicted vertical excitation energies, are provided in Fig. S10 and S11, respectively. The computed ES1,ve and ET1,ve of our candidate molecules fall within the thermodynamically favorable region for S1 → 2T1 conversion. Furthermore, these candidates' ET1,ve value is adequate for their potential integration into solar cell applications (Fig. 6) and the T1–T2 energy gaps are large enough to suppress unwanted triplet upconversion. As stated above, the three initial trials coincide with the rediscovery of polyenes (1), acenes (2), and BODIPY (3). In the following seven runs, which exclude the full list of unwanted substructures, the generative model uncovered derivatives such as substituted 2-benzofurans (4), which were initially screened and identified by Michl and coworkers17 as potential SF chromophores. Similarly to their findings, our candidates feature substitutions at the C1 and C3 positions,98–100 lowering the triplet energies to better align with the energy requirements for solar cell applications than the unsubstituted counterparts (Fig. S14).
We also encountered a variety of fulvenoid and quinoidal compounds (5–8), recently identified as promising SF scaffolds.12,101 The fulvenoid derivatives feature diverse heterocyclic rings such as furan, thiophene, imidazole, and hydantoin, linked to aromatic rings via an exocyclic bridge containing one or more methine moieties, often decorated with a cyano group. Similar to the substituted fulvenes previously identified,12 these fulvenoid structures maintained a favorable ES1,ve
:
ET1,ve ratio. While these structures merit further investigation, they will not be the focus of the remainder of this work.
In line with a previous work by some of us, the post-RL generative model 9 yielded molecules containing mesoionic N-oxide motifs (9).102 Mesoionic heterocycles have been identified as good acceptor units for charge transfer in donor–acceptor systems.14,21,24 Interestingly, structures embedding the N-oxide in an anthracene core exhibit lower ET1,ve and similar ES1,ve compared to the substituted acenes, which increases the splitting (ET1,ve = 1.4 eV, and ET1,ve = 3.4 eV).
Overall, the diversity of the (re)discovered singlet fission chromophore candidates demonstrates the capability of the optimization pipeline, powered by a tailored scoring function and the FORMED database, to identify SF molecules across different structural classes. Furthermore, the successive inclusion of unwanted substructures as a penalty term in the RL step underscores the flexibility of the workflow.
The derivatives selected by the model include amine and aryl substituents on the pyrone ring. Substituted neocoumarin structures (10 and 11) generated by our optimization trials exhibit proper energetics for SF, with ET1,ve ≈ 1.4 eV and ES1,ve ≈ 3.0 eV. The distribution of the excited-state character, as visualized through the hole and electron densities derived from the natural transition orbitals (Fig. S15), shows significant overlap between the singlet and triplet states, both being localized near the aromatic subsystem of the fused ring core.
To identify the molecular core responsible for the SF-relevant properties and extract concrete design principles, we performed adiabatic TD-DFT computations on the bare neocoumarin structure (13, Fig. 7), which confirmed good S1/T1 splitting (ET1,ad = 1.49 eV and ES1,ad = 3.34 eV). We also noted that the synthesis of neocoumarin and derivatives thereof has been accomplished in a number of previous works.75,103–105 Interestingly, while coumarin (1-benzopyran-2-one) and isocoumarin (2-benzopyran-1-one) motifs appear 889 and 121 times, respectively, in the FORMED database, neocoumarin (2-benzopyran-3-one) is not present. Neither coumarin (ET1,ad = 2.75 eV, ES1,ad = 4.43 eV) nor isocoumarin (ET1,ad = 2.46 eV, ES1,ad = 4.43 eV) exhibit energies which are conducive to SF (green crosses in Fig. 8; see also Fig. S14). This implies that the optimization strategy succeeded at finding an out-of-sample, synthesizable chemical motif with good SF properties in spite of its absence from the training data and the inadequacy of its closest constitutional isomers.
The synthetic viability of neocoumarin opens the door to a whole class of compounds with shared structural and electronic characteristics. In order to exhaustively identify beneficial modifications and rule out modifications that disturb the SF energetics, we systematically explored the chemical space around the neocoumarin core by manually constructing diverse derivatives and computing their excited-state properties using adiabatic TD-DFT (Fig. 7). Vertical T1 and T2 excited state energy levels of these molecules are reported in Fig. S13.
We first varied the number of fused rings in the system. Removing the fused benzene ring increases ES1,ad to 4.38 eV (14), while extending the number of fused rings reduces ET1,ad to 0.86 eV for two fused benzene rings (15) and 0.47 eV for three fused benzene rings (16), leading to poor SF thermodynamics in both cases, thus establishing that an energetic sweet spot is achieved with two rings. The role of the lactone moiety was then explored by swapping the ester functional group in the lactone ring with an amide (17, forming a lactam), which lead to an increase of ET1,ad by around 0.2 eV, and thus a poorer splitting. Replacing the lactam with a thioester instead (19) lowers ET1,ad by 0.1 eV while ES1,ad decreases by about 0.2 eV, which keeps the splitting ratio approximately constant. These results corroborate that the ortho-quinoidal double bonds are key to the photophysical properties of the system, whereas the nature of the lactone-type endcapping functional group and substituents on the pyranone unit are synthetic handles of interest for potential kinetic stability or synthesizability reasons.
The choice of ring system was also explored by manually constructing and testing different 5- and 6-membered aromatic heterocycles as a replacement for benzene to understand the limitations of this strategy. With 5-membered heterocyclic rings such as pyrrole (21, and 22), thiophene (23), the ES1,ad
:
ET1,ad ratio remains close to ideal while ET1,ad increases to around 2.0 eV. However, replacing benzene with pyridine (24) has a negligible effect on SF energetics.
Since replacing benzene with other aromatic rings proved to be a viable strategy to fine-tune the neocoumarin core, we tested the role of aromaticity by using non-aromatic or anti-aromatic fused rings instead. Replacing benzene by cyclobutene or cyclooctatriene (25–26) leads to a dramatic lowering in ET1,ad, while breaking the delocalized π system through saturation (27–28) leads to a consistent increase in ET1,ad, in both cases hampering SF energetics.
We note the similarity between the neocoumarin core and the previously identified 2-benzofuran derivatives pioneered by Michl and coworkers (Fig. 9a).17,106–109 In both cases, the presence of o-xylylene motif characterized by two ortho quinoid endocyclic double bonds appears to be conducive to good SF energies. As expected from this analysis, removing this motif by altering aromaticity (as in 27 and 28) or by changing the position of the ester functional group in the 2-pyrone ring (Fig. 8), disturbs the energetics of the system.
In the case of 2-benzofuran, the presence of the o-xylylene motif (dark green in Fig. 9b), has been linked with diradicaloid character, an electronic property associated with SF propensity.100,103,104,110,111 To assess whether the same applies to the identified class of molecules, we evaluated the di- and tetraradical character (y0 and y1, respectively) of neocoumarin and selected derivatives via natural orbital analysis of their unrestricted Hartree–Fock (UHF) wavefunctions (see SI for additional analysis).110,112,113 The results are compiled in Fig. 9. We find that the diradical character in neocoumarin (y0 = 0.16) is slightly higher than in benzofuran (y0 = 0.09, hence the lower T1 energy of the former, cf.Fig. 8) while both retain very small tetraradical character (y1 = 0.01). By comparison, coumarin has little of both diradical and tetraradical character (y0 = 0.02, y1 = 0.00). These findings, along with the features of the spin density plots (Fig. 9) highlight the critical role of the ortho quinoid arrangement and the necessity of preserving aromaticity in maintaining singlet fission (SF)-relevant electronic properties, offering valuable insights for the design of next-generation SF molecules.
Our results thus highlight the potential of generative design not only to rediscover known SF candidates but also to explore uncharted regions of chemical space, enabling the identification of out-of-the-box chromophores with tailored properties. This offers a promising pathway for advancing the discovery of functional materials. Although we successfully identified a variety of target molecules, our current approach relies on manually guiding the generation process by excluding previously discovered scaffolds. This process could be streamlined by automatically detecting key scaffolds identified at the end of each trial and dynamically updating the list of excluded motifs for subsequent trials. Furthermore, direct comparisons with other approaches (e.g., genetic optimization) across a broader range of practical chemical applications are still awaited.
The SI contains the Chemprop model performance metrics, the definitions of the scoring functions used for the reinforcement learning, the implementation details related to the training of the generative model and to the curriculum-based optimization, the complete results for all optimization trials, the additional reinforcement learning runs targeting unexplored scaffolds, the computed excited-state properties and the Chemprop prediction errors for the candidate molecules as well as further excited-state analyses of coumarin, isocoumarin, and neocoumarin derivatives. Supplementary information is available. See DOI: https://doi.org/10.1039/d5sc03184b.
| This journal is © The Royal Society of Chemistry 2025 |