Robert Strothmanna,
Mehran Amanpurb,
Tomáš Neveselýb,
Stefan Hecht*b,
Karsten Reuter*a and
Johannes T. Margraf
*ac
aFritz-Haber-Institute of the Max-Planck-Society, Faradayweg 4-6, 14195 Berlin, Germany. E-mail: reuter@fhi.mpg.de
bHumboldt-Universität zu Berlin, Brook-Taylor-Straße 2, 12489 Berlin, Germany. E-mail: sh@chemie.hu-berlin.de
cUniversity of Bayreuth, Bavarian Center for Battery Technology (BayBatt), Weiherstraße 26, 95448 Bayreuth, Germany. E-mail: johannes.margraf@uni-bayreuth.de
First published on 16th September 2025
This study presents the development and application of a generative machine learning model for the design of novel spiropyran photoswitches with enhanced switching speed and absorption bands with small spectral overlap between the open and closed form (i.e. high addressability). Leveraging a scaffold decoration approach, we fine-tuned a general chemical recurrent neural network (RNN) model on a curated dataset of photoswitches. The fine-tuned model was evaluated against both the pretrained baseline and literature-reported spiropyran compounds, demonstrating superior performance in generating diverse and novel candidates. Notably, the fine-tuned model effectively mitigates common biases in decoration patterns and functional group selection observed in the literature. The study also outlines the synthesis and experimental characterization of several newly designed spiropyran photoswitches, validating the design principles derived from the generative model. These findings highlight the potential of generative models in accelerating the discovery of advanced molecular photoswitches with tailored properties.
This wide range of applications leads to different demands for the switching behavior. Thermal backreactions are for instance not desired in data storage applications, since the molecules need to be switched back and forth in a controlled manner. This class of switches are referred to as P-type photoswitches in the literature (P indicating a photochemical backreaction). The ideal timescale for thermal backreactions in other applications (using T-type thermal switches) can range from minutes in drug delivery to less than seconds in 3D printing applications.18,19 Furthermore, the wavelengths used for switching must often be in some desired range due to technical as well as material/biological boundary conditions (such as tissue penetration depth).
All of this implies that there is no single optimal photoswitch, but rather a range of optimal switches for different applications. Even for a given application, the existence of multiple competing design targets will usually lead to a Pareto front of candidates that represent different trade-offs between the target properties. Overall, photoswitch design is thus a highly challenging task. Unfortunately, this challenge is still most commonly addressed via trial and error. Here, a common strategy is to modify known photoswitches, for example by adding or exchanging side groups. This approach is prone to human bias towards specific chemistries, e.g. based on previous knowledge or synthetic accessibility. It is also inefficient, requiring a large number of time-consuming syntheses to cover a small volume of chemical space. Indeed, this is a general problem of molecular design and by no means limited to photoswitches.
More unbiased and efficient approaches for molecular design based on machine learning (ML) have therefore been at the focus of much attention in recent years, for example in drug discovery and materials design.20–26 Here, generative models, which implicitly learn molecular construction rules, are a particularly elegant approach. Unlike virtual screening or combinatorial approaches, generative models can transcend currently known chemical libraries, allowing for true de novo design. Different ML architectures have been used for this purpose, including Recurrent Neural Networks (RNN),27 variational auto-encoders,28 autoregressive deep neural networks,29 and generative adversarial networks.30 These approaches have also been used specifically in the context of the popular class of azobenzene photoswitches, including ML enhanced discovery,31 excitonic property learning,32 and tuning of thermal barriers with ML interatomic potentials.33
In this paper, we explore the use of generative ML for the photoswitch design challenge. As a use case, we focus on the application of xolography, a volumetric 3D printing method,19 based on SP photoswitches serving as spatially addressable dual-color photoinitiators. This enables the fast layer-wise printing of millimeter sized objects in a box of viscous resin, using an orthogonal laser-projector setup. Specifically, closed SP molecules are first opened to their merocyanine (MC) form by irradiation with wavelength λA (see Fig. 1(a)). In a second step, the MC form is photochemically excited with another light source (projector) to initialize a radical polymerization cascade. Because the MC form is only present in places that were previously irradiated with λA, the polymerization can thus be initiated in a highly localized fashion by the projector.
From a material standpoint, the speed of the printing process is primarily governed by the thermal lifetime of the open MC form of the switch. A perfect photoswitch for this application should display a fast thermal backreaction to the SP form, with a lifetime in the order of the illumination time of the second light source (projector). In other words, the thermostability of the open form should be low (see Fig. 1(b)). A second requirement for this application is that the closed form can be excited independently from the open form, in order to avoid unwanted polymerization initiated by λA via residual open MC molecules in the resin. It is thus desirable that there is a suitable wavelength λA with small spectral overlap between the SP and MC forms. This property is referred to as addressability in the following (Fig. 1(c)).
Following Fig. 1(a), the most intuitive descriptor for thermostability would be the energetic barrier for the transition from the MC form back to the SP form. Unfortunately, this schematic depiction is overly simplistic since multiple conformers exist for both the SP and MC forms, so that in general several transition states lie between the SP and MC forms. This makes a reliable high-throughput prediction of reaction barriers extremely challenging. Furthermore, computationally efficient semiempirical methods like GFN2-xTB that work well for predicting molecular geometries are not sufficiently accurate for barriers.34 As a consequence, the computational effort of predicting kinetic barriers accurately would be very high (Fig. 3).
We therefore define a heuristic descriptor for thermostability, which is physically motivated and cheap to compute. In a previous investigation of a prototypical SP photoswitch, it was found that the backreaction follows a rotation along the highlighted bond in Fig. 1(b).35 The MC structure has two resonance structures (a fully conjugated and a zwitterionic one), depending on which the rotating bond is formally either a double or single bond. We can thus infer that the thermostability is lower if the zwitterionic character of the structure is higher. We therefore use the length of the rotating bond in the MC form (predicted with the GFN2-xTB method, see Computational details) as a proxy for inverse thermostability of a given structure. While this approach is certainly not perfect, it yields a descriptor that is robust and fast to compute. For convenience, bond lengths are reported as relative deviations to the prototypical SP 6-nitro-1′-3′-3′-trimethylspiro[2H-1 benzopyran-2,2′-indolin] (6-nitro-BIPS, see Fig. 3)36 so that values above 0% indicate longer bonds (and therefore lower predicted thermostabilities).
As described above, the main ingredient needed to compute the addressability are ground and excited state energies of both forms, as well as the oscillator strengths σ for each transition. As for the barriers, semiempirical calculations proved to be of too low accuracy for this purpose in preliminary tests (see Fig. S1). We therefore used Time-Dependent Density Functional Theory (TD-DFT) calculations with the hybrid PBE0 functional and def2-TZVP basis set, from which Gaussian broadened UV/vis spectra are obtained (see Computational details). The quotient of SP and MC spectra, which is used to define the addressability, is shown in green in Fig. 1(c).
With this, computationally affordable descriptors for thermostability and addressability are defined, which can be used to train and validate the generative model.
The REINVENT scaffold-decoration model is trained in a self-supervised manner, based on scaffold-decoration pairs that are created with a slicing scheme applied to all training molecules (see Fig. 2). Each molecule in general yields several scaffold-decoration pairs depending on its size and the sliced bonds. This procedure allows the model to learn the grammar of chemical modifications from the examples in the training set. The trained model can subsequently be used to decorate any target scaffold on arbitrary predefined decoration positions. Importantly, it can also be applied to scaffolds that were not present in the training set and can generate completely novel decoration groups.
![]() | ||
Fig. 2 Slicing scheme for a training molecule. Slicing along aliphatic bonds can lead to different definitions of the scaffold and the decoration. Stars indicate open decoration positions. |
The full generative TL workflow (see Fig. 4) starts with pretraining a REINVENT baseline model on a subset of ChEMBL, which is a manually curated database of over 2M bioactive molecules. Herein, training molecules have been filtered as in the work of Arús-Pous et al.37 for size, number of rings and other aspects. The dataset was further tailored for the SP application by only selecting systems which feature a benzene or pyrrolidine moiety (the two substructures which occur in the SP photochromic core). In total 783 k molecules resulted from this filtering, yielding a training set of 36.8 M scaffold-decoration pairs after slicing. The resulting model is referred to as the baseline model in the following. Further details on the training data and training process can be found in the Computational details.
In the next step, the possible SP decoration patterns are provided to the model. For two side-groups and 10 decoration sites, a total number of 45 unique decoration patterns can be identified. Here, the two open positions at the double bond in the central pyran ring are not considered because they are synthetically inaccessible. Since the generation with the RNN model is probabilistic, each decoration pattern is queried multiple times to sample extensively from the chemical space of possible SP molecules. It is pertinent to emphasized that the REINVENT model does not merely memorize the decorations in the training set, but generates new side-groups, not present in the input data. Indeed, 52.7% of the unique decorations predicted by the final model were not part of the training data.
We next draw a sample from these new photoswitch candidates (144 molecules for each of the 45 decoration patterns) and compute the addressability and thermostability descriptors for each. From this, a set of performant SP molecules is selected. Since addressability and thermostability can be conflicting criteria, we cannot define a simple ranking of all candidates. Instead, we selected molecules according to the Pareto front of addressability and thermostability via a non-dominated sorting algorithm. To this end, we define the first three layers of the Pareto front as the set of performant SP candidates, in order to strike a balance between performance and chemical diversity.
In the last step of the TL workflow, the REINVENT model is fine-tuned using these performant SP candidates as a new training set. The fine-tuned model was then used to generate improved additional photoswitch candidates. Note that in addition to providing performant SP candidates for retraining, the first set of generated systems also provides important insights on which decoration patterns are favorable for our design criteria, by considering how frequently each pattern occurs in the performant set. Indeed, this pattern effect is substantial, so that we also used a correspondingly weighted sample of generated molecules to evaluate from the fine-tuned set. Nonetheless, at least five candidates were included for each decoration pattern, to ensure a diverse sample of candidates. Additional ablation studies on different steps of the workflow can be found in the SI. This confirms that both fine-tuning and reweighting promising decoration patterns is important to achieve optimal results.
In addition, a set of literature-known doubly substituted SP molecules was compiled via a SP substructure search on SciFinder and PubChem, resulting in 976 doubly decorated SP molecules (see Computational Details). Fig. 5 provides an overview of the addressabilities and relative bond lengths in both sets. Focusing on the extreme values, the central panel shows the convex hulls of the different distributions, including the respective Pareto fronts towards the top right. Additionally, marginal distribution plots for the addressability and bond lengths are included as separate panels on the top and right.
Overall, the molecules of the literature set show a bimodal distribution for the relative bond length criterion with maxima around 0% and 1%, with a somewhat higher density at the shorter bond length peak. In contrast, the generative datasets display unimodal distributions peaking around 1%. Interestingly, the fine-tuned model has an additional shoulder and tail towards even longer bond lengths, indicating that the fine-tuning indeed works as expected. For the addressability, all datasets are unimodal and peaked at the lower end of the range. Again, the fine-tuned model displays a pronounced tail towards higher addressability and significantly larger mean values (6.5 vs. 2.4 for the baseline). In terms of the Pareto fronts, it can be seen that the fine-tuned model dominates the baseline, which in turn dominates the literature data. Since the Pareto fronts correspond to the most extreme values in the sample (i.e. they are outliers by definition), the precise shape of the front somewhat varies between different runs of the workflow. Nonetheless, the trends described above are robust across multiple independent runs.
The bimodal distribution of the literature data in terms of bond lengths to some extent reflects the biases in traditional “trial and error” molecular design. Many of the reference molecules are decorated in para-position to the oxygen, and often feature a nitro group. This is due to the large number of 6-nitro-BIPS derivatives that have been reported in the literature. Unsurprisingly, this leads to a bond length distribution peaked around the 6-nitro-BIPS reference. Another factor that likely influences the trend towards shorter bond lengths in the literature distribution is the fact that long thermal backreaction times are actually desired in some applications, such as energy storage or drug delivery.
In contrast, the distribution generated by the baseline model has no bias towards 6-nitro-BIPS or other previously known photoswitches. This shows that SP molecules naturally tend towards longer bond lengths across chemical space. Assuming that the inverse relation between bond length and thermostability is robust, this indicates that a large range of fast-switching SP molecules is unexplored. On the other hand, the addressability distribution between the literature and baseline datasets are virtually identical. Literature bias thus does not seem to affect this property significantly.
Comparing the fine-tuned and baseline distributions, we find that both properties are significantly influenced by the retraining step, vindicating our TL approach. Here it appears that the addressability is a more elastic property, though this is likely due to its definition as a quotient of oscillator strengths, which can in principle be arbitrarily large. In contrast, bond lengths are naturally constrained by chemistry, and variations beyond a few percentage points would be highly unusual.
In general, the fine-tuned model thus samples a much wider variety of decoration sites, providing the possibility for novel discoveries. It is also interesting to note that in some cases the patterns are opposed to what is found in the literature: for example, decorations meta to the oxygen are not found in the literature at all but are among the most common for the fine-tuned model. These trends can again be related to biases in the literature and opposing design targets (e.g. long vs. short backreaction times). While a decoration of an electron withdrawing group ortho or para to the oxygen can help to stabilize the open form, an electron withdrawing decoration meta to the oxygen (and therefore para to the rotatable double bond) can weaken the double bond character via mesomeric effects.
In terms of biases, the influence of 6-nitro-BIPS derivatives on the literature dataset is clearly evident in Fig. 6. However, this bias should not simply be interpreted as a lack of creativity on the side of synthetic chemists. Instead, the synthetic accessibility of the site para to the oxygen is much higher, when SPs are synthesized by condensation of a substituted salicylaldehyde with a substituted indolin. The generative models used herein do not explicitly account for synthesizability, though concepts of chemical stability should implicitly be learned from the ChEMBL training data. Indeed, a recent review by Bilodeau et al. highlights the role of synthesizability for molecular generative models.47 Westermayr et al. showed that the structure-based synthetic complexity score (SCS) of Coley et al. could be used to bias generative ML models towards synthetically accessible systems.45,48 However, we found that this approach cannot easily be transferred to the current scaffold decoration setting, because the SCS distribution for decorated SP scaffolds is highly static (see Fig. S6). Furthermore, overemphasizing synthesizability can also unduly constrain the novelty of the generated molecules, in particular when synthesizability (an inherently subjective concept) is quantified based on literature-known molecules. At this point, a critical assessment by a trained synthetic chemist is therefore still the best option.
In Fig. 7, the molecules in each box are arranged from highest to lowest addressability from top left to bottom right (and reverse for the relative bond length). Some general structural trends are similar between both datasets. For structures with high addressability (first row), conjugated chains attached meta to the SP oxygen (index 8 in Fig. 6) are prominent. For the fine-tuned model these decorations additionally feature heterocycles. These modifications are physically plausible, as they would be expected to lead to a red-shift of the absorption bands, in particular for the MC form. A second common decoration for highly addressable systems is a methoxy group ortho to the oxygen, which may have additional mesomeric effects.
To achieve long relative bond lengths, the fine-tuned model consistently decorates the position next to the double bond of the central pyran ring (index 10 in Fig. 6) with bulky substituents. This likely has steric effects on the conformational ensemble of the MC form, favoring longer bonds at this position.
Whether this has the desired effect of lower thermostability of the MC form is unclear, however, since such substituents could potentially also lead to larger barriers for the backreaction or prevent switching altogether. This highlights a general risk in proxy-based optimization, namely that the descriptor can be optimized in a way that is detrimental to the intended goal. On the other hand, these molecules also consistently feature nitrogen-based substituents, which could feature beneficial electronic effects by extending the conjugated π-system, in addition to the electronic withdrawing effects of the NO2 moiety. Additionally, some functional groups predicted by the model can be highly sensitive to their chemical environment and thus solvent polarity or pH can have drastic influence on their photophysical properties. Thus, additional precautions must be taken when evaluating the predicted molecules. Comparing the baseline and fine-tuned datasets, it is also interesting to note that the former more frequently contains functional groups that are typical in medicinal chemistry, such as cyclopropane or trifluoromethyl groups, which are much less prevalent in the fine-tuned set. This indicates that the fine-tuning moves the model away from the medicinal chemistry focused ChEMBL training set.
In order to explore the biases inherent in the ChEMBL training set, we explored the PubChem dataset as an alternative (see Fig. S3). This reveals that the training set of the generative model influences both the nature and the complexity of the decorations. For example, we observed a trend towards more hydrocarbon-based substituents in PubChem, while the ChEMBL decorations tend to contain more halogens and aryl substituents. Furthermore, the PubChem model yields overall more complex decorations. This reveals that the underlying training data has significant implications on the chemical space accessible by the generative model. The increased complexity of the PubChem decorations, while potentially enhancing the performance of the resulting photoswitches, also presents challenges related to reduced synthetic accessibility.
![]() | ||
Fig. 8 Scatterplot of descriptors of all photoswitches generated by the fine-tuned model. Experimentally synthesized molecules are shown in red and highlighted in the margins. |
Clearly, this experimental dataset is inherently limited by the challenge of synthesizing and isolating certain model-predicted molecules. A significant portion of the models' predictions involves molecules that, while chemically intriguing, are challenging to synthesize and isolate due to underlying thermal reactivity, not directly related to their photoswitching properties. To address this, candidate molecules were subjected to retrosynthetic analysis to identify synthetically approachable structures. Note, that even the elusive molecules remain valuable for advancing theoretical studies as they provide critical insights into the photophysical, structural, and thermochemical factors governing spiropyran switching behavior.
Among the synthesized compounds were the following derivatives. Spiropyran 1 featuring electron withdrawing nitro group on the indoline part of the molecule and methoxy group was found to have red-shifted absorption maxima of the open merocyanine form tailing to 700 nm and significantly lower activation barrier (9.85 ± 2.99 kJ mol−1) for thermal ring-closure as compared to the model derivative. Compound 2 with two nitro groups was measured to have an activation barrier of 63.84 ± 1.22 kJ mol−1. Compound 3 was a switch with significantly different absorption of the closed form compared to the model compound and the previously measured ones with its absorption maxima of only 310 nm. The activation barrier for ring-closure for this compound was measured as 39.94 ± 3.44 kJ mol−1. The spiropyran 4 was observed to undergo slight opening upon irradiation with 313 nm at −35 °C, however, due to the freezing point of acetonitrile, the exact activation barrier could not be determined. For compound 5, an activation barrier of 26.52 ± 1.27 kJ mol−1 was measured. This derivative shows the difference of alkoxy group influence compared to its regioisomer 1. Compound 6 was observed to have blue shifted lowest absorption maxima of merocyanine form and its activation barrier was found to be 42.62 ± 1.82 kJ mol−1.
For reference, the activation barrier of 6-nitro-BIPS was measured as 101.77 ± 0.45 kJ mol−1. A full comparison of all measured candidates can be found in the SI (Fig. S8). This also allows a post-hoc validation of the bond-length criterion for thermostability. We find a moderate negative correlation (with R2 = 0.67) between the experimental activation barriers and the relative bond length. Though not perfect, this is certainly a useful accuracy for guiding molecular design, especially given the heuristic nature of the descriptor.
Beyond statistical analyses of the generated chemical spaces, we have also critically examined the individual structures designed by the generative models. In agreement with previous reports, we find that the synthesizability of some of the candidates is likely low. Nonetheless, consistent design rules regarding the nature of the functional groups and the decoration patterns can be derived from the proposed molecules. These have been validated by synthesizing six molecules. All of the newly synthesized compounds are predicted to display shorter thermal half-lives of their merocyanine isomers when compared to the baseline 6-nitro-BIPS molecule. This was confirmed by spectroscopic measurements, which show that the observed activation barriers are significantly lower than for 6-nitro-BIPS in all cases. However, even though the qualitative prediction is correct, the criterion of relative bond length was found to not correlate perfectly with the measured barriers.
In future work, we aim to improve the generative design of photoswitches by developing more reliable performance descriptors and exploring ‘human-in-the-loop’ approaches. ML-accelerated quantum chemical calculations represent an intriguing route for predicting more accurate thermal backreaction barriers. Indeed, Axelrod et al. have shown that equivariant neural networks can be trained with active learning to predict diabatic ML potential energy surfaces for azobenzene photoswitches.33 However, even for a somewhat smaller space of 3100 molecules (compared to the space considered herein), this required 570000 training geometries, which would need to be added to the current cost of transfer learning and evaluation (13
000 DFT calculations in one run of the current workflow). Moreover, it should be noted that the switching of azobenzenes is less complex than that of spiropyrans, due to the presence of multiple conformers in the open merocyanine form of the latter. Therefore, to obtain an accurate diabatic ML potential for a broad chemical space of spiropyrans, even larger computational effort would be required. This is out of the scope of the current study yet certainly worth pursuing in future work. Beyond this, the establishment of dedicated experimental libraries for further benchmarking of computational models can further increase the quality of the descriptors and synthesizability should be considered in a more rigorous manner.
Supplementary information is available. See DOI: https://doi.org/10.1039/d5dd00327j.
This journal is © The Royal Society of Chemistry 2025 |