Clelia
Middleton
a,
Conor D.
Rankine
ab and
Thomas J.
Penfold
*a
aChemistry - School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK. E-mail: tom.penfold@ncl.ac.uk
bDepartment of Chemistry, University of York, York, YO10 5DD, UK
First published on 24th April 2023
Revolutionary developments in ultrafast light source technology are enabling experimental spectroscopists to probe the structural dynamics of molecules and materials on the femtosecond timescale. The capacity to investigate ultrafast processes afforded by these resources accordingly inspires theoreticians to carry out high-level simulations which facilitate the interpretation of the underlying dynamics probed during these ultrafast experiments. In this Article, we implement a deep neural network (DNN) to convert excited-state molecular dynamics simulations into time-resolved spectroscopic signals. Our DNN is trained on-the-fly from first-principles theoretical data obtained from a set of time-evolving molecular dynamics. The train-test process iterates for each time-step of the dynamics data until the network can predict spectra with sufficient accuracy to replace the computationally intensive quantum chemistry calculations required to produce them, at which point it simulates the time-resolved spectra for longer timescales. The potential of this approach is demonstrated by probing dynamics of the ring opening of 1,2-dithiane using sulphur K-edge X-ray absorption spectroscopy. The benefits of this strategy will be more markedly apparent for simulations of larger systems which will exhibit a more notable computational burden, making this approach applicable to the study of a diverse range of complex chemical dynamics.
This support often focuses on the simulation of quantities such as electronic state population kinetics; which can be compared directly to time constants extracted from fitting kinetic models to the experimentally-obtained observables. While useful, these theoretical quantities are not guaranteed to be directly related to any of the experimental observables, and previous works6 have highlighted the importance of simulating the experimental observables directly to avoid the misinterpretation of the data. Ultimately, the direct evaluation of experimental observables using simulations provides the critical connection between experiment and theory that is required to maximise the reliable information extracted from the experiment. It also supplies a common vocabulary for theoreticians and experimentalists working on light-triggered processes to exchange and develop meaningful interpretations of experimental data.
A time-dependent theoretical framework which describes excited-state non-adiabatic nuclear dynamics is often essential to reproduce accurately the experimental observables. When it comes to carrying out simulations under such a framework, trajectory-based approaches [e.g. trajectory surface hopping (TSHD),7ab initio multiple spawning (AIMS),8 or variational multi-configurational Gaussian (VMCG) dynamics9] are appealing since they circumvent the challenge presented by the exponential scaling of quantum dynamics.10–12 Indeed, as both the potential and experimental observables can be calculated on-the-fly with trivial parallelisation as and when (and only where) they are required, trajectory-based approaches make it easier to translate the quantum dynamics performed in higher dimensional space into time-resolved spectra. Specifically, in the context of X-ray spectroscopy13,14 – the focus of the present work – on-the-fly trajectory-based approaches have been used to great effect to model the ultrafast photochemical ring-opening reaction of 1,3-cylohexadiene,15 the excited state relaxation of pyrazine,16,17 and the non-radiative relaxation of ethene via a conical intersection (CI).18 However, while trajectory-based approaches can reduce the computational cost compared to traditional grid-based quantum dynamics approaches,19,20 there remains still a significant computational cost associated with carrying out an (X-ray) spectral simulation at each time step; this cost is preclusive for larger systems, especially where high-accuracy quantum-chemical calculations are required to describe satisfactorily complex effects.
We have previously developed21,22 and deployed25–27 a deep neural network (DNN) – XANESNET28 – for predicting the lineshape of X-ray absorption (XAS)21,22,29 and emission (XES)23 spectra. XANESNET predicts spectral lineshapes using only local information about the coordination geometry of the absorbing atom. It can be optimised in as little as a minute and predicts instantaneously, making it a powerful tool for reducing the computational cost associated with simulating time-resolved spectra. Towards developing DNNs for this kind of application, there are two distinct approaches: one can either develop a ‘Type I’ DNN, trained for generality in the sense that it is able to simulate an XAS/XES spectrum for an arbitrary absorbing atom in any coordination environment at a given absorption edge (our previous work has, to date, focused on this approach21,22), or a ‘Type II’ DNN, trained for a specific (time-dependent) problem. A general ‘Type I’ model might be a sub-optimal solution for the prediction of the fine structure that can typically be acquired in modern ultrafast experiments. Consequently, in this Article, we work with a ‘Type II’ DNN and apply our model to investigate theoretically the ultrafast excited-state ring-opening dynamics of 1,2-dithiane (for structure see Fig. 1) as studied with sulphur K edge XAS. Our results show that our ‘Type II’ DNN, trained on-the-fly using first-principles geometric and XAS spectral data obtained from excited-state TSHD, offers an accurate, affordable, and computationally efficient approach for translating nuclear dynamics into time-resolved experimental observables.
![]() | ||
Fig. 1 (a) Electronic state population kinetics obtained from the first 900 fs of TSHD simulations. (b) Average (line) and standard deviation (shaded region) of the S–S internuclear distance obtained from the first 900 fs of TSHD simulations. TSHD simulations were propagated following the transformation of 51 initial conditions sampled from a ground-state (S0) Wigner-distributed ensemble into the first electronically-excited (S1; ![]() |
Fig. 1(a) shows the population kinetics of the S0 (ground state) and S1 and S2 (together, as the excited states) over the first 900 fs post-photoexcitation to the S1 state. The excited state population in Fig. 1(a) couples together both the S1 and S2 states since the latter has only a minor contribution in the early part of the dynamics and the two form a degenerate pair of states at later times, so we do not represent the states separately. In the XAS spectral simulations, the treatment of the S1 and S2 states is discussed in more detail below.
Fig. 1(b) shows the average (with one standard deviation covered by the shaded area) S–S internuclear distance as a function of time obtained from the TSHD simulations. An initial coherent oscillation with a temporal period of ≈350 fs is observed, followed by a significant increase in the standard deviation and a damping of the oscillation which indicates vibrationally-hot incoherent motion occurring across the S0- and S1/S2-state potential energy surfaces at later times. This behaviour, discussed in detail in ref. 30, agrees well with previous experimental observations obtained via time-resolved ion spectroscopy.32
To calculate the sulphur K-edge XAS spectra of the electronically-excited states (S1/S2), the reference wavefunction for the valence-excited states was approximated using the lowest-lying triplet electronically-excited state (T1) of 1,2-dithiane, which has very similar electronic structure to the lowest-lying electronically-excited singlet states (S1/S2) and represents a tractable approximation due to the lack of spin sensitivity in XAS.40 As detailed in ref. 30, the S1 and S2 states correspond to transitions which are close in energy (near degenerate at later times in the TSHD simulations) and, given the similarity of their character, should be expected to exhibit similar sulphur K-edge XAS spectra. The T1 state, which we use as a proxy, also corresponds to a
transition and approximates the character of the S1 and S2 states well – certainly sufficiently well for the scope of the present work which aims primarily to establish the effectiveness of training our ‘Type II’ DNN on the fly.
Gradients of the empirical loss with respect to the internal weights, δJ(W)/δW, were estimated over minibatches of 32 samples and updated iteratively according to the Adaptive Moment Estimation (ADAM)42 algorithm. The learning rate for the ADAM algorithm was set to 1 × 10−4. The internal weights were initially set according to the He43 uniform distribution. Unless explicitly stated in this Article, optimisation was carried over 2000 epochs. Regularization was implemented to minimise the propensity of overfitting; batch standardisation and dropout were applied at each hidden layer. The probability of dropout at each epoch was set to 0.30.
The XANESNET DNN is programmed in Python 3 with the TensorFlow44/Keras45 API and integrated into a Scikit-Learn46 (sklearn) data pre- and post-processing pipeline via the KerasRegressor wrapper for Scikit-Learn. The Atomic Simulation Environment47 (ase) API is used to handle and manipulate molecular structures. The code is publicly available under the GNU Public License (GPLv3) on GitLab.28
In this Article, the objective is to develop a ‘Type II’ DNN to translate the TSHD simulations described above into time-resolved XAS spectroscopic signals. The DNN is trained on the fly from first-principles data (geometries and sulphur K-edge XAS spectra) obtained/calculated at each time step, and is then used to predict all future timesteps. After each iteration of this process, the DNN predictions are assessed for accuracy (Fig. 2). We aim to find the timestep at which the DNN can provide predictions that substitute for the time-consuming quantum-chemical calculations which would otherwise be required to obtain the time-resolved sulphur K-edge XAS spectra at later times. At each timestep, two DNN are trained using data from the preceding timesteps: one for the electronic ground state (S0) and one for the electronically-excited states (S1/S2, using the T1 as a proxy). The two DNNs are then used to predict the sulphur K-edge XAS spectra for all future timesteps, the choice of DNN depending on whether the trajectory is in the S0 or S1/S2 state at that timestep. We assess the point at which the DNN can substitute for the quantum-chemical calculations in two ways. Firstly, we use a cosine similarity metric to quantify the difference between the predictions and quantum-chemical calculations at each time step after the point up to which the DNNs were trained; while this provides an appropriately accurate assessment for this proof-of-concept work, the obvious limitation is that this approach requires the quantum-chemical calculations to already be available for all future timesteps. Secondly, and, principally, to circumvent this requirement, we also assess the cutoff using the model uncertainty evaluated using the ensembling methodology.23 Principally, ensembling exploits the fact that there is no guarantee that there exists a single unique set of internal weights, W*, for a DNN which satisfy . Practically, ensembling involves training multiple (N) instances of each DNN using the same reference dataset but different statistical initialisations for W. The ensemble of N DNNs is then used to produce N independent predictions from which a mean prediction and standard deviation for each input can be derived. The latter is used to quantify the ensemble uncertainty for the predictions. In this present Article, ensembling is performed at each timestep and the point at which the DNN is sufficiently capable of substituting for the quantum-chemical calculations is taken to be the point at which the standard deviation converges to its minimum value.
![]() | ||
Fig. 2 A schematic of the workflow used in this Article. The DNN takes as input geometries obtained from the TSHD simulations of 1,2-dithiane, carried out in ref. 30, and the calculated sulphur K-edge XAS spectra for these geometries up until some time, T, after excitation. These pairs of geometries and sulphur K-edge XAS spectra constitute the reference dataset on which the DNN carries out supervised learning. The optimised DNN is used to predict the sulphur K-edge XAS spectra for all future timesteps up until the final timestep, and the error – or loss – associated with the predictions is quantified by the cosine similarity between the theoretical target and DNN-predicted sulphur K-edge XAS spectra. This process is repeated at sequential timesteps, T + ΔT, until the cosine similarity between the theoretical target and DNN-predicted sulphur K-edge XAS spectra converges. |
In the S0 state (Fig. 3(a)) a shift of the first feature towards lower energy occurs as ring opening proceeds. This reflects the elongation of the S–S bond. In the S1/S2 states (Fig. 3(b)), the low-energy feature at 2401 eV is already present at the equilibrium/Franck–Condon geometry and so little change in this peak with S–S bond length is observed. A similar observation is made with respect to the main band shifting towards lower energy. However, in contrast to what we observe for the S0 state, the shift is weaker. It is noted that both the S0- and S1/S2-state sulphur K-edge XAS spectra are very similar at longer S–S internuclear distances as a consequence of the near-degeneracy of the three states at the ring-opened geometries.
Fig. 4 shows the temporal evolution of the sulphur K-edge XAS spectra from the ensemble average of the 51 TSHD trajectories over 900 fs post-photoexcitation to the S1 state. Fig. S1–S51 – found in the ESI† – show the temporal evolution of the sulphur K-edge XAS spectra for each of the 51 TSHD trajectories individually. Fig. 4 shows two distinct and temporally-evolving bands centred around 2401 and 2405 eV, consistent with the static simulations along the ring-opening coordinate (Fig. 3).
![]() | ||
Fig. 4 Time-resolved sulphur K-edge XAS spectrum associated with the ultrafast excited-state ring-opening dynamics of 1,2-dithiane. Calculated using REW-TDDFT at the DKH-BP86/def2-TZVP level. |
At early times, the band at 2401 eV shows oscillatory behaviour with a period of ≈150 fs. This mirrors the population kinetics (Fig. 3(a)). As shown in Fig. 3, this first transition exhibits quite different behaviour in the ground and excited state and therefore it is unsurprising that the oscillations in this band reflects the population kinetics. In contrast, the peak at 2405 eV shows oscillations with two distinct time periods, one of ∼50 fs and another ∼340 fs. The latter is consistent with changes in the S–S bond length, which as expected from Fig. 1(b) shows one distinct oscillation before being damped from the loss of coherence. The former 50 fs component corresponds to oscillations in C–S bond lengths, while coherence is reduced during the dynamics, these oscillations remain more visible throughout the first 900 fs. This peak does not exhibit any kinetics associated with the population kinetics as the spectral shape is less sensitive to the fine details of the electronic state populated than geometric changes occurring during the dynamics. Both of the bands show little clear variation after ≈400 fs, consistent with the dephasing of the coherent nuclear motion (Fig. 1(b)) after the first ring-opening/ring-closing cycle and the progression of the system towards vibrationally-hot incoherent motion occurring across the S0- and S1/S2-state potential energy surfaces at later times.
![]() | ||
Fig. 5 Evolution of the cosine similarity loss metric as a function of the training set size; the training set comprises all timesteps up to the training time (T; on the x axis). |
This is supported in Fig. 6 which shows a plot of the principal t-distributed stochastic neighbour embedding (t-SNE) components of the wACSF descriptor encoding each local geometry at every timestep (colour bar) of the dynamics. t-SNE is a statistical approach for reducing the dimensionality of datasets.24 In contrast to the more commonly-used linear dimensionality reduction approach of principal component analysis (PCA), t-SNE is a non-linear dimensionality reduction approach which, unlike PCA, seeks to preserve the local structure of data by minimizing the Kullback–Leibler (KL) divergence between distributions with respect to the locations of the points in the map. In contrast to PCA, t-SNE is not a black box, but instead requires user-defined hyperparameters: the perplexity, learning rate, and the number of iterations (which, to produce Fig. 6, were set to 50, 60, and 1000, respectively). The t-SNE visualisation presented in Fig. 6 shows that the wACSF embeddings describing the equilibrium/Franck–Condon structures are centred around 20 (t-SNE1), 0 (t-SNE2). During the first 200 fs post-photoexciation, t-SNE1 gradually transforms to −20, while t-SNE2 initially decreases, suddenly increases at ≈100 fs, and then subsequently decreases again to 0 at ≈200 fs. Post-350 fs t-SNE1 and t-SNE2 disperse over the entire t-SNE component space, reflecting the vibrationally-hot incoherent dynamics that take over the TSHD simulations at later times. This illustrates, consistent with the analysis performed in the previous paragraph, that a significant amount of the t-SNE1 and t-SNE2 space has been covered within the first 100 fs, which explains the convergence observed in Fig. 5.
Fig. 7(a) and (b) show theoretical (target) and DNN-predicted time-resolved sulphur K-edge XAS spectra over the first 900 fs post-photoexcitation, respectively. Percentage errors between the theoretical (target) and DNN-predicted time-resolved sulphur K-edge XAS spectra are tabulated in Table 1. In contrast to Fig. 4, the first 120 fs are left blank in Fig. 7(a) and (b) as this timeframe contains the first-principles data from which the DNN learns and on which no predictions are made (see Fig. 5). Overall, we observe good agreement (particularly at early times) between the theoretical and DNN-predicted sulphur K-edge XAS spectra: the oscillatory behaviour of both bands – associated with the population kinetics and changes in S–S internuclear distance – is successfully reproduced by the DNN. However, any comparison between the theoretical and DNN-predicted sulphur K-edge XAS spectra for the trajectory ensemble at later times (e.g. t > 500 fs) is complicated by the vibrationally-hot nuclear wavepacket and the progression of incoherent dynamics. Consequently, we focus our analysis of the performance of the DNN on individual trajectories from the trajectory ensemble which, we assert, are a more stringent test of the quality of the DNN predictions.
Trajectory | Overall | I | II | III | IV |
---|---|---|---|---|---|
1 | 8.4 | 3.7 | 8.7 | 10.1 | 11.1 |
2 | 8.7 | 5.1 | 7.6 | 11.7 | 10.2 |
3 | 11.5 | 6.0 | 14.1 | 14.8 | 10.9 |
Overall | 10.2 | 5.2 | 11.3 | 11.7 | 12.7 |
Fig. 8 shows a comparison between the theoretical and DNN-predicted time-resolved sulphur K-edge XAS spectra for three individual trajectories. The corresponding plots for the remaining trajectories are shown in Fig. S55–S102 (ESI†). These show much more fine detail in the dynamics than is visible at the trajectory ensemble level and – importantly – in each case show good agreement between the theoretical DNN-predicted sulphur K-edge XAS spectra. Fig. 8(a) and (b) show the theoretical and DNN-prredicted sulphur K-edge XAS spectra for Trajectory 1 and the percentage error between the two simulations is tabulated in Table 1. The oscillations present in the theoretical simulations are also observed in the DNN simulations with an overall percentage error under 10%. The merging of the two peaks observed at ≈300 and ≈700 fs are associated with the reformation of the S–S bond at these timescales. Similar behaviour is observed for Trajectory 2 in Fig. 8(c) and (d); in this case, the S–S bond reforms at ≈300 and ≈600 fs. Fig. 8(e) and (f), showing the time-resolved sulphur K-edge XAS spectra for Trajectory 3, illustrate different behaviour: once the S–S bond reforms after ≈300 fs, non-radiative internal conversion returns the molecule to the S0 state in a vibrationally hot condition and rapid oscillations are observed, corresponding to vibrational activity of the S–S stretching mode.
![]() | ||
Fig. 8 Time-resolved sulphur K-edge XAS spectra associated with the ultrafast excited-state ring-opening dynamics of 1,2-dithiane for three individual trajectories (1, 2, and 3) taken from the trajectory ensemble average shown in Fig. 7. (a), (c), (e) Calculated using REW-TDDFT at the DKH-BP86/def2-TZVP level. (b), (d), (f) Predicted using the DNN detailed in this Article. The remaining individual trajectories are shown in the ESI.† Plots are shown from 120 fs onwards, as the DNN has been trained on the data from all timesteps up until 120 fs. |
To quantify the error between the calculated and DNN predicted time-resolved spectra, Table 1 shows the overall percentage error and the error in 4 time windows for the overall ensemble and the trajectories shown in Fig. 8. The corresponding table for the remaining trajectories is shown in the Table S1 (ESI†). For the individual trajectories the overall percentage error is ∼10%, with the error being smaller in the first two time-windows and increasing slightly in the final two. The overall percentage error of the ensemble (shown in Table 1) is larger than the error of most of the individual trajectories, indicating (i) the influence of the worst trajectories and (ii) that combining each of the individual trajectories into the ensemble can compound the overall error. This influence of individual trajectories makes it interesting to consider the weighting given to each trajectory in the ensemble. For TSH dynamics used here each trajectory should have equal weights for all time-step, alternatively for Gaussian based methods, the weighting of the trajectory basis functions is dynamic and calculated during the dynamics.16,18,48 An alternative approach adopted to analyse time-resolved scattering experiments used the weighting as a free parameter to fit the experimental spectra.15 This would provide one criteria to eliminate trajectories exhibiting large errors and the weighings determined to achieve a good agreement with experimental data could be used to establish the dominant photochemical pathways from an ensemble of trial trajectories.
One approach could be to execute quantum-chemical calculations to assess the prediction error for a limited time window after the DNN model has been developed up until the prediction error is minimised. However, such an approach would likely fail to account for scenarios where the system explores a region of the potential energy surface which is significantly different to region(s) explored in the early timesteps with which the DNN was trained. Another approach could be to execute a small number of quantum-chemical calculations distributed throughout the future time window for the purposes of assessing the prediction error at specific future times, and to assume that these samples cover sufficiently the space of the potential energy surface that is explored. However, this is certainly not straightforward to assume.
Alternatively, we can implement a broader uncertainty awareness into the DNN itself using an ensembling approach.23,49 Practically, ensembling involves training multiple (N) instances of each DNN using the same reference dataset but different statistical initialisations for W. The ensemble of N DNNs is then used to produce N independent predictions from which a mean prediction and standard deviation for each input can be derived. The latter is used to quantify the ensemble uncertainty for the predictions. The concept behind this interpretation is that different W* obtained from the N models in the ensemble will tend towards similar predictions when the inputs are similar to those found in the reference data. This is because W* for each DNN instance, even if/when different, have been optimised for comparable data. In contrast, if the inputs are dissimilar to the inputs found in the reference dataset, each of the N independent predictions will be more greatly affected by W* and a higher standard deviation will consequently be observed. The deviation of ensembled predictions hence provides, to the end user, a barometer of the ‘dependability’ of the DNN for an application with their own dataset.
Fig. 9(a) shows the median relative standard deviation for the ensembled predictions as a function of the training time. This shows a rapid decrease with convergence observed around 100 fs, in close agreement with that shown in Fig. 5. This indicates that for general application where all of the first principles calculations do not exist, this metric could be used to assess the point at which convergence is reached. For this case study, the convergence around 100 fs is consistent with previous analysis of the convergence this time window sufficiently covers the full range of S–S bond distances covered throughout the rest of the dithiane ring-opening dynamics.
![]() | ||
Fig. 9 (a) Median relative standard deviations of the DNN-predicted time-resolved sulphur K-edge XAS spectra at timesteps greater than the model training time. (b) Median relative standard deviations as a function of time for the spectra trained up to 110 fs, as shown in Fig. 7. (c) Sulphur K-edge XAS spectra at 225 and 800 fs, including 2σ uncertainty (shaded region) calculated via the ensembling approach. |
For more detail, Fig. 9(b) shows the relative standard deviation as a function of time for the spectra trained up to 110 fs. This shows a small increase between 120–400 fs followed by a plateau for the remaining time of the simulations. A comparison for models with shorter training times can be found in Fig. S54 (ESI†), which show a much greater increase in the relative standard deviation. To exemplify the uncertainty obtained from the ensembling models, Fig. 9(c) shows two spectra with ±2σ, calculated from the ensembling technique.
For the present example presented in this article – the ultrafast excited-state ring-opening dynamics of 1,2-dithiane – we find that ≈100 fs of TSHD simulation provides sufficient first-principles data to train a DNN which is then able to predict accurately the sulphur K-edge XAS spectra at future (unseen) timesteps in the TSHD simulation. In this particular case – the periodic ring-opening/ring-closing dynamics of 1,2-dithiane – this training time window can be rationalised as it covers the time required for the S–S bond to break post-photoexcitation to an state and survey subsequently a significant amount of coordinate space. However, for future applications, where convergence may be less straightforward, we have also demonstrated that the ensembling approach can be used to assess the uncertainty of the DNN, providing an indication of whether or not the DNN is sufficiently trained to reproduce reliably and accurately the pertinent signals of the XAS spectra. In the present formulation, the DNN maps (local) structure onto the XAS spectral lineshape and, consequently, any interpretation drawn from using the DNN will be in terms of how the nuclear wavepacket or ensemble dynamics map onto the (experimental) spectroscopic observables. The DNN is therefore unable to provide – directly, at least – information on the electronic wavepacket dynamics and, under the present framework, such information would still require first-principle quantum-chemical calculations; however, these could be carried out at reduced computational cost in a targeted manner, e.g. at critical geometries/times identified from the experimental and/or DNN-predicted dynamics.
Overall, this Article demonstrates a particularly promising new approach for the simulation and interpretation of time-resolved X-ray spectroscopic signals. Extension to systems of higher dimensionality is straightforward under the present framework, and our ‘on-the-fly’ train-test procedure should have broad applicability to trajectory-based approaches for simulating time-resolved X-ray spectroscopic signals for a diverse range of chemical dynamics – particularly dynamics on longer timescales and for larger systems that the computational cost of these simulations presently precludes.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3cp00510k |
This journal is © the Owner Societies 2023 |