Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

SPAINN: equivariant message passing for excited-state nonadiabatic molecular dynamics

Sascha Mausenberger ab, Carolin Müller *cd, Alexandre Tkatchenko d, Philipp Marquetand a, Leticia González a and Julia Westermayr *ef
aFaculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
bVienna Doctoral School in Chemistry (DosChem), University of Vienna, Währinger Straße 42, 1090 Vienna, Austria
cDepartment Chemistry and Pharmacy, Computer-Chemistry-Center, Friedrich-Alexander-Universität Erlangen-Nürnberg, Nägelsbachstraße 25, 91052 Erlangen, Germany. E-mail: carolin.cpc.mueller@fau.de
dDepartment of Physics and Materials Science, University of Luxembourg, 162 A, Avenue de la Faïencerie, L-1511 Luxembourg, Luxembourg
eWilhelm Ostwald Institute for Physical and Theoretical Chemistry, Leipzig University, Linnéstraße 2, 04103 Leipzig, Germany. E-mail: Julia.westermayr@uni-leipzig.de
fCenter for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI), Dresden/Leipzig, Germany

Received 24th June 2024 , Accepted 1st September 2024

First published on 2nd September 2024


Abstract

Excited-state molecular dynamics simulations are crucial for understanding processes like photosynthesis, vision, and radiation damage. However, the computational complexity of quantum chemical calculations restricts their scope. Machine learning offers a solution by delivering high-accuracy properties at lower computational costs. We present SPAINN, an open-source Python software for ML-driven surface hopping nonadiabatic molecular dynamics simulations. SPAINN combines the invariant and equivariant neural network architectures of SCHNETPACK with SHARC for surface hopping dynamics. Its modular design allows users to implement and adapt modules easily. We compare rotationally-invariant and equivariant representations in fitting potential energy surfaces of multiple electronic states and properties arising from the interaction of two electronic states. Simulations of the methyleneimmonium cation and various alkenes demonstrate the superior performance of equivariant SPAINN models, improving accuracy, generalization, and efficiency in both training and inference.


1 Introduction

Accurate non-adiabatic molecular dynamics (NAMD) simulations play a vital role across various research domains in photophysics and photochemistry. These simulations delve into the exploration of light-induced reactions, considering light as one of the most prevalent energy sources on Earth. They facilitate the exploration of diverse phenomena, including but not limited to,1–3 vision,4,5 or radiation damage.6–8 All these processes involve molecular reactions occurring after excitation to higher electronic states. By studying the subsequent nuclear dynamics, it is possible to uncover the intricate relationships between molecular structure and photochemical properties, paving the way towards the design of new photoactive molecules,9 including photocatalysts,10 photosensitive drugs,11 or photovoltaic materials.12–14 Despite the importance of these NAMD simulations, the prohibitive costs associated with accurate quantum chemistry calculations for photodynamics have limited the scope of excited-state simulations. These restrictions apply specifically to full quantum dynamics or semi-classical dynamics simulations, confining them to short time scales (usually several picoseconds) and small systems (typically a few dozen atoms).9,15 Typical semi-classical photodynamics simulations usually cover only a range of several picoseconds for a few dozen of atoms.9,15

With the advent of machine learning (ML) in theoretical chemistry, analytic representations of potential energy surfaces (PESs) and related properties have been developed.16,17 These approaches accelerate the simulations by decoupling the computational costs of electronic structure calculations from the dynamics simulations.16 While ground-state PES fitting has been pursued for over two decades and has enabled the simulation of molecular dynamics of systems with many thousands to millions of atoms with ab initio accuracy,18–20 the application of ML approaches to excited-state processes has appeared much later, with most developments occurring mainly in this century.9,21–23 As a consequence, there is significant methodological work to be done in accurately and meaningfully fitting the numerous excited-state PESs and associated properties.

Equivariant ML models have recently emerged for ground-state properties and have achieved notable success in fitting thereof. These models have demonstrated improved accuracy and efficiency compared to traditional ML models by successfully capturing the underlying symmetries of molecules and exploiting them to achieve accurate predictions even with limited training data.24–28 However, to the best of our knowledge, only one study has applied equivariant ML to describe excited state properties. Gómez-Bombarelli and co-workers developed a diabatic artificial neural network using the equivariant PAINN29 model, achieving a six-fold speedup in photodynamics simulations for azobenzene derivatives.30 Although their model demonstrated robustness and transferability within the chemical space of azobenzenes, it was not compared to the performance of invariant approaches, such as PyRAI2MD31,32 and SCHNARC.33 The latter is based on SCHNET, the invariant precursor to PAINN used in the aforementioned study.30 This omission makes it unclear whether equivariant models have an advantage over invariant models for excited states.

Nonetheless, It is expected that equivariant models will play a crucial role in enabling robust training and accurate prediction of vectorial properties,29 such as nonadiabatic couplings (NACs) and transition dipoles in NAMD simulations. Unlike PAINN, SCHNET is rotationally invariant and cannot directly predict vectorial properties.29,34 Instead, SCHNET predicts a virtual property, from which the equivariant property is derived by taking its derivative with respect to nuclear coordinates. Although SCHNARC can predict NACs using this approach,33,35PAINN's equivariant design would allow for direct prediction of vectorial properties.

To address this gap, we developed SPAINN, an open-source Python tool for ML-accelerated photodynamics simulations that facilitates the use of both invariant and equivariant architectures, allowing for a direct comparison and evaluation of the benefits of equivariance. SPAINN integrates the NAMD program SHARC 3.0 (Surface Hopping Including ARbitrary Couplings)36,37 with the neural network potentials of SchNetPack 2.0.38 (including SCHNET34 and PAINN modules). This interface allows SPAINN to facilitate ML-based surface hopping NAMD, leveraging SchNetPack 2.0's inherent strengths, such as an improved data pipeline and support for both invariant (SCHNET) and equivariant (PAINN) neural networks.38

We demonstrate the enhanced accuracy of SPAINN models based on the PAINN compared to SCHNET architecture across various aspects, including PESs, which encompass energies, forces and NACs. Additionally, we highlight the improved computational efficiency of SPAINN(PAINN) with respect to SPAINN(SCHNET) models for both training and predictions. We showcase the performance of SPAINN for a selection of alkenes and the methyleneimmonium cation by accurately predicting energies, forces, and NACs. These molecules are known to undergo rotation around the double bond upon photoexcitation. Furthermore, we illustrate the acceleration of NAMD simulations for the methyleneimmonium cation using SPAINN models.

2 SPAINN architecture

The general architecture of SPAINN is illustrated in Fig. 1. SPAINN offers several improvements over its predecessor, SCHNARC,33 most notably by supporting a rotationally equivariant representation of molecules and materials using the PAINN representation in addition to the translationally and rotationally invariant SCHNET representation. Furthermore, the code is entirely modular, similar to SchNetPack2.0,38 allowing for straightforward addition and modification of modules for predicting excited-state properties. A detailed comparison between SPAINN and SCHNARC is presented in Table S1.
image file: d4sc04164j-f1.tif
Fig. 1 Overview of the various modules implemented in SPAINN: the asetools module creates and manages datasets of Cartesian coordinates (R) and atomic charges (Z) as well as energies (Ej), and forces (Fj) across different electronic states, along with their coupling properties (Cji), by parsing SHARC output files (GenerateDB) or converting existing ASE databases (ConvertDB) into the required format (cf. Table S2). Data handling is performed by an adapted AtomsDataModule class from SchNetPack 2.0 (ref. 38) (SPAINN), which facilitates data splitting, scaling, and transformation for multiple electronic states. The neural network potentials for training electronic properties of various electronic states utilize standard SchNetPack 2.0 modules (e.g., representation and neural network potential modules), with SPAINN providing specialized prediction modules (model) for atomistic invariant properties (Atomwise), forces as energy derivatives (Forces), non-adiabatic couplings (Nacs), and dipole moments (Dipoles). SPAINN also includes loss functions, output modules (loss), for phase-free training of properties arising from interactions between electronic states (Cji), using phase vectors (PhysPhaseLoss, PhysPhaseLossAtomistic) or ±1 multiplication (PhaseLoss, PhaseLossAtomistic) for on-the-fly phase correction during training. Additionally, SPAINN offers a calculator module (calculator.SPaiNNulator) for interfacing neural network predictions (via SchNetPack) with NAMD simulations using SHARC.

SPAINN integrates with SHARC36 for ML-driven NAMD simulations. As an interface to SchNetPack 2.0, SPAINN provides modules for:

• Conversion and generation (from SHARC output files) of databases useable with SPAINN (cf. left column in Fig. 1 and ESI section S1.1 and S1.2),

• Data pre-processing (e.g., splitting, scaling and transformation; cf. ESI section S1.3),

• Prediction modules for electronic properties in multiple electronic states (cf. middle column in Fig. 1), and

• Output modules enabling phase-free training of properties arising from the coupling of two electronic states (cf. right column in Fig. 1 and ESI section S1.4).

Additionally, SPAINN facilitates communication between SchNetPack 2.0 (ref. 38) and SHARC 3.0 (ref. 37) through the spainn.calculator.SPaiNNulator class. This functionality allows trained ML models to furnish quantum chemical properties (calculate) or SHARC predictions (get_qm) for molecular structures. A comprehensive description of the data pipeline can be found in the ESI (section S1) and a code documentation is available as “Read the Docs” page.39 In the following, we focus on certain properties that are essential for ML but for a comprehensive overview of the other parts of NAMD simulations with SHARC see ref. 37 and 40.

To account for nonadiabatic transitions, SHARC uses the Tully's fewest switches surface hopping method,41,42 which is a mixed quantum-classical method that propagates nuclei classically according to Newton's equation of motion on different excited-state potentials, which are treated by means of electronic structure theory. Nonadiabatic transitions between different PESs are accounted for by so-called hops across different PESs. These are determined stochastically based on the magnitude of the NACs, which are referred to as Cij in the following and are defined as

 
image file: d4sc04164j-t1.tif(1)
with Ψi and Ψj representing the electronic wave functions of states i and j and their respective potential energies Ei and Ej, whereas Ej > Ei. Fitting these couplings presents significant challenges arising from the intrinsic properties of NACs, characterized by their lack of smoothness and arbitrary sign variations.43–45 Specifically, complications arise at conical intersections, critical points in NAMD simulations where the energy levels of electronic states i and j coincide.36,46 However, as electronic states become degenerate, NACs tend toward infinity (cf.eqn (1)), giving rise to singularities in their values. To address this issue, SPAINN incorporates a smoothing process into its pipeline.45 This smoothing procedure involves scaling the original NACs by the energy difference between the two states involved in the coupling, denoted as EjEi (j > i), expressed as
 
[C with combining tilde]ij = Cij·(EjEi).(2)

These modified NACs, referred to as smoothed NACs ([C with combining tilde]ij, ‘smooth_nacs’) are pre-computed and stored in the database. This can be achieved by enabling the smooth_nacs=True option when using the GenerateDB or ConvertDB classes of spainn.asetools to create a SPAINN database.

Moreover, training of NACs and other properties arising from the interaction of two electronic states is further complicated by the arbitrary phase inherent in the wavefunction, resulting in these properties bearing an arbitrary sign. Numerous strategies have been proposed to confront this challenge, including phase correction of the data prior to fitting,47,48 and a phase-free training algorithm enabling the training of raw data.33 While the former approach is often applicable only within limited conformational regions of a molecule, the latter is constrained by the escalating training complexity as the number of electronically excited states increases. SPAINN incorporates two classes of phase-free loss functions, namely PhaseLossAtomistic and PhysPhaseLossAtomistic, enhancing computational efficiency and simplicity compared to the original approach of SCHNARC. The different loss functions implemented are explained in detail in the ESI in section S1.4.

Besides NACs, energies, forces, and permanent and transition dipole moment vectors can be learned. Forces are treated as derivatives of the potential energy surface with respect to atomic positions and dipole moment vectors are learned using the charge model according to ref. 49 and 50. All properties can be trained simultaneously using a combined loss function:

 
image file: d4sc04164j-t2.tif(3)
where tP is a pre-defined parameter to weigh the different properties during training with P referring to the set of trained properties, such as energies (E), forces (F), dipole moments (μ), or NACs (C). image file: d4sc04164j-t3.tif is the individual loss of each property.

Ultimately, SPAINN models can be utilized for predicting quantum chemical properties, either independently or within a NAMD simulation using SHARC (spainn.calculator.SPaiNNulator). It is important to note that while our models are trained with smoothed NACs ([C with combining tilde]ij), these are converted to classical NACs (Cij) for predictions using the NacCalculator class from spainn.interface.aseinterface. Specifically, predictions of smoothed NACs and energies for a given geometry are obtained, from which the classical NACs are derived according to eqn (2). Since the NacCalculator is employed in our ML-driven surface hopping simulations (within the spainn.calculator.SPaiNNulator), classical NACs are used during these dynamic runs. Consequently, the prediction of photochemical properties, such as rates and quantum yields, is expected to be as reliable as surface hopping generally is for predicting these properties.37,46,51

3 Datasets

Herein, we assess the quality of SPAINN predictions using two classes of model systems: Firstly, we delve into the light-induced rotation around a double bond across a series of alkenes: ethene (C2H4), propene (C3H6), and butene (C4H8). To achieve this, we created the XALKENEDB52 database at the complete active space self-consistent field (CASSCF) level of theory,53,54 specifically averaging the three lowest singlet states targeting two active electrons distributed over two active orbitals, denoted as SA(3)-CASSCF(2,2). A total number of circa 6 k (for C2H4 and C3H6 each) and 13 k (C4H8) reference calculations were performed for the respective alkenes, from which the following five quantum chemical features are stored in the XALKENEDB database: energies, forces, NACs, and dipole moment vectors (comprising both permanent and transition dipole moment vectors). Detailed information regarding the construction of the XALKENEDB52 database is provided in the ESI (cf. section S2).

As second showcase example, we studied the methyleneimmonium cation (CH2NH2+), which is the smallest member of the protonated Schiff bases that has been used as model systems to study the fundamental processes involved in vision.55,56 To this end, we relied on the literature-known database of CH2NH2+ comprising 4 k data points at the multi-reference configuration interaction singles and doubles (MR-CISD) level of theory with an active space of four active electrons in three active orbitals, namely MR-CISD(4,3).47 This data has served as a good test bed for several developments33,45,47,57 as it is characterized by ultrafast internal conversion after excitation to the second excited singlet state.

4 Results and discussion

To comprehensively assess the performance of SPAINN, we investigated two distinct categories of organic model compounds. These molecules share a common characteristic: They show a torsional movement of groups around a C–C bond, specifically a double bond in their ground state, upon absorption of light. This phenomenon arises from the ππ* nature of the reached excited state: When a ππ* state is populated by light, the bond order of the molecules changes to one, converting the double bond to a single σ-bond. Consequently, atoms or groups of atoms can freely rotate around this bond in the electronic excited state, a freedom not available in the ground state. The quintessential example of such a photochemical reaction is the cis/trans-isomerization.

Understanding these photochemical reactions computationally requires the accurate description of internal conversion between electronic states, particularly accessing NACs at regions of the conical intersections. These intersections are characterized by energetic degeneracy between electronic states, enabling ultrafast, non-radiative relaxation pathways. Consequently, conical intersections serve as critical junctions for light-induced reactions and processes that decide which state is populated. Herein, we evaluate the predictive capability of SPAINN in determining equivariant, vectorial properties, namely dipole moments (μij) and NACs (Cij). The discussion follows a conventional approach in understanding photophysical processes and reactions through computational chemistry. Initially, we focus on static considerations, including the prediction of absorption spectra (Franck–Condon region) and properties of crossing geometries (region of conical intersections). Subsequently, we showcase the application of SPAINN predictions in conducting photodynamics simulations.

4.1 Training and computational efficiency of SPAINN

To demonstrate the effectiveness of SPAINN relative to SCHNARC in accurately fitting excited-state potentials and properties, we trained SPAINN models utilizing both SCHNET and PAINN representations. Notably, while SPAINN models utilizing SCHNET representation demonstrate comparable performance to SCHNARC models, it is important to highlight that the re-implementation of the original SCHNARC code within SPAINN confers increased versatility and computational efficiency.

SPAINN models are obtained for three alkene molecules, namely ethene (C2H4), propene (C3H6), and butene (C4H8) as compiled in the XALKENEDB52 and methyleneimmonium cation (CH2NH2+) as reported earlier by some of us.47 For the three lowest singlet states of C2H4 and C3H6, we trained models on energies (Ej) and forces (Fj, j = {0, 1, 2}) as well as the NACs (C01, C02 and C12). Furthermore, we trained models on these nine features and additional dipole moment vectors, namely transition dipoles (μ01, μ02, μ12) and permanent dipoles (μj, j = {0, 1, 2}) for C4H8 and CH2NH2+. Details on hyperparameters of the models are compiled in the ESI (cf. section S3.1).

Table 1 provides an overview of the mean average errors (MAEs) and root mean squared errors (RMSEs) of SPAINN models utilizing SCHNET (white cells) and PAINN (grey cells) representations for fitting these properties for the corresponding four molecules. The errors indicate that the equivariant PAINN representation leads to notable improvements, particularly in predicting vectorial properties compared to SCHNET models. Specifically, while the MAEs of energies and forces show only a modest average improvement of about 15% when comparing SCHNET and PAINN models, the error in PAINN predicted NACs is on average 1.5 times smaller than that of SCHNET models for the four model molecules. This observation is reasonable considering that forces are not learned as vectorial properties, but rather as the first derivative of the energy.38 Furthermore, the fact that the equivariant model SPAINN outperforms SCHNARC for all properties is reflected in the MAE error distributions as compiled in Fig. S4, S5, and S10 for each model and property, respectively.

Table 1 Mean absolute errors (MAEs) and root mean squared errors (RMSEs) of SPAINN models for the alkenes ethene, propene, and butene, as well as the methyleneimmonium cation, employing either PAINN (highlighted in gray) or SCHNET representations, respectively. The models were trained on energies (Ej in eV), forces (Fj in eV Å−1), and non-adiabatic coupling vectors (Cij) involving the three lowest singlet states (S0, S1, and S2). A weight factor of 1.0 was used for energies, forces, and non-adiabatic couplings to ensure a balanced consideration of the target attributes and their trade-offs
MAE (RMSE)
C2H4 C3H6 C4H8 CH2NH2+
E j 0.004 (0.01) 0.054 (0.08) 0.002 (0.01) 0.055 (0.11)
0.003 (0.01) 0.065 (0.11) 0.011 (0.02) 0.091 (0.16)
F j 0.020 (0.03) 0.291 (0.96) 0.014 (0.02) 0.280 (0.57)
0.022 (0.05) 0.409 (0.79) 0.062 (0.09) 0.407 (0.74)
C ij 0.024 (0.08) 0.021 (0.04) 0.004 (0.01) 0.179 (3.10)
0.034 (0.07) 0.031 (0.05) 0.024 (0.06) 0.233 (3.15)
μ ij 0.005 (0.02) 0.087 (0.19)
0.023 (0.07) 0.125 (0.28)


The computational efficiency of training SPAINN models and predicting properties using these models is finally evaluated and compared to SCHNARC models. Table 2 provides a comparison of training times between SPAINN and SCHNARC models of comparable size (cf. hyperparameters in section S3.1), as well as prediction times for both models on a central processing unit (CPU).

Table 2 Training and prediction times for SPAINN models employing SCHNET or PAINN representation. The table reports the times for the whole training and per epoch indicated by the number of epochs in parenthesis as well as the prediction times obtained by averaging the prediction of 1000 structures 100 times. In none of the training procedures the number of maximum epochs (100 k) was reached and the training was stopped earlier when the validation loss was not changing over 300 epochs. Training was performed on a Tesla-P100 GPU (16 GB RAM) using an Intel Xeon Gold 6134 CPU @ 3.20 GHz. Predictions were performed on an Intel(R) Core(TM) i5-12500
Molecules Training times/s (# epochs) Prediction times/μs
SCHNET PAINN SCHNET PAINN
C2H4 53[thin space (1/6-em)]529 (19[thin space (1/6-em)]881) 2.7 (1) 14[thin space (1/6-em)]341 (6791) 2.1 (1) 84.8 ± 1.9 84.2 ± 2.2
C3H6 36[thin space (1/6-em)]277 (13[thin space (1/6-em)]821) 2.6 (1) 19[thin space (1/6-em)]121 (9061) 2.1 (1) 85.1 ± 1.8 85.6 ± 2.1
C4H8 32[thin space (1/6-em)]755 (7858) 4.2 (1) 30[thin space (1/6-em)]709 (7039) 4.4 (1) 84.7 ± 2.0 83.6 ± 2.2
CH2NH2+ 6574 (2398) 2.7 (1) 5930 (2770) 2.1 (1) 85.8 ± 7.6 85.5 ± 4.5


The training times displayed in Table 2 demonstrate that PAINN-based SPAINN models exhibit slightly greater efficiency during training compared to SPAINN models employing SCHNET representation (SCHNARC models). This is reflected by training times of 2.1 s per epoch for C2H4, C3H6, and CH2NH2+ with PAINN representation, compared to an average of 2.7 s when training SCHNET-based SPAINN models. Conversely, for C4H8, SCHNET-based model training is 0.2 s per epoch faster than PAINN-based model training. However, despite this, PAINN models converge in fewer steps, resulting in overall greater efficiency (8.5 h) compared to training the corresponding SCHNET-based model (9.1 h). This efficiency trend could potentially be attributed to the increased data complexity in C4H8 relative to the other three molecules, given that the database comprises 13 k molecules, while the other databases contain half or less than half of this size (4–6 k data points).

As shown in Table 2, despite the more complex representation employed by SPAINN, prediction times are nearly comparable when predicting 100 structures. Timings are computed by averaging predictions for 1000 structures over 100 iterations.

4.2 Franck–Condon region – prediction of electronic absorption spectra

In computational investigations of photoinduced reactions and processes, similar to experimental approaches, initial steps often involve characterizing the Franck–Condon region. This entails understanding the energetic and electronic nature of excited states within the framework of the Born–Oppenheimer approximation. Typically, this is achieved by computing vertical transition energies (Eabs) and oscillator strengths (f). These parameters quantify the electronic absorption characteristics, whereas the latter is obtained by calculating the energy difference between two electronic states (i and j) and the 2-norm of the corresponding transition dipole moments (μij), as given in eqn (4).
 
image file: d4sc04164j-t4.tif(4)
Herein, we selected butene (C4H8) in both cis- and trans-configurations, along with the methyleneimmonium cation (CH2NH2+) as our model systems. For predicting absorption spectra we trained a SPAINN model on energies (Ej), forces (Fj), NACs (Cij), and dipole moment vectors (μij), utilizing both SCHNET and PAINN representations. The absorption spectra obtained from quantum chemical reference calculations49,52 and SPAINN predictions are displayed in Fig. 2, S9, and S12. Notably, including Fj and Cij in the training is crucial for subsequent simulations of geometries at conical intersections and photodynamics (see section 4.3 and 4.4).

image file: d4sc04164j-f2.tif
Fig. 2 Simulated absorption spectra depicting S0 to S1 excitation in both cis- (blue, a) and trans-butene (purple, a), alongside S0 to S1 (light green, b) or S2 excitation (dark green, b) of CH2NH2+. Reference spectra, acquired for 451 geometries of cis- and trans-butene at SA(3)-CASSCF(2,2)/cc-pVDZ52 (a) and for 20 k geometries of CH2NH2+ at MR-CISD(6,4)/aug-cc-pVDZ49 (b) level of theory, are represented as solid, unfilled lines. The filled curves display the absorption spectra derived from SPAINN models (PAINN representation) for the respective structures. The residuals between the SPAINN- and reference calculated absorption spectra is shown in the bottom panels. Simulated/predicted vertical excitation energies and corresponding oscillator strengths were spectrally broadened using Gaussian functions with a full width at half maximum of 0.1 eV.

For butene, we conducted reference calculations using SA(3)-CASSCF(2,2)/cc-pVDZ for 451 cis- and trans-configurations.52 These configurations were generated by sampling geometries with varied HC[double bond, length as m-dash]CH dihedral angles and C[double bond, length as m-dash]C bond lengths. Specifically, for the cis-configuration, we scanned 11 equidistant dihedral angles between 0 to 20° and 41 equidistant alkene bond lengths between 2.0 and 3.0 Bohr. Similarly, for the trans-configuration, the same C[double bond, length as m-dash]C alkene bond lengths were scanned with dihedrals ranging from 160 to 180°. The corresponding absorption spectra are depicted in Fig. 2a as solid, unfilled curves. Additionally, we utilized a SPAINN model trained on 13 k Wigner sampled structures of butene (cf. ESI section S2.1) for predicting Ej and μij of these 902 geometries. Fig. 2a illustrates that the ML predicted spectra closely resemble the reference calculations (see filled vs. unfilled curves), capturing key spectral features and trends between the cis- and trans-configurations of butene. Specifically, the maxima of the cis- and trans-absorption bands align, while the absorption band of the cis-configurations appears spectrally broader (by 0.14 eV full-width-half-height) compared to the trans-configurations (ref. 0.08 eV difference in full-width-half-height).

Fig. 2b depicts the SPAINN predicted absorption bands (filled curves) of CH2NH2+ for nπ* (S0-to-S1, dark green) and ππ* photoexcitation (S0-to-S2, bright green). The corresponding absorption spectrum obtained for the same 20[thin space (1/6-em)]000 structures via MR-CISD(6,4)/aug-cc-pVDZ is presented as a solid green line in Fig. 2b, as previously reported by some of us.49 It is evident that the SPAINN model of CH2NH2+, accurately predicts absorption bands that closely resemble the reference absorption spectrum.

4.3 Static approach – predicting properties across the configurational space

Understanding and predicting excited-state chemistry hinges on thorough knowledge of PESs. However, in polyatomic molecules, PESs are complex high-dimensional functions, posing challenges for characterization. One computational approach – the static ansatz – involves identifying key structures on PESs and establishing reaction paths between them. Typically, this involves sampling various properties across a configurational space to pinpoint important geometries, such as minima and transition states based on potential energies and frequencies, or crossing geometries based on NACs.

This process is computationally intensive, even for molecular systems with few degrees of freedom and electronic states. To streamline this search, ML predictions of properties at ab initio quality can be utilized. ML models enable sampling of diverse structures and properties within the boundaries of the configurational space spanned by the training data. In our study, we assess the performance of SPAINN in predicting properties for the homologous series of alkenes in XALKENEDB,52 namely ethene (C2H4), propene (C3H6), and butene (C4H8), as well as CH2NH2+ with a particular focus on NACs. This emphasis is due to the scarcity of electronic structure methods providing couplings or Hessians and the inherent challenges in fitting NACs in ML approaches due to the phase problem and singularities at conical intersections (cf. section 2).

Given the challenge of fitting NACs, we further evaluate the predictive capability of SPAINN models trained on XALKENEDB and reference data for CH2NH2+ (ref. 47) by conducting inference on unseen data covering two degrees of freedom. In the first dimension, we varied dihedral angles around the sp2 double bond, specifically δ(HC[double bond, length as m-dash]CH) for C2H4, C3H6, and C4H8, or δ(HC[double bond, length as m-dash]NH) for CH2NH2+, within the range of 0 to 180°. In the second dimension, we scanned the length of the respective sp2 double bond, ranging between 2.0 and 3.0 Bohr for the alkenes, and between 2.4 and 4.4 Bohr for CH2NH2+.

To assess the accuracy of the NAC predictions across the geometries on the two-dimensional grids, we compare the prediction errors obtained for SPAINN models employing SCHNET or PAINN representations. First, we compute the logarithmic, absolute error between each of the models and the quantum chemical reference (ΔCij, cf.Fig. 3a–f and 4a–d). Subsequently, we subtract the SPAINN(SCHNET) from the SPAINN(PAINN) prediction errors, yielding ΔΔCij = ΔCij (PAINN) − ΔCij (SCHNET) values for every coupling (between states i and j) at every point on the 2D grid. The respective distribution of these error differences on the 2D grid is shown for ΔΔC01 of C2H4, C3H6, and C4H8 in Fig. 3g–i, and for ΔΔC12 and ΔΔC01 of CH2NH2+ in Fig. 4a and b. The respective error-difference plots for all NACs, potential energies, forces, or dipole moment vectors on the same grid are compiled in Fig. S6–S8 and S11.


image file: d4sc04164j-f3.tif
Fig. 3 Prediction errors ΔC01 (a–f): logarithm of the absolute difference between PAINN (a–c) or SCHNET-based SPAINN model predictions and the quantum chemical reference values for nonadiabatic coupling vectors between S0 and S1 (C01) for ethylene (a), propylene (b) and butene (c) on a 2D grid along the C[double bond, length as m-dash]C bond length and (HC[double bond, length as m-dash]CH) dihedral angles. The prediction errors represent the average values obtained from two PAINN and two SCHNET models, which were trained on two different data splits. Colored regions (a–c: green, d–f: pink) indicate prediction errors lower than or equal to 1 atomic unit. Gray regions indicate prediction errors above 1 atomic unit. Divergences in prediction errors ΔΔC01 (g–i): difference between the PAINN- and SCHNET-based prediction errors, ΔΔC01 = ΔC01 (PAINN) − ΔC01 (SCHNET). Green regions indicate better performance of the PAINN models, while pink regions indicate better performance of the SCHNET models. Unfilled symbols (a–i) highlight crossing geometries from SHARC simulations (500 trajectories), where the average CH distance aligns with the 2D grid average CH distance with a certainty of 5%. Selected geometries are shown in the bottom panel, indicated by black filled symbols in the contour plots.

In the difference plots (Fig. 3a–f) the colored areas indicate prediction errors ranging from 1 (dark color) to 0.01 atomic units (bright color) – an error magnitude acceptable for NAMD simulations. Vice versa, gray color indicates prediction errors above 1 atomic unit reaching up to 100 atomic units (black). In the corresponding error difference plots, green highlights regions where PAINN-based SPAINN models yield predictions closer to the reference values compared to those based on SCHNET, while pink indicates the opposite. We discuss the respective results for the selected NACs of the alkenes and CH2NH2+ individually in the subsequent paragraphs.

4.3.1 Alkenes. The sampled 2D-grid for the molecules in XALKENEDB covers the configurational space of interest, i.e., the region around the S1/S0-conical intersection. Typically, conical intersection-geometries with δ(HC[double bond, length as m-dash]CH) between 80 and 100° and r(C[double bond, length as m-dash]C) ranging from 2.5 to 2.8 Bohr are observed for alkenes. This is supported by the hopping geometries found from SHARC trajectories (#500) for the three alkenes. These are indicated by black symbols in Fig. 3 and refer to geometries, where the average C–H distance aligns with the one of the 2D grid (≈2.05 Bohr) with a certainty of 5%.

The error-difference plots for the alkenes show that PAINN-based SPAINN models outperform their SCHNET-based counterparts in the relevant regions of chemical configuration space for all three alkenes (cf. green vs. pink color in Fig. 3g–i). While the training data encompasses δ(HC[double bond, length as m-dash]CH) values ranging from 0 to 180°, mirroring the 2D-grid reference points, the C[double bond, length as m-dash]C double bonds cover distances approximately from 2.25 to 2.80 Bohr in the training set. Consequently, the models interpolate with respect to angles but extrapolate along the distance axis. Nevertheless, it is apparent that SPAINN models utilizing PAINN representation exhibit superior generalization in regions typically associated with S1/S0 conical intersections. These intersections, marked by black symbols in Fig. 3, are obtained from SHARC simulations (note: while geometries may not align precisely on the grid, corresponding C[double bond, length as m-dash]C bond distances and dihedrals are utilized for plotting).

Furthermore, the accuracy of PAINN predictions improves with the increasing size of the alkenes (see white regions in Fig. 3a–c). In contrast, SCHNET predictions show a growing number of points with mean errors between 1 and 10 units as the molecule size increases from C2H4 to C4H8 (see gray regions in Fig. 3d–f). This trend can be explained by the increasing degrees of freedom in larger molecules, which are more effectively captured by the equivariant models.

In the error difference plots (cf.Fig. 3g–i) are certain regions, where the SPAINN(SCHNET) models show a superior performance compared to SPAINN(PAINN) models. For instance, in case of C2H4, the SCHNET-based predictions of the nonadiabatic couplings are more accurate with respect to the PAINN-based predictions in a region between 2.2 and 2.7 Bohr at dihedral angles between 80 and 100°. Nevertheless, both SPAINN(SCHNET) and SPAINN(PAINN) exhibit in the regions, which are pink-colored in Fig. 3g–i, mean average errors in the order of 0.01 to 0.1 units, which indicates that both models describe the couplings in this region sufficiently accurately in terms of NAMD simulations.

4.3.2 Methyleneimmonium cation. For CH2NH2+, we consider the NACs of S2 and S1 (C12) as well as S1 and S0 (C01), since these couplings drive the ultrafast dynamics of the molecule upon photoexcitation into S2. To better comprehend the accuracy of the NACs in various regions of the PESs pertinent to NAMD, we plot the differences in the errors of these NACs individually on the 2D-grid along the length of C[double bond, length as m-dash]N bond and rotation of the H-atom groups around it (cf.Fig. 4). These coordinates were found to be pivotal for the conical intersections between the different states (cf. structures of optimized S2/S1 and S1/S0 conical intersections in Fig. 4a and b).47
image file: d4sc04164j-f4.tif
Fig. 4 Prediction errors ΔC12 (a, c) and ΔC01 (b, d): logarithm of the absolute difference between PAINN (a, b) or SCHNET-based SPAINN model predictions (c, d) and the quantum chemical reference values for nonadiabatic coupling vectors between S2 and S1 (C01) and S1 and S0 (C01) for the methyleneimmonium cation (CH2NH2+) on a 2D grid along the C[double bond, length as m-dash]N bond length and (HC[double bond, length as m-dash]NH) dihedral angles. The prediction errors represent the average values obtained from two PAINN and two SCHNET models, which were trained on two different data splits. Colored regions (a, b: green, c, d: pink) indicate prediction errors lower than or equal to 1 atomic unit. Gray regions indicate prediction errors above 1 atomic unit. Divergences in prediction errors ΔΔC12 (e) and ΔΔC01: difference between the PAINN- and SCHNET-based prediction errors, e.g., ΔΔC12 = ΔC12 (PAINN) − ΔC12 (SCHNET). Green regions indicate better performance of the PAINN models, while pink regions indicate better performance of the SCHNET models. Unfilled symbols (a–f) highlight crossing geometries from SHARC simulations, where the average CH distance aligns with the 2D grid average CH distance with a certainty of 5%. The white cross-symbols, show the geometry features of the optimized conical intersection geometry in S2 (a, c, e) and S1 (c, d, f). The respective geometries are shown on top of subsets (e) and (f).

The first conical intersection between the S2 and S1 state is characterized by a pyramidalization and a bond distances between 2.65 and 3.40 Bohr mainly, with most hopping geometries located around 2.83 Bohr (see geometries and unfilled symbols in Fig. 4a). Hardly any rotation around the C[double bond, length as m-dash]N bond is visible in the conical intersection of states S2 and S1. SPAINN outperforms SCHNARC in NAC accuracy in this region, which is relevant for the S2/S1 conical intersection, which can be seen by the region of green color Fig. 4a (bottom, left). For completeness, we note that SPAINN is less accurate at about 90° dihedral angle and large interatomic distances, a region hardly relevant for the dynamics under investigation.47

The second conical intersection between the S1 and S0 state is characterized by a rotation of the C[double bond, length as m-dash]N bond and a dihedral angle of up to 90°, see also Fig. S4 for a scan along the C–N rotation and corresponding NAC values. Bond distances are mainly at about 2.65 Bohr.47 As can be seen in Fig. 4b, the accuracy between SPAINN and SCHNARC models for C01, i.e., the NAC between the S0 and S1 states, seem to be balanced. However, the region of interest with small bond distances is mainly characterized by larger errors of SCHNET-based SPAINN models. As the interatomic distance increases, SCHNET-based SPAINN models exhibit greater accuracy compared to PAINN-based counterparts. However, the relevance of NACs diminishes for the investigated dynamics. For completion, the error difference in NACs between the S0 and S2 state as well as the energies and dipoles is plotted in Fig. S11. As evident from this figure, the majority of regions are shaded green, indicating that SPAINN models utilizing PAINN representation consistently outperform their counterparts with SCHNET representation across all examined NACs, energies, and dipoles and nearly all areas of the investigated 2D PES.

4.4 Dynamic approach – predicting ultrafast internal conversion

Given the limitations of static photochemistry approaches in potentially overlooking crucial geometries and pathways as well as relaxation mechanisms, time scales and quantum yields, we integrated SPAINN with SHARC 3.0.37 The latter forms the methodological foundation for dynamic approaches employing surface hopping NAMD. To showcase the performance of SPAINN-assisted SHARC simulations, we employed the SPAINN models of C4H8 and CH2NH2+ to conduct photodynamics simulations of the respective molecules. For the latter molecule, this has been used to assess the accuracy of NAMD methods earlier as the accurate description of its photodynamics that are governed by ultrafast internal conversions between different electronically excited states require accurate NACs.45,47,57

To execute ML-photodynamics we employed the SCHNET- and PAINN-based SPAINN models of C4H8 and CH2NH2+ trained on energies (Ej), forces (Fj), NACs (Cij), and dipole moment vectors (μij) as discussed in section 4.1 (cf.Table 1). Note that μij were included and trained for completeness, although they remain inactive in the photodynamics simulations under discussion. Nevertheless, their potential application lies in the prediction of electronic absorption spectra,49 thereby facilitating the selection of initial conditions within the conventional framework of NAMD simulations as discussed above (cf. section 4.2).

The results of the PAINN-based SPAINN predictions of the photodynamics of C4H8 and CH2NH2+ are summarized in form of kinetic plots showing the population of the two (C4H8) or three lowest singlet excited states (CH2NH2+) in Fig. 5a and b, respectively. The corresponding results obtained with SCHNET-based SPAINN models are shown in Fig. S14.


image file: d4sc04164j-f5.tif
Fig. 5 Temporal evolution of population in the two lowest singlet excited states upon photoexcitation into S1 of cis-butene (a) and the three lowest singlet excited states of CH2NH2+ upon excitation into S2 (b) as derived from SHARC simulations spanning 100 fs (with a time-step of 0.5 fs). The solid lines represent outcomes from 100 SHARC trajectories on SA(3)-CASSCF(2,2)/cc-pVDZ (a) or MR-CISD/aug-cc-pVDZ level of theory, respectively.47,52 Dashed lines reflect results from 600 (a) or 3846 SHARC trajectories (b), incorporating non-adiabatic couplings, energies, and forces predicted by a SPAINN model leveraging the equivariant PAINN representation. The background color is white in regions where SPAINN models yield reasonable results, as determined by the total energy of the molecules at each time-step, while other regions are shaded gray. Kinetic rate constants for a sequential kinetics scheme were derived using KiMoPack58 and are presented alongside the population plots. Bold numbers denote the excited-state lifetimes as obtained from reference calculations, the ML lifetimes are given in parenthesis.

For both molecules, it is evident from the population curves in Fig. 5 that the SPAINN-driven SHARC simulations are able to reproduce the ultrafast transitions between the different electronic states as obtained from SHARC reference calculations (cf. computational details in section S2.4). This is reflected in the match between the dashed (SPAINN/SHARC simulations) and the solid lines (electronic structure theory/SHARC simulations) in the white shaded regions in Fig. 5.

For CH2NH2+ (Fig. 5b), the transition from the first excited state (S1) to the ground state (S0) is marginally better captured when comparing PAINN- and SCHNET-based SPAINN models evaluated against quantum chemistry reference simulations.47 This is evidenced by the characteristic lifetime of S1 obtained from a sequential kinetic model, which is 30 fs and 50 fs for SCHNET- and PAINN-based SPAINN predictions, respectively, whereas the reference value is 59 fs. Nevertheless, employing both SCHNET and PAINN-based SPAINN models in SHARC simulations yields reasonable photodynamics results, aligning with previous analyses and findings from related studies.45

In contrast, photodynamics simulations for C4H8 employing PAINN-based SPAINN models outperform those using SCHNET-based counterparts. However, both models exhibit reliability limitations, constrained to a simulation time of approximately 60 fs, as indicated by the gray region in Fig. 5 and S14, determined by the total energy of the molecules at each time-step of the dynamics simulation. This is due to the fact that the C4H8 database was not optimized through active or online learning, hence the simulations fail for both SCHNET- and PAINN-based SPAINN models after reaching a certain simulation time, corresponding to long-tail events not well represented in the training data. Only the PAINN-based SPAINN model accurately reproduces the population kinetics of butene with respect to the reference, yielding a lifetime of the initially excited S1 state of 49 fs, consistent with reference calculations (48 fs), while SCHNET-based models yield a lifetime of about 15 fs (Fig. S14a). This underscores the influence of the more accurate excited-state properties of PAINN compared to SCHNET-based SPAINN models.

5 Conclusions

This work introduces SPAINN, an open-source Python software for performing surface hopping molecular dynamics simulations utilizing machine learning predicted energies, forces, nonadiabatic couplings or dipole moment vectors. By the integration of the nonadiabatic molecular dynamics program SHARC 3.0 with the neural network potentials provided by SchNetPack 2.0, SPAINN provides a flexible and modular tool for training and predicting excited state properties involving multiple electronic (singlet) states using rotationally-invariant (SCHNET) or -equivariant (PAINN) representation of molecules. Moreover, its modular architecture allows for straightforward adaption of modules and the development of new ones. The method is accessible on GitHub59 and is accompanied by a comprehensive documentation39 including tutorials, which facilitate its integration into research. SPAINN incorporates three core modules to (i) generate databases or convert existing databases into the SPAINN database format, and to interface (ii) SchNetPack and (iii) SHARC.

The performance and capability of SPAINN is demonstrated using simulations of the methyleneimmonium cation and a series of alkenes. The results show improved accuracy and comparable computational efficiency to previous methods (SCHNARC), particularly in fitting non-adiabatic couplings, energies, and forces of multiple electronic excited singlet states. Overall, we show that the usage of the equivariant representation, improves the data efficiency in training and inference as well as the generalization of the respective models. We believe that SPAINN is promising for the study of various photophysical and photochemical phenomena and the comprehensive documentation and tutorials available with this method facilitate the development of new interatomic potentials for molecules in their excited states.

Data availability

The reference and training data of the alkenes studied in this article, including the energies, forces, and permanent dipole moments of S0, S1, and S2 and the transition dipole moments and non-adiabatic couplings between these states on SA(3)-CASSCF(2,2)/cc-pVDZ level of theory are available at Zenodo at https://doi.org/10.5281/zenodo.10736050. The data of the methylene immonium cation was previously published by some of us and is available on Github at https://github.com/schnarc/SchNarc and RSC at https://doi.org/10.1039/C9SC01742A. SPaiNN, including the source code, documentation and tutorials is available at Github at https://github.com/CompPhotoChem/SpaiNN.

Author contributions

S. M. and C. M. conducted the main programming work, co-wrote the code documentation and co-developed tutorials. C. M. performed the quantum chemical calculations to create the XALKENEDB database. J. W., C. M. and P. M. conceived the study. All authors contributed to the writing of the manuscript, provided feedback, and edited it.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

C. M. gratefully acknowledges support from the Alexander von Humboldt Foundation with a Feodor Lynen Fellowship. S. M. thanks the University of Vienna (DosChem program) for funding. High-performance calculations were performed at the HPC facilities of the University of Luxembourg, the Computer-Chemistry-Center of the Friedrich-Alexander-Universität Erlangen-Nürnberg, and local facilities of the González group at the University of Vienna. The authors further thank Maximilian Tiefenbacher for discussions on model systems.

Notes and references

  1. G. Cerullo, Science, 2002, 298, 2395–2398 CrossRef CAS PubMed .
  2. E. Romero, V. I. Novoderezhkin and R. van Grondelle, Nature, 2017, 543, 355–365 CrossRef CAS PubMed .
  3. Y. Song, A. Schubert, E. Maret, R. C. Burdick, B. D. Dunietz, E. Geva and J. P. Ogilvie, Chem. Sci., 2019, 10, 8143–8153 RSC .
  4. C. Schnedermann, X.-Q. Yang, M. Liebel, K. M. Spillane, J. Lugtenburg, I. J. Fernandez, A. Valentini, I. Schapiro, M. Olivucci, P. Kukura and R. A. Mathies, Nat. Chem., 2018, 10, 449–455 CrossRef CAS PubMed .
  5. B. G. Levine and T. J. Martínez, Annu. Rev. Phys. Chem., 2007, 58, 613–634 CrossRef CAS PubMed .
  6. J. Westermayr, M. Gastegger, D. Vörös, L. Panzenboeck, F. Joerg, L. González and P. Marquetand, Nat. Chem., 2022, 14, 914–919 CrossRef CAS PubMed .
  7. M. N. R. Ashfold, B. Cronin, A. L. Devine, R. A. Dixon and M. G. D. Nix, Science, 2006, 312, 1637–1640 CrossRef CAS PubMed .
  8. C. Rauer, J. J. Nogueira, P. Marquetand and L. González, J. Am. Chem. Soc., 2016, 138, 15911–15916 CrossRef CAS PubMed .
  9. J. Westermayr and P. Marquetand, Chem. Rev., 2021, 121, 9873–9926 CrossRef CAS PubMed .
  10. B. Sanchez-Lengeling and A. Aspuru-Guzik, Science, 2018, 361, 360–365 CrossRef CAS PubMed .
  11. I. Ahmad, S. Ahmed, Z. Anwar, M. A. Sheraz and M. Sikorski, Int. J. Photoenergy, 2016, 2016, 1–19 CrossRef .
  12. S. Mathew, A. Yella, P. Gao, R. Humphry-Baker, B. F. E. Curchod, N. Ashari-Astani, I. Tavernelli, U. Rothlisberger, M. K. Nazeeruddin and M. Grätzel, Nat. Chem., 2014, 6, 242–247 CrossRef CAS PubMed .
  13. A. P. Bartók, S. De, C. Poelking, N. Bernstein, J. R. Kermode, G. Csányi and M. Ceriotti, Sci. Adv., 2017, 3, 12 Search PubMed .
  14. J. Westermayr, J. Gilkes, R. Barrett and R. J. Maurer, Nat. Comput. Sci., 2023, 3, 139–148 CrossRef CAS PubMed .
  15. S. Mai and L. González, Angew. Chem., Int. Ed., 2020, 59, 16832–16846 CrossRef CAS PubMed .
  16. J. Behler, Chem. Rev., 2021, 121, 10037–10072 CrossRef CAS PubMed .
  17. O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger, I. Poltavsky, K. T. Schütt, A. Tkatchenko and K.-R. Müller, Chem. Rev., 2021, 121, 10142–10186 CrossRef CAS .
  18. H. E. Sauceda, L. E. Gálvez-González, S. Chmiela, L. O. Paz-Borbón, K.-R. Müller and A. Tkatchenko, Nat. Commun., 2022, 13, 3733 CrossRef CAS PubMed .
  19. V. L. Deringer, N. Bernstein, G. Csányi, C. Ben Mahmoud, M. Ceriotti, M. Wilson, D. A. Drabold and S. R. Elliott, Nature, 2021, 589, 59–64 CrossRef CAS .
  20. R. Schade, T. Kenter, H. Elgabarty, M. Lass, O. Schütt, A. Lazzaro, H. Pabst, S. Mohr, J. Hutter, T. D. Kühne and C. Plessl, Parallel Comput., 2022, 111, 102920 CrossRef .
  21. P. O. Dral, J. Phys. Chem. Lett., 2020, 11, 2336–2347 CrossRef CAS PubMed .
  22. J. Westermayr and P. Marquetand, Mach. Learn.: Sci. Technol., 2020, 1, 043001 Search PubMed .
  23. J. Li and S. A. Lopez, Acc. Chem. Res., 2022, 55, 1972–1984 CrossRef CAS PubMed .
  24. A. Musaelian, S. Batzner, A. Johansson, L. Sun, C. J. Owen, M. Kornbluth and B. Kozinsky, Nat. Commun., 2023, 14, 579 CrossRef CAS PubMed .
  25. O. Unke, M. Bogojeski, M. Gastegger, M. Geiger, T. Smidt and K.-R. Müller, Advances in Neural Information Processing Systems, 2021, pp. 14434–14447 Search PubMed .
  26. S. Batzner, A. Musaelian, L. Sun, M. Geiger, J. P. Mailoa, M. Kornbluth, N. Molinari, T. E. Smidt and B. Kozinsky, Nat. Commun., 2022, 13, 2453 CrossRef CAS PubMed .
  27. I. Batatia, S. Batzner, D. P. Kovács, A. Musaelian, G. N. Simm, R. Drautz, C. Ortner, B. Kozinsky and G. Csányi, arXiv, 2022, preprint, arXiv:2205.06643,  DOI:10.48550/arXiv.2205.06643.
  28. W. G. Stark, J. Westermayr, O. A. Douglas-Gallardo, J. Gardner, S. Habershon and R. J. Maurer, J. Phys. Chem. C, 2023, 127, 24168–24182 CrossRef CAS PubMed .
  29. K. Schütt, O. Unke and M. Gastegger, Equivariant message passing for the prediction of tensorial properties and molecular spectra, 2021, https://proceedings.mlr.press/v139/schutt21a.html Search PubMed.
  30. S. Axelrod, E. Shakhnovich and R. Gómez-Bombarelli, Nat. Commun., 2022, 13, 3440 CrossRef CAS PubMed .
  31. J. Li, P. Reiser, B. R. Boswell, A. Eberhard, N. Z. Burns, P. Friederich and S. A. Lopez, Chem. Sci., 2021, 12, 5302–5314 RSC .
  32. J. Li, R. Stein, D. M. Adrion and S. A. Lopez, J. Am. Chem. Soc., 2021, 143, 20166–20175 CrossRef CAS PubMed .
  33. J. Westermayr, M. Gastegger and P. Marquetand, J. Phys. Chem. Lett., 2020, 11, 3828–3834 CrossRef CAS PubMed .
  34. K. T. Schütt, H. E. Sauceda, P.-J. Kindermans, A. Tkatchenko and K.-R. Müller, J. Chem. Phys., 2018, 148, 241722 CrossRef PubMed .
  35. SchNarc, https://github.com/schnarc/SchNarc.git Search PubMed.
  36. S. Mai, P. Marquetand and L. González, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2018, 8, e1370 Search PubMed .
  37. S. Mai, D. Avagliano, M. Heindl, P. Marquetand, M. F. S. J. Menger, M. Oppel, F. Plasser, S. Polonius, M. Ruckenbauer, Y. Shu, D. G. Truhlar, L. Zhang, P. Zobel and L. González, SHARC3.0: Surface Hopping Including Arbitrary Couplings – Program Package for Non-Adiabatic Dynamics, 2023, https://zenodo.org/records/7828641 Search PubMed.
  38. K. T. Schütt, S. S. P. Hessmann, N. W. A. Gebauer, J. Lederer and M. Gastegger, J. Chem. Phys., 2023, 158, 144801 CrossRef PubMed .
  39. https://spainn.readthedocs.io/ .
  40. S. Mai, P. Marquetand and L. González, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2018, 8, e1370 Search PubMed .
  41. J. C. Tully, Int. J. Quantum Chem., 1991, 40, 299–309 CrossRef .
  42. D. S. Sholl and J. C. Tully, J. Chem. Phys., 1998, 109, 7702–7710 CrossRef CAS .
  43. W. B. How, B. Wang, W. Chu, A. Tkatchenko and O. V. Prezhdo, J. Phys. Chem. Lett., 2021, 12, 12026–12032 CrossRef CAS PubMed .
  44. B. Wang, W. Chu, A. Tkatchenko and O. V. Prezhdo, J. Phys. Chem. Lett., 2021, 12, 6070–6077 CrossRef CAS PubMed .
  45. J. Westermayr, F. A. Faber, A. S. Christensen, O. A. von Lilienfeld and P. Marquetand, Mach. Learn.: Sci. Technol., 2020, 1, 025009 Search PubMed .
  46. S. Mai, F. Plasser, P. Marquetand and L. González, Attosecond Molecular Dynamics, The Royal Society of Chemistry, 2018, pp. 348–385 Search PubMed .
  47. J. Westermayr, M. Gastegger, M. F. S. J. Menger, S. Mai, L. González and P. Marquetand, Chem. Sci., 2019, 10, 8100–8107 RSC .
  48. A. V. Akimov, J. Phys. Chem. Lett., 2018, 9, 6096–6102 CrossRef CAS PubMed .
  49. J. Westermayr and P. Marquetand, J. Chem. Phys., 2020, 153, 154112 CrossRef CAS PubMed .
  50. M. Gastegger, J. Behler and P. Marquetand, Chem. Sci., 2017, 8, 6924–6935 RSC .
  51. S. Mai, P. Marquetand and L. González, J. Phys. Chem. Lett., 2016, 7, 1978–1983 CrossRef CAS PubMed .
  52. C. Müller and A. Tkatchenko, XAlkeneDB: A database illuminating the electronic ground and excited state quantum chemical features of ethene, propene and butene, 2024,  DOI:10.5281/zenodo.10736050 .
  53. B. O. Roos, P. R. Taylor and P. E. Siegbahn, Chem. Phys., 1980, 48, 157–173 CrossRef CAS .
  54. B. O. Roos and P. E. M. Siegbahn, Int. J. Quantum Chem., 1980, 17, 485–500 CrossRef CAS .
  55. M. Barbatti, A. J. A. Aquino and H. Lischka, Mol. Phys., 2006, 104, 1053–1060 CrossRef CAS .
  56. J. Herbst, K. Heyne and R. Diller, Science, 2002, 297, 822–825 CrossRef CAS PubMed .
  57. D. Hollas, L. Şiştík, E. G. Hohenstein, T. J. Martínez and P. Slavíçek, J. Chem. Theory Comput., 2018, 14, 339–350 CrossRef CAS PubMed .
  58. C. Müller, T. Pascher, A. Eriksson, P. Chabera and J. Uhlig, J. Phys. Chem. A, 2022, 126(25), 4087–4099 CrossRef PubMed .
  59. SPaiNN, https://github.com/CompPhotoChem/SPaiNN Search PubMed.

Footnotes

Electronic supplementary information (ESI) available: Architecture and usage of SPAINN, details on databases used and generated in this study, hyperparameters of training, mean absolute error plots of SPAINN models trained of energies, forces, and NACs using SCHNET and PAINN representation. See DOI: https://doi.org/10.1039/d4sc04164j
These authors contributed equally to this work.

This journal is © The Royal Society of Chemistry 2024
Click here to see how this site uses Cookies. View our privacy policy here.