Kemal Atalar*a,
Yannic Rathab,
Rachel Crespo-Oteroc and
George H. Booth*a
aDepartment of Physics and Thomas Young Centre, King’s College London, Strand, London, WC2R 2LS, UK. E-mail: kemal.atalar@kcl.ac.uk; george.booth@kcl.ac.uk
bNational Physical Laboratory, Teddington, TW11 0LW, UK
cDepartment of Chemistry University College London, 2020 Gordon St., London, WC1H 0AJ, UK
First published on 2nd May 2024
We build on the concept of eigenvector continuation to develop an efficient multi-state method for the rigorous and smooth interpolation of a small training set of many-body wavefunctions through chemical space at mean-field cost. The inferred states are represented as variationally optimal linear combinations of the training states transferred between the many-body bases of different nuclear geometries. We show that analytic multi-state forces and nonadiabatic couplings from the model enable application to nonadiabatic molecular dynamics, developing an active learning scheme to ensure a compact and systematically improvable training set. This culminates in application to the nonadiabatic molecular dynamics of a photoexcited 28-atom hydrogen chain, with surprising complexity in the resulting nuclear motion. With just 22 DMRG calculations of training states from the low-energy correlated electronic structure at different geometries, we infer the multi-state energies, forces and nonadiabatic coupling vectors at 12000 geometries with provable convergence to high accuracy along an ensemble of molecular trajectories, which would not be feasible with a brute force approach. This opens up a route to bridge the timescales between accurate single-point correlated electronic structure methods and timescales of relevance for photo-induced molecular dynamics.
While there are a number of alternative formulations for efficient propagation of coupled electron and nuclear motion, for many purposes mixed quantum-classical approaches can be devised whereby the nuclear propagation is still treated classically, relying on information from all relevant electronic states in the dynamics. These electronic states can therefore be obtained under the Born–Oppenheimer (BO) approximation6 at each nuclear geometry without explicit coupling to quantum nuclear degrees of freedom. In this work, we consider the popular ‘fewest-switches surface hopping’ (FSSH) approach,7–9 although many other approaches also exist.1,2 In FSSH, the full electronic state is evolved at each timestep as a superposition over multiple Born–Oppenheimer adiabatic electronic states, with the nuclei propagated on a single adiabatic electronic surface. When the simulations are done using adiabatic states, nonadiabatic stochastic jumps between the electronic states occur with a probability depending on the nonadiabatic couplings, whilst ensuring overall energy conservation. While this approach can be motivated from higher levels of theory,10 it lacks rigorous interference and decoherence effects of quantum nuclei (noting recent work incorporating nuclear decoherence and tunneling processes into the formulation8,9,11–13). Provided ad hoc corrections are incorporated to address some of these issues, FSSH has proven to be very effective in the modelling and prediction of photo-chemical processes over relevant atomic timescales.2
However, the outstanding limitation in the scope of application of NAMD is the electronic structure problem at its heart,4,14 which has been found to be generally more important to the quality of results than the specifics of different NAMD approximations.15 This is a demanding electronic structure challenge, requiring many consecutive calculations as the nuclei are propagated in time, each with a finely balanced description of multiple electronic states. Furthermore, nuclear forces are required for each state to determine the propagation of the nuclei, as well as nonadiabatic couplings (NACs) between the states for the propagation of the electronic part and stochastic hopping. Although it is possible in some cases to approximate these gradients and NACs if they are not available analytically from the electronic structure solver used, this introduces additional approximations, uncertainty and potentially overheads for the calculation. While density functional theory (DFT) is almost ubiquitous for ab initio molecular dynamics on ground states, the requirement of multiple electronic states limits its applicability to NAMD (noting recent approximations for DFT-based NAMD, especially in simulations of transport16,17). Furthermore, single-reference electronic structure theory based on linear response (such as TD-DFT,18 CC219 or EOM-CCSD20) are also generally not suitable where transitions to the ground state are required, due to their explicit formulation as excitations from a single electronic state, which precludes the critical region around conical intersections where this gap vanishes and ground and excited states must be treated on the same footing.14 These descriptions may not give the required balance in the accuracy of the ground and excited states, especially where excited states have significant charge rearrangement compared to the ground state.
This places a significant burden on multireference electronic structure methods for the description of these adiabatic states, in particular state-averaged complete active space self-consistent field (CASSCF) which can be considered the workhorse of NAMD, and which has analytic gradients and NACs21,22 available in many packages. Unfortunately, there are several drawbacks in these traditional multireference approaches, in particular the lack of dynamical correlations, which can unbalance the description of different states and manifest in e.g. dramatic overestimation of the relative energy of ionic states.23,24 Extensions to internally-contracted multireference perturbation theory (MRPT) and configuration interaction (MRCI) methods are more recently available, but they can still underestimate vertical excitations25 and are substantially more expensive with more challenging gradient theory and NACs.26,27 More fundamentally, these approaches depend significantly on the choice of active space28 and require the definition of a small active space to enable them to be tractable, which must remain consistent across the changes in geometry over the trajectory. This can often be hard or impossible, as relevant orbitals at one geometry can adiabatically change into an irrelevant subspace at another geometry, while orbitals crossing and entering or leaving the subspace can result in a discontinuous surface, difficulties with intruder states and issues with energy conservation during the NAMD simulations.29–31
This context of NAMD for photochemistry, catalysis and beyond, is a prime motivation for developments in electronic structure methodology. However, new wavefunction-based approaches emerging in the last couple of decades have not yet impacted upon this field, underlining the challenges faced in this area. Modern accurate and systematically improvable approaches such as DMRG (density matrix renormalization group),32–34 selected configuration interaction (sCI),35,36 and a number of stochastic methods (including full configuration interaction quantum Monte Carlo,37–39 auxillary field QMC40,41 and advances in variational Monte Carlo models42,43) can in principle be used as multiconfigurational solvers within NAMD, at least to extend the size of active spaces and mitigate the difficulty in their appropriate selection. However, despite much progress in accurately describing excited states within these frameworks,33,44–47 as well as their analytic nuclear gradients,30,48–51 the community generally still lack nonadiabatic couplings in these methods, while stochastic noise in the electronic structure can often be challenging for precise molecular dynamics that conserves total energy (noting significant work in ref. 52). More generally, while these methods can be powerful in obtaining a small number of single-point energies (often both for ground and excited states), they are still expensive for the number of calculations required in molecular dynamics. Furthermore, they often lack a robust black-box simulation protocol that hinders the reliable execution of these approaches for several thousand consecutive calculations, each relying on consistent convergence for all prior points for reliable trajectories, without manual checking of convergence and simulation parameters. This puts the timescales necessary to describe realistic photochemical processes out of their reach (a fact often even true with a CASSCF description).
As an alternative paradigm, machine learning (ML) potentials have been tremendously successful in breaking this computational barrier for ground state potential energy surfaces (PES). However, the application of ML methodologies to NAMD has various limitations and challenges (beyond those well-understood in ground-state ML force fields such as the local energy decomposition) despite showing great promise by accessing nanosecond NAMD trajectories for small molecules both with MR-CISD53 and CASSCF54 training data. Although progress has been made in predicting excited state energies and forces in these frameworks,53–59 direct learning of NACs tend to lead to their underestimation,59,60 while approximations to them based purely on energies and gradients can also lead to a substantial misrepresentation of true nonadiabatic processes.58,61,62 Capturing conical intersections (CoIns) is also a challenge, as small and sharp energy gaps near CoIns tend to be overestimated and smoothened since they are necessarily poorly represented in the training set.60,63 Furthermore, all of the same electronic structure challenges still exist in obtaining appropriate training data required to build these models.63
The approach outlined in this work takes a step to address these shortcomings, in a largely method-agnostic framework for the robust and compact interpolation of accurate wave functions through chemical space. Initially described in ref. 64 for the interpolation of ground states, we show how this approach can be extended for the balanced interpolation of both ground and excited states for ab initio systems over changing atomic geometries with mean-field cost. Importantly, we also show how both excited state gradients and nonadiabatic couplings between the states are straightforwardly extracted from the interpolation framework, allowing application to NAMD. This allows for the first (to the best of our knowledge) NAMD simulation using modern DMRG derived electronic states,65,66 obtaining high-accuracy molecular dynamics of excited hydrogen chains – a system for which we believe no other solver would be effective. We discuss both the convergence of the dynamics with respect to increasing numbers of training DMRG calculations supporting the interpolated model, and an approach for efficient selection of these training geometries. Finally, we end with a perspective of the use of this framework more generally for practical NAMD applications.
We assume that we have a ‘training’ set of M non-orthogonal many-body states, as vectors in the electronic Hilbert space, {|Ca〉} = C(a)n, where a, b, … labels these distinct training states and n denote the occupation number vectors of the orthonormal many-electron configurations at the nuclear geometry of interest, R. We project the ab initio Hamiltonian at this nuclear configuration into the space spanned by these states. We can then consider a valid wave function at this geometry as resulting from a variational optimization within the span of this many-body subspace. This can be found in closed form as a generalized eigendecomposition of the M × M Hamiltonian in this basis,
〈Ca|Ĥ(R)|Cb〉x(A)b(R) = EA(R)〈Ca|Cb〉x(A)b(R). | (1) |
The training basis (comprising M elements) is small enough that this generalized eigenvalue problem can be completely solved at all geometries of interest, giving a variational approximation to both the ground state and excitation spectrum at each arbitrary ‘test’ geometry, EA(R). The eigenvectors, x(A)b(R), denote the component of the many-body training vector in each interpolated state (these interpolated states are denoted by upper-case indices A, B, …). Through this scheme, wavefunctions at arbitrary test geometries and their observables can be inferred by sampling few wavefunctions at training geometries. As this training subspace is enlarged, the energies of all states from the subspace must necessarily variationally lower towards their exact eigenvalues, as ensured by the eigenvalue interlacing theorem.75 Due to the linearity of the model, the number of electronic states described remains constant as geometries change, and their energies must vary smoothly with changes in the nuclear potential (away from state crossings). The key questions now are how these training states are chosen such that they faithfully span the low-energy eigenstates as geometries change, as well as how the subspace Hamiltonian can be efficiently constructed without requiring the training data expressed in the exponentially large many-electron Hilbert space.
As indicated via the explicit dependence on R for certain quantities in eqn (1), the training vectors are defined such that their numerical values remain fixed in an abstract orthonormal Hilbert space, regardless of the test geometry R at which we are evaluating the subspace model. This critically ensures that the overlap metric between many-body training states, 〈Ca|Cb〉, is independent of R. However, if this training basis is to be interpreted as a set of physical wave functions at each geometry rather than abstract vectors, then it is worth stressing that their character does indeed change with R, since the underlying Hilbert space of electronic configurations will change. Therefore, it is essential that we have a consistent and orthonormal representation of the Hilbert space at each geometry, such that the numerically fixed training vectors are transferrable between geometries and still span a space of relevance for the low-energy states of interest, rather than them spanning increasingly irrelevant parts of the many-body Hilbert space.
To motivate a judicious choice for these training vectors, we assume that they come from the exact, full configuration interaction (FCI), solution for a small number of low-energy eigenstates of the Born–Oppenheimer electronic Hamiltonian at select ‘training’ geometries of the system. However, we also need to fix a consistent representation of the orbitals, χi(r; R), which define the many-body Hilbert space for these vectors, such that the probability amplitudes of states {|Ca〉} can be effectively transferred between geometries. To this end, we choose the orbital basis of the training states, χi(r; R), to be the symmetrically (Löwdin) orthonormalized atomic-orbital (SAO) basis.76,77 These are simply and uniquely derived from an underlying atom-centered atomic orbital basis, {ϕα(r; R)}, as
(2) |
(3) |
These SAOs are defined to remain (in a least-squares sense) as close as possible to the underlying local atomic orbital basis, while ensuring the required orthonormality of this abstract basis at each atomic geometry.‡ This choice of representation for the training states {|Ca〉} (obtained over a range of geometries) is motivated by the fact that much of the local character of the correlated many-electron state in this basis will remain similar as atoms move by small amounts. In addition, nearby atoms with similar chemical bonding will also have common features in their many-electron quantum fluctuations characterizing e.g. covalent bonding character. Therefore, the numerical values of the probability amplitudes of the states in this representation will plausibly remain ‘close’ to the states of interest at modified geometries, ensuring that the numerical values of the FCI many-electron states change least between the different electronic Hilbert spaces as the atomic positions change, and that the states can act as a general projector into the low-energy space as atoms move.
While this is a heuristic choice, we have the rigorous conditions that the inferred wave functions are strictly variational (for all geometries) with respect to increasing training data, as well as the exactness of all inferred states at test geometries which coincide with training geometries. From an ML perspective in its application to MD, the variationality of the model therefore ensures an inductive bias away from regions of the phase space which are poorly represented by the training data. The inferred states are at all points represented as a variationally optimal linear combination of the training states in their SAO representations, as
(4) |
No special structure is relied upon for the description of any of these states (other than the fact that their SAO-represented FCI vectors over the geometries of interest remain sufficiently close to a linear combination of the training states). Indeed, no mean-field information is used at any point in the framework. This provides confidence that strong electronically correlated states can be described, and that the procedure should be relatively unbiased for a description of multiple electronic states simultaneously, providing those states are equally represented in the training states. Furthermore, in contrast to the widespread machine-learning derived force fields now prevalent in molecular dynamics, no local energy decomposition is employed, and the fact that a valid many-electron state is propagated to all geometries means that all properties of interest can be predicted from the same model. This includes analytic nuclear forces, which are particularly straightforward due to the geometry-independence of the subspace definition and variational formulation.64 Furthermore, as we shall show in Sec. 2.3, nonadiabatic coupling vectors between inferred states are also straightforwardly obtainable, which places the approach as a suitable candidate to be an electronic structure solver for NAMD. Testing these assertions is at the heart of this work.
Finally, we note that FCI probability amplitudes in a local SAO representation are invariant to translation and rigid body rotations of the molecule (provided a consistent ordering of the underlying AOs). Furthermore, at dissociation the SAO probability amplitudes are unchanging for all states regardless of the extent of the dissociation, ensuring that all inferred wave functions in this important strongly-correlated limit should be consistently described. The appropriate choice of training geometries in which to support the model across e.g. a molecular dynamics trajectory is however critical to the success of the method. In Sec. 3.3 we develop an active-learning protocol to greedily update the training states in a self-consistent procedure across an MD trajectory, demonstrating convergence to near-exactness of NAMD over the trajectory.
(5) |
enabling the projection to be found as
(6) |
Crucially, once these tRDMs and overlaps between the training states are known, the simulation proceeds with the subspace Hamiltonian constructed via the contraction of eqn (6) in cost, where M is the number of training states and L is the number of basis functions in the system. This relatively low polynomial scaling contrasts with the generally exponential costs of accurate wave function solutions to the electronic structure problem to compute the training states, highlighting the significant speed up in this interpolation when many calculations are required across chemical space (and especially where multiple electronic states are required). Properties can then be extracted from the inferred state via its one- and two-body reduced density matrices (RDMs) represented in the SAO basis of each geometry, without any explicit recourse to the training states, as
(7) |
(8) |
This framework is demonstrated in Fig. 1, where we show a simple proof-of-principle for the one-dimensional phase space corresponding to the symmetric stretch of four Hydrogen atoms. The six lowest-energy FCI solutions are shown, including a state-crossing, and the training states supporting the interpolation in each panel are indicated by orange crosses. For the interpolated state, the span of the training space can be enlarged systematically by including either a larger number of higher-lying eigenstates at each geometry (which will necessarily be orthogonal at the same geometry), or a larger number of geometries (which will be non-orthogonal between different geometries). We show that the three lowest lying states can be smoothly interpolated to near-exactness (including a state crossing) with just three geometries, each contributing three states to the training space. This system is further benchmarked in Fig. 2. In all results of this work, we select the same number of training states at each geometry as the number of low-energy states of interest for the interpolation unless stated otherwise.
Finally, we note that while the framework was described with the use of exact (FCI) training states, other approximate electronic structure methods could also be used to define these training states. As long as N-representable tRDMs corresponding to physical wave functions, as well as overlaps, are defined in the method, then any approach could be used within the scheme and a variational state would be inferred at all points, motivating the framework as a tool to accelerate many different electronic structure methods. For the larger systems in Sec. 4, we therefore use the density matrix renormalization group (DMRG),78 which defines training states as matrix product states (MPS).
(9) |
(10) |
(11) |
The first order nonadiabatic coupling vectors between inferred states A and B are given, in its usual form, as
(12) |
The derivative of both the subspace expansion coefficients, x(A)b and the orbital basis need to be accounted for with two separate terms,
dAB = dcoefAB + dorbAB. | (13) |
The coefficient dependent term can be found using the subspace expansion in eqn (4), the orthonormality condition between the inferred states, xT(A)Qx(B) = δA,B (where Q = Qab = 〈Ca|Cb〉 is the overlap matrix between the training states), and the generalized Hellman–Feynman theorem, as
(14) |
(15) |
(16) |
(17) |
(18) |
Following a similar procedure to analytic CASSCF nonadiabatic coupling vectors,21,22 the orbital contribution, dorbAB is formulated as:
(19) |
(20) |
The electronic dynamics are propagated by the time-dependent Schrödinger equation as a superposition over the adiabatic surfaces at a given nuclear geometry as
(21) |
(22) |
(23) |
In this work we used the Newton-X85,86 package to perform the FSSH simulations, interfaced to our eigenvector continuation code which was called to obtain all interpolated electronic structure properties, and NACs between all pairs of interpolated states were included for the propagation of the electronic wave function and stochastic hopping. This also relied on PySCF for the computation of the required Hamiltonian, derivative integrals and FCI training data where used.81,82 This workflow was also initially benchmarked against the FCI Newton-X interface to OpenMolcas.62,87 A 5th order multistep integrator of Butcher88 was used to integrate eqn (21) with 20 multisteps. Decoherence corrections were applied through the simplified decay of mixing approach11 with a decay parameter of 0.1 Ha. All trajectories were checked for energy conservation.
Fig. 3 Comparison of the interpolated trajectory (dotted lines) against the exact results (solid lines) of a four-atom hydrogen chain starting in the equilibrium equidistant S1 state of a STO-3G basis. (a) Adiabatic energies along the trajectory (shown with black stars) with a nonadiabatic hop to the ground state. Insets showing the nuclear geometries. (b) Populations of propagated electronic wavefunction over the low-energy adiabatic states considered. (c) The average interatomic distance of the full chain (R1,Natm/(Natm − 1)) and the distance between the middle hydrogens (R2,3). (d) The transition probability from the current state of the trajectory to the other two states. The training wavefunctions used in the interpolation are taken from the three equidistant geometries shown in Fig. 2. |
We now consider the performance of the eigenvector continuation compared to the exact propagation in this toy system, where we select FCI training states only at the same three equidistant and evenly-spaced geometries as shown in Fig. 2. The key question is whether these non-equidistant, semi-dissociated nuclear geometries visited in the trajectory are well described in the scheme compared to FCI with these somewhat unrepresentative training points. On top of that, we want to see if it can capture the correct internal conversion time to the ground state as small changes in the stochastic hopping transition can result in significant differences in the dynamics. Fig. 3 demonstrates the success of the continuation even without any particular care in the training geometry selection. Despite the simplicity of this four-electron system, nine FCI training wavefunctions computed at three equidistant geometries seem to represent the low-energy Hilbert space (with a total of 256 Slater determinants) at all the relevant geometries. In addition to the energies, populations and nuclear geometries, the transition probabilities between different adiabatic states in Fig. 3(d) are also captured to a very high quantitative accuracy. This indicates the methods capabilities to extract accurate statistical averages over multiple surface hopping trajectories. It could perhaps be argued that the success here is only reflective of the simple ratio of the Hilbert space size to number of training states. We therefore address this question for larger systems in Sec. 4.
The crucial component of any active learning strategy is the selection criteria for the new data for the training set at each iteration. It is common to use a distance metric to choose the data that is most different from the ones in the training set. Since the main inputs for our inference are the 1- and 2-electron integrals at the geometries of interest, which determine all electronic states, we employed this ‘Hamiltonian distance’:
(24) |
as our distance metric where Rt are the geometries in the existing training set and, h(1)ij and h(2)ijkl are the electron integrals. This can be simply evaluated along with the inference over the MD trajectory, and satisfies the important property that the metric is zero for the interpolation at existing training geometries.
This metric was applied in a previous publication to converge the training states for ground state dynamics with eigenvector continuation, where the geometry with largest Dmin over the trajectory was solved with the electronic structure solver and added to the training data for the next MD run, iteratively improving the trajectory until convergence.64 This ensures that geometries from the trajectory are added to the training set that are ‘furthest’ (in this Hamiltonian distance sense) from the nearest training state. However, simply taking the maximum Dmin over the entire trajectory doesn’t yield the fastest convergence, as these points tend to correspond to the final geometries visited in the trajectory, where the furthest parts of the phase space are being explored. A better measure is to consider the turning points, i.e. peaks in the Hamiltonian distance, over the trajectory, as they signal two important scenarios. Firstly, they can point to true dynamical extrema such as the minimum and maximum separation of a nuclear oscillation. For an optimal interpolation (rather than extrapolation) of that motion, inclusion of these extrema is beneficial. The second scenario is when the continuation trajectory is in an explorative phase (i.e. we are searching for additional training geometries required, rather than running the final converged trajectory). As the scheme introduced in this paper necessarily overestimates the energies of atomic geometries in regions of phase space far from the training data due its variational nature, the MD has an inductive bias away from these spuriously high-energy regions. Following this, the trajectory hits an overestimated energy barrier as it moves towards these poorly described geometries and bounces back to a known part of the phase space. Thus, peaks in Hamiltonian distance also hint at these regions of the nuclear phase space where the addition of training data would be particularly beneficial to optimally improve the overall accuracy of the interpolation.
As the nuclear phase space explored during the dynamics changes as more training data is added to the model (since the inferred electronic potential changes), consideration needs to be made as to the trade-off between exploration and exploitation in the learning. Therefore some bias needs to be included in the heuristic metric for the choice of the training geometry to include, to preferentially select geometries from earlier times in the trajectory, before it has diverged too much from the exact converged path. This avoids unnecessary additional electronic structure calculations to include training geometries which may be unrepresentative of the phase space explored in the final converged trajectory. To account for this, the peaks in Dmin are weighted according to eqn (25), to bias the selection of training geometries that occur earlier in the trajectory,
(25) |
Here, the exponent x is a hyperparameter that determines the degree of weighting towards earlier geometries. At its limiting cases, it can reduce to either selecting the highest peak (x = 0) or the first peak (x → ∞) in the Hamiltonian distance, Dmin. We found x = 3 to result in a reasonably fast convergence, for the hydrogen chains studied in this work, providing a good balance between exploring the most unfamiliar geometries and including earlier peaks to ensure systematic convergence of the true trajectory from earlier to later times.
This learning strategy is demonstrated in Fig. 4 for converging the NAMD trajectory over five inferred states with respect to number of geometries used in the training (N) of an eight-atom hydrogen chain, using the five lowest-energy FCI states from each training geometry. The interpolated surfaces with the shown number of training geometries (N) are shown with orange dashed lines in panels (a)–(e), reflecting both the error in the interpolation and the divergence of the trajectories as a result, compared to the exact adiabatic surfaces from FCI (solid blue lines). All adiabatic surfaces were restricted to be from the same spatial symmetry as the ground state. The trajectory was started from the third excited state, S3, at the equilibrium equidistant linear chain geometry with zero nuclear kinetic energy and propagated with a timestep of 0.1 fs in the minimal STO-3G basis. The interpolated trajectory agrees almost exactly with the FCI trajectory by the time N = 14, as shown in 4(e). In this, the propagation follows S3 for around 20 fs which pushes the system to form two separate H4 clusters, before the trajectory hops to the S1 state pushing the middle hydrogens of each of these H4 clusters to form energetic dimers while ejecting the end hydrogens in opposite directions. Both of these can be seen in 4(o) where the dimerization of the second and third hydrogen atoms and the rapid increase in the end-to-end distance is observed just after 20 fs. After this, the system stays in S1 for 15 fs before jumping to the ground state around 35 fs. This, in turn, stabilises the dimers and allows the formation of another dimer in the middle between the fourth and fifth hydrogens that were ejected from their respective clusters. This leads to a final configuration of three vibrating H2 dimers in the middle with two atomic hydrogen atoms dissociating from the chain. This complex NAMD trajectory explores a lot of different regions of the nuclear phase space that would be difficult to select as training states a priori.
Fig. 4 Convergence of the active learning scheme for the nonadiabatic molecular dynamics trajectory of an eight-atom hydrogen chain with respect to number of geometries selected for training the model (N). Top panels (a–e) show the adiabatic energies along this trajectory (exact as blue solid lines, interpolated from N training FCI calculations as orange dashed lines). Bottom panels (k–o) display the scaled end-to-end distance of the chain, R1,Natm/(Natm − 1) and the distance between the 2nd and 3th hydrogen atoms along the chain, R2,3. Middle panels (f–j) show the minimum Hamiltonian distance (eqn (24)) over the geometries along the continuation trajectory with respect to the training geometries available at that iteration. The vertical lines in this panel represent the next geometry along the trajectory selected for inclusion in the model, based on the metric discussed in eqn (25). |
We can also use this to analyze the convergence of the active learning scheme for selected numbers of geometries up to the converged N = 14 trajectory, with the Hamiltonian-distance metric of eqn (24) shown in Fig. 4 panels (f)–(j). The geometries selected for subsequent inclusion in the training set from the path of the inferred trajectory are indicated by vertical lines in these panels, according to eqn (25). It can be seen that the peaks in the Hamiltonian distance serve as qualitative indicators for points of divergence between the trajectory on the inferred potential and the FCI trajectory. This is especially evident for N = 5 and N = 8 trajectories where the data selection in Fig. 4(g) and (h) corresponds to the geometries where the divergence starts in Fig. 4(b) and (c), respectively. Notably, the N = 5 trajectory underscores the significance of weighting the selection to earlier times since selecting the maximum Dmin would have lead to configurations unexplored within the true trajectory. This allows for a systematic convergence of data selection towards the true trajectory from earlier to later times. Moreover, prioritizing geometries in this way also facilitates the exploration of relevant regions in the phase space at later times in the MD, as illustrated for N = 11 in Fig. 4(d).
The focus on improving the accuracy of the model initially for earlier times ensures that the transition region between S1 → S0 can be predicted correctly before trying to iteratively improve the later time sampling of the relevant post-transition ground state dynamics. These are targeted from N = 11 onwards (see the data selection at 35 fs in Fig. 4(i)) when relevant parts of the phase space in these trajectories are being sampled.
Overall, the proposed heuristics appear to address the challenges of selecting a compact set of training geometries in a black-box and rapidly convergent fashion, minimizing the number of explicit electronic structure calculations required. This balances the challenges of considering both the time and the magnitude of the peak in the heuristic error estimate in the interpolated energy surfaces over the trajectories sampled. Overall, in this system qualitative convergence is achieved by the 11th training geometry, with quantitative accuracy attained after a total of 14 training geometries. Considering both the nuclear phase space explored and electronic complexity of this trajectory, the proposed learning scheme can train and replicate the true trajectory with remarkably few FCI training points. We anticipate however that further improvements could be made by including a description of the inferred states themselves in these heuristics since only the Hamiltonian distance to a single training point is considered. This will be investigated in future work.
In recent years, there has been progress in obtaining approximate DMRG-SCF excited state gradients and NACs33,91 as well as quantum dynamics simulations of realistic vibronic Hamiltonians with time-dependent DMRG.92–94 However, running full NAMD simulations with ab initio DMRG has practically been out of reach due to difficulties with exact analytical NACs, high computational cost and difficulty in ensuring a fully black-box and robust workflow over the many calculations. Here, we present the first NAMD simulation with fully ab initio MPS wavefunctions accelerated through the eigenvector continuation. We obtain DMRG training data via the spin-adapted and multi-state implementation in the block2 library, which can also straightforwardly obtain the required tRDMs and overlaps between the MPS training states.66,95,96
We consider the 28-atom one-dimensional hydrogen chain in a STO-6G basis with FSSH, where we initialize the chain with an equilibrium equidistant nuclear configuration with zero nuclear kinetic energy, photo-excited to the first excited electronic S1 state, and we consider the electron and nuclear dynamics over these two interpolated states. The training MPS wavefunctions were optimized to near-exact accuracy by exponentially increasing the bond dimension at each training geometry while decreasing the noise during the DMRG sweeps, simultaneously optimizing the ground and S1 state. A timestep of 0.5 fs was used for the FSSH trajectories.
The active learning scheme described in Section 3.3 was applied to converge a single nonadiabatic molecular dynamics trajectory of the H28 system that explores the relevant phase space and samples the dynamics on both S0 and S1 potential energy surfaces. It should be stressed that the training points selected by the scheme will not (in general) be featured in the final trajectory, since they are taken from different potential energy surfaces as the inferred model is iteratively improved. This suggests that while it converges the trajectory from a single seed, it should exhibit little bias towards this single trajectory in the final data selection. Similar accuracy should be found over a stochastic ensemble of trajectories, with the selection relatively insensitive to the precise details of the trajectory. The convergence was achieved with N = 22 geometries as shown in Fig. 5(a). In this, we show the convergence of the average variational energy of the S0 and S1 states with respect to N over all geometries of the same final converged trajectory. We find that the addition of the final two training points changes this metric for both states by less than chemical accuracy (1 kcal mol−1) across the entire trajectory, indicating confidence in convergence with respect to the training data. It is also worth noting that only two training geometries were sufficient to surpass the variational energy of Hartree–Fock over the trajectory, despite no mean-field information being considered in the interpolated states. The all-important S0 → S1 average excitation energy over the trajectory also converges faster than the absolute energies of the states, as shown in the inset.
For some context as to the accuracy of the interpolated surfaces, Fig. 5(b) illustrates the error in the interpolated S0 and S1 energies at the final N = 22 training geometries, compared to the explicit DMRG used in its training, along with additional S0 energy comparison to Hartree–Fock and Coupled Cluster Singles and Doubles (CCSD) theory. We find that the interpolated energies are exact (to numerical precision) compared to the DMRG calculations, as expected. However, at two geometries, the continuation scheme yields a slightly lower variational energy than the DMRG training energies for the S1 state, indicating that a small further optimization of this state with DMRG would be possible by increasing the bond dimension. The H28 system is close to an ideal system for DMRG due to its one-dimensional nature, so suboptimal convergence behavior was rare in this case. However, this showcases that convergence difficulties in the training for a particular geometry is not as problematic for the performance of the continuation model, as it would likely be during a traditional MD trajectory. This is because the model will ‘borrow’ electronic character from other training states to ensure that the surfaces remain smooth and therefore variationally improve upon unoptimized training states even at the training geometries themselves. This behavior was more evident when applied to the ground state dynamics of the Zundel cation reaction in the previous work.64
This unreliability in convergence of multiple single-point calculations can even be evidenced in comparison CCSD energies at the training geometries – generally quite a robust electronic structure method, shown in Fig. 5(b). Although it is well known that CCSD will struggle to capture the multi-reference nature of the more dissociative geometries due to its intrinsic single-reference formulation (geometry 2), it can also converge to the wrong state (geometry 17) or fail to converge at all (geometry 22) with simple default simulation parameters in widely used software packages. Isolated convergence issues are common when running lots of single-point calculations with different electronic character as are necessary in MD with many-body methodologies, and the proposed scheme ensures the smoothness of the potential energy surface and thus the physicality of dynamics. Even without these points however, CCSD would not reach chemical accuracy (even for energy differences) across the trajectory.
Using these DMRG training wavefunctions for H28 we find the dynamics over 60 fs for an ensemble of 100 trajectories with different random seeds, all starting at the equilibrium geometry with equally separated hydrogen atoms, in the S1 electronic state. This is a total of 12000 single-point calculations, all inferred from the same 22 DMRG training calculations. The averages over this ensemble of trajectories for the electronic population of each state and atomic distances are shown in Fig. 6 to better understand the nature of the nonadiabatic molecular dynamics in this H28 system.
The average electronic state populations are shown in Fig. 6(a), with the fastest rate of internal conversion between 30–50 fs after release, indicated in particular by two sharp jumps, with almost all trajectories in the ground S0 state after this time. In addition, there exists a handful of trajectories that hop back to the higher-energy S1 state after hopping to S0, especially among the trajectories that transition to S0 early (before 10 fs), which can be seen by the dips in S0 population around 5–15 fs. Considering the nuclear geometries along the trajectories, it is clear that the hydrogen atoms at the ends of the chains rapidly dimerize for all trajectories, which is shown in Fig. 6(b). This contrasts with the H4 and H8 systems, where their terminal atoms were ejected before being able to dimerize. In the longer chain dynamics, the frequency and amplitude of these terminal dimer vibrations also seems to change with time, which differ from the ground state behaviour where a constant frequency and amplitude is maintained throughout the dynamics. This might imply that the first excited state of H28 is not as rapidly dissociative (or dissociative at all) unlike the smaller chains, or that the smaller excitation energy in this longer chain leads to faster decay to the ground state which favors dimerization.
The end-to-end distance of the 28-atom hydrogen chain increases linearly with time as seen in Fig. 6(c) with relatively little scatter in this rate over the different trajectories. The rate of this increase is very similar to the one observed in ground state hydrogen chain dynamics in Rath et al.64 Note that the dynamics in that work started from a geometry that was 10% stretched from the equilibrium, while this work starts from the first excited state of the equilibrium equidistant geometry. This suggests that the electronic energy gained by that initial stretch results in a equivalent chain length behaviour as a photo-excitation to S1. Fig. 6(e) shows the maximum distance between any two neighbouring hydrogen atoms, which essentially describes the distance between the terminal dimers in the chain which move apart from each other at the fastest rate (i.e. the H2–H3 distance).
Unlike the consistent dimerization behavior at the ends of the chain, the behaviour in the middle of the chain seems to vary significantly depending on the trajectory, as seen in Fig. 6(d), exploring a large phase space. Continuous oscillations of some trajectories about 1.5 a0 are visible, indicating that a portion of trajectories form dimers between the 14th and 15th hydrogen atoms. This would result in a frustrated dimerization of the overall chain, as full dimerization would require the 14th and 15th hydrogen atoms to dimerize with their other neighbours, not with each other. This therefore requires other non-dimer configurations in the chain for those trajectories, either forming trimers, unbound hydrogen atoms, or other larger hydrogen complexes.
We analyze the geometries that emerge in the last 5 fs of the 100 trajectories computed in this work, to identify the different nuclear configurations that result. Of these trajectories, 27% form perfect dimerization throughout the chain, where the distance within dimers is always smaller than the distance to their next neighbour, leaving the majority without this expected order. 39% of all trajectories have hydrogen atoms being shared with two separate dimers where the bond separation within both of these dimers is less than the distance to their other neighbours. Only 9% of trajectories can be characterized as having a ‘free’ hydrogen where the distance to its neighbours is always more than the distance between the next consecutive hydrogens. Finally, the rest of the trajectories (25%) don’t fit any of these criteria and can be thought of as transition states between these configurations. Videos showcasing representative trajectories are available in the ESI.†
The variation in the trajectories all appears simply due to the stochastic nature of the hopping to the ground state in the FSSH approach, which can result in significant deviations in this system and the exploration of different local minima in the nuclear phase space. A consideration of nonadiabatic models beyond FSSH would be interesting, to see whether these differences persist, as well as to gain further insights into the electronic behaviour over time (e.g. (transition) dipole moments and absorption spectra), which are all accessible from the interpolated states. In addition, there are a number of other physical mechanisms to consider, in particular the effect of proton tunnelling, as well as the incomplete nature of the basis set for the electronic states, which will be considered in future work. While the electronic phase of the clamped equidistant hydrogen chain was found to exhibit surprisingly rich physics,90 it appears that the nuclear dynamics similarly exhibits significant intriguing and subtle physics. We suggest that this system could also be used as an effective benchmarking test bed for nuclear dynamics developments in correlated electron systems, as well as for the electronic structure solvers on which they so heavily depend.
We demonstrate the approach with fewest-switches surface hopping on one-dimensional hydrogen chains, which exhibits a surprisingly rich dynamical behaviour and wide range of trajectories. For the largest H28 system, we use 22 training states as matrix product states from single-point DMRG calculations, and infer 12000 points on a smooth multi-state electronic surface converged to chemical accuracy from this DMRG training (including their analytic atomic forces and NACs) to sample the nuclear trajectories at (hybrid) mean-field scaling. The electronic wavefunctions at each point are represented as variationally optimized linear combinations of the training states, which are themselves optimized in the training phase at selected nuclear geometries. This relies on a consistent and transferable representation of the many-body probability amplitudes, described in an atomic-local representation to facilitate this transferability of the low-energy probability amplitudes to different nuclear configurations. We discover a surprisingly broad spectrum of surface-hopping trajectories for this hydrogen chain, where the simple dimerization of all atomic pairs represent only a minority of the observed trajectories. While nuclear quantum effects are likely to contribute in this system and be an interesting direction for future work, the transitions are below the total initial energy and proceed energetically downhill and are therefore unlikely to be a leading order effect. The nonadiabatic dynamics of this simple system could therefore represent an effective sandpit in the further development and comparison of hybrid classical–quantum molecular dynamics schemes, as well as their dependence on underlying correlated electronic structure methodologies.
From a machine-learning perspective, this approach circumvents many of the traditional approximations and assumptions of local energy based decompositions for machine-learned force field parameterizations. The presence of an explicit many-body wavefunction and variational energy at each nuclear configuration ensures an inductive bias away from poorly represented regions of phase space. All desired observables are accessible within the same model, with a smoothly varying and physical electronic state at all times. The scope for accelerating the numerous situations where multiple sequential electronic structure calculations are required is clear – from molecular dynamics to geometry or transition path optimization, vibrational spectroscopy and beyond.
Of course, many questions still remain. Chief amongst them is the question of how the number of training points required for a given accuracy scales with the nuclear phase space sampled. While this is likely to be system-dependent, it builds on the fundamental question of how transferable the low-energy physics is between these locally represented many-body states. While the nuclear phase space of the H28 was relatively small given the one-dimensional and inversion-preserving configurations sampled due to the initial conditions, it was still large enough to make this a highly non-trivial interpolation from just 22 training points. This question of training set size must also necessarily consider the effect of more realistic basis set sizes. Currently, limitations still exist in obtaining the training data, and more approximate methods will need to be investigated, as well as circumventing the quartic scaling memory costs for the 2-tRDMs which are required. These are all critical questions for the long-term viability of this approach, which holds the promise for efficient acceleration of a wide number of correlated electronic structure methods.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4fd00062e |
‡ It is possible to work directly with a non-orthogonal AO representation of the abstract many-electron representation of the training states, but this then requires a rotation of the many-body states at each geometry. This was explored in ref. 74, but entails exponential complexity for each inference and therefore it primarily motivated application for quantum computers. |
This journal is © The Royal Society of Chemistry 2024 |