Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

High-dimensional neural network potentials for accurate vibrational frequencies: the formic acid dimer benchmark

Dilshana Shanavas Rasheeda *a, Alberto Martín Santa Daría b, Benjamin Schröder c, Edit Mátyus b and Jörg Behler a
aUniversität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraβe 6, 37077 Göttingen, Germany. E-mail: dilshana.rasheeda@chemie.uni-goettingen.de
bELTE, Eötvös Loránd University, Institute of Chemistry, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
cUniversität Göttingen, Institut für Physikalische Chemie, Tammannstraβe 6, 37077 Göttingen, Germany

Received 23rd August 2022 , Accepted 18th November 2022

First published on 24th November 2022


Abstract

In recent years, machine learning potentials (MLP) for atomistic simulations have attracted a lot of attention in chemistry and materials science. Many new approaches have been developed with the primary aim to transfer the accuracy of electronic structure calculations to large condensed systems containing thousands of atoms. In spite of these advances, the reliability of modern MLPs in reproducing the subtle details of the multi-dimensional potential-energy surface is still difficult to assess for such systems. On the other hand, moderately sized systems enabling the application of tools for thorough and systematic quality-control are nowadays rarely investigated. In this work we use benchmark-quality harmonic and anharmonic vibrational frequencies as a sensitive probe for the validation of high-dimensional neural network potentials. For the case of the formic acid dimer, a frequently studied model system for which stringent spectroscopic data became recently available, we show that high-quality frequencies can be obtained from state-of-the-art calculations in excellent agreement with coupled cluster theory and experimental data.


1 Introduction

Due to its unique capabilities to process and analyze large amounts of data, machine learning has nowadays found numerous applications in chemistry and related fields.1–5 Starting with the work of Doren and coworkers in 1995,6 a prominent example is the representation of the atomic interactions with quantum mechanical accuracy by learning the potential energy surface (PES) from a set of known training points computed using accurate electronic structure methods. The resulting machine learning potentials (MLP) have rapidly evolved in the past two decades.7–12 While first-generation MLPs have been restricted to small molecular systems,13,14 second-generation MLPs like high-dimensional neural network potentials (HDNNPs),15 Gaussian approximation potentials,16 spectral neighbor analysis potentials,17 moment tensor potentials,18 atomic cluster expansion19 and many others have paved the way to large-scale simulations of condensed systems, from water and aqueous solutions to bulk materials and interfaces. Long-range electrostatic interactions,20–22 non-local charge transfer,23–27 and magnetism28,29 can also be explicitly considered in modern MLPs. Neural networks have also been used to study reactive molecular systems in the gas phase.30–32

Due to their very flexible but simple functional form MLPs offer advantages like a very high accuracy of about 1 meV per atom in reproducing known reference total energies, an efficiency close to empirical force fields, and an unbiased description of many types of atomic interactions – from covalent bonds via dispersion interactions to metallic and ionic bonding. Still, although some recent MLPs include physical terms like electrostatics, in general the high flexibility of MLPs is curse and bless at the same time, since the functional forms employed in MLPs, like neural networks or kernel methods, are not guaranteed to yield the correct physical shape of the PES. Therefore, large training sets are needed to ensure that a reliable representation of the PES is obtained in the training process. Moreover, a careful validation is required, since energy and force predictions can be inaccurate when extrapolating to atomic configurations that are too different from those in the training sets.

These limited extrapolation capabilities of MLPs and the need for a dense enough sampling of the reference data set raise the question how modern MLPs can be validated. While for early MLPs constructed for small molecular systems a systematic, grid-based mapping of the underlying PES has been possible allowing for rigorous quality control, the situation is different for second-generation MLPs designed for very large systems. Here, the total energy is typically constructed as a sum of environment-dependent atomic energies, resulting in a linear scaling of the computational effort with system size enabling simulations of thousands of atoms. However, since the atomic energies, which are not quantum mechanical observables, can be considered as mathematical auxiliary quantities, error compensation might occur reducing the transferability of the potentials. Further, typically investigated quantities like the root mean squared errors (RMSE) of energies and forces are of limited use, since they can only be computed for the available reference structures, and as averaged properties they may not allow to assess the quality of all the fine details of the PES.

Here, we propose to employ the harmonic and anharmonic vibrational frequencies as a sensitive probe to validate the quality of MLPs pinpointing on close-to equilibrium structures, which are particularly important for spectroscopic applications. While far-from equilibrium behaviour is in principle equally interesting also for spectroscopy, any PES not being able to describe the equilibrium properties correctly could not be trusted for spectroscopic applications. This tool can thus be viewed as complementary to other procedures like active learning that are covering a wide range of non-equilibrium structures.33–35

Vibrational frequencies have been studied since the advent of MLPs, mainly for small molecular systems, with great success,36,37 and very accurate approaches for constructing MLPs suitable for this purpose have been derived.38–40 Still, most of these methods exhibit an unfavorable scaling with system size. Second-generation potentials allow to address larger systems, and some vibrational studies for larger molecules,41 clusters,42 and condensed systems43–46 have been reported. However, MLPs suitable for condensed systems have been typically constructed relying on density functional theory (DFT), which, although offering a good compromise between accuracy and efficiency for many systems, does not provide spectroscopic-quality vibrational frequencies. Moreover, studying complex condensed systems does not allow to disentangle all the subtle atomic interactions, which would be required for a thorough validation of the PES.

In the present work, we fill this gap by investigating in detail the accuracy that can be achieved by HDNNPs as a typical example for second-generation MLPs. We explore the limits of this method, which has originally been designed for dealing with very large numbers of atoms, by carefully training a coupled-cluster-quality47 PES for a moderately sized system. This does not only allow us to compute harmonic frequencies but also to assess the quality of highly accurate anharmonic frequencies using state-of-the-art methods under well-controlled conditions.

As a model system we have chosen the formic acid dimer (FAD), a doubly hydrogen-bonded complex, which in recent years has become a benchmark system for the development of molecular PESs and the calculation of accurate vibrational spectra thanks to the increasing body of gas-phase spectroscopic data.48–54 It is the smallest hydrogen bonded complex with double proton transfer and as such it is a system of high interest for spectroscopy and theoretical dynamic studies due to the possible delocalization of the nuclei over the two wells. This double proton transfer is challenging since it cannot be described by standard near-equilibrium tools based on normal coordinates and perturbation theory. Even for this seemingly small ten-atom system and its 24-dimensional PES the determination of accurate anharmonic frequencies is computationally demanding and thus hardly accessible by a direct application of wave function electronic structure methods. For this reason the standard approach is the intermediate construction of an analytic PES for the determination of vibrational frequencies.

Hence, in general, theoretical studies on vibrational frequencies have three limiting factors: the electronic structure method, its representation by a multi-dimensional PES function, and the vibrational treatment. Regarding the first two aspects, two ab initio-based analytic PESs have been proposed for FAD in the literature. In 2016, Qu and Bowman developed the first full-dimensional potential energy surface55 (henceforth labelled as QB16) for FAD by fitting permutationally invariant polynomials, a very accurate method providing PESs of a quality similar to modern MLPs,56 to 13[thin space (1/6-em)]475 ab initio energy points computed at the CCSD(T)-F12/haTZ level of electronic structure theory. Later, they carried out vibrational configuration interaction computations in normal coordinates57–59 on this surface. Recently, two of us used the QB16 PES and tested, using a reduced dimensionality model, the utility of normal coordinates or a possible efficiency gain of using curvilinear (normal) coordinates60 in the GENIUSH program.61 During this work, several fundamental, combination, and overtone frequencies in the fingerprint range were obtained in an excellent agreement with experiment.53 However, two (totally symmetric) fundamental frequencies were obtained strongly blueshifted in comparison with the harmonic frequencies of the QB16 PES55 and in comparison with the experimental value.53 Based on these observations and due to some ‘artificial’ features of the QB16 PES that made it necessary to restrict the quadrature grid used for the vibrational computations, it was concluded that further work on improving the FAD PES is required.

In 2022, Käser and Meuwly reported another full-dimensional PES (PhysTL) for FAD62 generated by the message passing neural network PhysNet.22 It is based on 26[thin space (1/6-em)]000 MP2/aug-cc-pVTZ single point energies, which have then been transfer-learned employing 866 CCSD(T)/aug-cc-pVTZ energies to obtain an approximately coupled cluster-quality PES that has also been used in the computation of harmonic and anharmonic vibrations.

Regarding experiment, FAD is a very well-studied system (see, e.g. ref. 53 and 54 and references therein), and a wealth of data is available for the validation of theoretical frequencies. In early work, thermal gas phase spectroscopy of FAD has been employed,63,64 while more recently jet-cooled infrared and Raman spectra of FAD in the monomer finger print region up to 1500 cm−1 have been studied in ref. 54. Over the past one and a half decades, all intermolecular vibrational fundamentals and several combination and overtone bands of FAD have been determined in the gas phase with an experimental uncertainty of 1 cm−1.48–52,65–69 The experimental results including a critical evaluation of theoretical work, which appear to be still lagging behind experiment, have been recently reviewed in ref. 53.

The aim of the present work is to develop a robust and full-dimensional high-quality HDNNP for FAD and to benchmark the obtained frequencies using the best available theoretical and experimental data. By robustness, we mean that the 24-dimensional (24D) hypersurface provides a faithful representation (possibly without ‘holes’ or other unphysical features) of the ab initio electronic energy over the relevant quantum dynamical range, and this robustness persists irrespective of the actual choice of (normal or curvilinear) internal coordinates. By accuracy, in this context, we mean that the fitted hypersurface reproduces ab initio points sufficiently closely. ‘Sufficient’ is determined in relation with prospective (ro)vibrational computations which are to be compared with (gas-phase) experimental infrared and Raman spectroscopy data.53

Next to the RMSE, which is the most common quality measure of PESs, we make use of additional quantities in this work that allow further refinement of the potential energy surface for spectroscopic applications. For this purpose, first, we define an accuracy goal for the harmonic frequencies that are expected to be reproduced by the PES to within 10 cm−1 with respect to the ab initio harmonic frequency values computed at the same level of electronic structure theory. Furthermore, to have a relatively compact assessment of the mode coupling representation in the PES, we test the second-order vibrational perturbation theory (VPT2) frequencies of the PES again compared to the direct ab initio values as well as experimental results. Finally, although a semi-rigid description involving relatively small amplitude vibrations about an equilibrium structure of the PES appears to be a good starting point for FAD, the concerted proton tunneling of the double hydrogen bond qualifies this complex for the family of systems with multiple (two) large-amplitude motions. ‘High’-dimensional systems with multiple large-amplitude motions, i.e., motions in which nuclei are delocalized over multiple PES wells, are common in molecular systems and cannot be efficiently described by using perturbative methods developed about equilibrium structure properties (underlying the normal coordinate concept). An efficient quantum dynamics description of these types of systems is currently an active and challenging field for methodological developments.70–76 These developments can be tested and validated with respect to precise spectroscopy data, assuming that a faithful and accurate PES representation for the molecular system is available.

After giving a brief summary of the employed methods in Section 2 and the computational details in Section 3, the results are presented in Section 4. First, we assess the quality of the HDNNP, which is obtained by iteratively increasing the amount of reference data, until a converged potential is obtained. This PES is then characterized by its harmonic frequencies, which are compared to coupled cluster data. Finally, we report anharmonic frequencies obtained from VPT2 and reduced-dimensionality variational calculations, which allow direct validation of the MLP using reference ab initio calculations and accurate experimental data.

2 Methods

2.1 High-dimensional neural network potentials

High-dimensional neural network potentials have been introduced in 2007 by Behler and Parrinello15 as the first type of MLP applicable to large condensed systems containing thousands of atoms. This is achieved by representing the potential energy E of the system as a sum of atomic energies Ei, which depend on the local chemical environment up to a cutoff radius,
 
image file: d2cp03893e-t1.tif(1)
Each of these atomic energies is the output of an individual atomic feed-forward neural network describing the functional relation between the respective energy contribution and the local atomic structure. The weight parameters of these neural networks are determined in an iterative training process using known energies (and often also forces) obtained from reference electronic structure calculations. The weight parameters and architectures of all atomic neural networks of a given chemical element are constrained to be the same making the potential transferable to different system sizes.

To ensure the mandatory translational, rotational and permutational invariances of a HDNNP based PES, the local atomic environments are typically characterized by vectors of atom-centered symmetry functions (ACSF)77 as geometric descriptors, which meet these requirements by construction and provide local structural fingerprints. The inclusion of a cutoff function ensures that the ACSFs smoothly decay to zero in value and slope at the cutoff radius, which is commonly chosen between 5 and 10 Å.

Several extensions to second-generation HDNNPs have been introduced in recent years, like the consideration of long-range electrostatic interactions based on environment-dependent charges,20,42 non-local charge transfer,24 and magnetic interactions.28 More details about HDNNPs and their properties can be found in several recent reviews-78–80 In the present work a second-generation HDNNP relying on “short-range” atomic energies only is employed, since for a comparably small benchmark system like the FAD all interactions can be fully described as a function of the local chemical environments provided that a sufficiently large cutoff radius is chosen.

2.2 Frequency calculations

2.2.1 Harmonic frequencies. The determination of harmonic frequencies ω is a routine task in theoretical spectroscopic investigations.81 Here, we construct the Cartesian second derivative matrix (Hessian) by means of finite differences based on the HDNNP. Mass weighting and diagonalization then yields the desired harmonic vibrational frequencies. These are then compared directly to the corresponding results of reference ab initio calculations.
2.2.2 Second-order vibrational perturbation theory. The anharmonic fundamental transition frequency νi within VPT2 is given by82
 
image file: d2cp03893e-t2.tif(2)
where ωi is the harmonic vibrational wavenumber and the xij are the anharmonicity constants that account for anharmonicity in the vibrational mode i as well as the coupling to other modes j. Eqn (2) is based on a quartic force field (QFF), i.e., a potential energy expression in terms of dimensionless normal coordinates q = {qi} given by a Taylor expansion up to fourth order,82
 
image file: d2cp03893e-t3.tif(3)
In eqn (3) the so-called cubic and quartic force constants are denoted by ϕijk and ϕijkl, respectively. A contact transformation of the (ro)vibrational Hamiltonian83–85 up to second order then yields formulae for the xij in terms of molecular parameters,
 
image file: d2cp03893e-t4.tif(4)
and
 
image file: d2cp03893e-t5.tif(5)
where the resonance denominator Δijk is given by
 
Δijk = (ωi + ωj + ωk)(ωi + ωjωk)(ωiωj + ωk)(ωiωjωk).(6)
The last term in eqn (5) depends on the three equilibrium rotational constants Bαe and the Coriolis coupling constants ζαij.

A closer look at eqn (4) and (5) shows that they contain differences of harmonic frequencies in denominators which may lead to so-called Fermi-Resonances. Two cases need to be accounted for within VPT2: ωi ≈ 2ωj and ωiωj + ωk. In fact, such vibrational resonances have been well documented experimentally for FAD (see ref. 54 and references therein) and therefore can be expected to interfere in the present VPT2 treatment. In order to allow a comparison between VPT2 results and experiment a special treatment is required which is sometimes referred to as GVPT2.86 First, a change in the contact transformation removes the resonance denominators from the xij to yield so-called deperturbed anharmonicity constants image file: d2cp03893e-t6.tif.87 In a second step the resonating vibrational states are treated using perturbation theory for (near) degenerate states88 where the coupling matrix element depends on a cubic force constant, i.e. ϕijj or ϕijk.

2.2.3 Reduced-dimensionality variational vibrational computations. A complete quantum dynamical characterization of a molecular system can be obtained by the variational solution of the (ro)vibrational Schrödinger equation including the multi-dimensional PES as an ‘effective’ interaction acting among the nuclei. The vibrational Hamiltonian, as a sum of the PES, E, and the kinetic energy operator written in general coordinates is
 
image file: d2cp03893e-t7.tif(7)
where [p with combining circumflex]k = −i∂/∂qk (k = 1,…,D ≤ 3Natom − 6), and the qk are vibrational coordinates whose definitions (and choice of the body-fixed frame) are encoded in the mass-weighted metric tensor, g[Doublestruck R]D×D, with [g with combining tilde] = det[thin space (1/6-em)]g and G = g−1. Rigorous geometrical constraints (D < 3Natom − 6) of selected internal coordinates can be introduced in the quantum mechanical vibrational model by ‘deleting’ the relevant rows and columns of the g matrix. A numerical kinetic energy operator (KEO) approach based on this formalism has been implemented in theGENIUSH computer program,61i.e., g, [g with combining tilde], and G are evaluated at grid points. The original implementation relied on the discrete variable representation (DVR)89 of the vibrational Hamiltonian, but more recently, this numerical KEO approach has been used with finite basis representation (FBR) and the Smolyak scheme,70,71,90 which opens the route towards higher-dimensional computations. For FAD, the lowest-energy vibrational frequencies from the fingerprint range have been converged for a series of reduced-dimensionality vibrational models defined in ref. 60.

The first applications with the new HDNNP based PES developed in the present work use an 8-dimensional vibrational model termed 8D([scr J, script letter J]t). This approach includes the six intermolecular modes ([scr J, script letter J]) and the cistrans torsional degree of freedom (t) of both monomers (see Fig. 1), which was found to perform reasonably well in ref. 60.


image file: d2cp03893e-f1.tif
Fig. 1 Formic acid dimer in its equilibrium structure. The intermolecular (R,θ,ϕ,α,β,γ) and the two cistrans torsional (τ(1) and τ(2)) coordinates used in the 8-dimensional variational vibrational computations are also shown.

3 Computational details

The reference coupled cluster calculations for generating the training data set have been performed using MOLPRO 2019.91 The explicitly correlated frozen-core (fc-)CCSD(T)-F12a92–94 method has been used in conjunction with an aug-cc-pVTZ basis95 for carbon and oxygen atoms and a cc-pVTZ basis96 for hydrogen atoms. In the following this atomic orbital basis will be abbreviated haTZ. A VTZ/JKFIT basis97 has been employed for the resolution of identity approximation. Reference ab initio equilibrium geometries and harmonic frequencies were obtained using either default convergence criteria or a tight definition which corresponds to: a lower threshold for screening of two-electron integrals (twoint = 10−16) and the energy (energy = 10−10) for all CCSD(T)-F12a energy evaluations, improved convergence of the geometry optimization (gradient = 10−6 and step = 10−6) employing a fourpoint numerical gradient and finally reduced step sizes of 0.005 a0 for both optimization and numerical Hessian. Furthermore, reference QFF calculations were performed using MOLPRO 2022, which provides the necessary advanced features of the XSURF program98,99 for the development of multi-dimensional PESs. These latter calculations also employ the tight settings.

The HDNNPs have been constructed using the program RuNNer.78,79 Several architectures of the atomic neural networks have been tested containing two hidden layers with up to 14 neurons each. A large cutoff radius of 15.0 bohr has been used to ensure that in case of the FAD all atoms are included in all atomic environments. The parameters of the atom-centered symmetry functions describing the atomic environments are given in the ESI. For the training process, the available data has been randomly split into a training set (90%) and an independent test set (10%) not used in the iterative weight optimization, which employs a global adaptive extended Kalman filter.100

The reference data set has been generated in several steps. An initial set of HDNNPs has been constructed using the energies of 13[thin space (1/6-em)]475 structures from the work of Qu and Bowman.58 This data has then been further extended by including additional structures obtained from active learning, i.e., by comparing the energy predictions of different HDNNPs. If for a given structure the deviation between these predictions was above a specified error threshold, a CCSD(T) calculation has been carried out for the respective structure, which has then been added to the data set to further refine the potential.

Several strategies have been employed to search for structures not well-represented, which are geometries displaced along the 24 normal modes, geometries used in the numerical calculation of the Hessian, and structures obtained from molecular dynamics (MD) simulations at 100 and 300 K driven by preliminary intermediate HDNNPs using the n2p2101 and LAMMPS102 codes. Moreover, about 500[thin space (1/6-em)]000 structures corresponding to the direct product grid used in the variational vibrational computations60 were screened systematically and added if needed. A threshold for the predicted energy deviation of 1 meV per atom has been applied in the active learning for the MD simulations, while 0.02 eV per atom have been used for selecting geometries from the pool of grid structures. In total, this extended second data set contains the energies of 27[thin space (1/6-em)]372 FAD structures. Finally, a third data set has been constructed by adding another 1800 structures extracted from two-dimensional cuts of the PES corresponding to pairwise coupled harmonic modes, for which a threshold of 2 meV per atom has been used.

4 Results

4.1 High-dimensional neural network potentials

The HDNNPs have been trained using the three different data sets containing increasing numbers of structures as described in the previous section. The RMSEs of the energies of the resulting HDNNPs showing the best performance called HDNNP1 (11 neurons per hidden layer), HDNNP2 (10 neurons per hidden layer), and HDNNP3 (14 neurons per hidden layer) are compiled in Table 1. Since most of the energies of the Qu and Bowman data set are within 0.1 Eh with respect to the energy of the global minimum geometry, the high-energy region is only sparsely sampled. This is the reason for the large test set errors of all HDNNPs compared to the respective errors of the training set, indicating overfitting in the high-energy region beyond 0.1 Eh, which is particularly pronounced for HDNNP1 relying on the data of Qu and Bowman only. This overfitting is not present in the very well sampled low-energy region below 0.1 Eh, as can be seen in the bottom half of Table 1.
Table 1 Energy root mean squared errors (RMSE) of the training and test sets for the three HDNNPs trained using data sets containing increasing numbers of structures. The RMSEs of these potentials are given for the complete energy range covered in reference data and for the structures in the most relevant energy range below 0.1 Eh with respect to the global minimum. The numbers of structures included in the respective energy range for calculating the RMSEs are given in the second column while in both cases the HDNNPs have been trained to the full data range
PES Structures RMSE [meV per atom] RMSE [cm−1]
Training Testing Training Testing
Full energy range
HDNNP1 13[thin space (1/6-em)]475 2.43 13.99 196 1129
HDNNP2 27[thin space (1/6-em)]372 0.92 1.88 74 158
HDNNP3 29[thin space (1/6-em)]162 0.37 2.04 30 165
Energy range below 0.1 Eh
HDNNP1 12[thin space (1/6-em)]725 2.15 2.42 174 195
HDNNP2 26[thin space (1/6-em)]531 0.85 1.04 68 83
HDNNP3 28[thin space (1/6-em)]286 0.35 0.34 28 27


Fig. 2 shows the energy error of all training and test data points for the QB16 potential and the three HDNNPs. The QB16 potential performs better than HDNNP1 if only the original data of Qu and Bowman is used. The (unweighted) RMSE of the QB16 on this initial data set corresponds to 0.91 meV per atom (74 cm−1) to be compared with 2.43 meV per atom (196 cm−1) for HDNNP1. If, however, the data set is increased, the error of the HDNNPs is strongly reduced finally resulting in a very small RMSE of only about 0.35 meV per atom (28 cm−1) for HDNNP3 in the relevant energy range up to 0.1 Eh (ca. 21[thin space (1/6-em)]950 cm−1).


image file: d2cp03893e-f2.tif
Fig. 2 Energy difference ΔE = ECCEPES as a function of the reference energy ECC relative to the energy of the global minimum for the QB16 PES,55 HDNNP1, HDNNP2, and HDNNP3 (=FAD-HDNNP). The root-mean-squared errors (RMSE) for the HDNNPs are provided in Table 1.

4.2 Vibrational frequencies

4.2.1 Harmonic frequencies. As an important test for spectroscopic applications, next we compare the harmonic frequencies of the PESs with the respective ab initio values corresponding to the same level of electronic structure theory. As described in the introduction we consider an agreement within ±10 cm−1 as a requirement for this purpose.

Fig. 3 shows the harmonic frequency errors obtained for the three main HDNNPs of the PES refinement procedure. Similar to the energy RMSEs we find a continuous improvement starting from HDNNP1 showing some rather large deviations up to 52 cm−1, which are decreasing to at most 27 cm−1 for HDNNP2, finally reaching a very high quality in case of HDNNP3 with the largest deviation in the PES vs. ab initio harmonic frequencies being less than 7 cm−1. Overall, the frequency RMSEs of HDNNP1, HDNNP2, and HDNNP3 are about 27 cm−1, 9 cm−1, and 4 cm−1, respectively. Consequently, we choose HDNNP3 as the production-quality PES for vibrational calculations, which will be called simply ‘FAD-HDNNP’ in the following.


image file: d2cp03893e-f3.tif
Fig. 3 Deviations of the harmonic vibrational frequencies ωi with respect to the reference ab initio results (tight settings). Top panel depicts the deviations Δωi = ωiωref,i for different HDNNP versions during the potential refinement with HDNNP3 corresponding to the final FAD-HDNNP. The bottom panel compares these differences for harmonic frequencies calculated from the QB16 PES by Qu and Bowman,55 the harmonic frequencies published by Käser and Meuwly62 based on the transfer-learned PhysTL, and the final FAD-HDNNP (= HDNNP3) results. The reference ab initio results have been obtained at the level of electronic structure theory that was used for the development of the respective PES, i.e. fc-CCSD(T)/aug-cc-pVTZ for PhysTL (see ref. 62 for details) and fc-CCSD(T)-F12a/haTZ for QB16 as well as FAD-HDNNP. Root mean squared errors (RMSE) are provided in the legends.

The lower part of Fig. 3 provides a comparison of FAD-HDNNP harmonic frequencies with the other two published FAD PESs.55,62 Concerning the accuracy of the PES at the harmonic level, FAD-HDNNP seems to outperform the earlier QB1655 and PhysTL62 potentials, which exhibit RMSEs of 8 cm−1 and 14 cm−1 with maximum deviations of 28 and 30 cm−1, respectively, from the ab initio harmonic frequency values of the corresponding level of electronic structure theory.

The numerical values of the harmonic frequencies for the newly developed FAD-HDNNP (and also for the HDNNP1 and HDNNP2 development stages) are collected in Table 2 together with the ab initio frequencies. Interestingly, the numerical values of the ab initio frequencies at the chosen level of theory depend notably on the precise details of the MOLPRO computation setup as shown in the last two columns of the table. The determined 4 cm−1 RMSE with a maximum deviation of 7 cm−1 of FAD-HDNNP appears to have reached the current accuracy limit of the available electronic structure methodology employing commonly used default settings, as even the MOLPRO frequencies exhibit changes up to 13 cm−1 in exceptional cases like ω15 when using tight settings. Still, even in this case the RMSE of FAD-HDNNP with respect to the tight MOLPRO data is very similar to the RMSE with respect to the default settings with only a marginally increased maximum deviation of 8 cm−1 in case of ω12 and ω13.

Table 2 Comparison of the harmonic frequencies ωi (in cm−1) for the three HDNNPs and the respective reference ab initio results obtained at the fc-CCSD(T)-F12a/haTZ level of theory. The latter are computed using either the standard convergence criteria (termed default) or stricter convergence criteria (termed tight) with respect to the electronic energy and geometry optimization as well as smaller step sizes in the numerical gradient and Hessian (cf. Section 3)
Mode Sym. HDNNP1 HDNNP2 FAD-HDNNP Ab initio
HDNNP3 Default Tight
ω 1 A g 3218 3212 3209 3203 3207
ω 2 A g 3060 3077 3103 3105 3103
ω 3 A g 1708 1723 1722 1717 1718
ω 4 A g 1443 1485 1486 1484 1482
ω 5 A g 1376 1417 1410 1413 1411
ω 6 A g 1254 1249 1257 1257 1255
ω 7 A g 689 681 688 688 686
ω 8 A g 216 215 214 211 210
ω 9 A g 164 173 167 171 170
ω 10 B g 1073 1083 1083 1085 1083
ω 11 B g 980 960 957 960 955
ω 12 B g 257 241 257 258 249
ω 13 A u 1128 1106 1109 1102 1101
ω 14 A u 937 983 979 986 985
ω 15 A u 213 161 180 186 173
ω 16 A u 68 71 71 76 68
ω 17 B u 3324 3323 3312 3305 3309
ω 18 B u 3108 3085 3100 3101 3099
ω 19 B u 1773 1783 1784 1782 1782
ω 20 B u 1411 1459 1459 1456 1453
ω 21 B u 1358 1414 1410 1405 1407
ω 22 B u 1208 1247 1260 1260 1260
ω 23 B u 704 718 712 716 715
ω 24 B u 281 275 275 278 276


The equilibrium structure of FAD obtained from the reference ab initio calculations can be compared to the minima of the HDNNPs. This is done in Table 3 and additional results obtained with the QB16 PES55 are provided for comparison. Due to symmetry only 9 internal coordinates are required to characterize FAD. Similar to the harmonic frequencies, the agreement of the HDNNP geometrical parameters with the reference ab initio results improves with the refinement procedure. The largest difference is observed for the r(O⋯O) distance which at the same time appears to be the most sensitive coordinate exemplified by a comparably large variation of 0.001 Å between the default and tight reference ab initio geometries. Nevertheless, the agreement of HDNNP 3 (= FAD-HDNNP) with the (tight) reference ab initio geometry is good with an RMSE of 0.002 Å for the bond distances and 0.04° for the angular coordinates.

Table 3 Geometrical parameters of the formic acid dimer minimum structure optimized at the reference ab initio level of theory and determined for different PESs. Bond lengths are provided in Ångströms and angles in degrees. HDNNP3 corresponds to the final FAD-HDNNP for spectroscopic use
Parameter Ab initio QB16a HDNNP1 HDNNP2 HDNNP3
Default Tight
a Qu and Bowman.55
r(O–H) 0.9934 0.9932 0.9927 0.9945 0.9925 0.9936
r(C–H) 1.0929 1.0930 1.0929 1.0927 1.0937 1.0927
r(C–O) 1.3113 1.3114 1.3116 1.3104 1.3121 1.3112
r(O⋯O) 2.6748 2.6758 2.6778 2.6729 2.6791 2.6709
r(C[double bond, length as m-dash]O) 1.2177 1.2176 1.2174 1.2192 1.2172 1.2178
∠O[double bond, length as m-dash]C–O 126.14 126.14 126.15 126.26 126.13 126.13
∠O[double bond, length as m-dash]C–H 122.02 122.03 122.05 122.04 122.15 121.99
∠C–O–H 109.77 109.76 109.73 109.42 109.93 109.79
∠O–H⋯O 178.93 178.93 178.95 179.51 179.01 178.86


Finally, we investigated the performance of the HDNNPs in describing the geometry and barrier height of the double proton transfer transition state. The details are provided in the ESI and results are compared to new ab initio calculations. Although not the main focus of the present investigation, we find good agreement between FAD-HDNNP and the reference results with an RMSE of 0.002 Å for Cartesian coordinates and a deviation of only 12 cm−1 in the barrier height.

4.2.2 VPT2 frequencies. Since the harmonic frequencies are well described by the FAD-HDNNP, it is now appropriate to investigate the representation of mode couplings as a next step. A compact measure for the mode couplings near the equilibrium structure can be obtained from a comparison of the VPT2 frequencies computed using the PES with gas-phase experimental results49,51,54,64,103 as well as direct ab initio results which are still feasible for FAD, although the computations are very demanding taking several weeks. The corresponding anharmonic fundamental transition frequencies are provided in Table 4. The normal coordinate QFF parameters obtained from FAD-HDNNP and those calculated ab initio are compiled in the ESI, and can be used for further assessment of the representation of the mode couplings. This allows a comparison which is completely unaffected by details of the VPT2 resonance treatment.
Table 4 Comparison of the VPT2 fundamental frequencies (in cm−1) obtained from the reference ab initio calculations and the FAD-HDNNP PES with experimental data49,51,54,64,103
Mode Sym. Ab initio FAD-HDNNP Exp.
a CCSD(T)-F12a/hATZ using tight settings. b Fermi-resonance coupled with ν4 + ν8 at 1619 cm−1. c Fermi-resonance coupled with ν4 + ν8 at 1625 cm−1. d Experimental bands at 1664 and 1668 cm−1. e Fermi-resonance coupled with ν12 + ν23 at 938 cm−1. f Fermi-resonance coupled with ν12 + ν23 at 941 cm−1. g Experimental bands at 939 and 953 cm−1. h Fermi-resonance coupled with ν10 + ν15 at 1218 cm−1. i Fermi-resonance coupled with ν10 + ν15 at 1219 cm−1. j Experimental bands at 1220, 1225, and 1234 cm−1.
ν 1 A g 2909 2920
ν 2 A g 2942 2948
ν 3 A g 1672b 1677c 1664d
ν 4 A g 1431 1433 1430
ν 5 A g 1375 1375 1375
ν 6 A g 1225 1229 1224
ν 7 A g 679 682 681
ν 8 A g 194 197 194
ν 9 A g 157 164 161
ν 10 B g 1061 1058 1058
ν 11 B g 923 934 911
ν 12 B g 241 247 242
ν 13 A u 1072 1074 1069
ν 14 A u 959e 964f 939g
ν 15 A u 162 166 168
ν 16 A u 67 68 69
ν 17 B u 3044 3041
ν 18 B u 2935 2941
ν 19 B u 1741 1745 1741
ν 20 B u 1406 1416 1407
ν 21 B u 1372 1375 1372
ν 22 B u 1234h 1233i 1234j
ν 23 B u 704 706 708
ν 24 B u 262 264 264


A direct comparison between VPT2-based results and experiment for the high-frequency hydrogen-stretching vibrations is problematic. This is on the one hand due to the fact that these fundamentals are in an energy range of already high state density which leads to substantial anharmonic couplings/resonances beyond what can be reliably treated using VPT2.57 On the other hand experimental results for these fundamentals also carry a significant uncertainty which may exceed 10 cm−1. Therefore these frequencies are not well suited for benchmarking purposes and we will below focus on the fundamentals <1800 cm−1. Nevertheless, the hydrogen stretching fundamental frequencies obtained with FAD-HDNNP agree well with the reference ab initio VPT2 results. A maximum deviation of 11 cm−1 is found for ν1 which probably is due to the resonance effects mentioned above that render these frequencies highly sensitive to small details of the PES representation.

For the lower frequency modes FAD-HDNNP reproduces the reference ab initio results with an RMSE of 5 cm−1 where the maximum deviation (for resonance free fundamentals) of 23 cm−1 is observed for ν11. Considering the deviation of the corresponding harmonic frequencies is only 2 cm−1 (cf.Table 2) this indicates a problem in correctly describing the anharmonicity in the symmetric OH out-of-plane bend. Overall the agreement between FAD-HDNNP and the experimental results is good with an RMSE of 9 cm−1. The RMSE of the reference results is slightly better with 6 cm−1 but still the ν11 mode shows a large deviation of 12 cm−1. Nevertheless, these results are well within the typical range of errors that are to be expected for the underlying level of ab initio theory. To improve on this would require the inclusion of high-level corrections such as core-valence correlation and higher-order correlation beyond CCSD(T). While such composite schemes have been shown to provide high-quality potentials for small molecules,104–107 their computational cost is prohibitive for FAD.

For ν14 we observe a large deviation of 25 cm−1 when comparing the FAD-HDNNP and the experimental frequency of 964 and 939 cm−1, respectively. In agreement with experimental results54,108 we find this mode to be in Fermi-resonance with ν12 + ν23 at 941 cm−1 to be compared to the experimental value of 961 cm−1. A close look at these numbers may indicate a possible misassignment. However, upon inspection of the VPT2 results we find that these vibrational states are in an almost perfect resonance (56[thin space (1/6-em)]:[thin space (1/6-em)]44 mixing) with deperturbed energies of 952 and 954 cm−1 for (ν12 + ν23)* and image file: d2cp03893e-t8.tif, respectively. As such this resonance is very sensitive to the level of theory and small changes in the PES can easily reverse the state ordering. Note that this situation is also present in the reference ab initio results and therefore appears to be due to the convergence of the PES with respect to the employed electronic structure level.

Finally, our calculations indicate a Fermi-resonance between ν22 and ν10 + ν15 with anharmonic transition frequencies of 1234 and 1219 cm−1, respectively. Nejad et al. have discussed this resonance in detail,54 proposing a new assignment for the transitions in the triplet of bands centered at 1220, 1225, and 1234 cm−1, i.e., a resonance triad assigned to ν10 + ν15, ν9 + ν11 + ν15 and ν22. In contrast, we find no involvement of ν9 + ν11 + ν15 in the VPT2 results for ν22 based on FAD-HDNNP. This (preliminary) result is also obtained from the ab initio reference calculations based on the resonance detection criterium for the ν10 fundamental. However, the situation is complicated by the fact that the ν11 shows the largest error (for fundamentals not affected by resonances) with respect to experiment of more than 20 cm−1 in case of FAD-HDNNP which in consequence shifts the ν9 + ν11 + ν15 band by a similar amount and thus it can not interact with ν10 + ν15 through the rather small coupling matrix element of only a few cm−1. In contrast, for the reference ab initio calculations the VPT2 frequencies 1219, 1226, and 1233 cm−1 for ν10 + ν15, ν9 + ν11 + ν15 and ν22, respectively, agree nicely with the experimental results. Upon enforcing the coupling between ν10 + ν15 and ν9 + ν11 + ν15 the previous frequencies change only slightly to 1218, 1227, and 1234 cm−1 but with a complicated mixing of the harmonic basis functions in agreement with the results of Nejad et al.54 Clearly, a correct description of this intricate resonance triad both for the frequencies of the involved bands as well as the intensity pattern governed by the mixing ratios will be an excellent benchmark test for a theoretical spectroscopic description of FAD.

4.2.3 Reduced-dimensionality variational vibrational frequencies. Finally, to perform the 8D intermolecular-plus-torsion vibrational computations, we first determined a potential-optimized DVR for every vibrational degree of freedom following the procedure described in ref. 60. All remaining, i.e., constrained, degrees of freedom have been fixed at the values of the FAD-HDNNP equilibrium geometry (cf.Table 3).

The one-dimensional cuts of the three investigated HDNNPs and the QB16 potential along the intermolecular coordinates plotted in Fig. 4 show that the newly developed FAD-HDNNP behaves well along all the intramolecular degrees of freedom consisting of the distance between the monomers and relative orientation (cf.Fig. 1). In contrast to the QB16 potential, which shows a low-energy oscillation for ϕ ≈ 150°, no artificial cutoff of the primitive grid intervals has been found to be necessary for running the calculations.


image file: d2cp03893e-f4.tif
Fig. 4 1-dimensional cuts of the three investigated HDNNPs and the QB16 potential along the intermolecular coordinates (cf.Fig. 1) of the formic acid dimer used in the reduced-dimensionality variational computations.

The obtained vibrational frequencies converged for the 8D([scr J, script letter J]t) intermolecular-plus-torsional representation are collected in Table 5. Overall the agreement for both QB16 and FAD-HDNNP with experiment is reasonably good with deviations of only a few cm−1 for most frequencies. However, similarly to previous results discussed in ref. 60, we can still observe a non-negligible blue shift in the problematic fundamental frequencies ν8 and ν9. With the present, extensively tested PESs, we can identify two possible origins and corresponding solutions for this shift. First, either it is a consequence of the constrained coordinates resulting in a too steep potential energy well at the equilibrium cut, which may be overcome by relaxing the constraint coordinates. Or, second, the number of the active vibrational degrees of freedom should be increased in the GENIUSH program. However, in the latter case the computational cost would increase rapidly in DVR, and for this reason, it will be necessary to use the more efficient FBR-Smolyak representation.70,71,90

Table 5 Vibrational energies referenced to the zero-point vibrational energy (in cm−1) obtained with the 8D([scr J, script letter J]t) intermolecular-torsional model in the GENIUSH program using the QB16 PES and FAD-HDNNP. The inactive degrees of freedom are fixed at their respective equilibrium values
Assignment60 [small nu, Greek, tilde] QB16 60 [small nu, Greek, tilde] FAD-HDNNP [small nu, Greek, tilde] expt 53
ν 16 70 70 69.2
2ν16 141 140 139
ν 15 162 171 168.5
ν 9/ν8 191 190 161
ν 8/ν9 208 210 194
3ν16 211 210
ν 15 + ν16 232 240
ν 12 239 243 242
ν 24 253 253 264
ν 9 + ν16 262 260
ν 8 + ν16 277 279
4ν16 281 280
ν 15+ 2ν16 303 309
ν 12 + ν16 310 311 311
ν 24 + ν16 323 322
2ν15 324 330 336
ν 9+ 2ν16 332 340
ν 8+ 2ν16 347 348


In comparison with VPT2 (cf. Section 4.2.2), it is interesting to note that VPT2 gets all fundamental frequencies (including ν8 and ν9) mostly correct (cf.Table 4). In Fig. 5 of ref. 53, Nejad and Suhm contrasted the good performance of VPT2 (±5 cm−1 with respect to experiment) against the very large deviations (15–40 cm−1) of the fundamentally more complete VCI computations.109 Their observations motivated further theoretical and computational work, including an inquiry about the efficiency of normal coordinates for the intermolecular (low-frequency) vibrations for this system.60

5 Summary & conclusions

In this work we have examined the accuracy of machine-learned potential energy surfaces using the prototypical case of high-dimensional neural network potentials. Like many other modern machine learning potentials, HDNNPs have been primarily developed to transfer the accuracy of electronic structure methods to very large systems containing thousands of atoms, with the aim to perform large-scale molecular dynamics simulations. Consequently, the validation of the resulting multidimensional PESs is a difficult task. In this benchmark study, we have selected a system of moderate size, the formic acid dimer, for which vibrational frequencies at the CCSD(T) level of theory are accessible as a probe for assessing the quality of the PES along with a wealth of experimental data.

For the development of the full-dimensional HDNNP we have pursued a three-step procedure based on increasing reference data sets in the training process. First, a generally faithful representation of the PES without artifacts like artificial “holes” of overall good quality is generated. Second, this surface is then iteratively refined by adding further points to ensure that the harmonic frequencies of the PES reproduce within ±10 cm−1 the harmonic frequencies of the CCSD(T) level of theory, which serves as our reference. Third, the representation of (lower-order) couplings is investigated by computing the VPT2 frequencies and the corresponding normal-coordinate quartic force constants in comparison with experimental data and reference ab initio results.

This carefully validated full-dimensional FAD-HDNNP surface has been used in (pilot) 8-dimensional, curvilinear, variational computations focusing on the low-energy intermolecular range. Further progress in the variational vibrational methodology is required for reaching a higher-energy spectral range including a higher number of active vibrational degrees of freedom.

The finally obtained FAD-HDNNP potential energy surface shows a very high quality, which, in combination with the wealth of available, high-quality experimental data,49,53,63–67,110 we expect to be very useful for future developments in quantum dynamics and spectroscopic applications, which rely on robust and accurate PESs.

Data availability

Data for this paper, including the newly calculated ab initio data set used in the fitting of fit FAD-HDNNP, the necessary RuNNer input files for FAD-HDNNP, the ab initio points employed in the reference QFF generation via numerical differentiation, and the obtained QFF parameters (equilibrium geometry, normal coordinates and force constants) for both the ab initio reference as well as FAD-HDNNP are available at GRO.data (10.25625/ZDGKYA).111 Additionally, the ESI provides details on the RuNNer settings and the ACSFs used during the HDNNP development, the transition state structure determined from different PESs and ab initio calculations, and details on the QFF parameters.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

This project has been funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – 389479699/GRK2455. EM is grateful for a DFG Mercator Fellowship granted in the framework of this project. AMSD and EM are grateful for the financial support of the Swiss National Science Foundation (PROMYS Grant, No. IZ11Z0_166525). We thank Chen Qu and Joel Bowman for sharing with us their formic acid dimer PES and for providing their ab initio data set. Computing time has been available through the DFG project INST186/1294-1 FUGG (Project No. 405832858). Discussions with Ricardo Mata, Rainer Ostwald, Arman Nejad, Martin Suhm and Guntram Rauhut are gratefully acknowledged.

Notes and references

  1. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev and A. Walsh, Nature, 2018, 559, 547–555 CrossRef CAS PubMed.
  2. A. C. Mater and M. L. Coote, J. Chem. Inf. Model., 2019, 59, 2545–2559 CrossRef CAS PubMed.
  3. J. Gasteiger and J. Zupan, Angew. Chem., Int. Ed. Engl., 1993, 32, 503–527 CrossRef.
  4. B. Sanchez-Lengeling and A. Aspuru-Guzik, Science, 2018, 361, 360–365 CrossRef CAS PubMed.
  5. M. Meuwly, Chem. Rev., 2021, 121, 10218–10239 CrossRef CAS PubMed.
  6. T. B. Blank, S. D. Brown, A. W. Calhoun and D. J. Doren, J. Chem. Phys., 1995, 103, 4129–4137 CrossRef CAS.
  7. J. Behler, J. Chem. Phys., 2016, 145, 170901 CrossRef PubMed.
  8. V. L. Deringer, M. A. Caro and G. Csányi, Adv. Mater., 2019, 31, 1902765 CrossRef CAS PubMed.
  9. F. Noé, A. Tkatchenko, K.-R. Müller and C. Clementi, Ann. Rev. Phys. Chem., 2020, 71, 361–390 CrossRef PubMed.
  10. O. T. Unke, S. Chmiela, H. E. Sauceda, M. Gastegger, I. Poltavsky, K. T. Schütt, A. Tkatchenko and K.-R. Müller, Chem. Rev., 2021, 121, 10142–10186 CrossRef CAS PubMed.
  11. P. Friederich, F. Häse, J. Proppe and A. Aspuru-Guzik, Nat. Mater., 2021, 20, 750–761 CrossRef CAS PubMed.
  12. J. Behler and G. Csányi, Eur. Phys. J. B, 2021, 94, 142 CrossRef CAS.
  13. C. M. Handley and P. L. A. Popelier, J. Phys. Chem. A, 2010, 114, 3371–3383 CrossRef CAS PubMed.
  14. J. Behler, Phys. Chem. Chem. Phys., 2011, 13, 17930–17955 RSC.
  15. J. Behler and M. Parrinello, Phys. Rev. Lett., 2007, 98, 146401 CrossRef PubMed.
  16. A. P. Bartók, M. C. Payne, R. Kondor and G. Csányi, Phys. Rev. Lett., 2010, 104, 136403 CrossRef PubMed.
  17. A. P. Thompson, L. P. Swiler, C. R. Trott, S. M. Foiles and G. J. Tucker, J. Comput. Phys., 2015, 285, 316–330 CrossRef CAS.
  18. A. V. Shapeev, Multiscale Model. Simul., 2016, 14, 1153–1173 CrossRef.
  19. R. Drautz, Phys. Rev. B, 2019, 99, 014104 CrossRef CAS.
  20. N. Artrith, T. Morawietz and J. Behler, Phys. Rev. B: Condens. Matter Mater. Phys., 2011, 83, 153101 CrossRef.
  21. K. Yao, J. E. Herr, D. W. Toth, R. Mckintyre and J. Parkhill, Chem. Sci., 2018, 9, 2261–2269 RSC.
  22. O. T. Unke and M. Meuwly, J. Chem. Theory Comput., 2019, 15, 3678–3693 CrossRef CAS PubMed.
  23. S. A. Ghasemi, A. Hofstetter, S. Saha and S. Goedecker, Phys. Rev. B: Condens. Matter Mater. Phys., 2015, 92, 045131 CrossRef.
  24. T. W. Ko, J. A. Finkler, S. Goedecker and J. Behler, Nat. Commun., 2021, 12, 398 CrossRef CAS PubMed.
  25. X. Xie, K. A. Persson and D. W. Small, J. Chem. Theory Comput., 2020, 16, 4256–4270 CrossRef CAS PubMed.
  26. L. Jacobson, J. Stevenson, F. Ramezanghorbani, D. Ghoreishi, K. Leswing, E. Harder and R. Abel, J. Chem. Theory Comput., 2022, 18, 2354–2366 CrossRef CAS PubMed.
  27. T. W. Ko, J. A. Finkler, S. Goedecker and J. Behler, Acc. Chem. Res., 2021, 54, 808–817 CrossRef CAS PubMed.
  28. M. Eckhoff and J. Behler, npj Comput. Mater., 2021, 7, 170 CrossRef.
  29. I. Novikov, B. Grabowski, F. Körmann and A. Shapeev, npj Comput. Mater., 2022, 8, 13 CrossRef.
  30. B. Jiang, J. Li and H. Guo, Int. Rev. Phys. Chem., 2016, 35, 479–506 Search PubMed.
  31. J. Li, K. Song and J. Behler, Phys. Chem. Chem. Phys., 2019, 21, 9672–9682 RSC.
  32. D. Lu, J. Behler and J. Li, J. Phys. Chem. A, 2020, 124, 5737–5745 CrossRef CAS PubMed.
  33. N. Artrith and J. Behler, Phys. Rev. B: Condens. Matter Mater. Phys., 2012, 85, 045439 CrossRef.
  34. E. V. Podryabinkin and A. V. Shapeev, Comput. Mater. Sci., 2017, 140, 171–180 CrossRef CAS.
  35. J. S. Smith, B. Nebgen, N. Lubbers, O. Isayev and A. E. Roitberg, J. Chem. Phys., 2018, 148, 241733 CrossRef PubMed.
  36. F. V. Prudente, P. H. Acioli and J. J. Soares Neto, J. Chem. Phys., 1998, 109, 8801–8808 CrossRef CAS.
  37. A. C. P. Bittencourt, F. V. Prudente and J. D. M. Vianna, Chem. Phys., 2004, 297, 153–161 CrossRef CAS.
  38. S. Manzhos and T. Carrington Jr., J. Chem. Phys., 2007, 127, 014103 CrossRef PubMed.
  39. M. Malshe, R. Narulkar, L. M. Raff, M. Hagan, S. Bukkapatnam, P. M. Agrawal and R. Komanduri, J. Chem. Phys., 2009, 130, 184102 CrossRef CAS PubMed.
  40. T. Carrington Jr., J. Chem. Phys., 2017, 146, 120902 CrossRef PubMed.
  41. M. Gastegger, J. Behler and P. Marquetand, Chem. Sci., 2017, 8, 6924–6935 RSC.
  42. T. Morawietz, V. Sharma and J. Behler, J. Chem. Phys., 2012, 136, 064103 CrossRef PubMed.
  43. V. Quaranta, M. Hellström, J. Behler, J. Kullgren, P. Mitev and K. Hermansson, J. Chem. Phys., 2018, 148, 241720 CrossRef PubMed.
  44. T. Morawietz, O. Marsalek, S. R. Pattenaude, L. M. Streacker, D. Ben-Amotz and T. E. Markland, J. Phys. Chem. Lett., 2018, 9, 851–857 CrossRef CAS PubMed.
  45. S. Shepherd, J. Lan, D. M. Wilkins and V. Kapil, J. Phys. Chem. Lett., 2021, 12, 9108–9114 CrossRef CAS PubMed.
  46. G. M. Sommers, M. F. Calegari Andrade, L. Zhang, H. Wang and R. Car, Phys. Chem. Chem. Phys., 2020, 22, 10592–10602 RSC.
  47. R. J. Bartlett and M. Musial, Rev. Mod. Phys., 2007, 79, 291–352 CrossRef CAS.
  48. M. Herman, R. Georges, M. Hepp and D. Hurtmans, Int. Rev. Phys. Chem., 2000, 19, 277–325 Search PubMed.
  49. R. Georges, M. Freytes, D. Hurtmans, I. Kleiner, J. Vander Auwera and M. Herman, Chem. Phys., 2004, 305, 187–196 CrossRef CAS.
  50. P. Zielke and M. A. Suhm, Phys. Chem. Chem. Phys., 2007, 9, 4528 RSC.
  51. Z. Xue and M. A. Suhm, J. Chem. Phys., 2009, 131, 054301 CrossRef CAS PubMed.
  52. F. Kollipost, R. W. Larsen, A. V. Domanskaya, M. Nörenberg and M. A. Suhm, J. Chem. Phys., 2012, 136, 151101 CrossRef CAS PubMed.
  53. A. Nejad and M. A. Suhm, J. Indian Inst. Sci., 2020, 100, 1–15 CrossRef.
  54. A. Nejad, K. A. E. Meyer, F. Kollipost, Z. Xue and M. A. Suhm, J. Chem. Phys., 2021, 155, 224301 CrossRef CAS PubMed.
  55. C. Qu and J. M. Bowman, Phys. Chem. Chem. Phys., 2016, 18, 24835–24840 RSC.
  56. B. J. Braams and J. M. Bowman, Int. Rev. Phys. Chem., 2009, 28, 577–606 Search PubMed.
  57. C. Qu and J. M. Bowman, J. Phys. Chem. Lett., 2018, 9, 2604–2610 Search PubMed.
  58. C. Qu and J. M. Bowman, J. Chem. Phys., 2018, 148, 241713 CrossRef PubMed.
  59. C. Qu and J. M. Bowman, Faraday Discuss., 2018, 212, 33–49 RSC.
  60. A. Martín Santa Daría, G. Avila and E. Mátyus, Phys. Chem. Chem. Phys., 2021, 23, 6526–6535 RSC.
  61. E. Mátyus, G. Czakó and A. G. Császár, J. Chem. Phys., 2009, 130, 134112 CrossRef PubMed.
  62. S. Käser and M. Meuwly, Phys. Chem. Chem. Phys., 2022, 24, 5269–5281 RSC.
  63. R. C. Millikan and K. S. Pitzer, J. Am. Chem. Soc., 1958, 80, 3515–3521 CrossRef CAS.
  64. J. E. Bertie and K. H. Michaelian, J. Chem. Phys., 1982, 76, 886 CrossRef CAS.
  65. V. V. Matylitsky, C. Riehn, M. F. Gelin and B. Brutschy, J. Chem. Phys., 2003, 119, 10553–10562 CrossRef CAS.
  66. F. Ito and T. Nakanaga, Chem. Phys., 2002, 277, 163–169 CrossRef CAS.
  67. J. E. Bertie, K. H. Michaelian, H. H. Eysel and D. Hager, J. Chem. Phys., 1986, 85, 4779–4789 CrossRef CAS.
  68. O. Birer and M. Havenith, Annu. Rev. Phys. Chem., 2009, 60, 263 CrossRef CAS PubMed.
  69. M. Ortlieb and M. Havenith, J. Phys. Chem. A, 2007, 111, 7355 CrossRef CAS PubMed.
  70. G. Avila and E. Mátyus, J. Chem. Phys., 2019, 150, 174107 CrossRef PubMed.
  71. G. Avila and E. Mátyus, J. Chem. Phys., 2019, 151, 154301 CrossRef PubMed.
  72. G. Avila, D. Papp, G. Czakó and E. Mátyus, Phys. Chem. Chem. Phys., 2020, 22, 2792–2802 RSC.
  73. X.-G. Wang and T. Carrington, J. Chem. Phys., 2020, 152, 204311 CrossRef CAS PubMed.
  74. P. M. Felker, D. Lauvergnat, Y. Scribano, D. M. Benoit and Z. Bačić, J. Chem. Phys., 2019, 151, 124311 CrossRef PubMed.
  75. Y. Liu, J. Li, P. M. Felker and Z. Bačić, Phys. Chem. Chem. Phys., 2021, 23, 7101–7114 RSC.
  76. A. Chen, D. M. Benoit, Y. Scribano, A. Nauts and D. Lauvergnat, J. Chem. Theory Comput., 2022, 18, 4366–4372 CrossRef CAS PubMed.
  77. J. Behler, J. Chem. Phys., 2011, 134, 074106 CrossRef PubMed.
  78. J. Behler, Int. J. Quantum Chem., 2015, 115, 1032–1050 CrossRef CAS.
  79. J. Behler, Angew. Chem., Int. Ed., 2017, 56, 12828–12840 CrossRef CAS PubMed.
  80. J. Behler, Chem. Rev., 2021, 121, 10037–10072 CrossRef CAS PubMed.
  81. E. B. Wilson, J. C. Decius and P. C. Cross, Molecular Vibrations: The Theory of Infrared and Raman Vibrational Spectra, Dover Publications, 1980 Search PubMed.
  82. D. A. Clabo, W. D. Allen, R. B. Remington, Y. Yamaguchi and H. F. Schaefer, Chem. Phys., 1988, 123, 187–239 CrossRef CAS.
  83. G. Amat, H. H. Nielsen and G. Tarrago, Rotation-Vibration of Polyatomic Molecules, Dekker, New York, 1971 Search PubMed.
  84. D. Papoušek and M. R. Aliev, Molecular Vibrational-rotational Spectra: Theory and Applications of High Resolution Infrared, Microwave and Raman Spectroscopy of Polyatomic Molecules, Elsevier Science Ltd, 1982 Search PubMed.
  85. M. R. Aliev and J. K. G. Watson, in Molecular Spectroscopy: Modern Research, ed. K. N. Rao, Academic Press, 1985, pp. 1–67 Search PubMed.
  86. M. Piccardo, J. Bloino and V. Barone, Int. J. Quantum Chem., 2015, 115, 948–982 CrossRef CAS PubMed.
  87. H. H. Nielsen, in Handbuch der Physik, ed. S. Flügge, Springer, Berlin, 1959, vol. 37, part I, pp. 173–313 Search PubMed.
  88. S. Califano, Vibrational States, John Wiley & Sons, Inc., London, 1976 Search PubMed.
  89. J. C. Light and T. Carrington Jr., Discrete-Variable Representations and their Utilization, John Wiley & Sons, Ltd, 2000, pp. 263–310 Search PubMed.
  90. A. Martín Santa Daría, G. Avila and E. Mátyus, J. Mol. Spectrosc., 2022, 385, 111617 CrossRef.
  91. H.-J. Werner, P. J. Knowles, G. Knizia, F. R. Manby, M. Schütz, P. Celani, W. Györffy, D. Kats, T. Korona, R. Lindh, A. Mitrushenkov, G. Rauhut, K. R. Shamasundar, T. B. Adler, R. D. Amos, S. J. Bennie, A. Bernhardsson, A. Berning, D. L. Cooper, M. J. O. Deegan, A. J. Dobbyn, F. Eckert, E. Goll, C. Hampel, A. Hesselmann, G. Hetzer, T. Hrenar, G. Jansen, C. Köppl, S. J. R. Lee, Y. Liu, A. W. Lloyd, Q. Ma, R. A. Mata, A. J. May, S. J. McNicholas, W. Meyer, T. F. Miller III, M. E. Mura, A. Nicklass, D. P. O’Neill, P. Palmieri, D. Peng, K. Pflüger, R. Pitzer, M. Reiher, T. Shiozaki, H. Stoll, A. J. Stone, R. Tarroni, T. Thorsteinsson, M. Wang and M. Welborn, MOLPRO, version, a package of ab initio programs, see https://www.molpro.net Search PubMed.
  92. T. B. Adler, G. Knizia and H.-J. Werner, J. Chem. Phys., 2007, 127, 221106 CrossRef PubMed.
  93. G. Knizia, T. B. Adler and H.-J. Werner, J. Chem. Phys., 2009, 130, 054104 CrossRef PubMed.
  94. H.-J. Werner, G. Knizia, T. B. Adler and O. Marchetti, Z. Phys. Chem., 2010, 224, 493–511 CrossRef CAS.
  95. R. A. Kendall, T. H. Dunning and R. J. Harrison, J. Chem. Phys., 1992, 96, 6796–6806 CrossRef CAS.
  96. T. Dunning, J. Chem. Phys., 1989, 90, 1007–1023 CrossRef CAS.
  97. F. Weigend, Phys. Chem. Chem. Phys., 2002, 4, 4285–4291 RSC.
  98. B. Ziegler and G. Rauhut, J. Chem. Phys., 2016, 144, 114114 CrossRef PubMed.
  99. R. Ramakrishnan and G. Rauhut, J. Chem. Phys., 2015, 142, 154118 CrossRef PubMed.
  100. T. B. Blank and S. D. Brown, J. Chemometrics, 1994, 8, 391–407 CrossRef CAS.
  101. A. Singraber, J. Behler and C. Dellago, J. Chem. Theory Comput., 2019, 15, 1827–1840 CrossRef CAS PubMed.
  102. S. Plimpton, J. Comput. Phys., 1995, 117, 1 CrossRef CAS.
  103. F. Ito and T. Nakanaga, Chem. Phys., 2002, 277, 163 CrossRef CAS.
  104. O. L. Polyansky, A. G. Császár, S. V. Shirin, N. F. Zobov, P. Barletta, J. Tennyson, D. W. Schwenke and P. J. Knowles, Science, 2003, 299, 539–542 CrossRef CAS PubMed.
  105. D. Feller, K. A. Peterson and D. A. Dixon, Mol. Phys., 2012, 110, 2381–2399 CrossRef CAS.
  106. M. B. Gardner, B. R. Westbrook, R. C. Fortenberry and T. J. Lee, Spectrochim. Acta, Part A, 2021, 248, 119184 CrossRef CAS PubMed.
  107. B. Schröder and P. Sebald, J. Mol. Spectrosc., 2022, 386, 111628 CrossRef.
  108. K. A. E. Meyer and M. A. Suhm, J. Chem. Phys., 2018, 149, 104307 CrossRef PubMed.
  109. C. Qu and J. M. Bowman, Phys. Chem. Chem. Phys., 2019, 21, 3397–3413 RSC.
  110. M. Gantenberg, M. Halupka and W. Sander, Chem. – Eur. J., 2000, 6, 1865–1869 CrossRef CAS PubMed.
  111. D. S. Rasheeda, A. M. Santa Daría, B. Schröder, E. Mátyus and J. Behler, Replication Data for: High-dimensional neural network potentials for accurate vibrational frequencies: The formic acid dimer benchmark, GRO.data, V1, 2022 DOI:10.25625/ZDGKYA.

Footnotes

Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2cp03893e
Present address: Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany, and Atomistic Simulations, Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr.

This journal is © the Owner Societies 2022