Silvan
Käser
and
Markus
Meuwly
*
Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland. E-mail: m.meuwly@unibas.ch
First published on 26th October 2021
The vibrational dynamics of the formic acid monomer (FAM) and dimer (FAD) is investigated from machine-learned potential energy surfaces at the MP2 (PESMP2) and transfer-learned (PESTL) to the CCSD(T) levels of theory. The normal mode (MAEs of 17.6 and 25.1 cm−1) and second order vibrational perturbation theory (VPT2, MAEs of 6.7 and 17.1 cm−1) frequencies from PESTL for all modes below 2000 cm−1 for FAM and FAD agree favourably with experiment. For the OH stretch mode the experimental frequencies are overestimated by more than 150 cm−1 for both FAM and FAD from normal mode calculations. Conversely, VPT2 calculations on PESTL for FAM reproduce the experimental OH frequency to within 22 cm−1. For FAD the VPT2 calculations find the high-frequency OH stretch at 3011 cm−1, compared with an experimentally reported, broad (∼100 cm−1) absorption band with center frequency estimated at ∼3050 cm−1. In agreement with earlier reports, MD simulations at higher temperature shift the position of the OH-stretch in FAM to the red, consistent with improved sampling of the anharmonic regions of the PES. However, for FAD the OH-stretch shifts to the blue and for temperatures higher than 1000 K the dimer partly or fully dissociates using PESTL. Including zero-point energy corrections from diffusion Monte Carlo simulations for FAM and FAD and corrections due to basis set superposition and completeness errors yields a dissociation energy of D0 = −14.23 ± 0.08 kcal mol−1 compared with an experimentally determined value of −14.22 ± 0.12 kcal mol−1.
Generating full-dimensional, reactive PESs even for small molecules is a challenging task.12,13 This often requires datasets consisting of tens of thousands of ab initio calculations to adequately describe configurational space of the system of interest. While calculations at the density functional theory (DFT) or Møller–Plesset perturbation (MP2) levels of theory are cost-efficient, reaction barriers are less accurate.14 On the other hand, the “gold standard” coupled cluster with perturbative triples (CCSD(T)) approach scales as N7 (with N being the number of basis functions)15 which becomes quickly computationally prohibitive for full-dimensional PESs even for moderately sized molecules (Natoms ∼ 10). Recent applications of novel machine learning (ML) approaches combined with transfer learning (TL)16–18 and related Δ-ML19 in physical/computational chemistry14,18,20–23 were shown to be data and cost-effective alternatives which will also be explored in the present work.
Formic acid (HCOOH), the simplest carboxylic acid, is an important intermediate in chemical synthesis and relevant to atmospheric chemistry.24 It is also a promising fuel and H2 carrier25,26 which can be produced by electrocatalytic CO2 reduction reactions and, thus, may contribute to decreasing atmospheric CO2 levels.26,27 Formic acid monomer (FAM) and its dimer (formic acid dimer, FAD) have been the subject of several experimental28–35 and theoretical34,36–45 studies. In the vapor phase, formic acid exists as hydrogen bonded dimers46 making it a prototype for complexes with hydrogen bonds such as enzymes or DNA base pairs.47 The experimental IR spectrum has been reported for both, FAM and FAD.28,30,34,48–53
For FAM a recent study of both, cis- and trans-FAM, analyses a global PES constructed using regression techniques with energies from CCSD(T)(F12*)/cc-pVTZ-F12 calculations as the reference.39 Vibrational eigenstates for both conformers were calculated using vibrational configuration interaction (VCI) and the fundamentals for trans-FAM were found to be in close agreement with experiment (RMSD = 3 cm−1). Subsequently, a second full-dimensional CCSD(T)-F12a/aug-cc-pVTZ based PES for the two FAM conformers was used in multi-configuration time-dependent Hartree (MCTDH) vibrational calculations.45 The trans fundamental transitions were found to agree to within 5 cm−1 with experiment. Additionally, vibrational states of deuterated cis- and trans-FAM were calculated using the block-improved relaxation method.54 The two PESs39,45 are further analyzed in detail in ref. 55 and are used for comparison with experiments for FAM and deuterated isotopologues of both isomers. Vibrational eigenstates were obtained from high order canonical Van Vleck perturbation theory (CVPT). Such high level anharmonic treatments of vibrations on high quality PESs are crucial for understanding and potentially (re)assigning experimental spectra as was found for trans-FAM:39,45,55 assignment of the O–H in plane bend ν5 which is in resonance with the O–H torsional overtone 2ν9 has been ambiguous for a long time. Based on the high-level calculations the fundamental was assigned to 1306 cm−1 and the overtone to a spectroscopic feature at 1220 cm−1. This assignment was subsequently supported by Raman jet experiment.50 Interestingly, this reassignment was recently supported from second order vibrational perturbation theory (VPT2) calculations using a neural network-(NN) based PES.23
For FAD a considerable amount of computational work has been done on its dynamics,56,57 infrared (IR) spectroscopy34 and tunneling splittings.40,41 A recent PES based on 13475 CCSD(T)-F12a energies with the cc-pVTZ and aug-cc-pVTZ basis sets for H and C/O atoms (CCSD(T)-F12a/haTZ), respectively, was fit to permutationally invariant polynomials40 which reported a barrier for double proton transfer of 2853 cm−1 (8.16 kcal mol−1). This was used for computing the zero-point energy (ZPE) based on diffusion Monte Carlo (DMC) calculations, for vibrational self consistent field (VSCF)/VCI calculations of fundamentals and to determine ground-state tunneling splittings in a reduced-dimensional approach. The tunneling splitting obtained was 0.037 cm−1 which is larger than the experimentally reported value58,59 of 0.016 cm−1. Subsequently,41 the same PES was used for computing tunneling splittings without reducing the dimensionality of the problem using the ring-polymer instanton approach.60 The value of 0.014 cm−1 was considerably closer to experiment and a more recent measurement based on IR techniques reported a tunneling splitting of 0.011 cm−1.61
More recently the CCSD(T)-F12a PES was extended with a new dipole moment surface (DMS) at the MP2/haTZ level of theory and used for the analysis of the FAD IR spectrum.42 Both, VSCF and VCI calculations were performed and compared with results from classical and “semi-classically” prepared quasiclassical MD simulations and experiment.43 Classical MD corresponds to NVE simulations run at 300 K whereas the harmonic ZPE is added in the “semi-classically” prepared approach. For both approaches, the X–H stretch frequencies remain at higher frequencies compared to experiment. This effect is ascribed to the inability of MD simulations to correctly sample anharmonicities. The high-quality PES and DMS42 were recently employed to assess the fingerprint region of FAD using variational vibrational computations and curvilinear kinetic energy operator representation.62
Experimentally, the dissociation energy of FAD into two FAMs has been determined from spectroscopy and statistical thermodynamics to be D0 = 59.5(5) kJ mol−1 (∼14.22 kcal mol−1).33 For the double proton transfer barrier, the most recent value from microwave spectroscopy (measured tunneling splitting of 331.2 MHz) is 2559 cm−1 (30.6 kJ mol−1 or 7.3 kcal mol−1) from analysis of a 3D model.63 This is consistent with a best value of 7.2 kcal mol−1 from a morphed MP2 surface that was determined from atomistic simulations and compared with IR experiments of the proton transfer band.34 Earlier experiments reported a larger splitting of 0.0158 cm−1 (corresponding to 473.7 MHz) which yields a somewhat higher barrier for double proton transfer.58 Yet more recent work in the infrared found 0.011367 cm−1 (340.8 MHz) which is closer to the most recent microwave data.61
In order to assess the expected magnitude of error cancellation due to shortcomings in the electronic structure method used and the fact that MD-based spectroscopy only samples the bottom of the potential well at ambient conditions, extensive higher-level calculations at the MP2 and CCSD(T) levels of theory were carried out together with the aug-cc-pVTZ basis set. This reference data was then represented as a NN based on the PhysNet architecture.64 Additionally, TL16–18 to the CCSD(T)/aug-cc-pVTZ level of theory was used to further improve the quality of the PES. Using these PESs for formic acid monomer and dimer, the harmonic and anharmonic vibrations, and the IR spectrum from finite-temperature MD simulations are determined. From this the complexation-induced red shifts can be determined and compared with experiment. Furthermore, DMC simulations are run to obtain an estimate for the ZPE of both molecules. The DMC calculations are a meaningful probe for the robustness of the PES and the resulting ZPEs are used for the determination of the binding energy of the dimer.
First, the methods are presented and discussed. Next, the accuracy of the NN-based PESs is assessed and the vibrational spectra for FAM and FAD are determined and compared with experiment. Then, the results from DMC simulations are presented and, finally, the results are discussed and conclusions are drawn.
A dataset of reference structures is required for training PhysNet. An ensemble of 20000 FAM (5000) and FAD (15
000) geometries are generated from Langevin dynamics at 2000 K using the atomic simulation environment (ASE)75 at the PM7 level of theory.76,77 For FAD, the transition state (TS) region is sampled by harmonically biasing the geometry toward the TS structure, similar to the umbrella sampling approach.78 This dataset is extended with geometries of fragments of the FAM molecule (H2, CH4, H2O, CO, H3COH, H2CO) following the amons approach.79 For the amons, 1000 geometries each are generated using Langevin dynamics at 1000 K. Based on initial PhysNet representations of the PES, the dataset was extended with one round of adaptive sampling.80,81 The final MP2 dataset contains 26
000 reference structures and was split randomly according to 20
800/2600/2600 for training/validation/testing. Henceforth, this PES is referred to as PESMP2.
To further improve the quality of the PES, TL16–18 from the MP2 to the CCSD(T) level of theory was performed. The data set for TL contained 866 geometries: 425 for FAM and 441 for FAD. For FAM, they were generated from normal mode sampling82 at different temperatures (between 10 and 2000 K) and geometries for FAD were those along the minimum energy path (MEP), along particular normal modes and geometries obtained from normal mode sampling. Ab initio energies, forces and dipole moments for the 866 geometries were determined at the CCSD(T)/aug-cc-pVTZ73,83,84 level of theory using MOLPRO.74 Then, the PhysNet model was retrained on the TL-data set by initializing the NN with the parameters from the trained NN at the MP2 level as a good initial guess. The data set was split randomly according to 85/10/5% into training/validation/test set and the learning rate was reduced to 10−4 compared with 10−3 when learning a model from scratch. The final TL PES is referred to as PESTL.
Finite-temperature MD simulations were carried out using ASE75 and PESMP2 and PESTL. The trajectories are started from the equilibrium geometry of FAM and FAD, respectively, and initialized with random momenta drawn from a Maxwell–Boltzmann distribution corresponding to 300 K and the simulations are carried out in the NVE ensemble with a time step of Δt = 0.5 fs. After equilibration for 50 ps the simulations are propagated for 200 ps. The molecular dipole moment μ(t) is recorded for structures separated by 0.5 fs and the dipole–dipole autocorrelation is computed as C(t) = 〈μ(0)μ(t)〉. Infrared spectra C(ω) are then obtained by Fourier transformation of C(t) and applying a Blackman filter. All finite-temperature spectra are averages over 1000 independent NVE MD simulations.
Pdeath = 1 − e−(Ei−Er)Δτ (Ei > Er) | (1) |
Pbirth = e−(Ei−Er)Δτ − 1 (Ei < Er). | (2) |
![]() | (3) |
Following ref. 14, the DMC calculations were performed in Cartesian coordinates and in full dimensionality. For both, FAM and FAD, ten independent DMC simulations were performed. To obtain an estimate of the ZPEs 30000 walkers were advanced for 50
000 time steps following an equilibration period of 5000 time steps. A step size of Δτ = 5.0 a.u. was used. For converging the DMC simulations outlined above and to obtain statistically meaningful results more than 3 × 1010(= 30
000 × 55
000 × 10 × 2) energy evaluations were required in total. The computational efficiency of a PES is crucial for DMC simulations which requires a large number of energy evaluations. The Tensorflow library,91 in which PhysNet is implemented, allows training and evaluation of the NN on graphics processing units (GPUs). The use of GPUs for the evaluation of PhysNet speeds up the calculations as they can be performed in a highly parallel manner.92 Here, all DMC calculations were performed on a GeForce RTX 2080Ti GPU with 12 Gb of RAM. Each DMC simulation for FAM (FAD) takes 8.7 (23.7) h on average.
PES1MP2 | PES2MP2 | PESTL | |
---|---|---|---|
MAE(E) | 0.012 | 0.014 | 0.007 |
RMSE(E) | 0.019 | 0.028 | 0.016 |
MAE(F) | 0.026 | 0.029 | 0.076 |
RMSE(F) | 0.195 | 0.411 | 0.580 |
MAE(μ) | 0.001 | 0.001 | 0.002 |
RMSE(μ) | 0.002 | 0.002 | 0.008 |
1 − R2(E) | 3.7 × 10−9 | 8.5 × 10−9 | 4.1 × 10−9 |
The better of the two PESMP2 is characterized by MAE(E) = 0.012 kcal mol−1, RMSE(E) = 0.019 kcal mol−1, MAE(F) = 0.026 kcal mol−1 Å−1 and RMSE(F) = 0.195 kcal mol−1 Å−1. Accuracies similar to PESMP2 are achieved for PESTL, although the test set (44 geometries) is smaller than for learning at the MP2 level: the performance is MAE(E) = 0.007 kcal mol−1, RMSE(E) = 0.016 kcal mol−1, MAE(F) = 0.076 kcal mol−1 Å−1, and RMSE(F) = 0.580 kcal mol−1 Å−1, see Table 1 which also reports results for the dipole moments.
The quality of PESMP2 can also be assessed in terms of molecular geometries and the energy barrier for double proton transfer. The MP2/AVTZ optimized geometries of FAM, FAD and the proton transfer TS (see Tables S1–S3, ESI†) are reproduced by PESMP2 with RMSEs smaller than 0.003 Å. The energy barrier for PESMP2 is 6.69 kcal mol−1 compared with 6.71 kcal mol−1 from ab initio calculations, i.e. a difference of 0.02 kcal mol−1. The proton transfer barrier, which is underestimated at the MP2 level of theory, is one of the essential features of the FAD PESTL. Earlier studies at the coupled cluster level of theory yielded barrier heights of 7.95, 8.16 and 8.30 kcal mol−1 using the CCSD(T)-F12a/haDZ, CCSD(T)-F12a/haTZ//CCSD(T)-F12a/haDZ and CCSD(T)/aV5Z//MP2/aV5Z level of theory, respectively.37,40 The barrier for DPT at the CCSD(T)/aug-cc-pVTZ level was found to be 7.9 kcal mol−1.34 This agrees favourably with a barrier of 7.92 kcal mol−1 from the present PESTL. The experimental barrier height is estimated to be 7.3 kcal mol−1 using a 3D-model to reproduce the tunneling splitting of 331.2 MHz.63 A somewhat lower barrier for DPT (7.2 kcal mol−1) was also found from morphing an MP2-based full-dimensional PES to reproduce the experimentally observed, broad IR absorption from finite-temperature MD simulations.34
The dissociation energy De on PESTL for FAD into two FAMs is −16.79 kcal mol−1 which is identical to that from explicit ab initio calculations at the CCSD(T)/AVTZ level of theory.38 Because the training set for the full-dimensional PES contains structures of FAM and FAD, dissociation along the reaction path connects the minimum energy structures of the two asymptotic states (2 separate FAMs and FAD, respectively).
![]() | ||
Fig. 2 The accuracy of harmonic frequencies using PESMP2 and PESTL compared with the corresponding reference ab initio values. The FAD CCSD(T) values are taken from ref. 36. Almost no error is visible for FAM whereas errors up to ∼30 cm−1 are found for FAD. |
Similarly for FAD, the PhysNet harmonic frequencies are compared with those from direct ab initio calculations in Fig. 2 (circles) and Table S5 (ESI†). Frequencies from PESMP2 match the ab initio harmonic frequencies with a MAE of 2.1 cm−1 and a maximum unsigned deviation of 7 cm−1 for mode 24. For FAD the harmonic frequencies on PESTL show a somewhat larger MAE than frequencies from PESMP2. The reference CCSD(T)/aug-cc-pVTZ harmonic frequencies were those from ref. 36. The MAE between reference frequencies and those from PESTL is 10.7 cm−1 and deviations up to ∼30 cm−1 for single frequencies occur.
![]() | ||
Fig. 3 Infrared spectrum of FAM obtained from PESMP2 (blue) and from PESTL (red) using finite-T simulations (IR) and the VPT2 approach. The results are complemented with experimental frequencies for comparison.29,39 The gray dashed lines indicate the positions of the experimental frequencies and the XH peaks are scaled for better readability. The intensities of the VPT2 calculation capture the general trend of the experimental measurements whereas it is less clear for the finite-T spectra. The finite-T spectra (IR) are calculated from the dipole–dipole auto-correlation function of 1000 MDs each run for 200 ps with PESMP2 and PESTL at 300 K. |
![]() | ||
Fig. 4 Infrared spectrum of FAD obtained from the PESMP2 (panel A) and from the PESTL (panel B) using finite-T simulations (A and B) and the VPT2 approach (C). The results are complemented with experimental frequencies33,36 and intensities for 7 fundamentals104 (D). Six intensities of ref. 104 were estimated based on experimental data from ref. 105 and extended with one intensity from ref. 106. Experimentally, the intensity of the OH stretch is distributed over a much wider frequency range.30 The gray dashed lines indicate the positions of the observed fundamental frequencies. The finite-T spectra (IR) are calculated from the dipole–dipole auto-correlation function of 1000 MDs each run for 200 ps with the PESMP2 and PESTL at 300 K. |
FAM | FAD | |||||||
---|---|---|---|---|---|---|---|---|
PESMP2 | MP2 | PESTL | Expa | PESMP2 | MP2 | PESTL | Exp | |
1 | 619.9 | 619.5 | 621.6 | 626.17 | 78.0 | 70.2 | 91.6 | 69.2b |
2 | 643.0 | 641.8 | 632.2 | 640.73 | 178.5 | 160.4 | 164.4 | 161.0c |
3 | 1036.7 | 1036.4 | 1029.1 | 1033.47 | 185.0 | 171.0 | 200.9 | 168.5b |
4 | 1097.2 | 1098.2 | 1098.6 | 1104.85 | 206.2 | 197.3 | 229.4 | 194.0c |
5 | 1222.2 | 1220.6 | 1296.0 | 1306.2 | 255.8 | 246.0 | 265.8 | 242.0c |
6 | 1380.7 | 1381.0 | 1374.9 | 1379.05 | 281.1 | 271.9 | 327.6 | 264.0d |
7 | 1761.3 | 1760.5 | 1768.2 | 1776.83 | 686.4 | 678.5 | 679.6 | 682.0c |
8 | 2975.4 | 2967.8 | 2935.3 | 2942.06 | 713.4 | 707.6 | 707.4 | 698.0b |
9 | 3549.8 | 3554.9 | 3548.3 | 3570.5 | 945.2 | 936.0 | 925.6 | 911.0e |
10 | 971.6 | 966.1 | 958.1 | 942f | ||||
11 | 1065.2* | 1062.6 | 1075.4 | 1050g | ||||
12 | 1083.9* | 1076.6 | 1084.6 | 1060h | ||||
13 | 1237.1 | 1233.0 | 1238.2 | 1214h | ||||
14 | 1238.4 | 1240.6 | 1240.1 | 1233.9i | ||||
15 | 1370.4 | 1369.2 | 1371.2 | 1371.78i | ||||
16 | 1376.7 | 1373.2 | 1374.4 | 1375h | ||||
17 | 1403.0* | 1407.6 | 1406.6 | 1415g | ||||
18 | 1426.7* | 1430.7 | 1435.0 | 1454b | ||||
19 | 1662.6 | 1659.1 | 1661.6 | 1666k | ||||
20 | 1732.9 | 1732.9 | 1734.5 | 1741k | ||||
21 | 2763.0* | 2763.5 | 2875.0 | 2900d | ||||
22 | 2945.4* | 2961.7 | 2912.3 | 2939.7l | ||||
23 | 2961.8 | 2962.9 | 2920.1 | 2949h | ||||
24 | 2968.5 | 2963.4 | 3011.2 | 3050b | ||||
MAE | 2.0 | 8.4 | 6.4 | 19.3 |
One interesting observation concerns the correct assignment55 of the fundamental ν5 (mode 5 in Table 2) and 2ν9 (first overtone of mode 2 in Table 2) modes in FAM. VPT2 calculations using PESTL find ν5 at 1296.0 cm−1 and 2ν9 = 1211.9 cm−1 which is consistent with recent Raman data measured in a jet reporting a frequency of 1306 cm−1 and 1220 cm−1 for the fundamental and the overtone, respectively.50 VPT2 calculations at the MP2 level using PESMP2, on the other hand, find ν5 = 1220.6 and 2ν9 = 1297.9 cm−1, i.e. the opposite assignment compared with experiment and calculations at the higher level of theory.
For the errors of the VPT2 calculation of FAD there are several potential reasons. First, PESTL is less accurate in reproducing harmonic frequencies at the CCSD(T) level. Secondly, the low frequency harmonic modes 1 to 6 agree well with experiment (see Table S5, ESI†). However, VPT2 determines anharmonic corrections to be added to or subtracted from the harmonic frequencies. Hence, if the harmonic frequencies agree well with experiment, the VPT2 frequencies potentially differ from experimental values and the disagreement is larger for larger anharmonic corrections as is often the case for low frequency vibrations. Thirdly, VPT2 calculations also may incur larger errors, in particular for proton-bound dimers with large amplitude motions.98 Finally, the above errors can also accumulate which then leads to worse agreement between experimentally observed and computed anharmonic frequencies despite the high quality of PESTL. It should be noted that VPT2 calculations at the CCSD(T) level using conventional electronic structure codes for FAM is computationally demanding and for FAD it is unfeasible. To this end, more approximate, hybrid schemes using harmonic frequencies from a higher calculation (e.g. CCSD(T)) and anharmonic corrections calculated at a lower level of theory (e.g. MP2) were applied for FAD.38 Alternatively, reduced-dimensionality (RD) approaches such as RD-VPT2 exist.99
The experimental Fourier Transform transmittance spectrum of FAD recorded in a jet reported30 a broad band with superimposed sharp features extending from below 2600 cm−1 up to at least 3300 cm−1. The sharp features arise mostly from combination bands and make it difficult to assign one specific absorption feature to the OH stretch vibration. However, it is reasonable to assume that it is located to the blue side of the CH-stretch band rather than to the red side of it. Interestingly, the width of the jet-cooled spectrum is comparable to that recorded at room temperature. Hence, cooling only leads to sharpening of certain features but not to a simpler spectrum. This is consistent with other studies on formic and acetic acid.31,101 A potential assignment of the OH stretch vibration in FAD was made30 to a signature at 3084 cm−1 by comparing with earlier calculations (SCF with double zeta basis set).102 Similarly, earlier Raman spectra of (DCOOH)2 reported a broad absorption around and above 3000 cm−1 and assigned features between 2565 cm−1 and 3427 cm−1 to the dimer OH stretching band.103 Finally, a broad absorption was also found from lower resolution, room temperature IR spectroscopy34 for which spectral subtraction techniques were used to obtain the IR spectrum of FAD. The transition assigned to the OH-stretch was the region between 2600 and 3400 cm−1 with a maximum at ∼3100 cm−1.34 Hence, assignment of the OH-stretch frequency is not straightforward. Here, a position of 3050 ± 100 cm−1 was used to compare with. However, it needs to be stressed that considerable uncertainty about both, the position and the width of this band exist.
The failure of finite-T MD simulations to even qualitatively capture the spectroscopy of high-frequency modes is primarily related to the fact that at T ∼ 300 K, at which most MD studies are carried out, the anharmonicities of X–H vibrations are not sampled adequately. Because constraining the correct amount of ZPE in the modes of polyatomic molecules is as of now an unsolved problem,43 an alternative that has been considered is to run MD simulations at higher temperature.43,100 Here, simulations with increasing amount of energy in the OH stretch of FAM and FAD have been carried out using PESTL. The NVE simulations are run with excess energies of 0.1, 0.3, 0.6, and 0.9 eV for FAM and of 0.1 and 0.3 eV for FAD in each of the OH stretch modes. For comparison, the (harmonic) ZPE of FAM in this mode is 1647 cm−1 (equivalent to 0.2 eV). This energy is initially placed into the OH stretch(es) as kinetic energy. The E-dependent IR spectra (Fig. 5) correspond to an ensemble of 45 trajectories run for 200 ps each. For FAM it was found that with increasing energy content in the OH stretch the associated band shifts progressively towards the red as expected. With an excess of 0.5 eV the computed OH stretch aligns with that reported from experiment.
![]() | ||
Fig. 5 IR spectra from E-dependent MD simulations for FAM and FAD with PESTL. For comparison, for FAM the harmonic ZPE of the OH stretch is 0.23 eV and the total ZPEs are ∼0.9 and 1.9 eV. The grey vertical lines mark the peak positions of the experiments29,39 and the grey spectra are calculated from room temperature simulations on PESTL (see Fig. 3 and 4). For FAM a clear energy dependence is seen: the OH peak ∼3725 cm−1 with an energy of 0.1 eV shifts to the red and reaches ∼3435 cm−1 with an energy of 0.9 eV. This energy dependence was not achieved for FAD, for which the OH stretch peak was mainly broadened and blue shifted for 0.3 eV. In contrast, the C–H peak is red shifted as expected. For FAD, the IR signals above 2500 cm−1 are scaled by a factor of 5 for better visibility. |
Even though the energy redistribution to the lower frequency modes does occur on the time scale of the simulations the spectroscopic features below 2000 cm−1 do not respond strongly to the increased temperature. To assess the redistribution of vibrational energy a supplemental trajectory is run for FAM with an excess of 0.5 eV in the OH stretch mode. The energy remains in this mode for at least 150 ps after which it slowly redistributes into other degrees of freedom, see Fig. S1 (ESI†).
For FAD running the simulations at higher temperature apparently does not solve the problem of overestimating the location of the XH bands. As one additional complication, the binding energy, i.e. EFAD − 2EFAM, of the dimer is only −16.79 kcal mol−1 (−0.73 eV). Hence, excitation of the two OH stretch vibrations with more than ∼8.5 kcal mol−1 (∼0.35 eV) leads to dissociation of the dimer and the spectroscopy of the H-bonded OH⋯O motifs can not be probed. For the simulations with 0.1 eV of excess the OH⋯O hydrogen bond (distance between O10 and H5 or O2 and H6, see Fig. 1) extends up to 2.3 Å compared with an equilibrium separation of 1.68 Å. With an excess of 0.3 eV the two OH⋯O “bonds” elongate up to 3.3 Å. This indicates considerable destabilization of the dimer due to partial breaking of the hydrogen bonds and the two FAMs in the dimer start to behave more like monomers without fully dissociating FAD. Loosening the OH⋯O contact affects, however, the vibrational frequency of the OH stretch. Visual inspection also shows that an out of plane bending motion of the two monomers is excited due to vibrational energy redistribution which further weakens the OH⋯O hydrogen bond. For the simulations of FAD with 0.3 eV excess the OH stretch shifts to the blue by ∼70 cm−1 whereas the CH stretch shifts to the red as expected for simulations at higher temperature.
Based on these ZPEs the FAD dissociation energy D0 = DCe − 2ZPEFAM + ZPEFAD can be determined. Here, DCe is the corrected (see below) binding energy (“from the bottom of the potential”) and ZPEFAM and ZPEFAD are the ZPEs of FAM and FAD, respectively. The dissociation energy De needs to be corrected (a) to account for basis set superposition error (BSSE) and (b) to include effects due to finite-size basis set effects by extrapolation to the complete basis set (CBS) limit. These corrections were determined in previous work38 and will be used here. This is possible because De = −16.79 kcal mol−1 from the present work for dissociation of FAD into two FAMs each at their respective equilibrium structure, is identical to the value reported from the earlier CCSD(T)/aug-cc-pVTZ calculations.38 Including corrections for BSSE at the CBS limit yields DCe = −16.11 kcal mol−1.38 Together with the ZPEs of FAM and FAD determined above this yields a best estimate of D0 = −14.23 ± 0.08 kcal mol−1 where the error is that of the ZPEs from the DMC calculations. This value is consistent, within error bars, with the experimentally reported value of −14.22 ± 0.12 kcal mol−133 and compares with D0 = −14.3 ± 0.1 kcal mol−1 from computations with ZPEs obtained from a hybrid VPT2 approach (harmonic frequencies from CCSD(T)/AVQZ and correcting for anharmonic effects using MP2/AVDZ).38
The present work also allows to make contact with a recent assessment35 of various sources of errors in the spectroscopy and thermodynamics of FAD which reported that “More often than not, the overestimation of harmonic downshifts in DFT is qualitatively compensated by the inability of classical dynamics to sample the anharmonic region probed by the quantum nature of the hydrogen atom. This frequently provides right answers for the wrong reasons whenever high-frequency XH stretching spectra are simulated, because in this case, temperatures of several 1000 K would be needed to sample the relevant fundamental vibrational displacements.” While some of these statements are supported by the present results, others require amendments.
In agreement with this assessment the present work finds that energies up to 0.5 eV (∼5800 K) are required for the OH stretch vibration in FAM to agree with experiment. This is consistent with the temperature required to reproduce the stretching fundamental of a simple Morse oscillator with harmonic wavenumber ωe.108 On the other hand, the CCSD(T)-level quality of PESTL allows to probe whether or not accidental agreement with experiment is found for the complexation induced red shifts Δω and Δν. For the harmonic frequencies, the red shifts are Δω = −511 cm−1 and Δω = −433 cm−1 from normal mode calculations at the MP2 and CCSD(T) levels of theory, compared with −518 and −449 cm−1 from normal modes on the respective NN-learned PESMP2 and PESTL, see Tables S4 and S5 (ESI†). MD simulations with PESMP2 at 300 K find Δν = −477 cm−1 compared with −521 ± 100 cm−1 estimated from experiments,30 see Table 3. With PESTL the red shift is Δν = −412 cm−1 and the VPT2 calculations yield a shift of −537 cm−1. It should be noted that VPT2 calculations are known to have limitations for proton bound dimers.98 This is also consistent with the present finding that the ZPEs of FAD from DMC simulations and VPT2 calculations using PESTL differ by 0.14 kcal mol−1 (50 cm−1) whereas that for FAM is virtually identical.
The shifts from finite-T MD simulations still suffer from the limitation that the anharmonicity of the OH stretch is not sufficiently sampled, irrespective of the level of theory at which the PES was determined. However, it is likely that the magnitude and the direction of the error in the anharmonic OH stretch frequencies differ for FAM compared with FAD. This can be seen most clearly in Fig. 5 where the OH stretch in FAM/FAD shifts to the red/blue with increasing internal energy, respectively. The complexation induced red shift from simulations on PESMP2 is within 44 cm−1 of the experimentally observed value whereas that on (higher-level) PESTL is lower by 109 cm−1. Given the close agreement for the anharmonic OH-stretch for FAM from simulations on the two PESs it is expected that PESTL for FAD is of similar quality as for FAM and only minor improvement is expected by adding additional reference points in the transfer learning. This is supported by the finding that for the OH stretch Δω from MP2 calculations differs from that on PESMP2 by 7 cm−1 whereas for CCSD(T) and PESTL the difference is 16 cm−1. Likewise, using a larger basis set, such as aug-cc-pVQZ, is unlikely to change ω and ν of the OH stretch appreciably because the harmonic frequencies from CCSD(T)/aug-cc-pVTZ and CCSD(T)/aug-cc-pVQZ differ only by 3 cm−1, see Table 3. Thus, the apparent “agreement” between experiment and spectra computed from the present MD simulations on PESMP2 is due to differential undersampling the anharmonicity of the OH coordinate at 300 K for FAM and FAD and also due to the uncertainty in assigning the OH-stretch in FAD more definitively.
This changes as soon as DPT occurs, because then the full anharmonicity of the energy function along the XH coordinate must be sampled. From simulations with multiple forward and backward crossings in the MD simulations it is expected that the anharmonicity is comprehensively sampled which should provide more realistic IR spectra. Previous work on FAD accomplished this by using the Molecular Mechanics with Proton Transfer (MMPT)109 PES for which trajectories 250 ns in length were carried out which sampled 25 DPT events.34 Such long simulation times become accessible because these energy functions can be evaluated at the speed of a conventional empirical force field. Furthermore, the MMPT function used did not allow FAD to dissociate into two monomers.
In summary, the present work uses machine-learned, full dimensional and reactive PESs at the MP2 and CCSD(T) levels of theory for FAM and FAD to characterize their vibrational dynamics. It is established that for framework vibrational modes (below 2000 cm−1) computed frequencies from VPT2 and finite-T MD simulations agree well with experiments. However, for the high-frequency XH stretch modes – in particular the OH modes – MD simulations do not sufficiently sample the anharmonicities which leads to a considerable overestimation of the experimentally observed frequencies. Hence, it is primarily the use of MD simulations that is the source of errors for the disagreement between experiments and simulations. The estimated experimentally observed red shift of −521 ± 100 cm−1 for the OH stretch peak30 compares with −477 cm−1 from finite-T simulations on PESMP2 and −412 cm−1 when using PESTL in the simulations. The red shift from the VPT2 calculations is −537 cm−1. However, there is considerable uncertainty on the position and width of the OH-stretch fundamental which makes it less suited for direct comparison with computations.
For FAM, running simulations at elevated temperatures does not affect the framework modes but shifts the OH stretch modes towards the frequencies observed in experiments. On the other hand, for FAD destabilization of the dimer prevents using such an approach. Hence, computing accurate complexation-induced spectral shifts remain a challenge even on high-level PESs due to shortcomings in the dynamics simulations underlying the spectroscopy. This is also not expected to change if approximate quantum methods such as quasiclassical MD (QCMD) or ring-polymer MD (RPMD) simulations are used, as was shown explicitly.44,100 The computed dissociation energy of D0 = −14.23 ± 0.08 kcal mol−1 from a combination of electronic structure calculations and DMC simulations on the CCSD(T)-quality PESTL is in favourable agreement with the value of −14.22 ± 0.1 kcal mol−1 from experiments.
Full-dimensional PESs for medium-sized molecules as the one used here provide an ideal starting point for the accurate exploration of molecular motions. Notable efforts in this direction include a PIP-based PES for 15-atom tropolone110 or reproducing kernel Hilbert space (RKHS)-based representations for molecular systems up to 10 atoms.111 For the motion of H2 in hydrates, which is inherently a high-dimensional problem, an NN-based energy function has been successfully developed to follow quantum diffusion.112 Similarly, NN-based energy functions have been used for accurate spectroscopic studies of van der Waals complexes such as HCl–H2O and its isotopologues.113 Finally, NN-based energy functions have also been used for bond-forming/bond-breaking processes.67,114 These developments together with suitable embedding in surrounding environments provide considerable scope to more quantitatively characterize challenging problems in physical chemistry at molecular-level detail.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d1cp04393e |
This journal is © the Owner Societies 2022 |