Taija L.
Fischer
a,
Margarethe
Bödecker
a,
Sophie M.
Schweer
a,
Jennifer
Dupont
b,
Valéria
Lepère
b,
Anne
Zehnacker-Rentien
b,
Martin A.
Suhm
a,
Benjamin
Schröder
a,
Tobias
Henkes
c,
Diego M.
Andrada
d,
Roman M.
Balabin
e,
Haobam Kisan
Singh
f,
Himangshu Pratim
Bhattacharyya
f,
Manabendra
Sarma
f,
Silvan
Käser
g,
Kai
Töpfer
g,
Luis I.
Vazquez-Salazar
g,
Eric D.
Boittier
g,
Markus
Meuwly
g,
Giacomo
Mandelli
h,
Cecilia
Lanzi
h,
Riccardo
Conte
h,
Michele
Ceotto
h,
Fabian
Dietrich
i,
Vicente
Cisternas
i,
Ramachandran
Gnanasekaran
j,
Michael
Hippler
k,
Mahmoud
Jarraya
l,
Majdi
Hochlaf
l,
Narasimhan
Viswanathan
m,
Thomas
Nevolianis
m,
Gabriel
Rath
m,
Wassja A.
Kopp
m,
Kai
Leonhard
m and
Ricardo A.
Mata
*a
aInstitut für Physikalische Chemie, Universität Göttingen, Tammannstraße 6, Göttingen, Germany. E-mail: rmata@gwdg.de
bInstitut des Sciences Moléculaires dOrsay, Université Paris-Saclay, CNRS, 91405 Orsay, France
cDepartment of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
dInstitute for Inorganic Chemistry, Saarland University, 66123 Saarbrücken, Germany
eBond Street Holdings, Long Point Road, KN-1002 Henville Building 9, Charlestown, KN10 Nevis, St. Kitts and Nevis
fDepartment of Chemistry, Indian Institute of Technology Guwahati, Assam-781039, India
gDepartment of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland
hDipartimento di Chimica, Università degli Studi di Milano, via C. Golgi 19, 20133 Milano, Italy
iDepartment of Physics Science, Universidad de La Frontera, Francisco Salazar 01145, Temuco, Chile
jVellore Institute of Technology, School of Advanced Sciences (SAS), ChemistryDivision, Chennai 600 027, India
kDepartment of Chemistry, University of Sheffield, Sheffield S3 7HF, UK
lU. Gustave Eiffel, COSYS/IMSE, 5 BD Descartes 77454, Champs-sur-Marne, France
mInstitute of Technical Thermodynamics, RWTH Aachen University, Schinkelstraße 8, D-52072 Aachen, Germany
First published on 3rd August 2023
Vibrational spectroscopy in supersonic jet expansions is a powerful tool to assess molecular aggregates in close to ideal conditions for the benchmarking of quantum chemical approaches. The low temperatures achieved as well as the absence of environment effects allow for a direct comparison between computed and experimental spectra. This provides potential benchmarking data which can be revisited to hone different computational techniques, and it allows for the critical analysis of procedures under the setting of a blind challenge. In the latter case, the final result is unknown to modellers, providing an unbiased testing opportunity for quantum chemical models. In this work, we present the spectroscopic and computational results for the first HyDRA blind challenge. The latter deals with the prediction of water donor stretching vibrations in monohydrates of organic molecules. This edition features a test set of 10 systems. Experimental water donor OH vibrational wavenumbers for the vacuum-isolated monohydrates of formaldehyde, tetrahydrofuran, pyridine, tetrahydrothiophene, trifluoroethanol, methyl lactate, dimethylimidazolidinone, cyclooctanone, trifluoroacetophenone and 1-phenylcyclohexane-cis-1,2-diol are provided. The results of the challenge show promising predictive properties in both purely quantum mechanical approaches as well as regression and other machine learning strategies.
Blind challenges in the field of computational chemistry have become more and more popular over the past years. Particularly noteworthy examples thereof are the Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) challenges initiatives3,4 and the Cambridge crystallographic database structure challenges.5,6 This is followed in smaller scale by individual groups initiatives,7,8 or even in structured round-robin tests.9 Whatever the case may be, these challenges provide unique opportunities to pit different methods to the same observable in a collegiate (optimally unbiased) fashion. A previous challenge8,10 promoted by some of the groups involved was the furan-methanol challenge, whereby the binding preferences of methanol and a series of modified furan molecules was under scope. The question posed to the theoretical groups was to identify the most stable binding motif, an OH⋯O or OH⋯π bound conformer. Although the experimental data for the challenge were jet-cooled vibrational spectra, the target quantities were energy differences. The latter were estimated from the relative heights of the observed absorption peaks for the different binding modes. Albeit this being an indirect measurement, conservative error estimates were still below 1 kJ mol−1, well beyond the error bar of commonly accepted theory benchmarks.11
In this new joint effort, the observable to be tested is no longer inferred from the spectra. The goal is to correctly predict the experimental red shift of the OH bond stretch in 1:1 monohydrate complexes relative to the free symmetric stretch. This builds a much more rigorous connection point between theory and experiment in the important field of microhydration,12,13 as a quantity that both sides of the table feel comfortable estimating with minimal modelling on the experimental side. With its sensitivity to both electronic structure and nuclear motion aspects,14 it promises interesting error cancellation and thus Pauling point situations. As monohydrates are vibrationally15 far less characterised than rotationally,16,17 the topic also leaves enough playground for blind testing. As the layout of this blind test has been described extensively before,18 we refrain from repeating the experimental tools19–21 and caveats.22
As detailed in our first publication introducing the challenge,18 a set of 10 hydrate systems were selected as ‘training set’, in order for the participants to fine tune or validate beforehand their approaches. The latter consisted of: acetone (ACE), acetophenone (APH), 1,2,4,5-tetrafluorobenzene (TFB), 1-phenylethanol (POH), imidazole (IMZ), aniline (ANL), dibenzofuran (DBZ), di-tert-butyl nitroxide (DBN), o-cyanophenol (OCP) and cyclobutanone (CBU). Another set of 10 structures was chosen18 and presented as the target of the challenge – from now on referred to as ‘test set’. The latter consisted of: cyclooctanone (CON), 1,3-dimethyl-2-imidazolidinone (DMI), formaldehyde (FAH), methyl lactate (MLA), 1-phenylcyclohexane-cis-1,2-diol (PCD), pyridine (PYR), tetrahydrofuran (THF), tetrahydrothiophene (THT), 2,2,2-trifluoroacetophenone (TPH) and 2,2,2-trifluoroethan-1-ol (TFE), all previously uncharacterised in terms of the experimental water stretching vibrations in the vacuum-isolated complex. The participants were given 6 months between the test set reveal and a submission deadline. Following said deadline, any inconsistencies in the submissions were addressed. One issue we had to address was the use of different levels of theory within a single submission (i.e., higher levels of theory for the smaller systems, computationally cheaper approaches for the larger cases). Our solution was in general to ask the participants to split the data into different sets. This was done to increase the amount of information about the different levels of theory featured, while ensuring a strict unique procedure (level of theory) per computational data set. There were also some issues with the definition of the reference – the symmetric water OH stretch. This was the only value which we allowed to be revised. Otherwise, all of the data was kept as submitted, even including mistakes in the molecular structure. TFE and PCD values in submission PH1 were based on the wrong molecules, same as with DMI in PH4 and FA3. These issues are discussed in further detail later in the text.
As previously discussed,18 the training and test sets covered roughly the same range of wavenumber downshifts, from a very low (10 and 8 cm−1 for TFB and TFE respectively) to a rather high end (199 and 203 cm−1 for IMZ and PYR respectively). Fig. 1 provides the Lewis structure of the compounds selected for both sets, as well as the experimentally obtained absolute position of the water OH donor stretch bands.
Fig. 1 Experimental wavenumbers in cm−1 of the hydrogen-bonded OH stretching fundamentals of water in the monohydrates of the training18 and test23 systems without any anharmonic deperturbation attempt, together with their acronyms and CAS registry numbers, see also ESI.† |
PH purely harmonic: results at any given level of theory which were directly obtained by diagonalisation of the mass-weighted Hessian matrix. In some cases, these were provided as extra data sets by the groups upon request of the organisers.
AC anharmonic corrected: results which are derived from the use of a nuclear Hamiltonian with a non-harmonic potential in one or more degrees of freedom involving the target OH stretch. The calculations are not fully-dimensional as in the following category.
FA fully anharmonic: results which are derived from a full anharmonic vibrational calculation of the monohydrate. This includes both molecular dynamics and static calculations with an appropriate nuclear wave function method, not necessarily using the same electronic structure method as for the harmonic part.
LS learning strategies: results which are obtained on the basis of regression techniques. We include in this category both least-squares fitting and more advanced machine learning approaches.
The naming of the data sets followed the above mentioned category acronyms, with numbers attached and incremental within each category. The order for the numbers is somewhat arbitrary, depending on the date of registration and when the final data was submitted. An overview of the participants is provided in Table 1, with a short summary of the techniques employed available in Table 2. It should be noted that in some cases the description is rather short and does not really do justice to the overall process. The readers should consult the ESI,† files for a full description and computational details.
Participants | Submissions |
---|---|
Gnanasekaran | PH1 |
Mandelli, Lanzi, Conte, Ceotto | FA1 |
Dietrich, Cisternas | LS1 |
Henkes, Andrada | AC1, PH3 |
Töpfer, Boittier, Meuwly | AC2 |
Vazquez-Salazar, Boittier, Meuwly | LS2 |
Käser, Boittier, Meuwly | LS3 |
Singh, Bhattacharyya, Sarma | PH4, FA3 |
Hippler | PH5, AC4 |
Viswanathan, Nevolianis, Rath, Kopp, Leonhard | AC3 |
Balabin | LS4 |
Jarraya, Hochlaf | PH2, PH6, PH7, |
FA2, FA4, FA5 |
Submission | Computational protocol |
---|---|
PH1 | B3LYP-D3(BJ)/def2-TZVPP harmonic values |
PH2 | PBE0-D3/aug-cc-pVTZ harmonic values |
PH3 | ωB97xD/def2-TZVP harmonic values |
PH4 | B3LYP-D3(BJ)/def2-TZVP harmonic values |
PH5 | MP2/6-31++G(2d,p) counterpoise-corrected harmonic values |
PH6 | CCSD-F12/aug-cc-pVDZ harmonic values |
PH7 | MP2-F12/aug-cc-pVDZ harmonic values |
AC1 | ωB97xD/def2-TZVP with 1D numerical differences along the donor OH stretch normal mode |
AC2 | Reproducing Kernel Hilbert Space+DVR3D based on B3LYP-D3(BJ)/aug-cc-pVTZ structures |
AC3 | B2PLYP-D3(BJ)/aug-cc-pVTZ with selected anharmonic modes calculations |
AC4 | MP2/6-311++G(d,p) counterpoise-corrected 2-D QFF cc-VSCF |
FA1 | Quasi-classical B3LYP-D3/aug-cc-pVDZ trajectories |
FA2 | VPT2 PBE0-D3(BJ)/aug-cc-pVTZ anharmonic values |
FA3 | VPT2 B3LYP-D3(BJ)/def2-TZVP anharmonic values |
FA4 | CCSD-F12/aug-cc-pVDZ harmonic+PBE0-D3(BJ)/aug-cc-pVTZ anharmonic corrections |
FA5 | MP2-F12/aug-cc-VDZ harmonic+PBE0-D3/aug-cc-pVTZ anharmonic corrections |
LS1 | PAW PBE-D3 adjusted with a 2-parameter linear regression function extracted from the training set values |
LS2 | Kernel prediction of harmonic frequencies + delta learning correction with respect to the experiment |
LS3 | Neural network on B3LYP-D3(BJ)/aug-cc-pVTZ structures + transfer Learning to experiment |
LS4 | PBE0-D3(BJ)/may-cc-pVTZ|aug-cc-pVQZ adjusted with a 5-parameter linear regression function |
As clearly evidenced by Table 2 there is a wide variety of approaches to critically compare. There is no repetition of methods, although some entries do come close (e.g., PH1 and PH3 which only differ by the addition of extra polarisation functions in the atomic orbital set). In some cases (PH3, PH4, PH6 and PH7), we requested harmonic values to estimate the impact of the correction/regression approaches, although the original submission only featured anharmonic predictions.
In many cases, the assignments are straightforward and leave little room for interpretation, already based on infrared spectroscopy alone.19,21 In a few cases, complementary Raman spectroscopy20 was needed to safely identify the hydrogen-bonded OH stretch of the water unit in the monohydrate. Here, we only discuss some key results for the most complicated case of MLA, where there is evidence for mode mixing between the hydrogen-bonded OH groups of the two complexed molecules.
As illustrated in Fig. 2 and proven by rotational spectroscopy,24 the water molecule inserts into the intramolecular hydrogen bond between the α-hydroxy group and the carbonyl group of the ester, leading to a O–H⋯O–H⋯O chain of two coupled oscillators. The two top Raman spectra show the effect of adding water to a highly diluted MLA expansion. A strong band of the 1:1 complex emerges at 3474 cm−1. Minor satellite contributions near 3477 and 3491 cm−1 may be due to traces of larger complexes or due to dark overtone and combination states stealing some intensity from the OH stretching fundamental by perturbation. If the latter is true, the deperturbed (zeroth-order) state which should be compared to a harmonic or low-order anharmonic prediction is slightly higher than the peak wavenumber, perhaps around 3475–3478 cm−1. However, this is not so relevant, because the Raman spectrum is most sensitive for the in-phase stretching of the two OH oscillators. The out-of-phase stretching mode is better seen in the Raman spectra of the more concentrated expansions (central traces) at 3524 cm−1 and it is very prominent in the IR spectra (lowest spectrum in Fig. 2), because it involves a larger change of the dipole moment. As shown in the ESI,† it is the only strong transition attributable to the monohydrate in this spectral window and therefore corresponds to the requested wavenumber for the challenge. However,18 O-substitution in the water molecule demonstrates that both the out-of-phase and the in-phase OH stretching signals involve some water character, because they both downshift (see ESI†). As the downshift is more pronounced for the 3524 cm−1 band (8 vs. 5 cm−1), this is the band which should be compared to single theoretical predictions of the water OH stretch. Evidently, a comparison of both wavenumbers (listed in Fig. 1 for MLA) to their theoretical counterparts could be valuable as well in this particular case, but will be left for the future and the individual theory groups.
The OH group contained in TFE might have caused a similar mode mixing problem,25 but here the coupling to the water OH is sufficiently weak for an unambiguous assignment. This is also the case for PCD,26 where the monohydrate features a network of three coupled OH oscillators but only one of them shows a strong isotope shift upon18 O substitution (ESI†).
The experimental downshifts from the symmetric water monomer stretch in the test set span a range of 8–203 cm−1 and thus exclude the more challenging strong hydrogen bonds, where the likelihood of anharmonic resonances increases substantially. None of the 10 test systems involves a proven b2lib resonance,22 where the water bending overtone (b2) and hydrogen bond libration (lib) ternary combination band steals intensity from the OH stretch fundamental. Therefore, the raw experimental data can be compared directly to the theoretical predictions, keeping in mind that there may be some mode mixing issue for the multi-chromophore systems MLA, TFA and PCD. Only in the MLA case is this mixing supported by experiment.
The experimental uncertainties of the OH stretches vary slightly. They only exceed ±2 cm−1 for the MLA case due to satellite bands close to the MLA-centered mode and for the PCD case due to the convolution between the rotational contour and the resolution of the laser used in the IR-UV experiments. Therefore, the individual experimental uncertainties of the test systems do not enter the following analysis.
Given that the training data was available for all the groups, any participant who also computed the training set (even if partially) would have been in a position to provide uncertainty estimates based on the performance of the method. Learning strategies heavily dependent on the latter would likely underestimate the uncertainty, but could provide this information nonetheless. LS1 provided a rather small uncertainty of 8 cm−1, similar to that of LS4 (7 cm−1). Both made use of linear fitting expressions, with comparable fitting quality for the test set. In the case of FA1, the only other submission with uncertainty quantification, the value was 16 cm−1, taken to be slightly above the mean average error of the training set quasi-classical simulation results.
Finally, one should note that one could make use of other deviation measures instead of relying on the root-mean-square deviation (RMSD).27 It is for example common to report the mean absolute error (MAE). However, the latter will give the smallest error estimates, as it tends to lessen the weight of outliers. Given that we are critically assessing the predictive power of computational protocols, a measure like RMSD seems a more natural choice. For fairness, we will also consider results when removing the largest outliers of each set, trying to set an equal footing for all submissions, whether complete or incomplete.
A close inspection of the individual submissions showed that a few submissions made in fact use of the wrong organic molecule. We will mention this in the text. Of less concern was the computation of different minima. The most problematic system in this respect was PCD, with two alcohol groups. Otherwise, only small deviations were observed. It was commonly found in the TPH and FAH geometries that the water molecule was binding out of plane, breaking Cs symmetry. This seems to have a minimal impact on the predicted wavenumbers. Independent calculations were carried out at the CCSD(T)/aug-cc-pVTZ29–31 level of theory, confirming that the global minimum for both complexes should localise the water molecule in plane.
In order to give an overview of the submitted values, we provide in Fig. 3 the errors for each entry. For ease of visualisation, a kernel density estimator has been used, showing in bright yellow value ranges with a large number of submissions. The RMSD of the null hypothesis (±62.3 cm−1) is delimited by the dotted white lines. A few observations can be readily made. Most of the submitted values do outperform the null hypothesis. Although this is a low bar, it is not necessarily a given. If the quantity of interest had a low spread (small range of test values), the null hypothesis would be hard to compete with, as has been observed in other blind challenges (see for example ref. 32). Another observation to make is that there is no apparent systematic error (too small or too large shifts) with a fairly balanced distribution around 0. The density distribution readily highlights DMI, FAH and TFE as the systems with the best quality estimates, despite the fact that a few outliers (with errors beyond 62 cm−1) are observed. Rewardingly, these three systems span relatively strong, intermediate and very weak hydrogen bonding.
Fig. 3 Deviations of the submitted OH stretch shifts relative to the measured experimental values (raw values) in cm−1 for the 10 test systems. A positive error means that the predicted wavenumber downshift from the monomer is too large. Each value is represented as a white dot (only one value is outside of the visualisation window, AC1 submission for CON, with an error of 460.5 cm−1). The color coding of the graph is provided by the probability density function of the error for each system. Yellow indicates larger density values, dark blue low density. The bandwidth selection has been made with Scott's rule.33 The dotted line delimits the RMSD range of the null hypothesis (62.3 cm−1), obtained as described in the text. |
It is rather hard to analyse the results bundling all submissions together. In Fig. 4 an overview of the performance for all submissions is provided, taking the RMSD in the test set as a measure. This choice is not without bias, given that not all submissions included all 10 test systems. Nonetheless, it does allow for some first general considerations in regard to the different types of calculations. The best fully anharmonic submission (FA3) still exhibits a RMSD of 38.5 cm−1, a rather large value considering that the shifts only go up to 203 cm−1. It is in general the class with the lowest performance. Only 6 submissions are below the 30 cm−1 mark (PH5, AC2, AC4, LS1, LS3 and LS4) and 4 below 15 cm−1 (PH5, AC2, LS3 and LS4). Somewhat surprisingly, this shows that a similar accuracy can be reached with wildly varying approaches to the problem. Learning strategies do show some advantage, but the differences are not so significant that one would immediately exclude other approaches to the problem.
In the following text, we mention correct and incorrect structures based on the large agreement observed for the best submissions. These are considered as reference global minima and are then used to evaluate the other entries. The organisers of the challenge have not independently confirmed all of the minima for their stability.
The PH2 submission worst value (for MLA, error of 155 cm−1) does not appear to be due to the minimum chosen, but solely the approach used. The PBE0-D3/aug-cc-pVTZ46 level of theory reveals large errors for the whole set, at least as submitted by the participating group. The only geometry which was not correctly predicted was for TPH. The water binds to the side of the trifluoro group, when it should be docking to the carbonyl group from the opposite end. Despite the wrong geometry, and through error compensation, the deviation is only 23 cm−1, the second best prediction in PH2. Curiously enough, despite the similarity in method between PH1 and PH4, the smaller basis set chosen in the latter submission seems to have a sizeable impact (an almost constant 15 cm−1 increase in RMSD). All structures of PH4 were in line with the best results, with the exception of DMI, where 2,5-dimethylpyrrole was wrongly used. Attempting to build a summary for this class of submissions, and removing the impact of wrong structure submissions, there was a decisive advantage in the use of the MP2 method and a small basis set. This could be due to the well documented fact that MP2 will overcorrect Hartree–Fock in predicting OH stretching hydrogen bond shifts (too large downshifts).34 By reducing the basis set, the amount of electron correlation described decreases, compensating the harmonic and method errors. This is partly confirmed in the harmonic estimates underlying the AC4 submission, where the basis was expanded to 6-311++G(d,p) and the results were overall worst (if no anharmonic corrections were included).
The submission AC3 only included entries for FAH, THF, THT and TFE. The conformers of these four systems were found using CREST54 at GFN2-xTB55 level of theory. The resulting lowest-energy conformers were then optimized using the B2PLYP-D3(BJ)56,57 level of theory with an aug-cc-pVTZ basis set and harmonic vibrational frequency calculations were carried out. On paper, this was perhaps the highest-level method for the underlying PES. The same level of theory was then used for a VPT2 treatment of some selected modes that showed a significant contribution from the OH stretch. Preliminary calculations on the acetone-water test system showed significant improvement in the accuracy of the downshift (by 10 cm−1 with respect to experiments) when nine modes were selected instead of three. The previously mentioned modes were the ones that specifically involved the OH stretch. AC3 therefore chose an anharmonic treatment of nine selected vibrational modes throughout.
As one can extract from the discussion of the individual entries, the choice of dimensions used for the partial anharmonic corrections wildly varied. The water-benzene dimer (a system which was not featured in this challenge) was recently studied by Felker and Bačić.58 In their study they obtained a striking agreement between the symmetric OH shift calculated from an adapted PES considering only the water degrees of freedom and a larger 9 dimensional PES whereby coupling to large amplitude motions was included. However, the agreement is found to be precarious, as replacing one of the hydrogens by deuterium (HOD) shows. Deviations rise from under 1 cm−1 to an order of magnitude larger. A more systematic review of the systems featured in HyDRA could be warranted to observe how error compensation is at work from weakly to strongly interacting dimers.
Finally, we have the best set of values under this category, the FA1 submission. It falls close to FA3 for the best predictions, with the largest errors for MLA and THF, 101 and 76 cm−1 in absolute deviations. The total RMSD falls significantly once the two outliers are removed (RMSD of 27 cm−1). It is a unique set of calculations within the challenge. The authors carried out adiabatic ab initio molecular dynamics at the B3LYP-D3(BJ)/aug-cc-pVTZ level of theory. The fundamentals were taken from the power spectrum which is obtained by Fourier transform of the velocity autocorrelation function. Further details to the method are available in the corresponding FA1 ESI.† The single trajectories were started from optimised minima which fall in good agreement with the best predictions available, aside from a small difference in the conformation of the PCD hydrate. The source of the deviations should not be linked to the starting structures or the initial momenta. One should note that for this particular submission, the results for the training set were significantly better than for the test set, 18 vs. 47 cm−1. Most of this difference is due to the two aforementioned outliers, and which the authors anticipated as problematic. MLA and THF spectra were characterized by complex power spectra and more investigations should have been undertaken to best identify the target spectroscopic signal. Clearly this task is eased in the case of the training set.
The other two submissions (LS1 and LS4) approached the problem making use of computed harmonic frequencies corrected by an empirical function. In the case of LS1, the base harmonic predictions were obtained at the PBE-D3 level of theory using the projector augmented wave approach (PAW). For calculation details please consult the corresponding ESI.† Eight complexes from the training set were used to fit a linear regression formula (two were excluded due to issues in the optimisation), the latter being applied to the harmonic predictions of the test set. The training and test sets roughly covered the same range of OH shifts, with similar types of interactions, providing a reasonable basis for the correction function. Albeit LS1 was not among the top group of submissions, the RMSD was still overall below 30 cm−1. Lastly, LS4 used a somewhat more complex linear correction function. The geometries and corresponding harmonic wavenumbers were obtained with the PBE0-D3(BJ) functional. The basis sets used were may-cc-pVTZ for the organic molecules, aug-cc-pVQZ for the water molecule in the complex. An empirical linear function was then used to correct the harmonic predictions to target experimental values. The training data consisted of 42 hydrate complexes with published spectroscopic data. The linear function included two parameters which were fitted to the aforementioned data, weighing in the OH bond elongation. Further details are provided in the LS4 ESI.† The performance of this approach was very similar to that of LS3 (Fig. 9).
For all the other training set members, there is no new evidence which would cause a revision of the originally proposed fundamental wavenumbers, with or without deperturbation, although the DBN case is currently under reinvestigation. Hence, the following training set wavenumber alternatives remain, depending on whether raw results for the strongest signal or deperturbation for the high order anharmonic resonance22 is attempted. For ACE, the raw/deperturbed alternatives are 3538/3531 cm−1, for APH 3536/3530 cm−1, for DBN 3484/3487 cm−1. Note that Fig. 1 only lists the raw values. The methods shown in Fig. 10 would profit by at most 10% in their RMSD when switching to deperturbed reference values. Other deficiencies of the models clearly dominate. Still, it is rewarding to see that none of the discussed methods deteriorate with the deperturbation. On the other hand, none of the submissions has attempted to model the high-order resonance explicitly. Therefore, this subtle anharmonic aspect22 is left for future iterations of HyDRA, with better performing theoretical methods.
Keeping the discussion on the best performing methods, it is interesting to consider the individual errors plotted in Fig. 10. PH5, LS3 and LS4 predominantly underestimate the stretch frequencies shifts, whereas AC2 is more balanced in this regard. The major flaw in the AC2 prediction is PYR for which a large shift overestimation is found. The accumulated negative errors (underestimates) are −97.1, −18.6, −85.0 and −62.6 cm−1 compared with accumulated shift overestimates of 25.6, 68.6, 12.0 and 28.6 cm−1 for PH5, AC2, LS3 and LS4, respectively. For the case of AC2 it is interesting to note that PYR has the farthest red shift (i.e. strongest interaction between water and probe molecule) which is consistent with results on the training set (see AC2 ESI†) for which the bands shifted most to the red are typically captured least reliably. This points towards a genuine deficiency of B3LYP for the calculation of the PES, which is required in the DVR3D calculations.
As an example for the power of this collaborative approach, we suggested to the authors of the best submissions to compute the results for the OH donor vibration in the water dimer, which has been a rather controversial observable over the last few years.22,70–73 A well-converged full-dimensional variational treatment of the donor OH stretch band origin on a high quality analytical potential energy hypersurface is now available74 yielding a value of 3599 cm−1. This calculation matches the experimental consensus of about 3601 cm−1 well within the ≈10 cm−1 spread due to quantum tunneling of indistinguishable nuclei present in this floppy dimer. It is of interest how close different local strategies without such tunneling contributions come to experiment. The results are summarised in Table 3. One can see that none of the predictions from this work supports the multiply refuted22,71–73 recent IR-VUV result and interpretation.70 AC2 comes closest but its focus on monomer anharmonicity is likely to perform poorly for the lightest, strongly librating hydrates, i.e. water dimer. All others consistently predict the water donor OH stretching downshift some 20 cm−1 above experiment and above the full-dimensional variational treatment, perhaps again a consequence of the particularly light and thus delocalised nature of the binding partner. This is also qualitatively true for the second-lightest case of formaldehyde (FAH, Fig. 11). It is encouraging that the three particularly successful independent approaches based on a harmonic reference are so consistent for such an extreme testing case.
Fig. 11 Individual errors in predicted shifts for the four submissions with the lowest RMSD overall (PH5, AC2, LS3 and LS4). |
The other side of the coin are the purely harmonic (PH) calculations, which heavily rely on error compensation. PH5 is built on a very conscious decision of using the ‘Pauling point’. Albeit calculations with larger basis sets were available (and provide the basis for AC4), cutting the number of functions leads to an improved error compensation. The robustness is impressive given the amount of approximations involved. We find two questions of interest for the future. The first one is whether this robustness will still hold when increasing the number or diversity of systems under study. This could be verified in future editions of the challenge or simply calculating some of the other systems which were suggested for the training set. The other question is whether this type of error cancellation can be found in other (relatively cheap) electronic structure methods, in particular DFT functionals. The three models used in this category (B3LYP, PBE0 and ωB97xD) seem to not hold to the task. However, the PH data sets are mostly subproducts of other computational protocols, and the ‘Pauling point’ was not being explicitly sought for. There could be alternatives.
The fully anharmonic submissions were perhaps the biggest upset overall. Some submissions were able to break below an RMSD of 40 cm−1, but they were far from reaching the top 4. One should say that this is partly due to the way the challenge was conceived. The test set consisted of both large and small systems, significantly limiting the level of theory that could be applied across all (or at the least the majority of) systems. It would have been perhaps of interest to have a few higher-level calculations for the smallest hydrates. This can be a task for future studies, although one will miss the blind character of the challenge. In future editions, one could also attempt to include more groups with a focus on ab initio vibrational spectroscopy which could potentially restrict themselves to the smallest systems featured. Another deciding factor was to estimate the shift in the OH frequency, i.e. to estimate a difference of frequencies instead of the absolute value. This decreases the importance of anharmonic effects and strongly favours methods which most heavily rely on error compensation (e.g., the PH submissions). The fully anharmonic (FA) methods have a much better comparative performance when looking at absolute fundamental band positions.
Finally, one should note that the conditions were not the best for machine learning approaches. Only a very small training set was made available. All LS submissions (with the exception of LS1) compensated for this fact by performing additional calculations on molecular datasets. Moving forward, it would be well-advised to combine the experimental information (e.g., for the 20 systems featured in this challenge) and libraries of computed vibrational spectra, even if just under the harmonic approximation. The curated data should reduce the overhead in building such models, besides significantly improving their accuracy.
Footnote |
† Electronic supplementary information (ESI) available: Experimental data for test set members, detailed analysis of the theoretical submissions and computational details for individual submissions. See DOI: https://doi.org/10.1039/d3cp01216f |
This journal is © the Owner Societies 2023 |