Robin
Dohmen
,
Denis
Fedosov
and
Daniel A.
Obenchain
*
Georg-August University, Tammannstraße 6, Göttingen, Germany. E-mail: daniel.obenchain@uni-goettingen.de
First published on 14th December 2022
Rotational spectroscopy relies on quantum chemical calculations to interpret observed spectra. Among the most challenging molecules to assign are those with additional angular momenta coupling to the rotation, contributing to the complexity of the spectrum. This benchmark study of computational methods commonly used by rotational spectroscopists targets the nuclear quadrupole coupling constants of chlorine containing molecules and the geometry of its complexes and clusters. For each method, the quality of both structural and electronic parameter predictions is compared with the experimental values. Ab initio methods are found to perform best overall in predicting both the geometry of the complexes and the coupling constants of chlorine with moderate computational cost. This cost can be reduced by combining these methods with density functional theory structure optimization, which still yields adequate predictions. This work constitutes a first step in expanding Bailey's quadrupole coupling data set to encompass molecular clusters. [W. C. Bailey, Calculation of Nuclear Quadrupole Coupling Constants in Gaseous State Molecule, 2019, https://nqcc.wcbailey.net/]
The analysis of the large amount of spectroscopic data collected, both in number of spectra and in spectral complexity, can currently be very time consuming. An accurate initial guess of the spectroscopic parameters is valuable because it drastically minimize the time spent on assignment. Accurate theoretical predictions are also valuable for automatic fitting programs,4,5 since it reduces calculation time. The aim of this paper benchmark a number of theoretical methods to test their performance in predicting the complex hyperfine splitting of weakly bound complexes.
In microwave spectroscopy, rotational constants are the fundamental spectroscopic parameters, which are directly related to the geometry of the molecule as they depend on the moments of inertia. This direct link to the mass distribution in the molecule establishes microwave spectroscopy as a powerful tool for determining the molecular structure and underscores the necessity for an accurate prediction of these parameters from theory for assignment. In addition to rotational constants, quantum chemical methods produce predictions for another characteristic important for assignment, the Nuclear Quadrupolar Coupling Constant (NQCC) χij. Provided the target molecule contains one or more quadrupolar nuclei (I > 1/2), such as 14N or 35Cl,6–9 a hyperfine structure is observed that arises from the coupling of the nuclear angular momentum with the molecular angular momentum.10 The coupling is dependent on the intrinsic nuclear quadrupole moment Q and the electronic environment expressed by the Electronic Field Gradient (EFG) qij(i,j = abc).
(1) |
The NQCC χij is experimentally observed in the principal axis system abc based on the construction of the rotational Hamiltonian. Through rotation in the nuclear axis system of the quadrupolar nucleus, the NQCC tensor is transformed to χij(i,j = xyz) by diagonalization. The steepest gradient of the EFG is along the chlorine carbon bond in the nuclear axis system, where this axis is labeled z by convention. To clarify which axis system is used, χabc and χxyz will be used to denote the NQCC in the principal axis system or the nuclear axis frame, respectively.
Bailey et al. performed extensive benchmarking of NQCC11 with a focus on chlorine-containing molecules which are present in many important compounds for chemical industry.12 This is in addition to Bailey's numerous examinations of similar nuclei containing high-spins that are found in rotational spectroscopy studies. Bailey found a linear correlation between NQCC values predicted at the B1LYP/TZV(3df,2p)13,14 level and experimental results of 22 molecules. He established an empirical correction Qeff of the intrinsic nuclear quadrupole moment that acts on the EFG of the molecule. The reference data set encompasses a wide range of chlorine compounds probed with experimental methods that are added sequentially to the data set after the initial publication. The data set is limited exclusively to chlorine-containing monomers, disregarding chlorine-containing species with intermolecular dispersive bonding interactions.
Chlorine containing weakly-bound complexes have been studied experimentally for multiple decades.9,15–33 The data on weakly bound complexes allows one to review the predictive power of quantum chemical calculations for dispersive interactions by utilizing chlorine as a probe for the EFG. The goal is to improve the quantum chemical predictions for non-covalent complexes to assign weak transitions in rotational spectroscopy. Thereby, it is also a step toward the investigation of rarely sampled atoms.
These subgroups provide trends within the data set. A large number of structures have chlorine as a component of an organic molecule such as 2-chlorothiophene or CF3Cl which interact with a number of binding partners, mostly small molecules, both organic and inorganic compounds. In particular, CF3Cl has been explored with a number of binding partners offering a wide selection of different interactions with the same system. Another group is argon as a rare gas binding partner to a number of small chlorine compounds such as Cl2 and HCl, uncovering the impact of the covalent environment of chlorine. To further diversify the binding partners, structures containing metals are also included in the data set.34
This is a particular complication if a single method is used both for structure optimization and for the prediction of the NQCC, as a singular method is not necessarily optimal for both computations. To address this issue, well established methods were used to optimize structures of complexes, and then a variety of methods were used to predict the NQCC of chlorine in these optimized complexes.
An unreliable geometry prediction of dimers is also particularly relevant when it comes to large amplitude motions. Low binding energies compared to covalently bound structures facilitate low energetic barriers for large-amplitude motions. This challenge is not unique to NQCC predictions, but is relevant for dispersive complexes in general. Nevertheless, it is an important factor to consider when approaching NQCC predictions and corrections.
(2) |
Structure | A/MHz | B/MHz | C/MHz | χ aa | χ bb | χ cc | χ ab | χ ac | χ bc | Citation |
---|---|---|---|---|---|---|---|---|---|---|
a Included in single point calculations. b B not in literature, value for (B + C)/2 given instead. c No parameterization for Ag, Au, Kr, Xe in 6-311++G(d,p) and 6-311++G(2d,2p) basis sets, therefore excluded from MP2 optimization. | ||||||||||
Ar–ClFa | 0 | 1327.113 | 0 | −140.869 | — | — | 0 | 0 | 0 | 15 |
Ar–HCla | — | 1678.511b | — | −23.027 | — | — | 0 | 0 | 0 | 16 |
Ar2–HCla | 1733.857 | 1667.932 | 844.491 | −28.123 | 12.471 | 15.658 | 0 | 0 | 0 | 17 |
Ar3–HCla | 0 | 843.8974 | 0 | −31.008 | — | — | 0 | 0 | 0 | 18 |
CF3Cl–Ara | 3373.118 | 988.2529 | 879.5788 | 36.57 | −75.485 | 38.915 | 15.4 | 0 | 0 | 19 |
CF3Cl–CH3Fa | 21826 | 728.2164 | 702.9987 | −73.3 | 34.05 | 39.25 | 29 | 0 | 0 | 20 |
CF3Cl–CO2a | 2985.3104 | 890.7548 | 780.1095 | 2.617 | −41.1605 | 38.5435 | 52.89 | 0 | 0 | 21 |
CF3Cl–DMEa | 11738 | 595.399 | 565.836 | −76.587 | 39.5235 | 37.0635 | −0.44 | 0 | 0 | 22 |
CF3Cl–OCH2a | 27295 | 787.4896 | 764.373 | −73.91 | 34.695 | 39.215 | 22.96 | 0 | 0 | 23 |
F2CCFCl–NH3a | 2236.0689 | 1317.19498 | 1219.63324 | −21.817 | −14.4725 | 36.2895 | 55.9 | 10.3 | 12 | 24 |
Fluorobenzene–HCl | 1863.8635 | 1107.99873 | 918.09242 | −36.6494 | 10.9547 | 25.6947 | 30.48 | 0 | 0 | 25 |
H2–AgClc | 1588778.3 | 3453.034 | 3443.7141 | −31.608 | 15.232 | 17.384 | 0 | 0 | 0 | 26 |
H2–AuClc | 1209646 | 3297.117 | 3287.332 | −41.6877 | 18.581 | 23.107 | 0 | 0 | 0 | 9 |
H2–CuCl | — | 4781.8759b | — | −24.43 | — | — | 0 | 0 | 0 | 27 |
HCCH–CH2ClFa | 5262.899 | 1546.8071 | 1205.4349 | 28.497 | −65.618 | 37.121 | −22.2007 | 0 | 0 | 28 |
HCCH–ClHCCF2a | 2671.139543 | 988.154084 | 721.384869 | 36.22802 | −72.54615 | 36.31813 | 20.5876 | 0 | 0 | 29 |
HCCH–ClHCCH2a | 5679.6182 | 1523.02024 | 1200.10948 | 35.7483 | −67.1371 | 31.3888 | 11.2 | 0 | 0 | 30 |
HCCH–H2CCFCla | 9147.85663 | 889.015856 | 811.975763 | −70.63491 | 37.3203 | 33.31461 | 12.356 | 0 | 0 | 31 |
Kr–HClc | 0 | 1200.62374 | 0 | −29.2376 | — | — | 0 | 0 | 0 | 32 |
Xe–HClc | 0 | 989.26113 | 0 | −34.67 | — | — | 0 | 0 | 0 | 33 |
For errors the standard deviation of the MARD is calculated.
ORCA 4.2.135–37 was chosen for all quantum chemical calculations, as it is widely used through open access and offers customization of the basis sets.38–40
Most of the methods were chosen due to their prevalence among rotational spectroscopists such as B3LYP41–43 and its D344–46 and D447–49 dispersion corrected extensions with Becke–Johnson damping.45,46,50 The prevalence of the double hybrid functional B2PLYP51 similarly justifies its inclusion. B1LYP14,41,52 was chosen for comparison with Bailey's studies. B97M-D3BJ53 is a meta-GGA functional chosen to compare with hybrid functionals based on previous studies.49,54 MP2 is also a popular method among rotational spectroscopists for structure predictions and serves as an ab initio method comparison with moderate computational cost.55
The basis sets used for DFT and meta-GGA methods are the def2 basis sets def2-TZVPD and def2-TZVPPD,56–59 a very common all-purpose basis set. To compare with another basis with an improved description of core electrons, which potentially has an impact on the shielding effect of the quadrupolar nucleus by the electrons, x2c-TZVPP-2c60 and x2c-TZVPP-s61 are included. An exception is B1LYP, where VTZ(3df,2d)62–72 was used to compare with Bailey's calculations. For the MP2 calculations 6-311++G(d,p) and 6-311++G(2d,2p)64–69 basis sets where used to compare basis sets of different sizes. ORCA's integrated basis sets were used unless the basis was unavailable, then the basis set was downloaded from Basis Set Exchange.38–40
Single point calculations was performed for select geometries optimized at the levels of B3LYP/def2-TZVPPD, B3LYP-D3(BJ)/def2-TZVPPD, and B3LYP-D3(BJ)/x2c-TZVPPall-2c. These structures were chosen to keep the scope limited within the B3LYP hybrid functional group, as these calculations provided a large range of quality in their respective geometry optimizations with respect to structure. As these methods were used for the optimization in this part of the study, each method was subsequently used as a single point calculation method on the other two optimized structures. For example, a single point calculation was done with B3LYP/def2-TZVPPD on the optimized geometry at the B3LYP-D3(BJ)/def2-TZVPPD level. This was done for completeness of the comparison. For these geometries, a mix of calculations were used. In keeping with the previous standards, the NQCC was determined with a single point calculation using B1LYP/TZV(3df,2p) according to Bailey,62 which includes transforming the raw EFG tensor to the NQCC tensor using both the nuclear quadrupole moment Q from the literature and Bailey's corrected Qeff. For each of the three optimized geometries, B3LYP/def2-TZVPPD. Lastly, CCSD(T)/cc-pVQZ73–76 was chosen to compare with a high-level ab initio method. The treatment of all geometries was repeated in these single point calculations, regardless of the quality of the predicted structure, with the exception of many electron systems. As CCSD(T)77,78 is a very expensive method, Ag, Au, Cu, Kr, Xe and fluorobenzene were excluded from this sub-test set, as the computational cost was too high (see Table 1).
For χij, the distinction between the NQCC in the nuclear axis frame and principle axis system has to be made. No previous study has rigorously investigated dispersive dimers, but Bailey corrected Qeff by 0.1 ± 0.9% for 35Cl containing monomers.
Fig. 1 Comparison of MARD of NQCC and geometry predictions made by examined combinations of methods and basis sets after optimization. Error bars were omitted when the values were to large for a readable depiction. The values can be found in the ESI† (Tables S1–S5) in the dataverse.79 |
The DFT hybrid functionals like B1LYP and B3LYP show large discrepancies between predicted geometries and experimental structures when omitting Grimme's dispersion correction and Becke–Johnson damping as demonstrated by the large MARD for (B + C)/2. This is expected behavior, as long-range dispersive interactions are underestimated by theory and lead to distortion in geometry for the dimer data set governed by dispersion.45
Reliable structural data with mean relative deviations below <4%80 are provided by B3LYP-D3/def2-TZVPD, B3LYP-D4/def2-TZVPD, B3LYP-D4/def2-TZVPPD, and MP2/6-311++G(2d,2p) calculations. Overall, B3LYP-D3/def2-TZVPD and B3LYP-D4/def2-TZVPD performed best among all methods in predicting rotational constants, demonstrating no significant differences beyond the methods precision between D3 and D4 dispersion corrections. A close second is MP2/6-311++G(2d,2p) in archiving quality geometric predictions. However, it is important to note that a precise prediction is not expected, as we are comparing equilibrium structure calculations with the ground state structure observed in experiments. Nevertheless, spectroscopists have reliably used these methods for conformational assignment, and the benchmark data presented here confirm once more that these methods provide a robust description of weakly bound complexes at moderate computational costs.54,81–83 With a shift to flexible systems with internal motions, this mismatch in comparisons needs to be carefully considered in a case-by-case method.
The description of the quadrupole tensor orthogonal to the bond axis Δ(χxx − χyy) is less accurate than for χzz. As the NQCC tensor is traceless, χxx and χyy are of similar magnitude in most of the chlorine compounds selected. For this reason, the difference Δ(χxx − χyy) leads to higher relative errors compared to χzz. Indeed, the non-dispersion corrected B3LYP and B1LYP methods outperform other DFT hybrid functional approaches with Grimme corrections in predicting Δ(χxx − χyy). Due to inaccurate geometry predictions, this is most likely due to the much larger error spread and error compensation. Therefore, it is important to view the Δ(χxx − χyy) predictions in conjunction with χzz and (B + C)/2, to ensure that the NQCC description is correctly described and not just by chance. The method that strikes that balance best with a single optimization is MP2/6-311++G(2d,2p), as this method performs the best in predicting the NQCC while simultaneously occupying fourth place for geometry predictions among the methods tested here, suggesting that the electronic description of the EFG is reliable.
The difference between D3 and D4 for B3LYP is minor, they have almost identical NQCC MARDs, as the test set lacks ionic species, the characteristic D4 improves over D3.47Fig. 1 indicates that overall B1LYP produces better predictions of NQCC than other DFT methods despite the imperfect structure obtained by optimization, which is discussed in another section.
The basis set has a significant impact on the description of NQCC. The x2c basis set is an all-electron basis that was fit using NMR parameters. NMR data were chosen specifically to improve the shielding constants for inner electrons.60,61 This approach can theoretically affect NQCC if the description of the core potential for chlorine is inadequate. In contrast to expectations, the x2c basis set does not outperform the def2 basis set for the description of quadupole coupling. While χzz is comparatively well described, the errors in Δ(χxx − χyy) rise compared to the Karlsruhe def2 basis set. Overall, the relative error of Δ(χxx − χyy) is much larger than that of χzz, thereby making this parameter less reliable in judging the overall quality of the prediction of the NQCC tensor. As a result, the x2c basis sets still outperform the def2 basis sets for DFT calculations. However, MP2 with a 6-311++G(2d,2p) basis set still overall provides the most reliable predictions of the entire set of methods.
When comparing ab initio MP2 methods, the importance of the basis set is highlighted. The addition of diffuse functions to the basis set results in robuster NQCC predictions in the molecular axis frame. Therefore, among the methods tested, MP2/6-311++G(2d,2p) is the best choice among the methods tested if only a single calculation is performed, if the basis set can be computationally afforded.
Omitting B3LYP dispersion corrections does not have a significant effect on the prediction of NQCC, since the coupling is primarily dependent on the description of the electronic environment. The impact of dispersion correction on this calculation is diminished by other computational decisions, such as the choice of basis sets, which is illustrated in Fig. 2. Comparison of B3LYP and B3LYP-D3 indicates that B3LYP did not accurately predict χ for the right reason, as suggested in Fig. 1. Instead, the low NQCC MARD is caused by error compensation between the distorted geometry and an inaccurate description of the nuclear environment.
Despite the inaccuracies in optimization discussed above, B1LYP/TZV(3df,2p) outperforms all other methods in describing χzz, albeit prediction for (χxx − χyy) contains the largest errors compared to B3LYP-D3/def2-TZVPPD and CCSD(T)/cc-pVQZ. Employing Qeff to transform the EFG according to eqn (1) fails to correct the NQCC, as it breaks down when applied to dimers.
The CCSD(T) results improve the prediction of χ compared to B3LYP, although it is still outperformed by B1LYP in terms of χzz. The ab initio method performs best for (χxx − χyy). This result was a surprise, as CCSD(T) is often praised as the gold standard in computational chemistry. It can likely be attributed to the basis set cc-pVQZ used in the CCSD(T) calculation, which is not augmented with diffuse functions compared to the TZV(3df,2p) employed for the B1LYP calculation. More benchmarking needs to be done to confirm this hypothesis. It demonstrates the importance of the basis set, as in this case a chosen DFT method with a large basis set can compete with CCSD(T) calculations in accuracy to predict the local EFG.
A major issue for correlating prediction and experiment is that fundamental approximations made by optimizing equilibrium structures cause much larger errors in loosely bound systems and in more rigid monomers. This is particularly apparent in the argon–HCl complexes, which show a large deviation from the linear fit in Fig. 3. The average ground state structure deviates greatly from the equilibrium geometry due to large amplitude motion.16 For cases such as this where the equilibrium structure is an insufficient approximation, Qeff as a single correction parameter falls short. When Ar complexes are excluded from linear regression, the effective quadrupole moments in the data set can be determined to be Qeff = −79.6 ± 0.9 mb. This is a significant improvement of the fit, but, the fact that simple calculations become less reliable still holds true for this data set, and using this correction factor for the NQCC for other systems still requires a chemical understanding in case the approximations of the calculations break down. In that case, a larger benchmarking set is required to determine Qeff for similar cases, which probably requires the definition of individual Qeff for complexes dominated by the same effects. Grouping the complexes in this data set and investigating them separately has been done, where the Arx–HCl complexes show the most pronounced deviations, but the experimental data set is very limited (see Fig. SF1, ESI†). Alternatively, calculations with less approximation and higher time investment, such as vibrational calculations, might be used to treat these cases. In either case, simply using Bailey's Qeff determined from monomers for complexes is a fallacy.
Returning to our large shift of Qeff = −73.4 ± 5.9 mb compared to Bailey's value Qeff = −81.568 ± 0.091 mb, Bailey's reasoning that inner-core electron shielding effects explain the shift in monomers is more clear. In our study of complex NQCC, we now connect the large contribution of revs. r0 differences in geometry to the NQCC prediction. Ultimately, we cannot disentangle the contributions of either from the effective quadrupole moment of −73.4 ± 5.9 mb, and must conclude that both contribute. The large shift in the complexes does suggest a significant contribution from structural difference between equilibrium and ground state geometries.
The primary focus was on established methods for structure determination, with MP2 and dispersion corrected B3LYP-D3(BJ) and B3LYP-D4 performing best, proving their status as robust methods for geometry optimization. To reliably obtain accurate predictions for the NQCC, MP2/6-311++G(2d,2p) as an ab initio method outperformed all DFT methods when requiring both reliable geometry and NQCC predictions with a single calculation. We showed that Bailey's empirical correction of chlorine monomers is not applicable for dimers. An empirical correction factor based on the accessible dimer data set was defined by pre-optimization of the geometry and calculation of the NQCC at B1LYP/TZV(3df,2p) level. It is at the user's discretion if the correction is applicable, because flexible systems with large amplitude motions or energetically close conformers require computational treatment of these phenomena to improve the NQCC predictions.
Future work on this project would broaden the variety of methods employed, in particular, more different ab initio methods for comparison, since MP2 performed well compared to DFT methods. An important step would be the further investigation flexible cases such as the argon complexes featured in this study with a more expensive anharmonic treatment to improve the NQCC. The experimental data set should be extended to include different quadrupolar nuclei, such as nitrogen, boron, bromine, and iodine. These extended data sets present a challenge in that they are typically harder to analyze and have fewer examples in the literature. The benefit of extending to these data sets to bromine and iodine is that they are more likely to have full, experimentally determined quadrupole coupling tensors due to the larger Q values of the nuclei.
Footnote |
† Electronic supplementary information (ESI) available: Files and data basis of the plots can be accessed digitally at https://doi.org/10.25625/VMY3WD. See DOI: https://doi.org/10.1039/d2cp04067k |
This journal is © the Owner Societies 2023 |