Benchmarking the quadrupolar coupling tensor for chlorine to probe weak-bonding interactions

Rotational spectroscopy relies on quantum chemical calculations to interpret observed spectra. Among the most challenging molecules to assign are those with additional angular momenta coupling to the rotation, contributing to the complexity of the spectrum. This benchmark study of computational methods commonly used by rotational spectroscopists targets the nuclear quadrupole coupling constants of chlorine containing molecules and the geometry of its complexes and clusters. For each method, the quality of both structural and electronic parameter predictions is compared with the experimental values. Ab initio methods are found to perform best overall in predicting both the geometry of the complexes and the coupling constants of chlorine with moderate computational cost. This cost can be reduced by combining these methods with density functional theory structure optimization, which still yields adequate predictions. This work constitutes a first step in expanding Bailey’s quadrupole coupling data set to encompass molecular clusters. [W. C. Bailey, Calculation of Nuclear Quadrupole Coupling Constants in Gaseous State Molecule, 2019, https://nqcc.wcbailey.net/ ]


Introduction
With the rise of broadband rotational spectroscopy in the last two decades, 1,2 the demands for accurate predictions of rotational spectroscopic constants from quantum chemical calculations have increased.More precise calculations are needed to interpret the increasing complexity of molecular targets.The complexity is introduced through an increase in system size, leading to a large conformational space or several angular momenta coupling to the total angular momentum.In addition, below the threshold of molecular mass transferable into the gas-phase, broadband microwave spectroscopy is not size selective.During detection, many impurities and weakly bound complexes containing two or more molecules are observable simultaneously. 3he analysis of the large amount of spectroscopic data collected, both in number of spectra and in spectral complexity, can currently be very time consuming.An accurate initial guess of the spectroscopic parameters is valuable because it drastically minimize the time spent on assignment.Accurate theoretical predictions are also valuable for automatic fitting programs, 4,5 since it reduces calculation time.The aim of this paper benchmark a number of theoretical methods to test their performance in predicting the complex hyperfine splitting of weakly bound complexes.
In microwave spectroscopy, rotational constants are the fundamental spectroscopic parameters, which are directly related to the geometry of the molecule as they depend on the moments of inertia.This direct link to the mass distribution in the molecule establishes microwave spectroscopy as a powerful tool for determining the molecular structure and underscores the necessity for an accurate prediction of these parameters from theory for assignment.In addition to rotational constants, quantum chemical methods produce predictions for another characteristic important for assignment, the Nuclear Quadrupolar Coupling Constant (NQCC) w ij .Provided the target molecule contains one or more quadrupolar nuclei (I 4 1/2), such as 14 N or 35 Cl, [6][7][8][9] a hyperfine structure is observed that arises from the coupling of the nuclear angular momentum with the molecular angular momentum. 10The coupling is dependent on the intrinsic nuclear quadrupole moment Q and the electronic environment expressed by the Electronic Field Gradient (EFG) q ij (i, j = abc).w ij ¼ w aa w ab w ac w ba w bb w bc w ca w cb w cc The NQCC w ij is experimentally observed in the principal axis system abc based on the construction of the rotational Hamiltonian.Through rotation in the nuclear axis system of the quadrupolar nucleus, the NQCC tensor is transformed to w ij (i, j = xyz) by diagonalization.The steepest gradient of the EFG is along the chlorine carbon bond in the nuclear axis system, where this axis is labeled z by convention.To clarify which axis system is used, w abc and w xyz will be used to denote the NQCC in the principal axis system or the nuclear axis frame, respectively.
Bailey et al. performed extensive benchmarking of NQCC 11 with a focus on chlorine-containing molecules which are present in many important compounds for chemical industry. 12his is in addition to Bailey's numerous examinations of similar nuclei containing high-spins that are found in rotational spectroscopy studies.Bailey found a linear correlation between NQCC values predicted at the B1LYP/TZV(3df,2p) 13,14 level and experimental results of 22 molecules.He established an empirical correction Q eff of the intrinsic nuclear quadrupole moment that acts on the EFG of the molecule.The reference data set encompasses a wide range of chlorine compounds probed with experimental methods that are added sequentially to the data set after the initial publication.The data set is limited exclusively to chlorine-containing monomers, disregarding chlorine-containing species with intermolecular dispersive bonding interactions.

Selection of molecular targets
We selected 20 complexes for our data set in this study.Reoccurring substructures such as common monomers or rare gas binding partners appear in multiple structures within the data set.There were several selection parameters that guided the choice of molecular structures to be included in the data set.

Full description of NQCC
From all structures picked from literature, the full description of the NQCC tensor of a quadrupolar nucleus was required, including the off-diagonal elements of the tensor.A full experimental description of the NQCC tensor was required to compare with the computational results.The experimental determination of the full tensor is easier when the number of nonzero off-diagonal elements is low; hence, a number of chosen molecules have symmetry elements.

Gas-phase complex
All chosen structures display non-covalent intermolecular interactions in an effort to expand on Bailey's data set.All are heterodimers and contain only a singular quadrupole of interest.There are a variety of bond strengths in the chosen complexes, from van der Waals interactions to hydrogen bonds and interactions with a weakly covalent character.The data set includes several subgroups with similar monomers for different interactions with organic, inorganic, and metallic bonding partners.
These subgroups provide trends within the data set.A large number of structures have chlorine as a component of an organic molecule such as 2-chlorothiophene or CF 3 Cl which interact with a number of binding partners, mostly small molecules, both organic and inorganic compounds.In particular, CF 3 Cl has been explored with a number of binding partners offering a wide selection of different interactions with the same system.Another group is argon as a rare gas binding partner to a number of small chlorine compounds such as Cl 2 and HCl, uncovering the impact of the covalent environment of chlorine.To further diversify the binding partners, structures containing metals are also included in the data set. 34

Computational cost
To keep the computational cost affordable, the number of complexes with heavy atoms such as Au and Xe was kept to a minimum.The number of atoms was limited by the computational cost and the lowered experimental accessibility of large molecules in the gas-phase, a boundary condition set by the experimental data preexisting in literature.

Chlorine nucleus
This data set is meant to be an extension of Bailey's work, and therefore it is focused exclusively on chlorine as a quadrupolar nucleus.There are also other high-spin nuclei present in selected systems, but at least one chlorine atom was required for inclusion in the benchmarking set.

Challenges of NQCC benchmarking
When discussing NQCC predictions, there is an inherent overlap with structure that complicates analysis.Geometry, chemical environment, and large amplitude motions all inform quadrupolar coupling constants, both in their inherent changes to the local EFG, and in the projection of the NQCC into the molecular axis system.The reliability of these factors from the calculations is informed by the structure type, specifically the treatment of dispersive complexes compared to that of monomer structures.Overall, dispersive dimers tend to be more floppy and have many local minima on the potential energy surface.A NQCC prediction should be correct for the right reasons.If the optimized structure already has discrepancies from the experimentally observed one, it contaminates the prediction of the EFG and its projection from the nuclear into the principal axis system.This is a particular complication if a single method is used both for structure optimization and for the prediction of the NQCC, as a singular method is not necessarily optimal for both computations.To address this issue, well established methods were used to optimize structures of complexes, and then a This journal is © the Owner Societies 2023 variety of methods were used to predict the NQCC of chlorine in these optimized complexes.
An unreliable geometry prediction of dimers is also particularly relevant when it comes to large amplitude motions.Low binding energies compared to covalently bound structures facilitate low energetic barriers for large-amplitude motions.This challenge is not unique to NQCC predictions, but is relevant for dispersive complexes in general.Nevertheless, it is an important factor to consider when approaching NQCC predictions and corrections.

Optimization
The geometries of all the structures in the benchmark set were optimized with the methods mentioned above.An exception are the heavy atom containing complexes, which were excluded from MP2 optimizations in the absence of parameterization of the basis set for Au, Ag, Kr and Xe.In addition to geometry, the EFG of chlorine was calculated for the optimized structure.

Single point calculations
It proved difficult to compare the computational results for the NQCC independently of the structural predictions.Several methods yielded inaccurate geometries, displaying relative deviations beyond 10% for structural parameters, which is generally unsuitable for fitting and auto-fitting when hyperfine splitting is also present.Therefore, single point calculations were performed on select pre-optimized structures to obtain predictions of the NQCC based on different starting geometries.While this approach cannot negate the inherent link between geometry and EFG, predictions of the NQCC based on a common geometry share the same systematic error through structural distortion, hence a relative comparison of EFG predictions between methods is possible.This is the same methodology prescribed by Bailey that we now extend to complexes.
Single point calculations was performed for select geometries optimized at the levels of B3LYP/def2-TZVPPD, B3LYP-D3(BJ)/def2-TZVPPD, and B3LYP-D3(BJ)/x2c-TZVPPall-2c.These structures were chosen to keep the scope limited within the B3LYP hybrid functional group, as these calculations provided a large range of quality in their respective geometry optimizations with respect to structure.As these methods were used for the optimization in this part of the study, each method was subsequently used as a single point calculation method on the other two optimized structures.For example, a single point calculation was done with B3LYP/def2-TZVPPD on the optimized geometry at the B3LYP-D3(BJ)/def2-TZVPPD level.This was done for completeness of the comparison.For these geometries, a mix of calculations were used.In keeping with the previous standards, the NQCC was determined with a single point calculation using B1LYP/TZV(3df,2p) according to Bailey, 62 which includes transforming the raw EFG tensor to the NQCC tensor using both the nuclear quadrupole moment Q from the literature and Bailey's corrected Q eff .For each of the three optimized geometries, B3LYP/def2-TZVPPD.Lastly, CCSD(T)/cc-pVQZ [73][74][75][76] was chosen to compare with a highlevel ab initio method.The treatment of all geometries was repeated in these single point calculations, regardless of the quality of the predicted structure, with the exception of many electron systems.As CCSD(T) 77,78 is a very expensive method, Ag, Au, Cu, Kr, Xe and fluorobenzene were excluded from this sub-test set, as the computational cost was too high (see Table 1).

Optimization results and interpretation
To assess performance in benchmarking, an estimated behaviour from other test sets is advantageous.For rotational constants, a back correction to experimental ground state B 0 from equilibrium values B e of À1(AE1)% is expected for monomers and À1(AE3)% for hydrogen bonded dimers. 80Currently, this is the highest possible accuracy that can be expected with the approximations in place for the rotational constant estimation.
For w ij , the distinction between the NQCC in the nuclear axis frame and principle axis system has to be made.No previous study has rigorously investigated dispersive dimers, but Bailey corrected Q eff by 0.1 AE 0.9% for 35 Cl containing monomers.

Geometry
The methods in Fig. 1 are compared with the experiment with three key observables: (B + C)/2 as structural parameter and the diagonal elements of NQCC w ij in both the nuclear axis system and the principal axis system to probe the electronic description.The sign of the MARD values determined by the definition the w xyz axis system for chlorine, its steepest gradient along the z-axis is negative in all cases.
The DFT hybrid functionals like B1LYP and B3LYP show large discrepancies between predicted geometries and experimental structures when omitting Grimme's dispersion correction and Becke-Johnson damping as demonstrated by the large MARD for (B + C)/2.This is expected behavior, as long-range dispersive interactions are underestimated by theory and lead to distortion in geometry for the dimer data set governed by dispersion. 45eliable structural data with mean relative deviations below o4% 80 are provided by B3LYP-D3/def2-TZVPD, B3LYP-D4/def2-TZVPD, B3LYP-D4/def2-TZVPPD, and MP2/6-311++G(2d,2p) calculations.Overall, B3LYP-D3/def2-TZVPD and B3LYP-D4/def2-TZVPD performed best among all methods in predicting rotational constants, demonstrating no significant differences beyond the methods precision between D3 and D4 dispersion corrections.A close second is MP2/6-311++G(2d,2p) in archiving quality geometric predictions.However, it is important to note that a precise prediction is not expected, as we are comparing equilibrium structure calculations with the ground state structure observed in experiments.2][83] With a shift to flexible systems with internal motions, this mismatch in comparisons needs to be carefully considered in a case-by-case method.

v xyz
Direct comparison of the nuclear quadrupole moment is best done in the nuclear axis frame xyz.The steepest gradient of NQCC along the axis of the chlorine covalent bond w zz is described with reasonable precision by all methods.The best predictions for this parameter are provided by B97M-V-D3 and MP2 calculations, with these methods producing the same quality of predictions.In contrast, the double hybrid functional B2PLYP performs worst among the methods discussed.It is also remarkable that B1LYP/VTZ (3df,2d) outperforms most other methods, despite the lack of dispersion corrections.As (B + C)/2 demonstrates, the geometry is distorted and does not agree with the experimental findings, bringing the reliability of the prediction of NQCC into question.Among the basis sets, the x2c basis set always outperformed the def2 basis sets independent of the method.
The description of the quadrupole tensor orthogonal to the bond axis D(w xx À w yy ) is less accurate than for w zz .As the NQCC tensor is traceless, w xx and w yy are of similar magnitude in most  S1-S5) in the dataverse. 79f the chlorine compounds selected.For this reason, the difference D(w xx À w yy ) leads to higher relative errors compared to w zz .Indeed, the non-dispersion corrected B3LYP and B1LYP methods outperform other DFT hybrid functional approaches with Grimme corrections in predicting D(w xx À w yy ).Due to inaccurate geometry predictions, this is most likely due to the much larger error spread and error compensation.Therefore, it is important to view the D(w xx À w yy ) predictions in conjunction with w zz and (B + C)/2, to ensure that the NQCC description is correctly described and not just by chance.The method that strikes that balance best with a single optimization is MP2/6-311++G(2d,2p), as this method performs the best in predicting the NQCC while simultaneously occupying fourth place for geometry predictions among the methods tested here, suggesting that the electronic description of the EFG is reliable.
The difference between D3 and D4 for B3LYP is minor, they have almost identical NQCC MARDs, as the test set lacks ionic species, the characteristic D4 improves over D3. 47 Fig. 1 indicates that overall B1LYP produces better predictions of NQCC than other DFT methods despite the imperfect structure obtained by optimization, which is discussed in another section.
The basis set has a significant impact on the description of NQCC.The x2c basis set is an all-electron basis that was fit using NMR parameters.NMR data were chosen specifically to improve the shielding constants for inner electrons. 60,61This approach can theoretically affect NQCC if the description of the core potential for chlorine is inadequate.In contrast to expectations, the x2c basis set does not outperform the def2 basis set for the description of quadupole coupling.While w zz is comparatively well described, the errors in D(w xx À w yy ) rise compared to the Karlsruhe def2 basis set.Overall, the relative error of D(w xx À w yy ) is much larger than that of w zz , thereby making this parameter less reliable in judging the overall quality of the prediction of the NQCC tensor.As a result, the x2c basis sets still outperform the def2 basis sets for DFT calculations.However, MP2 with a 6-311++G(2d,2p) basis set still overall provides the most reliable predictions of the entire set of methods.

v abc
Although NQCC w xyz in the nuclear axis system provides insight into the electronic description of the nucleus, for the purpose of fitting a spectrum, w abc is required, which is the projection of w into the principal axis system of the molecule.An accurate description of the geometry and the EFG is important to prevent distortion of the NQCC tensor.However, methods that fail to optimize geometry sufficiently, such as B1LYP and B3LYP without dispersion correction, have poorer precision than others, such as B3LYP-D3.It resulted in the omission of error bars in Fig. 1 to enhance legibility, which can be found in the ESI.† The non-dispersion corrected methods yield slightly better MARD for w aa with lower precision, but as with the Grimme corrected counterpart of the B3LYP methods, it lacks precision.
When comparing ab initio MP2 methods, the importance of the basis set is highlighted.The addition of diffuse functions to the basis set results in robuster NQCC predictions in the molecular axis frame.Therefore, among the methods tested, MP2/6-311++G(2d,2p) is the best choice among the methods tested if only a single calculation is performed, if the basis set can be computationally afforded.
Omitting B3LYP dispersion corrections does not have a significant effect on the prediction of NQCC, since the coupling is primarily dependent on the description of the electronic environment.The impact of dispersion correction on this calculation is diminished by other computational decisions, such as the choice of basis sets, which is illustrated in Fig. 2. Comparison of B3LYP and B3LYP-D3 indicates that B3LYP did not accurately predict w for the right reason, as suggested in Fig. 1.Instead, the low NQCC MARD is caused by error compensation between the distorted geometry and an inaccurate description of the nuclear environment.
Despite the inaccuracies in optimization discussed above, B1LYP/TZV(3df,2p) outperforms all other methods in describing w zz , albeit prediction for (w xx À w yy ) contains the largest errors compared to B3LYP-D3/def2-TZVPPD and CCSD(T)/cc-pVQZ.Employing Q eff to transform the EFG according to eqn (1) fails to correct the NQCC, as it breaks down when applied to dimers.
The CCSD(T) results improve the prediction of w compared to B3LYP, although it is still outperformed by B1LYP in terms of w zz .The ab initio method performs best for (w xx À w yy ).This result was a surprise, as CCSD(T) is often praised as the gold standard in computational chemistry.It can likely be attributed to the basis set cc-pVQZ used in the CCSD(T) calculation, which is not augmented with diffuse functions compared to the TZV(3df,2p) employed for the B1LYP calculation.More benchmarking needs to be done to confirm this hypothesis.It demonstrates the importance of the basis set, as in this case a chosen DFT method with a large basis set can compete with CCSD(T) calculations in accuracy to predict the local EFG.

NQCC correlation
To directly compare the electronic description around the nucleus, selected geometries pre-optimized with B3LYP-D3/ def2-TZVPPD were used to determine the EFG at the level of B1LYP/TZV(3df,2p) and plotted against the experimental values of NQCC.This is following Bailey's method of optimizing the geometry with one's preferred method and calculating the EFG using B1LYP/TZV(3df,2p) to apply an empirical correction factor.When correlating the calculated EFG and the experimental NQCC values, the data spread is much more pronounced compared to Bailey's monomers, as the geometry of dimers are generally harder to predict (see Chapter 3).Additionally, the dimer data set is smaller, with 14 comparable systems which had existing basis set parameterisation for all atoms, while Bailey included 22 systems in his correlation.When applying a linear fit, we derive a value of eQ eff /h = À17.3AE 1.4 MHz a.u.À1 for 35 Cl compared to Bailey's eQ eff /h = À19.166AE 0.021 MHz a.u.À1 62 Bailey's values suggests shielding of chlorine nucleus as the effective quadrupole moment Q eff = À 81.568 AE 0.091 mb is smaller than the literature value of Q = À 81.65 AE 0.80 mb.Therefore, the calculations underestimate the electronic interactions with the quadrupole moment.This effect ostensibly is enhanced in complexes, where the effective quadrupole moment is À73.4 AE 5.9 mb, but may have influence from other sources, as we suggest below.
A major issue for correlating prediction and experiment is that fundamental approximations made by optimizing equilibrium structures cause much larger errors in loosely bound systems and in more rigid monomers.This is particularly apparent in the argon-HCl complexes, which show a large deviation from the linear fit in Fig. 3.The average ground state structure deviates greatly from the equilibrium geometry due to large amplitude motion. 16For cases such as this where the equilibrium structure is an insufficient approximation, Q eff as a single correction parameter falls short.When Ar complexes are excluded from linear regression, the effective quadrupole moments in the data set can be determined to be Q eff = À79.6AE 0.9 mb.This is a significant improvement of the fit, but, the fact that simple calculations become less reliable still holds true for this data set, and using this correction factor for the NQCC for other systems still requires a chemical understanding in case the approximations of the calculations break down.In that case, a larger benchmarking set is required to determine Q eff for similar cases, which probably requires the definition of individual Q eff for complexes dominated by the same effects.Grouping the complexes in this data set and investigating them separately has been done, where the Ar x -HCl complexes show the most pronounced deviations, but the experimental data set is very limited (see Fig. SF1, ESI †).Alternatively, calculations with less approximation and higher time investment, such as vibrational calculations, might be used to treat these cases.In either case, simply using Bailey's Q eff determined from monomers for complexes is a fallacy.
Returning to our large shift of Q eff = À73.4AE 5.9 mb compared to Bailey's value Q eff = À81.568AE 0.091 mb, Bailey's reasoning that inner-core electron shielding effects explain the shift in monomers is more clear.In our study of complex NQCC, we now connect the large contribution of r e vs. r 0 differences in geometry to the NQCC prediction.Ultimately, we cannot disentangle the contributions of either from the effective quadrupole moment of À73.4 AE 5.9 mb, and must conclude that both contribute.The large shift in the complexes does suggest a significant contribution from structural difference between equilibrium and ground state geometries.

Conclusion
In this work, the quantum chemical predictions for NQCC were compared to the experimental spectroscopic results to determine the best performing methods to obtain initial guesses for future fitting efforts of spectroscopic data.
The primary focus was on established methods for structure determination, with MP2 and dispersion corrected B3LYP-D3(BJ) and B3LYP-D4 performing best, proving their status as robust methods for geometry optimization.To reliably obtain accurate predictions for the NQCC, MP2/6-311++G(2d,2p) as an ab initio method outperformed all DFT methods when requiring both reliable geometry and NQCC predictions with a single calculation.We showed that Bailey's empirical correction of chlorine monomers is not applicable for dimers.An empirical correction factor based on the accessible dimer data set was defined by pre-optimization of the geometry and calculation of the NQCC at B1LYP/TZV(3df,2p) level.It is at the user's discretion if the correction is applicable, because flexible systems with large amplitude motions or energetically close conformers require computational treatment of these phenomena to improve the NQCC predictions.
Future work on this project would broaden the variety of methods employed, in particular, more different ab initio methods for comparison, since MP2 performed well compared to DFT methods.An important step would be the further investigation flexible cases such as the argon complexes featured in this study with a more expensive anharmonic treatment to improve the NQCC.The experimental data set should be extended to include different quadrupolar nuclei, such as nitrogen, boron, bromine, and iodine.These extended data sets present a challenge in that they are typically harder to analyze and have fewer examples in the literature.The benefit of extending to these data sets to bromine and iodine is that they are more likely to have full, experimentally determined quadrupole coupling tensors due to the larger Q values of the nuclei.

Fig. 1
Fig. 1 Comparison of MARD of NQCC and geometry predictions made by examined combinations of methods and basis sets after optimization.Error bars were omitted when the values were to large for a readable depiction.The values can be found in the ESI † (TablesS1-S5) in the dataverse.79

Fig. 3
Fig. 3 Linear correlation of calculated EFG and experimental NQCC values.Optimization of the structure performed with B3LYP-D3 and secondary calculations of the EFG with the optimized structure with B1LYP/TZV(3df,2p).The red dots mark the argon complexes with HCl (Ar-HCl, Ar 2 -HCl) and Ar 3 -HCl.The red linear regression includes these complexes in the data set, the black one does not.

Table 1
Rotational constants and NQCC of chlorine containing dispersive complex from literature