Marco
Mendolicchio
*a,
Lina
Uribe
ab,
Federico
Lazzari
b,
Luigi
Crisci
b,
Giovanni
Scalmani
c,
Micheal J.
Frisch
c and
Vincenzo
Barone
*d
aScuola Normale Superiore, Piazza dei Cavalieri 7, 56126 Pisa, Italy. E-mail: marco.mendolicchio@sns.it
bScuola Superiore Meridionale, Largo San Marcellino 10, 80138 Napoli, Italy
cGaussian, Inc., Wallingford 06492, Connecticut, USA
dINSTM, Via Giuseppe Giusti 9, 50121 Firenze, Italy. E-mail: vincebarone52@gmail.com
First published on 11th July 2025
Equilibrium molecular geometries are essential for understanding molecular systems, particularly in the gas phase, where intrinsic stereoelectronic effects can be disentangled from environmental influences. High-resolution rotational spectroscopy offers direct structural information and is now applicable to molecules with up to 50 atoms, significantly expanding its scope and demanding more advanced computational support to account for vibrational averaging and spectral complexity. Herein, we present an automated workflow that integrates the Pisa composite schemes (PCS) with efficient vibrational correction models, interfacing seamlessly with Gaussian and MSR programs. The protocol is designed for medium-sized molecules where relativistic and static correlation effects can be neglected, and is demonstrated on a set of prebiotic and biologically relevant compounds. Reliable equilibrium geometries are obtained for both semi-rigid and flexible species, provided that a second-order vibrational perturbation theory (VPT2) treatment is adequate; proline is included as a representative flexible case. Additionally, the phenyl radical is considered as a prototypical open-shell system, supported by extensive isotopic experimental data. This strategy enables the accurate and cost-effective determination of equilibrium geometries for molecules beyond the small-molecule regime, outperforming conventional methods and offering broad applicability in astrochemistry, prebiotic chemistry, and molecular spectroscopy.
The growing importance of precise geometries is especially pronounced in astrochemistry and atmospheric chemistry, where accurate structures are required to model exotic environments, including interstellar space1 and planetary atmospheres.2 In astrochemistry, structural precision is critical for interpreting spectroscopic data and identifying novel species under extreme conditions. Similarly, atmospheric models depend on reliable geometrical inputs to simulate reaction pathways and assess the impact of reactive species. In both areas, gas-phase data are indispensable, as they provide access to intrinsic molecular properties unaffected by solvation or matrix effects.
Among experimental techniques, high-resolution rotational spectroscopy stands out for its ability to probe molecular structures with exceptional detail.3–6 The resulting rotational constants are directly linked to the molecular moments of inertia and hence to the geometry. However, these constants correspond to vibrationally averaged structures, as measured in the ground vibrational state, and thus differ from the true equilibrium geometry. Consequently, recovering equilibrium structures from spectroscopic data requires the application of vibrational corrections, which can be derived experimentally only for small, rigid molecules. For larger or more flexible species, accurate theoretical predictions become indispensable—not only to determine structural parameters but also to assist spectral assignment and interpretation, particularly in congested spectra or in the presence of multiple conformers.
Here, we present a computational protocol that delivers accurate equilibrium geometries and vibrational corrections at moderate computational cost. This integrated strategy extends the applicability of rotational spectroscopy to increasingly complex molecular systems.
Our study focuses on closed-shell, neutral molecules composed exclusively of light elements (up to the third period) and free of transition metals, ensuring that relativistic and multireference effects can be neglected. While the protocol is also applicable to open-shell7 and charged species,8 current high-resolution data for such systems are generally limited to small molecules. As an illustrative exception, we include the phenyl radical, which has been extensively characterized across several isotopologues. Second-order vibrational perturbation theory (VPT2) delivers reasonable vibrational corrections also for flexible molecules, provided that a well-defined equilibrium geometry exists and anharmonicity remains moderate. As a representative example, we analyze proline, a biologically relevant molecule that exemplifies internal flexibility.
For small species, spectral assignment and interpretation can often proceed with minimal computational support.9,10 However, advances in instrumentation have extended the reach of rotational spectroscopy to molecules with up to 50 atoms,11–13 significantly increasing the size and complexity of accessible systems. This expansion introduces challenges such as spectral congestion and the presence of several low-energy conformers, which make purely experimental analysis impractical without reliable computational guidance.
A key difficulty arises from the already mentioned mismatch between experimentally derived rotational constants and theoretical equilibrium geometries. Since the former are affected by vibrational averaging, accurate vibrational corrections are essential for meaningful structural interpretation. Yet, high-level computations of such corrections remain costly for large molecules, while experimental determinations are feasible only for a limited subset of rigid systems.
These considerations underscore the need for computational methods that balance accuracy and efficiency. Standard approaches based on density functional theory (DFT) or second-order Møller–Plesset perturbation theory (MP2) provide equilibrium geometries that are not sufficiently accurate and neglect vibrational effects,14–16 thereby preventing unambiguous spectral assignment and interpretation. In practice, empirical adjustments or chemical intuition are frequently used to compensate for these shortcomings.
To address these challenges, we have developed the Pisa composite schemes (PCS),17,18 a family of hierarchical protocols that combine high-level electronic structure corrections—either explicitly computed or efficiently estimated—with manageable computational cost. These schemes are tailored to the needs of rotational spectroscopy, providing accurate geometrical data for medium-sized molecules.
A key component of the protocol is the implementation of vibrational correction models within the VPT2 framework.19,20 These models are fully automatable and computationally affordable, as the cost of computing vibrational corrections for any set of isotopologues is essentially the same as that for the parent molecule.
However, these methods risk remaining underutilized in many experimental workflows, largely due to their perceived complexity and the manual intervention they require. To bridge this gap, we introduce a fully automated, black-box workflow that integrates the Gaussian21 and MSR (molecular structure refinement)22 programs. This tool enables the routine prediction of spectroscopically accurate geometries and vibrational corrections with minimal user input, making advanced computational methods accessible to non-specialists.
We demonstrate the performance of this workflow through a diverse set of case studies spanning a wide range of molecular sizes and structural characteristics, from rigid frameworks to highly flexible backbones. These examples showcase the robustness and generality of our approach, and its ability to support both spectral assignment and structural refinement. Ultimately, our strategy enables the routine application of high-precision quantum chemical tools in rotational spectroscopy, at a computational cost comparable to that of conventional, less accurate methods.
Let us consider a molecule characterized by Na atoms and N normal modes of vibration, with N = 3Na − 6 for non-linear molecules and N = 3Na − 5 for linear ones. The starting point is the well-known Watson ro-vibrational Hamiltonian Hvr, which can be expanded in terms of the dimensionless normal coordinates q = {q1, …, qN} leading to the following expression:
![]() | (1) |
B0τ = Beqτ + ΔBvibτ | (2) |
ΔBvibτ = ΔBharmτ + ΔBCorτ + ΔBanhτ | (3) |
![]() | (4) |
![]() | (5) |
![]() | (6) |
In the above equations ai,τη is the derivative of the τη component of the moment of inertia tensor with respect to the qi coordinate, Ieqη is one of the principal moments of inertia, ωi is the harmonic wavenumber (in cm−1) associated with the ith normal mode, ζij,τ is the element of the Coriolis matrix coupling modes i and j along the τ axis, c is the speed of light in vacuum, h is the Planck constant, and the following notation has been introduced,
![]() | (7) |
Within the VPT2 framework, the calculation of anharmonic transition energies and intensities requires both the full set of anharmonic force constants up to semi-diagonal quartic terms (fijkk) and the derivatives of the relevant property (e.g., dipole moment for infrared) up to the semi-diagonal third-order terms.50,51 These data are typically obtained using central finite differences by displacing the molecular geometry along normal coordinates and performing a frequency calculation at each point, which also provides the property gradient.45,52 This methodology, from now on referred to as full Hessian (FH), implies 2N + 1 force constant computations (including the equilibrium geometry).
As can be deduced from eqn (4)–(6) and outlined in previous works,9,19 the quantities allowing for the calculation of ΔBharmτ, ΔBCorτ can already be computed at the harmonic level, while the anharmonic term only requires the semi-diagonal cubic force constants, paving the way to computational strategies far less demanding. Vibrational corrections to the rotational constants of one or more isotopic species are required in different situations. In the former case, the computational protocol is similar to the FH method, but the evaluation of force constants is replaced by the much cheaper computation of analytical gradients. Firstly, a harmonic calculation is carried out, providing both ΔBharmτ and ΔBCorτ contributions, together with the normal coordinates necessary to perform the displacements and finite differences. Then, at each geometry only the analytical gradient is evaluated, as this piece of information is sufficient to compute the semi-diagonal cubic force field,
![]() | (8) |
BSEτ = B0,expτ + ΔBvibτ | (9) |
Once the experimental rotational constants for the isotopic species of interest have been determined, the corresponding vibrational corrections must be evaluated with efficient protocols. A widely used approach involves performing a number of FH calculations equal to the number of isotopologues. This is related to the availability of several QC packages able to compute analytical second derivatives of the energy with respect to Cartesian coordinates and then to carry out full anharmonic calculations, with the ro-vibrational analysis generated as a byproduct. However, from a computational perspective, this approach is highly inefficient when the primary focus is on ro-vibrational spectroscopic parameters. In this work, we present two computational strategies designed to provide the same information while significantly reducing computational cost. The first approach is based on the FG method and consists of two main steps. First, a harmonic calculation is performed for a single isotopic species, typically the parent molecule. Since all isotopologues share the same equilibrium geometry and Cartesian Hessian matrix, a single matrix can be systematically applied to the harmonic analysis of all isotopic species, incorporating their respective nuclear masses. As a result, a single Hessian matrix enables the calculation of the ΔBharmτ and ΔBCorτ terms for all isotopologues.
At this point, the best course of action would be to calculate the semi-diagonal cubic force field of the parent species, followed by its tensorial transformation to obtain the ΔBanhτ term for all other isotopic species. Unfortunately, this approach requires the full cubic force field of the original species. As demonstrated in previous work,19,20 the removal of terms that involve three different normal modes (fijk with i ≠ j ≠ k) can cause serious problems in final vibrational corrections. Hence, the anharmonic contributions are estimated by applying the FG method to each isotopologue separately, leading in any case to a significant reduction in computational cost through the replacement of analytical force constants with gradients.
An alternative strategy relies on a single FH calculation, in which both second and third energy derivatives with respect to Cartesian coordinates are evaluated at the same time,
![]() | (10) |
The purely harmonic and Coriolis contributions to ΔBvibτ can be determined following the same route discussed for the FG method. Conversely, the calculation of the semi-diagonal cubic force constants for all isotopic species is carried out through a tensorial transformation,
![]() | (11) |
The choice of the most effective strategy depends on the type of molecule, the number of isotopic species, the level of calculation, the computational resources, and, in particular, the possibility of parallelization. An indicative comparison of the computational costs of the different procedures is reported in Table 1.
Method | Computational cost |
---|---|
FH |
N
I![]() ![]() ![]() ![]() |
FG | C H + NI·CG·2N |
TH |
C
H![]() ![]() |
In fact, after the evaluation of the Hessian at the equilibrium geometry, the FG method is more convenient than its TH counterpart whenever
![]() | (12) |
The optimization can be carried out through different algorithms, the default one being the Gauss–Newton method, over a set of nuclear coordinates defined by the user. In the present work, Z-matrix internal coordinates have been systematically adopted. After convergence of the optimization, in addition to reporting the final structure, a detailed error analysis is carried out. In particular, the standard deviations and confidence intervals for the different structural parameters are computed, together with the leverages and correlation coefficients. Regardless of the internal coordinates employed in the optimization process, a robust error propagation allows for the calculation of uncertainties in the desired coordinates, ranging from the canonical set of bond lengths, angles, and dihedrals to the distance between the centers of mass of different fragments. Furthermore, the Hessian matrix of the residual function is computed to confirm that the optimized geometry corresponds to a true minimum. While convergence to the absolute minimum cannot be guaranteed, different starting geometries and coordinate sets can be employed to mitigate this issue. In practice, the PCS starting geometries are sufficiently close to the SE structure that the final result typically corresponds to the absolute minimum.
More in detail, three main output files are provided:
• Main output file, containing all the information concerning the calculation;
• Summary file, containing the optimized geometry, the general trend of the optimization (which could in principle be plotted), part of the error analysis, and the rotational constants related to the different isotopologues with the corresponding uncertainties;
• A standard xyz file containing the Cartesian coordinates of the parent species in Eckart orientation, readable by any molecular editor.
The integration of the TH method is simpler, since it requires the calculation of the second and third derivatives of the energy with respect to Cartesian coordinates at the equilibrium geometry (freq = cubic keyword of Gaussian). Then, the vibrational corrections are computed by MSR through the tensorial transformation introduced in eqn (11).
Let us underline that the new developments have been fully integrated with the calculation of the SE structure, so that a single run of MSR allows for the calculation of the vibrational corrections, their use in the optimization process, and the determination of the structural parameters. For the sake of completeness, a diagram representing the whole engine is sketched in Fig. 1.
In general, this issue can be addressed by removing ill-defined parameters from the fit and fixing them at specified values, usually obtained through refined QC calculations. A viable alternative is the method of predicate observations (also known as mixed regression),59 in which accurate estimates of structural parameters and their uncertainties are included in the fit as additional sources of information. While the MSR code has incorporated these procedures since its inception, this work explores a new method that provides valuable support, particularly in ill-conditioned fits.
Previous studies have shown that different types of coordinates (e.g., C–H bond lengths) exhibit systematic errors,56,58,59 paving the way for methods that refine structural parameters through corrections derived from linear regressions. Therefore, a whole set of related coordinates can be optimized using a single parameter. More specifically, the parameters in question share the same correction at each step of the optimization while preserving different initial guesses and final values.
From a mathematical perspective, this is equivalent to imposing that the difference between one of these parameters and the others in the set remains equal to its initial guess value.
The cheapest variant (referred to as HPCS2 and used in the computation of anharmonic contributions) corresponds to a hybrid functional (B3LYP) paired with the 6-31+G* basis set, and complemented by empirical dispersion (D4).62 A more accurate variant (referred to as DPCS3 and used for the evaluation of harmonic force fields) corresponds to the revDSD-PBEP86-D4 double hybrid functional63 in conjunction with the 3F12− basis set,18 which is obtained from its standard cc-pVTZ-F12 (3F12) counterpart64 by removing d functions on first-row atoms and replacing the two f functions on second- and third-row atoms by a single f function taken from the cc-pVTZ basis set.65 Note that the D3BJ model for empirical dispersions66 employed in previous PCS versions has now been replaced by the more accurate D4 model,62 which has been implemented in a development version of the Gaussian software.57 Although this modification has a negligible impact on the molecular parameters of interest in the present work, it should allow a better reproduction of intermolecular interactions. At the same time, other quantum chemical models have been implemented in the Gaussian software (HF-3c,67 r2SCAN-3c,68 …) and tested as alternatives to the quite old B3LYP hybrid functional for the computation of anharmonic force fields and equilibrium geometries. Unfortunately, none of the tested methods improved the B3LYP results and some of them actually provided unphysical results for some molecules (e.g., HF-3c for molecules containing oxygen atoms). As a consequence, we have retained the HPCS2 and DPCS3 variants used in previous works, except for the use of D4 empirical dispersions.
The most accurate PCS variant (referred to as PCS2 and used in geometry optimizations) is based on the CCSD(T)-F12b ansatz60 (CC-F12) in conjunction with the cc-pVDZ-F12 (2F12) basis set18,69 to evaluate valence contributions for molecules containing first- and second-row atoms. For third-row atoms, an additional f function (taken from the cc-pVTZ basis set) is needed to obtain comparable errors, with the resulting basis set being referred to as 2F12+.17 A benchmark study70 showed that further extension of the basis set has a negligible effect provided that the contribution of the complementary auxiliary basis set (CABS) is taken into account in the Hartree–Fock (HF) component.71,72 On the other hand, the conventional MP273 model in conjunction with the cc-pwCVTZ74 (wC3) basis set is sufficient for evaluating the contribution of core–valence (CV) correlation.
ΔrCV = r(ae-MP2/wC3) − r(fc-MP2/wC3) | (13) |
Therefore, the PCS2 geometrical parameters are given by
rPCS2 = rCC-F12 + ΔrCV | (14) |
Equilibrium rotational constants require an accuracy on the order of 1 mÅ for bond lengths and 0.1 degrees for valence angles,18,75 which is attained by the PCS2 variant. However, the computational cost of this approach becomes prohibitive for molecules with more than 20 atoms. To address this limitation, a more efficient variant—denoted BDPCS3—has been developed. This replaces the coupled-cluster treatment with the DPCS3 scheme described above and estimates core–valence (CV) correlation effects on bond lengths using the following one-parameter expression:
![]() | (15) |
In the above expressions, Kronecker δ's are used to restrict the corrections to specific bond types, and, by convention, the atomic number of atom i is greater than or equal to that of atom j. Pij denotes the Pauling bond order, which is used to identify bonded pairs when its value exceeds 0.3:
Pij = exp[(rcovi + rcovj − rij)/0.3] | (16) |
The final BDPCS3 bond lengths are given by:
rBDPCS3 = rDPCS3 + ΔrBCVij + ΔrBVij | (17) |
All the quantum chemical computations have been performed with the Gaussian package,21,57 except the CC-F12 ones, which have been performed with the Molpro program.79 The standard convergence thresholds for geometry optimizations (maximum gradient < 4.5 × 10−3 a.u. and maximum step < 1.8 × 10−3 a.u., with RMS values of 3 × 10−3 a.u. (gradient) and 1.2 × 10−3 a.u. (step) are too large both for the sought spectroscopic accuracy and the computation of converged vibrational corrections, especially in the presence of soft modes. Therefore, tighter criteria (opt = tight keyword of Gaussian, i.e., max gradient 1.5 × 10−3 a.u., gradient RMS 1.0 × 10−5 a.u.; max step 6 × 10−5 a.u., step RMS 4 × 10−5 a.u.) have been always used. In the same vein, a pruned (99590) grid is used for numerical integrations (about 15
000 points per atom) and a pruned (75
302) grid (about 7000 points per atom) in the CPHF step, together with RMS convergence thresholds for the density matrix of 10−8 in the SCF step and 10−10 in the CPHF step.
![]() | ||
Fig. 2 Distributions of differences between computed (PCS2, DPCS3, or HPCS2) and semi-experimental equilibrium CH bond lengths for the molecules contained in the SE111 database. |
The average difference between DPCS3 and SE values is 0.0027 Å and the differences follow a Gaussian distribution with a standard deviation of 0.0007 Å, demonstrating the systematic nature of the error source. The largest positive bond length deviation (0.0052 Å) is observed for pyruvic acid and the largest negative deviation (−0.0003 Å) is observed for 1-chloro-2-fluoroethene. Therefore, a constant shortening of DPCS3 bond lengths rounded to 0.0025 Å provides results with spectroscopic accuracy for molecules with a large variety of bond patterns.
Finally, the HPCS2 variant is characterized by a much larger systematic error (0.0064 Å), and the noise is also higher, with a standard deviation of 0.0010 Å. These positive values indicate the tendency of HPCS2 to overestimate bond lengths. The largest deviation from SE values (0.0092 Å) is observed for the CH bond of pyruvic acid, while the smallest deviation (0.0032 Å) is observed for the CH bonds of the imidazole system.
In conclusion, only the PCS2 variant provides CH bond lengths with spectroscopic accuracy, but a systematic correction of DPCS3 values (as done in the BDPCS3 model) leads to comparable accuracy at DFT cost. On the other hand, much larger error bars affect HPCS2 values irrespective of systematic corrections. As the systematic overestimation of CH bond lengths at the DPCS3 level is more than twice the current BDPCS3 correction (0.0012 Å), the second term in square bracket of eqn (15) has been added in the new release of the BDPCS3 model. Although the BDPCS3 correction is possibly too small also for other XH bond lengths, the number of available SE structures containing those bonds is too small to permit an unbiased revision of the corresponding BDPCS3 corrections.
The first example is thiophene (see Fig. 3) for which very accurate equilibrium geometry, vibrational, and electronic corrections are available, together with the experimental ground state rotational constants of a large number of isotopologues.81 The main results of our analysis are collected in Table 3. Of course, optimization of all the geometrical parameters reproduces exactly the results of ref. 81. Already, BDPCS3 and PCS2 equilibrium geometries are in remarkable agreement with their SE counterpart, and optimization of the positions of heavy atoms (freezing the bond lengths and valence angles of hydrogen atoms at their PCS2 or BDPCS3 values) provides a final structure where all the geometrical parameters are within the error bars of their full SE counterparts. At the same time, the relative errors of the rotational constants are reduced by about one order of magnitude when going from purely theoretical equilibrium structures (either PCS2 or BDPCS3) to their counterparts issued from the refinement of the positions of heavy atoms. While optimization of the positions of hydrogen atoms further reduces the errors on the rotational constants, the effect is quite small and, above all, well within the error bar of vibrational corrections.
Parameter | r eq |
r
SEeq
![]() |
|||||
---|---|---|---|---|---|---|---|
HPCS2 | DPCS3 | BDPCS3 | PCS2c | BDPCS3d | PCS2e | Ref. 81 | |
a The symbol ✓ denotes structural parameters kept fixed at their guess value. b Vibrational and electronic corrections at the CCSD(T)/cc-pCVTZ level of theory taken from ref. 81. c Data taken from the molecular database70 (https://www.skies-village.it/databases/). d SE structure obtained starting from the BDPCS3 geometry. e SE structure obtained starting from the PCS2 geometry. f Maximum absolute percentage error in the rotational constants (for all isotopic species) of the refined structure relative to their experimental counterparts. g Mean absolute percentage error in the rotational constants (for all isotopic species) of the refined structure relative to their experimental counterparts. | |||||||
r(S–C1) | 1.7354 | 1.7137 | 1.7103 | 1.7102 | 1.7104(2) | 1.7103(1) | 1.71049(18) |
r(C1–C2) | 1.3705 | 1.3698 | 1.3680 | 1.3663 | 1.3654(3) | 1.3655(2) | 1.36564(31) |
r(C1–H1) | 1.0819 | 1.0795 | 1.0770 | 1.0765 | 1.0770✓ | 1.0765✓ | 1.07714(17) |
r(C2–H2) | 1.0847 | 1.0816 | 1.0791 | 1.0788 | 1.0791✓ | 1.0788✓ | 1.07856(14) |
ϑ(C4–S–C1) | 91.50 | 92.10 | 92.10 | 92.06 | 92.02(1) | 92.04(1) | 92.047(15) |
ϑ(S–C1–C2) | 111.60 | 111.54 | 111.54 | 111.61 | 111.630(7) | 111.632(5) | 111.608(16) |
ϑ(H1–C1–S) | 120.31 | 120.33 | 120.33 | 120.30 | 120.33✓ | 120.30✓ | 120.065(28) |
ϑ(H2–C2–C1) | 123.35 | 123.37 | 123.38 | 123.44 | 123.38✓ | 123.44✓ | 123.414(23) |
MAX%f | 2.0302 | 0.4273 | 0.0878 | 0.0290 | 0.0104 | 0.0042 | 0.0023 |
MAE%g | 1.8327 | 0.4161 | 0.0468 | 0.0368 | 0.0020 | 0.0014 | 0.0005 |
Parameter | r eq |
r
SEeq
![]() |
|||||||
---|---|---|---|---|---|---|---|---|---|
HPCS2 | DPCS3 | BDPCS3 | PCS2c | BDPCS3d | RD(BDPCS3)e | PCS2f | RD(PCS2)g | Ref. 82 | |
a The symbol ✓ denotes structural parameters kept fixed at their guess value. b Vibrational corrections at the fc-CCSD(T)/ANO0 level of theory taken from ref. 82. c Data taken from the molecular database70 (https://www.skies-village.it/databases/). d SE structure obtained starting from the BDPCS3 geometry, taken from ref. 82. e SE structure obtained starting from the BDPCS3 geometry, taken from ref. 82, and refined for the C–C and C–C–C classes. f SE structure obtained starting from the PCS2 geometry, taken from ref. 82. g SE structure obtained starting from the PCS2 geometry, taken from ref. 82, and refined for the C–C and C–C–C classes. h Parameter derived from the SE structure reported in ref. 82. i Maximum absolute percentage error in the rotational constants (for all isotopic species) of the refined structure relative to their experimental counterparts. j Mean absolute percentage error in the rotational constants (for all isotopic species) of the refined structure relative to their experimental counterparts. | |||||||||
r(C1–C2) | 1.5070 | 1.5016 | 1.4988 | 1.5005 | 1.4982(7) | 1.49892(2) | 1.4995(2) | 1.499790(4) | 1.49906(34) |
r(C2–C3) | 1.3521 | 1.3479 | 1.3466 | 1.3468 | 1.3465(6) | 1.34672(2) | 1.3461(2) | 1.346090(4) | 1.34635(35) |
r(C1–H1) | 1.1011 | 1.0969 | 1.0944 | 1.0940 | 1.0944✓ | 1.0944✓ | 1.0940✓ | 1.0940✓ | 1.09441(17) |
r(C2–H2) | 1.0841 | 1.0812 | 1.0787 | 1.0787 | 1.0787✓ | 1.0787✓ | 1.0787✓ | 1.0787✓ | 1.07865(24) |
r(C3–H3) | 1.0853 | 1.0820 | 1.0795 | 1.0795 | 1.0795✓ | 1.0795✓ | 1.0795✓ | 1.0795✓ | 1.07907(22) |
ϑ(C5–C1–C2)/2 | 51.64 | 51.59 | 51.59 | 51.54 | 51.56(3) | 51.5700(6) | 51.528(9) | 51.5258(5) | 51.541(15) |
ϑ(C1–C2–C3) | 109.07 | 109.24 | 109.24 | 109.32 | 109.34(4) | 109.2700(6) | 109.32(1) | 109.3048(5) | 109.330(24) |
ϑ(H2–C2–C1) | 124.02 | 124.11 | 124.11 | 124.11 | 124.11✓ | 124.11✓ | 124.11✓ | 124.11✓ | 124.01(11) |
ϑ(H3–C3–C2) | 126.27 | 126.14 | 126.14 | 126.13 | 126.13✓ | 126.13✓ | 126.13✓ | 126.13✓ | 126.082(60) |
![]() |
106.04 | 106.40 | 106.40 | 106.60 | 106.40✓ | 106.40✓ | 106.60✓ | 106.60✓ | 106.580(12)h |
MAX%i | 0.9744 | 0.3957 | 0.1970 | 0.1120 | 0.0100 | 0.0118 | 0.0033 | 0.0039 | 0.0772 |
MAE%j | 0.9744 | 0.2775 | 0.1050 | 0.0608 | 0.0045 | 0.0050 | 0.0014 | 0.0016 | 0.0660 |
The second example is that of cyclopentadiene (see Fig. 3), whose equilibrium SE structure has been recently obtained from the experimental rotational constants and fc-CCSD(T)/ANO0 vibrational corrections.82 Also in this case, refinement of all the geometrical parameters by the MSR code reproduces exactly the results of the original work (see Table 4).
Furthermore, the results obtained freezing the relative positions of hydrogen atoms at their BDPCS3 or PCS2 values are even better than those issued from the complete optimization, possibly because of the well known numerical instabilities often affecting the optimization of hydrogen atom positions.59
As a further example we have considered cis-hexatriene (see Fig. 4), which was previously investigated by Demaison and coworkers.83
The results collected in Table 5 show that PCS2 and BDPCS3 provide more accurate reference values with respect to those employed in the mixed regression approach of ref. 83. Once again, the relative positions of hydrogen atoms can be safely frozen at their computed values while optimizing the positions of heavy atoms in a reduced-dimensionality SE approach.
Parameter | r eq |
r
SE![]() |
|||||
---|---|---|---|---|---|---|---|
HPCS2 | DPCS3 | BDPCS3 | PCS2c | Ref. 83 | BDCPS3d | PCS2e | |
a The symbol ✓ denotes parameters kept fixed at their guess value. b Vibrational corrections at the HPCS2 level of theory. c Data taken from the molecular database70 (https://www.skies-village.it/databases/). d SE structure obtained starting from the BDPCS3 geometry. e SE structure obtained starting from the PCS2 geometry. f Maximum absolute percentage error in the rotational constants (for all isotopic species) of the refined structure relative to their experimental counterparts. g Mean absolute percentage error in the rotational constants (for all isotopic species) of the refined structure relative to their experimental counterparts. | |||||||
r(C1–C2) | 1.3457 | 1.3408 | 1.3996 | 1.3401 | 1.3400(4) | 1.3397(2) | 1.3394(2) |
r(C2–C3) | 1.4524 | 1.4533 | 1.4508 | 1.4525 | 1.4512(4) | 1.4511(2) | 1.4511(1) |
r(C3–C4) | 1.3568 | 1.3506 | 1.3492 | 1.3493 | 1.3488(9) | 1.3481(2) | 1.3492(2) |
r(C1–H1c) | 1.0883 | 1.0849 | 1.0824 | 1.0827 | 1.0826(3) | 1.0824✓ | 1.0827✓ |
r(C1–H1t) | 1.0859 | 1.0826 | 1.0801 | 1.0800 | 1.0798(2) | 1.0801✓ | 1.0800✓ |
r(C2–H2) | 1.0883 | 1.0850 | 1.0825 | 1.0823 | 1.0825(8) | 1.0825✓ | 1.0823✓ |
r(C3–H3) | 1.0897 | 1.0869 | 1.0844 | 1.0845 | 1.0841(5) | 1.0844✓ | 1.0845✓ |
ϑ(C1–C2–C3) | 123.52 | 122.91 | 122.91 | 122.75 | 122.77(3) | 122.92(2) | 122.86(2) |
ϑ(C2–C3–C4) | 126.80 | 126.40 | 126.40 | 126.22 | 126.28(2) | 126.29(1) | 126.27(2) |
ϑ(H1c–C1–C2) | 121.45 | 121.12 | 121.12 | 120.98 | 121.02(2) | 121.12✓ | 120.98✓ |
ϑ(H1t–C1–C2) | 121.69 | 121.51 | 121.51 | 121.48 | 121.46(2) | 121.51✓ | 121.48✓ |
ϑ(H2–C2–C1) | 118.57 | 118.78 | 118.78 | 118.90 | 118.91(9) | 118.78✓ | 118.90✓ |
ϑ(H3–C3–C4) | 117.70 | 117.91 | 117.91 | 118.05 | 118.02(7) | 117.91✓ | 118.05✓ |
MAX%f | 1.6068 | 0.4469 | 0.3385 | 0.1550 | 0.1536 | 0.0086 | 0.0052 |
MAE%g | 1.0007 | 0.2419 | 0.2585 | 0.0795 | 0.0817 | 0.0022 | 0.0020 |
The situation is slightly different in the case of azulene (see Fig. 4), as the optimization of all the geometrical parameters faces numerical instabilities, and only a stepwise procedure with some frozen parameters allowed obtaining a reasonable solution.55 Under such circumstances, freezing the positions of hydrogen atoms to their BDPCS3 equilibrium values guarantees the stability of the optimization process and provides very reliable results (see Table 6).
Parameter | r eq |
r
SEeq
![]() |
|||||||
---|---|---|---|---|---|---|---|---|---|
HPCS2c | DPCS3c | BDPCS3 | PCS2c | Ref. 55 | RD1(BDCPS3)d | RD2(BDCPS3)e | RD1(PCS2)f | RD2(PCS2)g | |
a The symbol ✓ denotes parameters kept fixed at their guess value. b Vibrational corrections at the HPCS2 level of theory. c Data taken from ref. 55. d SE structure obtained starting from the BDPCS3 geometry and optimizing the C–C–C classes. e SE structure obtained starting from the BDPCS3 geometry and optimizing the C–C and C–C–C classes. f SE structure obtained starting from the PCS2 geometry and optimizing the C–C–C classes. g SE structure obtained starting from the PCS2 geometry and optimizing the C–C and C–C–C classes. | |||||||||
r(C1–C9) | 1.4077 | 1.4035 | 1.4014 | 1.4012 | 1.402(4) | 1.396(2) | 1.401151(9) | 1.399(2) | 1.401200(7) |
r(C4–C10) | 1.3917 | 1.3883 | 1.3864 | 1.3872 | 1.379(7) | 1.389(3) | 1.386171(9) | 1387(3) | 1.387200(7) |
r(C4–C5) | 1.3985 | 1.3950 | 1.3930 | 1.3939 | 1.399(4) | 1.394(2) | 1.392811(9) | 1.395(2) | 1.393900(7) |
r(C5–C6) | 1.3987 | 1.3949 | 1.3929 | 1.3940 | 1.393(3) | 1.3933(7) | 1.392701(9) | 1.3936(6) | 1.394000(7) |
r(C1–H11) | 1.0837 | 1.0811 | 1.0786 | 1.0790 | 1.079✓ | 1.0786✓ | 1.0786✓ | 1.0790✓ | 1.0790✓ |
r(C2–H12) | 1.0849 | 1.0824 | 1.0799 | 1.0800 | 1.080✓ | 1.0799✓ | 1.0799✓ | 1.0800✓ | 1.0800✓ |
r(C4–H14) | 1.0899 | 1.0877 | 1.0852 | 1.0854 | 1.087(2) | 1.0852✓ | 1.0852✓ | 1.0854✓ | 1.0854✓ |
r(C5–H15) | 1.0877 | 1.0850 | 1.0825 | 1.0827 | 1.082(1) | 1.0825✓ | 1.0825✓ | 1.0827✓ | 1.0827✓ |
r(C6–H16) | 1.0889 | 1.0862 | 1.0837 | 1.0840 | 1.081(2) | 1.0837✓ | 1.0837✓ | 1.0840✓ | 1.0840✓ |
ϑ(C1–C9–C10) | 106.57 | 106.64 | 106.65 | 106.64 | 106.5(3) | 106.67(2) | 106.6653(2) | 106.65(2) | 106.6426(2) |
ϑ(C4–C5–C6) | 128.66 | 128.68 | 128.68 | 128.68 | 128.68✓ | 128.70(2) | 128.6990(2) | 128.68(2) | 128.6746(2) |
ϑ(C5–C4–C10) | 129.03 | 128.80 | 128.81 | 128.70 | 128.70✓ | 128.83(2) | 128.8252(2) | 128.69(2) | 128.6896(2) |
ϑ(C5–C6–C7) | 129.83 | 129.90 | 129.90 | 129.98 | 129.9(1) | 128.70(2) | 128.6989(2) | 128.68(2) | 128.6746(2) |
ϑ(C1–C2–H12) | 125.08 | 125.07 | 125.07 | 125.08 | 125.08✓ | 125.07✓ | 125.07✓ | 125.08✓ | 125.08✓ |
ϑ(C6–C5–H15) | 115.68 | 115.61 | 115.61 | 115.59 | 115.7(2) | 115.61✓ | 115.61✓ | 115.59✓ | 115.59✓ |
ϑ(C9–C1–H11) | 125.17 | 125.14 | 125.15 | 125.09 | 125.09✓ | 125.15✓ | 125.15✓ | 125.09✓ | 125.09✓ |
ϑ(C10–C4–H14) | 115.27 | 115.32 | 115.31 | 115.32 | 115.32✓ | 115.31✓ | 115.31✓ | 115.32✓ | 115.32✓ |
MAX%e | 0.9986 | 0.2996 | 0.1568 | 0.0701 | 0.0081 | 0.0066 | 0.0086 | 0.0062 | 0.0062 |
MAE%f | 0.7941 | 0.2341 | 0.0786 | 0.0366 | 0.0024 | 0.0020 | 0.0030 | 0.0019 | 0.0021 |
As previously noted in the Introduction, our computational protocol can be applied to open-shell species7 and charged systems,8 both of which are highly relevant in astrochemistry. However, currently available experimental data for such species are mostly limited to very small molecules, which fall outside the main scope of this study. As an illustrative exception, we include the phenyl radical, for which experimental data are available for a sufficient number of isotopologues to enable the determination of an accurate SE equilibrium structure (Table 7).84
Parametera | r eq | r SEeq | |||||
---|---|---|---|---|---|---|---|
HPCS2 | DPCS3 | BDPCS3 | PCS2b | HPCS2 Δvib | DPCS3 Δvib | CCSD(T) Δvibc | |
a The symbols Ci, Co, Cm, and Cp refer to ipso, ortho, meta, and para carbon atoms, respectively. b Data taken from the PCS141 database70 (https://www.skies-village.it/databases/). c Data taken from ref. 84. d DPCS3 value augmented by 0.005 Å (see main text for details). | |||||||
r(Ci–Co) | 1.3784 | 1.3675 | 1.3725d | 1.3745 | 1.3726(7) | 1.3728(7) | 1.3722(3) |
r(Co–Cm) | 1.4060 | 1.3951 | 1.4001d | 1.3999 | 1.3989(3) | 1.3990(3) | 1.3984(1) |
r(Cm–Cp) | 1.3988 | 1.3879 | 1.3929d | 1.3932 | 1.3936(3) | 1.3937(2) | 1.3931(1) |
r(Co–H) | 1.0866 | 1.0829 | 1.0804 | 1.0809 | 1.0799(2) | 1.0799(2) | 1.0808(1) |
r(Cm–H) | 1.0875 | 1.0844 | 1.0819 | 1.0820 | 1.0811(2) | 1.0813(2) | 1.0819(1) |
r(Cp–H) | 1.0865 | 1.0833 | 1.0806 | 1.0811 | 1.0802(2) | 1.0804(2) | 1.0809(1) |
ϑ(Co–Ci–Co) | 125.86 | 126.18 | 126.18 | 125.73 | 125.87(3) | 125.82(3) | 125.85(1) |
ϑ(Ci–Co–Cm) | 116.57 | 116.38 | 116.38 | 116.61 | 116.59(3) | 116.62(3) | 116.60(1) |
ϑ(Co–Cm–Cp) | 120.18 | 120.28 | 120.28 | 120.16 | 120.17(2) | 120.15(2) | 120.16(1) |
ϑ(Cm–Cp–Cm) | 120.63 | 120.57 | 120.57 | 120.66 | 120.62(2) | 120.63(2) | 120.63(1) |
ϑ(Ci–Co–H) | 122.44 | 122.40 | 122.40 | 122.36 | 122.97(3) | 122.96(3) | 122.90(1) |
ϑ(Co–Cm–H) | 119.63 | 120.13 | 120.13 | 120.19 | 120.41(3) | 120.44(3) | 120.39(1) |
ϑ(Cm–Cp–H) | 119.69 | 119.59 | 119.59 | 119.65 | 119.69(1) | 119.68(1) | 119.68(1) |
The data clearly show that vibrational corrections computed at the HPCS2 level lead to SE structures that closely match those obtained using the more expensive DPCS3 and even CCSD(T)/ANO0 approaches. In all cases, the results are well converged, and the final errors are comparable to those observed for closed-shell systems. However, the accuracy of the QC equilibrium geometries is lower than usual, due to the known moderate multi-reference character of the phenyl radical.84,85 In the case of PCS2, this effect primarily affects the Ci–Co bond length and can be mitigated through a localized correction, as previously demonstrated for other multi-reference systems.86,87 Conversely, the DPCS3 variant tends to underestimate all bond lengths involving atoms with significant spin density.7 Although a general reparametrization of the method for open-shell species is currently hindered by the lack of accurate benchmark geometries—either SE equilibrium structures or computational results of at least CCSDT(Q) quality10 for a sizeable set of radicals—reasonable results can be obtained for the phenyl radical by applying an empirical offset of 5 mÅ to the DPCS3 equilibrium values of all the CC bond lengths.
Cyanides and isocyanides have attracted considerable attention in the field of astrochemistry as proxies of their hydrocarbon analogues, which are microwave silent due to vanishing dipole moments. After systematic investigations of polyaromatic cyanides and isocyanides,88–93 branched carbon-chain analogues are now under extensive investigation.94,95 Among isocyanides, experimental rotational constants are available for all the isotopologues of tert-butyl isocyanide (see Fig. 5) involving single substitutions of heavy atoms.96
Due to the high symmetry of the molecule (C3v) the number of independent parameters is quite limited and, above all, only two values are needed for CH bond lengths and two others for HCC valence angles. The results obtained from different quantum chemical models are compared in Table 8 with those issued from partial SE optimizations in which the CH bond lengths and HCC valence angles are frozen at their PCS2 or BDPCS3 values.
Parameter | r eq |
r
SEeq
![]() |
||||
---|---|---|---|---|---|---|
HPCS2 | DPCS3 | BDPCS3 | PCS2 | BDPCS3c | PCS2d | |
a The symbol ✓ denotes parameters kept fixed at their guess value. b Vibrational corrections at the HPCS2 level of theory. c SE structure obtained starting from the BDPCS3 geometry. d SE structure obtained starting from the PCS2 geometry. | ||||||
r(C1–N1) | 1.1751 | 1.1730 | 1.1707 | 1.1712 | 1.1703(4) | 1.1703(4) |
r(N1–C2) | 1.4429 | 1.4413 | 1.4386 | 1.4406 | 1.439(1) | 1.439(1) |
r(C2–C3) | 1.5379 | 1.5289 | 1.5259 | 1.5255 | 1.5249(4) | 1.5251(4) |
r(C3–H1) | 1.0945 | 1.0911 | 1.0885 | 1.0887 | 1.0885✓ | 1.0887✓ |
r(C3–H3) | 1.0952 | 1.0919 | 1.0894 | 1.0895 | 1.0894✓ | 1.0895✓ |
ϑ(N1–C2–C3) | 108.08 | 107.98 | 107.98 | 108.02 | 108.02(5) | 108.02(5) |
ϑ(C2–C3–H1) | 110.67 | 110.58 | 110.58 | 110.52 | 110.58✓ | 110.52✓ |
ϑ(C2–C3–H3) | 109.57 | 109.47 | 109.47 | 109.40 | 109.47✓ | 109.40✓ |
φ(N1–C2–C3–H1) | 60.18 | 60.19 | 60.19 | 60.19 | 60.19✓ | 60.19✓ |
MAX%c | 1.7026 | 0.6613 | 0.2224 | 0.0999 | 0.0004 | 0.0004 |
MAE%d | 1.1329 | 0.3332 | 0.0997 | 0.0889 | 0.0002 | 0.0002 |
The remarkable accuracy of the BDPCS3 and PCS2 equilibrium geometries is well evidenced by the small deviations for all the available rotational constants, but in this case, the sought spectroscopic accuracy is reached only at the PCS2 level. Optimization of the relative positions of heavy atoms by a reduced-dimensionality SE approach leads to a slight increase in the N1C2 bond length and a corresponding shortening of the C2C3 bond length, while the C1N1 bond length and the N1C2C3 valence angle are only negligibly affected. Although this trend points out some residual problems in the balanced description of inductive and delocalization effects at the BDPCS3 level, the final SE equilibrium structures employing either PCS2 or BDPCS3 values for the hydrogen positions are nearly indistinguishable and reproduce all the available rotational constants with very small residual errors. Therefore, those geometrical parameters represent significant improvements with respect to those obtained for the so-called substitution structure (rs) in ref. 96.
Another interesting case is that of 1,4-dehydronaphtalene (see Fig. 5), which has been recently investigated.16 The results collected in Table 9 show that the deviations of BDPCS3 results are one order of magnitude smaller than their B3LYP counterparts reported in ref. 16. Actually, only a slight improvement is brought by the further optimization of heavy atom positions by the SE approach.
Parameter | r eq |
r
SEeq
![]() |
|||
---|---|---|---|---|---|
HPCS2 | DPCS3 | BDPCS3 | RD1(BDCPS3)c | RD2(BDCPS3)d | |
a The symbol ✓ denotes parameters kept fixed at their guess value. b Vibrational corrections at the HPCS2 level of theory. c SE structure obtained starting from the BDPCS3 geometry and optimizing the C–C–C class and C–C bonds separately. d SE structure obtained starting from the BDPCS3 geometry and optimizing the C–C and C–C–C classes. e Maximum absolute percentage error in the rotational constants (for all isotopic species) of the refined structure relative to their experimental counterparts. f Mean absolute percentage error in the rotational constants (for all isotopic species) of the refined structure relative to their experimental counterparts. | |||||
r(C1–C2) | 1.5019 | 1.4973 | 1.4945 | 1.495(14) | 1.49445(3) |
r(C2–C3) | 1.5155 | 1.5100 | 1.5071 | 1.507(20) | 1.50708(3) |
r(C3–C4) | 1.4035 | 1.4007 | 1.3986 | 1.398(22) | 1.39852(3) |
r(C4–C5) | 1.3924 | 1.3877 | 1.3857 | 1.386(15) | 1.38567(3) |
r(C10–C1) | 1.3357 | 1.3331 | 1.3320 | 1.332(26) | 1.33192(3) |
r(C1–H1) | 1.0889 | 1.0859 | 1.0834 | 1.0834✓ | 1.0834✓ |
r(C2–H2) | 1.1013 | 1.0973 | 1.0948 | 1.0948✓ | 1.0948✓ |
r(C4–H4) | 1.0886 | 1.0860 | 1.0835 | 1.0835✓ | 1.0835✓ |
r(C5–H5) | 1.0868 | 1.0838 | 1.0813 | 1.0813✓ | 1.0813✓ |
ϑ(C1–C2–C3) | 114.32 | 114.12 | 114.12 | 114.1(3) | 114.120(1) |
ϑ(C2–C3–C4) | 118.95 | 118.80 | 118.80 | 118.8(3) | 118.796(1) |
ϑ(C3–C4–C5) | 121.41 | 121.39 | 121.39 | 121.4(3) | 121.387(1) |
ϑ(C4–C5–C6) | 119.46 | 119.47 | 119.47 | 119.5(3) | 119.475(1) |
ϑ(H1–C1–C2) | 116.54 | 116.83 | 116.83 | 116.83✓ | 116.83✓ |
ϑ(H2–C2–C3) | 109.32 | 109.22 | 109.22 | 109.22✓ | 109.22✓ |
ϑ(H4–C4–C5) | 119.69 | 119.72 | 119.72 | 119.72✓ | 119.72✓ |
ϑ(H5–C5–C6) | 120.39 | 120.37 | 120.37 | 120.37✓ | 120.37✓ |
φ(H2–C2–C3–C4) | −57.09 | −57.24 | −57.24 | −57.24✓ | −57.24✓ |
MAX%e | 1.0428 | 0.3329 | 0.0211 | 0.0187 | 0.0186 |
MAE%f | 0.9437 | 0.3142 | 0.0146 | 0.0103 | 0.0102 |
In this connection, it is remarkable that very accurate results are obtained by collecting all carbon atoms in a single class (i.e., optimizing only two parameters, one for bond lengths and one for valence angles). In any case, the rotational constants of all the available isotopologues are reproduced with remarkable accuracy (see Table 10).
Isotopic species | A | B | C |
---|---|---|---|
Parent | 0.0095 | 0.0159 | 0.0074 |
13C2/9 | 0.0091 | 0.0162 | 0.0069 |
13C1/10 | 0.0094 | 0.0159 | 0.0074 |
13C4/7 | 0.0101 | 0.0157 | 0.0077 |
13C5/6 | 0.0093 | 0.0157 | 0.0075 |
13C3/8 | 0.0096 | 0.0155 | 0.0074 |
To further evaluate the performance of the methodology, we have considered a flexible molecule, namely the most stable conformer of proline (see Fig. 6).
The results collected in Table 11 show that, as expected, the errors are larger than for rigid molecules, possibly due to both limitations of the VPT2 model and of the underlying quantum chemical computations. In any case, the accuracy of the final BDPCS3 geometrical parameters is comparable to that of their PCS2 counterparts, while the computational cost is reduced by at least two orders of magnitude. Furthermore, the reduced-dimensionality SE equilibrium structures obtained by freezing the positions of hydrogen atoms at their BDPCS3 or PCS2 values do not show any numerical instability, while delivering an accuracy comparable (if not better) than that of the structure obtained in ref. 97 through a mixed regression approach based on a CCSD(T) geometry optimization.
Parameter | r eq |
r
SEeq
![]() |
|||||
---|---|---|---|---|---|---|---|
HPCS2 | DPCS3 | BDPCS3 | PCS2 | Ref. 97 | RD(BDCPS3)c | RD (PCS2)d | |
a The symbol ✓ denotes parameters kept fixed at their guess value. b Vibrational corrections at the HPCS2 level of theory. c SE structure obtained starting from the BDPCS3 geometry. d SE structure obtained starting from the PCS2 structure. | |||||||
r(N–C1) | 1.4845 | 1.4788 | 1.4759 | 1.4772 | 1.4785(18) | 1.475(6) | 1.478(2) |
r(C1–C2) | 1.5449 | 1.5355 | 1.5325 | 1.5324 | 1.5307(18) | 1.532(6) | 1.533(2) |
r(C1–C5) | 1.5425 | 1.5359 | 1.5329 | 1.5338 | 1.5321(18) | 1.532(6) | 1.534(2) |
r(C2–C3) | 1.5364 | 1.5291 | 1.5262 | 1.5271 | 1.5262(17) | 1.525(6) | 1.528(2) |
r(C3–C4) | 1.5321 | 1.5249 | 1.5220 | 1.5231 | 1.5215(15) | 1.521(6) | 1.524(2) |
r(C5–O1) | 1.2113 | 1.2043 | 1.2016 | 1.2019 | 1.2027(18) | 1.2016✓ | 1.2019✓ |
r(C5–O2) | 1.3405 | 1.3365 | 1.3338 | 1.3342 | 1.3343(19) | 1.3338✓ | 1.3342✓ |
r(N–H1) | 1.0152 | 1.0100 | 1.0089 | 1.0085 | 1.0078(11) | 1.0089✓ | 1.0085✓ |
r(C1–H2) | 1.0951 | 1.0919 | 1.0894 | 1.0897 | 1.0891(23) | 1.0894✓ | 1.0897✓ |
r(C2–H3) | 1.0963 | 1.0928 | 1.0903 | 1.0907 | 1.0903(23) | 1.0903✓ | 1.0907✓ |
r(C2–H4) | 1.0923 | 1.0888 | 1.0863 | 1.0866 | 1.0865(23) | 1.0863✓ | 1.0866✓ |
r(C3–H5) | 1.0969 | 1.0935 | 1.0910 | 1.0915 | 1.0909(23) | 1.0915✓ | 1.0915✓ |
r(C3–H6) | 1.0939 | 1.0899 | 1.0874 | 1.0875 | 1.0872(23) | 1.0874✓ | 1.0875✓ |
r(C4–H7) | 1.1003 | 1.0961 | 1.0936 | 1.0937 | 1.0931(23) | 1.0936✓ | 1.0937✓ |
r(C4–H8) | 1.0946 | 1.0908 | 1.0883 | 1.0884 | 1.0880(23) | 1.0883✓ | 1.0884✓ |
r(O2–H2) | 0.9965 | 0.9844 | 0.9833 | 0.9821 | 0.9861(23) | 0.9833✓ | 0.9821✓ |
ϑ(N–C1–C2) | 105.57 | 105.71 | 105.71 | 105.82 | 105.79(13) | 105.5(5) | 105.8(1) |
ϑ(N–C1–C5) | 109.82 | 110.31 | 110.31 | 110.21 | 110.09(12) | 110.31✓ | 110.21✓ |
ϑ(C1–C2–C3) | 102.96 | 102.39 | 102.39 | 102.23 | 102.304(80) | 102.2(5) | 102.2(1) |
ϑ(C1–C5–O1) | 122.65 | 122.60 | 122.60 | 122.50 | 122.41(21) | 122.60✓ | 122.50✓ |
ϑ(C1–C5–O2) | 113.86 | 114.00 | 114.00 | 114.04 | 114.17(21) | 114.00✓ | 114.04✓ |
ϑ(C2–C3–C4) | 102.53 | 102.17 | 102.17 | 102.23 | 102.121(81) | 102.0(5) | 102.2(1) |
ϑ(C1–N–H1) | 112.44 | 111.79 | 111.79 | 111.53 | 111.84(19) | 111.79✓ | 111.53✓ |
ϑ(N–C1–H2) | 112.07 | 111.78 | 111.78 | 111.94 | 112.03(33) | 111.78✓ | 111.94✓ |
ϑ(C1–C2–H3) | 109.55 | 109.70 | 109.70 | 109.73 | 109.87(34) | 109.70✓ | 109.73✓ |
ϑ(C1–C2–H4) | 111.81 | 111.71 | 111.71 | 111.60 | 111.69(34) | 111.71✓ | 111.60✓ |
ϑ(C2–C3–H5) | 110.46 | 110.29 | 110.29 | 110.21 | 110.17(36) | 110.29✓ | 110.21✓ |
ϑ(C2–C3–H6) | 113.15 | 113.39 | 113.39 | 113.43 | 113.54(34) | 113.39✓ | 113.43✓ |
ϑ(C3–C4–H7) | 110.19 | 110.00 | 110.00 | 109.92 | 109.92(34) | 110.00✓ | 109.92✓ |
ϑ(C3–C4–H8) | 113.29 | 113.26 | 113.26 | 113.21 | 113.23(34) | 113.26✓ | 113.21✓ |
ϑ(C5–O2–H9) | 104.02 | 103.70 | 103.70 | 103.72 | 102.88(31) | 103.70✓ | 103.72✓ |
φ(N–C1–C2–C3) | 23.94 | 26.11 | 26.11 | 26.82 | 26.47(18) | 26.11✓ | 26.82✓ |
φ(N–C1–C5–O1) | −179.98 | −177.87 | −177.87 | −176.73 | −177.36(35) | −177.87✓ | −176.73✓ |
φ(N–C1–C5–O2) | 0.30 | 2.11 | 2.11 | 3.24 | 2.50(31) | 2.11✓ | 3.24✓ |
φ(C1–C2–C3–C4) | −38.09 | −39.91 | −39.91 | −40.25 | −40.23(17) | −39.91✓ | −40.25✓ |
φ(N–C1–C2–H3) | −93.17 | −90.62 | −90.62 | −89.81 | −90.76(56) | −90.62✓ | −89.81✓ |
φ(N–C1–C2–H4) | 147.00 | 149.05 | 149.05 | 149.64 | 148.91(56) | 149.05✓ | 149.64✓ |
φ(C1–C2–C3–H5) | 79.38 | 77.17 | 77.17 | 76.73 | 77.19(53) | 77.17✓ | 76.73✓ |
φ(C1–C2–C3–H6) | −159.42 | −161.26 | −161.26 | −161.59 | −161.58(53) | −161.26✓ | −161.59✓ |
φ(C1–C5–O2–H9) | 0.21 | −0.64 | −0.64 | −1.02 | −0.41(56) | −0.64✓ | −1.02✓ |
φ(C2–C3–C4–H7) | −80.97 | −79.92 | −79.92 | −79.91 | −79.57(55) | −79.92✓ | −79.91✓ |
φ(C2–C3–C4–H8) | 157.67 | 158.75 | 158.75 | 158.70 | 159.03(56) | 158.75✓ | 158.70✓ |
φ(C5–C1–N–H1) | −114.12 | −117.67 | −117.67 | −119.35 | −119.13(18) | −117.67✓ | −119.35✓ |
φ(H1–N–C1–C2) | 125.65 | 122.38 | 122.38 | 120.94 | 121.09(17) | 122.38✓ | 120.94✓ |
φ(H1–N–C1–H2) | 4.10 | 0.44 | 0.44 | −1.25 | 0.01(56) | 0.44✓ | −1.25✓ |
The performance of the proposed computational strategy is illustrated by the rotational spectrum of progesterone (see Fig. 7), which has been recently reported and interpreted in terms of DFT computations neglecting vibrational corrections.15
The results collected in Table 12 show that the BDPCS3 equilibrium geometry in conjunction with HPCS2 vibrational corrections provides results about one order of magnitude more accurate than the computations reported in ref. 15, reaching spectroscopic accuracy with reasonable computational efforts. Optimization of the average value of the bond lengths between heavy atoms (by means of a single offset) with respect to the experimental rotational constants of the parent species further reduces the error, but already the starting values can be considered fully satisfactory.
HPCS2 | DPCS3 | BDPCS3 | Fita | SEb | |
---|---|---|---|---|---|
a Rotational constants of the equilibrium geometry, obtained by refining the C–C and C–O bond lengths as a single class using the parent species. b SE rotational constants of the parent species, obtained by combining the experimental ground-state data from ref. 15 with vibrational corrections at the HPCS2 level of theory. c Vibrational corrections to the rotational constants of the parent species at the HPCS2 level of theory. | |||||
B eqa | 677.71(−3.14c) | 676.72 | 679.45 | 679.68 | 679.52 |
B eqb | 130.61(−1.37c) | 132.12 | 132.55 | 132.60 | 132.62 |
B eqc | 119.85(−1.34c) | 121.37 | 121.77 | 121.81 | 121.79 |
MAE% | 0.884 | 0.378 | 0.028 | 0.018 | — |
A similar accuracy has been obtained with analogous methods for several other hormones and biomolecule building blocks18 with a cost comparable to that of standard DFT approaches. Therefore, the implementation of the whole computational protocol in a black box and user-friendly tool paves the way for its widespread use also by experiment-oriented researchers.
The workflow has been tested on diverse molecular systems, demonstrating robustness and accuracy for both semi-rigid and flexible species. The phenyl radical illustrates the applicability to open-shell systems, while proline exemplifies the treatment of flexibility within the VPT2 framework, provided that vibrational anharmonicity remains moderate.
A key strength of the approach is its ability to bridge the gap between theoretical predictions and experimental observations, enabling the extraction of equilibrium structures from vibrationally averaged spectroscopic data. This is particularly valuable in gas-phase studies, where intrinsic stereoelectronic properties can be analyzed without interference from environmental effects.
Overall, the proposed methodology enhances both the accuracy and accessibility of structural predictions, outperforming standard computational approaches at comparable cost. It offers a practical and general solution for the routine interpretation of high-resolution spectra, with applications in astrochemistry, atmospheric chemistry, and molecular pharmacology, and lays the groundwork for extending spectroscopic-quality structural analysis to increasingly complex molecular systems.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5cp01611h |
This journal is © the Owner Societies 2025 |