Open Access Article
Bruno A. Piscelli
ab,
Tiger Swithenbank-Michel
b,
Rodrigo A. Cormanich
a,
David O’Hagan
b and
Michael Bühl
*b
aUNICAMP, Universidade Estadual de Campinas, Instituto de Química, Rua Monteiro Lobato 270, 13083-862, Campinas, São Paulo, Brazil
bEaStCHEM, School of Chemistry, University of St Andrews, St Andrews, Fife KY16 9ST, UK. E-mail: buehl@st-andrews.ac.uk
First published on 7th October 2025
A series of GFN-xTB methods were benchmarked against high-level DFT and ab initio thermodynamic data for a set of conformational equilibria and driving forces for the formation of non-covalent complexes involving the Janus-face fluorocyclohexanes based on the all-syn-C6FnR12−n motif (n = 3, 5, 6). When used alone, GFN methods showed moderate performance, with mean absolute errors (MAEs) from the high-level benchmarks of approximately 2.5 kcal mol−1 for conformational equilibria and ∼5.0 kcal mol−1 for molecular complexes. However, applying DFT-level single-point energy corrections on GFN-optimised geometries significantly improved the accuracy, reducing MAEs to ∼0.2 and ∼1.0 kcal mol−1 for the same systems. This hybrid approach achieves DFT-D3-level accuracy while maintaining a low computational cost, offering up to a 50-fold reduction in computational time. As such, it provides a new cost-efficient and accurate tool for the computational modeling of Janus-face systems. An illustrative application to a flexible system, C6F5H6O2C(CH2)3NHCOC6H2(OR)3, is reported (R = alkyl), highlighting the relative stabilities of folded and extended forms and their supramolecular assembly into helical stacks.
However, as a class, PFAS are non-biodegradable and are often referred to as ‘forever chemicals’ due to their long half-lives.12–16 They are also highly resistant to metabolic degradation, often breaking down into other stable PFAS rather than innocuous products.17,18 As a result, their widespread use has raised concerns due to their high persistence in the environment,19–21 propelling the search for new classes of organo-fluorine motifs as an alternative to the existing PFAS. In recent years, selectively fluorinated all-syn cyclohexanes with all of the fluorine atoms on one face of the cyclohexane ring have emerged as extraordinarily polar aliphatics, with all-syn-1,2,3,4,5,6-hexafluorocyclohexane 1 (Fig. 1(A)), first prepared by O’Hagan et al., being a prominent representative of this new class of compounds.22 In this molecule, three C–F bonds are co-aligned in an axial orientation. This results in a large dipole moment (calculated at 6.2 D), which polarizes the molecule. Also, the strong electron-withdrawing nature of the fluorine atoms polarizes the hydrogens, rendering them highly electropositive and leading to an unprecedented electrostatic profile for an aliphatic molecule, with cyclohexane 1 exhibiting a negatively charged fluorine face and a positively charged hydrogen face, as illustrated in Fig. 1. These aspects result in high thermal stability with decomposition, rather than melting, occurring above 200 °C.
Santschi and Gilmour aptly coined the term ‘Janus Face’ cyclohexanes to describe this dual characteristic, referencing the Roman god Janus, known for his two opposing faces.23 This unique Janus characteristic induces highly organized supramolecular assembly around this motif, with molecular stacking of such Janus rings enabled by intermolecular interactions of the fluorine faces with hydrogen faces.24–26
Despite their unique properties and potential applicability, to our knowledge only one report, by Pavan and Delius et al. in 2021, has used the Janus characteristic of these cyclohexanes to induce and control dynamic supramolecular assembly.27 In our opinion, a computational method able to deal with the large systems used in supramolecular applications (in the order of thousands of atoms), while maintaining a good compromise between computational time and accuracy, would help facilitate the use of these fluorinated rings in supramolecular chemistry. Therefore, we propose a set of systems divided into 4 groups (Fig. 1) to assess the accuracy of cheap semiempirical DFT variants, specifically tight-binding GFN1-xTB and GFN2-xTB methods28,29 as well as xTB's universal forcefield GFN-FF30 on predicting: (1) conformational preferences; (2) non-covalent assemblies; and (3) ion complexation thermodynamics. The low computational cost and large applicability of the xTB methods make them good options for this case. Finally, the molecules used by Pavan and Delius et al.27 are used as a proof of concept for a ‘real’ supramolecular application of these methods (Group 4).
For the conformational analysis of compound 10, a conformational search was carried out using the iterative-static metadynamics algorithm as implemented in the CREST 2.11.2 software package.37,38 The GFN-FF, GFN1, and GFN2 methods were employed both for the conformational sampling and subsequent reoptimisation steps. Preliminary calculations revealed that the majority of conformational flexibility in 10 arose from the large alkyl chains appended to the aromatic core. Consequently, most of the conformers identified during the sampling process reflected variations in the orientation of these chains, with minimal changes to the central scaffold. Given that the primary objective was to assess the geometry of the molecular core, the long alkyl substituents were truncated and replaced with methyl groups (10′, R = Me in Fig. 1). Under these conditions, each xTB method yielded approximately 1000 conformers. These structures were then ranked using a principal component analysis (PCA) and k-means approach, also implemented in CREST, to identify the 100 most representative conformers per method. The 4 conformers qualitatively resembling the relevant structures I–IV of 10 determined through well-tempered metadynamics calculations, as reported by Pavan and Delius et al.,27 were selected by visual inspection and subsequently reoptimised at the corresponding theoretical level. Finally, harmonic frequency calculations were performed at the xTB levels to obtain the relative Gibbs free energies for each conformer.
Overall, force-field calculations using GFN-FF, which was originally designed to describe intermolecular (and not intramolecular) interactions in supramolecular assemblies and large molecules,30 do not provide reasonable results compared to reference DFT calculations, exhibiting a mean absolute error (MAE) near 3 kcal mol−1 across compounds 2, 3, and 4. The only exception is the GFN1//GFN-FF composite method, which achieves an MAE of 0.94 kcal mol−1. Although single-point energy calculations at the DFT-D3 level (DFT-D3//GFN-FF) should theoretically yield better results than those at GFN1//GFN-FF, this is not the case. Additionally, the second-generation semi-empirical xTB method, GFN2, also exhibits poor performance, with an MAE of 2.51 kcal mol−1 when associated with GFN-FF geometry optimisations (GFN2//GFN-FF). Thus, the relatively good accuracy of the GFN1//GFN-FF approach likely arises from error cancellation between the two methods, rather than true chemical accuracy. Importantly, the F⋯F contact distances predicted by GFN-FF are consistently shorter (2b and 4b) or longer (3b) compared to those obtained with GFN1 and GFN2, which in turn provide very similar geometries (Fig. S1). This suggests that GFN1 treats F⋯F contacts in a less distance-dependent manner, such that the energetic penalties associated with overly short or long contacts in GFN-FF geometries are mitigated when single-point energies are evaluated at the GFN1 level. In contrast, for the DFT-D3//GFN-FF and GFN2//GFN-FF composite methods, the computed energies appear to be more sensitive to the precise F⋯F distances. Since the fluorine atomic charges predicted by GFN-FF and GFN1 are generally similar, the observed error cancellation most likely originates from the dispersive contribution included in GFN1 rather than from electrostatics. Moreover, the poor performance of the GFN1//GFN-FF method for the conformational equilibrium of 3 suggests that this approach may not adequately capture the steric destabilization associated with hydrophobic alkyl–alkyl contacts and may be sensitive to the system studied, highlighted by the relatively high standard deviation across the series.
In contrast, the GFN1 and GFN2 methods were benchmarked on largely the same test sets for conformational equilibria in their original papers,28,29 and both display strong overall performance, reflecting their reliable description of intramolecular interactions. Moreover, both methods deliver geometries of sufficiently high quality that the corresponding composite methods, DFT-D3//GFN1 and DFT-D3//GFN2, perform remarkably well, with MAEs of 0.37 and 0.20 kcal mol−1, respectively. Notably, the DFT-D3//GFN2 approach even matched full DFT-D3 calculations for the studied systems (MAE = 0.20 kcal mol−1), highlighting that these hybrid strategies are cost-effective options for computing relative Gibbs free energies of Janus cyclohexane conformational equilibria without compromising chemical accuracy.
Importantly, the methods GFN-FF, GFN2//GFN-FF, DFT-D3//GFN-FF, and GFN2 failed to correctly predict the preferred tri-axial conformation in 4b, underscoring significant limitations in their ability to accurately capture conformational energy differences in this case (see Table S1 for further details).
Molecular complex 5a was chosen to study how the xTB methods deal with hydrophobic CH–π interactions between hexafluorocyclohexane 1 and benzene.40 To our great surprise, electronic energy corrections at DFT-D3CP on GFN structures continues to perform well compared to the pure DFT-D3CP calculations in Group 2, in light of their good performance in Group 1. DFT-D3CP, DFT-D3CP//GFN1 and DFT-D3CP//GFN2 methods exhibit AEs of 0.05, 0.46 and 0.65 kcal mol−1, respectively, on predicted binding ΔG's, a strong performance compared to the high-level wave function reference values (DLPNO-CCSD(T)/CBS//B3LYP-D3CP/def2-TZVP). Pure xTB methods also perform well for the binding Gibbs free energy of complex 5a, especially GFN1, with an AE of 0.06 kcal mol−1, and GFN-FF and GFN2 maintaining strong performance with AEs of 0.93 and 0.52 kcal mol−1, respectively. However, when the electronic energy is calculated at the semi-empirical or DFT level over GFN-FF geometries, the absolute errors progressively move away from the reference values when we go from GFN1, GFN2 and DFT-D3CP, in all cases with AEs over 1 kcal mol−1.
It is noteworthy that the global minimum conformer for the 1-benzene complex is not C3v-symmetric 5a, but the “slipped-sandwich” complex 5b instead, which is more stable than 5a by 0.47 kcal mol−1 according to B3LYP-D3/def2-TZVP calculations in terms of ΔG, or by 2.30 kcal mol−1 when CP corrections are applied. This sheds light on an important limitation of the GFN methods, which may be insensitive to distinct conformers which are very close in energy and separated by small energy wells in the potential energy surface (PES). Considering that geometry optimisations in GFN-FF, GFN1 and GFN2 all led to the c3v complex, even when the starting structure was an already DFT pre-optimised slipped sandwich geometry, most probably the latter is not even an energy minimum on the PES of the xTB methods. However, considering both interaction modes are very close in energy, this limitation is not critical in this case and should not imply major concerns when one chooses one over another method.
The dimeric arrangement of hexafluorocyclohexane 1, complex 6, was chosen to assess the ability of the computational methods to estimate the energy of the strong CFax–HaxC contacts that are electrostatic in nature and drive supramolecular arrangements in Janus-face cyclohexanes. In this case, the absolute errors of DFT-D3CP, DFT-D3CP//GFN1 and DFT-D3CP//GFN2 methods are at 0.22, 2.15 and 1.32 kcal mol−1. Pure GFN-FF and GFN1 calculations exhibit moderate AEs of 2.78 and 1.32 kcal mol−1, respectively, while GFN2 performs exceptionally better, with an AE of only 0.03 kcal mol−1. Semi-empirical single-point energy corrections on GFN-FF geometries greatly improve chemical accuracy in GFN1//GFN-FF to an AE of 1.07 kcal mol−1, best performance of the series, while single-point energies at DFT-D3CP and GFN2 provide the worst performing methods among the composite methods with GFN-FF, with absolute errors higher than 7 kcal mol−1.
Molecular complex 7 is similar to 6, with an additional three ethyl groups that increase molecular complexity and the conformational degrees of freedom of the system. In this case, the DFT-D3CP results were taken as a reference, and DFT-D3CP single points on GFN-FF structures yield an absolute error of 2.40 kcal mol−1. The pure GFN methods exhibit varied performance, with GFN2 yielding the highest AE of 7.23 kcal mol−1, whereas GFN1 performs slightly better with an AE of 5.91 mol−1. Hybrid approaches involving DFT-D3CP corrections on GFN geometries perform considerably better, with DFT-D3CP//GFN1 and DFT-D3CP//GFN2 achieving AEs of 0.77 and 0.39 kcal mol−1, respectively. Pure GFN-FF calculations perform considerably worse than GFN2, with an AE of 13.10 kcal mol−1, while composite methods utilizing GFN//GFN-FF geometries show improvements, both for GFN1//GFN-FF (AE = 4.87 kcal mol−1) and GFN2//GFN-FF (AE = 5.08 kcal mol−1).
In order to study possible cooperativity effects, we studied the trimeric arrangement of all-syn 1,3,5-trifluorocyclohexane 8, where fluorine atoms occupy the axial positions of the cyclohexane and engage in strong CFax–HaxC electrostatic interactions, leading to supramolecular assembly. The reference Gibbs free energy of trimerization is ΔG = 8.94 kcal mol−1 relative to three separated monomers 2a at the DLPNO-CCSD(T)/CBS//B3LYP-D3CP/def2-TZVP level. The results reveal that DFT-D3CP calculations yield an AE of 4.77 kcal mol−1, while pure GFN methods exhibit a broad range of performances, with GFN1 yielding an impressive AE of 0.15 kcal mol−1, whereas GFN2 shows a higher error of 5.70 kcal mol−1, indicating worse agreement with high-level reference calculations. Hybrid approaches incorporating DFT-D3CP corrections over GFN structures also demonstrate strong performance, particularly for DFT-D3CP//GFN1 and DFT-D3CP//GFN2, which achieve AEs of 1.84 and 2.14 kcal mol−1, respectively. Interestingly, GFN-FF alone produces a high AE of 6.83 kcal mol−1, while the composite methods relying on GFN//GFN-FF geometries display contrasting trends. While GFN1//GFN-FF yields an impressively low AE of 0.31 kcal mol−1, GFN2//GFN-FF performs considerably worse with an AE of 7.52 kcal mol−1, indicating strong method-dependent variations. Corrections at the DFT level over GFN-FF structures result in an AE of 8.04 kcal mol−1 for DFT-D3CP//GFN-FF, once again suggesting that the great accuracy of GFN1//GFN-FF is most probably due to the cancellation of errors between both methods.
The computational methods were further tested against a more structurally complex system, comprising the dimer of an all-syn pentafluorinated bis-cyclohexyl compound with a long alkyl linker (complex 9). In this case, in addition to the strong electrostatically-driven CFax–HaxC interactions, hydrophobic contacts between the alkyl chains may also play a role in complexation. No conformational analysis was undertaken; the alkyl chains were constructed in linear all-trans conformations, mimicking the X-ray diffraction data.24 Due to the system size, the reference value in this case was taken as calculations at the B3LYP-D3CP/def2-TZVP theoretical level (ΔG = −1.86 kcal mol−1 relative to two separated monomers). Upon optimisation, noticeable intermolecular contacts formed between the H atoms of the alkyl chains (with the H⋯H contact down to 2.395 Å at the DFT-D3CP level, see Fig. S2). Pure GFN methods exhibit a wide range of errors, with GFN1 yielding a relatively high AE of 4.62 kcal mol−1, whereas GFN2 performs substantially better, achieving an AE of 2.35 kcal mol−1, close to DFT-D3 accuracy. Hybrid approaches incorporating DFT-D3CP corrections on GFN geometries display varied performance. DFT-D3CP//GFN1 and DFT-D3CP//GFN2 yield AEs of 2.20 and 0.45 kcal mol−1, respectively, reinforcing the strong performance of GFN2 geometries in this case. GFN-FF alone exhibits a high AE of 7.05 kcal mol−1, making it one of the least reliable methods for this system. However, composite methods using GFN-FF geometries display interesting trends. GFN1//GFN-FF results in an AE of 5.83 kcal mol−1, while GFN2//GFN-FF delivers a remarkably low AE of just 0.32 kcal mol−1. DFT-D3CP corrections on GFN-FF structures yield a very respectable AE of 0.13 kcal mol−1.
In general, basis set superposition error corrections proved to be rather important in systems from Group 2 and consistently improved the results when either incorporated directly in geometry optimisation at the DFT-D3CP level or even when incorporated only in single-point energy corrections, as in the composite approaches with the GFN methods. The only exception is the binding Gibbs free energy of system 5a, in which inclusion of CP correction led to a small overestimation (see Fig. S3 for further details). Pure GFN-FF and GFN2 calculations were the worst performers for molecules in Group 2, with MAEs of 5.12 and 2.64 kcal mol−1, respectively. Even though single-point energy corrections obtained at higher level semi-empirical or DFT-D3 on GFN-FF geometries improved the results against the reference values in some cases, the results are not consistent and appear to arise from random error-cancellation interactions between the methods and proved to be very system-dependent. Surprisingly, the best overall performing method was achieved by composite methods based on GFN2, with an MAE of 0.82 kcal mol−1, virtually the same as full DFT-D3CP calculations (MAE = 0.84 kcal mol−1). Strong performances were obtained by the hybrid approach DFT-D3CP//GFN1, which rendered a similar MAE (1.24 kcal mol−1).
The results reported herein are consistent with the original GFN publications. GFN1, which was primarily benchmarked against noncovalent complexes stabilized by London dispersion and classical hydrogen bonding,28 performs worst among the xTB methods. In contrast, GFN2 incorporates higher multipole electrostatics instead of the monopole-based treatment in GFN1,29 a feature that is particularly relevant for the electrostatically driven packing of Janus-face cyclohexanes, and consequently shows improved performance in the present set. As for GFN-FF, although noncovalent interactions are treated at the force-field level, its incorporation of flexible atomic charges and tailored lone-pair potentials is expected to bring its performance closer to that of the semiempirical GFN1 method,30 an outcome that is indeed observed for the Group 2 systems. Nevertheless, none of the xTB methods were parametrized for the assembly of highly polarized aliphatic systems such as Janus cyclohexanes. Yet, their reasonable performance in this context underscores their practical usefulness.
The varying accuracies of GFN methods in predicting the binding Gibbs free energies of Group 2 non-covalent complexes prompted us to evaluate the quality of their corresponding geometries in comparison to those obtained with DFT-D3 and DFT-D3CP. As shown in Fig. 4(a), the CP correction has little effect on the predicted inter-ring distances, with both DFT-D3 and DFT-D3CP yielding similar equilibrium geometries. Notably, GFN-FF consistently predicts significantly shorter inter-ring contacts, whereas GFN1 and GFN2 geometries tend to be slightly shorter and longer, respectively, compared to DFT-D3CP across all systems studied. The PES of the inter-ring distance in dimer 6 (Fig. 4(b)) reveals that the GFN-FF equilibrium geometry corresponds to a much shorter contact and lies in a repulsive region of the DFT-D3CP PES, approximately 4 kcal mol−1 above the minimum. Consequently, applying higher-level single-point energy corrections to GFN-FF geometries does not consistently enhance accuracy and appears to be system-dependent, as it depends on how far the GFN-FF geometry is from the equilibrium geometry at other methods. It is worth noting that GFN-FF geometries most closely resemble those from GFN1, and thus GFN1//GFN-FF consistently outperforms other GFN-FF-based combinations. In contrast, GFN1 and GFN2 geometries, though distinct from DFT results, lie within a shallow region near the PES minimum (within 0.5 kcal mol−1). Therefore, single-point energy corrections at higher level DFT-D3 and DFT-D3CP over GFN1 and GFN2 geometries generally improve the chemical accuracy of these composite methods.
Thus, only results at DFT-D3, DFT-D3CP and their corrections to GFN1 and GFN2 geometries will be discussed in the main text. To account for the higher polarizability of anionic species, the def2-TZVPD basis set (with additional diffuse functions) was used on anions during DFT calculations.
The 1-Li+ complex, dominated by localized electrostatics, shows a moderate absolute error of 3.31 kcal mol−1 with the pure DFT-D3CP method. Among the composite approaches, DFT-D3CP//GFN2 stands out with a similar AE of 4.15 kcal mol−1. DFT-D3CP//GFN1 performs slightly worse (5.80 kcal mol−1). In all cases, neglecting BSSE affords slightly lower MAEs (compare DFT-D3 and DFT-D3CP bars in Fig. 5).
The 1-Na+ complex, involving a larger monovalent ion with more diffuse interactions, shows a poor baseline performance with DFT-D3CP (7.46 kcal mol−1). Composite approaches using GFN geometries yield AEs of 11.72 kcal mol−1 and 13.17 kcal mol−1 (DFT-D3CP//GFN1 and DFT-D3CP//GFN2, respectively).
The 1-Mg2+ complex, featuring a highly polarizing divalent ion, challenges all methods. DFT-D3 and DFT-D3CP remain the most reliable, with AEs of 1.80 and 2.90 kcal mol−1, respectively. Note, however, that the binding Gibbs free energy of this complex is the strongest of all, with a target ΔG of −150.68 kcal mol−1. All composite methods overshoot the target significantly, affording AEs of 11.72 kcal mol−1 and 13.17 kcal mol−1 at DFT-D3CP//GFN1 and DFT-D3CP//GFN2, respectively.
For the 1-NH4+ complex, likely governed by directional hydrogen bonding mediated by the F atoms of 1 and electrostatics, the results are generally more favourable. Standard DFT-D3 and DFT-D3CP calculations produce AEs of 1.07 and 0.84 kcal mol−1, respectively. DFT-D3CP//GFN1 and DFT-D3CP//GFN2 deliver excellent AEs of 1.22 kcal mol−1 and 0.27 kcal mol−1, respectively. The presence of N–H⋯F hydrogen bonds likely aligns 1-NH4+ more closely with systems in the training set used during the development of the xTB methods, enhancing the accuracy of force-field and semiempirical geometries and contributing to the strong overall performance of GFN approaches in this case.
As for the anionic complexes, 1-F−, a small and highly basic anion, DFT-D3 yields a significantly absolute deviation of 9.61 kcal mol−1, which remains virtually the same at 9.83 kcal mol−1 upon application of the CP correction. Unexpectedly, CP corrections have a strong worsening effect on GFN-based calculations for this system, where the hybrid methods, DFT-D3CP//GFN1 (AE 25.31 kcal mol−1) and DFT-D3CP//GFN2 (AE 24.86 kcal mol−1), show much higher deviations compared to the one non-corrected for BSSE calculations (12.03 and 11.81 kcal mol−1, respectively).
For the complex of 1 with chloride in 1-Cl−, a larger and less basic anion compared to fluoride, a moderate deviation profile is observed. DFT-D3 gives an AE of 2.55 kcal mol−1, increased to 2.74 kcal mol−1 upon CP correction, a slight increase in error in this case. Among the hybrid methods, DFT-D3//GFN1 and DFT-D3//GFN2 yield decent results (5.24 and 6.13 kcal mol−1, respectively), though CP correction leads to AEs of 11.98 and 12.74 kcal mol−1 in DFT-D3CP//GFN1 and DFT-D3CP//GFN2, respectively, again highlighting an important decrease in accuracy upon correction for BSSE.
In complex 1-BF4−, where 1 interacts with the weakly coordinating anion BF4− primarily through dispersion and diffuse electrostatics, standard DFT-D3 provides a reliable AE of 0.01 kcal mol−1, while DFT-D3CP brings it slightly up to 0.44 kcal mol−1. Hybrid methods vary widely: DFT-D3//GFN2 delivers an exceptionally low AE of 0.57 kcal mol−1, which unexpectedly increases to 3.78 kcal mol−1 after CP correction. A similar trend occurs with GFN1 geometries (from 2.70 to 6.05 kcal mol−1 upon CP correction), maintaining good accuracy.
Lastly, the 1-SO42− complex features a highly charged, polarizable anion with multiple oxygen atoms capable of forming non-conventional hydrogen bonds with the polarized C–Hax bonds of the Janus cyclohexane. DFT-D3 and DFT-D3CP show moderate deviations (4.29 and 4.85 kcal mol−1, respectively), with the CP correction marginally decreasing accuracy. Hybrid methods yield mixed results: DFT-D3//GFN1 produces a higher AE (7.86 kcal mol−1), but CP correction increases the error to 18.76 kcal mol−1. Similarly, GFN2-based methods jump from 14.34 to 24.57 kcal mol−1 with CP, revealing an unexpected and substantial error increase.
Among the various approaches, full DFT calculations remain the most reliable and consistent approach for the more challenging Group 3 ionic complexes, yielding mean absolute errors of 3.60 kcal mol−1 with DFT-D3 and 4.05 kcal mol−1 with DFT-D3CP with low standard deviations (3.41–3.46 kcal mol−1). However, the use of GFN1 geometries in hybrid protocols delivers a cost-effective alternative to pure DFT calculations and renders results in relatively good agreement with the reference values, with MAEs of 6.62 kcal mol−1 for DFT-D3//GFN1 (st. dev. of 4.13) and 11.25 kcal mol−1 for DFT-D3CP//GFN1 (st. dev. of 7.66)—closely approaching DFT-level accuracy at a significantly reduced computational cost. In contrast, geometries obtained from GFN2 exhibit greater variability and tend to perform worse than those from GFN1, with corresponding MAEs of 8.30 kcal mol−1 (DFT-D3//GFN2) and 12.69 kcal mol−1 (DFT-D3CP//GFN2).
In terms of methodological performance, the application of counterpoise correction proved beneficial for pure DFT-D3 calculations for 1-NH4+, but consistently led to higher AEs on other systems and on hybrid approaches based on GFN geometries, which prompted us to investigate how the contact distances between the ions and cyclohexane 1 vary across equilibrium geometries obtained at different theoretical levels. Notably, contact distances between 1 and the molecular anions BF4− and SO42− calculated at the GFN1 level are significantly shorter than those from DFT-optimised geometries, and slightly shorter than those predicted by GFN2 (Fig. 6). In these cases, the overly close ion–cyclohexane proximity predicted by the GFN methods amplifies the effect of the CP correction, resulting in an apparent overcorrection. This reflects an error-cancellation phenomenon, where the uncorrected values fortuitously agree better with the reference. Nevertheless, because BSSE systematically decreases with increasing basis set size, it is expected that larger basis sets in the single-point calculations would mitigate this behaviour and make CP corrections consistently beneficial.
It is worth noting that GFN1 incorporates the third generation of Grimme's empirical dispersion correction (D3),28,35 whereas GFN2 employs the more advanced D4 correction.29,41 Consequently, for molecular anions—where dispersion interactions are expected to play a more significant role in determining complex geometries and energetics—GFN2 generally offers comparable or improved performance over GFN1 due to its more refined treatment of dispersive contributions. Conversely, for cationic species, the trend reverses: GFN1 geometries are in better agreement with the DFT results than GFN2, even though contact distances are still shorter than those predicted by DFT, indicating that D4 dispersive corrections may overestimate binding contacts to more localized charged species.
GFN-FF geometries, on the other hand, show substantial deviations from DFT geometries. They predict contact distances up to ∼1.25 Å shorter, in cases such as the Li+ complex, and up to ∼0.70 Å longer, as seen with the hard anion F−. While GFN-FF also includes a modified version of the D4 dispersion correction,30,41 these discrepancies reflect the intrinsic limitations of force-field approaches, in which atomic ions are modeled as point charges. As a result, electrostatic interactions are often exaggerated, and stereoelectronic effects are poorly handled through parametrization that is highly system dependent. Therefore, for charged complexes, GFN-FF geometries are generally of poor quality, and applying single-point energy corrections at higher theoretical levels is insufficient to yield reliable results (see the SI for further details).
The impact of the optimiser on the resulting geometries was also evaluated. For this purpose, Gaussian16's Berny algorithm was used to optimise all Group 3 systems using the GFN-FF, GFN1, and GFN2 methods. As shown in Fig. S5, the ion–cyclohexane distances remained largely unchanged regardless of whether the optimisations were performed in Gaussian16 or with the xTB optimiser, indicating that the differences discussed above arise from the methods themselves, while the choice of optimisation algorithm has only a minimal impact on the geometry of these systems.
O⋯H–N hydrogen bonds in 10-II, and π-stacking interactions in 10-IV. In contrast, the open-state conformer 10-III is promoted upon the addition of well-defined seed molecules, enabling the stacking of Janus cyclohexanes and triggering kinetically controlled supramolecular aggregation.
Our first objective was to assess whether the GFN methods could reliably identify these four conformers of the core structure of 10 through unbiased conformational sampling. To this end, we performed conformational searches using the workflow detailed in the Materials and Methods section with GFN-FF, GFN1, and GFN2. To enhance sampling of the conformational space around the core, the long alkyl chains appended to the aromatic ring were substituted with methyl groups. The methyl-capped systems are referred to hereafter as 10′. As shown in Fig. 8(A), representatives of all four conformers were qualitatively reproduced across the three theoretical levels, with the exception of conformer 10-IV at the GFN-FF level, which predicted a non-conventional C–F⋯π interaction rather than the experimentally observed C–H⋯π interaction. The selected conformers taken forward for further analysis are shown in Fig. 8(A).
For a more quantitative assessment, we fully optimised the selected conformers obtained at each level and subsequently re-optimised them using DFT-D3 (including frequency calculations). We then compared both the relative ΔG and structural parameters, namely the root-mean-square deviation (RMSD) of heavy atoms, between the GFN-derived and DFT-D3-optimised structures (Fig. 8(B)). In contrast to the behavior observed for Group 1, where single-point DFT-D3 corrections over GFN geometries improved ΔG predictions, such corrections in the case of 10 consistently increased the absolute deviations from the DFT-D3 reference. GFN-FF performed worst among the xTB methods, with a mean absolute error (MAE) of 4.74 kcal mol−1 and the highest RMSD values across nearly all conformers, averaging 0.74 Å. This structural inaccuracy carried over into composite methods involving GFN-FF, which yielded even larger MAEs: 11.46 and 7.60 kcal mol−1 for GFN2//GFN-FF and DFT-D3//GFN-FF, respectively. GFN1//GFN-FF was a notable exception, with a reduced MAE of 3.16 kcal mol−1.
On the other hand, GFN1 emerged as the most accurate among the xTB methods, with an MAE of only 0.61 kcal mol−1 and an average RMSD of 0.42 Å, indicating good agreement with DFT-D3 geometries. Surprisingly, the composite method DFT-D3//GFN1 performed worse, with an MAE of 3.56 kcal mol−1. GFN2, despite yielding geometries in close agreement with DFT-D3 (average RMSD of 0.41 Å), exhibited a higher MAE of 1.53 kcal mol−1. Again, DFT-D3 single-point corrections worsened the results, raising the MAE to 3.63 kcal mol−1 for DFT-D3//GFN2.
In addition to the AE analysis, we evaluated whether the tested methods could correctly identify conformer 10′-IV as the global minimum, as reported for conformer 10-IV in the work of Pavan and Delius et al.27 Another key aspect is the relative stability of all conformers, which was previously reported to lie within approximately 2.6 kcal mol−1 of the global minimum, with 10-I and 10-II being nearly isoenergetic at around 1.2 kcal mol−1 (Fig. 7). However, these relative Gibbs free energies were obtained for the full system 10 (not the truncated 10′) and at the GAFF force field level, which may not accurately capture the energetics of this system.42 Therefore, the discussion here focuses on the general trends in relative conformational stability. As shown in Fig. 9, GFN-FF correctly predicts 10′-IV as the most stable conformer. However, it severely overestimates the energy of the “open-state” 10′-III, placing it ∼25 kcal mol−1 above the minimum, while 10′-I and 10′-II are predicted to be comparably stable, lying within 1.5 kcal mol−1 of the global minimum. When higher-level single-point energies are computed over GFN-FF geometries, the energetic ordering changes: 10′-IV is no longer the global minimum, providing evidence that the C–F⋯π interactions predicted by GFN-FF are not as stabilizing at the semi-empirical or DFT-D3 levels. Instead, 10′-I becomes the most stable conformer in GFN1//GFN-FF, while the “open-state” 10′-III is the lowest in both GFN2//GFN-FF and DFT-D3//GFN-FF, indicating the poor quality of GFN-FF geometries for this system. GFN1 offers improved predictions, placing 10′-IV as the global minimum and locating 10′-I and 10′-II at 2.8 and 2.3 kcal mol−1, respectively. In this case, 10′-III is the least stable at 7.0 kcal mol−1. DFT-D3 single-point calculations over GFN1 geometries (DFT-D3//GFN1) preserve the general energetic trends but increase the relative energy differences. GFN2 nearly reproduces the DFT-D3//GFN1 results, while DFT-D3//GFN2 slightly increases the energy gaps further and inverts the order of 10′-I and 10′-II.
![]() | ||
| Fig. 7 Representative conformers of 10 and their relative ΔGs found by Pavan and Delius et al.27 | ||
![]() | ||
| Fig. 9 Relative conformational stabilities of conformers 10′-I, 10′-II, 10′-III and 10′-IV obtained using different theoretical methods, with Pavan and Delius et al.27 results as references. | ||
Pavan and Delius et al.27 also reported the supramolecular arrangement of polymeric 10, which forms a double-stranded, non-covalently bound complex stabilized by electrostatic attraction between the opposing faces of the Janus cyclohexanes (Fig. 8(C)). As a final stress test, we used a double-stranded helical structure of an oligomeric 10 with a ∼10 Å pitch consisting of 40 monomeric units (in extended conformations resembling 10-III), equilibrated after a 1 μs MD simulation at the GAFF level, extracted from the work of Pavel and Delius et al.,27 as the starting point for full geometry optimisations using GFN-FF, GFN1, and GFN2. Note that we employed the full system 10 with the bulky side chains for this purpose. Remarkably, all methods produced structures with relatively low RMSDs compared to the MD-equilibrated reference: 2.91, 3.13, and 2.71 Å for GFN-FF, GFN1, and GFN2, respectively. These results suggest that all tested xTB methods are capable of handling large and complex supramolecular systems, as demonstrated by their reasonable performance on a non-covalent assembly containing 5480 atoms.
000-fold reduction in computational time for pure GFN methods (see Table S10), making them suitable for systems where DFT is computationally prohibitive, while still maintaining a good compromise between chemical accuracy and computational cost across most of the systems studied.
Across different levels of molecular complexity, the performance of the xTB methods varied, as outlined below.
However, the best accuracy was achieved using DFT-D3 single-point corrections over GFN-optimised geometries, with DFT-D3//GFN2 matching full DFT-D3 calculations in accuracy and surpassing it in computational efficiency for the tested set.
In summary, xTB methods, especially the semiempirical GFN1 and GFN2, offer a powerful combination of speed and accuracy for the study of Janus cyclohexanes and their supramolecular assemblies. Their reliability across a range of systems highlights their importance as a tool for exploring new applications and guiding the design of advanced materials based on Janus motifs.
We hope that the insights provided in this work will stimulate further development and application of Janus cyclohexane platforms in supramolecular chemistry.
Supplementary information: additional computational details, supporting results and optimised cartesian coordinates. See DOI: https://doi.org/10.1039/d5cp02879e.
| This journal is © the Owner Societies 2025 |