Sopanant Datta and
Taweetham Limpanuparb
*
Science Division, Mahidol University International College, Mahidol University, Salaya, Nakhon Pathom 73170, Thailand. E-mail: taweetham.lim@mahidol.edu
First published on 10th June 2021
A quantum chemical investigation of the stability of compounds with identical formulas was carried out on 23 classes of compounds made of C, N, P, O and S atoms as core structures and halogens H, F, Cl, Br and I as substituents. All possible structures were generated and investigated by quantum mechanical methods. The prevalence of a formula in which its Z configuration, gauche conformation or meta isomer is the most stable form is calculated and discussed. Quantitative and qualitative models to explain the stability of the 23 classes of halogenated compounds were also proposed.
Many findings in contrast to steric predictions exist in the literature. Table 1 shows experimental and theoretical investigations of the Z configuration, gauche conformer and meta isomer being the most stable forms in carbon-backbone compounds. The experiments include heat of combustion or hydrogenation and spectroscopic measurement while the theoretical studies are mainly quantum mechanical methods.
Case | Exceptions to steric prediction and reasonings |
---|---|
Z configuration18,31,60–72 | • Early experimenters such as Demiel conjectured that more electronegative atoms are on the same side in the most stable isomers.11,13 |
• Representative examples compiled by Eliel et al.73 include CHF![]() ![]() ![]() ![]() ![]() ![]() |
|
![]() |
|
• Yamamoto et al. proposed four delocalization effects, σ-LP (nσ → ![]() ![]() ![]() ![]() |
|
Gauche conformation16,17,74–84 | • For CH2F–CH2F, the gauche form is preferred73 due to hyperconjugative interactions. The dominant one is the antiperiplanar σCH to ![]() |
![]() |
|
• Potential energy surfaces of rotamers have been thoroughly investigated. For CH2F–CH2F, the twofold (V2) potential actually has an energy minimum when the F–C–C–F torsional angle is ±90°.79 Rotational barriers can be small such that the shift in equilibrium can be easily observed when polar solvents promote the interconversion of anti to gauche conformers. | |
• “Bent bond” may offer an explanation for the destabilization of the anti conformer.17,80,81 | |
Meta isomer85–87 | • For difluorobenzene, heat of combustion results clearly showed that the meta isomer is the most stable.87 |
![]() |
|
• Computational studies showed that meta isomers are the most stable forms in most cases of dihalobenzenes. Taskinen attributed this to the absence of electronic interactions (shown below) between the two halogen substituents when they are at 1,3-positions.86 | |
![]() |
Even when steric effect reasoning correctly predicts the result, controversy ensues. For example, a number of organic chemistry textbooks attributed the relative stability of the staggered conformation of ethane to steric factor alone. This has led to controversy discussed at length across the scientific community for over eight years.1–8
Electron delocalization effects, on the other hand, are relatively more complicated. The reasoning for energy prediction often involves resonance structures9–15 (formerly called mesomeric effect) or hyperconjugation16–20 (delocalization) of orbitals. Specific reasonings for each case of exception to steric prediction are shown in Table 1. The preference for Z configuration and gauche conformation are primarily due to hyperconjugation in a similar vein to the ethane case,17,18 but the reasoning underlying the preference for meta isomer is still lacking.
In addition to carbon-backbone compounds in Table 1, there are many experimental and theoretical studies for other backbones discussed in this paper, namely, C3,21 CN,22,23 C
P,24 N
N,25–31 N
P,32 P
P,33,34 C–N,35 C–P,36,37 C–O,38–40 C–S,41 N–N,38,42–44 N–P,38,43 P–P,38,43,45,46 N–O,38,42,47 N–S,47 P–O,48,49 P–S,50 O–O,38,42,51,52 O–S,53,54 and S–S.38,55
Inspired by Bent's rule,21,56 which states how orbital hybridizations can explain trends of bond lengths and bond angles in a series of compounds correctly while the steric argument fails, in this paper, we want to advance the understanding of energy prediction of chemical structures that are derived from the same molecular formula.
Non-superimposable structures of the same molecular formula can be enantiomers, diastereomers, conformers and constitutional isomers (structural isomers). As energies of enantiomers are identical, they are excluded from our investigation. For the other three types of isomerism, E and Z configurations in AA′ compounds and halocyclopropanes represent diastereomerism, gauche and anti conformers in A–A′ compounds represent conformational isomerism and ortho, meta and para structures in halobenzenes represent constitutional isomerism.
Pople's basis set of 6-311++G(d,p) was used due to the availability of iodine atom and its reasonable computational cost. However, Pople's basis sets are well-known to produce imaginary frequency under certain conditions.88 Therefore, sample frequency calculations at MP2/6-311++G(d,p) were performed on all classes and only the benzene class was found to have imaginary frequencies. In agreement with the previous study,88 we found that imaginary frequencies disappear at MP2/6-311G(d,p) and the electronic energies are very close to MP2/6-311++G(d,p). While mean absolute deviation is 12.4 kcal mol−1 from 1505 structures, R2 and slope for the energies from the two basis sets are virtually unity. In other words, energies of all structures are shifted by similar magnitudes in the same direction. As a result, we use the basis 6-311++G(d,p) consistently for all classes of compounds in this study.
Optimized geometries of selected AA′, A–A′ and halobenzene compounds were compared to gas-phase experimental data in previous studies.18,89,90,101 The current level of theory and/or basis set yields acceptable results. Additional confirmation with solid-phase X-ray structures from the Cambridge Structural Database (CSD)91 shows good agreement between calculated geometries from the current work and experimental results from the database for dichlorobenzenes (see the supporting information† for the results).
Revised dataset of all classes of compounds in the previous studies together with additions from this work are available online in Open Science Framework. This new data repository is intended to supersede the three separate datasets.57–59 Raw output files from Q-Chem,92 lists of structures, energies, PubChem CIDs, detailed methodology, source codes, scripts and templates are included. Unless specified differently, default settings of Q-Chem were used for the calculations. For some difficult cases of rotamers, different convergence criteria for energies and forces were applied. These attempts can be clearly seen in datasets in the supporting information.†
For E and Z configuration of diastereomers, if all four substituents are different, there are six possible isomers of a halo-substituted CC. To differentiate them, labels of Ea, Eb, E1, E2, E3, Za, Zb, Z1, Z2, Z3, G0, G1, G3 (G stands for geminal) were used for C
C and six other classes of compounds in accordance with the previous study.57 Therefore, energy comparison can be made within a diastereomeric pair (same label such as E1 vs. Z1) and geminal compounds were excluded from the current investigation.
In a similar manner, for gauche and anti conformers, the torsional angle between the highest-priority substituents per CIP rule from the two ends of the molecule are considered. The angles within the interval (−120°, 120°) are treated as gauche and the angles within the interval [−180°, −120°) or (120°, 180°] are counted as anti. Unlike the previous definition of gauche effect,38 for simplicity, ambiguous cases (compounds with at least one conformer having ambiguity in labelling) are not considered. For example, all conformers of CBr2Cl–CF2Cl are not considered since the presence of the two Br atoms as the highest priority atoms on the left leads to an ambiguity in labelling the conformers as gauche or anti. However, compounds with more than one gauche conformer are considered normally in this paper.
For constitutional isomers of halobenzenes, we extended the standard nomenclature ortho, meta, para in disubstituted benzenes to highly substituted benzenes if it can be done by using the two highest priority substituents without ambiguity for all isomers in an empirical formula. For example, C6F4Cl2 isomers can be considered but C6Cl4F2 isomers are not included in our analysis.
As per the definition above, steric effects therefore predict that the E configuration, the anti conformer and the para isomer for compounds in this study are the most stable structures. Herein, deviations from these expectations are called Z configuration effect, gauche conformation effect and meta isomer effect respectively. Preliminary exclusion of irrelevant structures mentioned above reduced the total number of structures for the three groups from 710, 8365 and 1505 in previous studies57–59 to 530, 4980 and 830, respectively. Table 2 shows breakdowns of these numbers for each class.
The numbers of structures shown in Table 2 may differ from those in the results and analysis due to the following reasons.
• In the NN class, NBr
NI and NI
NI disintegrated during the CCSD optimization process and are therefore excluded.
• Some conformers interconverted during the geometry optimization process and are excluded from the analysis.
• Enantiomeric structures exist in many conformer classes. Only one from the pair was chosen for quantum chemical calculation. The excluded structures are still included in the analysis using energetic data from their enantiomers.
The uncertainty in computational results can be quantified in several ways. The change in level of theory from MP2 in previous studies57–59 to CCSD(T) in the present study leads to a change in prevalence rate in Fig. 1 of at most 7% (C–N). The prevalence rates at various other levels of theory (HF, B3LYP, MP2) with codes that generate them are available in the supporting information.† The basis set change from 6-311++G(d,p) to 6-311G(d,p) in MP2 optimization jobs of halobenzene compounds has no effect on the distribution of prevalence rates. Fig. 2 shows detailed distributions of energy differences as an extension to Fig. 1. Most of the distributions appear to be approximately bell-shaped if not uniformly distributed. The range of difference can be very small (e.g. 0.3 to 0.8 kcal mol−1 for PP) and considerably large (e.g. −11.6 to 2.1 kcal mol−1 for N–N).
There are borderline cases in both experimental and computational results as the difference in energy can be extremely small. For the example of CHBrCHBr in Table 1, the gas-phase experimental value for a configuration conversion from E to Z is −100 ± 160 cal mol−1 in one source60 and revised to 90 ± 240 cal mol−1 in another.66 The present CCSD(T) electronic energy agrees with the latter source that the E configuration is more stable. However, in a similar vein to conformers discussed in Table 1, the Z structure is preferred in the liquid phase.66
The main results of prevalence rates here agree with experimental and computational studies previously mentioned in the introduction. The well-known cases in carbon compounds in Table 1 summarized in the infamous book by Eliel, Wilen and Mander73 were reproduced in the current work with only one exception mentioned above. Moreover, the extreme cases of 0% (PP) and 92% (N
N) are also in line with previous work by others.28,30,33,34 For the majority of halobenzenes, in contrast to steric prediction, meta isomers are the most stable forms. This is confirmed by previous computational results for dihalobenzenes.86 Similar observations in polychlorinated compounds also confirm this meta preference e.g. for the first few chlorine substitutions to biphenyls (PCB), dibenzo-p-dioxins (PCDD) and dibenzofurans (PCDF), the most stable chlorination occurs at meta positions with respect to the other ring.93
Simple predictors in which their periodic trends are obvious were selected for our preliminary analysis here. To represent steric bulk, one from three measures of atomic size, covalent radius (RC), van der Waals radius (RV) and atomic radius (RA), was used; the first two exhibit the typical trend of R(H) < R(F) < R(Cl) < R(Br) < R(I) whereas the last leads to the trend of R(F) < R(H) < R(Cl) < R(Br) < R(I). To represent electron delocalization, one from two measures, electronegativity (Pauling's scale, EN(H) < EN(I) < EN(Br) < EN(Cl) < EN(F)) and pKb of the conjugate base X− (pKb(H−) < pKb(F−) < pKb(Cl−) < pKb(Br−) < pKb(I−)), was used.
The models for all classes of compounds with all combinations of steric and electron delocalization factors, together with detailed descriptions can be found in the supporting information.† This analysis can be conducted on a class of compounds or a larger group of classes. In the latter case, additional term(s) are required to represent the class a compound belongs to (mean Ef within a class or central atom properties). An RV vs. pKb model created for the diastereomers group is shown in Table 3 as an example. In this model, the additional predictor A is the mean Ef of a class of compounds. The expectation for coefficient trends of cr2 > cr3 and ce2 < ce3 would represent the aforementioned counteracting effects in diastereomers. The resulting coefficients are as expected. As shown in the table, CHFCHI is predicted to be more stable in its Z configuration whereas CHI
CHI is predicted to be more stable in its E configuration. Cases whereby the final prediction indicates a more stable E configuration can be regarded as those having the more prominent steric effects overpowering electron delocalization effects. The vice versa is applied if Z configuration turns out to be more stable. The RV (or RC) vs. pKb models resulted in expected coefficient trends for conformers (all classes of compounds in one model) and constitutional isomers as well.
Regression model for the group of diastereomers (R2 = 0.8537, adjusted R2 = 0.8531, RMSE = 11.85). All coefficients are significant at p < 0.01 with four exceptions of cr2, cr3, ce2 and ce3 | ||||||
---|---|---|---|---|---|---|
Type of interaction | Predictora | Coefficient | Standard error | |||
a ri and ei are values of the steric factor r (RV in Å) and the electron delocalization factor e (pKb – unitless), respectively, of substituent i (i ∈ {a, b, c, d}). The unit of coefficients can be inferred from this information and the nature of one and two-body terms. | ||||||
1 | c0 = | −968.1540 | 17.5752 | |||
A | c1 = | 1.0242 | 0.0270 | |||
ERV | 1-body | ra + rb + rc + rd | cr0 = | 150.7690 | 6.0272 | |
2-body | Geminal | rarb + rcrd | cr1 = | 23.2610 | 2.2489 | |
Vicinal – Z | rarc + rbrd | cr2 = | −2.0951 | 2.4623 | ||
Vicinal – E | rard + rbrc | cr3 = | −3.7969 | 2.4623 | ||
EpKb | 1-body | ea + eb + ec + ed | ce1 = | −2.5909 | 0.0296 | |
2-body | Geminal | eaeb + eced | ce1 = | −0.0047 | 0.0005 | |
Vicinal – Z | eaec + ebed | ce2 = | 0.0006 | 0.0005 | ||
Vicinal – E | eaed + ebec | ce3 = | 0.0011 | 0.0005 |
Example of prediction | |||||
---|---|---|---|---|---|
Structure | CCSD(T) Ef | Predicted Ef | ERV | EpKb | |
CHF![]() |
Z | −37.6837 | −27.4818 | 953.467 | 21.6087 |
E | −37.1600 | −27.1276 | 953.108 | 22.3213 | |
CHI![]() |
Z | 19.6092 | 26.2601 | 1040.1556 | −11.3382 |
E | 18.2007 | 26.2321 | 1039.1202 | −10.3308 |
However, regression models using other combinations of steric and electron delocalization factors, especially with EN as the electron delocalization factor, often did not result in expected coefficient trends.
For each class of compounds, four contingency tables from four different combinations of steric and electron delocalization factors were constructed and can be found in the supporting information.† The RC vs. pKb model provides results that can be considered the same as those in Fig. 1, as the two factors give identical substituent rankings. The prevalence rates are obtained from the counts of structures classified as Z configuration, gauche conformer or meta isomer by both factors.
An example of a contingency table is shown in Fig. 3 for the class of CC compounds under the RC vs. EN model. For this model, the assumption is that the steric factor favours the E configuration (ERC) whereas the electron delocalization factor favours the Z configuration (ZEN).11,13 Therefore, a final 100% is expected from an ERC and ZEN structure and a 0% from its isomeric counterpart (ZRC and EEN). There are 24 C
C structures of this classification, of which 20 structures are the most stable in their diastereomeric pairs. This shows that 20/24 = 83% of ERC and ZEN structures are the most stable compared to their diastereomers. Neither a 100% nor a 0% were attained from the other two classifications (ZRC and ZEN, and ERC and EEN) in this class of compounds. Expected results (0% and 100%) were achieved for C
N, C
P and cyclopropane as described in the supporting information.† However, this is relatively small compared to the number of classes of compounds in which expected results were not achieved. Thus, this qualitative model was not able to illustrate conclusive results across all classes of compounds of diastereomers, conformers and constitutional isomers.
Firstly, there could be a third factor affecting the results. For example, the deviation from idealized geometry was considered by performing both quantitative and qualitative analyses on the unoptimized structures, using standard bond lengths and bond/torsional angles. Improved trends were observed as described in the supporting information.†
Secondly, steric and electron delocalization predictors are highly correlated as per the periodic trend. A more appropriate electronic predictor may help improve the model. Steric factors may, in fact, be negligible when compared to electron delocalization factors for the studied classes of compounds after appropriate treatment of electron delocalization terms are employed.
Lastly, in cases of qualitative models for conformers and constitutional isomers, only considering the pair of highest priority substituents has an inherent flaw and may not reflect the summative effect of all substituents.
The implications for teaching are manifold. Many general, organic and biochemistry (GOB) textbooks95–97 mention the relative stability of E/Z (cis–trans or geometric) isomers but neglect to mention these phenomena probably for simplicity or because the phenomena were thought to be rare. There are two possible changes. First, one must be apprehensive when the steric reasoning is used to make stability predictions of compounds based on the size of halogen substituents. Second, for reasoning of these phenomena, there should be a more balanced view or a shift from teaching of VSEPR (steric-driven reasoning) to Bent's rule98 and hyperconjugation (electron delocalization-driven reasoning). The call to move away from VSEPR99,100 has been discussed elsewhere and this work only provides an additional piece of supporting evidence. It is important to note that even in the case that the steric prediction is right, the hyperconjugation energy can still be dominant as in the controversial case of ethane rotational barrier.1–8
The data presented should lead to a renewed interest in finding a new approach to describe stability of chemical compounds. The dataset is open for further analysis and utilization in many ways. For example, the models can be applied in molecular mechanics force field construction. Further analysis of bond length and bond angles for different structural classifications may reveal important insights into the three phenomena. Also, we are aware that constitutional isomers exist within the group of diastereomers (only CC, C
N, and C
P). There is currently no specific naming convention for the relationship. Preliminary analysis shows that the failure rate of steric prediction from 1,2-interchange of substituents is 22/55, 20/50 and 17/50, respectively for the three classes. For example, in C2F2I2, the geminal structure is more stable than the E structure (see the supporting information† for the complete list). Similarly, constitutional isomers do exist within the group of conformers too but an exchange of two substituents will have the constitutional isomer effect intertwined with conformational isomer effect.
It is possible to further improve upon the levels of theory and basis used in this study. Specifically, Pople's basis sets are known for issues with post-HF calculations. A well-balanced approach should be developed for this particular area of study. Application of machine learning techniques may also help make a better sense of the data set and reduce the number of structures required to undergo expensive quantum mechanical calculations.
Footnote |
† The data presented in this study are openly available in Open Science Framework with the reference number 6ECP4. See DOI: 10.17605/OSF.IO/6ECP4. |
This journal is © The Royal Society of Chemistry 2021 |