H-bond competition experiments in solution and the solid state †‡

The H-bonding outcomes in crystal structures of simple molecules, where two potential H-bonds can be formed, have been used to calculate relative H-bond probabilities for 59 combinations of H-bond donors and H-bond acceptors. H-bond probabilities are shown to correlate well with the difference in solution phase free energy between the two competing H-bonds.


Analysis of intermolecular contacts in the Cambridge
Structural Database (CSD) has provided important fundamental insights into the nature of non-covalent interactions. [1][2][3][4][5][6][7] The structural properties of functional group interactions in the solid state correlate well with ab initio calculations of interaction potentials in the gas phase and with solution phase spectroscopic properties, such as infrared bond stretching frequencies. [8][9][10][11][12][13][14][15] Studies of intermolecular interactions in the solid state have focussed on geometrical properties that can be directly measured in X-ray crystal structures: interatomic distances, and the relative distributions of functional groups in three-dimensional space. 16,17 However, the frequency distributions of functional group contacts in the CSD should also provide thermodynamic information about relative functional group interaction energies. 18,19 In this paper, we show that experimentally determined solution phase H-bond parameters provide a good prediction of the probability of functional group interactions in the CSD. The results imply that the frequency distribution of functional group contacts in the CSD could provide a useful tool for the prediction of interaction energies in different environments.
The relative stabilities of solution phase functional group interactions can be quantified using experimentally derived H-bond parameters, α and β ( Fig. 1(a)). 20 By summing over pairwise contacts between solvent and solute, the Gibbs free energy of complex formation, ΔG°, can be reliably estimated using eqn (1).
where α and β are the H-bond donor and acceptor parameters of the solutes, α S and β S are the corresponding H-bond parameters of the solvent, and 6 kJ mol −1 is the adverse free energy for solution-phase bimolecular association. A similar approach can be used to estimate the difference in free energy between two different H-bonded complexes in solution ( Fig. 1(b), eqn (2)). This H-bond competition experiment provides a convenient tool for tackling the relationship between solution phase and solid state interaction energies.
where ΔG°1 and ΔG°2 are the free energy changes for the forwards and backwards equilibrium in Fig. 1 Eqn (2) and the frequency distribution of functional group interactions in the CSD can be used to compare the competition between two different H-bonding interactions in solution with the outcome of the competition between the same two interactions in the solid state. If we select molecules that contain one H-bond donor (D), two different H-bond acceptors (A1 and A2) and hydrocarbon only, then two different H-bonded states are possible in the solid state (Fig. 2). If sufficient X-ray crystal structure data are available for reliable statistics, the populations of the two different H-bonded states in the solid state should be related to the solution phase free energy for the competition equilibrium in a hydrocarbon solvent ( Fig. 1(b)).
We therefore searched the CSD 21 for molecules that contain only two functional groups, one of which is both a H-bond donor and a H-bond acceptor, and the other is only a H-bond acceptor. The frequency of occurrence of the two different H-bonds in the CSD was used to calculate the probability of H-bond formation in the solid state, p i (eqn (3)). (3) where N i are the number of structures in the CSD which contain both functional groups as the only non-hydrocarbon functionality and where the H-bond of interest is formed (i = 1 or 2 for the D·A1 or D·A2 interaction respectively).
The error in p i , ε i , is given by eqn (4). For some systems, p i values of 1 and 0 were obtained, because one of the two H-bonds was never observed. For these systems, the error in p i cannot be estimated.
(4) Table 1 lists the functional groups studied in the H-bond competition experiments reported in this work. For each pair of functional groups in Table 1, the CSD was searched using a formula constraint to ensure that one of each functional group was present in the molecule with only a variable hydrocarbon skeleton connecting them. The criterion used to detect a H-bond was any contact between the specified atoms that was closer than the sum of the van der Waals radii, and the number of structures containing D·A1 and D·A2 H-bonds was recorded (see ESI †).
For certain functional group combinations the same crystal structure was retrieved in searches for D·A1 and for D·A2, i.e. the crystal contained both possible H-bonds. These structures were checked manually for artefacts. For example, Fig. 3 shows two results where both interactions were found in the competition experiment between a ketone acceptor and an alcohol acceptor for an alcohol donor. In Fig. 3(a), the presence of two ketone-alcohol H-bonds (O1A·O2B and O1B·O2A) forces the alcohol oxygens (O1A and O2A) into close proximity, so that the interatomic distance is shorter than the sum of the van der Waals radii. There is no hydrogen atom between O1A and O2A, so this H-bond was removed as a hit from the search. In contrast Fig. 3(b) shows that in a different structure both ketone-alcohol (O3B·O2A) and alcohol-alcohol (O1A·O2A) H-bonds are present.
The values of p i calculated from the CSD data can be compared with the free energy change for the corresponding solution phase competition experiment, ΔG°i . The values of solution phase H-bond parameters for a specific functional group vary with substituent (see ESI †). We therefore used a database of experimentally derived H-bond parameters to obtain generic values of α and β for each functional group, and these values were used in eqn (2) to calculate the free energy change for the solution phase H-bond competition experiment in a hydrocarbon solvent. 22 The α value for the solvent was chosen as 0.7 (the average value of α for an aliphatic CH and an aromatic CH, which are 0.4 and 1.0 respectively). The results are not very sensitive to the precise value used as  hydrocarbon is a very weak competitor for H-bonds. The results are compared with the corresponding H-bond frequency distribution in the solid state in Fig. 4. The 59 different functional group combinations studied produce 118 data points in Fig. 4, because each competition experiment gives two results, p 1 and p 2 .
Although there is considerable scatter in Fig. 4, there is a clear correlation between the solution phase ΔG°i values and solid state p i values. If the frequency distribution of solid state H-bond interactions were determined by the solution phase free energy change for the competition experiment, then we would expect a Boltzmann distribution according to eqn (5). Eqn (5) is represented by the red line in Fig. 4.
The root mean square error between the theoretical line and the experimental data in Fig. 4 is large (RMSE = 0.20). One of the outliers in Fig. 4 is the competition of an alcohol H-bond for an alcohol H-bond acceptor and an aryl ether H-bond acceptor. The alcohol-alcohol H-bond is 3 kJ mol −1 more stable that the alcohol-aryl ether H-bond according to the solution phase H-bond parameters, but the populations of these H-bonds in the CSD are equal (52% and 48% respectively, see ESI †). Fig. 5 illustrates some of the structures in the CSD that give rise to this behaviour. Of the 22 structures that contain an alcohol-aryl ether H-bond 9 have the generic structure shown in Fig. 5(a). In these structures, the C-OH bond is coplanar with the aromatic ring, which sterically blocks H-bond donors from interacting with the alcohol lone pairs. Fig. 5(b) shows another example where a rigid polycyclic ring structure sterically blocks the alcohol group from acting as a H-bond acceptor. This outlier in Fig. 4 is therefore due to a small number of structures with particular steric properties. This is a general problem with functional group combinations that yielded a relatively small number of hits in the CSD search.
The issue of poor sampling can be addressed by combining data from different searches.    state and solution phase H-bond free energies is significantly better than obtained for the raw data in Fig. 4.
These results indicate that the solution phase H-bond parameters α and β provide an accurate indicator of the probability of forming a H-bonding interaction in the solid state. This observation is consistent with the success of cocrystal screening approaches based on this assumption. [23][24][25] The results also suggest that the CSD could provide a valuable resource for quantifying the relative strengths of intermolecular functional group interactions. The frequency distributions of functional group contacts in the CSD are directly related to the corresponding interaction energies in solution and may ultimately be useful for calibration of intermolecular potentials for use in molecular modelling applications. Fig. 6 Relationship between ΔG°i and p i obtained by summing H-bond populations over 2 kJ mol −1 windows in ΔG°i (RMSE = 0.07). The red line corresponds to the Boltzmann distribution in eqn (5). ‡ Data Access. All data supporting this study are provided as supplementary information accompanying this paper.