Open Access Article
Maria Chiara
Storer
and
Christopher A.
Hunter
*
Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK. E-mail: herchelsmith.orgchem@ch.cam.ac.uk
First published on 22nd November 2022
The functional properties of molecular systems are generally determined by the sum of many weak non-covalent interactions, and therefore methods for predicting the relative magnitudes of these interactions is fundamental to understanding the relationship between function and structure in chemistry, biology and materials science. This review focuses on the Surface Site Interaction Point (SSIP) approach which describes molecules as a set of points that capture the properties of all possible non-covalent interactions that the molecule might make with another molecule. The first half of the review focuses on the empirical non-covalent interaction parameters, α and β, and provides simple rules of thumb to estimate free energy changes for interactions between different types of functional group. These parameters have been used to have been used to establish a quantitative understanding of the role of solvent in solution phase equilibria, and to describe non-covalent interactions at the interface between macroscopic surfaces as well as in the solid state. The second half of the review focuses on a computational approach for obtaining SSIPs and applications in multi-component systems where many different interactions compete. Ab initio calculation of the Molecular Electrostatic Potential (MEP) surface is used to derive an SSIP description of a molecule, where each SSIP is assigned a value equivalent to the corresponding empirical parameter, α or β. By considering the free energies of all possible pairing interactions between all SSIPs in a molecular ensemble, it is possible to calculate the speciation of all intermolecular interactions and hence predict thermodynamic properties using the SSIMPLE algorithm. SSIPs have been used to describe both the solution phase and the solid state and provide accurate predictions of partition coefficients, solvent effects on association constants for formation of intermolecular complexes, and the probability of cocrystal formation. SSIPs represent a simple and intuitive tool for describing the relationship between chemical structure and non-covalent interactions with sufficient accuracy to understand and predict the properties of complex molecular ensembles without the need for computationally expensive simulations.
The basis of the approach is the description of a molecule as a set of Surface Site Interaction Points (SSIP) that capture the properties of all possible non-covalent interactions that the molecule might make with another molecule.10 Each SSIP represents an area of approximately 9 Å2 on the van der Waals surface, which corresponds to the footprint of a single hydrogen bond (H-bond).10 SSIPs can exist in one of two states, either non-bonded or paired with an SSIP from another molecule, and interactions between SSIPs are mutually exclusive. The non-polar interaction between two SSIPs is described by a constant van der Waals energy, but SSIPs also have polarity, so there is an additional polar contribution to the interaction energy when a positive SSIP interacts with a negative SSIP. Positive SSIPs represent regions of local positive electrostatic potential, associated with H-bond donors or σ-holes, and interactions at these sites are quantified by the non-covalent interaction parameter α. Negative SSIPs represent regions of local negative electrostatic potential, associated with lone pair H-bond acceptor sites or π-electron density, and interactions at these sites are quantified by the non-covalent interaction parameter β.
Fig. 1 shows the SSIP representation of two simple molecules.11 Methanol is described by six SSIPs: the two negative SSIPs (red) represent the lone pair H-bond acceptor sites on the oxygen; the large positive SSIP (blue) represents the H-bond donor site on the hydroxyl group; and the small positive SSIPs represent the less polar CH groups. Pyridine is described by eight SSIPs: the large negative SSIP represents the H-bond acceptor site at the nitrogen lone pair; there is one small negative SSIP on each face of the ring, representing sites of interaction with the π-electron density; and five positive SSIPs represent the CH groups. These SSIPs allow prediction of the energies of non-covalent interactions with these molecules. For example in a mixture of pyridine and methanol, the most favourable interaction is between the negative SSIP that represents the lone pair of the pyridine nitrogen, which has the largest β value, and the positive SSIP that represents the methanol hydroxyl proton, which has the largest α value. The structure of the resulting H-bonded complex is illustrated in Fig. 1, and the polar contribution to the interaction energy is quantified by the product of the non-covalent interaction parameters that describe the two SSIPs, −(α × β).
Here we provide an overview of the development of the SSIP approach and discuss applications to understanding the thermodynamic properties of intermolecular complexes and solvation free energies. The review is broadly divided into two parts. The first half focuses on the empirical non-covalent interaction parameters, α and β, which have been developed to describe the properties of different functional groups and form the basis for describing the interaction properties of SSIPs. These empirical parameters have been used to establish a quantitative understanding of the role of solvent in solution phase equilibria, and to describe non-covalent interactions at the interface between macroscopic surfaces as well as in the solid state. The second half of the review focuses on a computational approach to predicting the properties of molecular ensembles using SSIPs. Ab initio calculation of the Molecular Electrostatic Potential (MEP) surface is used to derive an SSIP description of a molecule, where each SSIP is assigned a non-covalent interaction parameter, which is equivalent to the corresponding empirical parameter, α or β. By considering the free energies of all possible pairing interactions between all SSIPs in a molecular ensemble, it is possible to calculate the speciation of all intermolecular interactions and hence predict thermodynamic properties. Applications of this approach to both the solution phase and the solid state are discussed, including methods that have been developed for virtual cocrystal screening, prediction of partition coefficients and solvent effects on association constants for formation of intermolecular complexes.
:
1 complexes formed in non-polar organic solvents have been used to probe non-covalent interactions between two solutes. IR, NMR and UV-Vis titrations were used to measure association constants (K) for the formation of H-bonded complexes between thousands of different compounds. These experiments provided insight into the H-bonding properties of a wide range of different functional groups, because simple molecules were chosen, so that complexation was dominated by a single H-bonding interaction between two well-defined polar sites. The experimental data were used to establish empirical scales, such as pKHB for H-bond acceptors,16,17 and Abraham developed the approach further with the H-bond donor and acceptor parameters, αH2 and βH2, that can be multiplied together to predict the relative strengths of interactions between different solutes.6 Hunter generalised the application of these parameters to any solvent environment by explicitly including the competition between solute and solvent interactions (Fig. 2).18 The assumption is that the equilibrium is dominated by interactions between the most polar sites on the solute and the most polar sites on the solvent. Therefore the experimentally determined non-covalent interaction parameters that describe the properties of the most polar sites of the two solutes (α and β) and of the solvent (αS and βS) can be used to estimate the free energy change associated with the exchange of pairwise interactions illustrated in Fig. 2 (eqn (1)).18| ΔΔG/kJ mol−1 = (αβS + αSβ) − (αβ + αSβS) | (1) |
The value of the association constant for formation of a 1
:
1 complex (K) is obtained by addition of a constant c, which is related to the entropy changes associated with bimolecular association and desolvation (eqn (2)).
−RT ln K/kJ mol−1 = − (α − αS)(β − βS) + c | (2) |
The value of c was experimentally determined to be 6 kJ mol−1 in carbon tetrachloride solution, and this value has been shown to provide a good description of a range of different solvents.18–20 The four non-covalent interaction terms in eqn (2) make it clear that the position of equilibrium in Fig. 2 depends on the relative polarities of the solutes and the solvent. The key point is that the solvent is treated in exactly the same way as the solute, and so the non-covalent interaction parameters that have been determined for solutes can be used to describe the same functional groups when they occur in a solvent.18,21,22
Fig. 3 illustrates the range of values of α and β found for different functional groups. The values of α generally fall in the range 0–5, and the values of β for most functional groups fall in the range 0–10. Variations in the values of the non-covalent interaction parameters can be understood based on the polarity of the atom at the site of interaction. For example, the presence of a nearby electron-withdrawing group would increase the value of α and decrease the value of β. Fig. 3 also includes parameters for anions and cations (red and blue boxes respectively). These parameters can be used in eqn (2) to describe interactions between a neutral solute and a charged solute, but interactions between two charged solutes are beyond the scope of the current treatment, because additional Coulombic interactions come into play. A more complete listing of experimentally determined α and β parameters derived from literature data on the formation of 1
:
1 complexes is provided in the (ESI†).23–57 Although these parameters were determined from measurements of H-bonded complexes involving polar functional groups, they can be applied to all classes of non-covalent interaction. A H-bond is defined as an attractive interaction between the hydrogen atom of an XH group, in which X is more electronegative than H, and another atom,58 but the non-covalent interaction parameters in Fig. 3 encompass a wider range of functional groups. For example, the α parameter of the iodo compound in Fig. 3 describes a positive σ-hole, which is a region of electron density deficiency located on the surface of the iodine atom on the extension of the carbon–iodine bond axis.59,60 A halogen-bonding interaction between this σ-hole and the lone pair of another compound is described in the same way as a H-bond, i.e. using −(α × β).61 Similarly, different combinations of the functional groups in Fig. 3 can be used to describe any other class of non-covalent interaction, such as aromatic interactions62,63 or weak polar interactions with non-polar functional groups that do not form conventional H-bonds. As will be explained below, the non-covalent interaction parameters illustrated in Fig. 3 are the basis for the SSIP approach, which is used to describe the entire surface of a molecule, rather than single point contacts at the most polar site.
![]() | ||
| Fig. 3 The α and β parameters used to describe non-covalent interactions with different functional groups. The atoms highlighted in blue (α) and red (β) are the atoms involved in the interaction, and ions are separated in shaded boxes. X represents an electron-withdrawing group (see ESI† for complete listing of experimentally determined parameters). | ||
Eqn (2) can be used to obtain a quantitative description of the effects of solvent on the properties of non-covalent interactions. For example, the association constant measured for the 1
:
1 complex formed between perfluoro-t-butanol and tri-n-butylphosphine oxide decreases by five orders of magnitude when the solvent polarity is increased from benzene to N-methyl formamide, and the values measured in 10 different solvents were accurately predicted by eqn (2) (RMSE in log
K = 0.18).19 Similarly, eqn (2) accurately predicted the values of the association constants for 254 different 1
:
1 complexes, which were experimentally measured in seven different organic solvents (RMSE in log
K = 0.32).20
However, when eqn (2) was applied to H-bonded complexes in alcohol solvents, the predicted association constants were more than an order magnitude higher than the values measured experimentally.19 The reasons for this discrepancy, and the properties of alcohol solvents will be discussed in more detail below.
The original H-bond scales were developed by measuring association constants for formation of complexes in non-polar solvents. In effect, these experiments fixed the values of αS and βS in eqn (2) and used association constants for different solute combinations to determine the corresponding α and β parameters. An alternative approach is to use association constants for combinations of solutes with known α and β parameters in eqn (2) to determine values of αS and βS for the solvent. This method has been used to determine empirical non-covalent interaction parameters for non-polar solvents like hydrocarbons and perfluorocarbons.20
![]() | ||
| Fig. 4 Generalised Functional Group Interaction Profiles (FGIP) which shows the free energy landscape for interaction between two solutes (α and β) in a solvent (αS and βS) as defined by ΔΔG in eqn (1). The blue regions indicate solute combinations that make favourable interactions, and the red regions indicate solute combinations for which interactions are unfavourable. The most stable pairwise interaction that dominates the equilibrium in Fig. 2 is illustrated for each quadrant. | ||
Fig. 5 shows how the non-covalent interaction parameters of common solvents, αS and βS, map onto the FGIP, and Fig. 6 shows how the appearance of the FGIP changes for different types of solvent. Non-polar solvents like hydrocarbons and perfluorocarbons have low values of αS and βS and lie in the bottom left corner of the solvent map in Fig. 5. The corresponding FGIP for cyclohexane shown in Fig. 6(a) is almost completely blue, because the solute–solute interaction always dominates the equilibrium. All interactions between solute functional groups are favourable, which accounts for the poor solvating properties of these solvents.
![]() | ||
| Fig. 5 Map of the non-covalent interaction parameters that describe the properties of common solvents (αS and βS). Polar interaction sites are highlighted in blue (αS) and red (βS) (see ESI† for complete listing of experimentally determined parameters). | ||
Polar protic solvents like water fall in the centre of the solvent map in Fig. 5 and 6(b) shows the corresponding FGIP. The four quadrants of the FGIP are approximately the same size, because the solvent parameters lie in the middle of the solute α and β scales. As a result, solvophobic effects and polar interactions play an equally important role in determining the properties of non-covalent interactions between solutes in water. Other polar protic solvents, like alcohols and carboxylic acids, are not shown in Fig. 5, because they have non-polar functional groups that play a role in solvation of solutes. A treatment for obtaining FGIPs for these more complicated solvents and the consequences for solvophobic region will be discussed later.
Polar aprotic solvents are characterised by low values of αS, but this group of solvents spans a wide range of βS values. The most polar examples, like dimethylsulfoxide (DMSO) and hexamethylphosphoramide (HMPA), are grouped in the bottom left corner of the solvent map in Fig. 5, and the corresponding FGIP is shown in Fig. 6(c). The FGIP for DMSO is almost completely red, because solvation by the solvent H-bond acceptor dominates the equilibrium. Almost all interactions between solute functional groups are unfavourable, which explains why DMSO is such a good solvent. THF is also a polar aprotic solvent, but in this case, the value of βS lies in the middle of the solute β scale. The resulting FGIP shown in Fig. 6(d) is dominated by two quadrants, one in which polar solutes make favourable interactions (blue) and one in which solvation by the solvent H-bond acceptor dominates (red).
Polar fluorinated solvents are grouped in the top left corner of the solvent map in Fig. 5 and constitute a special class of solvents with unique properties. The FGIP for hexafluoroisopropanol (HFIP) shown in Fig. 6(e) is almost completely red, because solvation by the solvent H-bond donor dominates the equilibrium. There is a clear relationship between the FGIPs for DMSO in Fig. 6(c) and HFIP in Fig. 6(e). Both solvents break up almost all solute–solute interactions, but DMSO solvates H-bond acceptors and anions poorly, whereas HFIP solvates H-donors and cations poorly. These selective solvation properties can lead to enhanced rates of chemical reaction in these solvents, if a key reactant is poorly solvated.64–67
Fig. 6(f) shows the FGIP for chloroform. Halogenated solvents have values of βS that are similar to hydrocarbons but the values of αS are significantly higher. As a result, the FGIP is dominated by two quadrants, one in which polar solutes make favourable interactions (blue) and one in which solvation by the solvent CH group dominates (red). Fig. 6 provides a qualitative illustration of the solvation properties of six different classes of solvent, but it is also possible to construct quantitative FGIPs that map the value of ΔΔG for any solute combination in any solvent or solvent mixture. Application of this approach to obtain FGIPs for alcohols and solvent mixtures is discussed in detail below.
:
1 complex formed between perfluoro-t-butanol and tri-n-butyl phosphine oxide in mixtures of chloroform and tetrahydrofuran (THF) is an order of magnitude lower than the value measured in either of the pure solvents.73 Chloroform is a good H-bond donor and weak H-bond acceptor, so it preferentially solvates the phosphine oxide. THF is a good H-bond acceptor and weak H-bond donor, so it preferentially solvates the alcohol. As a result, both solutes are better solvated in the mixture leading to a reduction in the stability of the complex.
![]() | ||
| Fig. 7 Solvent competition model for formation of a complex between two solutes (α and β) in a mixture of two different solvents (white and grey). The black frames draw a comparison with Fig. 2, grouping the multiple solute–solvent and solvent–solvent interactions that are possible in the solvent mixture. The solvent with the largest αS preferentially solvates β, and the solvent with the largest βS preferentially solvates α. | ||
The competing solvation equilibria illustrated in Fig. 7 have been quantified for mixtures of polar solvents and alkanes.68–70Fig. 8 shows how the association constant for formation of a H-bonded complex changes with varying solvent composition for a mixture of a non-polar solvent (S1) and a polar aprotic solvent (S2). Low concentrations of S2 have no effect on the association constant, because the solutes are exclusively solvated by S1, i.e. only the species shown in the orange box are populated, and the observed association constant is equal to K1. However, once the concentration of S2 is high enough, the H-bond donor is preferentially solvated by S2, and this equilibrium competes with the solute–solute interaction, reducing the observed association constant in proportion to the concentration of S2. The concentration of S2 at which solvent competition becomes apparent can be accurately predicted by using the relevant non-covalent interaction parameters in eqn (2) to estimate the equilibrium constant K2 for the solvation equilibrium shown in the green box in Fig. 8.68,69
The association constant for formation of the 1
:
1 complex between 4-phenylazophenol and tri-n-butylphosphine oxide was measured in mixtures of n-octane as the non-polar solvent S1 and a number of different ethers and polyethers as the polar solvent S2.70 When the values of log K for all of alkane–ether mixtures were plotted as a function of the total concentration of ether oxygen atoms present in the solvent mixture ([O]), the same profile shown in Fig. 8 was obtained except with log[S2] replaced by log[O].70 This result shows that the solvation states of solutes are determined by the concentrations and the polarities of the functional groups present in a solvent. It makes no difference whether the oxygen acceptor sites in an ether solvent are all separated on different molecules or multiple oxygen acceptor sites are located on the same polyether molecule: the resulting solvation of H-bond donor solutes is indistinguishable. The implication for understanding solvation phenomena is that solvents can be described in a straightforward manner as a collection of independent non-covalent interaction sites (SSIPs). Competing solvation equilibria and interactions with solutes can be quantitatively predicted based on the concentrations and non-covalent interaction parameters of the SSIPs. This finding forms the basis for the Surface Site Interaction Model for the Properties of Liquids at Equilibrium (SSIMPLE), which is discussed in more detail below.
The solvent mixture experiment illustrated in Fig. 8 provided insight into the properties of alcohol solvents.71,72 When alcohols were used as the polar solvent (S2) in mixtures with n-octane (S1), the association constants measured for the formation of H-bonded complexes between two solutes showed a much more dramatic dependence on the concentration of S2 than had been observed for other polar solvents. The reason for the anomalous behaviour of alcohols is that they have both H-bond donor and H-bond acceptor sites, which leads to two important consequences.
The green box in Fig. 8 highlights how a polar aprotic solvent preferentially solvates H-bond donor solutes, reducing the observed association constant (log
K). When alcohols are used as S2, the same preferential solvation of the H-bond donor solute occurs, but in this case, there is also preferential solvation of the H-bond acceptor solute, so at high concentrations of S2, the effect of an alcohol on log K is double that of a polar aprotic solvent, which only solvates one of the two solutes.71
The other consequence of the presence of both H-bond donor and acceptor sites in alcohol solvents is that there is strong self-association. At high S2 concentrations, the H-bond donor and acceptor sites at the ends of the resulting oligomeric chains interact more strongly with solutes than the monomeric alcohols present at low S2 concentrations (Fig. 9). The solvent mixture experiments were used to determine αS and βS for alcohol solvents, and the values are significantly higher than the corresponding values of α and β measured for alcohols as monomeric solutes in dilute solution.72 The assumption that solute and solvent non-covalent interaction parameters can be used interchangeably breaks down for this class of solvents, but the behaviour appears to be unique to alcohols, and a large increase in polarity was not observed for secondary amides, which also self-associate into oligomeric chains.71
The α and β parameters can also be used to accurately predict the relative probability of formation of different non-covalent interactions in the solid state.77,78 For example, for compounds that contain only one H-bond donor (α) and only two H-bond acceptors (β1 and β2), there are two possible modes of interaction in a crystal lattice (Forms I and II in Fig. 11). An analysis of crystal structures of such compounds using the Cambridge Structural Database (CSD)79 showed that the frequency of occurrence of Forms I and II can be accurately predicted based on the difference in interaction energy calculated using eqn (3).77
| ΔE/kJ mol−1 = −αβ1 + αβ2 | (3) |
![]() | ||
| Fig. 11 Compounds that contain one H-bond donor (blue) and two different H-bond acceptors (red) can adopt two different packing motifs in the solid state. | ||
Note that eqn (3) treats interactions in the solid using parameters derived from free energy measurements made in solution. The contributions of entropy to interactions in the solid and in solution are different, so ΔE is used to compare the relative probability of formation of different non-covalent interactions in a solid rather than to determine the free energy difference between two solids.
This idea can be generalised to an analysis of all pairwise functional group interactions in the CSD. The non-covalent interactions observed in a crystal structure are the outcome of a competition between different potential functional group interactions. The frequency of occurrence of an interaction between two specific functional groups in the CSD is therefore related to both the strength of this interaction and the strength of the alternative interactions that it had to compete with. A statistical analysis of the occurrence of non-covalent interactions in the CSD was used to make a quantitative ranking of the H-bond acceptor properties of different functional groups (RA). The solid state RA values obtained from the CSD correlate well with the corresponding β parameters measured in solution (Fig. 12).78 These analyses of interactions in solids, as well as at surfaces, indicate that the α and β parameters derived from solution phase experiments are not limited to liquids, but have general applicability for making quantitative predictions of non-covalent interaction energies in a range of different environments.
![]() | ||
| Fig. 12 Relationship between β parameters and solid state H-bond acceptor ratings RA for different functional groups. The best fit line is shown (R2 = 0.88).78 | ||
| α = 1.2 × 10−5Emax2 + 1.14 × 10−2Emax | (4) |
| β = cFG(8.33 × 10−5Emin2 − 2.08 × 10−2Emin) | (5) |
![]() | ||
| Fig. 13 Comparison of non-covalent interaction parameters calculated using eqn (4) and (5) with the corresponding experimental values. (a) α (RMSE = 0.32) (b) β (RMSE = 0.64).11 | ||
This result suggests that the exchange of solute and solvent interactions illustrated in Fig. 2 is dominated by differences in electrostatic interactions and that the other contributions largely cancel out. However, the relationships in eqn (4) and (5) are both quadratic, and additional functional group specific constants (cFG) were required to obtain accurate values of the β parameter, which implies that other factors also play a role.
The number of simultaneous interactions that a molecule can make is limited by steric packing of the interacting partners, so a molecular surface is described as a finite number of SSIPs. The area of an SSIP is defined based on the footprint of a H-bonding interaction at the van der Waals surface (9 Å2).10 A footprinting algorithm was developed to convert the MEP surface of a molecule into a set of SSIPs, where each SSIP is described by a non-covalent interaction parameter (α or β) calculated using eqn (4) and (5) and the value of Emax or Emin for the patch of surface under that footprint.11,89Fig. 1 illustrates the results for methanol and pyridine, showing how the SSIP description captures the properties of both polar and non-polar interaction sites. The COSMO methodology developed by Klamt et al. also uses surface patches to characterise non-covalent interactions across the entire surface of a molecule.90,91 However, an important difference between the COSMO and SSIP methods is the use of empirical calibration. SSIPs represent surface patches that correspond to the area involved in a H-bonding interaction, which means that the SSIP non-covalent interaction parameters can be calibrated by using experimental data on H-bonded complexes to make a direct connection between ab initio MEP calculations and solution phase free energies (Fig. 13).
Experimental data on vapour–liquid equilibria for noble gases and alkanes show that van der Waals or dispersion interactions are proportional to molecular surface area, regardless of atom type.92 The use of SSIPs that each represent the same area on the van der Waals surface therefore provides a relatively straightforward treatment of dispersion, because the contribution to SSIP interactions can be treated as a constant (EvdW = −5.6 kJ mol−1).92 For solution phase equilibria such as the solvent competition model shown in Fig. 2, contributions due to the exchange of dispersion interactions largely cancel out, because there is an exchange of surface contacts of equal surface area: there are two SSIP interactions on each side of the equilibrium.18,93 However, the magnitude of the dispersion contribution becomes important for an accurate description of phenomena where the number of SSIP contacts changes, for example in the hydrophobic effect (see below).92 There is also experimental evidence that the contribution from van der Waals interactions does not fully cancel out in solution phase interactions between aromatic rings, where there is exceptionally good geometric complementarity,94 and in the interactions between perfluorocarbons, which pack poorly.92,95
![]() | (6) |
An alternative packing arrangement for these compounds would be crystallisation separately as the pure substances. Eqn (6) can be used to calculate the difference between the SSIP pairing energy for the cocrystal and SSIP pairing energies for the two pure substances. This calculated energy difference has been successfully used as a selection tool in virtual screening of large compound libraries to identify coformers that have a high probability of forming cocrystals with an active pharmaceutical ingredient (API).99,100 For example, new cocrystals of nalidixic acid, sildenafil, griseofulvin and spironolactone have been discovered using the SSIP virtual screening method.98,101,102
Treatment of SSIP interactions in the solid state is relatively straightforward, because the molecules are static and closely packed, so that each SSIP is in contact with a partner. In solution, all possible SSIP interactions are in dynamic equilibrium, and there are void spaces, so that non-bound states are also accessible. Fig. 15 illustrates non-covalent interactions in a solution phase mixture of a solute (square) and solvent (circles). The dots represent the relatively large volume of void space present in a liquid (approximately 45% for organic solvents at room temperature).10 The probability of forming an interaction between two SSIPs, i and j, is given by the equilibrium constant Kij in eqn (7).10
![]() | (7) |
The inclusion of the van der Waals term in the interaction energy (EvdW) is important for accurately describing the probability of formation of non-bonded states, which do not make van der Waals interactions and are treated as contact with void space.
The solution phase can therefore be treated as a Boltzmann ensemble of all possible SSIP interactions, and the matrix of Kij values calculated using eqn (7) can be used to determine the speciation of intermolecular contacts for any mixture of solvents and solutes. The Surface Site Interaction Model for the Properties of Liquids at Equilibrium (SSIMPLE) is a general implementation of this approach that is used to calculate solvation free energies from the SSIP description of molecules. The method is described in detail elsewhere,10,11,110 but the assumption is that SSIPs can adopt either bonded or non-bonded states, and that the non-bonded state has the same potential energy in any environment. For an individual SSIP in a given phase, calculation of the speciation of intermolecular contacts is used to obtain the total amount that interacts with other SSIPs and how much is non-bonded. These concentrations give an effective equilibrium constant for interaction of a single SSIP with the solvent environment and are used to calculate the solvation free energy of the SSIP relative to an infinitely dilute phase, where there are no intermolecular interactions.
Fig. 16 shows how the solvation free energy (ΔGS) calculated for a single SSIP varies with non-covalent interaction parameter in different solvent environments. n-Hexadecane (grey) is a non-polar solvent, so the interaction is dominated by the van der Waals term, and the solvation free energy does not depend strongly on the polarity of the SSIP. Chloroform (blue) has a relatively polar CH group, which strongly solvates negative SSIPs with large β parameters. However, there are no polar negative sites in chloroform, so the solvation profile is similar to n-hexadecane for positive SSIPs. THF (red) has a polar H-bond acceptor but no polar positive sites, so it solvates positive SSIPs with large α parameters strongly but behaves similarly to n-hexadecane for negative SSIPs. Water (black) has both H-bond donor and acceptor sites, so it solvates SSIPs with either large α or large β values strongly. However, water is unique in that the solvation free energy is positive for non-polar solutes. The solvation profile in Fig. 16 shows that for SSIPs with values between −1.5 and +1.0, the water–water interaction is more favourable than the water–solute interaction. This behaviour is a manifestation of the hydrophobic effect and shows that this phenomenon can be understood simply based on competition for pairwise interactions between solvent and solute. Solvation profiles such as those illustrated in Fig. 16 can be used in a quantitative manner to calculate solvent similarity indices to compare the solvation properties of different solvents with respect to all possible solutes. Solvent similarity indices provide a tool for the design of cost effective or environmentally-friendly alternatives to conventional solvents.111
Fig. 17(a) shows the solvation profile for ethanol, which is a more complex solvent than the examples shown in Fig. 16, because it contains both polar and non-polar functional groups. Three different solvation regimes are apparent, and the changes in gradient in 17a are indicative of changes in the polarity of the solvent functional group that interacts with the solute. Fig. 17(b) shows the populations of the three different solvation states that determine the properties of ethanol. Negative SSIPs with large β values are solvated by the OH H-bond donor of ethanol (blue line). Negative SSIPs with β values less than 6.9 are solvated by the CH groups of ethanol (grey line). The reason for this switch from solvation by OH to CH is that the β value of the H-bond acceptor sites associated with the oxygen lone pairs of ethanol is 6.9. The concentration of solvent is overwhelmingly higher than the concentration of the solute, so solutes with negative SSIPs that are less polar than the solvent will be outcompeted for interaction with solvent OH H-bond donors. In the third regime in 17a, positive SSIPs are preferentially solvated by the H-bond acceptor sites associated with the oxygen lone pairs of ethanol (red line in 17b). The red and blue lines in 17b show that the total population of solvation states is nearly 100% for polar SSIPs with large α or β values. For non-polar SSIPs, solvation is dominated by the van der Waals term, so the grey line in 17b only reaches a maximum of 60% and there is 40% of the non-bonded state.
Solvation profiles can also be used to obtain insights into the properties of solvent mixtures. Fig. 18(a) compares the solvation profiles of chloroform (blue), THF (red) and a 1
:
1 mixture (black). The solvation profile of the mixture closely resembles THF for positive SSIPs, whereas for negative SSIPs, the solvation profile of the mixture is closer to chloroform (blue). Fig. 18(b) provides an explanation for this behaviour. The blue line shows the extent to which SSIPs are solvated by chloroform in the mixed solvent, and the red line shows the proportion of SSIPs solvated by THF. The CH group in chloroform is more polar than the CH groups in THF, so chloroform preferentially solvates negative SSIPs. The H-bond acceptors sites associated with the oxygen lone pairs of THF are more polar than any of the negative sites in chloroform, so THF preferentially solvate positive SSIPs. For non-polar SSIPs, the van der Waals term dominates, so there is an equal probability of interaction with either of the two solvents. One consequence of the preferential solvation illustrated in Fig. 18 is that interactions between two polar solutes are generally less favourable in mixtures than in the pure solvents because they are more strongly solvated.
:
1 complex between two solutes. Thus the difference between the free energy change for formation of an interaction between two solute SSIPs (−αβ + EvdW) and the sum of the two solute SSIP solvation free energies can be used to make predictions about solvent effects on association constants for formation of intermolecular complexes.110Fig. 19 compares calculated and experimental values for 351 different complexes in 12 different solvents (RMSE in log
K = 0.37), and the values calculated using SSIMPLE solvation free energies have comparable accuracy to association constants predicted using empirical solvent parameters in eqn (3) (RMSE in log
K = 0.29).20,112,113
![]() | ||
Fig. 19 Comparison of association constants (log K) for formation of 1 : 1 complexes between two solutes calculated using SSIMPLE with the corresponding experimental values (RMSE in log K = 0.37) for 351 complexes involving a range of different solutes and solvents (1,1,1-trichloroethane, 1,2-dichloroethane, benzene, chlorobenzene, chloroform, cyclohexane, dichloromethane, n-heptane, n-hexane, o-dichlorobenzene, acetonitrile, acetone).20,112,113 The black points are interactions between neutral compounds, red points indicate that one of the compounds is an anion, and blue points indicate that one of the compounds is a cation. The line corresponds to y = x. | ||
The advantage of the SSIMPLE approach over the use of empirical parameters in eqn (2) is that SSIMPLE can be applied to any solvent mixture of any composition and predictions can be made for solvents for which experimental parameterisation is not possible. Fig. 20 shows that SSIMPLE provides an excellent description of the effect of solvent composition on the association constant for formation of a 1
:
1 complex between tri-n-butylphosphine oxide 4-phenylazophenol.69,71 This complex was studied in mixtures of n-octane (S1) and seven more polar solvents (S2). For each solvent mixture, the relationship between log K and log[S2] follows the general trend illustrated in Fig. 7, and SSIMPLE accurately predicts the concentration of S2 at which preferential solvation starts to compete with the solute–solute interaction in each case.
![]() | ||
Fig. 20 The association constant (log K) for formation a 1 : 1 complex between tri-n-butylphosphine oxide and 4-phenylazophenol in mixtures of n-octane (S1) and a more polar solvent (S2) plotted as a function of solvent composition (log[S2]). S2 is di-n-butyl sulfoxide (red), N,N-diethyl acetamide (green), 2-ethylhexyl-acetamide (black) 2-heptanone (orange), ethyl heptanoate (brown), di-n-octyl ether (purple), n-butyl cyanide (pink), 1,1,2,2-tetrachloroethane (blue) and toluene (grey). The points are experimental measurements,69,71 and the lines were calculated using SSIMPLE. | ||
SSIMPLE can also be used to construct quantitative versions of the generic Functional Group Interaction Profile (FGIP) shown in Fig. 4 for any solvent or solvent mixture, providing a useful design tool for guiding the choice of functional group or solvent composition in order to optimise the relative contributions of different types of non-covalent interaction.110Fig. 21 shows the SSIMPLE FGIP for water. There are two regions in which functional group interactions are favourable and two regions where interactions with the solvent dominate (cf. Fig. 4). The value of ΔΔG for the interaction between two non-polar solutes in the solvophobic quadrant is about −3 kJ mol−1. This result differs from the prediction obtained using eqn (1), which gives a value of ΔΔG of about −12 kJ mol−1. The reason for the difference is that the solvent competition model illustrated in Fig. 2 assumes that interactions with the solvent are either fully made or fully broken. In water, the solvent shell around non-polar solutes reorganises to minimise the disruption of water–water H-bonds, and consequently there is less H-bonding to be gained on desolvation. In other words, the gain in solvent–solvent interactions on the right hand side of the equilibrium in Fig. 2 is only a quarter of a H-bond rather than a full water–water interaction, because both solvents on the left hand side of the equilibrium retain 75% of a H-bond with the surrounding solvent. These solvation equilibria are accurately described by the SSIMPLE treatment, and the resulting value of ΔΔG is consistent with experimental data on the magnitude of the hydrophobic effect. For example, the contribution of the hydrophobic effect to the stability of biomolecular complexes is 0.3 kJ mol−1 per Å2 of buried non-polar surface, which corresponds to −3 kJ mol−1 for the interaction between two non-polar SSIPs with a contact surface area of 9 Å2.114
![]() | ||
| Fig. 21 FGIP for water calculated using SSIMPLE. Red regions represent unfavourable interactions, blue regions represent favourable interactions, and the contours are labelled with ΔΔG values in kJ mol−1. Representative solute functional groups are illustrated (reproduced from ref. 110 with permission from RSC Publishing, copyright 2020). | ||
Fig. 22 shows how the FGIP of water changes when cosolvents are added. In a mixture of water and THF, the hydrophobic region disappears, but the other three quadrants are unaffected (Fig. 22(a)). The reason for this behaviour is that in a mixture of water and THF non-polar solutes are preferentially solvated by the CH groups of THF. As a result, the driving force for the hydrophobic effect is eliminated, because no water–water H-bonds are formed when the solutes are desolvated. In a mixture of water and HMPA, the hydrophobic quadrant again disappears due to preferential solvation of non-polar solutes by the CH groups of the cosolvent (Fig. 22(b)). In this case, the favourable interactions between polar solutes in the other blue quadrant of the FGIP are also suppressed. The reason for this behaviour is that HMPA has an exceptionally polar H-bond acceptor (β = 10.9), so H-bond donor solutes are more strongly solvated in the mixed solvent than in pure water. Quantitative measurements of the effects of cosolvents on protein denaturation show that THF and HMPA are two of the most effective solvents at unfolding proteins, and the FGIPs in Fig. 22 provide an explanation.115
![]() | ||
| Fig. 22 (a) FGIP for an equimolar mixture of water and THF. (b) FGIP for an equimolar mixture of water and HMPA. Red regions represent unfavourable interactions, blue regions represent favourable interactions, and the contours are labelled with ΔΔG values in kJ mol−1 calculated using SSIMPLE.110 | ||
The generalised FGIP in Fig. 4 suggests that all polar protic solvents should show solvophobic properties similar to water, but this is not necessarily the case. Fig. 23 shows the FGIPs for two polar protic solvents where the solvophobic quadrant is absent. The FGIP for ethanol (Fig. 23(a)) resembles the FGIP for a mixture of water and THF shown in Fig. 22(a), and the reasons for the absence of a solvophobic region are similar. In ethanol, non-polar solutes are solvated by the non-polar CH groups of ethanol rather than the polar OH group (see Fig. 17(b)), and so the new solvent–solvent interactions that are made when non-polar solutes are desolvated are not H-bonds. Exchange of non-polar contacts leads to no overall change in interaction energy, so the bottom left quadrant of the FGIP is now white rather than blue. Fig. 23(b) shows the FGIP for liquid ammonia. In this case, the solvent has one H-bond acceptor and three H-bonds donors, so oligomeric chains of H-bonded solvent molecules predominate. However, only one of the three H-bond donors can be paired, and the excess H-bond donors present in the solvent are always available for interaction with H-bond acceptor solutes.116 The result is that the corresponding red quadrant where solute–solvent interactions dominate is enlarged to fill most of the bottom half of the FGIP, and the solvophobic quadrant disappears. Solvophobic effects are only observed if the solvation of a non-polar solute has to compete with polar solvent–solvent interactions. If the solvent has additional interaction sites that are not fully H-bonded, these sites will be used to solvate non-polar solutes, and there is no solvophobic region in the FGIP. Other than water, two solvents that have an FGIP with a large solvophobic quadrant are formamide and ethylene glycol, which is consistent with Abrahams experimentally determined solvophobicity scale.103,110
![]() | ||
| Fig. 23 (a) FGIP for ethanol (b) FGIP for ammonia. Red regions represent unfavourable interactions, blue regions represent favourable interactions, and the contours are labelled with ΔΔG values in kJ mol−1 calculated using SSIMPLE.110 | ||
![]() | ||
| Fig. 24 Comparison of calculated and experimental phase transfer free energies (ΔG, mole fraction standard state) for transfer of 214 compounds from n-octanol to water n-hexadecane to water, 177 compounds from the pure liquid to water, 177 compounds from the pure liquid to n-hexadecane, and for transfer of water from water to 115 different pure liquids (RMSE = 3.4 kJ mol−1).11 | ||
Phase transfer free energies calculated using SSIMPLE have been used to parametrise beads that are used to represent fragments of molecules in Dissipative Particle Dynamics (DPD) simulations. The SSIMPLE approach is ideally suited to the description of molecular fragments, since any SSIPs that would sit at the junction point between two fragments in the complete molecule can be removed and do not contribute to the calculated solvation energies. A DPD description of a range of surfactants has been developed using this methodology, and simulations of aqueous solutions of these compounds accurately predict self-assembly properties such as critical micelle concentration and aggregation number.117,118
:
1 complex formed between molecular iodine and tetramethyl thiourea, which has been studied in 15 different solvents. Solvent effects on the stability of this complex are quite different from closely related halogen-bonded and H-bonded complexes (Fig. 25). Using the non-covalent interaction parameter α for molecular iodine in eqn (2) accurately predicts the association constants measured for the halogen-bonded complex formed with tetramethyl urea (green points). Similarly, the β values for both tetramethyl urea and tetramethyl thiourea and accurately predict the association constants measured for the H-bonded complex formed with 4-azophenylphenol (blue and red points respectively). In contrast, the iodine–tetramethyl thiourea complex is four orders of magnitude more stable than predicted by eqn (2) (black points). Spectroscopic and structural evidence supports the conclusion that a significant covalent contribution leads to the anomalous stability. A survey of cocrystals of molecular iodine and thiocarbonyl compounds in the CSD shows that short intermolecular sulfur–iodine distances are correlated with increases of up to 0.4 Å in the iodine–iodine bond length.57
![]() | ||
Fig. 25 Comparison of association constants (log K) for formation of 1 : 1 complexes calculated using eqn (2) with the corresponding experimental values for four different complexes in 15 different solvents.57 The structures of the complexes are shown, and the colouring of the boxes indicates which data points belong to which complex. The line corresponds to y = x. | ||
Covalent interactions probably make a contribution in all of the complexes discussed in this review, but in most cases, these effects are small enough to be captured as perturbations in the empirical values of the non-covalent interaction parameters α and β. Observation that a complex is significantly more stable than predicted by the non-covalent interaction parameters is an indication that covalent contributions may be important.57,137,138
Software that can used to footprint a molecule in order to obtain the SSIP description and to calculate solvation energies using SSIMPLE is freely available at https://gitlab.developers.cam.ac.uk/ch/hunter/ssiptools. More details can be found in ref. 89.
Footnote |
| † Electronic supplementary information (ESI) available: Lists of experimentally determined non-covalent interaction parameters α, β, αS and βS. See DOI: https://doi.org/10.1039/d2cs00701k |
| This journal is © The Royal Society of Chemistry 2022 |