Katarzyna J.
Zator
and
Christopher A.
Hunter
*
Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK. E-mail: herchelsmith.orgchem@chem.ac.uk
First published on 19th May 2025
Atomic surface site interaction points (AIP) provide a complete description of the non-covalent interactions that one molecule can make with another. The surface site interaction model for the properties of liquids at equilibrium (SSIMPLE) algorithm can be used to calculate the free energy change associated with the pairwise interaction between two AIPs on two different molecules in any solvent. Summing these pairwise AIP interactions across an intermolecular interface that involves multiple interacting sites can be used to calculate solution phase binding free energies and association constants. A computational tool that converts the three-dimensional structure of a complex into a set of AIP contacts has been developed along with a visualisation tool to display AIP interaction maps, allowing straightforward identification of the key intermolecular contacts that contribute most to the overall binding free energy in a complex. The method successfully reproduces solution phase association constants (to within an order of magnitude) for a range of host–guest complexes involving H-bonding, aromatic and hydrophobic interactions, but performs less well for halogen-bonds and complexes involving interactions between the extended π-surfaces of fullerene-type compounds.
We have been developing computational tools for the analysis of binding interfaces based on the molecular surface properties of the interacting molecules.14–21 Atomic surface site interaction points (AIP) are used to describe individual interaction sites on the surface of a molecule and can be calculated ab initio from molecular electrostatic potential surfaces (MEPS) using density functional theory (DFT).22 Each AIP represents 9 Å2 on the van der Waals surface of the molecule (i.e. the footprint of a H-bonding interaction)20 and is assigned an interaction parameter, εi, based on the maximum or minimum electrostatic potential for that patch of surface. This set of AIPs describes all possible interactions that the molecule can make with the surroundings, and the SSIMPLE algorithm can be used to calculate solvation energies for each AIP in any solvent or solvent mixture.17,20 We have previously shown that this approach can be used to make accurate predictions of phase transfer free energies between two solvents (e.g. water and n-hexadecane) and to identify cocrystal coformers by virtual screening.23 Here, we use AIPs to predict association constants for the formation of intermolecular complexes by summing the free energy contributions due to pairwise AIP contacts in the binding interface.
Fig. 1 shows the AIP representation of different functional groups. In order to obtain the AIP description of a molecule, three different MEPS are calculated using DFT.22 Minima on the 0.0300 e bohr−3 electron density isosurface identify the H-bond acceptor sites highlighted in Fig. 1, and the corresponding interaction parameters are assigned using eqn (1).
εi = 0.0336Emin + cβ | (1) |
Local minima on patches of the 0.0020 e Bohr−3 electron density isosurface associated with the non-polar acceptor sites highlighted in Fig. 1 are used in eqn (2) to obtain the corresponding interaction parameters.
εi = 0.0232Emin | (2) |
εi = 0.0132Emax − 2.80 | (3) |
For non-polar positive AIPs associated with CH and SH groups, local maxima on patches of the 0.0104 e Bohr−3 electron density isosurface were used in eqn (4) to obtain the interaction parameters.
εi = 0.0078Emax − 0.64 | (4) |
Fig. 2 illustrates the approach for two molecules that form a 1:
1 complex. Each molecule is represented by a set of AIPs, and the interaction between the two molecules is described by a set of pairwise interactions between AIPs on one molecule and AIPs on the other. The free energy change associated with each AIP contact can be calculated from the corresponding interaction parameters, as described previously,22,24 and the total free energy change for formation of the complex can be obtained by summing pairwise AIP interactions.14 For H-bonding interactions (Fig. 2a), when the AIPs of the two functional groups are projected onto the three-dimensional structure of the complex, the H-bond donor AIP coincides in space with the H-bond acceptor AIP, so this contact can be straightforwardly and unambiguously identified. In general, non-covalent interaction interfaces are not so simple, and the AIPs representing two interacting sites may not coincide as neatly as in Fig. 2a. For example, Fig. 2b shows an aromatic stacking interaction. In this case, when the AIPs of the two functional groups are projected onto the three-dimensional structure of the complex, multiple AIPs are close in space, and it is not obvious which AIPs should be paired to best represent the interaction.
![]() | ||
Fig. 2 Atomic interaction point (AIP) description of (a) a H-bonding interaction between an amide and a pyridine, and (b) an aromatic stacking interaction. |
Here we describe a general solution to the problem of identifying a unique set of pairwise AIP contacts to describe the non-covalent interactions present in an intermolecular complex. We use a collection of supramolecular host–guest complexes to show that this new computational approach can be used to analyse the three-dimensional structure of a complex in order to obtain a reliable estimate of the solution phase association constant in a wide range of different solvents.
These SASA values were used to identify which AIPs are desolvated on binding and which remain in contact with the solvent. Parameterisation of the SASA criteria was based on X-ray crystal structures of protein–ligand complexes, which have interstitial water molecules in the binding interface. The three-dimensional structures of complexes from the CASF dataset were retrieved from the protein databank and analysed for water molecules that were resolved in the intermolecular binding interface. The AIPs that interact with the water molecules in these structures are easily identified as close contacts. For each structure, the interstitial water molecule was removed (highlighted in blue in Fig. 4a), and SASA were calculated for each AIP in the remaining pocket (highlighted in yellow in Fig. 4b). In general, the AIPs had been in contact with the water molecule in the original structure had the largest SASA values, but for seven complexes (1NC1, 1NC3, 3G2N, 3G2Z, 4K18, 4LLX, 4TWP), the SASA values were all rather small, and the discrimination between AIPs that did and did not contact the interstitial water molecule was less clear cut. These structures were therefore used to optimise the SASA criteria for identifying solvated AIPs.
The SASA probe radius was varied between 0.0 and 1.4 Å in increments of 0.05 Å, and the resulting AIP SASA values for all seven complexes were analysed for differences between the set of AIPs that contact water and the set of AIPs that do not. The 0.35 Å probe radius gave the largest difference, with 9.8 Å2 providing the threshold value for assignment of a solvated site. This value can be compared with the footprint of an AIP: 9 Å2 on the van der Waals surface of an oxygen atom is equivalent to 14 Å2 projected onto the 0.35 Å probe radius SASA. In other words, an AIP is considered solvated, if more than 70% of the associated surface is exposed to solvent. For fractional AIPs, the threshold SASA was therefore scaled by the corresponding fractional parameter f, which represents the surface area of these sites. Although this parameterisation is based on finding a small number of sites solvated by water, the resulting analysis is based purely on the shape of the SASA of the solutes, and we assume that the parameters do not depend on the solvent or AIP type.
This SASA analysis eliminates a large number of AIPs, and only the remaining AIPs are candidates for involvement in intermolecular interactions that contribute to the overall stability of the complex.
The value of dmax is the only adjustable parameter that determines the performance of the pairing algorithm. Analysis of X-ray crystal structures of protein–ligand complexes from the CASF dataset was used to test different values of dmax.29 Values less than 1 Å led to almost no AIP pairings, and values of greater than 2 Å led to networks that were too large for computation using the MBP algorithm. Values of dmax midway between these two extremes gave AIP pairings consistent with the close contacts observed between atoms in the interaction interfaces. We use a default value of dmax of 1.7 Å, which was sufficiently large to capture all of the potential AIP contacts in an intermolecular interface and small enough to keep the computational cost down (about a minute on a desktop computer for interaction surfaces involving 200 atoms).
An additional complication in the analysis of these networks arises due to the fractional AIPs that are used to represent some functional groups. The original formulation of surface site interaction points (SSIP) used the same molecular surface area (9 Å2) for each SSIP, because this approach allows a straightforward treatment of the van der Waals (or dispersion) contribution to the interaction energy between two SSIPs.20 However, AIPs that describe π-sites on aromatic carbon atoms are assigned a fraction parameter, f = 0.5, because they represent half the molecular surface area of a standard SSIP.22 Thus an AIP with a fraction parameter f = 1.0 could interact with two different AIPs with a fraction parameter f = 0.5. For example in the pairing process shown in Fig. 2, if A1 and A2 are both f = 0.5 AIPs, and B2 is a f = 1.0 AIP, then A1·B2 and A2·B2 are no longer mutually exclusive pairings. This situation is handled by copying instances of f = 1.0 AIPs into two identical f = 0.5 AIPs at the same position, allowing two different pairwise interactions with the same site. In this case, a fractional weighting of 0.5 is also applied to the distances between the pairs of interacting AIPs used to calculate dtotal, the distance parameter used to determine the set of pairings that best represents the non-covalent interactions present in the interface.
ΔG(i,j) = ΔGB(i) + ΔGB(j) − ΔGS(i) − ΔGS(j) | (5) |
The free energy of each AIP in the bound state is given by eqn (6).
![]() | (6) |
![]() | (7) |
![]() | (8) |
The solvation free energy of each AIP in the free state can be calculated for any solvent using SSIMPLE.20 In SSIMPLE, all pairwise interactions between solvent and solute AIPs in the liquid phase are described by eqn (8), which allows calculation of the Boltzmann distribution of AIP contacts and hence the solvation energy for an individual solute AIP, ΔGS(i). We have previously shown that for a specific solvent the relationship between ΔGS(i) and the AIP interaction parameter εi can be described accurately using a simple polynomial (eqn (9)).24,30
![]() | (9) |
The overall free energy change for complexation is simply the sum of the individual contributions from each AIP contact scaled by the fractional parameter, f, for interactions involving fractional AIPs (eqn (10)).
![]() | (10) |
Since each AIP represents 9 Å2 of molecular surface area, the AIP contact analysis of even relatively small intermolecular interfaces turns out to be rather complicated, because a large number of interactions are involved. A graphical method has therefore been developed to visualise the most important AIP contacts that contribute to the overall stability of a complex. These AIP interaction maps provide a three-dimensional display of all of the AIP contacts that contribute more than a user-specified threshold to the free energy change of complexation in a given solvent. The default setting is that attractive interactions worth more than 5 kJ mol−1 are plotted as dark green spheres, attractive interactions worth more than 0.5 kJ mol−1 as light green spheres, repulsive interactions worth more than 0.5 kJ mol−1 as orange spheres, and all weaker contacts are plotted as small grey spheres, so that they do not confuse the image unnecessarily.
Complex | Major NCI | Solvent | Error (kJ mol−1) | Ref. | ||
---|---|---|---|---|---|---|
1 | Aromatic | Chloroform | −17.6 | −15.6 | −2.0 | 32 |
2 | Aromatic | Chloroform | −5.9 | −2.6 | −3.3 | 32 |
3 | Aromatic | Dichloromethane | −6.3 | 2.0 | −8.3 | 33 |
4 | Aromatic | Dichloromethane | −7.5 | 0.6 | −6.9 | 33 |
5 | Aromatic | Chloroform | −21.7 | −9.0 | −12.7 | 34 |
6 | Aromatic | Chloroform | −19.2 | −9.2 | −10.0 | 34 |
7 | H-bond | Chloroform | −34.7 | −19.8 | −14.9 | 35 |
8 | H-bond | Chloroform | −13.8 | −13.7 | −0.1 | 35 |
9 | H-bond | Chloroform | −48.9 | −44.0 | −4.9 | 36 |
10 | H-bond | Dichloromethane | −26.3 | −30.9 | 4.6 | 37 |
11 | H-bond | 1,1,2,2-Tetrachloroethane | −11.8 | −11.6 | −0.2 | 38 |
12 | H-bond | 1,1,2,2-Tetrachloroethane | −15.0 | −14.5 | −0.5 | 38 |
13 | H-bond | 1,1,2,2-Tetrachloroethane | −11.1 | −7.6 | −3.5 | 39 |
14 | Desolvated cage | Mesitylene | −41.8 | −39.8 | −2.0 | 40 |
15 | Desolvated cage | Mesitylene | −37.6 | −44.9 | 7.3 | 40 |
16 | Hydrophobic | Water | −12.5 | −12.0 | −0.5 | 41 |
17 | Hydrophobic | Water | −20.5 | −22.8 | 2.3 | 41 |
18 | Hydrophobic | Water | −58.9 | −58.3 | −0.3 | 42 |
19 | Halogen bond | Cyclohexane | −2.9 | −30.9 | 28.0 | 43 |
20 | Halogen bond | Cyclohexane | −21.3 | −48.8 | 27.5 | 43 |
21 | Extended π | Chloroform | −23.0 | 17.7 | −40.7 | 44 |
22 | Extended π | Chloroform | −9.2 | 22.9 | −32.1 | 44 |
23 | Extended π | Toluene | −22.2 | 2.8 | −25.0 | 45 |
24 | Extended π | Toluene | −21.3 | 3.5 | −24.8 | 45 |
25 | Extended π | Carbon disulfide | −18.4 | 0.0 | −18.4 | 46 |
26 | Extended π | Carbon disulfide | −17.6 | −0.3 | −18.9 | 46 |
The X-ray crystal structure of the complex was used as the starting point for the calculation, and the coordinates of the two components of the complex were extracted as separate files. For each molecule, the MEPS was calculated without any geometry optimisation at the 0.0300, 0.0104 and 0.0020 e Bohr−3 electron density isosurface using NWChem 7.0.228 and DFT with the B3LYP functional and a 6-31G* basis set (or 6-31G** for iodine). Footprinting software described previously was used to convert the MEPS to AIPs, and then the AIPs of the two components were superimposed on the X-ray crystal structure of the complex. The AIP pairing analysis described above was then used to identify contacts and calculate the total solution phase interaction energy.
There is an assumption implicit in this approach that the three-dimensional structure observed in the crystal does not change in solution. Since the analysis is based on assignment of AIP contacts, precise interatomic distances are not critical in determining the interaction energies. There is an inherent fuzziness in the pairing of AIPs based on the distance criteria, so small variations in the geometry of the complex will not have a significant effect on the calculated interaction energy. However, a gross rearrangement of the structure of the complex in solution would render the analysis invalid.
Fig. 5–10 shows the AIP interaction maps calculated for each of the complexes in Table 1, and the corresponding free energy changes calculated using eqn (10) are listed in Table 1 (see ESI† for a detailed list of individual AIP contacts and associated free energy contributions for each complex). It was possible to treat the desolvated cages in the same way as the other complexes by simply omitted the ΔGS term describing solvation of the host AIPs from eqn (5).
![]() | ||
Fig. 7 AIP interaction maps for complexes where the host is a cage that is desolvated in the free state. Chemical structures of the components are shown. |
![]() | ||
Fig. 9 AIP interaction maps for complexes where halogen bonding is the major non-covalent interaction. Chemical structures of the components are shown. |
The aromatic interaction complexes all feature guests with electron-withdrawing groups and hosts with more electron-rich π-systems (Fig. 5). In complex 1, the guest π-face AIPs are sufficiently positive to make attractive interactions with the negative AIPs that represent the π-faces of the host. There are also four attractive edge-to-face contacts leading to a large favourable free energy change that agrees well with experiment. In complex 2, the guest π-face AIPs are not positive enough to make attractive interactions with the host π-face AIPs, and complexation is dominated by the four edge-to-face interactions. The reduction in binding affinity compared with complex 1 agrees well with the experimental results. For complexes 3 and 4, the AIPs are not sufficiently polar for any contacts to compete with desolvation, and the calculated free energy changes are all very small. In complexes 5 and 6, the guest π-face AIPs are positive and make a large number of attractive interactions with the host π-face AIPs. However, the net interaction energy is somewhat less favourable than experiment in both cases. Although the AIP contact maps provide a good qualitative description of the significance of different contacts in these complexes, summing over a large number of relatively non-polar AIP interactions results in a large error in the totals for the calculated free energy changes.
In contrast, the H-bond complexes are characterised by a small number of very polar AIP interactions (Fig. 6). In this case, the accuracy of the calculated free energy changes depends on the accuracy of the AIP polar interaction parameters εi. In all cases, the H-bonds are clearly identified as green balls in the AIP interaction maps, and the agreement with the experimental free energy changes is excellent. The one discrepancy is complex 7, which is significantly more stable than the calculation predicts. This discrepancy appears to be due to the rather low polarity assigned to the amide AIPs involved in the H-bonding interactions.
The properties of the desolvated cage complexes are well-described by the AIP interaction maps in Fig. 7. In both cases, there are a large number of non-polar AIP contacts, which are each worth about −3 kJ mol−1 due to the lack of solvent competition for van der Waals interactions with the host sites. The total calculated free energy changes agree well with experiment.
The hydrophobic complexes in Fig. 8 are also characterised by a large number of non-polar AIP contacts, each worth −3 to −4 kJ mol−1. However, in this case, the driving force comes from the favourable desolvation term associated with the poor solvation of non-polar sites by water. Again the total calculated free energy changes agree well with experiment.
The halogen bond and extended π-surface complexes are less well-described. Although the halogen bonds are clearly identified in the AIP interaction maps (Fig. 9), the calculated free energy changes for these complexes are significantly more favourable than the experimental values.
We have previously investigated the description of halogen-bonded complexes using the empirical H-bond parameters, α and β, on which the AIP polar interaction parameters εi were parameterised. For many simple complexes that make a single halogen-bond, it was possible to accurately reproduce the experimentally measured association constants K using eqn (11).22,47
−RT![]() ![]() | (11) |
However, some halogen-bonded complexes are not well-described by eqn (11), specifically complexes of perfluoroiodocarbons with tertiary amines and the complex formed by molecular iodine and tetramethylthiourea, where there appears to be a significant covalent contribution to the interaction energy that leads to a stabilisation of the complex.47,48 In contrast, halogen-bonded complexes 19 and 20 are significantly less stable than predicted by the AIP calculations. The AIP pairing method is based on point contacts with no consideration of the precise arrangement of the interacting groups, and it is possible that there is some geometrical mismatch in these complexes that prevents optimal pairwise interactions at all of the halogen-bond sites, which results in a destabilisation that is not captured by the calculations.
The AIP interaction maps of the extended π complexes are characterised by a large number of weakly unfavourable contacts (Fig. 10). For complexes 21 and 22, contacts between negative π-face AIPs lead to positive calculated free energy changes. For the fullerene complexes 23–26, most of the guest AIP values are close to zero, and the calculated free energy changes are small. These results suggest that there is an additional contribution to the stability of these complexes that is not captured by the AIP contact analysis. Since these complexes feature interactions between non-polar surfaces, the main driving force for binding is presumably dispersion. In the SSIMPLE model, dispersion is introduced through a constant van der Waals interaction energy of −5.6 kJ mol−1 per AIP contact (see eqn (8)). However, dispersion interactions between solute AIPs are largely cancelled by dispersion interactions with solvent AIPs, which is consistent with experimental measurements of non-polar interactions in non-polar solvents.49,50 As a result, the net contribution of dispersion to the calculated intermolecular interaction energies is small, except in the case of the desolvated cages where this cancellation does not occur. The AIP model makes reasonable predictions for the smaller aromatic interaction complexes, so the discrepancy for the extended π complexes appears to be related to the very large surface area of contact. One possible explanation is that there are cooperative effects across larger flat surfaces, and the AIP model, which treats individual AIP contacts as mutually independent, would not capture such effects.
The calculated free energy changes can be used to estimate association constants for the complexes (K = exp(−ΔG°/RT)). Fig. 11 shows the correlation between calculation and experiment for the complexes featuring conventional interaction types that are well-described by the AIP interaction maps (aromatic interactions, H-bonds, hydrophobically driven complexation, and desolvated cages). Although there are some outliers, the AIP description of these complexes is rather good, with predicted binding affinities accurate to within one order of magnitude.
![]() | ||
Fig. 11 Comparison of the experimentally measured association constants for formation of 1![]() ![]() |
A range of host–guest complexes were used to investigate the utility of the methods. X-ray crystal structures of the complexes were used to identify AIP contacts and hence calculate binding free energies. Complexes that are dominated by H-bonding and aromatic interactions are described well by the AIP pairing calculations, which provide reasonably accurate predictions of the experimentally measured association constants in different solvents (i.e. within one order of magnitude). Similar results were obtained for complexes in water, where binding is driven by contacts between hydrophobic surfaces. The treatment of solvation can easily be adapted to describe complexes formed by cages in solvents that are too large to fit inside the host cavity. In this case, the term describing solvation of the host AIPs was simply removed from the free energy calculation to obtain accurate predictions of the association constants. In contrast, halogen-bonded complexes and complexes involving interactions between the extended π-surfaces were not well-described by the AIP method. Nevertheless, AIP interaction paps of intermolecular complexes are relatively straightforward to calculate and provide useful insight into the nature of the interactions that lead to complex formation and the role of solvent in determining observed binding affinities.
Footnote |
† Electronic supplementary information (ESI) available: Details of the algorithm, polynomials for calculation of solvation energies in 261 different solvents, AIP contacts in host–guest complexes. The code to calculate AIPs is available at https://github.com/k-zator/AIP and https://github.com/k-zator/AIP_MEPS. The code to identify AIP–AIP pairings has been developed in Python and has been documented and deployed at https://github.com/k-zator/AIP_map. See DOI: https://doi.org/10.1039/d5cp00398a |
This journal is © the Owner Societies 2025 |