Matthew R.
Davis
and
Dennis A.
Dougherty
*
Division of Chemistry and Chemical Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena CA 91125, USA. E-mail: dadougherty@caltech.edu
First published on 8th October 2015
Cation–π interactions are common in biological systems, and many structural studies have revealed the aromatic box as a common motif. With the aim of understanding the nature of the aromatic box, several computational methods were evaluated for their ability to reproduce experimental cation–π binding energies. We find the DFT method M06 with the 6-31G(d,p) basis set performs best of several methods tested. The binding of benzene to a number of different cations (sodium, potassium, ammonium, tetramethylammonium, and guanidinium) was studied. In addition, the binding of the organic cations NH4+ and NMe4+ to ab initio generated aromatic boxes as well as examples of aromatic boxes from protein crystal structures were investigated. These data, along with a study of the distance dependence of the cation–π interaction, indicate that multiple aromatic residues can meaningfully contribute to cation binding, even with displacements of more than an angstrom from the optimal cation–π interaction. Progressive fluorination of benzene and indole was studied as well, and binding energies obtained were used to reaffirm the validity of the “fluorination strategy” to study cation–π interactions in vivo.
Even if a structure is available and it shows a close contact between an ion and a π system, it cannot be assumed that a functionally significant noncovalent interaction exists. Over the past 20 years we have addressed this issue using non-canonical amino acid mutagenesis.4,8 The aromatic of interest (the side chain of a phenylalanine (Phe), tyrosine (Tyr), or tryptophan (Trp)) is progressively fluorinated. Fluorine is well known to be deactivating in a cation–π interaction, and its effects are typically additive. One thus expects a correlation between protein function and/or ligand binding and degree of fluorination if a cation–π interaction is important. In a number of systems we have found a linear trend between the activation of a receptor by a cationic ligand and the calculated binding of a sodium ion to a series of fluorinated aromatic rings (indoles to mimic the side chain of Trp or benzenes to mimic Phe/Tyr). We considered this compelling evidence for a cation–π interaction.
This “fluorination strategy” is surprisingly general. Linear plots have been seen in over 30 cases, spanning a range of proteins and ligand types. Drug-like molecules with widely differing structures have been studied, including quaternary ammonium ions (acetylcholine) and protonated amines, including primary (glycine, GABA, serotonin), secondary (epibatidine, cytidine, varenicline) and tertiary (nicotine). In addition, more complex cations such as granisetron, ondansetron,9 and the guanidinium toxin tetrodotoxin (TTX)10 have shown linear fluorination plots. In contrast, a study of another guanidinium compound, meta-chlorophenyl biguanide (mCPBG) binding to the 5-HT3 (serotonin) receptor showed behavior that was difficult to interpret.11 In all cases we compared experimental data to the binding of Na+ to the appropriate aromatics. While it may be reasonable to assume that a primary ammonium ion (RNH3+) is well modeled by Na+, more complex ions such as a quaternary ammonium or a guanidinium show much different charge distributions (Fig. 1) and so may display different binding behaviors.
To address this issue we have computationally evaluated fluorination effects on cation–π interactions involving the more complex cations ammonium (NH4+), tetramethylammonium (NMe4+), and guanidinium (Fig. 1). Substituent effects on cation–π interactions and related noncovalent interactions involving benzene have been the subject of several recent investigations, including some with very high levels of theory.12–14 These studies have revealed some unanticipated effects in such noncovalent interactions. The more modest goals of the present work involve the trends in cation–π binding energies in response to progressive fluorination for several combinations of cation and aromatic. When constrained to a cation–π binding geometry, these larger cations mimic the trends seen with Na+ as probe ion.
BE = (E+ + Eπ) − ETOT | (1) |
Existing aromatic box motifs were imported from the Protein Data Bank using Gaussview 5,17 and protons were added to the structures. Cation–π binding energies were calculated using Gaussian09.18 Aromatic boxes were generated by deleting backbone and β atoms of the amino acids as well as the noncationic portion of acetylcholine or trimethyllysine (to generate NMe4+ in both cases). The cation–π binding energy was computed by taking the single-point energy (M06/6-31G(d,p)) with and without the tetramethylammonium ion. Individual cation–π interactions between tetra-methylammonium and aromatic residues were found by deleting all but the cation and aromatic group of interest and computing a single point energy.
Several cation–π structures represented relatively shallow local minima, and were challenging to isolate. For the three separate orientations of the tetramethylammonium ion (one, two and three methyl groups facing the benzene), starting structures were the molecular mechanics-minimized structure. This enforced a symmetry on the structure that was maintained while optimizing at a higher level of theory. To achieve a stacked conformation for guanidinium–benzene, the distance from the center of the guanidinium molecule to three meta carbons of benzene was constrained to 3.4 Å and the (constrained) equilibrium geometry was calculated at M06/6-31G(d,p). The final stacked geometry was obtained using this restricted geometry as a starting point for full geometry optimization.
Ion | HF 6-31G(d,p) | HF 6-311+G(d,p) | MP2 6-31G(d,p) | MP2 6-311+G(d,p) | B3LYP 6-31G(d,p) | B3LYP 6-311+G(d,p) | M06 6-316(d,p) | M06 6-311+G(d,p) | Expt. |
---|---|---|---|---|---|---|---|---|---|
a Computed values are ΔE; experimental values are ΔH. All data in the gas phase. b Mean absolute error of the four values. | |||||||||
Na+ | 27.1 | 23.2 | 29.9 | 24.9 | 28.4 | 23.8 | 26.8 | 22.2 | 28.0 |
K+ | 19.3 | 15.6 | 23.8 | 19.5 | 20.8 | 16.1 | 20.3 | 16.3 | 19.2 |
NH4+ | 15.3 | 14.5 | 19.0 | 19.2 | 17.5 | 15.8 | 19.5 | 18.2 | 19.3 |
NMe4+ | 6.6 | 5.6 | 11.3 | 11.7 | 7.7 | 6.0 | 10.8 | 10.0 | 9.4 |
Errorb | 1.9 | 4.3 | 2.2 | 1.5 | 1.4 | 3.6 | 0.98 | 2.6 | — |
In recent years, a wide array of DFT methods has been developed, each targeting a specific computational challenge. Several methods have been specifically parameterized for noncovalent interactions, including M06-L, M06, M06-2X and others. A previous comparison of over a dozen DFT methods showed that M06 and M06-2X both performed well.16 For cation–π interactions, we find that M06 performs very well (Table 1), even with a modest basis set. The most significant effect is a substantial improvement in the binding to tetramethylammonium. Surprisingly, M06-2X overestimated the binding energies to benzene of Na+ and K+ by 2 and 4 kcal mol−1, respectively, and so we did not consider it further. Clearly, when considering DFT methods one must make a choice on a case-by-case basis. Interestingly, for all the methods considered except MP2, the larger 6-311+G(d,p) basis set gave poorer results than 6-31G(d,p). This likely reflects a cancellation of effects to some extent, but the results are consistent across the various cations considered. Given the results of Table 1, we have settled on M06/6-31G(d,p) as the optimal level of theory. It gives quite good results for all the systems considered. We note that the simpler HF/6-31G(d,p) method reproduces all the trends seen with M06; the largest differences are seen in the quantitative results for tetramethylammonium and guanidinium ions.
Cation–π energies of guanidinium ions to benzene were calculated in two separate conformations: T-shaped and stacked (see Fig. 2). As has been seen in other studies,31,32 T-shaped interactions have a much stronger binding energy. Interestingly, according to M06/6-31G(d,p) the guanidinium ion adopts a propeller shape in the stacked conformation, with the amine groups adopting a 25 degree torsion angle from the plane of the cation. All cation–π binding energies are listed in Table 2.
Benzene | F-benzene | F2-benzene | F3-benzene | |||
---|---|---|---|---|---|---|
1,3 | 1,4 | 1,2,3 | 1,3,5 | |||
a M06/6-31G(d,p) calculations; full geometry optimization unless otherwise noted. b Binding energy obtained via single point method. | ||||||
Sodium | 26.8 | 22.7 | 19.2 | 18.9 | 16.1 | 15.6 |
Potassium | 20.3 | 17.7 | 14.6 | 14.4 | 12.0 | 11.7 |
Ammonium | 19.5 | 16.2 | 12.8 | 12.9 | 10.1 | 10.1 |
Tetramethylammonium (1) | 6.5 | 5.3b | 4.3b | — | 2.9b | 3.0 |
Tetramethylammonium (2) | 8.7 | 7.2b | 5.5b | 5.7 | 4.0b | — |
Tetramethylammonium (3) | 10.8 | 9.0b | 7.1b | 7.0 | 5.3b | 5.4 |
Guanidinium (T-shaped) | 14.9 | 12.3 | 10.4 | 9.8 | 7.0b | 7.9 |
Guanidinium (stacked) | 8.6 | 7.2b | 5.8b | — | 4.6b | 5.1 |
We first probed the aromatic box by studying the simultaneous binding of 4 benzene rings to a tetramethylammonium.40 As seen in Table 3 and Fig. 4a, a tetramethylammonium can easily bind to multiple benzenes simultaneously. The total binding energy is 39.2 kcal mol−1 (Table 3). Each pairwise interaction has three methyl groups en face, and so four optimal interactions of this sort would produce a binding energy of 43.2 kcal mol−1 (4 × 10.8; listed as “Theoretical” in Table 3). The individual binding energies are nearly optimal, as evidenced by the “Sum” column in Table 3. Similar results are seen for 2 or 3 or 5 benzenes (ESI†), although the 5th benzene binds less tightly than the others.
Binding energy | Theoreticala | Sumb | Discrepancyc | |
---|---|---|---|---|
a Four times the binding energy for a single ion–benzene complex. b Four times the binding energy of a single cation-box aromatic. c The difference between the theoretical binding energy and the calculated binding energy. | ||||
NH4+ | 59.6 | 78.0 | 71.1 | 18.4 |
NMe4+ | 39.2 | 43.2 | 42.3 | 4.0 |
![]() | ||
Fig. 4 Geometry-optimized complexes of four benzenes to (a) one tetramethylammonium and (b) one ammonium ion. Binding energies computed at M06/6-31G(d,p). |
Receptors that bind primary ammonium ions (RNH3+) such as GABA or serotonin also have a cluster of aromatic amino acids at the agonist binding site, so we considered the binding of ammonium to an aromatic box. Again, four benzenes can fit comfortably around an ammonium ion with no obvious steric conflicts (Fig. 4b). Experimental studies of K+, which is similar in size to ammonium, have shown that multiple benzenes can complex the ion.21,41,42 The total binding energy of ammonium to four benzenes is 59.6 kcal mol−1, but now the individual interactions are not optimal; the sum value (four times the binding energy of a single cation-box aromatic) differs considerably from the true binding energy. The even larger difference between the theoretical value and the binding energy could reflect a small negative cooperativity due to conflicts between induced dipoles generated by the binding.
Such induced dipoles are expected to be smaller when tetramethylammonium is the ion, consistent with the smaller deviation seen in Table 3.
The aromatic box motif found in biological systems uses Phe, Tyr, and Trp, although there is a general bias toward Tyr and Trp.32 Based on studies using Na+ as a probe ion,44 Trp is expected to provide the strongest cation–π binding. Indeed, calculated binding energies for tetramethylammonium (3 methyls down) to benzene (Phe), phenol (Tyr), and indole (Trp) are 10.8, 12.5, and 15.1 kcal mol−1, respectively. As in previous studies, the cation is significantly offset from the center of the phenol,45 and it lies over the six-membered ring of the indole.32
We replaced the ACh of AChBP with tetramethyl-ammonium and evaluated the total binding energy as well as the interaction to each individual ring (Fig. 5a and Table 4). We find a total binding energy of 28.9 kcal mol−1, with four of the five aromatics contributing significantly. The largest contribution comes from TrpB, the residue that has been shown experimentally to make a functionally important cation–π interaction in most nAChRs.4
AChBP | Histone binding protein | ||
---|---|---|---|
a M06/6-31G(d,p) calculations; single-point energies. b Binding energy if all interactions were optimal and independent. | |||
Aromatic box | 28.9 | Aromatic box | 32.3 |
Predictedb | 67.7 | Predictedb | 42.7 |
TrpB | 8.3 | Trp50 | 13.9 |
TyrC2 | 8.0 | Trp47 | 12.0 |
TyrC1 | 7.5 | Tyr26 | 9.0 |
TrpD | 4.8 | ||
TyrA | 2.3 | ||
Sum | 30.9 | Sum | 34.9 |
We applied a similar analysis to the binding of KMe3+ to a histone binding protein (Fig. 5b).43 We find a total binding energy of 32.3 kcal mol−1 for the tetramethylammonium. Trp47 and Trp50 contribute the bulk of the binding energy, but Tyr26 is still significant (Table 4). Based on the distance dependence of tetramethylammonium binding from Fig. 3, it is evident that all eight potential methyl⋯aromatic interactions of Fig. 5b contribute to the total binding energy. For both protein structures, the calculated total binding energy is slightly less than the sum of the individual interactions, as was seen in the model structures. Also for both systems, the total binding energy is much less than what would be expected if all cation–π interactions were optimal; deviations of 39 and 10 kcal mol−1 are seen for AChBP and the histone binding protein, respectively.
![]() | ||
Fig. 5 The aromatic box motif binding sites of (left) ACh binding to AChBP (PDB: 3WIP).34 Residues are labeled according to their conventional designation in the nAChR. (right) Trimethyllysine binding to the polycomb chromodomain from Drosophila histone H3 protein (PDB: 1PFB).43 Distances (Å) are from a carbon to the ring centroid; the calculated optimal distance for such an interaction is 3.6 Å. |
The exceptions occurred with the larger ions tetramethylammonium and guanidinium binding to the fluorobenzenes and highly fluorinated indoles. In these cases full geometry optimization led to edge-on binding, with the cation attracted to the fluorine(s). Full details of these results are presented in the ESI.†
While these edge-on structures are interesting with regard to possible gas phase studies, the focus of the present work is the binding of cationic ligands to proteins, where structural constraints are likely to discourage edge-on geometries. We developed two strategies to determine the effect of fluorination on such constrained systems. First, we simply added fluorines to the ion–benzene complex (which is in an en face cation–π geometry) and performed single point calculations. Alternatively, we could use the symmetrical 1,4- and 1,3,5-fluorobenzenes, for which the ions stayed centered over the ring. The two strategies produced similar results, with the expected trends on progressive fluorination.
![]() | ||
Fig. 6 Cation–π binding plots for (a) glycine at the glycine receptor using computed cation–π binding energies for ammonium ions to fluorinated benzenes using in vivo data collected for activation of glycine receptor mutants46 and (b) acetylcholine at the α4β2 nicotinic acetylcholine receptor (nAChR) using in vivo data collected for activation of α4β2 nAChR mutants47 using cation–π energies for tetramethylammonium ions binding to fluorinated indoles. |
We have several demonstrations of strong cation–π interactions between ACh and nicotinic ACh receptors.47 The relevant calculation then is tetramethylammonium binding to fluorinated indoles. Fig. 6 shows such an analysis, considering tetramethylammonium binding to the indole ring with three methyl groups en face (Fig. 6). In each case, the calculated cation–π binding energies line up well with experimental results, confirming that the results of such an analysis of receptor binding are independent of the nature of the probe cation.
The aromatic box motif is common in protein cation–π binding sites. This suggests that multiple aromatics can simultaneously have favorable interactions with a cation, although it seems likely that not all interactions would be optimal in a protein structure. This prompted us to revisit the issue of the distance dependence of the cation–π interaction. As in an earlier, preliminary study,1 we find that the distance dependence is not steep (Fig. 3). For example, it could be argued that the cation–π interaction should be treated as an ion–quadrupole interaction. Ideally, an ion–quadrupole interaction should show a 1/r3 distance dependence for the binding energy. However, the cation–π interaction is better modeled by a 1/r2 distance dependence for the prototype Na+ and K+ ions. For larger ions a roughly 1/r3 distance dependence is seen, likely reflecting the increased importance of polarizability for the larger ions. In any case, displacement of the ion by a full angstrom from van der Waals contact (the optimal binding arrangement) diminishes the interaction by less than half.
We have taken two approaches to evaluating the aromatic box motif. First, we evaluated the binding of multiple benzenes to both tetramethylammonium and ammonium ions under optimal conditions. We find that four aromatics can easily bind simultaneously to either ion. For tetramethylammonium, the total binding energy is close to what would be expected if all four individual benzene tetramethylammonium interactions were optimal and independent. For ammonium there is a significant deviation from expectation based on four optimal ammonium⋯benzene interactions. Mostly, this is because the individual interactions are not optimal; clearly the ammonium ion is a bit too small to perfectly bind four benzenes simultaneously. There is a second, non-additivity factor that we attribute to a clash between induced dipoles in the four benzenes. Nevertheless, the global conclusion is that multiple aromatics can readily contribute to the binding of a cation through cation–π interactions.
We next considered two exemplary aromatic boxes from protein structures: the acetylcholine binding protein (AChBP) and a polycomb chromodomain binding to a peptide model of trimethyllysine from the histone H3 protein. In each case we truncated the ligand (ACh or trimethyllysine) to a tetramethylammonium. We anticipated that all contacts between tetramethylammonium and an aromatic of the box would not be optimal, and indeed that is the case. As shown in Table 4, the “predicted” binding energy – that which would arise if each cation–π interaction was optimal – is substantially larger than the actual binding energy. Recalling that an optimal cation–π interaction for tetramethylammonium to Tyr or Trp would be 12.5 or 15.1 kcal mol−1, it is clear that each interaction evaluated individually is weaker than optimal. The histone binding protein has three interactions approaching the ab initio-generated geometries and interaction energies, whereas some components of the AChBP box make only relatively minor contributions to the overall binding energy. Nevertheless, it is clear that a ligand can benefit simultaneously from multiple cation–π interactions to components of an aromatic box. Gratifyingly, the largest interaction in the AChBP is to TrpB, which is the residue most commonly involved in a functionally significant cation–π interaction in studies of nAChRs.
We also considered the extent to which the total binding energy deviates from the sum of the individual interactions. In each case, the deviation is not large, suggesting a relative lack of cooperativity (positive or negative) in the biological aromatic boxes.
We conclude that while the positioning of the aromatics in the biological structures is not optimal, a cation can still make energetically significant cation–π interactions to multiple amino acid side chains. The shallow distance dependence of the cation–π interaction presaged this observation.
While not as large as is theoretically possible, total binding energies for the two aromatic boxes are quite large. Like most noncovalent interactions, the cation–π interaction is attenuated on moving from the gas phase to a condensed medium, although the attenuation is not as large as is typically associated with a hydrogen bond or an ion pair, since the desolvation penalty is less severe.48 Additionally, these calculations do not include any adjustments that the protein structure must make to adapt to the ligand; here we are considering perfectly preorganized binding sites.
It is also interesting that, despite the presence of 5 cation–π interactions, the AChBP shows a smaller binding energy than the histone binding protein with 3 cation–π interactions, because the cation–π interactions in the latter are individually much stronger. Experimentally, AChBP binds ACh with a ∼4 mM affinity,33 while affinities on the order of 0.3 mM have been determined for the binding of KMe3+ to a histone binding protein.49
An additional goal of the present work was to determine whether previous experimental studies of complex biological receptors – in which cation–π binding to a receptor was correlated to the affinity of a Na+ to a substituted aromatic ring – would be altered if more realistic models of the biological ligand were used. The ligands studied included simple protonated amines, (RNH3+), ACh and related compounds (RNMe3+), and derivatives of guanidinium. As such, we have evaluated the binding of NH4+, NMe4+, and guanidinium to a series of fluorinated benzenes and indoles.
As noted above, several highly detailed analyses of noncovalent interactions involving aromatics have appeared in recent years.49 These have emphasized direct interactions between the ion and the substituent, an interaction that might be expected to favor geometries in which the ion migrates away from the center of the ring toward the substituent(s). We find that for the simpler ions (Na+, K+, and ammonium) the ion stays over the center of the ring as fluorines are introduced. In contrast, for tetramethylammonium and guanidinium, edge-on geometries are competitive with and in some cases superior to the cation–π geometries for multiply fluorinated aromatics (ESI†). However, when in a cation–π binding geometry – achieved either through constraints or by using symmetrically substituted benzenes – tetramethylammonium and guanidinium follow the same patterns as the smaller ions.
Our results allow more reasonable modeling of the experimental studies. We find, perhaps not surprisingly, that the linear trends between receptor activation and cation–π binding energy arise regardless of the precise nature of the cation used in the calculations. These more realistic calculations do, however, allow some interesting quantitative comparisons to be made. We have several studies of ACh binding to a series of fluorinated Trp derivatives. The largest effect we have seen is a 540-fold drop in affinity on going from Trp to F4-Trp, corresponding to 3.7 kcal mol−1.4 Our calculations for tetramethylammonium show a corresponding drop of 6.1 kcal mol−1. For a simple ammonium ion, our biggest effect has been seen for GABA binding to the GABAA receptor, with a 16500-fold drop in affinity on going from Phe to 3,4,5-F3-Phe, corresponding to 5.75 kcal mol−1.4 Our calculations on ammonium show a corresponding drop of 8.4 kcal mol−1. We consider these results to be quite encouraging. As noted above, one might expect some attenuation of a cation–π interaction on going from the gas phase to a receptor binding site. The fact that the trend between the two studies is in the right direction, with the protein binding being roughly 60% of the gas phase binding in both cases, suggests that the present computational model may be able to provide semi-quantitative analyses of protein cation–π binding sites.
All the ions studied here show a linear fluorination trend when in a cation–π (en face) binding geometry, enforced either by computational constraints or symmetrical substitution patterns. This supports the use of fluorination plots as a means to evaluate potential cation–π interactions.
Footnote |
† Electronic supplementary information (ESI) available: Description of binding substituent effect: Fig. S1–S4 and Tables S1–S3. See DOI: 10.1039/c5cp04668h |
This journal is © the Owner Societies 2015 |