 Open Access Article
 Open Access Article
      
        
          
            Vyshnavi 
            Vennelakanti
          
        
       ab, 
      
        
          
            Helena W. 
            Qi
          
        
      ab, 
      
        
          
            Rimsha 
            Mehmood
          
        
      ab and 
      
        
          
            Heather J. 
            Kulik
ab, 
      
        
          
            Helena W. 
            Qi
          
        
      ab, 
      
        
          
            Rimsha 
            Mehmood
          
        
      ab and 
      
        
          
            Heather J. 
            Kulik
          
        
       *a
*a
      
aDepartment of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. E-mail: hjkulik@mit.edu;   Tel: +1-617-253-4584
      
bDepartment of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
    
First published on 19th November 2020
Hydrogen bonds (HBs) play an essential role in the structure and catalytic action of enzymes, but a complete understanding of HBs in proteins challenges the resolution of modern structural (i.e., X-ray diffraction) techniques and mandates computationally demanding electronic structure methods from correlated wavefunction theory for predictive accuracy. Numerous amino acid sidechains contain functional groups (e.g., hydroxyls in Ser/Thr or Tyr and amides in Asn/Gln) that can act as either HB acceptors or donors (HBA/HBD) and even form simultaneous, ambifunctional HB interactions. To understand the relative energetic benefit of each interaction, we characterize the potential energy surfaces of representative model systems with accurate coupled cluster theory calculations. To reveal the relationship of these energetics to the balance of these interactions in proteins, we curate a set of 4000 HBs, of which >500 are ambifunctional HBs, in high-resolution protein structures. We show that our model systems accurately predict the favored HB structural properties. Differences are apparent in HBA/HBD preference for aromatic Tyr versus aliphatic Ser/Thr hydroxyls because Tyr forms significantly stronger O–H⋯O HBs than N–H⋯O HBs in contrast to comparable strengths of the two for Ser/Thr. Despite this residue-specific distinction, all models of residue pairs indicate an energetic benefit for simultaneous HBA and HBD interactions in an ambifunctional HB. Although the stabilization is less than the additive maximum due both to geometric constraints and many-body electronic effects, a wide range of ambifunctional HB geometries are more favorable than any single HB interaction.
Given the inherently quantum mechanical (QM) nature of the HB, care must be taken in defining and observing it in structural or computational studies. Use of geometric considerations (i.e., van der Waals radii) alone to determine the presence or absence of HBs can lead to erroneous conclusions.39,41,42 Some classical electrostatic models or empirical correlations43–45 have been developed along with first-principles investigations46–50 to understand the strength and nature of hydrogen bonding. Nevertheless, HBs can challenge conventional modeling methods, including approximate density functional theory (DFT) treatments51–53 that fail to accurately model long-range electron correlation. Although protein structures can be used to validate DFT and more accurate correlated wavefunction theory (WFT) methods,15,54–57 challenges remain. For example, short hydrogen bonds are ubiquitous in proteins, but they are often penalized during structural refinement.58 Many noncovalent distances in a range of 2.6–2.8 Å that would be deemed to be unfavorable in crystal structures are instead observed to be favorable and weakly stabilizing when evaluated with correlated wavefunction theory.59 These interactions cannot be captured with the classical force fields predominantly used to simulate proteins.59 Similar observations have recently been made in noncovalent interactions in DNA.60 Although force fields have been noted in recent years to be broadly improving in agreement with experiment,61 they can fail to describe noncovalent interactions essential for modeling protein structure in globular55,62 or intrinsically disordered proteins63 due to inherent limitations in the physics that can be captured by their functional forms.64,65
Cooperative, strong hydrogen bonds are an exemplary subset of HBs where modeling beyond the force field level is essential. In low-barrier hydrogen bonds (LBHBs),1,66 the barrier to hydrogen transfer is similar to the zero-point vibrational energy.67,68 These HBs are typically69 characterized by O⋯O separations around 2.6 Å or lower, and similarly short interactions are observed in charge-assisted hydrogen bonds and salt bridges.67,70–73 These functionally important interactions1,66,68,74 would be missed by standard force fields that do not treat charge transfer or polarization and disfavor short non-covalent distances.
The same assumptions that limit the modeling of LBHBs give rise to uncertainty about the interplay and balance of sidechain–sidechain,5,6 sidechain–backbone23 and intraresidue HBs75 in protein structure and function. For instance, it is poorly understood, given the number of potential hydrogen-bonding partners that a sidechain can form, the extent to which single or multiple hydrogen bonds are observed for a particular sidechain.76 Part of this challenge arises from the fact that the structure of most amino acids will require some compromise of the preferred linear and directional nature of the HB to form multiple interactions. This is in contrast to the lack of compromise required for base pairing interactions comprising simultaneous short, linear N–H⋯O and N–H⋯N HBs in RNA77,78 and DNA.79,80 As an example of the potential functionality offered by multiple, compromised interactions, we recently observed that simultaneous donor and acceptor HB interactions between the aliphatic hydroxyl of a Thr sidechain and a neighboring Asn played an essential role in native substrate recognition of a non-heme iron halogenase.81 This interaction, validated by experimental mutagenesis,82 was revealed with QM modeling but was challenging to replicate with force fields. Similarly, a large-scale screen of protein interactions suggests that many simultaneous donor/acceptor interactions can form between the aromatic Tyr and Asn or Gln residues.59
In this work, we thus investigate the balance of hydrogen bond acceptor (HBA) and donor (HBD) interactions in representative models of amino acid sidechains (i.e., Tyr/Ser/Thr with Asn/Gln) that can act simultaneously as HBAs and HBDs. Because the ability of charged residues (e.g., Arg, Lys, His, or Glu/Asp) to form multiple HBDs/HBAs depends upon their protonation state, we focus on a subset of neutral residues capable of forming multiple interactions. Given the challenges with ensuring the accuracy of computational models of HBs, we first carry out careful coupled cluster theory modeling of individual HBA or HBD interactions. The WFT-level analysis carried out here is critical, as force-field modeling fails to reproduce essential differences between aromatic and aliphatic amino acids. We confirm the suitability of our models and level of theory for capturing the structures favored in a curated set of 4000 HBs obtained from high-resolution (<1.5 Å) X-ray crystal structures of proteins. Using this analysis, we quantify the degree and nature of the benefit observed in the formation of ambifunctional (i.e., simultaneous HBA/HBD) HB interactions.
From the resulting set of 6114 residue pairs, we carried out further refinements to identify a subset for which HBs were most likely to be present. A wide range of HB distance (2.2–4.0 Å) and angle criteria (90–180°) have been proposed87–89 in the literature. Because consensus about HB distances and angles is not established, and the optimal HB distance or angle is strongly species dependent, we developed a quantum-mechanically derived approach to selecting distance and angle cutoffs. The presence of bond critical points (BCPs90) from the quantum theory of atoms in molecules (QTAIM)91 and the potential energy density evaluated at that point provides a heuristic estimate of HB strength.92 After adding and optimizing hydrogen atoms on representative (ca. 10%) residue pairs from PDB structures, we quantified the presence of BCPs and the potential energy density at the BCP using Multiwfn93 (ESI Text S1 and Table S2†). Our final range of HBA/HBD distances (N⋯O: 2.5–3.2 Å, O⋯O Ser/Thr: 2.4–3.1 Å, and O⋯O Tyr: 2.4–3.2 Å) and angles (N–H⋯O: 105–180° and O–H⋯O: 110–180°) was selected based on structures where BCPs were detected for N–H⋯O and O–H⋯O HBs. Our definition of an ambifunctional HB requires at least one of the O⋯O and N⋯O HB distances to fall within their specified HB distance criteria, whereas the second must only be within the 120% vdW radii sum criteria. Using these distance criteria, the refined protein data set consists of 3908 residue pairs. HB angle distributions were evaluated over all of these residues by an automated procedure that added hydrogen atoms, evaluated HB angles, and also classified N–H⋯O HBs as syn or anti (ESI Text S2†).
In both the protein residues and their truncated models, the planar amide hydrogen atoms can be either syn or anti with respect to the C![[double bond, length as m-dash]](https://www.rsc.org/images/entities/char_e001.gif) O bond (Fig. 1 and ESI Fig. S1†). The WFT N–H⋯O HB interaction energies of syn and anti N–H⋯O HBs agree within 1 kcal mol−1 for both models, so we select syn as the representative case for further comparison (Fig. 2 and ESI Table S3†). Between models, the interaction energies of N–H⋯O HBs in acetamide–p-cresol are smaller for both conformations (−6.1 kcal mol−1) than in the acetamide–methanol model (syn: −7.0 kcal mol−1), consistent with our expectations (Fig. 2 and ESI Table S3†). Although the aromatic p-cresol hydroxyl exhibits weaker interaction energies than the aliphatic hydroxyl in methanol, the difference remains modest (<1 kcal mol−1, Fig. 2 and ESI Table S3†). The optimized geometries support energetic observations; longer N⋯O HB distances (∼0.1 Å) and larger HB angles (∼6.3°) are observed for p-cresol than methanol N–H⋯O HBs (ESI Tables S4 and S5†).
O bond (Fig. 1 and ESI Fig. S1†). The WFT N–H⋯O HB interaction energies of syn and anti N–H⋯O HBs agree within 1 kcal mol−1 for both models, so we select syn as the representative case for further comparison (Fig. 2 and ESI Table S3†). Between models, the interaction energies of N–H⋯O HBs in acetamide–p-cresol are smaller for both conformations (−6.1 kcal mol−1) than in the acetamide–methanol model (syn: −7.0 kcal mol−1), consistent with our expectations (Fig. 2 and ESI Table S3†). Although the aromatic p-cresol hydroxyl exhibits weaker interaction energies than the aliphatic hydroxyl in methanol, the difference remains modest (<1 kcal mol−1, Fig. 2 and ESI Table S3†). The optimized geometries support energetic observations; longer N⋯O HB distances (∼0.1 Å) and larger HB angles (∼6.3°) are observed for p-cresol than methanol N–H⋯O HBs (ESI Tables S4 and S5†).
To quantify how well our energetic models reproduce the N⋯O HB distances in high-resolution crystal structures, we calculated one-dimensional (1D) potential energy curves (PECs) (see Section 5) as a function of the N⋯O HB distance and compared distances in the protein data set to the representative model systems (Fig. 1 and see Section 2). Features of the 1D PECs for the representative syn N–H⋯O HB conformation in both models are broadly consistent with the freely optimized model structures, both in terms of a deeper energy minimum for the methanol model (by ca. 0.9 kcal mol−1) and a shorter (methanol: 2.94 Å vs. p-cresol: 3.06 Å) N⋯O HB distance (Fig. 3, ESI Fig. S2 and S3†).
Despite differences in the hydroxyl placement on Ser or Thr, the most frequently observed N⋯O HB distances in Ser/Thr–Asn/Gln pairs agree well with the 1D PEC minimum for the acetamide–methanol model (Fig. 3, ESI Fig. S2 and S3†). This good agreement further supports the choice of truncated models (i.e., methanol for both Ser and Thr) to represent the protein residues. Although a range of distances are observed over the full protein set, only a small fraction of structures have distances different from the value at the minimum of the acetamide–methanol N⋯O HB 1D PEC (Fig. 3). Consistent trends hold for HBs with Tyr, where the shallower 1D PEC model coincides with both the slightly wider range and longer distances of X-ray crystal structures (Fig. 3). Overall, our truncated models capture key interactions from protein crystal structures whether in syn or anti N–H⋯O HB conformations, but sub-kcal mol−1 energetic differences (e.g., relative preference for syn or anti) in the models can be expected to be affected by competing backbone and environmental stabilization in the crystal structures (Fig. 1, 2, and ESI Text S3, Tables S3, S6 and S7, Fig. S4 and S5†).
The directionality of HBs provides a key indicator of their strength and character. We thus evaluated N–H⋯O HB angle distributions over the X-ray crystal structures (see Section 2). Ser/Thr and Tyr N–H⋯O angle distributions exhibit similar trends, with the highest angle probability between 150 and 170° and a rapid decay outside of that range (Fig. 4, ESI Fig. S6 and S7†). In our model systems, we had observed more obtuse N–H⋯O angles, with the acetamide–p-cresol angle (178°) larger than acetamide–methanol (171°), both of which exceed many of the angles observed in our X-ray structure data set (Fig. 4). Because we approximate the placement of hydrogen atoms to compute angles in the X-ray structures (see Section 2), this additional uncertainty limits determination of the relative directionality of Tyr versus Ser/Thr N–H⋯O HBs with Asn/Gln amino acid pairs (Fig. 4). For the N–H⋯O HB angles for X-ray structures of all amino acid pairs, average values (154°) are reduced by ca. 20° with respect to the model systems, suggesting a possible effect of the protein environment on favored HB angles (see Section 2 and ESI Table S6†). In the extended X-ray crystal structures of amino acid pairs with reduced HB angles, interactions with solvent molecules, backbone (i.e., N–H hydrogen or carbonyl oxygen) atoms of nearby residues or other interactions in the greater environment absent from the models are present to varying degrees (ESI Fig. S8†). We also optimized our small models in a dielectric medium (ε = 10) to capture the screening effect of the protein environment (ESI Table S8†). This approximate incorporation of environment effects leads to only small changes in HB distances (ca. 0.02 Å in both directions) and angles (ca. 2–5° increase) for either model (ESI Table S8†).
To examine the interplay of optimal HB distances and angles, we computed two-dimensional potential energy surfaces (PESs) of the N–H⋯O HB interaction. Qualitatively similar PES shapes are obtained for the acetamide–methanol and acetamide–p-cresol model systems (Fig. 5). The main difference between the two arises from the less favorable minimum in p-cresol that corresponds to a shallower overall PES, leading to more comparable interaction energies for the two models when the angles or distances are displaced from the minimum (Fig. 5, ESI Fig. S9 and S10†). Although in both cases, the strongest interaction energies are at the expected HB distances (i.e., between 2.8 and 3.2 Å) and HB angles (i.e., between 160° and 180°), structures over a wide 140–180° angle range are within 1–2 kcal mol−1 of the minimum (Fig. 5). In both cases, structures at a fixed distance always favor larger HB angles, but displacement from the equilibrium angle incurs considerably less penalty than distance displacement (Fig. 5).
Comparing protein structures to our model systems, the majority (i.e., over 85%) of Ser/Thr–Asn/Gln structures reside within 2 kcal mol−1 of the minimum on the computed model 2D PES (Fig. 5 and ESI Table S9†). Results for Tyr–Asn/Gln are consistent, but the shallower nature of the 2D PES for p-cresol leads to a smaller fraction (70%) residing within 2 kcal mol−1 of the model global minimum (Fig. 5 and ESI Table S9†). Over all amino acid pairs, a minority (10%) of X-ray structures sample smaller than expected HB angles (i.e., between 110° and 130°) that correspond to less favorable model interaction energies due to a short N⋯O distance but relatively long HBD to HBA (i.e., H⋯O) distance (ESI Table S9†). Examining the full protein in representative cases reveals competing HB interactions in the surrounding protein environment with alternative HB partners (e.g., solvent or other amino acids) that likely compensate for the formation of these weaker N–H⋯O HBs (ESI Fig. S11†).
While the 1D PEC O⋯O HB energetics are largely consistent with those of the freely optimized structures, some minor differences are apparent due to differences in the level of theory used (see Section 5, Fig. 3, ESI Fig. S2 and S3†). The 1D PEC acetamide–p-cresol O⋯O heavy-atom HB distance is further reduced (2.73 Å vs. 2.81 Å) with respect to methanol (Fig. 3, ESI Tables S4 and S5†). Overlaying X-ray crystal structure O⋯O distances on these 1D PECs confirms the suitability of the model systems for O–H⋯O HBs, with a significant fraction of Tyr–Asn/Gln O⋯O distances that are shorter than those for Ser/Thr–Asn/Gln (Fig. 3, ESI Fig. S2 and S3†). The distribution of X-ray crystal structure O⋯O HB distances is especially narrow for Tyr HBs, consistent with the steeper 1D PEC for Tyr in comparison to Ser/Thr (Fig. 3, ESI Fig. S2 and S3†). The most frequently observed O⋯O distances in X-ray crystal structures are slightly shorter (by ca. 0.1 Å) than the 1D PEC minima for both methanol and p-cresol (Fig. 3, ESI Fig. S2 and S3†). This effect is relatively modest, as it corresponds to interaction energies approximately 0.3 kcal mol−1 above the model 1D PEC minimum. Both the omission of the protein environment and our neglect of quantum nuclear effects58,68 particularly relevant at short HB distances could explain this discrepancy. Indeed, incorporation of a dielectric during optimization of our model systems leads to shorter (0.06–0.07 Å) HB distances (ESI Table S8†).
Analyzing the O–H⋯O HB angle distribution over the X-ray crystal structure data set highlights a greater preference for near-linear (i.e., 170–180°) angles in comparison to N–H⋯O HB angles, especially for the Tyr residue pairs (Fig. 4, ESI Fig. S6 and S7†). The greater strength of Tyr O–H⋯O HBs is consistent with a slightly higher fraction of the most linear angles in comparison to Ser or Thr (Fig. 4, ESI Fig. S6 and S7†). Average O–H⋯O HB angles (ca. 165°) are comparable for all residue pairs and higher than those for N–H⋯O HB angles by around 11° (ESI Table S10†). This increase in favored angles in the X-ray crystal structures suggests greater consistency with the O–H⋯O HB angles of fully optimized models in comparison to the N–H⋯O HB case (ESI Tables S4 and S5†).
Evaluation of the O–H⋯O HB 2D PESs highlights good agreement between the optimal model structures and observed X-ray crystal structures for all amino acid pairs (Fig. 5, ESI Fig. S9 and S10†). In comparison to N–H⋯O HBs, the joint distribution of X-ray crystal structure distances and angles is both more compact and more aligned with the lowest-energy regions of the model systems for both aromatic (i.e., Tyr or p-cresol) and aliphatic (i.e., Ser/Thr or methanol) hydroxyl HBDs (Fig. 5, ESI Fig. S9 and S10†). Almost all (ca. 95%) of the Ser/Thr X-ray crystal structures sample distances and angles within 2 kcal mol−1 of the O–H⋯O model minimum (Fig. 5 and ESI Table S9†). The percentage of Tyr–Asn/Gln O–H⋯O HB X-ray structures (ca. 85%) within 2 kcal mol−1 of the model 2D PES minimum is also increased (Fig. 5 and ESI Table S9†). Thus, the steeper PES and greater directionality of O–H⋯O HBs is likely due to local interactions that are well described in gas-phase models.
Of the four HBs considered thus far, the acetamide–p-cresol O–H⋯O HB is significantly (∼4.9 kcal mol−1) stronger than the acetamide–p-cresol N–H⋯O HB as well as the HBs in the acetamide–methanol model (Fig. 5 and ESI Table S3†). Displacements of the acetamide–p-cresol O–H⋯O HB distance by ca. 0.7 Å or angle by ca. 60° lead to interaction energies still as strong as the alternative N–H⋯O HB (Fig. 5, ESI Fig. S9 and S10†). The differentiation of HB strength for aromatic hydroxyls diverges from the aliphatic hydroxyl (i.e., the methanol model) where the O–H⋯O HB is only slightly stronger (0.9–1.4 kcal mol−1) than the N–H⋯O HB (ESI Table S3†).
We used energy decomposition analysis (i.e., at the SAPT2+3 level of theory98) to understand the origins of the O–H⋯O HB interaction being more favorable than the N–H⋯O HB (see Fig. 2 and ESI Table S11†). In the acetamide–p-cresol model, stronger electrostatic attraction (by 8 kcal mol−1) and induction (by 3 kcal mol−1) terms both contribute to the stronger O–H⋯O HB, and similar but smaller effects are observed for acetamide–methanol (ESI Table S11†). Although the shorter O–H⋯O HB distances expectedly give rise to more destabilizing exchange repulsion, this effect is outweighed by the other factors, including greater dispersion stabilization for the O–H⋯O HB especially for acetamide–p-cresol (by ca. 3 kcal mol−1) that contributes to the overall more favorable O–H⋯O HB (ESI Table S11†).
Given the greater favorability of O–H⋯O over N–H⋯O HBs, we expected to observe a significantly larger number of O–H⋯O HBs especially for interactions with Tyr. Contrary to both our expectations and prior observations over a larger data set,59 a greater number of N–H⋯O HBs is observed in our data set than O–H⋯O HBs for either Ser/Thr or Tyr with Asn/Gln (ESI Table S12†). Supporting our choice of the methanol model to represent either Ser or Thr, we also observe that these trends are consistent when comparing the relative number of O–H⋯O and N–H⋯O HBs for Ser or Thr with Asn or Gln individually (ESI Table S12†). Of all 3908 HBs in the curated set, slightly more are observed for Thr than Ser (1332 vs. 1174), but both residues have a similar relative percentage of N–H⋯O HBs with Gln (63%) that exceeds those with Asn (54–57%, ESI Table S12†). One potential source of this counterintuitive difference in HB abundance is the compensation of weaker N–H⋯O sidechain–sidechain HBs by additional sidechain–backbone or sidechain–solvent interactions. Indeed, inspection of the protein around representative N–H⋯O HBs reveals simultaneous formation of sidechain–backbone HBs and additional sidechain (i.e., Ser/Thr/Tyr hydroxyl or Asn/Gln carbonyl) to solvent HB interactions that could not form in O–H⋯O HB conformations (ESI Fig. S8 and S12†). Additionally, Asn/Gln can form two N–H⋯O HBs per sidechain, which we also observe in our data set (ESI Fig. S13†). Thus, raw counts in a limited data set likely capture the relative favorability of globally compensated HB interactions, whereas the relative strengths of the individual HBs appear better captured by comparison of the distributions of X-ray structures and model systems.
At its upper limit, the ambifunctional HB would correspond to the sum of the two single HBs (methanol: −14.9 kcal mol−1 and p-cresol: −17.1 kcal mol−1), with a more favorable acetamide–p-cresol ambifunctional HB expected due to its strong constituent O–H⋯O HB (ESI Table S13†). The acetamide–p-cresol ambifunctional HB (−12.2 kcal mol−1) is indeed stronger by ca. 2 kcal mol−1 than that in acetamide–methanol (−10.6 kcal mol−1), but both are weaker than their theoretical limit by a comparable 4–5 kcal mol−1 (Fig. 2 and ESI Table S13†). Despite the overall weaker interaction energy for methanol, its ambifunctional HB is significantly (ca. 3–4 kcal mol−1) more favorable than its individual O–H⋯O or N–H⋯O HBs (Fig. 2 and ESI Table S3†). Conversely, the acetamide–p-cresol ambifunctional HB provides a limited (ca. 1 kcal mol−1) benefit over the O–H⋯O HB conformation, suggesting the dominant role of the O–H⋯O HB even in the ambifunctional conformation (Fig. 2 and ESI Table S3†). Beyond this purely electronic interaction energy picture, the ambifunctional HB could be expected to incur a relative entropic penalty in comparison to the individual HBs. While relative free energies do disfavor the ambifunctional HB very slightly (ca. 0.2–0.5 kcal mol−1) with respect to the single O–H⋯O HB, this difference is significantly smaller than the free energy penalty increase from the single N–H⋯O to O–H⋯O HB (ESI Table S14†).
To understand if the reduced interaction energies in the ambifunctional HB arise due to a geometric compromise, we compared the geometry of the constituent HBs with the corresponding single O–H⋯O and syn N–H⋯O HBs. In both model system ambifunctional conformations, the N⋯O and O⋯O HB distances are mostly unchanged (shortened by ca. 0.05 Å and 0.1 Å, ESI Tables S4, S5 and S13†). Comparing interaction energies at these shortened HB distances from the individual 1D PECs, we determine that no significant energy penalty is incurred (ESI Table S15†). The ambifunctional HB angles are more distinct, being significantly reduced (O–H⋯O: ca. 10° and N–H⋯O: ca. 30–40°) with respect to their near-linear values in single HBs (ESI Tables S4, S5 and S13†). In contrast to the negligible distance penalties, the penalty for displacing both HB angles from their optimal values comes at an energetic cost of 1.5 kcal mol−1 in both model systems (ESI Tables S13 and S15†). The remaining difference between the ambifunctional HB interaction energy and the additive sum of the single HBs (ca. 3 kcal mol−1) is thus likely due to many-body, electronic effects such as those that limit the simultaneous HBA/HBD strength of the hydroxyl oxygen. Indeed, if we had used a typical force field for biomolecular simulations (i.e., the generalized amber force field, or GAFF99) we would have failed to distinguish differences between the aromatic and aliphatic hydroxyls, resulting in underestimation of the higher O–H⋯O HB strength in p-cresol and failure to predict the relatively high benefit of the ambifunctional HB in the methanol model (ESI Table S16†).
To quantify energetic relationships among the syn N–H⋯O HB, O–H⋯O HB, and ambifunctional HB conformations, we identified an approximate reaction coordinate for the minimal structural rearrangement that describes the transition between these conformations (see Section 5 and ESI Text S4†). A suitable approximate reaction coordinate to capture this transition is the rotation of the alcohol (i.e., methanol or p-cresol) with respect to the amide, a quantity well described by the (H)O⋯C![[double bond, length as m-dash]](https://www.rsc.org/images/entities/char_e001.gif) N intermolecular angle (ESI Fig. S14†). Although the intermolecular angle was constrained during optimization, the remaining degrees of freedom (e.g., HB distance and angle) were fully relaxed. We made this choice rather than obtaining explicit free energies and minimum energy pathways because the reaction coordinate energetics are evaluated at a higher level of theory (i.e., local coupled cluster at the complete basis set limit) than is feasible for geometry optimization and vibrational characterization (see Section 5). This approximation introduces at most one small rotational imaginary mode, typically corresponding to rotation about the O–H⋯O HB axis (ESI Tables S17 and S18†). This approximate reaction coordinate can be transformed to the N–H⋯O angle between acetamide and the alcohol, a quantity more intuitively linked to the hydrogen bond (ESI Text S4 and Fig. S15†).
N intermolecular angle (ESI Fig. S14†). Although the intermolecular angle was constrained during optimization, the remaining degrees of freedom (e.g., HB distance and angle) were fully relaxed. We made this choice rather than obtaining explicit free energies and minimum energy pathways because the reaction coordinate energetics are evaluated at a higher level of theory (i.e., local coupled cluster at the complete basis set limit) than is feasible for geometry optimization and vibrational characterization (see Section 5). This approximation introduces at most one small rotational imaginary mode, typically corresponding to rotation about the O–H⋯O HB axis (ESI Tables S17 and S18†). This approximate reaction coordinate can be transformed to the N–H⋯O angle between acetamide and the alcohol, a quantity more intuitively linked to the hydrogen bond (ESI Text S4 and Fig. S15†).
Along the approximate reaction coordinate, increasing N–H⋯O angles correspond to the transformation from a single O–H⋯O HB to the formation of an additional N–H⋯O interaction in the ambifunctional HB until the O–H⋯O HB interaction is lost and only the N–H⋯O HB remains (Fig. 6). The hydroxyl group of the alcohol in either model rotates freely along the chosen reaction coordinate, making it difficult to map the approximate reaction coordinate to the O–H⋯O angle. Still, the O–H⋯O angle generally behaves as expected with changing N–H⋯O angle: a near-linear O–H⋯O angle must coincide with an acute N–H⋯O angle or vice versa (ESI Fig. S16†).
For both model systems, the N–H⋯O HB conformation appears only as a shoulder along the approximate reaction coordinate and thus the transition to the lowest-energy ambifunctional HB conformation is barrierless (Fig. 6). This is consistent with vibrational analysis along the reaction coordinate, which shows an imaginary mode toward the ambifunctional basin (ESI Tables S17 and S18†). Conversely, while the O–H⋯O HB is a local minimum along the approximate reaction coordinate, the transition from the O–H⋯O HB conformation to the ambifunctional HB conformation has a small (methanol: 1.7 kcal mol−1, p-cresol: 2.3 kcal mol−1) approximate barrier in both model systems (Fig. 6 and ESI Table S19†). For the acetamide–methanol model, it is apparent that this approximate barrier can be partly attributed to the reduction of the O–H⋯O HB angle from its ideal value (180° to 160°) before the N–H⋯O angle approaches the larger values (ca. 120–140°) near the ambifunctional HB global minimum (Fig. 6 and ESI Fig. S16†). However, no such geometric distortion is observed for the acetamide–p-cresol case (ESI Fig. S16†). The thermodynamic driving force for forming the ambifunctional HB (e.g., with respect to a single O–H⋯O HB) on the approximate reaction coordinate is lower for p-cresol than methanol, despite both p-cresol conformations having stronger overall interaction energies (Fig. 6). Exiting the ambifunctional HB global minimum requires 4–5 kcal mol−1 for both model systems to break the ambifunctional N–H⋯O interaction and reform a single O–H⋯O HB, with the greater stability of the ambifunctional HB in acetamide–methanol leading to a slightly higher energetic cost than for acetamide–p-cresol (Fig. 6).
Although geometric arguments can partly explain the significant barrier on the acetamide–methanol approximate reaction coordinate for rearranging from the single O–H⋯O HB to the ambifunctional HB, it does not explain the equally large approximate barrier for the p-cresol model. To understand the electronic origins of the barrier for rearrangement, we used symmetry-adapted perturbation theory (i.e., at the SAPT2+3 level of theory98) to decompose relative electrostatic and dispersion contributions in both minima on the approximate reaction coordinate as well as at the peak of the DLPNO-CCSD(T) approximate barrier (ESI Table S11†). This energy decomposition reveals that a loss of dispersion and induction stabilization occurs equivalently (by ca. 2 kcal mol−1) for both model systems at the maximum energy point between the O–H⋯O and ambifunctional HB configurations (ESI Table S11†). At this energetic peak on both approximate reaction coordinates, dispersion stabilization is also weaker than at the N–H⋯O HB geometry (ESI Table S11†). Although favorable electrostatic interactions also weaken at the barrier maximum, this effect is counteracted by a reduction in the exchange repulsion (ESI Table S11†). Thus, a delicate interplay of electronic effects can be expected to govern rearrangement between single and ambifunctional HB configurations. It is therefore unsurprising that using a standard biomolecular force field (e.g., GAFF99) fails to capture this approximate barrier for rearrangement in addition to underestimating the stabilization of the ambifunctional HB (ESI Fig. S17 and Table S16†).
In both model systems, we identify an ambifunctional HB basin as the range of approximate reaction coordinate geometries around the global minimum that remain stabilized with respect to the most stable single HB (i.e., O–H⋯O) conformation (Fig. 6). Due to the greater relative stability of the methanol ambifunctional HB with respect to the constituent HBs, its basin corresponds to a larger, nearly 3 kcal mol−1 energy window instead of approximately 1.5 kcal mol−1 for p-cresol (Fig. 6). These differences are reflected in the geometric properties of structures within the basin: methanol N–H⋯O angles span a nearly 40° range (123 to 160°), whereas favorable p-cresol N–H⋯O angles span only 20° (126 to 146°, Fig. 6, ESI Fig. S18 and S19†). An even wider range of O–H⋯O angles is observed for methanol (135–167°), whereas the favored O–H⋯O angles for p-cresol (151–168°) are more restrictive (Fig. 6, ESI Fig. S18 and S19†). This observation confirms that for the aromatic hydroxyls in p-cresol or Tyr residues, the O–H⋯O interactions dominate and an ambifunctional HB provides limited additional stabilization.
Nevertheless, structural variations in the ambifunctional HB basin are comparable for both models, with the higher, productive O–H⋯O angles compensated by a monotonic linear reduction to less productive N–H⋯O angles or vice versa (ESI Fig. S18 and S19†). Approaching the ambifunctional HB global minimum from either side of the basin corresponds to a shortening of the heavy-atom distance for the forming HB (i.e., N⋯O or O⋯O) while the other HB distance changes minimally (ESI Fig. S18 and S19†). As a result, the sum of the two HB distances is lowest in the ambifunctional HB global minimum and rises in either direction (ESI Fig. S20†). The basin O⋯O distances span a significantly narrower 0.18 Å range for p-cresol (2.72–2.90 Å) than the 0.55 Å range for methanol (2.75–3.30 Å, ESI Fig. S18 and S19†).
Given the increased favorability of ambifunctional HBs over both O–H⋯O and N–H⋯O HBs for model systems, we expect to observe them in protein crystal structures. Indeed, we observe ambifunctional HBs between all pairs of amino acids, corresponding to around 15% (559 of 3908) of all HBs in our data set (ESI Table S12†). Although one may expect a more significant fraction of Ser/Thr HBs to be ambifunctional than Tyr HBs due to the enhanced relative benefit for the aliphatic hydroxyl, the difference in relative abundance is modest (ca. 15–18% vs. 11–12%, ESI Table S12†). This relative abundance in crystal structures is likely dictated by a combination of both geometric constraints for ambifunctional HB formation as well as competition with other interactions as previously observed for single HBs (see Section 3.2).
To characterize the geometries of X-ray crystal structure ambifunctional HBs, we computed their joint N⋯O and O⋯O HB distance distributions (Fig. 7 and ESI Fig. S21†). In this set, one of the HB distances is typically closer to its optimal value than the other, with a small (ca. 10%) fraction consisting of two HB distances close to their optimal values in single N–H⋯O or O–H⋯O HBs (Fig. 7, ESI Fig. S21 and S22 and Table S20†). While competing interactions with solvent molecules or other residues can partially rationalize why symmetric ambifunctional HB interactions are infrequently observed, such competing interactions are also observed simultaneously with highly symmetric ambifunctional HBs (Fig. 8). For the asymmetric Ser/Thr pairs with Asn or Gln, a slight majority (57%) has shorter N⋯O than O⋯O HB distances, whereas for Tyr this subset represents a minority (43%), consistent with differences in relative O–H⋯O HB strength (Fig. 7 and ESI Fig. S21†). We also observe lengthening of the N–H⋯O HB distance when the protein environment is incorporated as a dielectric during our optimizations, which could also support the reduced number of symmetric ambifunctional HBs in the X-ray crystal structure set (ESI Table S8†). Although the HB distances in many cases are asymmetric, the positioning of the residue pairs still orients both sets of HBAs and HBDs in sufficient proximity for two simultaneous interactions that are each characterized by the presence of a BCP (Fig. 8 and ESI Table S21†).
|  | ||
| Fig. 8 Representative proteins showing the three different types of ambifunctional HBs in Ser–Asn (top) and Tyr–Gln (bottom) residue pairs in our data set: shorter d(N⋯O) (left), equivalent length d(N⋯O) and d(O⋯O) (middle), and shorter d(O⋯O) (right). The Ser–Asn pairs correspond to PDB IDs (left to right): 1SFS, 2VOV, and 1O4Y, and the Tyr–Gln pairs to PDB IDs (left to right): 3EPW, 3BVU, and 4B9F. Specific HBs are indicated by black dashed lines between the heavy atoms in residue pairs with annotated distances (black, in Å) and N/O–H⋯O angles (light green, in °). Hydrogen atoms added for all proteins and relaxed with constrained heavy atoms are shown as translucent spheres. The orange and green dashed lines indicate additional stabilizing interactions observed with respect to the residue pair with nearby residues and solvent molecules, respectively. All residues are labeled by one-letter amino acid codes and residue numbers. | ||
In the model systems, we attributed lower than expected interaction energies to the difficulties associated with simultaneous formation of two productive HB angles. In the X-ray structure set, more ambifunctional HBs have near-linear (i.e., 170–180°) O–H⋯O angles and small (i.e., 110–130°) N–H⋯O angles than the reverse, but average angles differ only by around 5° from the average values of the single HB angles in X-ray structures (Fig. 8, ESI Fig. S23 and Tables S4, S5 and S22†). Consistent with the HB distance analysis, a minority of all Tyr (9%) and Ser/Thr (20%) ambifunctional HBs simultaneously form relatively obtuse O–H⋯O and N–H⋯O angles as high as those observed in the model systems (Fig. 8 and ESI Fig. S23†). This small fraction overlaps significantly with the minority of strong, symmetric ambifunctional HBs that nearly all have two obtuse angles (Fig. 8 and ESI Fig. S23†). Analyzing the secondary structure motifs in proteins with symmetric ambifunctional HBs reveals the two residues are most commonly located on adjacent strands of β-sheets, on the same β-sheet strand but separated by a single residue, or on a flexible loop. These symmetric ambifunctional HBs are less frequently observed on α-helices. When these are observed, it is typically between an α-helix and an adjacent loop or when two α-helices are oriented perpendicular to each other. Consistent with observations on protein structures, model system ambifunctional HB geometries also exhibit a greater reduction of the N–H⋯O angle than the O–H⋯O angle, especially for p-cresol (20°) versus methanol (10°, ESI Tables S4 and S5†).
While ambifunctional HBs are apparent in protein structures, the benefit of a near-linear O–H⋯O angle especially in interactions with Tyr can be expected to dominate. Thus we can expect the Tyr hydroxyl to act as a simultaneous HBA and HBD only when limited deviation of HB angles is necessary. Conversely, we should anticipate this motif to be apparent in proteins with Ser or Thr in close proximity to Gln or Asn. We expect more such structures could be uncovered with even larger-scale and more inclusive examination of proteins from crystal structures, molecular dynamics, or with other (e.g., NMR) spectroscopic techniques.
We showed that simultaneous O–H⋯O and N–H⋯O interactions are stabilizing in an ambifunctional HB. While this energetic benefit was most significant for aliphatic hydroxyl groups, it was less than the theoretical limit (i.e., sum of two individual HBs) and only slightly more favorable than the O–H⋯O HB alone for aromatic hydroxyls. We determined this reduction in interaction strength was due both to geometric constraints on the formation of two productive HB angles and distances along with the reduced ability of a single hydroxyl to act as a simultaneous HBA/HBD. These many-body effects could not be captured by conventional force fields widely used to study proteins. While evaluating the reaction coordinate that ccaptured the transformation between HB conformations revealed rearrangement from the N–H⋯O to ambifunctional HB to be approximately barrierless, the basin of stable ambifunctional HB structures accommodated a wide range of distances and angles especially for Ser/Thr–Asn/Gln. Consistent with model system observations, we observed a range of ambifunctional HB structures in X-ray crystal structures, especially for Ser/Thr over Tyr. These studies set the stage for systematic100–102 and quantitative study of representative models to illuminate mechanistic roles for ambifunctional HBs that may have been missed when studied with conventional force fields. It is expected that the unique energetic and geometric properties of these hydrogen bonds could play an important role in substrate recognition and in controlling enzyme selectivity through substrate positioning.
Unconstrained and constrained geometry optimizations were performed using both hybrid (i.e., B3LYP105–107) density functional theory (DFT) and Møller–Plesset second-order perturbation theory (MP2). All optimizations were carried out using the 6-31G* basis set,108 followed by single-point energies evaluated using larger basis sets (ESI Tables S4 and S5†). Semi-empirical D3 (ref. 109) dispersion with Becke–Johnson110 damping was incorporated in the B3LYP optimizations, although its effect on geometries was limited (ESI Tables S4 and S5†). B3LYP-D3 DFT geometry optimizations were carried out in a developer version of TeraChem111 v1.9 in Cartesian coordinates using L-BFGS algorithm, as implemented in DL-FIND.112 Default thresholds of 4.5 × 10−4 hartree per bohr for the maximum gradient and 1 × 10−6 hartree for self-consistent field (SCF) convergence were employed. Geometry optimizations with MP2 in ORCA113 v.4.0.1.2 were carried out in redundant internal coordinates using the BFGS method with default thresholds of 3 × 10−4 hartree per bohr for the maximum gradient and 5 × 10−6 hartree for SCF convergence. The MP2 and B3LYP-D3 HB distances and angles as well as intramolecular bonds were comparable, with only equilibrium O⋯O distances exhibiting a slight dependence on basis set or method choice (ESI Tables S4, S5 and S24†). Comparisons of single-point energies at higher levels of theory on MP2 and B3LYP-D3 geometries revealed very limited differences (≤0.1 kcal mol−1) on evaluated interaction energies (ESI Table S25†). All initial and optimized structures are provided in the ESI.†
For the acetamide–methanol model system, single-point energy calculations were carried out on MP2/6-31G* geometries with both canonical coupled cluster singles doubles with perturbative triples (i.e., CCSD(T)) and domain-localized pair natural orbital CCSD(T) (i.e., DLPNO-CCSD(T)114,115). Dunning-style correlation consistent double-ζ and triple-ζ (i.e., aug-cc-pVDZ and aug-cc-pVTZ) basis sets were employed to enable two-point116–118 extrapolation to the complete basis set (CBS) limit. Given the larger size (i.e., 25 atoms) of the acetamide–p-cresol system, only DLPNO-CCSD(T) single-point energies were evaluated on MP2/6-31G* structures with aug-cc-pVXZ (X = D, T) basis sets to enable extrapolation to the CBS limit (ESI Table S3†). All reported DLPNO-CCSD(T) energies correspond to those obtained from Tight PNO thresholds, after testing the effect of threshold choice on interaction energies (ESI Table S3†). For the acetamide–methanol model system where canonical CCSD(T) could be carried out, interaction energies obtained from DLPNO-CCSD(T)/CBS with Tight PNO thresholds and CCSD(T)/CBS agreed to within 0.3 kcal mol−1, with relative interaction trends in even closer agreement (ESI Table S3†).
One-dimensional (1D) potential energy curves (PECs) and two-dimensional (2D) potential energy surfaces (PESs) were obtained by generating initial geometries for constrained geometry optimizations. Constrained optimizations were all carried out in ORCA v4.0.1.2 at the MP2/6-31G* level of theory, followed by single-point energies evaluated with DLPNO-CCSD(T)/CBS or with the generalized amber force field (GAFF99). We select this protocol to ensure we are predictive across full HB potential energy curves, but B3LYP-D3/aug-cc-pVTZ interaction energies underestimate the DLPNO-CCSD(T)/aug-cc-pVTZ values by only ca. 1 kcal mol−1, indicating limited method sensitivity of some properties (ESI Table S26†). The 1D PECs were obtained for syn N–H⋯O HBs and O–H⋯O HBs in acetamide–methanol and acetamide–p-cresol systems by varying the constrained HB distance (i.e., N⋯O or O⋯O) in steps of 0.01 Å from 2.4 Å to 4.0 Å. The 2D PESs were obtained for the syn N–H⋯O HB and O–H⋯O HB of acetamide–methanol and acetamide–p-cresol by varying the constrained HB distance from 2.40 up to 5.00 Å in steps of 0.05 Å and simultaneously varying the angle from 110° to 180° in steps of 5°. Reaction coordinates for the transformation between N–H⋯O, ambifunctional, and O–H⋯O HBs for the acetamide–methanol and acetamide–p-cresol model systems were sampled from initial geometries in which HB partners were translated and rotated with respect to each other using an in-house Python script. The reaction coordinate corresponded to an intermolecular angle, which was selected by trial and error, and initial geometries for constrained optimizations were generated by rotation of this angle in 0.1° increments (ESI Text S4†).
| Footnote | 
| † Electronic supplementary information (ESI) available. See DOI: 10.1039/d0sc05084a | 
| This journal is © The Royal Society of Chemistry 2021 |