Increasing protein stability by engineering the n → π* interaction at the β-turn†

Abundant n → π* interactions between adjacent backbone carbonyl groups, identified by statistical analysis of protein structures, are predicted to play an important role in dictating the structure of proteins. However, experimentally testing the prediction in proteins has been challenging due to the weak nature of this interaction. By amplifying the strength of the n → π* interaction via amino acid substitution and thioamide incorporation at a solvent exposed β-turn within the GB1 proteins and Pin 1 WW domain, we demonstrate that an n → π* interaction increases the structural stability of proteins by restricting the ϕ torsion angle. Our results also suggest that amino acid side-chain identity and its rotameric conformation play an important and decisive role in dictating the strength of an n → π* interaction.


Introduction
An array of noncovalent interactions including electrostatic forces, hydrogen bonds, van der Waals interactions and hydrophobic effects in a polypeptide chain dictate its threedimensional structure and govern its folding. 1 In particular, owing to their high abundance, the noncovalent interactions originating from the backbone (main chain) atoms of a polypeptide chain, 2 including the classical hydrogen bonds, 3 C-H/O hydrogen bonds, 4 C5 hydrogen bonds 5 and n / p* interactions, 6 play a crucial role in stabilizing protein structures. The n / p* interaction originates from the donation of the lone pair (n) electron density of the carbonyl oxygen (O) i into the empty p* orbital of the adjacent carbonyl group (C]O) i+1 . [7][8][9] The distance (d # 3.2 A) and angular criteria (q ¼ 109 AE 10 ) dening an n / p* interaction are in agreement with the Bürgi-Dunitz trajectory for nucleophilic attack, 10 which along with the associated directionality i / i + 1 (N-term / C-term) ( Fig. 1) is indicative of its possible role in folding and stabilization of protein secondary structures. [11][12][13][14][15][16] Contribution of the n / p* interaction towards the stability of the protein structure was initially reported in collagen mimetics. 17 The enhanced thermostability of a collagen mimetic with the 4R-congured proline derivative compared to that with the 4S-congured proline derivative was attributed to the stronger n / p* interaction in the 4R-congured proline derivative with the exo-pucker of the pyrrolidine ring. 18 The nding was exquisitely substantiated later by the highresolution crystal structure of the oligoproline PPII helix, where the n / p* interaction was favored by the C g -exo pucker and disfavored by the C g -endo pucker of the pyrrolidine ring. 19 Furthermore, the stability of this PPII helix in the absence of intramolecular hydrogen bonds and hydration emphasizes the role of the n / p* interaction in the structural stability of collagen.
For an idealized geometry, the n / p* interaction between amides contributes $0.3 kcal mol À1 , 20 which may seem moderate. However, given the ubiquity of carbonyl groups in a polypeptide chain, n / p* interactions could have a signicant collective contribution towards the overall energetics of protein stability. 2 The distribution of the n / p* interaction obtained from analyses of protein crystal structures reveals that >70% of residues in a-helices, as opposed to <5% of residues in b-sheets engage in this interaction. 11,12 Furthermore, since one third of all amino acids in the random coil have torsion angles in the a-helical region, 21 the n / p* interaction might have an important role in restricting the conformational ensemble of unfolded proteins. 22 In this context, it is worth noting that random coils and turn regions of proteins show a high abundance of reciprocal n / p* interactions (back and forth donation between adjacent carbonyl pairs). 9 The evidence of the n / p* interaction has been shown by microwave and IR spectroscopy in various small molecular systems. [23][24][25][26][27] However, despite enormous excitement in this area, so far experimental measurements of the energy of an n / p* interaction in proteins and its practical consequence on protein structural stability have been lacking. Therefore, we sought to engineer an n / p* interaction at the b-turn within a protein to understand its inuence on the protein structure and its stability.
b-Turns (Fig. 2a) are the third most important protein secondary structure representing $20% of all protein residues 28 having an important role in protein folding. [29][30][31] Furthermore, substituting non-proline residues with proline residues in the b-turn leads to increased stabilization of the turn 32 and enhanced protein stability. 33,34 The increased stability results from the decreased backbone conformational entropy of the denatured state due to the restricted rotation of the N-C a bond, also known as the f torsion angle. Since an n / p* interaction also restricts the f angle of an amino acid residue, 11 we speculated that engineering an n / p* interaction at the b-turn would have direct consequence on the protein stability.
Here, by using bioinformatic analysis of the b-turn in proteins, we nd an interplay between the conformational exibility of the peptide backbone and the abundance of n / p* interactions at the two central residues, i + 1 and i + 2 of the b-turn. Through subsequent X-ray crystallography and computational analysis of synthetic GB1 proteins with amino acid substitutions at the i + 2 residue of the b-turn, we show that amino acid side-chain identity and its rotameric conformation have a direct inuence on the strength of an n / p* interaction. Gratifyingly, the thermal denaturation of the GB1 proteins shows a good correlation between their stability and the strength of an n / p* interaction at the b-turn. Finally, we validate this observation in the Pin 1 WW domain, wherein by amplifying the strength of an n / p* interaction at the b-turn by thioamide incorporation, we could increase the thermal stability of the thioamidated Pin 1 WW domain. Results and discussion n / p* interaction and conformational exibility at the bturn Previous computational analyses predicted that n / p* interactions confer conformational stability to the i + 1 residue in common type I and type II b-turns, and thus have a special role to play in the stability of turns. 11 Therefore, we sought to examine the abundance of n / p* interactions and their possible correlation with the conformational exibility of the peptide backbone in the b-turns.
By analyzing a non-redundant subset of high-resolution (#2.0 A) protein crystal structures in the Protein Data Bank (PDB), we curated 500 b-turns (identied using Promotif) representing the common type-I, type-II, type-I 0 , and type-II 0 turns. Next, using the distance and angular criteria dening an n / p* interaction ( Fig. 1), we determined the abundance of n / p* interactions at the i + 1 and i + 2 residues in the b-turns. We noted that 40-80% of the residues engage in To identify the underlying cause of this behavior, we determined the torsion angles f and j of i + 1 and i + 2 residues in all the b-turns and plotted them on the Ramachandran map. It was interesting to note the broader distribution of f and j angles at the i + 2 residue (Fig. 2d) as opposed to the i + 1 residue (Fig. 2c). This is suggestive of restricted conformational freedom at the i + 1 residue, which is associated with the higher abundance of n / p* interactions at this site. We also calculated the difference in mean f and j angles (Df and Dj) in the presence and absence of the n / p* interaction in the respective b-turns ( Fig. 2e and f). The differences were signicantly higher at the i + 2 residue in comparison to the i + 1 residue. This further indicates that the lower abundance of n / p* interactions at the i + 2 residue is associated with greater conformational exibility of the peptide backbone.
Inuence of the amino acid side-chain on the n / p* interaction As b-turns are stabilized by the intramolecular hydrogen bond between C]O i /HN i+3 (Fig. 2a), the higher abundance of n / p* interactions at the i + 1 residue is perhaps linked with the conformational restriction of C]O i via hydrogen bonding. Thus, we surmised that the C]O i / C]O i+1 n / p* interaction and the conformational space at the i + 1 position might be insensitive to amino acid substitution. Instead, the relatively exible i + 2 residue of the b-turn (Fig. 2a), where neither the donor (n) C]O i+1 nor the acceptor (p*) C]O i+2 is constrained by the intramolecular hydrogen bond, is an ideal site to probe the role of the n / p* interaction in the protein structure and its stability. Additionally, the solvent exposure of b-turns allows for amino acid substitution and examining the inuence of the amino acid side-chain on the n / p* interaction. Thus, we chose to engineer the loop L1 of the 56-residue immunoglobulin-binding domain B1 of the streptococcal protein G (GB1). [35][36][37][38] The solvent exposed loop L1 in wild type GB1 is a type-I b-turn (Fig. 3a) with lysine at the i + 1 position and threonine at the i + 2 position that lacks a C]O i+1 / C] O i+2 n / p* interaction. However, to have a better control over the turn conformation, we decided to introduce a type-II 0 bturn. 39 Thus, we synthesized a GB1 variant where -KTin loop L1 was substituted with D-Ala-L-Ala (1) to induce a type-II 0 b-turn (Fig. 3a).
Alanine was chosen due to its preference for an n / p* interaction 11 and high helix propensity 40 (preference of an amino acid to be in a-helices). Although thermal denaturation of 1 using variable temperature circular dichroism (CD) showed unfolding cooperativity similarly to GB1 (Fig. S8 and S9 †), multiple attempts to crystallize 1 remained unsuccessful. This is possibly a consequence of conformational exibility introduced in the loop L1 by alanine substitution. Our earlier results indicated that the b-branched amino acid D-Val at the i + 1 site stabilizes a type-II 0 b-turn more than D-Ala. 41 Thus, we synthesized 2, with D-Val-L-Ala in loop L1 (Fig. 3a), which readily crystallized and X-ray diffraction data were collected to a maximum resolution of 1.9 A. The structure of 2 overlaps closely with the tertiary structure of GB1 (backbone RMSD 0.39 A) (Fig. 3b), although with a signicant displacement of loop L1.
Gratifyingly, the D-Val C]O i+1 and L-Ala C]O i+2 in the type-II 0 bturn engage in an n / p* interaction, where the torsion angles of L-Ala at the i + 2 site (f, j ¼ À59.7 , À42.2 ) are remarkably close to the mean torsion angles of a right-handed a-helix (f, j ¼ À62 , À41 ) 42 (Fig. 4).
Encouraged by this result, we next incorporated serine with moderate helix propensity (3), valine (4) and threonine (5) with low helix propensity 40 at the i + 2 site of the type-II 0 b-turn (Fig. 3a). With the decreasing helix propensity of amino acids in the order Ala > Ser > Val, we noted an increase in both d and q between C]O i+1 / C]O i+2 , suggesting a gradual weakening of the n / p* interaction at the i + 2 residue (Fig. 4). Thus, to obtain a quantitative estimate of the n / p* interaction energy (E n/p* ) at the i + 2 residue in 2, 3, and 4, we resorted to NBO analysis, 43 which clearly indicated a decreasing E n/p* in the order 2 > 3 > 4 (Fig. 4).
Despite the low helix propensity of threonine, we were surprised to note the shortest d and q between C]O i+1 /C] O i+2 at the type-II 0 b-turn in 5 with an E n/p* of 0.46 kcal mol À1 (Fig. 4). An overlay of the type-II 0 b-turns of both the b-branched amino acids valine (4) and threonine (5) revealed a clear difference in the side-chain rotamer conformation (Fig. 5a). Valine in 4 crystallized in a gauche À (g À ) side-chain rotameric conformation, whereas threonine in 5 crystallized in a gauche + (g + ) conformation. From the statistical analyses of protein structures, it is known that b-branched amino acids favor the g + side-chain conformation over g À in helices. [44][45][46][47] Thus, Thr25 in the a-helix of GB1 with a g + conformation engages in an n / p* interaction, whereas Thr11 at the i + 2 residue in loop L1 with a g À conformation lacks the n / p* interaction (Fig. 5b). Moreover, our dataset revealed that Thr with a g À conformation at the i + 2 residue in the b-turns does not engage in an n / p* interaction (Table S4 †). Therefore, despite the low helix propensity, the n / p* interaction in threonine at the type-II 0 b-turn in 5 is a result of the altered side-chain rotamer conformation.
The Ramachandran plot of the i + 2 residue (Fig. 5c) in the type-II 0 b-turn of 2, 3, 4, and 5 revealed that, as the strength of the n / p* interaction increases, the torsion angles of an amino acid in a non-helical region in the absence of the stabilizing intramolecular hydrogen bond are gradually altered to occupy the right-handed a-helical region. Thus, proline with a high propensity to engage in an n / p* interaction 11 is a strong helix initiator. 48,49 Hence, our result further supports the crucial role of the n / p* interaction in helix nucleation, as hypothesized earlier. 11 Implication of the n / p* interaction on protein stability Since, an n / p* interaction results in a restricted f torsion angle ( Fig. 2e and f), 11 we sought to examine the inuence of the n / p* interaction on the conformational stability of 2, 3, 4, and 5. The midpoint of the thermal transition (T M ), which is a measure of structural stability was determined by variable temperature CD (Fig. S10-S13 †). 5 with the strongest C]O i+1 / C]O i+2 n / p* interaction displayed the maximum stability (T M ) and 4 with no detectable C]O i+1 / C]O i+2 n / p* interaction showed the least stability (Fig. 4). We were surprised to note a very good correlation (Fig. S7 †) between the T M of these proteins and E n/p* between C]O i+1 / C] O i+2 , in the absence of the stabilizing C]O i /HN i+3 hydrogen bond (Fig. 4). An n / p* interaction rigidies the b-turn by reducing the conformational entropy at the i + 2 residue, which is presumably responsible for the increased stability of the protein in solution. However, as the amino acid sidechains at the i + 2 residue of the type II 0 b-turn are different in 2-5, there might be additional factors that contribute towards the stability of these proteins. Therefore, we adopted an orthogonal strategy to validate the role of the n / p* interaction in protein stability.
By employing a prolyl-based torsion balance system, Raines et al. have shown that a thioamide (C]S i ) engages in a stronger C]S i / C]O i+1 n / p* interaction than amide C]O i . 20,50 However, due to the longer C]S bond length (1.71 A) 51 and larger van der Waals radius of sulfur (1.85 A), 52 thioamide substitution perturbs the local secondary structure of proteins where the amide oxygen participates in a shorter hydrogen bond. [53][54][55][56][57] On the other hand, thioamide substitution at a site where the amide oxygen is involved in a longer hydrogen bond or is solvent exposed, leads to minimal perturbation of the secondary structure. 53,55,57-59 Therefore, we chose to substitute the solvent exposed C]O i+1 in the type-II 0 b-turn of 2 by C]S i+1 . The NBO analysis of the C]O i+1 to C]S i+1 substituted type-II 0 bturn in 2, 3, 4, and 5 clearly indicated a signicant enhancement in E n/p* , due to the amplied C]S i+1 / C]O i+2 n / p* interaction (Table S3 †).
Thus, towards the synthesis of i + 1 thionated GB1 (D-Val t -L-Ala; 2a) (the thionated residue is denoted by superscript "t"), we obtained a clean 46-mer polypeptide up to the L-Ala i+2 . However, on completion of the 56-mer 2a on a solid support, following the acidolytic removal of protecting groups, the mass spectrum corresponded to a 45-mer fragment without the L-Ala i+2 (Fig.-S16B †). To circumvent the undesirable peptide cleavage, we coupled the tetrapeptide Fmoc-Asn(Trt)-Gly-D-Val t -L-Ala-COOH and Fmoc-Asn(Trt)-Gly-D-Val-L-Ala-COOH onto two individual 45-mer polypeptides. Aer acidolytic cleavage, although we obtained the 49-mer oxo-polypeptide, the thio-tetrapeptide coupling repeatedly resulted in the 45-mer fragment without the L-Ala i+2 (Fig. S16C and D †). This suggests a spontaneous acid catalyzed cleavage of the peptide bond C-terminal to L-Ala i+2 in thioamidated GB1, 2a.
With numerous failed attempts to synthesize 2a, we focused towards the 32-mer Pin 1 WW domain, a three stranded b-sheet protein that shows a cooperative two-state folding. 60,61 The Pin 1 protein is amenable to loop modication that retains the global fold with alteration in its thermodynamic stability, making it an excellent model protein for structure-folding studies. 60 We selected a Pin 1 variant with a type-I 0 b-turn in loop 1 and substituted the -Asn-Glywith D-Val-L-Ala-(6) and D-Ala-L-Ala-(7) (Fig. 6a) to adopt a type-II 0 b-turn that was conrmed by characteristic NOEs at the b-turn (Fig. S29 †). Subsequently, we synthesized Pin 1 variants with thioamidation at the i + 1 site (D-Val t -L-Ala; 6a and D-Ala t -L-Ala; 7a). Remarkably, the acidolytic removal of the protecting groups to obtain 6a resulted in both the desired product and the N-and C-terminal fragmented peptides resulting from the nucleophilic attack of D-Val C]S i+1 onto L-Ala C]O i+2 (Fig. 7a) as observed in 2a.   (1ZCN) with -Asn-Gly-in loop 1 forming a type-I 0 b-turn that has been modified to form a type-II 0 b-turn in 6 and 7. The n / p* interaction at the i + 2 residue is amplified by the C]S i+1 substitution in 6a and 7a. (b) The midpoint of the thermal transition (T M AE S.D.) was derived from variable temperature CD. The free energy of folding (DG f ) was obtained by fitting the guanidine hydrochloride denaturation (4 C) curves to a two-state model. DDG f ¼ 6a À 6 and 7a À 7.
An identical fragmentation was reported by Heimgartner et al. during the aqueous acidolytic workup of the thioacylated Aib-Pro dipeptide (Fig. 7b), towards the synthesis of Ph-(C]S)-Aib-Pro-Aib-N(Me)Ph. 62 However, by bubbling HCl gas through the dipeptide in THF, the thiazolone intermediate could be characterized, which results from the nucleophilic attack of Ph C]S i onto Aib C]O i+1 . To our excitement, the crystal structure of the nal product Ph-(C]S)-Aib-Pro-Aib-N(Me)Ph revealed the C]S i / C]O i+1 n / p* interaction, leading to a high degree of pyramidalization, D ¼ 0.059 A at Aib C]O i+1 , a rm indicator of the n / p* interaction. 19 Thus, the directional (i + 1 / i + 2) fragmentation observed in 2a (Fig. S16 †), 6a (Fig. 7a) and 7a (Fig. S17B †) is a chemical signature of the amplied n / p* interaction between C]S i+1 and C]O i+2 .
Next, we assessed the folding of Pin 1 variants 6, 6a, 7, and 7a in sodium phosphate buffer (pH 7.4). All the Pin 1 proteins showed the characteristic 227 nm maximum in the CD spectrum, indicating the presence of a folded protein with bsheets (minimum centered around 215 nm) (Fig. S22A-S25A †). The virtually identical H a chemical shi perturbation deduced from TOCSY and NOESY experiments indicated that a single atom substitution (O to S) at the solvent exposed C]O i+1 did not lead to major structural perturbation in 6a and 7a (Fig. S28 †). We next performed thermal and chemical denaturation (Fig. S22-S25 †) to understand the effect of the amplied n / p* interaction. The proteins showed a two-state unfolding and we were delighted to note that the C]S i+1 / C]O i+2 n / p* interaction enhanced the stability of 6a by 0.8 kcal mol À1 and 7a by 0.3 kcal mol À1 (Fig. 6b).
The increased stability arises from the reduced conformational exibility of the amino acid residue engaged in an n / p* interaction, a feature that is analogous to the ring constraint in proline, which restricts its conformational space compared to other amino acids and increases protein stability by reducing the entropy of the unfolded state. 64,65 An n / p* interaction restricts the conformational space of an amino acid residue with the adoption of torsion angles as depicted in the Ramachandran plot of the b-turn residues ( Fig. 2c and d). This would also be expected in an amplied n / p* interaction by thioamide substitution. The adoption of such torsion angles is favorable at the i + 1 and i + 2 positions of a b-turn (Fig. 2a). Furthermore, since b-branched amino acids restrict the backbone conformation more than the unbranched residues, 65 the C]S i+1 / C]O i+2 n / p* interaction stabilizes 6a more than 7a.
Thus, our results in the Pin 1 WW domain re-emphasize the role of the amino acid side-chain in tuning the n / p* interaction energy. Not only the side-chain rotamer of the amino acid involved in an n / p* interaction dictates its strength (Fig. 4), the steric interactions imposed by the amino acid sidechain of the donor carbonyl oxygen (i) (C]S i+1 in this case) (Fig. 6) can also inuence an n / p* interaction.

Conclusions
In summary, our bioinformatic analysis indicates that the reduced conformational freedom of the donor C]O i by the intramolecular C]O i /HN i+3 hydrogen bond in b-turns is associated with the high abundance of n / p* interactions at the i + 1 residue, whereas, the absence of the intramolecular hydrogen bond, constraining either the C]O i+1 or C]O i+2 results in conformational exibility of the i + 2 residue, which could be restricted by introducing an C]O i+1 / C]O i+2 n / p* interaction. The experimental results at the i + 2 residue of the type-II 0 b-turn in GB1 variants suggest that amino acid sidechain identity and the rotamer conformation can modulate the strength of an n / p* interaction. Although, it is challenging to estimate the exact contribution of this energetically subtle interaction towards the global stability of the protein, we note that the altered rotamer conformation as a result of local structural changes can amplify/weaken an n / p* interaction affecting the backbone torsion angles (f, j), thereby inuencing its stability. With an enhanced n / p* interaction in the absence of the stabilizing intramolecular hydrogen bond, we observe a clear shi of amino acid torsion angles (f, j) from a non-helical to the right-handed a-helical region. It is worth noting that the i / i + 1 directionality (N-term / C-term) associated with the n / p* interaction coincides with the formation of the productive helix nucleus at the N-terminus of a polypeptide, [66][67][68] highlighting an important contribution of the n / p* interaction towards helix nucleation. Furthermore, the recent report of a long-range n / p* interaction in stabilizing the a-helical conformation of a synthetic peptide in water, re-emphasizes the potential of this noncovalent interaction in engineering helical structures. 69 To conclusively demonstrate the inuence of the n / p* interaction on protein stability, we chose to amplify this weak noncovalent interaction by thioamide substitution. Since a strong n / p* interaction induces a "kink" in the polypeptide backbone by optimizing the f, j torsion angles suitable for orbital overlap, and thereby reducing the conformational entropy at the b-turn, the thioamide substitution increased the protein stability. It is worth noting that thio-Gly465 in the natural protein methyl-coenzyme M reductase, which is suggested to stabilize the protein secondary structure near the active site, induces a kinked conformation (f, j ¼ À68.5 , À47.2 ) by engaging in an n / p* interaction with C]O of Phe466 (Fig. S30 †). 70,71 With the recent advancement in ribosome mediated incorporation of thioamide into proteins and polypeptides, thioamide substitution could be potentially utilized to stabilize turns and enhance protein stability, 72,73 aided by exogenous factors like salt concentration 11 and solvation by water molecules 69 that have been shown to inuence the n / p* interaction in protein secondary structures.

Conflicts of interest
There are no conicts to declare.