Chen
Chen‡
ab,
Zeting
Zhang‡
*ab,
Mojie
Duan
*c,
Qiong
Wu
a,
Minghui
Yang
ab,
Ling
Jiang
ab,
Maili
Liu
ab and
Conggang
Li
*ab
aKey Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan National Laboratory for Optoelectronics, Wuhan Institute of Physics and Mathematics, Innovation Academy of Precision Measurement, Chinese Academy of Sciences, Wuhan 430071, China. E-mail: conggangli@apm.ac.cn; zhangzeting@wipm.ac.cn
bGraduate University of Chinese Academy of Sciences, Beijing 100049, China
cInterdisciplinary Institute of NMR and Molecular Sciences, School of Chemistry and Chemical Engineering, The State Key Laboratory of Refractories and Metallurgy, Wuhan University of Science and Technology, Wuhan 430081, China. E-mail: mduan@wust.edu.cn
First published on 18th December 2024
Proteins typically adopt a single fold to carry out their function, but metamorphic proteins, with multiple folding states, defy this norm. Deciphering the mechanism of conformational interconversion of metamorphic proteins is challenging. Herein, we employed nuclear magnetic resonance (NMR), circular dichroism (CD), and all-atom molecular dynamics (MD) simulations to elucidate the mechanism of fold switching in proteins GA95 and GB95, which share 95% sequence homology. The results reveal that long-range interactions, especially aromatic π–π interactions involving residues F52, Y45, F30, and Y29, are critical for the protein switching from a 3α to a 4β + α fold. This study contributes to understanding how proteins with highly similar sequences fold into distinct conformations and may provide valuable insights into the protein folding code.
In recent years, Alexander et al. have designed proteins GA and GB by introducing mutations in two structural domains of the streptococcal protein G.15,16 These proteins exhibit high sequence homology but distinct folding patterns: GA folds into a 3α structure, while GB folds into a 4β + α structure. Their fold switching is accomplished through only a small number of amino acid mutations. Notably, in proteins GA95 and GB95, all three mutation sites are hydrophobic amino acids, two of which are aromatic amino acids. Aromatic residues are known to cluster in the hydrophobic core of folded proteins, where they stabilize proteins through hydrophobic effect, as well as π–π or cation–π interactions17 due to their quadrupolar electrostatic character.18,19 The configuration of the aromatic cluster is found to play crucial roles in protein stability, molecular recognition processes as well as catalytic functions of enzymes.20–24 Despite the mentioned relevance, the impact of aromatic cluster side-chain interactions on the fold switch of proteins between different structures is still poorly characterized.
Research on the mechanisms of fold switching between GA and GB has attracted widespread interest. Studies have suggested that structural differences in the denatured states play key roles in determining their folding pathway.25–28 Additionally, from thermodynamic and kinetic perspectives, fold switching of GA and GB is related to the N- and C-terminal sequences,29–32 and are supported by certain local interactions.33–35 However, the mechanism underlying the fold switch between GA and GB remains insufficiently well-defined. Current studies primarily rely on theoretical models or computer simulations due to the challenges in obtaining soluble protein samples and their inherent instability. Although AlphaFold is a powerful tool for predicting protein static structures36–39 and protein–protein interactions,40 the current default algorithms still face challenges in predicting metamorphic protein conformations and the effects of sequence variation.41,42 Importantly, AlphaFold cannot provide protein folding pathways,43,44 limiting its applications in exploring folding mechanisms for metamorphic proteins. Therefore, experimental data is essential for elucidating the exact mechanism of fold switching between GA and GB.
This study investigates how the proteins GA95 and GB95, which share 95% sequence homology, fold into different conformations by using nuclear magnetic resonance (NMR) and all-atom molecular dynamics (MD) simulations, revealing the intrinsic mechanism of their fold switching in GA and GB. The findings highlight the impact of a transition of early contacts of aromatic residues and the subsequent formation of an aromatic cluster in driving fold switching.
![]() | ||
Fig. 1 Structures of GA95 and GB95. (A) and (B) The tertiary structures of GA95 (PDB id: 2KDL) and GB95 (PDB id: 2KDM). Aromatic amino acid residues are shown in stick representations with different colors, Y is shown in orange, W is shown in blue, and F is shown in green; (C) sequence comparison and secondary structure of GA95 and GB95. Three residue differences (20, 30, 45) of GA95 and GB95 are highlighted in red. Aromatic amino acid residues are highlighted with colors consistent with the stick representations. |
We further extended the truncated proteins GA95-n and GB95-n (n is the number of residues) to various lengths (Table 1) to explore the minimal sequence required for tertiary structure formation. HSQC spectra (Fig. 2A–J) were used to assess their folds; well-dispersed spectra suggest stable tertiary structure formation. We found that GA begins to be partially folded when the sequence reaches 52 residues (GA95-52), while GB needs three more residues, reaching 55 to be partially folded (GB95-55) (Fig. 2D and I). The structural transition from disordered to partial folded structures was also observed by one-dimensional 19F spectra of GA95-n and GB95-n, consistent with the HSQC results (Fig. S1 and S2†). These observations suggest that the three mutation sites affect the final folding of the tertiary structure mainly through the long-range interactions with the C-terminal residues.
Proteins | Sequence |
---|---|
a Mutation sites are in red. | |
GA95-38 |
![]() |
GA95-45 |
![]() |
GA95-51 |
![]() |
GA95-52 |
![]() |
GA95-53 |
![]() |
GA95-(38–56) |
![]() |
GB95-38 |
![]() |
GB95-45 |
![]() |
GB95-54 |
![]() |
GB95-55 |
![]() |
GB95-(38–56) |
![]() |
We also noticed that Gly41 had weaker signal intensity in GB95-45 compared to GA95-45 in 1H–15N HSQC spectra (Fig. S3†), and its chemical shifts changed with sequence length (Fig. S4†). This suggests that although the proteins had not yet folded into a tertiary structure, the secondary structure may have been formed, resulting in observable dynamic changes. We then utilized circular dichroism (CD) to measure the secondary structure content of the truncated proteins containing two or three mutation sites in GA95 and GB95 (Fig. 2K, L, S5 and S6†). As expected, differences in secondary structures were indeed observed. There was a small difference between GA95-45 and GB95-45, whereas an increased disparity was observed between GA95-51 and GB95-54. GA95-51 had significantly more α-helices, while GB95-54 had more β-sheets.
The NMR and CD results of these truncation variants suggest that the folding of GA95 and GB95 is closely related to the three mutation sites and potentially dependent on key interactions involved in the C-terminal region. Similarly, truncation on the N-terminal region was also performed. CD experiments showed that the C-terminal fragments GA95-(38–56) and GB95-(38–56) containing L45Y are disordered, with a negative peak appearing at wavelengths of less than 200 nm (Fig. S7†), which is different from the results in GB1,45 possibly due to sequence differences. This observation indicated that the folding of GA and GB into different conformations relies on long-range interactions between the N- and C-terminal regions. This is also demonstrated by structural changes in GA and GB that highlight the importance of these regions in folding.16
As indicated in Fig. 2C and D, the transition from a disordered structure to a tertiary structure occurs from residue 51 to 52 in GA95, suggesting that F52 is crucial for folding, particularly as some studies have also pointed out that F52 is involved in key interactions within protein GB.16,35,46 The interwoven aromatic interactions within proteins are well-established as pivotal for structural stability.20,24,47,48 GA95's NMR structure (PDB id: 2KDL) indicated a potential π–π interaction between F52 and Y29, and the aromatic rings of F52 can interact with that of both F30 and Y45 within spatial proximity simultaneously in GB95's NMR structure (PDB id: 2KDM). Therefore, it is reasonable to assume that the π–π interactions involving these three residues form an aromatic cluster, which contributes to folding switch of GA95 and GB95. To analysis the π–π interaction network during folding, we counted the occupancy of structural states in which residue F52 interacts either with Y29 or I30/F30 in the different folded states of the protein. In GA95, only the Y29–F52 interaction was observed (Fig. 3B). In contrast, in GB95, the F52–Y29 interaction was observed only in state 4 and state 3, while the F52–F30 interaction began to occur in state 3 and became fully dominant in the subsequent folded states (state 2 and state 1) (Fig. 3D). These findings indicate that the F52–Y29 interaction persists throughout the folding process of GA95, whereas in the mid-to-late stages (from state 3 to state 2) of GB95 folding, there is a transition in interactions, with the F52–Y29 interaction being entirely replaced by the F52–F30 interaction. This shift is possibly due to a higher density of the π-electron cloud in the F30 side chain.49,50
To understand what causes the shift in long-range π–π interactions from F52–Y29 to F52–F30 during the β3–β4 formation step, transitioning from the unfolding state (state 3) to the partially folded state (state 2) of GB95 (Fig. 3D, E, and 3G), we analyzed inter-residue interactions at different folding states of GB95. We observed that the interaction between Y45 and F52 occurs at state 2 (Fig. 3E). Combined with the previous observations that the F30–F52 interaction initiates at state 3 and fully replaces F52–Y29 at state 2 (Fig. 3D), this suggests a significant synergistic role for Y45 in rearranging F52 interactions, resulting in folding switch from 3α to 4β + α. This finding also aligns with Sikosek et al.'s report that the interaction between F52 and Y45 is critical in the protein folding switch.35
The literature and our MD simulations proposed that π–π interactions between F52 and F30 also play a key role in the folding of the 4β + α structure of GB.16,35 A former study showed that the F52A mutation would induce the unfolding of GB95.16 In this study, we found that the introduction of mutation F30I also disrupted the structure and drove the unfolding of the protein (Fig. 4C). These results demonstrate that the long-range π–π interaction between F30 and F52 is essential in GB95 folding. Additionally, Y29 also plays a significant role in GB95 folding. Introducing a mutation at Y29 resulted in partially unfolded structures of GB95, as observed in the 1H–15N HSQC spectrum (Fig. 4D). However, compared to the complete unfolding of GB95 induced by F30I, the Y29A mutation only led to partial unfolding, suggesting that F30 plays a more critical role than Y29 in GB95 folding. Based on the NMR and MD simulations result above, we propose that the shift in interaction from Y29 to F30 is crucial for the fold switching between GA95 and GB95.
Based on the MD simulations, we found that the F30–F52 interaction facilitates the formation of main-chain hydrogen bonds of the central α-helix, ultimately triggering the folding of the 4β + α structure. However, it remains uncertain whether the F30–F52 interaction solely influences the formation of the final α-helix or plays a key role in the early stages of GB95 folding. To address this, we introduced a leucine-to-proline mutation at residue 32, known to disrupt the central α-helix.51 In GB95, this mutation only causes the disappearance of signals from some residues in the central α-helix in the 1H–15N HSQC spectrum, with new signals emerging in the center of the spectrum (Fig. 5D), indicating disruption of the central α-helix, but not the overall GB95 structure by L32P mutation. Given the fact that F30I disrupts the structure of GB95, as mentioned earlier, we conclude that the long-range π–π interaction between F30 and F52 is crucial for the overall structure formation of GB95 during early folding stages. Furthermore, the side-chain π–π interaction involving Y45 and F52, as well as the folding of β34, is heavily reliant on F30.
Based on these results, we concluded that Y45–F52 π–π interaction stabilizes β34 and forms an aromatic cluster with F30, facilitating the complete shift of F52's interaction from Y29 to F30 and ultimately stabilizing the structure of GB95. It follows that the long-range interactions, particularly the preferences for self-association among hydrophobic residues52–54 and the formation of aromatic clusters exert a significant influence on protein structures.24,47,48 More importantly, it is plausible to propose that there may be more than one potential intramolecular interaction pattern in metamorphic proteins, where a specific interaction plays a decisive role in conformational transitions, termed folding switches. These folding switches may be the fundamental regulatory mechanism in metamorphic protein systems, capable of being formed or disrupted in response to external factors such as mutation, ligand binding, or solution conditions,55,56 which facilitate changes in the population of conformational ensembles57 and affect structural transitions.58,59
To investigate the roles of T55 and E56 in GB95 folding, we analyzed the NMR structure of GB95 (PDB id: 2KDM) and found that the side chain OH group of T55 lies within hydrogen bond distance to the side chain carbonyl group of N8, and the side chain carbonyl group of E56 lies within hydrogen bond distance to the main chain NH group of K10 (Fig. 6G). It is reasonable to hypothesize that the hydrogen bonds between T55–N8 and E56–K10 contribute significantly to the folding of GB95.60 To test this hypothesis, we introduced T55S, T55A, or N8A mutations into GB95-55, respectively. As expected, the 1H–15N HSQC spectrum of GB95-55 T55S closely matched that of GB95-55 (Fig. S13†). In contrast, the spectra of GB95-55 T55A and GB95-55 N8A exhibited poor signal dispersion (Fig. 6B and C), indicating that the absence of the T55 and N8 side chains significantly destabilized the native structure. These results confirm that the hydrogen bonds involving the side chain OH group of T55 and the side chain carbonyl group of N8 play essential roles in stabilizing the 4β + α fold of GB95.
Furthermore, we introduced E56A, K10A, and K10E mutations into GB95. Comparing the 1H–15N HSQC spectra of these mutants with that of GB95 revealed that the E56A mutation results in a significant loss of the native structure (Fig. 6E), implying that the side chain of E56 plays a crucial role in the 4β + α fold in GB95. However, the K10A mutation did not affect the protein structure (Fig. 6F), nor did the K10E mutation (Fig. S14†), indicating that the side chain of K10 is not involved in key interactions. Therefore, we conclude that hydrogen bonds involving the side chain carbonyl group of E56 and the main chain NH group of K10 contribute significantly to the stabilization of the tertiary structure of GB95.
Surprisingly, the T55A or N8A mutation did not affect the tertiary structure of GB95 when the E56–K10 hydrogen bond was preserved (Fig. S15 and S16†). This suggests that the hydrogen bonds involving T55–N8 and E56–K10 act synergistically, akin to a pair of “double-breasted buckles.” When the N- and C-termini of proteins are bridged by the E56–K10 interaction, the T55–N8 hydrogen bond has little effect on protein conformation. However, if E56 is disrupted or absent, the hydrogen bond involving T55 and N8 becomes essential for maintaining conformational stability in GB95. In summary, additional long-range interactions involving the C-terminal residues are essential for the ultimate stabilization of the protein structure. In GA95, the 3α fold is stabilized by T53–Y29 interaction, whereas in GB95, the N- and C-termini of the protein are connected by T55–N8 and E56–K10 interactions, further stabilizing the 4β + α fold.
In summary, our work provides empirical validation for studying the dynamic mechanisms underlying fold switching between different structures of proteins with highly similar sequences. It emphasizes how reshuffling aromatic residue side-chain interactions within molecules can lead to the remodeling of secondary structures—offering insights into solving future challenges in protein folding by understanding these critical inter-residue interactions.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sc04951a |
‡ C. Chen and Z. Zhang contributed equally to this work. |
This journal is © The Royal Society of Chemistry 2025 |