Qiang Shao*
Drug Discovery and Design Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China. E-mail: qshao@mail.shcnc.ac.cn
First published on 8th December 2014
Integrated-tempering-sampling molecular dynamics simulation is utilized to investigate the folding of a 67-residue three-α-helix bundle, α3W. Reversible folding and unfolding can be observed in individual trajectories and a total of 28 folding events are achieved within 7 μs simulation, giving sufficient sampling on the configuration space of protein folding. The native-like state with a left-handed topology constitutes the largest fraction of the conformational ensemble sampled by the simulation. In addition, a misfolded state with mirror-image (right-handed) topology is observed with smaller population. The free energy landscape analysis demonstrates that the folding of α3W is initiated by the formation of α-helical secondary structures and is followed by the assembling of folded α-helices to construct tertiary structure. The “correct” α-helix assembling which leads to the native structure is mainly dominated by inter-helical hydrophobic interactions whereas the “incorrect” assembling which leads to a misfolded mirror-image structure is highly affected by not only hydrophobic but also charge interactions. It is speculated on the basis of the present study on α3W and other studies on the B domain of protein A and α3D that the misfolding probability of α-helix bundle proteins is dependent on the strength of intra-protein hydrophobic and charge interactions: proteins containing stronger hydrophobic interactions but weaker charge interactions should have smaller misfolding probability. The importance of intra-protein hydrophobic interactions in preventing protein misfolding has been also seen in our previous studies on β-hairpins. Therefore, the present study along with our previous studies provide comprehensive, atomic-level picture of the folding/misfolding of α-helix bundle and β-hairpin proteins.
As popular models for in vitro investigation of protein folding, three-α-helix bundles are also geometrically interesting and has potential of misfolding. After the first two helices are aligned, the third one can be packed against either side of the plane involving the former two helices, corresponding to left-handed or right-handed conformation.9 The representative three-α-helix bundle, the B domain of protein A from Staphylococcus aureus (BdpA, 46 residues), has been indicated to adopt both native (left-handed) and topologically mirror-image (right-handed) conformations in several coarse-grained and all-atom simulations.7,8,10–13 The molecular simulation with coarse-grained united-residue (UNRES) force field by Maisuradze et al. identified the mirror-image conformation of BdpA with very low population at room temperature.7 More recent replica exchange molecular dynamics (REMD) and structure-based simulations by Noel et al. further quantified the mirror-image conformation with the occupation of 5% among folded structures of BdpA.8 According to its small population, the mirror-image conformation of BdpA has been regarded as a misfolded state by the authors. In addition, both previous replica exchange Monte Carlo (REMC) simulation by Yang et al.14 and more recent molecular dynamics (MD) simulation enhanced by dihedral-biased tempering method9 indicated that another α-helix bundle protein, α3W (67 residues) which has natively left-handed topology, could also misfold to non-native mirror-image conformation with observable population. The common observation of the mirror-image conformation in the simulations of multiple protein systems and implemented with various force fields suggests that it should be not the force field artifact.
Understanding the molecular mechanism underlying protein folding and (possible) misfolding is of fundamental importance to biology. The difficulty of experimental technologies in distinguishing the fold and misfolded states of abovementioned proteins and their respective transient states in the folding transition pathways, however, limits our approach to explore the folding/misfolding mechanism of these proteins. All-atom MD simulation can essentially provide detailed dynamic information of protein motion, but the low efficiency in sampling high-dimensional and rugged configuration space of protein makes conventional MD simulation very computationally expensive.15 Enhanced sampling methods are thus often used to improve the sampling efficiency of MD simulation, ensuring that the simulation can sufficiently sample protein configuration space.9,16–22 The associated free energy landscape analysis of simulation data can identify and energetically characterize protein folding/misfolding pathway.
Recently, we used integrated-tempering-sampling molecular dynamics (hereinafter referred to as ITS-MD) simulation23,24 to systematically investigate the folding and misfolding of β-hairpin systems.3,5,25–27 The estimated folding kinetics and thermodynamics parameters (e.g., the folding time and individual backbone hydrogen bond stability of TRPZIP2, the cold denaturation tendency of five β-hairpins of MrH1, MrH4a, MrH3a, MrH3b, and MrH4b) are in well agreement with experimental data, indicating the validity of the simulation methodology in describing β-hairpin folding.3,28–30 More importantly, the misfolded states observed in most of β-hairpin systems are consistent with those reported by aforementioned simulation studies from other research groups.1,2,6 The detailed free energy landscape analysis sheds light on the mechanism of β-hairpin misfolding which is induced by the alterable turn configuration and its competition with the cross-strand hydrophobic side chain interactions.25 In the presence of strong β-turn-promoting sequence, the turn formation precedes over the formation of other two structural elements (the packing of cross-strand hydrophobic core cluster and backbone hydrogen bond assembling). On the other hand, the strong β-turn-promoting sequence might also have high risk for the turn adopting alternative conformations of which the native configuration does not but the non-native one does conflict with the packing of hydrophobic core cluster. As a result, the native turn configuration collaborates with the hydrophobic core to form correctly folded structure whereas the non-native turn configuration interrupts the packing of the hydrophobic core to reach misfolded structure, especially for protein system containing weak hydrophobic interactions. In the presence of weak β-turn-promoting sequence and strong hydrophobic residues on the strand, the turn formation becomes the rate-limiting step and the folding process is initiated by the packing of hydrophobic core cluster, which strongly limits the approach to β-hairpin misfolding. For α-helix bundle proteins, although previous studies suggested that right-handed conformation is dominated by charge interactions,9,31 how charge interactions take into effect in the folding process to help the protein find a routine to target structure and what is the interplay between the charge interactions and the driving force for protein structure collapse are still unclear.
In this article, we utilized the same enhanced sampling simulation method (ITS-MD) to simulate the folding of a 67-residue α-helix bundle protein, α3W (PDB code: 1LQ7 (ref. 32)). In comparison to other α-helix bundle analogues with relatively smaller sizes such as BdpA, the simulation studies on α3W are quite few, some of which have been performed using coarse-grained models.9,33,34 The detailed folding mechanism of the protein is thus not clear yet. The present all-atom simulation captures not only low-energy native-like (left-handed) and misfolded mirror-image (right-handed) conformations but also various high-energy configurations along the folding transition pathway back and forth in individual trajectories. The captured mirror-image conformation has the same topology as that reported in previous simulations of α3W.9,14 With sufficient protein folding/misfolding events collected in simulation, the folding and misfolding pathways of α3W can be identified through the free energy landscape analysis. It is observed that the folding pathway of α3W is mainly dominated by inter-helical hydrophobic interactions whereas the misfolding is also highly affected by inter-helical charge interactions. In addition, the present study suggests that the misfolding of mirror-image conformation should be easier to occur for α-helix bundle protein containing less hydrophobic but more charged residues. Such effects of intra-protein hydrophobic interaction strength on the misfolding tendency can be also seen for β-hairpin system.25 Therefore, the present study together with our previous studies on β-hairpins3,5,25 provide comprehensive, atomic-level picture of the folding and misfolding of α-helix bundle and β-hairpin.
Multiple independent simulation trajectories were performed. In each trajectory, the initial extended structure of the protein without residual secondary and tertiary structure elements was first subjected to 2500 steps of minimization. Then the system temperature was increased upon 300 K using velocity rearrangement from a Maxwell–Boltzmann distribution. After that, the system temperature was maintained at 300 K using the weak-coupling algorithm with a coupling time constant of 0.5 ps.39 The long-time equilibrium simulation (production run) was performed with a step size of 0.002 ps and at the constant temperature of 300 K, but with a potential modified via the ITS method to encourage thorough sampling over a large potential energy range. The calculation data were collected every 2.0 ps and all data from the production run were used for data analysis.
The details of ITS method have been described previously.23,24 Briefly, a modified potential energy is obtained from an integration function over temperature:
![]() | (1) |
![]() | (2) |
Then the integration in eqn (1) could be written as a summation over a series of discrete temperature values . Compared to conventional MD simulation, the sampled energy range in ITS-MD simulation could be greatly expanded depending upon the range of βk used. In the present study, 300 temperatures (βk), evenly distributed in the range of 270–390 K, were used. A preliminary iteration process was employed to obtain converged values of discrete nk's, which were then used in the long-time production run to evenly sample the entire energy range. After the data were collected in production run using the modified potential with the set of f(βk)'s, the thermodynamic properties of the simulated system at a desired temperature β were then calculated by reweighting of each term by timing a factor
.
The number of folding events is accumulated to ∼28 in all trajectories. The sufficient sampling of protein folding events allows us to statistically analyze the weight of various structures including the unfolded, transient, and folded structures along protein folding pathway. The one-dimensional free energy profile as a function of backbone heavy-atom RMSD was calculated at room temperature (Fig. 2(a)). As shown in this figure, the local free energy minimum at small RMSD position (∼3.90 Å) has the lowest free energy, indicating that the native-like (left-handed) state of α3W is the most populated. In addition, another local minimum with higher free energy is also presented at RMSD ≈ 9.2 Å, corresponding to the mirror-image (right-handed) state. The free energy difference between the native and misfolded states is about −0.37 kcal mol−1, making the former state roughly 1.86 times more populated than the latter state. The free energy profile depicted as a function of another reaction coordinate, the fraction of native contacts of α3W, also indicates that the native state is more thermodynamically favorable than the misfolded state (the free energy difference is −0.88 kcal mol−1, Fig. 2(b)). In addition, a free energy barrier needs to be overcome for the transition from the misfolded state to native state (1.40 kcal mol−1 in Fig. 2(a) and 0.628 kcal mol−1 in Fig. 2(b)).
To illuminate clearly the structure difference of the two states, we next calculated their contact maps using the contact map analysis (CMA) server in SPACE tools.40 One can see from Fig. 3 that the main difference is in the residue–residue contacts between the helix H3 and the other two helices (see the regions circled in Fig. 3). The observed mirror-image structure of α3W here has similar geometric features as that in previous explicit solvent MD simulation of the same protein:9 both mirror-image structures have similar chirality (opposite to the native structure) and have less side chain-side chain contacts among hydrophobic residues than the native structure.
![]() | ||
Fig. 3 The contact maps of (a) left-handed and (b) right-handed structures of α3W. The contact map is calculated by the contact map analysis (CMA) server in SPACE tools.40 |
It is worth noting that the use of ITS-MD simulation can save the computational resource in comparison to other enhanced sampling methods. In our previous study on an Ala–Pro peptide,41 we found that ITS-MD has an apparently higher efficiency than other bias potential and generalized ensemble methods, e.g., accelerated molecular dynamics (AMD) and replica-exchange molecular dynamics (REMD) simulations, in generating the thermodynamics parameters for the trans and cis interstate transition of the peptide. The present simulation ran 12 independent trajectories and each trajectory lasted ∼600 ns. To achieve converged free energy profiles for a smaller α-helix bundle protein, engrailed homeodomain protein (EnHD, 54 residues), Joshi and co-workers performed implicit solvent REMD simulations of which either 30 parallel trajectories (replicas) covering the temperature range of 286–373 K (ref. 42) or 26 replicas covering the temperature range of 292–367 K (ref. 43) were implemented and the simulation per replica was lasted for 600 ns–2.0 μs. The required computational resource and simulation time are greater than those in the present study. In addition, the coarse-grained canonical and multiplexed REMD simulation on the 46-residue BdpA protein used 80 replicas to cover the temperature range of 250–500 K.7 The use of explicit solvent REMD to study the folding of protein is even more computationally expensive, e.g., a total of 64 replicas with the temperature range of 287–643 K were used by Noel et al. and 82 replicas with the temperature range of 277–548 K were used by Garcia et al. in their respective folding simulations on BdpA protein.8,44
Molecular interactions | Native (left-handed) conformation | Misfolded (right-handed) conformation | ||||
---|---|---|---|---|---|---|
H1–H2 | H1–H3 | H2–H3 | H1–H2 | H1–H3 | H2–H3 | |
Inter-helical hydrophobic interactions | V4–I41 | V4–V50 | I27–I64 | V4–L44 | L7–V53 | I27–I64 |
V4–L44 | L7–V50 | I27–L67 | A6–L44 | L7–V57 | I27–L67 | |
L7–L37 | L7–V53 | L30–L60 | L7–L37 | V11–V53 | W34–V57 | |
L7–I41 | L7–V57 | L30–I64 | L7–I41 | V11–V57 | W34–L60 | |
L7–L44 | V11–V57 | L30–L67 | L7–L44 | V11–L60 | W34–I64 | |
V11–W34 | L14–V57 | W34–V57 | L14–L30 | L14–L60 | L37–V57 | |
V11–L37 | L14–L60 | W34–L60 | L14–W34 | L14–I64 | L37–L60 | |
V11–I41 | L14–I64 | L37–V53 | L14–L37 | V18–L60 | I41–V53 | |
L14W34 | V18–I64 | L37–V57 | V18–I27 | V18–I64 | I41–V57 | |
V18–I27 | L21–I64 | L37–L60 | V18–L30 | V18–L67 | — | |
V18–L30 | L21–L67 | I41–V53 | L21–I27 | — | — | |
V18–W34 | — | L44–V53 | — | — | — | |
Inter-helical favorite charge interactions (salt bridges) | E15–K38 | R3–E54 | E29–K59 | K17–E29 | R3–E54 | K38–E61 |
— | K10–E61 | K33–E56 | — | K12–E56 | — | |
— | K17–E61 | K40–E56 | — | E15–K59 | — | |
— | — | — | — | K19–E63 | — | |
Inter-helical unfavorite charge interactions | K12–K38 | R3–K51 | E29–E63 | K10–K33 | E8–E56 | — |
K19–K31 | K10–K58 | K33–K59 | K17–R26 | E15–E63 | — | |
— | — | K40–K52 | — | — | — |
To understand the interplay of the formation of secondary and tertiary structures in the folding/misfolding pathways of α3W, we used the total number of backbone hydrogen bonds formed (NHB) and the number of inter-helical side chain contacts (CNat) to depict two-dimensional free energy landscapes for both native and mirror-image conformations. The former reaction coordinate is often used to describe secondary structure and the latter corresponds to tertiary structure. It is noteworthy that the detailed backbone hydrogen bonds are similar but the components of inter-helical side chain contacts are different for the two conformations.
As shown in Fig. 4, three distinct local minima exist in these profiles. A large fraction of secondary structure contents are formed even in local minima containing fewest side chain contacts. Thus, the three local minima could be treated as two partially folded/misfolded states (PL1 and PL2 in Fig. 4(a) corresponding to the partially folded left-handed states and PR1 and PR2 in Fig. 4(b) corresponding to the partially misfolded right-handed states) and one folded (F) or misfolded (M) state. The connection of the three distinct states in each profile constructs the lowest free energy folding or misfolding pathway for protein. One can see that the number of backbone hydrogen bonds increases and reaches the maximum in very early stage of the folding of α3W (PL1 or PR1 state) where only a small portion of inter-helical side chain contacts are formed. The further folding of α3W involves the formation of more side chain contacts while the secondary structure is fully formed. Therefore, for either native or misfolded conformation, the formation of secondary structures precedes over that of tertiary structure, consistent with the framework mechanism: in the folding process of protein, its secondary structures are formed incipiently and independently of tertiary structure, then the formed secondary structure elements coalesce into native structure.45 This observation is in well agreement with previous large-scale Monte Carlo simulation study of α3W.33
The fractions of hydrophobic and charge interactions in each distinct state along the folding and misfolding pathways of α3W, respectively, were calculated. One can see from Table 2 that the fraction of hydrophobic interactions increases gradually in the transition pathways of both native and misfolded conformations. For left-handed conformation, the fraction of charge interactions is always smaller than that of hydrophobic interactions in all three states, suggesting that hydrophobic interactions but not charge interactions largely contribute for the folding of left-handed conformation. In contrast, although only a small fraction of charge interactions are formed in the very early stage of the formation of right-handed conformation (PR1 state), most of charge interactions are formed in the middle and all of them are formed at the end of the misfolding pathway, strongly suggesting that charge interactions also play an important role in the misfolding of α3W to right-handed conformation. This observation is consistent with the speculation that charge interactions dominate the formation of right-handed conformation of α-helix bundle.9,31
Left-handed conformation | Right-handed conformation | |||||
---|---|---|---|---|---|---|
PL1 state | PL2 state | F state | PR1 state | PR2 state | M state | |
Fraction of hydrophobic interactions | 31.26% | 43.50% | 70.08% | 32.07% | 52.88% | 79.16% |
Fraction of charge interactions | 18.50% | 26.98% | 35.66% | 16.10% | 65.10% | 87.08% |
Next, to distinguish the folding pathway of left-handed conformation from the misfolding pathway of right-handed conformation, we calculated the free energy landscape as the function of the fraction of all inter-helical side chain contacts for both conformations (QLeft and QRight). The local minima corresponding to the partially folded/misfolded and fully folded/misfolded states are also presented in Fig. 5. It is clear to be seen that the lowest free energy folding and misfolding pathways are roughly mirror symmetric along the diagonal axis. Therefore, the divergence of the folding of left-handed and the misfolding of right-handed conformations of α3W occurs in the very early stage of α3W folding process. The average number of inter-helical hydrophobic contacts (the main content of tertiary interactions) formed within each subdomain in the distinct states along the folding/misfolding pathways of left-handed and right-handed conformations was calculated and organized in Table 3. For both conformations, more hydrophobic contacts are formed in H1–H2 subdomain than in the other two subdomains in the early stage of α3W folding (PL1 or PR1 state) and the further folding leads to the formation of more hydrophobic contacts in all subdomains.
Left native conformation | Right non-native conformation | |||||
---|---|---|---|---|---|---|
PL1 state | PL2 state | FL state | PR1 state | PR2 state | FR state | |
NHC, H1–H2 | 5.25 | 5.53 | 8.25 | 4.98 | 6.42 | 10.35 |
NHC, H1–H3 | 2.84 | 6.50 | 8.52 | 2.88 | 3.02 | 5.84 |
NHC, H2–H3 | 2.85 | 3.17 | 7.75 | 3.68 | 5.76 | 6.28 |
![]() | ||
Fig. 6 The temperature dependence of the average fraction of inter-helical side chain interactions for left- and right-handed conformations of α3W. |
To understand the different temperature resistance of the native and mirror-image conformations of α3W, we calculated the average fraction of inter-helical hydrophobic and charge interactions formed, respectively (Fig. 7). Both kinds of interactions have their fraction decreased with temperature in native conformation, corresponding to the low temperature resistance of this conformation. However, while the fraction of hydrophobic interactions keeps decreasing, the fraction of charge interactions in the non-native right-handed conformation keeps small but steady in a large temperature range. The small value of the fraction of charge interactions corresponds to the low population of non-native right-handed conformation and its steadiness explains the relatively strong temperature resistance of this conformation.
![]() | ||
Fig. 7 The temperature dependence of the average fraction of inter-helical hydrophobic and charge interactions for (a) left- and (b) right-handed conformations of α3W. |
In principle, the accuracy of theoretical simulation largely depends on the molecular force field used. The popular combination of AMBER FF03 force field and GB implicit solvent model, which has been widely used in many previous simulations on the members of α-helix bundle proteins,42,46–50 was used here to model protein amino acids and water solvent, respectively. Although it is well-known that FF03 favors α-helical conformations,51–54 these abovementioned simulations can still provide reasonable description on the folding of α-helix bundle proteins, with the calculated thermodynamics parameters (e.g., melting temperature and folding free energy) consistent with experimental data. Very recently, Simmerling and co-workers indicated that the molecular simulation combined with the improved generalized Born (GB) implicit solvent model, GB-Neck2 (igb = 8),37 can fold proteins with diverse topologies.55 The combination of AMBER FF03 force field and GB-Neck2 model in the present simulation can not only fold α3W into its NMR experimentally identified native (left-handed) structure but also capture a misfolded mirror-image (right-handed) structure “hidden” behind the native one which has the similar topology as that observed in previous explicit solvent MD simulation,9 indicating its efficiency in the sampling of protein configuration space. The former conformation occupies the largest fraction of the equilibrium conformational ensembles sampled by the simulation whereas the population of the latter is much smaller.
The detailed free energy landscape analysis indicates that the folding of α3W starts with the formation of all α-helical secondary structures and is then followed by the assembling (packing) of folded α-helices to construct the tertiary topology. The α-helix assembling which involves the formation of inter-helical side chain contacts e.g., by hydrophobic and charge interactions, however, could divaricate at the early stage of folding process, depending on which kinds of interaction makes dominant contribution in the structure assembling. The “correct” assembling which leads to native topology is solely dominated by inter-helical hydrophobic interactions whereas the “incorrect” assembling which leads to misfolded mirror-image structure is highly affected by not only inter-helical hydrophobic interactions but also charge interactions. The important role of hydrophobic interactions in folding native structure can be seen from the larger amount of hydrophobic interactions in native structure than those in mirror-image structure (Table 1) as well as the large fraction of hydrophobic interactions formed in the folding pathway (Table 2). Similarly, the large fraction of charge interactions in the misfolding pathway (Table 2) reflects its importance in α3W misfolding.
Considering the importance of hydrophobic interactions to folding and the importance of charge interactions to misfolding of α3W, one may speculate that the misfolding probability could be increased in the presence of less hydrophobic but more charged residues in amino acid sequence for α-helix bundle protein. Using α3W, BdpA, and another α-helix bundle protein α3D as examples, one can see that the hydrophobic and charged residues occupy 31.34% and 53.74% in the amino acid sequence of α3W, and the counterparts in BdpA occupy 41.30% and 26.08% and in α3D occupy 51.8% and 33.2%, respectively. Accordingly, the mirror-image structure of α3W has larger population than the mirror-image structure of BdpA (roughly 34.96% of folded structures of α3W are in mirror-image conformation as shown in the present study but the number is only 5% for BdpA in ref. 8). In addition, both the explicit solvent MD simulation by Zhang and Ma9 and our recent implicit solvent MD simulation50 showed that no mirror-image structure is adopted for α3D which has the largest fraction of hydrophobic residues in the sequence among the three proteins under study. Therefore, the misfolding probability of an α-helix bundle protein could be evaluated by its amino acid sequence information. It is worth noting that our previous simulation suggested that the folding mechanism of an α-helix bundle protein is also dependent on its sequence: α-helical proteins containing higher percentage of hydrophobic residues than charged ones fold via nucleation–condensation mechanism (concurrent formation of secondary and tertiary structures) whereas proteins having opposite tendency in amino acid composition more likely fold via framework mechanism (incipient and independent formation of secondary structures before tertiary interactions).50 In this scenario, both BdpA and α3D fold via nucleation–condensation mechanism whereas the folding of α3W follows framework mechanism.50 Therefore, one may further speculate that the cooperative folding (nucleation–condensation) probably decrease the probability of misfolding. Combining the present study and our previous studies on β-hairpins,5,23–26 we can find one common feature in the misfolding of α-helix bundle and β-hairpin: proteins containing stronger intra-protein hydrophobic interactions have smaller probability to misfold.
This journal is © The Royal Society of Chemistry 2015 |