Open Access Article
Jiayu
Li
and
Hongbin
Li
*
Department of Chemistry, University of British Columbia, 2036 Main Mall, Vancouver, BC V6T 1Z1, Canada. E-mail: hongbin@chem.ubc.ca
First published on 30th May 2022
Metalloproteins account for over one-third of all proteins in nature and play important roles in biological processes. The formation of the native structures of metalloproteins requires not only the correct folding of the polypeptide chains but also the proper incorporation of metal cofactors. Understanding the folding mechanism of metalloproteins has been challenging. Horse heart cytochrome C (cytc) is a classical model system for protein folding studies. Although a large number of ensemble studies have been carried out to characterize the folding mechanism of cytc, there is still a significant debate on the folding mechanism and the existence of the proposed “foldons”. Here, we used single-molecule optical tweezers to probe the mechanical folding–unfolding behaviors of cytc at the single-molecule level. By directly monitoring the folding and unfolding of holo-cytc, we revealed novel insights into the folding of cytc. Our results showed that the structural elements that are distant from the N- and C-termini can exist as a short-lived intermediate, a finding that contrasts with the general belief that the folding and packing of the N- and C-terminal helices are prerequisites for the folding of other structural elements in cytc. In addition, our results present strong evidence that apo-cytc, which has been long believed to be a random coil, is not a true random coil, and weak interactions within the unfolded polypeptide chain exist. Our results bring new insights into our understanding of the folding mechanisms of heme proteins as well as the role of heme in the folding process.
![]() | ||
| Fig. 1 The structure of horse heart cytc. (A) Three-dimensional structure of horse heart cytc (PDB code: 1hrc). Cytc is a 104 aa helical protein containing a c-type heme cofactor. The heme is covalently bound to the polypeptide chain through two thioether bonds, and the heme iron forms axial Fe–N and Fe–S coordination bonds with a histidine and a methionine residue. (B) Schematics of the structure of cytc. The black circle indicates the porphyrin ring, the spirals indicate the helices, and the blue dotted lines represent the coordination bonds. (C) Schematic of the optical tweezer experiment to investigate the mechanical unfolding–folding of cytc. | ||
The folding–unfolding mechanisms of apo- and holo-forms of cytc have been studied extensively using various experimental techniques at the ensemble level. Early equilibrium chemical denaturation studies suggested that cytc exhibits two-state folding behavior,7 while more recent spectroscopic and calorimetric studies provide strong evidence showing the existence of both on-pathway and off-pathway folding intermediate states.8–12 In addition to equilibrium studies, kinetic studies on cytc have also confirmed the complexity of the folding process and identified possible misfolded conformations either involving the non-native coordination to the heme iron of wrong histidines, lysines, and even the N-terminal amino group, or arising from proline isomerization.13–16 The hydrogen exchange method has also been used to study the folding–unfolding of cytc, and a foldon dependent hierarchical multistep folding–unfolding mechanism was proposed.17–19 In contrast to holo-cytc, the heme-free apo-form cytc (apo-cytc) has long been considered as a random coil with no folded structures based on results from multiple spectroscopic experiments.20,21 It is thus believed that the heme plays a decisive role in transforming the random coil structure into the folded globular conformation of cytc, and the flexible and disordered conformation of apo-cytc is believed to facilitate the accepting and enveloping of the heme cofactor during the folding of holo-cytc.20,21 Despite this progress, the detailed folding mechanism of holo-cytc is still under debate, and the random coil conformation of apo-cytc remains to be substantiated.
Over the last two decades, single-molecule force spectroscopy (SMFS) techniques, including atomic force microscopy (AFM), optical tweezers (OT) and magnetic tweezers, have evolved into powerful tools to probe protein folding–unfolding mechanisms and nucleic acid conformational dynamics at the single-molecule level.22–29 By mechanically stretching/relaxing a protein from its two chosen residues, one can use SMFS techniques to probe the folding–unfolding reaction of a protein in real-time along a well-defined reaction coordinate at the single-molecule level, revealing unique insights into the protein folding–unfolding mechanism. These features have enabled SMFS techniques to become an important new tool to probe the folding–unfolding mechanism of metalloproteins that are otherwise difficult to study using traditional biophysical methods in vitro.30–39 Here, we used the single-molecule OT technique to investigate the mechanical folding–unfolding behaviors of the holo- and apo-forms of horse heart cytc. On the one hand, OTs enable one to investigate the unfolding/folding of apo-cytc in a mechanical setting that partially mimics that of the translocation process; on the other hand, OTs allow investigation of the unfolding/folding of holo-cytc along a well-defined reaction coordinate set by the applied stretching force. Our results showed that holo-cytc is mechanically stable and unfolds following two distinct pathways: a two-state unfolding pathway and a three-state unfolding pathway involving an intermediate state. In contrast, the folding of holo-cytc followed an apparent two-state pathway without the accumulation of any observed intermediate state. Moreover, our results showed that apo-cytc demonstrates some intrachain interactions and may form structures that exhibit low mechanical resistance. Our results revealed some new insights into the conformation of apo-cytc and the folding–unfolding mechanism of holo-cytc and help elucidate the role played by the heme in the folding of cytc.
The gene of cytc (C14, 17A) was obtained by standard site-directed mutagenesis methods. The genes of cytc and cytc (C14, 17A) were amplified using standard PCR to carry 5′ BamHI (G′GATCC) and 3′ KpnI restriction sites. They were then subcloned into two modified pQE80L (Qiagen, Valencia, CA) expression vectors, respectively, which allow for adding a cysteine and a cysteine together with an NuG2, to both termini of the protein. All the sequences were confirmed by direct DNA sequencing.
All the recombinant proteins were overexpressed in the Escherichia coli strain BL21 (DE3) at 37 °C. To express holo-cytc, 5 mL of preculture was inoculated in 2 L of rich medium (12 g L−1 tryptone, 24 g L−1 yeast extract, 4 mL L−1 glycerol, 2.3 g L−1 KH2PO4, and 12.5 g L−1 K2HPO4) containing 100 μg mL−1 ampicillin, and the protein expression continued for 30 h.40 To express apo-cytc, 3 mL of preculture was inoculated in 200 mL of 2.5% Luria–Bertani media containing 100 mg mL L−1 ampicillin, and when the OD600 of the culture reached ∼0.7, protein overexpression was induced with 0.5 mM isopropyl-b-D-1-thiogalactopyranoside (Thermo Fisher Scientific, Waltham, MA) and continued for 4 h. Both the cells were pelleted by centrifugation at 4000g for 10 min at 4 °C and resuspended in 10 mL of phosphate-buffered saline (PBS) buffer (10 mM, pH 7.4). After adding 10 μL of protease inhibitor cocktail (Sigma-Aldrich, St. Louis, MO), 100 μL of 50 mg mL−1 lysozyme from egg white (Sigma-Aldrich, St. Louis, MO), 1 mL of 10% (w/v) Triton X-100 (VWR, Tualatin, OR), and 50 μL of 1 mg mL−1 DNase I (Sigma-Aldrich, St. Louis, MO) and RNase A (Bio Basic Canada Inc, Markham, ON), cells were lysed for 40 min on ice. Cell debris was then removed by centrifugation at 22
000g at 4 °C, and the supernatant was loaded into a Co2+ affinity chromatography column (Takara Bio USA Inc, Mountain View, CA). After washing the column with 50 mL of washing buffer (10 mM PBS, 300 mM NaCl, 7 mM imidazole, and pH 7.4), the protein was eluted with 2 mL of elution buffer (10 mM PBS, 300 mM NaCl, 250 mM imidazole, and pH 7.4).
α0 and ln
β0 do not show any obvious curvature, we used the Bell–Evans model for fitting the data. In doing so, we did not need to assume a certain shape of the free energy profile for the mechanical unfolding/folding of cytc, which is required for other models, such as the Dudko–Hummer–Szabo model.45
For OT experiments, we coupled two dsDNA handles with Cys-holo-cytc-NuG2-Cys via thiol-maleimide coupling chemistry to create the DNA-protein-DNA chimera. Stretching Cys-holo-cytc-NuG2-Cys allowed us to stretch the reduced holo-cytc between its N- and C-termini. Fig. 3A shows the representative force–distance curves of the protein-DNA chimera at a pulling speed of 50 nm s−1. Each curve displayed two distinct unfolding/folding events. As expected, the fingerprint domain NuG2 (colored in cyan) unfolded at ∼20–40 pN and folded at ∼8 pN with a ΔLc of 18 nm. Hence, the unfolding/folding events of holo-cytc (colored in red and blue) can be readily identified from the force–distance curves. Evidently, the unfolding of cytc mostly occurred between ∼25 and 30 pN. The great majority of the native holo-cytc unfolded via an apparent two-state pathway (∼92%, 258 out of 279 events), and a small percentage occurred following a three-state pathway involving one short-lived intermediate state (∼8%, 21 out of 279 events) (Fig. 3A). Fitting the force–extension relationships of holo-cytc using the worm-like chain model (WLC) of polymer elasticity yielded a ΔLc of 34.6 ± 1.2 nm (average ± standard deviation) for the complete unfolding of holo-cytc (Fig. 3B).49 For the three-state unfolding pathway, cytc displayed a ΔLc1 of ∼15 nm (from the native to the intermediate state) and a ΔLc2 of ∼20 nm (from the intermediate state to the unfolded state). Based on the structure of holo-cytc, it is expected that the complete mechanical unfolding of holo-cytc should lead to a ΔLc of 35.7 nm (104 aa × 0.36 nm/aa − 1.7 nm = 35.7 nm, where 0.36 nm/aa is the length of an aa residue and 1.7 nm is the distance between the N- and C-termini). This value is in close agreement with the experimentally determined ΔLc of ∼34.6 nm, confirming that the two-state and three-state unfolding events we observed indeed correspond to the complete unfolding of holo-cytc.
It is important to note that in our OT experiments, cytc is stretched from its N- and C-termini. Both termini pack against each other in the folded structure of cytc, effectively shielding the rest of the cytc structure from the stretching force (Fig. 1A). Due to this feature, it is apparent that in order to extend cytc, its N- and C-termini must be separated (and/or unfolded) in the first step. In the three-state of unfolding, the mechanical unfolding of cytc into its unfolding intermediate shows a ΔLc of ∼15 nm, suggesting that the N- and C-termini are separated by ∼15 nm. Hence, in the mechanical unfolding intermediate state, the structure formed by N- and C-terminal helices must have been separately and likely unravelled. However, it remains unknown if other parts of cytc are unravelled or not.
Most of these refolded holo-cytc molecules (265 out of the 294 refolding events) subsequently unfolded at ∼20–30 pN with a ΔLc of ∼35 nm, in the same way as the pristine holo-cytc unfolded, suggesting that these refolded holo-cytc molecules were correctly folded into their native states in these cases. To obtain the spontaneous unfolding/folding rate constants and unfolding/folding distances, we measured the unfolding and folding rate constants as a function of force using the Oesterhelt method.42 Fitting the experimental data to the Bell–Evans model (Fig. 3D) estimated a spontaneous unfolding rate constant α0 of (1.83 ± 1.13) × 10−8 s−1 and a folding rate constant β0 of (4.94 ± 2.00) × 102 s−1 at zero force.43 It is worth noting that the spontaneous unfolding rate constant α0 of cytc is extremely small (1.83 × 10−8 s−1), which is consistent with that measured from chemical denaturation studies (∼3 × 10−10 s−1),10 yet cytc can be readily unfolded at ∼28 pN, an acceleration of ∼109 times. This significant acceleration is achieved by the extremely malleable native state, which is characterized by a large unfolding distance ΔLc of cytc (∼2.7 nm).
It is important to note that, in these refolding events, holo-cytc did not always refold successfully into its native state. We observed that in ∼10% of the refolding events (29 out of 294), their subsequent unfolding occurred at forces lower than 15 pN, significantly lower than that of the native holo-cytc (∼28 pN) (Fig. 4A, cycle 1), suggesting that these molecules folded into a structure that is mechanically much more labile than the native state. In addition, about 5% of the relaxation traces (17 out of 316) showed a refolding event of holo-cytc with shorter folding ΔLcs (Fig. 4A, cycle 2); and in rare cases (∼2%, 5 out of the 316 relaxation traces), cytc did not fold at all (i.e. no folding event in the relaxation curve, and no unfolding event in the subsequent stretching curve) (Fig. 4A, cycle 3). These phenomena indicated that, besides folding back into its native state, unfolded holo-cytc may also misfold into non-native structures or not fold at all during the relaxation. The refolded holo-cytc with a lower mechanical unfolding force (<15 pN) or a shorter ΔLc is classified as misfolded holo-cytc. It is worth noting that refolding and misfolding can occur in the same molecule (Fig. 4B left panel), and the misfolding occurs at a low frequency (Fig. 4B right panel). In addition, after the unfolding of a misfolded holo-cytc, holo-cytc can refold back to its native state and regain its mechanical stability (with an unfolding force of ∼28 pN). These results suggest that the misfolding of holo-cytc is not irreversible, and the misfolded state can be mechanically unfolded to allow its subsequent correct refolding. Similarly, not being able to fold in a few cycles does not render a permanent loss of the ability of the unfolded holo-cytc to refold, as the unfolded holo-cytc could refold into its native state in the subsequent relaxation cycles (Fig. 4C). We also found that, after the holo-cytc started to misfold or did not fold, extending the folding time at 0 pN to up to 60 s did not change the folding probability of cytc to its native state significantly (Fig. 4C), implying that such states were not productive folding intermediate states.
Previous OT studies50 showed the molten globule of apo-myoglobin, in which most of the secondary structures have formed in a native-like geometry but the tertiary contact is not fully formed,51,52 displayed similar mechanical unfolding behaviors as misfolded holo-cytc, i.e. significantly lower unfolding forces but the same ΔLc as that of the native state. Due to this similarity, we cannot completely rule out the possibility that the “misfolded” conformation of holo-cytc with a lower stability but the same ΔLc is the molten globule state of holo-cytc. However, the molten globule state usually leads to the correct folding of the native state. Our observation that this misfolded conformation of cytc can persist for some extended time without folding into its native state suggests that this misfolded conformation is likely to be a misfolded state or a kinetically trapped off-pathway folding intermediate, rather than the molten globule state. Nonetheless, the molecular mechanism underlying this misfolding behavior in the mechanical unfolding–folding experiments at the single-molecule level is unknown. It is possible that proline isomerization and the mis-ligation of heme play some roles, as proposed for the misfolding of cytc in the chemical folding–unfolding process.17
We then used OT to examine the mechanical response of apo-cytc to further examine this seemingly random coil. Apo-cytc has two free endogenous cysteine residues (Cys 14 and Cys 17), which may react with two extra dsDNA handles in the coupling process and result in an altered mechanical response of the protein or only a part of the protein being stretched in the OT experiments. To mitigate these potential complications, we mutated both Cys residues to Ala and engineered Cys-NuG2-cytc (C14, 17A)-NuG2-Cys for OT experiments, so that apo-cytc would only be stretched from its N- and C-termini. Since cytc (C14, 17A) is flanked by two NuG2 fingerprint domains, force–distance curves that contain the unfolding/folding signatures of two NuG2 domains must contain the mechanical response of apo-cytc (C14, 17A), thus allowing for the unambiguous identification of the mechanical features of apo-cytc (C14, 17A).
Previous SMFS studies showed that stretching random coils resulted in monotonically increased force versus extension, and no “unfolding” or “folding”-like events were present in force–extension curves.53–55 These include real random coiled proteins, such as the apo-form RTX domain of CyaA,26 and unfolded globular proteins that are not able to refold, such as protein MJ0336 with a bounded BiP domain.56 In our OT experiments, ∼77% (325 out of 423 events measured from 11 different molecules) of the stretching–relaxing cycles of single Cys-NuG2-apo-cytc-NuG2-Cys molecules showed only the unfolding and folding events of two NuG2 domains (Fig. 6A, cycle 1), suggesting that apo-cytc did behave like a random coil in these stretching–relaxing cycles with no detectable intrachain interactions along the polypeptide chain. However, intriguingly, in 23% (98 out of 423) of the trajectories, some “unfolding”-like rupturing events were observed at forces below ∼10 pN (Fig. 6A, cycle 2–5), in addition to the unfolding and folding events of two NuG2 domains. This observation suggests that some apo-cytc molecules displayed some intrachain interactions that give rise to the observed “rupturing” events. Moreover, these low-force rupturing events were observed in all the 11 apo-cytc molecules that we measured, indicating that such deviations from the behaviors of random coils is a general feature among all apo-cytc molecules. Looking closely at these rupturing events, we found that these events occurred with different ΔLcs and different rupturing behaviors, including one-step and multiple-step rupturing behaviors. Plotting the force against the ΔLc of each rupturing event (Fig. 6B) showed that there are no clear clusters of data points present, which may represent the rupturing of a specific interaction. This result indicates that, although there are some weak intrachain interactions, the amino acid residues in apo-cytc do not have a well-defined interaction mode, consistent with the previous observations from ensemble experiments that apo-cytc lacks a folded structure.21 In addition, the fact that the observed ΔLc is always smaller than the contour length of cytc indicates that these interactions are intrinsic to the apo-cytc polypeptide chain itself, but not due to any interaction between apo-cytc and dsDNA handles (Fig. S3, ESI†). Nevertheless, the formation of such interactions did not result in any “folding”-like events in the relaxation curves, implying that the weak intrachain interactions can form only when the polypeptide chain is relaxed close to zero force.
Although the use of fingperprint domains is a widely used method in SMFS studies on proteins, including intrinsically disordered proteins (IDPs),57–60 fusing NuG2 to apo-cytc raised the question if the weak intrachain interactions observed in apo-cytc could be due to the interactions between NuG2 and apo-cytc. To address this issue, we engineered dsDNA-apo-cytc-dsDNA for OT measurements. In the force–distance curves of dsDNA-apo-cytc-dsDNA (Fig. S2 in the ESI†), we observed similar unfolding-like events occurring at low forces, suggesting that there exist weak, nonlocal intrachain interactions in apo-cytc. This finding is similar to those of dsDNA-NuG2-apo-cytc-NuG2-dsDNA, corroborating our conclusion that the weak intrachain interactions indeed originate from the polypeptide chain of apo-cytc itself.
The mechanism of the chemical folding of holo-cytc has been studied extensively by using hydrogen exchange techniques.17–19 It was found that holo-cytc is composed of five cooperative folding units (termed foldons), and the folding and unfolding processes go through the same foldon-dependent native-like intermediates in a reversible fashion. In the unfolding process, the substructures in the middle of the protein's sequence unfold first, followed by the unfolding of the foldon containing the N- and the C-terminal helices; and in the folding process, the N- and the C-terminal helices bind first on the millisecond time scale, and the other structures form subsequently in ∼2 s.17 It was concluded that the folded substructure formed by the N- and the C-terminal helices is more thermodynamically stable and has faster folding kinetics than the other part of holo-cytc, and the folding of the substructures in the middle of the protein's sequence may rely on the correct folding of the N- and the C-terminal helices.
However, in the mechanical folding–unfolding experiments, the unfolding–folding mechanism of cytc may well differ from that of chemical ones due to the directional nature of the mechanical unfolding–folding experiments. In the OT experiments, the folded substructure formed by the N- and the C-terminal helices of cytc is directly subject to the stretching force and acts as the force-bearing structural unit. Upon stretching, detachment of the N- and the C-helices and/or unraveling these two helices are the first steps of the mechanical unfolding of cytc. In the mechanical two-state pathway, it seems that this first step directly led to the complete unfolding of the whole protein without populating any “foldon” structure. In the mechanical folding pathway, we did not observe any formation of foldon structures, other than the ultimate folding of cytc, which is indicated by the formation of N- and C-terminal helices. These results suggested that if these foldons existed in the mechanical unfolding/folding pathway, their existence depended on the formation of the foldon formed by the N- and C-terminal helices. From a mechanical perspective, our results again highlighted the critical importance of the N- and C-terminal helices in the folding of cytc. Moreover, this insight also raises an intriguing question if the observed “misfolded” cytc (with a much weaker mechanical stability) corresponds to a conformation in which the N-, and C-terminal helices are fully formed while the internal helices (or foldons) are yet to form. Future protein engineering work will be needed to test this possibility.
Another novel insight from our OT results is the observation of the short-lived unfolding intermediate state, in which the N- and C-terminal helices are unraveled. In previous chemical (un)folding studies of cytc, the formation and packing of the N- and C-terminal helices is the first step in the folding process while their unfolding is the last step in the unfolding step.12,17,18 In other words, the folding of the N- and C-helices is the prerequisite for the folding of other structural elements. In contrast to this view, our results clearly showed that it is possible that other structural elements can exist in the absence of the N- and C-terminal helices, revealing new structural information of cytc in the absence of chemical denaturants.
Intrinsically disordered proteins behave as random coils, and chemically denatured proteins are often assumed to be random coils too.61–63 Although a variety of techniques have been used to characterize random coils,62,64 it remains difficult to detect and characterize residual structures in unfolded polypeptide chains, due to the insensitivity of these techniques to the residual and/or transient structures in polypeptide chains. For example, the hydrodynamic radius, which can be measured by different techniques, such as fluorescence correlation spectroscopy and dynamic light scattering, is often used to characterize the random coil behaviors of unfolded proteins. However, the hydrodynamic radius is insensitive to the native contacts/transient structures in unfolded polypeptide chains.62,65 In fact, a random coil-like radius can be generated in a protein by randomizing only 8% of its native contacts.66 These prior studies clearly showed that observation of a random coil-like radius in an unfolded polypeptide chain does not necessarily rule out the possibility of transient structural formation. As a result, residual folded structures may not be detected in some spectroscopic measurements, leading to the mischaracterization of random coils.67,68
Nuclear magnetic resonance (NMR) spectroscopy is a sensitive technique that allows for the detection of residual structures in unfolded proteins.64,69 For example, NMR studies revealed the formation of hydrophobic clusters in the urea-denatured 434 repressor which causes intense medium-range interactions suggested by the NMR nuclear Overhauser effect, hydrophobic interactions between aromatic sidechains that keep a β-hairpin secondary structure in a 16-residue peptide from protein G, and interactions between charged sidechains stabilize helices in the S-peptide from ribonuclease A.70–72 Although these proteins exhibit random coil-like properties in several spectroscopic measurements, they are not true random coils as they contain residual structures and/or intrachain interactions as revealed by NMR.
In the case of apo-cytc, its conformation has been considered as a random coil since the early 1970s based on results from multiple spectroscopic methods, including CD spectroscopy, intrinsic viscosity, sedimentation coefficients, and the UV absorption, reactivity and ionization of certain amino acid residues.20,21 However, these methods do not characterize the fine details of the interactions between residues. Different from these prior studies, our single-molecule OT measurements have yielded the evidence that apo-cytc is not a true random coil. Instead, there exist intrachain interactions in apo-cytc, leading to the formation of mechanically labile but detectable structures. It is likely that due to hydrophobic interactions, apo-cytc folds into an ensemble of collapsed conformations with weak non-local interactions in the polypeptide chain. This finding is similar to the results of some recent SMFS studies on several IDPs, including α-synuclein and the neuronal RNA binding protein Orb2,73–75 which showed that these IDPs are not true random coils and weak, nonlocal interactions exist in these polypeptide chains. This common feature among these supposedly random coiled proteins revealed new features of IDPs and highlighted the unique suitability of the SMFS technique as an effective tool to evaluate the conformations of these proteins from a mechanical perspective at the single-molecule level, making SMFS and NMR the few available techniques that can detect residual/transient structures in unfolded polypeptide chains.
Footnote |
| † Electronic supplementary information (ESI) available. See https://doi.org/10.1039/d2sc01126c |
| This journal is © The Royal Society of Chemistry 2022 |