David
De Sancho
and
Robert B.
Best
*
Cambridge University, Department of Chemistry, Lensfield Road, Cambridge CB2 1EW, UK. E-mail: rbb24@cam.ac.uk; Tel: +44-1223-336470
First published on 2nd September 2011
Intrinsically disordered proteins that acquire their three dimensional structures only upon binding to their targets are very important in cellular signal regulation. While experimental studies have been made on the structures of both bound (structured) and unbound (disordered) states, less is known about the actual folding–binding transition. Coarse grained simulations using native-centric (i.e.Gō) potentials have been particularly useful in addressing this problem, given the large search space for IDP binding, but have well-known deficiencies in reproducing the unfolded state structure and dynamics. Here, we investigate the interaction of HIF1α with CBP using a hierarchy of coarse-grained models, in each case matching the binding affinity at 300 K to the experimental value. Starting from a pure Gō-like model based on the native structure of the complex we go on to consider a more realistic model of helix propensity in the HIF1α, and finally the effect of non-native interactions between binding partners. We find structural disorder (i.e. “fuzziness”) in the bound state of HIF1α in all models which is supported by the results of atomistic simulations. Correcting the over-stabilized helices in the unbound state gives rise to a more cooperative folding–binding transition (destabilizing partially bound intermediates). Adding non-native contacts lowers the free energy barrier for binding to an almost barrierless scenario, leading to higher binding/unbinding rates relative to the other models, in better agreement with the near diffusion-limited binding rates measured experimentally. Transition state structures for the three models are highly disordered, supporting a fly-casting mechanism for binding.
Molecular simulations provide information complementary to experiment, namely an atomically resolved description of the binding process at very high time resolution. However state-of-the-art molecular dynamics simulations either in an implicit or explicit solvent, that have been attempted in a few cases for IDPs,21–26 are generally too expensive computationally for system sizes of interest (typically two proteins comprising a total of more than 100 amino acids). This is especially true if, as is desirable, we want to observe a large number of binding events in order to obtain meaningful statistics. As an alternative, theorists have turned to coarse-grained simulation models based on the native conformation of the complex, that have been very successful in the last two decades for the study of the related process of protein folding.27–33
Simple topology based (i.e.Gō) models embody some of the predictions of energy landscape theory,34,35i.e. that the folding energy landscape is funneled and the free energy minimum corresponds to the native conformation. For binding, the topology of protein–protein complexes has also been shown to be the major determinant of the mechanism, at least for dimers of single domain globular proteins.36 Therefore the application of this type of model to IDP binding seems justified. Nonetheless, structure-based models usually rely only on the native contact map and they may lack some details that are expected to be important in IDPs, especially if we consider the importance of the unfolded state description.
Here we investigate the binding of the C-TAD domain of HIF1α to the TAZ1 domain of CBP, resulting in a high affinity complex (with KD reported to be between 7 nM for the dehydroxylated HIF1α and 143 nM for HIF1α-OH).13,37 This system is especially important for its involvement in the response to hypoxia during tumour growth.38HIF1α is actually expressed as a long 826 residue protein with several functional domains (see Fig. 1a), which are tightly regulated by O2 concentration. Under normoxic conditions HIF1α is hydroxylated and degraded by the proteasome.39 However, when the oxygen pressure decreases HIF1α is not hydroxylated and accumulates in the nucleus, where it binds to the TAZ1 domain of CBP through the C-TAD domain, triggering the hypoxic response. Although the binding of HIF1α to CBP is clearly of great interest for therapeutic applications,40 it has been studied using biophysical techniques only recently.13,37
![]() | ||
| Fig. 1 (a) Domain structure of the full sequence HIF1α (Uniprot: Q16665) and disorder score for the full sequence (green) and the isolated C-TAD domain (red) calculated from PONDR-FIT.41 The cyan dashed line marks the threshold in the disorder score (0.5) over which residues are predicted to be disordered. (b) Cartoon representation of the experimental NMR structure of the complex formed by the TAZ-1 domain of CBP in blue and the CAD domain of HIF1α in different shades of red (PDB id: 1L8C). (b) Contact map with the inter-Cα distances for the HIF1α–CBP complex. The blue and red segments mark the limits of CBP and helices αA, αB and αC of HIF1α. | ||
In the context of the natural sequence the C-TAD domain of HIF1α is predicted to be disordered in its C-terminus by a state-of-the-art disorder meta-predictor41 (see Fig. 1a), while in isolation the C-TAD domain is predicted to become more disordered. C-TAD is also predicted to have a very small population of α-helix by Agadir42 (only 1.76%), with the region corresponding to helix αC of the CBP-bound state being the most helical (11%). In fact the C-TAD domain of HIF1α has been shown to exhibit random-coil properties in CD and NMR experiments.37 Relaxation dispersion experiments have shown that HIF1α binds to CBP in a coupled folding and binding reaction with on-rates that are as fast as ≃109 M−1 s−1.13 The structure of the bound complex is also intriguing, consisting of the IDP wrapping completely around the TAZ1 domain, and forming three helices (αA, αB and αC; see Fig. 1b and c). This distinguishes it from most of the previous IDPs studied using molecular simulation, which generally form a more localized binding interface involving only one or two helices. Examples include pKID-KIX,28–32 p53/S100Bβ32 and recently the protein inhibitor IA3.16,33 The extensive binding interface raises obvious questions about possible alternative binding mechanisms and the role of “fly-casting”43 (i.e. rate enhancement due to an increased capture radius) in locating the folded complex.
The paper is organized as follows. We first describe a series of models, starting with a well-established “flavoured” Gō model44 (i.e. one that includes sequence effects), and progressively including more detail: (i) we correct the torsional description so that it resembles that obtained from atomistic simulations from an accurate force field,45 in order to better capture the helical population in the unbound state; (ii) we consider the effects of non-native interactions and electrostatic terms, combining the Gō model with a transferable coarse grained model for protein–protein binding.46 In all cases, we balance the energies to match the KD in the experimental range at 300 K. By comparing the different models under conditions where the KD is similar, we remove the effects of differing stability in the comparison of energy landscapes, free-energy barriers and binding rates. From this we are able to assess the effect of more realistic descriptions of molecular interactions in the binding scenarios as produced from coarse-grained simulation models.
![]() | (1) |
![]() | (2) |
In this version of the model, all (originally repulsive) non-native interactions are replaced by a transferable (sequence-based) potential which can be either attractive or repulsive, depending on the value of the interaction strength εij in the Miyazawa–Jernigan-derived contact potential between residues i and j.46,48 These contact energies accurately reflect the strength of inter-residue interactions due to the different hydrophobicity of amino-acid pairs.48 This description is therefore suitable for IDPs, that typically have less hydrophobic sequences than globular proteins.49 For attractive pairs (εij < 0) the energy is calculated using a standard Lennard-Jones potential
![]() | (3) |
![]() | (4) |
Since we are mainly interested in the dynamics of HIF1α upon binding, we restrain CBP so that it remains folded at all temperatures. We do this by imposing a biasing potential on the fraction of intramolecular contacts (QCBP ≃ 0.9, see below). This bias is necessary to mimic the stabilizing effect of the bound Zn+2 ions on CBP, since the ions are not explicitly present in our model.
![]() | (5) |
To estimate the degree of residual structure in the unbound state we calculate the fraction helix, fhelix, using a Lifson–Roig type of definition.55 Since in the coarse-grained model there is no access to the Ramachandran angles, we base our definition on the dihedral angle θi between beads (i, i + 4). A dihedral is considered to be helical if the value of θi is between −35° and 145°, corresponding to the helical well. A helical segment is then the one that has at least three contiguous dihedrals which are helical.
We calculate the dissociation constant from the fraction of bound (pb) and free (pu) states of the molecule, and the total protein concentration ([Protein], including bound and unbound) as:
![]() | (6) |
and pb = (1 − pu), with Q* = 0.1 being the value of a dividing line in the PMF between the bound and unbound states.
To calculate rates we use time correlation functions for the minimum distance between the two proteins (dmin),
![]() | (7) |
To analyze reaction coordinates for binding and identify transition states, we use a Bayesian criterion.56,57 Transition paths are defined for the projection on an order parameter r as those fragments of the trajectory connecting the unbound state (with r = rU) and the bound state (r = rN). From the observed transitions we obtain the conditional probability density of r for transition paths, p(r|TP), and using Bayes' theorem we obtain the probability of being on a transition path given the value of the order parameter r.
![]() | (8) |
![]() | ||
| Fig. 2 Description of the binding process from the structure-based models. Time series for the fraction of native contacts, QHIF1α–CBP, at 310 K (top) and corresponding potentials of mean force (bottom). (a, b) Gō model; (c, d) Gō-fixα model; and (e, f) Gō-nonnative model. The dashed lines mark the positions in QHIF1α–CBP of the intermediate and bound free energy wells. | ||
We first show the projection of the simulation data on the fraction of native intermolecular contacts (QHIF1α–CBP; see Fig. 2a, c and e) and the corresponding PMFs (panels b, d and f). At 310 K we see that in all three models the system hops between three different states: the unbound state (U) at QHIF1α–CBP ≃ 0, a broad fully bound state at high values of QHIF1α–CBP (B) and an intermediate (I). To calculate KD we lump together the intermediate and fully bound states into a single bound state B–I, separated from the unbound by Q* = 0.1 and obtain KD using eqn (6) (see Fig. 3). We note that the resulting value is not very sensitive to small changes in Q*. The same type of calculation using the projection on the distance between protein centers of mass yields very similar results (Fig. 3, dashed lines), confirming the robustness of our estimate of KD for the simulation models.
![]() | ||
| Fig. 3 Dissociation constant (KD) as a function of temperature for the three coarse grained models: Gō (blue), Gō-fixα (green) and Gō-nonnative (red). Note that the experimental KD at room temperature is in the range of 10−9–10−7 M. Straight lines and circles correspond to the calculation by integrating over the PMFs on the fraction of native intermolecular contacts QHIF1α–CBP. Dashed lines are the values of KD obtained using the projection on the distance between centers of mass of HIF1α and CBP. | ||
Remarkably, given its simplified nature, the Gō model is able to produce a KD in the experimental range (1.03 ± 0.24 × 10−7 M) without any adjustment. The Gō-fixα and Gō-nonnative models however require some fine-tuning of the potential. This may be justified since the stability of the Gō model is set by balancing the total native contact energy against the loss of entropy on folding, which for an N-residue protein is approximated by Nωres, where ωres is a per-residue folding entropy.44 The value of ωres has been calibrated with the original torsion potential, and a different value may need to be used in conjunction with the Gō-fixα model, for example. We use uniform scaling of the interaction strengths as an efficient and unbiased way to calibrate the potential.32 We introduce these changes using the Gō model as a reference for the parametrization, because we would ultimately like to compare all models at similar stability.
For the Gō-fixα model to reproduce the experimental KD we uniformly increase the strength of all native contacts by 6% (see Fig. 3). The Gō-fixα model produces a very similar overall picture of binding with three distinct states (Fig. 2c and d), although the system spends less time in intermediate values of QHIF1α–CBP. In the Gō-nonnative model the additional terms from the Kim–Hummer model for non-native interactions would result in too high an affinity for the complex. The Kim–Hummer model was carefully parameterized to reproduce low affinity protein–protein interactions, using experimental second virial coefficients and protein–protein dissociation constants in the micromolar range.46 Therefore, we instead tune the strength of the native contacts, which produce binding specificity. We recover the experimental KD reducing the strength of all intermolecular native interactions by 10% with reference to the Gō model (Fig. 3). Interestingly, in the resulting Gō-nonnative model the transitions seem much more frequent than in the Gō and Gō-fixα versions (Fig. 2e), as we discuss further below. Furthermore, the bound state appears to be no longer split into two metastable states but now consists in a very broad free energy basin (Fig. 2f). This suggests that the inclusion of non-native interactions causes a substantial shift of binding mechanism, as we investigate in detail below.
Further insight into the contributions of the different helices to the stability of the bound state has been obtained from experiments on different shorter constructs of HIF1α. These showed that the fragment comprising helices αB and αC (790–826) was able to bind with a KD only 5 fold larger than that of the full sequence (776–826).37 We have run simulations on this fragment and we observe the same change in the affinity (KD increases from 1.03 × 10−7 M to 6.74 × 10−7 M), confirming that αA contributes little to the total affinity. However, a shorter fragment (808–826) which had a much decreased affinity in the experiments retains approximately the same affinity as the longer fragment in our simulations (KD = 6.39 × 10−7 M). This suggests that the Gō model over-stabilizes helix αC and therefore the intermediate state with only αC bound.
An interesting prediction of the Gō model is a very heterogeneous bound state, as revealed by the broad free energy basin in the projection on QHIF1α–CBP (see Fig. 5d). The protein shifts rapidly between highly ordered native-like conformations (with QHIF1α–CBP ≃ 0.8) and less ordered conformations (with QHIF1α–CBP as low as 0.4), where helix αA is not perfectly packed against CBP. Other regions that transiently dissociate from the complex are helix αB and the loop between αB and αC (QHIF1α–CBP ≃ 0.7). This seems consistent with the observation of a poorly defined N-terminal region in the NMR structure37 and the heterogeneity in the NMR models between helices αB and αC. However the NMR models are clearly much less heterogeneous than the bound state in the coarse grained simulations as seen when we calculate the RMSD from the initial structure (see Fig. 5e).
In order to test the prediction of a heterogeneous bound state we have run a set of atomistic MD simulations of the HIF1α–CBP complex with a force field45 optimized to match experimental data on short peptides when combined with an accurate model for water (TIP4P/2005)51 (see Methods). We examine the dynamics of the bound state in 20 ns runs at 300 K and 350 K from the RMSD of the energy minimized structure (see Fig. 5e). We use the run at higher temperature to enhance sampling of the bound conformation and provide an upper bound to the expected fluctuations at a lower temperature. We compare the results of the atomistic MD runs with the conformations sampled at 280 K, where only the B basin is populated.
We find that the structures from the atomistic MD simulations both at 300 K and 350 K are considerably more heterogeneous than the experimental NMR models, regardless of the short time-scales explored in these simulations (see Fig. 5e and f). In fact, the fluctuations observed in the atomistic simulations at 300 K are remarkably close to those obtained with the Gō model under stabilizing conditions, with the exception of residues 20–35 where larger fluctuations are observed for the Gō model. However, these larger fluctuations are nonetheless within the range sampled by the atomistic models at 350 K, which we use to estimate an upper bound for likely fluctuations in the bound state (since atomistic simulations cannot be run for the length of time which may be needed to sample larger bound state fluctuations). Taken together, these results suggest that HIF1α forms a fuzzy complex with CBP, i.e. one that retains a significant degree of disorder even in the bound state.58
![]() | ||
| Fig. 4 Two dimensional PMFs for the fraction of native intermolecular contacts QHIF1α–CBPvs. the intramolecular contacts in the IDP, QHIF1 (top), and the distance between centers of mass, dCM (bottom). The insets show the effective pair potential between the protein centres of mass in the unbound state (defined as QHIF1α–CBP < 0.1). (a, b) Gō model; (c, d) Gō-fixα; (e, f) Gō-nonnative. All energies are in kcal mol−1. | ||
Previous studies have noted that Gō-like models result in a large amount of preformed helical structure32 in unfolded or unbound proteins. In our model, this is a consequence of the statistical nature of the pseudo-dihedral potential, built from an analysis of a subset of structures of the (predominantly helical) PDB.44 Ganguly and Chen have corrected this bias by introducing weaker helical contacts in the IDP.32 Here we adopt an alternative approach in which we rebalance the backbone potential to obtain a more realistic helix–coil equilibrium. We use as a guide the relative weights of helical and extended states in atomistic molecular dynamics (MD) simulations59 carried out with an energy function specifically optimized to reproduce the helix–coil transition.45
In Fig. 6(d) we compare the distribution in the pseudo-dihedral angle in the isolated HIF1α in the Gō model simulation with that of the Ala5peptide obtained from atomistic MD. While in the atomistic model the minimum corresponding to the α-helical population (ϕ ≃ 45) has a higher free energy than the β/coil well (ϕ ≃ −120), the statistical potential in the Gō model produces the opposite effect. In the Gō-fixα model we introduce a uniform correction for all dihedrals (see Methods). We see that with this subtle change we approximately recover the correct balance between the α and β wells obtained from atomistic simulations (see Fig. 6d). In turn the amount of preformed structure in the unbound state is considerably reduced, with 〈fhelix〉 decreasing to 0.08 (see Fig. 6b). The distribution of the radius of gyration is also shifted to slightly higher distances (see Fig. 6a).
![]() | ||
| Fig. 5 Bound state description from the Gō model, atomistic simulations and experiments. (a) Time series for the fraction of intermolecular contacts QHIF1α–CBP at 300 K. The red dashed lines mark the mean values for the intermediate (I) and bound (B) states. We show snapshots for the reference structure for the native complex (b), the intermediate (c) and the heterogeneous bound state found in the simulations (d). (e) Heterogeneity of the bound state in the experimental NMR models, atomistic simulations and coarse grained simulations. The reference for the Cα-RMSD calculations is the first experimental NMR model but for the atomistic simulations, where we use the energy minimized structure after 1 ns dynamics (see text). The positions of the different helices in the sequence are shown schematically in red. (f) Representative snapshots of the heterogeneous bound state in the atomistic simulations at 300 K. | ||
![]() | ||
| Fig. 6 Effects of the dihedral potential correction in the unfolded state description. Probability distributions of the radius of gyration Rg (a) and the fraction helix fhelix (b) in the Gō and Gō-fixα. (c) Representative snapshots from the simulations of isolated HIF1α. (d) Comparison of the PMFs of the pseudo-dihedral angle distribution for HIF1α from coarse grained simulations with the Gō and Gō-fixα models and that of Ala5 from atomistic simulations. In all panels blue and green correspond to the Gō and Gō-fixα models, respectively. | ||
![]() | ||
| Fig. 7 Effects of the dihedral correction in the binding mechanism. (a) Projection of the simulation at 310 K on the fraction of native contacts for the whole HIF1α, QHIF1α–CBP (top), for helix αA (center) and for helix αC (bottom). The projection for helix αB (center, gray lines) is shown with that of αA since they occur simultaneously. (b) Snapshots from the simulation trajectory of the two intermediate states IA and IC. | ||
These new features of the Gō-fixα models are in fact all interconnected. The change of mechanism, with flux going through two different channels, is a by-product of the rebalancing of the α and β/coil wells. The correction in the dihedrals destabilizes the more helical IC intermediate, that in the Gō model was very stable due to the larger helicity of helix αC (8 helical residues compared to only 4 for αA, according to DSSP assignment).60 In the Gō-fixα model both intermediates can form because of the increased requirement for intermolecular contacts to stabilize secondary structure elements. Native inter-protein contacts are very similar in number in the interface of CBP with helix αA (25 contacts) and αC (20). This also explains that αB with high helicity in the bound state (7 helical residues) but few contacts (only 14) is not observed to bind independently of the other helices.
Therefore we find that small modulations in the structure of the unbound state can have a strong effect on binding mechanism, emphasizing the importance of carefully constructing coarse-grained simulation models for the study of IDPs. We note however that the changes in the torsional potential also decrease the helicity in the bound state, since contacts can form without the need to populate helical torsion angles. A desirable future development of structure based models would be a potential for hydrogen bonded interactions that allows us to better recover the geometry of α-helices without artificially biasing the torsional potential.61
In the simulations with the resulting Gō-nonnative model we find that the additional non-native terms have important consequences for the folding/binding energy landscapes (see Fig. 2e, f and 4e, f). First, the bound ensemble can no longer be easily divided into different microstates (i.e. IA, IC, B) as for the Gō and Gō-fixα models. Now we find a very broad free energy basin that spans the whole range of bound values for QHIF1α–CBP. Second, the binding transition appears less cooperative, with the proteins spending much more time at intermediate values of QHIF1α–CBP (see Fig. 2e). This translates into a marginal apparent free energy barrier between unbound and bound states for the Gō-nonnative model in this projection (Fig. 2f). At midpoint conditions the barriers decrease from the ∼5 kcal mol−1 (i.e. ∼7 kBT) for the Gō model to less than ∼2 kcal mol−1 (i.e. ∼3 kBT), reminiscent of a “downhill” scenario proposed for protein folding.35,62–65
The two-dimensional projection on dCM and QHIF1α–CBP (Fig. 4) also suggests very low free energy barriers between the different minima, and significant differences from the other two models explored. Unlike the situation for Gō and Gō-fixα models, in which the decrease in dCM is strongly correlated with formation of intermolecular contacts (QHIF1α–CBP), we observe that binding can now occur from non-specific bound conformations (i.e. those with low dCM and low QHIF1α–CBP in Fig. 4f). We illustrate the formation of these encounter complexes in more detail by zooming in a binding transition path between unbound and fully bound states (Fig. 8). From the unbound state (high dCM, t ≃ 30–40 ns) HIF1α binds by forming a contact between the C-terminal region (i.e. that corresponding to helix αC) and a pocket in CBP that in the native structure is bound to helix αA (t ≃ 50 ns). Importantly, non-native contacts start to form first in this transition path, and only later do native interactions take over. Specifically, contacts between the αA region are established (t ≃ 50 ns). The inclusion of the non-native contacts results in a much longer transition path length (more than 200 ns in this instance) due to the frustration induced by the non-native contacts. At t = 150 ns helix αA is tightly bound to CBP, with the binding being completed by the formation of native contacts with αB and αC (the corresponding increase in the fraction of non-native contacts is due to interactions in the vicinity of native pairs).
![]() | ||
| Fig. 8 Transition path from the Gō-nonnative model at 320 K. We show the projection of the simulation data on the fraction of native contacts for the whole HIF1α, QHIF1α–CBP and each individual helix QαA, QαB and QαC (top); the fraction of non-native contacts QNon-native (center); and the distance between centers of mass of the two proteins, dCM (bottom). For selected time points we show snapshots from the simulation. | ||
In order to avoid arbitrary definitions of binding and unbinding events, we study the binding kinetics of the different models using time correlation functions. We calculate the auto-correlation function for the minimum distance between HIF1α and CBP, dmin, as defined in eqn (7), near the binding midpoint temperature. The resulting decays (see Fig. 9) have a fast initial decay that cannot be completely resolved, which occurs at a similar time-scale (∼0.1 ns) for all three models and a slower decay which occurs on different time scales for each model. We fit the correlation functions to double exponentials with the parameters being the correlation time of the fast phase (τfast) that we fit globally and the amplitudes and correlation times for the slow phase (Aslow and τslow) that are independent in the fit (i.e. a total of 7 fitting parameters, see Table 1). The fitted correlation time for the fast phase of 0.2 ns can be attributed to the fast fluctuations in the unbound state (see Fig. 9, inset). The slow phase, corresponding to the binding process, is slowest for the Gō model and fastest for the Gō-nonnative model. We note that although the data can be reasonably well fit by the double exponential, a higher quality fit to the relaxation in the Gō-nonnative model can be obtained using the sum of a single and an stretched exponential function of the type C(t) = (1 − Aslow)exp[−t/τfast] + Aslowexp[−(t/τslow)β], with β = 0.52. This fit is similar to the “strange kinetics” observed for barrierless protein folding63 (see Fig. 9, red dashed line).
![]() | ||
| Fig. 9 Time correlation functions for the minimum distance between proteins (dmin) for the Gō (blue), Gō-fixα (green) and the Gō-nonnative models (red) under isostability conditions. The lines correspond to double exponential fits to the data. The dashed red line corresponds to a stretched exponential fit. In the inset we show time series for the dmin for the three models. | ||
| τ fast | A slow | τ slow | k off × 104 | k on × 107 | |
|---|---|---|---|---|---|
| Gō | 0.2 | 0.46 | 78.9 | 0.548 | 0.622 |
| Gō-fixα | 0.2 | 0.66 | 38.5 | 1.29 | 6.38 |
| Gō-nonnative | 0.2 | 0.42 | 3.2 | 14.4 | 87.3 |
From the relaxation times fitted to the double exponential expression we can calculate on and off-rates for the binding process (kon and koff), using the values for the KD and protein concentration ([Protein]) and assuming that the observed exponential decay has contributions from both rates (i.e. 1/τslow = koff + kon[Protein]). The resulting values for kon and koff (see Table 1) can be compared with experimental time-scales adjusting for the fact that the dynamics in the simulation are faster due to the low friction (0.2 ps−1; while typical values are 50–100 ps−1).66,67 We therefore scale the rates by 1/100, corresponding to the relative folding rates of proteins at frictions of 0.2 ps−1 and 50 ps−1, obtaining on-rates kon of ∼106 M−1 s−1 for the Gō model and ∼108 M−1 s1 for the Gō-nonnative model. This very fast rate for the association process is in reasonable agreement with experiment (kon = 1.29 × 109 M−1 s−1). Since we have computed binding rates at the binding midpoint temperature, we expect our results to be in better agreement with experiment at lower temperature where the binding rate would be slightly higher as the energy landscape would be more tilted toward the bound state. However our results suggest that the lower barrier given by the Gō-nonnative model is most consistent with the near diffusion-limited experimental binding rates observed for HIF1α–CBP.13
With respect to the mechanism, it would be tempting to consider that the formation of the intermediate with the preformed helix αC indicates that the description of the process corresponds to conformational selection, at least for the Gō model. However, under the hypothesis that this structure is preformed in the unbound state, it would most likely also be present during the initial binding (prior to forming the intermediate). Here we study transition paths between the unbound and intermediate state as these contain all the information on the binding mechanism.68 We use a Bayesian procedure (see Methods)56,57 to identify the most reactive states (i.e. transition states) on the transition paths. First we show that our chosen progress variables are good reaction coordinates. For the Gō and Gō-fixα models (see Fig. 10a and c) we find that QHIF1α–CBP is a reasonably good coordinate, with maximum values of p(TP|Q) ≥ 0.4, near the theoretical maximum of 0.5 for diffusive dynamics. Although the maximum value of p(TP|Q) is relatively low (∼0.3) for QHIF1α–CBP applied to the Gō-nonnative model, we find that we obtain an improved coordinate by a simple linear combination QHIF1α–CBP + λQnonnative, with optimal λ ≈ 1.03 (Fig. 10e). This confirms the important influence of non-native interactions in this model, even in the transition state for binding. This is an interesting contrast with protein folding, where similar non-native interactions did not appreciably affect the transition state structure or quality of the native contact reaction coordinate.69
![]() | ||
| Fig. 10 Reaction coordinates and transition state structure for HIF1α–CBP binding. Left: Bayesian analysis of transition paths. We show equilibrium probabilities peq and probabilities of being in a transition path, p(TP|Q). Right: Snapshots corresponding to transition states for binding. (a, b) the Gō (blue), (c, d) Gō-fixα, (e, f) Gō-nonnative. | ||
In all three cases we find that the maximum in p(TP|Q) appears early in the value of the order parameter, indicating that the transition states are probably very unstructured. We can obtain a picture of the transition states by looking at a number of conformations from the equilibrium trajectory that lie in a narrow region around the maximum of p(TP|Q). In all three models we find that these structures correspond to marginally bound states with great heterogeneity. For the Gō and Gō-fixα encounters seem to occur at multiple different positions in the binding groove of CBP. For the Gō non-native model encounter complexes can bind even partially outside of the native binding interface due to non-native steering effects. This may result in the observed enhancement in the rate of binding.
A common feature of all of the models is some “fuzziness” in the bound state, i.e. structural heterogeneity in the complex.58 This is already present in the Gō model, but is extended to include a broader ensemble of non-specific configurations in the most detailed (Gō-nonnative) model considered. We find a significant effect of the chosen model on the binding mechanism, with a more accurate backbone potential destabilizing the single-helix intermediate and resulting in greater heterogeneity of binding. Further adding a realistic sequence-based model of non-native interactions results in an increased capture radius, accelerating binding via a mechanism similar to the “fly-casting” proposed in the context of native interactions,43 in agreement with results obtained with a more generic model of non-native interactions.28 The non-native interactions also lower the apparent barrier to binding, as has also been observed for protein folding in the presence of weak non-native interactions.70,71 We note, however, that this effect was not observed in some previous simulation studies of IDPs considering non-native interactions.31 We hypothesize that these effects of non-native contacts may be important for IDPs to rapidly search for and bind to their cognate binding partners.
While the coarse-grained models used here are clearly a very reduced representation of IDP binding, they have already illustrated several factors which may be important in controlling the mechanism of binding, and binding rates. These hypotheses may be further tested in the future, both by experiment and by means of more detailed simulations.
Footnotes |
| † Published as part of a Molecular BioSystems themed issue on Intrinsically Disordered Proteins: Guest Editor M. Madan Babu. |
| ‡ D.D.S. is supported by a FEBS Long Term Postdoctoral fellowship and R.B.B. by a Royal Society University Research Fellowship and the BBSRC. |
| This journal is © The Royal Society of Chemistry 2012 |