Salla I.
Virtanen‡
,
Anne M.
Kiirikki‡
,
Kornelia M.
Mikula
,
Hideo
Iwaï
and
O. H. Samuli
Ollila
*
Institute of Biotechnology, University of Helsinki, Helsinki, Finland. E-mail: samuli.ollila@helsinki.fi
First published on 4th September 2020
Importance of disordered protein regions is increasingly recognized in biology, but their characterization remains challenging due to the lack of suitable experimental and theoretical methods. NMR experiments can detect multiple timescale dynamics and structural details of disordered protein regions, but their detailed interpretation is often difficult. Here we combine protein backbone 15N spin relaxation data with molecular dynamics (MD) simulations to detect not only heterogeneous dynamics of large partially disordered proteins but also their conformational ensembles. We observed that the rotational dynamics of folded regions in partially disordered proteins is dominated by similar rigid body rotation as in globular proteins, thereby being largely independent of flexible disordered linkers. Disordered regions, on the other hand, exhibit complex rotational motions with multiple timescales below ∼30 ns which are difficult to detect from experimental data alone, but can be captured by MD simulations. Combining MD simulations and backbone 15N spin relaxation data, measured applying segmental isotopic labeling with salt-inducible split intein, we resolved the conformational ensemble and dynamics of partially disordered periplasmic domain of TonB protein from Helicobacter pylori containing 250 residues. To demonstrate the universality of our approach, it was applied also to the partially disordered region of chicken Engrailed 2. Our results pave the way in understanding how TonB transfers energy from inner membrane to the outer membrane receptors in Gram-negative bacteria, as well as the function of other proteins with disordered domains.
We have recently resolved the problem of spectral overlap for partially disordered TonB protein with 101 folded and 149 disordered residues using segmental isotopic labeling with salt-inducible split intein.16 Moreover, we have used molecular dynamics (MD) simulations to interpret the rotational dynamics of asymmetrically shaped folded proteins from NMR spin relaxation times.15 However, the accuracy of force field parameters limits the applications of MD simulations to interpret the NMR data. For folded proteins, the main issue has been the underestimated water viscosity in the commonly used TIP3P water model, which leads to too fast rotational diffusion of rigid proteins.15,17 Nevertheless, this can be taken into account when calculating NMR parameters even for proteins with anisotropic shape.15 For disordered proteins, canonical force fields predict too condensed conformational ensembles and calculated NMR parameters often disagree with experiments.18–22 Attempts to overcome these issues have been made by reweighting the conformational ensembles23 or rescaling the timescales to reproduce the experimentally measured NMR spin relaxation times.22,24 While these approaches can be useful in the interpretation of individual experiments, they cannot be generally applied for proteins with heterogeneous dynamics in complex environments, such as membrane bound disordered linkers. Furthermore, rescaling of timescales is ambiguous for disordered proteins without clearly distinguishable dynamical processes, and ensemble reweighting can be used only when the original ensemble is not too far from the correct solution.
Here, we show that NMR spin relaxation times and MD simulations can be combined to resolve conformational ensembles of proteins with heterogeneous dynamics such as partially disordered proteins. We resolve conformational ensembles of two large (>100 residues) partially disordered proteins: periplasmic domain (residues 36–285) of TonB protein from Helicobacter pylori (HpTonB) and residues 143–259 of chicken Engrailed 2 (EN2). Periplasmic domain of TonB propagates energy created by proton motive force in the inner membranes of Gram-negative bacteria to the TonB dependent receptors (TBDT) in the outer membranes via unknown mechanism.25 Engrailed 2 is a transcription factor in which partially disordered residues 143–259 are highly conserved and crucial, in particular, in the binding of transcriptional regulators.26–28 The resolved conformational ensembles give novel insight in the rotational timescales of disordered protein regions and in the biological function of TonB proteins. Furthermore, our results pave the way in understanding of relevant properties and functions of disordered protein regions in biology and protein engineering.
Fig. 1 (A) Sequence of the disordered proline rich region (residues 31–184) in HpTonB. The trans membrane (TM) domain is anchored in the inner membrane of Gram-negative bacteria and the folded C-terminal domain interacts with TonB dependent receptors in the outer membrane. Aminoacids bearing positive or negative charge in neutral pH are colored with red and blue, respectively. (B) Overlayed snapshots from MD simulations of HpTonB30–285 with different force fields without additional salt. The orientation of folded domain is fitted in each snapshot. Residues within α-helices and β-strands are red and blue, respectively. (C) Spin relaxation times calculated from simulations with different force fields compared with the experimental values without additional salt. Rotational diffusion of the folded regions in Amber ff99SB-ILDN and CHARMM36m simulations is divided by a factor 2.9 in the spin relaxation time calculation to correct underestimated viscosity of TIP3P water model.15 |
To resolve the conformational ensemble of partially disordered periplasmic domain of HpTonB, we initiated MD simulations of TonB30–285 starting from the fully extended linker conformation using three different force fields. In simulations with the Amber ff99SB-ILDN30 and CHARMM36m31 force fields, the linker rapidly collapsed (40 ns and 120 ns, respectively) and the protein formed compact structures with the radius of gyration of 2.29 ± 0.02 nm and 2.32 ± 0.05 nm, respectively (Fig. 1B). The Amber ff03ws force field32 resulted in significantly more extended structure with the radius of gyration of 3.5 ± 0.2 nm. To evaluate the results from different force fields against experimental data, we calculated the backbone 15N spin relaxation times, T1, T2 and NOE relaxation, from simulations taking into account the effect of incorrect water viscosity in TIP3P water model for the folded region (Fig. 1C).15 Amber ff99SB-ILDN and CHARMM36m force fields with the collapsed structures overestimate the T1/T2 ratios for the C-terminal domain, suggesting that the rotation of the folded region is too slow in these simulations. In Amber ff03ws simulations with more extended conformational ensemble, the rotation of C-terminal domain is faster and the spin relaxation times are in better agreement with experiments. Therefore, we conclude that the simulations with Amber ff99SB-ILDN and CHARMM36m force fields predict overcondensated conformations for the disordered proline rich region, and that the whole protein rotates as a single relatively rigid body in these simulations with significantly slower timescales than the C-terminal domain without the disordered region attached. In more realistic Amber ff03ws simulations, the disordered region with more extended conformations has only a marginal effect on the rotational dynamics of C-terminal domain, in line with the interpretation from experimental spin relaxation data.16 Overcondensation of disordered proteins in Amber ff99SB-ILDN and CHARMM36m simulations is reported also previously.19,20
Even though the spin relaxation times from the Amber ff03ws simulation are close to the experimental values in Fig. 1C, they slightly overestimate the T1/T2 ratios for residues in the C-terminal domain suggesting that the rotational dynamics of the folded region is still too slow when compared with experiments. To simulate the protein in conditions closer to experiments, we added 40 mM NaCl to the system, which approximately corresponds to the ionic strength of 20 mM sodium phosphate buffer solution at pH 6 that was used in the experiments. Indeed, the addition of salt led to the further extension of the disordered linker and faster rotational diffusion of the structured C-terminal domain (Table 1). Spin relaxation times from the simulation with 40 mM of NaCl are in good agreement with the experimental values (Fig. 2). Further increase of NaCl concentration to 150 mM did not affect the results neither in experiments nor in simulations (Fig. 2). The radius of gyration from simulations with 150 mM NaCl (5.3 ± 0.7 nm) was close to the experimental value (5.5 ± 0.7 nm) measured with the same salt concentration. In conclusion, both spin relaxation times and radius of gyration from the Amber ff03ws simulation agree with experiments when electrostatic conditions of solvent match with experiments, suggesting that this simulation can be used to interpret the spin relaxation times of partially disordered HpTonB36–285 protein.
D x | D y | D z | D av | τ c | R g (nm) | |
---|---|---|---|---|---|---|
0 mM | 1.3 ± 0.1 | 1.3 ± 0.1 | 1.4 ± 0.2 | 1.3 ± 0.2 | 13 ± 2 | 3.5 ± 0.2 |
40 mM | 1.80 ± 0.01 | 2.1 ± 0.2 | 2.4 ± 0.3 | 2.1 ± 0.2 | 7.9 ± 0.8 | 6.1 ± 0.2 |
150 mM | 1.65 ± 0.03 | 1.7 ± 0.1 | 4.1 ± 0.3 | 2.5 ± 0.2 | 6.6 ± 0.6 | 5.3 ± 0.7 |
Fig. 2 (A) Overlayed snapshots from Amber ff03ws MD simulations of HpTonB30–285 with 40 mM NaCl. The orientation of folded domain is fitted in each snapshot and color codes for secondary structure are similar to Fig. 1B. (B) Spin relaxation times calculated from Amber ff03ws simulations with 40 mM and 150 mM NaCl compared with experiments without additional salt (this work) and with 150 mM NaCl (from ref. 16). |
Extension of the disordered linker region upon the increase of NaCl concentration from 0 to 40 mM can be explained by the screened attractive interactions between oppositely charged residues in the proline rich linker of HpTonB. This is demonstrated in Fig. 3 by showing the directional correlations of vectors between Cα carbons in consecutive backbone residues in the disordered linker region (residues 31–200). Negative correlations between residues 95–120 and 120–140 in simulation without additional salt can be explained by the hairpin-like turning of the linker due to electrostatic attraction between residues 97–113, bearing a total charge of +9 from lysines, and residues 131–137, bearing a total charge of −2 from glutamic acids. The addition of 40 mM NaCl screens this attraction, thereby straightening the chain and eliminating the negative directional correlations. Our results suggest that the presence of ionic strength corresponding to the typical buffer concentrations used in experiments significantly screens the electrostatic interactions between charged residues in disordered proteins, thereby affecting their conformational ensembles. This should be taken into account when performing MD simulations that mimic the experimental conditions. However, further increase of ion concentration had much smaller effect on directional correlation, spin relaxation times, and conformational ensemble.
To test our approach also using other protein than HpTonB, we ran simulations of partially disordered region (residues 143–259) of chicken EN2 for which the experimental spin relaxation times measured with multiple magnetic fields are available in the literature.28 In Amber ff03ws simulation of EN2 without additional salt, the oppositely charged residues between N-terminal and C-terminal ends interact and the protein forms a loop-like conformational ensemble (Fig. 4A). Similarly to the TonB simulation without additional salt, T1 times measured with 800 MHz magnet are slightly overestimated in this simulation (Fig. 4C), indicating that the loop-like structure rotates too slowly. Furthermore, residues 151–166 in the N-terminal end form an artificial α-helix, which leads to a poor agreement with experimental spin relaxation times in this region. Because serine and aspartatic acids have overestimated tendencies to form α-helices in this force field,33 we mutated serine (148, 150, 153, 154, 155, 156) and aspartic acid (152 and 154) residues in the N-terminal end to threonine and glutamine, respectively. This mutated protein was then simulated with 40 mM NaCl to mimic the electrostatic conditions in experiments where 40 mM sodium succinate buffer was used.28 As observed for HpTonB, the attraction between charged residues is screened by the addition of salt and more extended conformational ensemble is observed (Fig. 4B). Together with the unfolding of artificial α-helix in residues 151–166, this leads to the improved agreement with experimental spin relaxation data (Fig. 4C). Also the results from different magnetic field strengths support this conclusion (Fig. S1 in the ESI†).
Fig. 4 Overlayed snapshots from (A) EN2 simulation without additional salt and (B) from mutated EN2 simulation with 40 mM NaCl. (C) Spin relaxation times from simulations compared with experiments measured in 800 MHz.28 Spin relaxation times with different magnetic field strengths are shown in Fig. S1 (ESI†). Error bars of the experimental data are smaller than the point size. |
Our results for HpTonB and EN2 demonstrate that the 15N backbone spin relaxation times are sensitive to the conformational ensembles of partially disordered proteins, and can be therefore used to evaluate MD simulations of such proteins against NMR experiments. Notably, the differences in chemical shifts (calculated using SPARTA+34 or ShiftX35) between simulations with different force fields that predict distinct conformational ensembles for HpTonB (Fig. 1B) were not large enough to evaluate the simulations against experiments (Fig. S2, ESI†). The Amber ff03ws force field captured the essential features of conformational ensembles for both partially disordered proteins studied here when the ionic strength of solvent corresponded to the experimental conditions. However, the known caveats of this force field, such as overestimated tendency of serine and aspartic acid to form α-helices and underestimated stability of folded structures,19,33 may limit its wider usage. In conclusion, our results not only emphasize the necessity of developing better force fields for disordered protein regions, but also demonstrate that the backbone 15N spin relaxation times are good measures to evaluate conformational ensembles predicted by the force fields.
Because our HpTonB30–285 simulations with Amber ff03ws force field parameters give spin relaxation times in good agreement with experiments when the ionic strength of solvents match (Fig. 2), we analyzed the underlying rotational timescales of both folded and disordered regions from these simulations using the regularized inverse Laplace transformation (ILT),43 which we used also to perform the Fourier transform of the rotational correlation functions during the spin relaxation time calculations.15 In this approach, each backbone N–H bond rotational correlation function is written as a sum of large number of exponentials (here N = 371)
(1) |
For folded proteins, the timescales related to the overall rigid body rotation dominate the rotation processes with the weight factors between 0.5–0.9 in almost all residues of HpTonB194–285 (Fig. 5 and Fig. S6, ESI†) and PaTonB-96 from Pseudomonas aeruginosa15 (Fig. S7, ESI†). Intermediate (1–4 ns) and very fast (10 ps or faster) timescales have small contributions with the weight factors between 0.1–0.3 in the Amber ff03ws simulation of folded HpTonB194–285 (Fig. 5). In Amber ff99SB-ILDN simulations of folded HpTonB194–285 and PaTonB-96 proteins, the very fast timescales have similar contribution, but intermediate timescales are observed only for few residues (Fig. S6 and S7, ESI†). Therefore, our MD simulations suggest that the two-state (also known as Lipari–Szabo) model36,37 with independent overall and internal rotation timescales is a reasonable approximation for rotational dynamics of folded proteins, even though a third timescale with small weight is observed in simulations with Amber ff03ws force field. Moreover, the αi coefficients for rigid body motions approximately corresponds the Lipari–Szabo order parameters37 in folded regions.
Fig. 5 Rotational timescales (top) and related weight coefficients (bottom), defined in eqn (1), from Amber ff03ws simulation of (A) a folded HpTonB194–285 and (B) a partially disordered HpTonB36–285 protein. Only timescales with the weight coefficients larger than αi > 0.05 are shown. Black horizontal lines and the shaded region between the extreme values display timescales from rigid body rotational diffusion coefficients. To guide the eye, the timescales above τi ≈ 4 ns (the fastest rigid body timescale observed for folded HpTonB194–285 protein) are labeled as rigid body motions (black), the timescales corresponding to the saving frequency of coordinates in simulations (τi = 10 ps) are labeled as fast (red), and the timescales in between (10 ps < τi < 4 ns) are labeled as intermediate (turquoise). |
Most residues in the folded region (residues 185–285) of partially disordered HpTonB30–285 exhibit similar distribution of rotational timescales as observed in globular proteins: timescales corresponding to rigid body motions dominate, while very fast and intermediate timescales have minor contributions (Fig. 5). Encouraged by this observation, we calculated the rotational diffusion coefficients of folded domains having linkers with different lengths attached. Even though the mean square angle deviations around inertia axes exhibited some non-linear features with increasing linker length (see ESI†), we think that the estimated rotational diffusion coefficients and related timescales can be used to analyze the influence of a disordered linker on rotational dynamics of the folded region. As expected, the rotation of a folded region slows down upon the increasing length of attached linker (Table 2). However, the overall dynamics of folded domain in HpTonB179–285 is only slightly faster than in HpTonB30–285, even though the latter protein has 2.34 times more residues. This is in line with the small changes in experimental spin relaxation times upon increasing the length of linker observed in our previous work.16 These results suggest that the linker is very flexible and that the rotational dynamics of the folded C-terminal domain is almost independent of the attached flexible disordered linker. This conclusion is supported by the similar results for overall timescales estimated from experimental T1/T2 ratios and from MD simulation analysis in Table 2. The analysis using T1/T2 ratio assumes an isotropic rigid body rotational diffusion for protein,45 while MD simulation analysis takes anisotropic effects fully into account.15 Similarity of the results from these two approaches suggests that the assumption of rigid body rotational diffusion is relatively reasonable for folded domains even with long disordered linkers attached.
Disordered region in HpTonB30–285 (residues 30–184) exhibits clearly different distribution of rotational motions than the folded region, having dispersed timescales between approximately 60 ps and 30 ns without any dominating timescales (Fig. 5). Similar differences in distribution of timescales between folded and disordered regions are seen also in HpTonB30–285 with 150 mM NaCl (Fig. S8, ESI†) and EN2 (Fig. S9, ESI†), although intermediate timescales are more pronounced in the folded region of EN2, most likely due to the less rigid structure than in HpTonB simulations. The large number of detected timescales in disordered regions in both proteins suggests that the models assuming two or three timescales for rotational motions are too simple to capture all relevant rotational processes in disordered proteins. This is in line with the recent study where dihedral rotation and rotational tumbling motions of peptide planes were found to have multiple timescales.42 Because large number of parameters are needed to describe such dynamics, the multiple timescale dynamics is very difficult to detect even with the excessive amount of experimental data measured at different conditions.28,41 Khan et al. have analyzed rotational timescales of EN2 by fitting a sum of exponential correlation functions with six fixed timescales to the experimental data detected at different magnetic fields.28 In line with our results (Fig. S9, ESI†), timescales between ∼10 ps and ∼1 ns were not present in the folded region in their analysis. However, their dominant timescale, 5.27 ns, is shorter than ∼10 ns in our results, and they observed significant weight for a very long 21 ns timescale, which is not present in our results. For disordered region of EN2, Khan et al. observed dispersed timescales between 21 ps and 21 ns which is in line with our results. However, the timescale of 1.33 ns dominated the rotation of disordered region in their analysis, while we did not detect any dominant timescale in the disordered region. We believe that these differences may arise because the dynamical model with six fixed timescales that are equal for all residues may be too simple for partially disordered proteins. Our MD simulations suggest that disordered protein regions have more complex distributions of rotational timescales with multiple components that vary between the residues. While this distribution of timescales is very difficult to detect by fitting to even a vast amount of experimental data, MD simulations with realistic force field parameters can capture the relevant rotational processes detected by spin relaxation times for disordered protein regions, thereby giving an interpretation of experimental data with unprecedented details.
Fig. 6 Schematic figure of conformational ensemble sampled by periplasmic TonB in sub-microsecond timescales in cell envelope of Gram-negative bacteria. Distances from the inner membrane (IM) to the outer membrane (OM) and proteoglycan (PG) layer are from ref. 46. |
In the conformational ensemble of HpTonB36–285 in solution with physiological ion concentration (Fig. 2), resolved here using Amber ff03ws simulations and NMR spin relaxation experiments, the proline rich linker is very flexible with the rotational timescales faster than 30 ns (Fig. 5) and the folded C-terminal domain rotates almost independently of the linker region (Table 2). On the other hand, significant fractions of polyproline helices (PPII) with the probability up to approximately 70% were observed in the linker region in our simulations (Fig. S17, ESI†). This is in line with the previous NMR and CD measurements of TonB sequence from E. coli.47,58,59 Further data supporting the relatively rigid polyproline helix conformations of the TonB linker in E. coli comes from DEER EPR experiments, where distances between spin labels at different locations in the linker were measured.47 The EPR experiments give average length of 4.6 ± 0.1 nm for the residues 88–106 in E. coli TonB consisting mainly of PK repeats. The corresponding sequence in HpTonB between residues 94–112 (Fig. S16B, ESI†) has the same average length of 4.6 ± 0.2 nm in the Amber ff03ws simulation with 40 mM NaCl. In conclusion, the conformational ensemble from Amber ff03ws simulations (Fig. 2) is in line with the experimental spin relaxation data, as well as with the interpretation of previous NMR and EPR experiments from E. coli sequence,47,58,59 suggesting that the linker is disordered but has relatively extended conformation with high propensity of polyproline helices.
Because crystal structures suggest that TonB interacts with TBDTs via the β3 strand in the folded C-terminal domain of TonB (residues 272–281 in HpTonB),53,54 the linker domain should reach across the approximately 20 nm wide periplasmic space46 to bring the C-terminal domain into contact with TBDTs in outer membrane (Fig. 6). In our conformational ensemble, resolved from Amber ff03ws simulations in 40 mM NaCl solution (Fig. 2), the maximum distance between the β3 strand atoms and the end of transmembrane domain (residue 30) is 15.2 nm with the average value of 11.9 ± 0.3 nm, which is close to the typical distance between inner membrane and proteoglycan layer in Gram-negative bacteria, but shorter than the width of periplasmic space (Fig. 6).
In summary, our results suggest that the periplasmic part of TonB exhibits rapid conformational fluctuations with the timescales faster than 30 ns and that the C-terminal domain locates on average close to the proteoglycan layer (Fig. 6). These results are in line with the previously proposed models where the C-terminal domain of TonB reaches only to the proteoglycan layer, while the binding site in TBDTs extends into periplasm to interact with the β3 helix in CTD of TonB at the proteoglycan layer.50,60 Alternatively, the proton-motive force could be used to increase the linker length to push the C-terminal domain of TonB to interact with the TBDTs in the outer membrane, and the relaxation back to the equilibrium length would then mechanically open the TBDT. On the other hand, the high flexibility of the linker and uncorrelated rotational motions of linker and the folded C-terminal domain do not support models where rotational torque is mediated across periplasmic space via rigid linker.50 Also scanning of TBDTs directly from the outer membrane48 seems less likely due to the short average length of the linker. However, we cannot exclude the possibility that changing pH conditions, TonB dimerization or its interactions with membrane, proteoglycan layer or other proteins in TonB complex would increase the length or rigidity of the linker. We believe that MD simulations with realistic conformational ensembles of disordered linkers, evaluated using experimental spin relaxation times, paves the way for further studies to understand the function of TonB complex, as well as other biological machineries where disordered protein regions play crucial roles.
MD simulations that reproduced the experimental spin relaxation times revealed that the timescales from overall rigid body rotation dominate in fully folded proteins and folded regions in partially disordered proteins, while very fast (10 ps or faster relaxation times) and intermediate (1–4 ns relaxation times) timescales give small contributions. Furthermore, the rotational dynamics of folded region was almost independent of the attached disordered linker. Therefore, the estimation of overall rotational timescales from experimental T1/T2 ratio using simple model assuming isotropic rotation45 gave reasonable results even for folded region with long disordered linker attached. In contrast, disordered regions exhibit rotational dynamics with multiple dispersed timescales between 60 ps and 30 ns without any dominating contribution, suggesting that the dynamical modes of disordered proteins are very difficult to capture by fitting to even a large amount of experimental data. We suggest that the dynamical modes of disordered protein regions can be captured using MD simulations with accurate force field and solvent ionic strength matching the experimental conditions.
The resolved conformational ensemble of HpTonB36–285 is in line with the previous NMR and EPR studies47,58,59 of TonB linker from E. coli, but gives unprecedented details of periplasmic domain of TonB that helps to understand its function. In our results, the equilibrium length of the TonB linker is not sufficient to reach across the periplasmic space, suggesting that either TBDTs has to extend the periplasmic space,50,60 or the length of linker has to increase, for example, due to changing pH conditions or interactions with other molecules. One possible scenario is that the proton-motive force would be used to induce such increase in the linker length. The energy released by the relaxation of linker to its equilibrium length could be then used to transport nutrients through the TBDT. Our results pave the way for MD simulations that could be combined with experimental approaches to test this, and other hypotheses on the functional mechanism of TonB complex function.
Using MD simulations to interpret the conformational ensembles and dynamics of disordered protein regions from NMR data has two main advantages over the other available methods.11–13,41,61 First, MD simulations that correctly predict the spin relaxation times can be used to interpret the experiments without a priori assumptions about the amount of rotational timescales. Second, MD simulation force fields that give correct conformational ensembles for disordered protein regions can be straightforwardly used to model these regions within complex biomolecules assemblies under various conditions. For example, our results suggest that TonB complex could be simulated by combining the Amber ff03ws force field with Amber compatible lipid force fields62,63 in order to fully understand its function. However, applications of Amber ff03ws force field are limited by its problems to predict the stability of α-helices in some situations.19,33 Therefore, our results also emphasize the necessity of force field development to correctly capture the conformational ensembles of partially disordered proteins in various conditions. Our results suggest that the backbone 15N spin relaxation times are suitable experimental references for these efforts because they are more sensitive to the conformational ensemble than, for example, chemical shifts which are often used in force field optimization or to resolve conformational ensembles of disordered proteins.19,64
The protein significantly collapsed in both Amber ff99SB-ILDN and CHARMM36m simulations during 40 ns and 120 ns, respectively. Due to the reduced size of the protein, the amount of water molecules were reduced to 45950 and 35892 in these simulations, respectively. The simulation were then continued to have total lengths of 384 ns and 520 ns, respectively. The last 280 ns and 400 ns were used in the analysis, respectively. The size of protein in the Amber ff03ws simulation was much less reduced and the amount of water molecules was reduced to 123497 after 5.3 ns simulation. This system was then simulated 1171 ns and the last 1000 ns was used in the analyzes. The temperature was 310 K in Amber ff99SB-ILDN and CHARMM36m simulations and 298 K in the Amber ff03ws simulation. Simulations were ran using Gromacs versions between 5.0 and 2020 series.67 Rest of the simulation details are described in the ESI.† Simulation files and trajectories are available from ref. 68–70.
Sodium and chloride ions described with parameters containing the electronic continuum correction (ECC),71,72 available from https://bitbucket.org/hseara/ions/, were added into the Amber ff03ws simulation to study the influence of solvent ionic strength on conformational ensemble. The 6 chloride counterions were described by the default Amber ff03ws parameters. Because the protein immediately expanded upon addition of 150 mM NaCl and started to interact with its own periodic image after 200 ns simulation, the number of water molecules was increased for simulations containing ions. Simulation containing 150 mM NaCl had 261780 water molecules with 711 Na+ and Cl− ions, and simulation containing 40 mM NaCl had 262822 water molecules with 190 Na+ and Cl− ions. Simulation files and trajectories with NaCl are available from ref. 73 and 74.
In our previous study, we showed that the underestimated viscosity of TIP3P water model can be corrected during spin relaxation time calculation by dividing the rotational diffusion coefficients of folded proteins with a constant factor of 2.9.15 Here, we apply the same approach to the folded C-terminal domain of HpTonB30–285 when calculating the spin relaxation times from Amber ff99SB-ILDN and CHARMM36m simulations, where TIP3P water was used. For this, we first calculate the rotational diffusion coefficients of the folded C-terminal end (residues 196–280). The rotational correlation functions from N–H bonds of residues 175–285 were then divided to the internal and overall motions, and the contribution of overall motion was corrected during spin relaxation time calculation by dividing the diffusion coefficients with a factor 2.9 as described in our previous work.15 The water viscosity correction was not applied to the disordered linker region because it does not rotate as a rigid body and is therefore expected to be less affected by the fluid viscosity.82 Because spin relaxation times from Amber ff03ws force field are in good agreement with experiments for the folded HpTonB194–285 protein without any correction for the water viscosity (Fig. S3 in the ESI†), the spin relaxation times from Amber ff03ws simulations of HpTonB36–285 and EN2 protein were calculated directly from simulation data without any corrections for water viscosity.
Correlation functions up to one hundredth of the simulation time should give a sufficient statistics for the analysis of rotation from a single molecule MD simulation.83 We observed that insufficient statistics in correlation functions induce artificial timescales with the longest possible timescale (50 ns) when fitting the correlation function to eqn (1). Therefore, when calculating spin relaxation times directly from Amber ff03ws simulations without using diffusion coefficients, we selected the time frames to be analyzed from correlation functions such that the number of weights above 0.05 for artificial 50 ns timescale is minimized. The analyzed time frames from correlation functions were the first 15 ns for HpTonB194–285, the first 10 ns for HpTonB30–285 without NaCl, and the first 25 ns for HpTonB30–285 with NaCl and EN2.
Quantitative error analysis of calculated spin relaxation times is complicated because each step in the analysis has different sources of errors. However, independent simulation replicas of folded PaTonB-96 protein in our previous work15 gives an average deviations of ΔT1 = 0.06 s, ΔT2 = 0.02 s, and ΔNOE = 0.04, which can be considered as the estimates of accuracy of the calculated values.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d0cp03473h |
‡ These authors contributed equally to this work. |
This journal is © the Owner Societies 2020 |