Yang
Huang
,
Kwangho
Nam
and
Per-Olof
Westlund
*
Department of Chemistry, Umeå University, Umeå, Sweden. E-mail: perolof.westlund@chem.umu.se
First published on 19th June 2013
The hydration of a protein, Peroxiredoxin 5, is obtained from a molecular dynamics simulation and compared with the picture of hydration which is obtained by analysing the water proton R1 NMRD profiles using a generally accepted relaxation model [K. Venu, V. P. Denisov and B. Halle, J. Am. Chem. Soc., 1997, 119, 3122]. The discrepancy between the hydration pictures derived from the water R1(ω0)-NMRD profiles and MD is relevant in a discussion of the factors behind the stretched NMRD profile, the distribution of orientational order parameters and residence times of buried water used in the NMRD model.
For an introduction to the NMRD experiment and the old interpretational controversy concerning the nature of the protein-associated waters that contribute to the relaxation dispersion we recommend the papers by Halle et al.2–4 They have presented a consistent NMRD model describing 1H, 2H and 17O R1-NMRD profiles of several proteins, which is based on the locations, orientational order parameters, and residence times of protein associated water molecules. The model introduces two types of water molecules, namely, α and β waters. The α waters are characterized by an effective correlation time τc short enough compared with the inverse of the maximum Larmor frequency (1 Tesla) of the NMRD experiment (ωmax0τc ≪ 1) and the β waters are characterized by long τc satisfying the non-extreme narrowing regime (ωmax0τc ≥ 0.1), respectively. Consequently, in the NMRD-model reported by Halle et al.,2–4 the α-waters give rise to a field-independent relaxation contribution, whereas β-waters, which are more interesting, give rise to relaxation dispersion profiles. The effective correlation time of a β-water molecule is described by the residence time τW of the water at the protein interface and the protein reorientational correlation time τR
(1) |
The observed water R1(ω0) dispersion profile may be stretched over a wide frequency range and generally it cannot be described by a single Lorentzian spectral density function. The reasons for the dispersion stretching have been explained in terms of a distribution of water inter-molecular dipole–dipole coupling constants and a distribution of water residence times.5,6 In the present study, we focus on slow tumbling proteins where a distribution of different residence times, which are shorter than or comparable with the protein reorientational correlation time τR, becomes important. The β water molecules are then characterized by different τc values (cf.eqn (1)) and the observed dispersion profile is a sum of many field-dependent dispersion curves. Consequently, the relaxation contribution from β water molecules with τW ≤ τR may explain the common observation of stretching to higher frequencies as well as in the low field region of the NMRD profile. The effect of the quadrupolar dip on the amide proton relaxation rate clearly modifies its NMRD profile. However, based on the explanation of the presence of this dip for immobilized protons,7 the relaxation properties of the amide protons are not transferred to the water protons for slow tumbling proteins and thus do not contribute to the stretching of the water dispersion profile.8
Although the water R1-NMRD profiles can be measured accurately and analyzed using the NMRD-model,2–4 the reliability of the analysis and the corresponding results are difficult to access only from the quality of the fitting of the NMRD model to the experimental NMRD profile. Complementary information can be obtained from crystallographically determined waters. However, it does not provide any information about the time-scales of the dynamics of protein-associated waters, and the resolution of the structure and the model bias in the structure refinement potentially affect the results.
The usefulness of molecular dynamics (MD) simulation for analysing different model assumptions used in spectroscopic studies has previously been demonstrated. Several examples are found, for instance in the study of the pure dephasing vibrational relaxation mechanism in the symmetric vibrational modes of acetonitrile10 and in the ESR lineshape analysis of spin labeled lipid molecules in the lamellar phase of different bilayer membranes.11 MD was also used in a study of the pseudo-rotation model which is often used in nuclear paramagnetic spin relaxation theory.12 In a combined MD and 14N-NMR relaxation study the problem of urea induced protein unfolding was discussed for the urea–lysozyme system.13
Regarding the problem discussed in this paper, Likie and Prendergast14 have used MD simulations to investigate the dynamics of the internal water molecules of fully solvated rat intestinal fatty acid binding protein (I-FABP). They found that the number of long lived intrinsic β-waters could be different from the number of intrinsic waters indicated from the X-ray crystal structure. This is also observed in our study. These results indicate that crystal waters only give an approximative picture of the intrinsic waters present in a soluble protein. However, their MD simulation is relatively short (about 1 ns)14 and consequently, they were unable to determine the distribution function of the residence times of β waters. In the study reported by Garcia and coworkers,15 the hydration of cavities in the interior of staphylococcal nuclease (SN) was characterized in terms of residence times and average location. A 10 ns long MD simulation was performed and 3 internal water molecules were identified for the wild type SN having residence time of 4–10 ns. The water orientational motion is hindered since the mean square deviations were small but the orientational order parameter was not determined. The NMRD study shows two water molecules with residence times larger than 10 ns.16 In exploring the conserved water sites and hydration of a coiled-coil trimerisation motif, a 10 ns MD simulation did not reveal any long residence water and all crystallographic waters had residence times less than 10 ps.17 Henchman and McCammon18 also used MD simulations to characterize the water hydration sites around the protein acetylcholinesterase (AChE). They carried out (10 ns) MD simulation and explored different approaches to identify hydration sites. They found that the number of identified hydration sites is larger than the number of waters identified using X-ray crystallography and that the residence times of those waters vary between several ps and 10 ns. However, these studies do not calculate water T1-NMRD-profiles based on their MD simulations and the stretching property was not discussed.
In this study, we have used MD simulations to evaluate the accuracy and limitations of the water R1-NMRD relaxation model and its analysis in detail. First, we have determined the theoretical NMRD profiles based on the information of the dynamics of protein-associated waters obtained from MD, and analyzed them using the R1-NMRD model. Since the relaxation contributions of different types of water molecules are explicitly known in the present study, they provide reference data to compare the molecular picture of the protein hydration extracted from the R1-NMRD model analysis. Next, the stretching of the NMRD dispersion due to the distribution function of β-water residence times is studied for different reorientation correlation times and compared to the NMRD relaxation model. Last, we have identified a number of buried β water molecules through the analysis and determined their orientational order parameters.
The paper is organized as follows. First, the NMR relaxation theory is reviewed followed by the presentation of the MD simulations of a fully hydrated protein (Peroxiredoxin 5).9 The analysis of the NMRD profiles obtained from the MD simulation is then presented and discussed. Finally, we sum up and some conclusions are given.
(2) |
(3) |
(4) |
(5) |
Fig. 1 The snapshot structure of the protein Peroxiredoxin 5 (PrxV). The distribution of each long-lived β-water is shown by the color-coded surface representation: the red-colored surface is for the distribution of waters with longer than 10 ns residence time, the yellow-color is for the waters with the residence time between 5 ns and 10 ns, the blue-color is for the residence time between 2 ns and 5 ns, respectively. For clear presentation, we only show several representative long-lived waters. Crystallographic waters that overlap with the long-lived waters are shown by the atom-cleared sphere representation. During the simulation, many crystal waters leave the initial position and bulk waters replace those positions. Not all of the long-lived waters are found at the positions where the crystallographic waters are found. The five most long-lived β-water molecules (with residence time longer than 10 ns) are labeled W1, and B1 to B4, in which W1 is the crystal water and B1 to B4 are bulk waters that are added in the simulation to fully solvate the system. |
The system is prepared on the basis of the recent high resolution structure of PrxV (PDB ID: 3MNG).20 It has been suggested that the protein functions by interconverting between the monomeric and the dimeric states,21 and in the present study we use the monomeric form of the protein to simplify the simulation study. The structure of the monomeric PrxV is presented in Fig. 1. In the structure, 288 water molecules are identified as crystal waters and included in the simulation. The system is further solvated with a rhombic dodecahedron (RHDO) box of 3803 water molecules, followed by the removal of any added water molecule that overlaps with the protein and crystal waters. The resulting system contains a total of 11393 atoms (2414 protein atoms and 2993 water molecules). Thus, the total number of water molecules in the system is NT = 2993. The CHARMM22 force fields22 and the CMAP correction23 are used to describe the protein. Each water molecule is represented by the three sites TIP3P water model.24 The RHDO periodic boundary conditions with the lattice length 53.8 Å are used, and the electrostatic interactions are evaluated using the particle mesh Ewald summation (PME) method.25
The system is first energy minimized to relax the coordinates of water molecules and hydrogen atoms of PrxV. After heating up the system to 300 K over 23 ps, the MD simulations are run for 20 ns. The temperature is maintained at 300 K using the Langevin thermostat.26 The simulations are carried out with the 2.0 fs integration time step and the SHAKE algorithm27 is applied to constrain the bonds involving hydrogen atoms. The NAMD program28 is used to perform the MD simulations and the CHARMM program29 in the system preparation, energy minimizations, and trajectory analysis. During the simulation, the coordinates of the protein and water molecules are saved every 1 ps. The saved coordinates are used in the subsequent analysis.
(6) |
Fig. 2 The full distribution function of residence times displaying the number of water molecules N(τW) characterized by a residence time τW. The α water has τW ≤ 0.5 ns and the rest are β water molecules. |
Fig. 3 The distribution function of residence time displays the number of β water molecules with residence times in the interval 0.5 < τW ≤ 20 ns. The inset shows the distribution function of residence times displaying the number of water molecules within the selected time range. |
We present in ESI,† a Movie S1 showing the dynamic exchange of β-waters (τW > 2.5 ns) around several hydration sites of PrxV. In the movie, the sizes of water molecules are proportional to their τW values and colored according to Fig. 1. The movie suggests that the dynamics of perturbed waters are more complicated than the simple picture based on the analysis of crystallographically determined waters. In particular, it is not possible to determine whether the crystallographically identified waters are either α- or β-waters, exclusively by analyzing their positions and the number of interactions with protein. Many exchanges of waters are observed at each hydration site. Some of them are β-waters with long residence time and some are with relatively short residence time. Sometimes, those sites are occupied by α-waters (thus for much short period of time), before β-waters move in and replace them, making the total number of hydration waters roughly constant.
If the average residence time of water is computed for each hydration site, it may be used to characterize each hydration site, but it may require an MD simulation much longer than 20 ns to achieve a full convergence of the computed (average) residence times. However, this kind of analysis would lose detailed information about the nature of the dynamics of hydration waters, and the Nα and Nβ values determined based on such analysis should be interpreted carefully.
(7) |
Fig. 4 The orientational order parameter S0 for β-waters versus their residence time τW. |
Fig. 5 displays three (exact) water R1-NMRD profiles obtained from eqn (2) using τR = 50 ns (squares), 10 ns (pyramids), 1 ns (stars), respectively. The τR values are chosen to cover a large range of possible τR values in real experiments. This provides an ideal case to explore the effects of τR on the residence time distribution function, whereas it is not possible to carry out experimentally, because we are unable to manipulate τR independently over a large range. The τR = 1 ns value is a very short and rather unrealistic reorientational correlation time for a global protein, but is included here as an illustrative example of a single τcR1-NMRD dispersion profile. In this case, the NMRD profile is almost uninfluenced by the τW distribution function. The NMRD profile, on the other hand, is influenced by the whole τW distribution function for the slow tumbling protein, such as τR = 50 ns.
Fig. 5 The “exact” NMRD profiles obtained from the MD simulation (eqn (2)) with τR = 50 (squares), 10 (pyramids), 1 (stars) ns are displayed with best fitting NMRD profiles (solid line) calculated using the NMRD model of eqn (5), using one, two and three effective correlation times τc (see Table 1). The model NMRD profile defined by exp-parameters given in Table 1 is not distinguishable from the exact NMRD profile. |
The fitting is carried out sequentially. Starting from the shortest τR we first fit eqn (5) against the exact NMRD profile using Nβ1 and one residence time τW,1. We then proceed to τR = 10 ns and add a second dispersion profile with τW,2 and Nβ2. Finally for τR = 50 ns a third residence time τW,3 is introduced with Nβ3 water molecules. This fitting procedure results in three τW distribution functions shown as a–c in Fig. 6. The modeled NMRD profiles are displayed in Fig. 5 as a solid line and the model parameters determined are summarized in Table 1. The analyses indicate that three residence times can be extracted and produce a model NMRD profile conforming well to the exact NMRD displayed in Fig. 5. However, as presented in Fig. 6, the resultant τW-distributions are much simpler than the exact distribution given in Fig. 3. We note that alternative fitting procedures can be applied, possibly producing a different set of τW and Nβ values. This, together with the present fitted results, suggests that the residence time distribution can be interpreted in several ways, and alternative different distribution functions can produce equally good and excellent fitting to the “exact” NMRD profile. One such interpretation is denoted exp in Fig. 6. The exp-distribution is obtained from Fig. 3 by averaging the distribution values from 20 ns to 10 ns, 10 ns to 2 ns and 2 ns to 0.5 ns to give three residence times, τW = 12, 4 and 0.7 ns, respectively, with weight factors that are equal to the natural number of water molecules (Table 1). This model NMRD profile obtained with the exp-distribution fits excellent to the “exact” NMRD profile with τR = 50 ns. The present analysis suggests that detailed information about the intrinsic water dynamics is partly lost in the NRMD model analysis and the residence time distribution differs from the exact distribution of Fig. 3.
τ R (ns) | τ W1 (ns) | N β1 | τ W2 (ns) | N β2 | τ W3 (ns) | N β3 | χ 2 | Fig. 6 |
---|---|---|---|---|---|---|---|---|
50 | 0.7 | 7 | 4.0 | 6 | 12.0 | 4 | 0.00163 | exp |
1 | 0.7 | 17 | — | — | — | — | 0.00039 | a |
10 | 0.7 | 9 | 5.1 | 8 | — | — | 0.00634 | b |
50 | 0.7 | 7 | 5.1 | 8 | 15.7 | 2 | 0.00275 | c |
Fig. 6 The distribution of residence times obtained from Fig. 5 and listed in Table 1 using the NMRD model. The residence times obtained from the NMRD analysis are displayed in (a) when τR = 1 ns, in (b) when τR = 10 ns and in (c) when τR = 50 ns. The residence times of (c) (see Table 1) can be compared with the averaged distribution function of Fig. 3 displayed in exp. |
The results presented in Fig. 5 show that three effective correlation times reproduce well the stretching of the exact NMRD profiles with τR = 50 ns and τR = 10 ns. In Fig. 7, we have analyzed further the stretching of the NMRD profile with τR = 50 ns, by comparing the exact NMRD profile to the model NMRD profiles assuming a single τc value, i.e., 20 ns, 10 ns, and 5 ns. The figure shows that a single Lorentzian function cannot explain the stretching of the NRMD profiles both at low and high fields and therefore, the entire distribution of residence times contributes to the stretching. Nevertheless, the present analysis provides a simple approach to estimate the range of τc by focusing on the low and high field dispersions. In the present case, the range is between τc,min = 5 ns and τc,max = 20 ns. If the reorientational correlation time τR is known, the range of the residence times τW can also be determined straightforwardly. For example, by assuming τR = 50 ns in the present case, τW,min and τW,max are 5.6 ns and 33.3 ns, respectively, which are comparable to the values given in Table 1.
Fig. 7 The stretched NMRD profile with τR = 50 ns is displayed by a solid line. It is analysed from low fields to high field using a single Lorentzian function . The low field relaxation rate r(0) = 1.4. The correlation times are: (a) τc = 20 ns, (b) τc = 10 ns, (c) τc = 5 ns. |
We found that the main assumptions made in the formulation of the NMRD relaxation model (eqn (5)) are corroborated by the MD simulation analysis. Compared to the large number of short-lived α waters, a relatively small number of β waters (a total of 17 with τW > 0.5 ns in this study) are identified from the MD simulation, all of them with high orientational order parameters. The order parameters of most of the identified β waters are close to 1.0, and only a few have slightly reduced values. Since the relaxation contribution is proportional to the square of the order parameter (eqn (2)), the waters with reduced order parameters will contribute less to the NMRD profile, and the main relaxation contribution determining the NMRD profile is due to the remaining β waters with large order parameters. Our analysis also suggests that the reduced values are not because of local water motion but because they follow local protein fluctuations. Therefore, they should be interpreted cautiously.
However, the NMRD model yields a much simplified distribution function of water residence times relative to the distribution function obtained by analyzing the MD simulation trajectory (Fig. 6). For slowly tumbling protein, displaying a stretched NMRD profile, an excellent fitting does not reproduce the full distribution of residence times of protein-associated waters (compare Fig. 3 to Fig. 6). This implies that the information content of the (experimentally determined) NMRD profile is not sufficient for a unique determination of the full distribution function of residence times. Furthermore, the NMRD model results in residence time distributions that are rather similar to the distribution function of the block-averaged residence times (Table 1 and Fig. 6). This suggests that the information that is extracted from the NMRD model analysis is more or less averaged over waters with similar residence times.
The analysis of the β waters and their “binding sites” in protein shows that different residence times are sometimes correlated with the same location in the protein. This can be understood as the effect of protein dynamics and local structural fluctuations, which consequently change the residence times of the waters bound in that site. In addition, the crystallographically identified waters are not always β waters. Therefore, the rigid picture from crystallographics, where water residence sites are identified for buried water molecules, is not valid for all β-water molecules. Although it is likely to be true for very long lived waters, for the waters with residence time 1–10 ns their averaged residence times cannot be determined with any high accuracy. This implies that the dynamics of protein-associated waters (both α and β) are rather complex.
Taken together, the present study suggests that we cannot extract information about all β waters and their residence times solely with the information obtained from the analysis of the NMRD model and X-ray structures. The information obtained from the NMRD analysis is rather averaged over waters with similar residence times. Nevertheless, the present analysis clearly demonstrates the influence of the full τW-distribution function on the stretching of the water proton NMRD profiles both in the low field and in the high field region of the dispersion.
Finally, it should be noted that Fig. 3 is obtained from the 20 ns MD simulation and thus may not be fully converged. Since the results of MD simulation are dependent on the length of the simulation, we do not attempt to draw any quantitative conclusion on the water dynamics of the protein Peroxiredoxin V in this study. This is also not the focus of this study. In this study, our focus is on assessing the information content of the NMRD profile and the relaxation model by comparing the water residence times and their distribution determined from the NMRD model analysis to the exact distribution of water residence times, which is used as the input in the NMRD model fitting. Therefore, it is sufficient to have any distribution of residence times which we can consider as a real one and be used to determine the theoretical NMRD profiles. Since the R1(ω0)-NMRD experiments cannot discriminate between all individual β water molecules, we used the MD simulations to generate such information in this study. Therefore, the length of the simulation has no consequences for this study, as far as the length roughly corresponds to the time scale of the protein tumbling rate.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c3cp51147b |
This journal is © the Owner Societies 2013 |