Length-scale dependence of protein hydration-shell density

Akash Deep Biswas ab, Vincenzo Barone bc, Andrea Amadei *d and Isabella Daidone *a
aDepartment of Physical and Chemical Sciences, University of L’Aquila, via Vetoio (Coppito 1), 67010 L’Aquila, Italy. E-mail: isabella.daidone@univaq.it
bScuola Normale Superiore di Pisa, Piazza dei Cavalieri 7, I-56126 Pisa, Italy
cNational Institute for Nuclear Physics (INFN) Pisa Section, Largo Bruno Pontecorvo 3, 56127 Pisa, Italy
dDepartment of Chemical Science and Technology, University of Roma Tor Vergata, via della Ricerca Scientifica, 00133 Roma, Italy. E-mail: andrea.amadei@uniroma2.it

Received 17th November 2019 , Accepted 18th March 2020

First published on 23rd March 2020

Here we present a computational approach based on molecular dynamics (MD) simulation to study the dependence of the protein hydration-shell density on the size of the protein molecule. The hydration-shell density of eighteen different proteins, differing in size, shape and function (eight of them are antifreeze proteins), is calculated. The results obtained show that an increase in the hydration-shell density, relative to that of the bulk, is observed (in the range of 4–14%) for all studied proteins and that this increment strongly correlates with the protein size. In particular, a decrease in the density increment is observed for decreasing protein size. A simple model is proposed in which the basic idea is to approximate the protein molecule as an effective ellipsoid and to partition the relevant parameters, i.e. the solvent-accessible volume and the corresponding solvent density, into two regions: inside and outside the effective protein ellipsoid. It is found that, within the model developed here, almost all of the hydration-density increase is located inside the protein ellipsoid, basically corresponding to pockets within, or at the surface of the protein molecule. The observed decrease in the density increment is caused by the protein size only and no difference is found between antifreeze and non-antifreeze proteins.

1 Introduction

The hydration of bio-molecular systems plays important roles in their stability,1 dynamics,2–4 folding,5 ligand recognition6,7 and function.8,9 There have been a variety of theoretical9–18 and experimental3,19–32 studies demonstrating that proteins modify the structural and dynamical organization of their surrounding water molecules. The thickness of protein hydration shells inferred from most studies is in the range of 1–1.2 nm,13–15,18,33 although larger, long-ranged dynamical solvation shells have been hypothesized on the basis of terahertz spectroscopical studies.25,34,35

Although it is now rather well accepted that the density of the protein hydration shell is higher than the bulk one,11,12,15,16,18,20 there is not a consensus about the magnitude of this increment. Depending on the protein, on the experimental technique, or on the computational approach, used and on the distance from the protein surface at which the density increment is evaluated, different estimates have been given ranging from 1% to 50%.11,12,16,20

Here, we focus on the dependence of the protein hydration shell density on the protein size, and partly function by analyzing the hydration properties of eighteen different proteins, out of which eight are antifreeze proteins (AFPs), by means of MD simulation and a recently developed analysis to investigate hydration shell densities.15,18 Previous work have focused on the length-scale dependence of the hydration density of hydrophobic solutes36–39 but, to the best of our knowledge, this is the first systematic analysis of the correlation between hydration shell densities and the size and shape of globular proteins, which are systems with amphipathic surfaces.

2 Theory

The basic idea for constructing a simple, yet reasonable, model describing the hydration shell-density variations as a function of the protein size and geometry, is to approximate the protein molecule as an ellipsoid and to partition the relevant quantities (i.e. the protein excluded volume Vex, the solvent-accessible volume Vacc, and the solvent density ρsh, within the solvation shell of the protein). The hydration shell is basically divided into two parts: one residing inside the effective protein ellipsoid (Vinex, Vinacc and ρin, respectively), the other residing within the solvation shell but outside the effective protein ellipsoid (Voutex, Voutacc and ρout, respectively) (see Fig. 1). For a description of the procedure used to define the effective protein ellipsoid volume, Vell,eff and the solvation shell see the Methods section.
image file: c9cp06214a-f1.tif
Fig. 1 Schematic diagram showing the main parameters used in the model. The inner ellipsoid corresponds to the effective protein ellipsoid volume, Vell,eff, while the outer ellipsoid corresponds to the volume of the ellipsoid defined by the boundary layer of the hydration shell, Vsh. The excluded protein volume, Vex, is represented in red (dark red corresponds to Vinex while light red to Voutex); the solvent-accessible volume, Vacc, is represented in blue (dark blue corresponds to Vinacc while light blue to Voutacc).

Then, we make the following assumptions:

• (i) we assume that the protein excluded volume inside the effective protein ellipsoid can be expressed as a fixed fraction of the protein ellipsoid volume and that this fraction is constant for the different proteins:

image file: c9cp06214a-t1.tif(1)

• (ii) similarly, we assume the ratio between the excluded volume due to protein atoms outside the protein ellipsoid and the excluded volume due to protein atoms inside the protein ellipsoid to be a constant independent of the protein type:

image file: c9cp06214a-t2.tif(2)

• (iii) we assume that the inner shell-density (i.e. ρin) is the same for each protein (this assumption will be confirmed by the MD simulation results, see Results section). For the outer shell-density (ρout) two cases are considered, one in which it is considered the same for each protein, the other in which it is affected by the shell size which depends, in turn, on the protein size. Hence, ρout is given either by a constant independent of the protein considered

image file: c9cp06214a-t3.tif(3)

or by a simple function of the ratio α = Vsh/Vell,eff

image file: c9cp06214a-t4.tif(4)
where Vsh is the volume of the ellipsoid defined by the boundary layer of the hydration shell (see the Methods section), ρb is the solvent bulk density, a0 and a1 are parameters identical for all the proteins considered and we used the fact that image file: c9cp06214a-t5.tif.

From the above assumptions we may obtain the solvent-accessible volume inside the protein ellipsoid, Vinacc, and the solvent-accessible volume outside the protein ellipsoid, Voutacc, as follows

Vinacc = Vell,effVinexVell,eff(1 − χ)(5)
Voutacc = VshVell,effVoutexVell,eff[α − (1 + γχ)](6)

The total accessible volume within the solvation shell, Vacc, can be then expressed as follows

Vacc = Vinacc + VoutaccVell,eff[αχ(1 + γ)](7)

Similarly, the density within the solvation shell can be expressed as

image file: c9cp06214a-t6.tif(8)

Finally, we consider the relative increment of ρsh with respect to the bulk water-density ρb, η = ρsh/ρb − 1

image file: c9cp06214a-t7.tif(9)
with m = (1 + γ)χ.

From eqn (1) and (2) we can express the total excluded volume via the parameter m

Vex = Vinex + VoutexχVell,eff + γχVell,eff = (1 + γ)χVell,eff = mVell,eff(10)
which corresponds to a linear relation between the excluded volume and the protein ellipsoid volume, hence providing m once fitting the data with a linear regression (see the Results section).

Similarly, by expressing the mean number of solvent molecules inside the effective protein ellipsoid ninvia the inner shell density

nin = (1 − χ)ρinVell,eff(11)
we can obtain β = (1 − χ)ρin by a linear regression of the ninversus Vell,eff plot as provided by MD simulation data (see the Results section).

In order to obtain a reliable estimate of ρin we compared for each protein the values of nin provided by the protein–solvent simulation with that as obtained removing from a pure solvent simulation (at the same temperature and pressure) the excluded volume of the protein (see Methods section). Given that the accessible volume inside the effective protein ellipsoid is the same for the protein–solvent and pure solvent simulations, the ratio of the mean numbers of solvent molecules is identical to the ratio of the two inner densities, with the inner density of the pure solvent simulation (by definition) identical to the bulk one. It will be shown in the Results section that ρin/ρb over the protein sample used is approximately constant for all proteins, thus confirming that ρin can be considered invariant for all the proteins and providing its estimate by simply multiplying such ratio (i.e. the mean ratio over the protein sample) by the bulk density.

Therefore, once obtained m,ρin and β, we can estimate 1 − χ = β/ρin and hence by using either eqn (3) or (4) into eqn (9) we have either

image file: c9cp06214a-t8.tif(12)
image file: c9cp06214a-t9.tif(13)

By using eqn (12) (the lower-level model) or eqn (13) (the higher-level model) to fit the values of η versus α as provided by the protein sample, either r0 or a1 and a2 parameters can be evaluated to obtain a fully analytical expression of η(α) at the two levels of approximation.

We end this section with a comment on the applicability of our model to larger proteins, i.e. in the range of α values tending to 1. The model has a formal lower limit, αmin, because Voutacc/Vell,eff needs to be ≥0, which implies, according to eqn (6), that Voutacc/Vell,eff = αm − (1 − χ) ≥ 0, and thus αm + (1 − χ) = αmin. However, it is likely that in the proximity of αmin (i.e. close to a vanishing relative outer accessible volume) our model is not fully reliable (see Results section for an estimate of αmin).

3 Methods

3.1 MD simulations

Molecular dynamics simulation of the following fourteen proteins was performed: eight antifreeze proteins (AFPs), namely yeast-AFP (PDB ID: 2LQ0), fish type I AFPI (PDB ID: 1WFA), fish type II AFPII (PDB ID: 2AFP), fish type III AFPIII (PDB ID: 1HG7), insect Tenebrio molitor TmAFP (PDB ID: 1EZG), insect Choristoneura fumiferana CfAFP (PDB ID: 1M8N), Rhagium inquisitor RiAFP (PDB ID: 4DT5), arctic yeast Leucosporidium sp. ice binding protein LeIBP (PDB ID: 3UYV), and ten non-AFPs, namely Trp-cage (PDB ID: 2J0F), Heliomicin (PDB ID: 1I2U), GB1 (PDB ID: 5JXV), BPTI (PDB ID: 5PTI), Ubiquitin (PDB ID: 3M3J), Barnase (PDB ID: 2KF3), Lysozyme (PDB ID: 1LZT) and Myoglobin (PDB ID: 1UFP), HCAII (PDB ID: 3KS3), and COVID-19 main protease (PDB ID: 6LU7). The Gromacs 5.1.4 software package40 in conjunction with the OPLS-AA force field41 was used. Each structure was solvated in a periodic dodecahedral box large enough to ensure at least 1.3 nm distance between the protein surface and the box faces. The simulation box was filled with Simple Point Charge (SPC) water molecules42 and (when needed) a proper number of ions to neutralize the system. The density of the boxes containing the SPC-protein solutions was calibrated, as done in previous work,15,18,43,44 in order to obtain (within the isothermal–isochoric (NVT) MD simulations) a pressure of ≈560 bar, corresponding to the pressure provided by an MD simulation of a pure SPC box at 300 K, with a density corresponding to the experimental liquid water density at the same temperature (≈33.3 molecules per nm3). Each protein was simulated at room temperature (300 K) in the NVT ensemble, using an integration step of 2 fs and keeping the temperature constant by the velocity-rescaling algorithm.45 All bonds were constrained using the LINCS algorithm46 and for short-range interactions a cut-off radius of 1.1 nm was employed. The particle mesh Ewald method47 was used to compute long-range interactions with grid search and cut-off radii of 1.1 nm. After solute optimization and subsequent solvent relaxation, each protein was simulated for 100 ns.

In addition, a 10 ns-long NVT simulation of a pure SPC box at 300 K at the reference pressure of 560 bar is performed in order to compare the properties of the protein hydration shells with the properties of fictitious hydration shells filled with bulk water (see below in the Methods section).

Concerning the choice of the water model, we showed in a previous paper18 for four of the proteins used here that the density results were not dependent on the water model chosen. To do this we performed MD simulations, and subsequent analyses, with both SPC42 and TIP4P/2005 models48 at two different temperatures, i.e. room temperature and a temperature just above the melting temperature for each of the two water models. The results with SPC were fully consistent with the ones obtained with the TIP4P/2005 water model.

3.2 Protein volume and hydration-shell density calculations

The protein excluded volume, i.e. the volume enclosed by the solvent-accessible surface, was calculated according to the method reported by Eisenhaber et al.,49 as implemented in Gromacs, using a probe radius of 0.14 nm. The protein mean excluded volume, Vex, was obtained for each simulation by averaging over the instantaneous excluded volumes evaluated at each MD time frame.

In order to characterize the hydration-shell density of a protein we approximate the protein molecule as an ellipsoid defined, at each MD time frame, by the eigenvectors and eigenvalues of the 3 × 3 geometrical covariance matrix of the x, y, z atomic coordinates, as described in the recent papers.15,18 Briefly, for each time frame of a trajectory the three eigenvectors define the protein ellipsoid axes, with the corresponding lengths provided by the eigenvalues: considering a Gaussian atomic positional distribution along each eigenvector, a semi-axis image file: c9cp06214a-t10.tif was used, with i = 1, 2, 3 and λi the eigenvalue of the ith eigenvector. A set of ellipsoidal layers around the protein defined by the consecutive ellipsoids with semi-axes a(n)i = ai + , with fixed increment δ = 0.03 nm, was then considered. By calculating, and averaging, the instantaneous SPC density within each layer (disregarding the possible presence of protein atoms and/or counter-ions) we obtained the solvent-density profile around the protein, within layers of increasing distance from the protein ellipsoid surface (i.e. the layer-solvent-density profile). Given that such solvent density profile does not account for the effect of excluded volume of the non solvent atoms, the density values within layers including a significant number of protein atoms are lower than the actual density. Typically, single protein atoms can be still present in layers at 0.7–0.8 nm from the protein ellipsoid surface while for layers beyond 1 nm basically no protein atoms are detected. Therefore, the differences among the protein SPC density profiles within the first hydration layers (up to ≈0.4–0.5 nm) largely reflect differences in the spatial arrangement and compactness of protein atoms.

In order to remove the effect on the density profile of the spatial arrangement of the protein atoms that protrude from the protein ellipsoid surface and to compare the solvent density inside each hydration layer with the density of the bulk solvent, we use a previously employed strategy18 to construct a fictitious hydration shell filled with bulk water. To this aim, the protein coordinates obtained from the protein + solvent MD simulations are inserted into a pure SPC box extracted from a MD simulation of water alone. The water molecules of the pure SPC box overlapping with the protein coordinates are removed, providing a protein + solvent box in which the hydration shell is filled with the bulk solvent. Water molecules of the pure SPC box are removed where the distance between any existing atom and any atom of the inserted protein is less than the sum of the van der Waals radii of both atoms. For each protein, this is done for 10[thin space (1/6-em)]000 different protein coordinates, each one inserted in a different bulk solvent box. The solvent density around the protein as a function of the distance from the protein ellipsoid surface is then calculated for these fictitious protein + solvent configurations and we call it ρb,fict. Finally, the ratio ρ/ρb,fict, i.e. the density variations with respect to the bulk density, is reported as a function of the distance from the protein ellipsoid surface (see Results section).

We define an effective protein-ellipsoid volume, Vell,eff, to be used in studying the protein-size dependence of η. This is achieved by adding to each protein ellipsoid semi-axis the solvent radius (we used 0.14 nm as the solvent radius) plus the effective mean thickness due to the excluded volumes of the protein atoms over the ellipsoid surface (0.06 nm) which was obtained via a numerical fit to protein partial molecular volumes, as described in a previous work.15 Briefly, the effective mean thickness of the ellipsoid surface was obtained by tuning its value in order to best reproduce the experimental partial molecular volumes over a sample of 14 globular proteins. A more accurate method50 that could be used to define an effective protein ellipsoid has been proposed based on the calculation of the neutron scattering profile from MD simulation which is used to back-calculate the gyration-tensor of the protein and its closest hydration layers. Nevertheless, such an approach is more time-consuming than the method used here and hence less effective for the analysis of a large number of proteins.

4 Results and discussion

From the MD simulations of the eighteen studied proteins, using the procedure described in the Methods section, we calculated for each protein the layer-solvent-density profile as a function of the distance from the protein ellipsoid surface, see Fig. S1 in the ESI, and the corresponding density variations with respect to the bulk density, ρ/ρb,fict, see Fig. 2. For each of the proteins under investigation the density of the solvent is higher than the bulk density in the proximity of the protein ellipsoid surface and decreases at increasing distances, approaching the bulk density (i.e. ρ/ρb,fict tends to 1). Each layer-density profile approaches the plateau at around 1 nm from the protein ellipsoid surface, thus indicating that beyond such a distance the SPC molecules of the protein–solvent system behave equivalently to the reference pure SPC ones.
image file: c9cp06214a-f2.tif
Fig. 2 Density variations with respect to the bulk density, ρ/ρb,fict, as a function of the distance from the protein ellipsoid surface for the eighteen proteins: Trp-cage (black), Yeast-AFP (red), AFPI (green), Heliomicin (blue), GB1 (yellow), AFPIII (brown), BPTI (grey), TmAFP (violet), Ubiquitin (cyan), CfAFP (magenta), Barnase (orange), Lysozyme (indigo), RiAFP (maroon), AFPII (turquoise), Myoglobin (dark green), LEIBP (light red), HCAII (light blue), and COVID-19 (light green).

By using the mean hydration shell volume, Vsh, corresponding to the mean volume of the ellipsoid defined by the boundary layer of the hydration shell (as obtained by adding 1 nm, i.e. the distance from the protein ellipsoid surface at which the density approaches the plateau value of bulk density, to each semi-axis of the protein ellipsoid), the mean number of SPC molecules within such an ellipsoid, nsh, and the mean protein excluded volume, Vex, we can obtain the mean solvent density within the accessible volume of the protein hydration shell, ρsh, via:

image file: c9cp06214a-t11.tif(14)

We can compare ρsh to the solvent bulk-density, ρb, by calculating the relative increase η = ρsh/ρb − 1. As is shown in Tab. 1 the simulations of the studied proteins provided, by means of eqn (14), a hydration shell mean density in the range of 4–14% higher than the bulk one. This range of η is in agreement with experimental data20 and previous calculations.11,12,15,18

By analyzing the data reported in Table 1, in which also the mean effective-protein ellipsoid, Vell,eff, the mean ellipsoidal hydration-shell volume, Vsh, are reported, one can notice a positive correlation between η and the size of the protein, as provided by the analysis of the correlation between η and the protein volumes. This behavior is opposite to what was observed for hydrophobic solutes in the same length-scale range,39 for which a decrease in hydration density was observed as a function of the protein size.

Table 1 Residue number, effective protein-ellipsoid volume, Vell,eff, ratio between the volume of the shell, Vsh, and the effective ellipsoid volume, Vell,eff, and relative density increment, η, for the eighteen studied proteins. Volumes are given in nm3
Protein Residue N. V ell,eff

image file: c9cp06214a-t12.tif

Note: the error on Vell,eff and Vsh is ≈0.3%, the one on η is ≈1%. Errors on the different quantities were evaluated through the standard error of their mean calculated over 3 (independent) subtrajectories.
Trp-cage 20 4.46 5.95 0.050
Yeast-AFP 25 6.87 4.97 0.052
AFPI 37 8.06 4.86 0.046
Heliomicin 44 9.47 4.38 0.065
GB1 56 11.81 4.00 0.079
AFPIII 66 12.27 3.91 0.071
BPTI 58 12.41 3.94 0.068
TmAFP 82 12.46 4.00 0.083
Ubiquitin 75 14.40 3.70 0.090
CfAFP 121 19.65 3.37 0.088
Barnase 108 20.55 3.30 0.094
Lysozyme 129 22.64 3.19 0.099
RiAFP 143 23.26 3.26 0.089
AFPII 129 25.70 3.06 0.106
Myoglobin 154 28.63 2.99 0.111
LeIBP 241 38.73 2.73 0.123
HCAII 261 46.61 2.56 0.133
COVID-19 306 54.93 2.56 0.125

On the contrary, no clear dependence of η was found on the protein eccentricity, as provided by the analysis of the correlation between η and the eccentricity of the ellipsoids approximating the protein molecules (see Fig. 3A), and on the hydrophobicity of the protein surface, as provided by the analysis of the correlation between η and the hydrophobic fraction at the protein surface (see Fig. 3B). In the ESI, Fig. S2 and S3, a global view of the chemical character of the surfaces of all proteins is given in terms of hydrophilic/hydrophobic atoms.

image file: c9cp06214a-f3.tif
Fig. 3 η as a function of the ellipsoid eccentricity, defined as ε2 = 1 − a2/c2 with a and c the smallest and largest ellipsoid axes, respectively (A), and of the fraction of hydrophobic atoms at the protein surface (B).

In order to interpret the dependence of the hydration shell-density on the protein size through the model proposed in the Theory section, we firstly need to obtain m, ρin/ρb and 1 − χ, the latter evaluated as 1 − χ = β/ρin.

To get m we plotted Vex as a function of the effective protein-ellipsoid volume, Vell,eff, for all the studied proteins, see Fig. 4A, and from a linear fit of the data we obtain m (m = 0.95). From the data and the quality of the fit it is evident that the linear relation we assumed between Vex and Vell,eff is a good approximation ensuring that one of our basic assumptions is reasonable.

image file: c9cp06214a-f4.tif
Fig. 4 (A) Vex as a function of the effective protein-ellipsoid volume, Vell,eff, for all the studied proteins. From the linear fit Vex = mVell,eff we obtain m = 0.95. (B) Mean number of solvent molecules inside the effective protein ellipsoid, nin, as a function of the effective protein-ellipsoid volume, Vell,eff, for all the studied proteins. From the linear fit nin = βVell,eff we obtain β = 11.61. The correlation coefficient is higher than 0.99 for both regressions.

To get ρin/ρb we used the procedure described in the Methods section, i.e. ρin/ρb is calculated from the ratio of the mean number of solvent molecules inside the protein effective ellipsoid and the corresponding number calculated on the fictitious protein + solvent configurations. The ρin/ρb values, calculated over the protein sample and reported in Fig. 5A as a function of Vsh/Vell,eff, show a rather flat distribution with a mean value of 1.56, thus confirming that ρin is constant for all the proteins as was assumed in the model. A similar analysis of ρout/ρb, evaluated as the ratio of the mean number of solvent molecules outside the protein effective ellipsoid, but inside the hydration shell, and the corresponding number calculated on the fictitious protein + solvent configurations was performed (see Fig. 5B). The comparison between ρin/ρb and ρout/ρb shows that almost all of the hydration shell density increase is concentrated inside the protein effective ellipsoid, being the former ≈1.56 and the latter in the range of 1.00 and 1.08. Moreover, ρout/ρb shows a slight dependence on the protein size.

image file: c9cp06214a-f5.tif
Fig. 5 (A) ρin/ρb as a function of α = Vsh/Vell,eff for all the studied proteins. A linear regression of the ρin/ρb data yields a slope coefficient of −0.031 which is within one standard error of the coefficient (0.032), showing that ρin/ρb can be considered basically constant over the protein sample. (B) ρout/ρb as a function of α = Vsh/Vell,eff for all the studied proteins. The black solid line is the analytical function ρout/ρb = (a0 + x)/(x + a1) with a0 = −0.63 and a1 = −0.75 as obtained from the fit of η with eqn (12) (see Results section). The analytical expression provides a good reproduction of the ρout/ρb values obtained from the MD simulations.

Similarly as for m, by expressing the mean number of solvent molecules inside the effective protein ellipsoid ninvia the inner shell density nin = (1 − χ)ρinVell,eff, we can obtain β = (1 − χ)ρin by a linear regression of the ninversus Vell,eff plot (see Fig. 4B) as provided by shell-density calculations (we get β = 11.61). From β and ρin, we can evaluate 1 − χ = β/ρin = 0.22.

Once obtained m, ρin/ρb and (1 − χ) we can use eqn (12) and (13) to fit the values of η as a function of α = Vsh/Vell,eff calculated for the different proteins (see Fig. 6). From the quality of the fits it emerges that both levels of approximation of the model employed are reasonably accurate, with the higher-level model performing slightly better consistently with the slight size-dependence observed from the explicit calculation of ρout/ρb (see Fig. 5). A further validation of our model comes from the fact that if the parameters a0 and a1, as obtained from the fitting of η with eqn (13), are used into the analytical function for ρout/ρb (ρout/ρb = (a0 + x)/(x + a1)) a good reproduction of the ρout/ρb values provided by the MD simulations is obtained (see Fig. 5B).

image file: c9cp06214a-f6.tif
Fig. 6 (A) η as a function of α = Vsh/Vell,eff calculated for the different proteins. The data are fitted with eqn (12) (dashed line) providing r0 = ρout/ρb = 1.04 and eqn (12) (full line) providing a0 = −0.63 and a1 = −0.75; the correlation coefficient is 0.97 for both models. The eight AFPs are highlighted in red.

The relevant parameters (i.e. 1 − χ, ρin/ρb and ρout/ρb which are independent of the level of approximation) can be utilized to understand how the solvent molecules are partitioned inside the hydration shell possibly explaining the variation of the solvent-shell density according to the protein size. According to our model ≈22% of the effective protein-ellipsoid volume is accessible to solvent (i.e. 1 − χ = 0.22) and basically all the solvent density increase is concentrated in this region. In fact, the solvent density in the inner accessible volume is approximately 56% higher than bulk (i.e. ρin/ρb ≈ 1.56), while it is basically equal to the bulk density in the outer region (i.e. ρout/ρb = 1.04 in the case of the lower level of approximation). Thus, almost all of the solvent density increase is confined within the protein and within grooves and pockets at the protein surface, as shown in Fig. 7 in which a representative configuration showing the water molecules within the protein ellipsoid is reported.

image file: c9cp06214a-f7.tif
Fig. 7 Representative configuration of HCAII showing the water molecules within the protein ellipsoid in ball and stick (oxygen and hydrogen atoms are reported in red and white, respectively). Most of the water molecules are localized within grooves and pockets. The behaviour is qualitatively similar for all proteins.

The protein-independent density increment of ≈56% found here is in line with previous findings in which the hydration density increase was calculated by means of MD simulation at a distance of 0.2 nm from the protein surface for lysozyme11 (≈50% increase was found) and from the analysis of the crystal waters in contact with the protein surface for 22 crystal structures51 (an average ≈20% increase was found).

Our protein data-set contains proteins with a specific function, namely antifreeze proteins. AFPs are able to inhibit ice growth by binding to ice nuclei and, although their ice-binding mechanism is still unclear, yet the hydration layer is thought to play a fundamental role. We have previously shown that the hydration density of two of these AFPs, CfAFP and AFPIII, are very similar to non-AFPs.18 Here, we extend the study to a total of eight AFPs spanning a rather large length-scale range, and compare the hydration density with non-AFPs. As shown in Fig. 6, in which the AFPs proteins are highlighted in red, the behavior of the AFPs does not differ from that of non-AFPs. Indeed, if we separate the data into two sets, one comprising non-AFPs and the other AFPs proteins, and fit our model to the two sets independently we basically obtain the same results. Thus, our results do not show any peculiar behavior of AFPs in terms of the protein-size dependence of the hydration-shell density with respect to non-AFPs.

As explained in the theory section, our model has a formal lower limit αmin = m + (1 − χ) = 1.17. However, it is likely that in the proximity of αmin (i.e. close to a null relative outer accessible volume) the model is not fully reliable. We might consider a value of α around 1.5 as a lower limit, which corresponds to a maximum number of residues of approximately 500–600, hence limiting the applicability of the model to small-to-medium-sized globular proteins.

5 Conclusions

By means of molecular dynamics simulation we analyzed the protein hydration-shell density of eighteen proteins, which include eight AFPs, and found an increase of water density with respect to the bulk-density for all the studied proteins. While we did not find any correlation between the magnitude of this increase and the protein eccentricity and protein-surface mean hydrophobicity, we found a clear dependence on the protein length-scale. In particular, the increment in hydration-shell density decreases for decreasing protein size (from ≈14% to ≈4%) and, within the simple model developed in the present work, this decrease is caused by the protein size only. In fact, the hydration shell-density increase is found to be confined within, or in pockets at the surface of the protein (i.e. inside the effective solvent-accessible volume of the ellipsoid approximating the protein molecule) and to be the same for all proteins. Therefore, since the relative size of this inner solvent-accessible volume, with respect to the total accessible volume of the hydration shell, is lower for smaller proteins, the decrease in water-density is explained by the difference in protein size.

The protein-independent increment of the density of the water molecules confined within the protein effective ellipsoid of ≈56% obtained from our model is consistent with previous explicit calculations showing that the density of water molecules in close proximity to the protein surface, and in particular in concave regions, is 20–50% higher than bulk.

Finally, we show that the antifreeze proteins studied here, which cover a broad range of sizes and types (hyperactive from insects, CfAFP, TmAFP and RiAFP, moderately active from fish, AFPI, AFPII and AFPIII, and moderately active from yeast, yeast-AFP and LeIBP), do not behave differently from non-antifreeze proteins in terms of the size-dependence of their hydration density, in line with our previous work in which we showed that the shell thickness and density of two AFPs did not feature any relevant difference with respect to two non-AFPs.18

Conflicts of interest

There are no conflicts to declare.


A. D. B and V. B acknowledge Italian MIUR (PRIN 2017, project “Physico-chemical Heuristic Approaches: Nanoscale Theory Of Molecular Spectroscopy, PHANTOMS”, prot. 2017A4XRCA) for supporting this work. I. D. acknowledges the CINECA award IscrC CSC under the ISCRA initiative for the availability of high-performance computing resources and support and funding from the FORTISSIMO project FP7-2013-NMP-ICT-FOF.


  1. H. R. Drew and R. E. Dickerson, J. Mol. Biol., 1981, 151, 535–556 CrossRef CAS PubMed.
  2. J. G. Kempf and J. P. Loria, Cell Biochem. Biophys., 2002, 37, 187–211 CrossRef CAS.
  3. S. Dellerue and M.-C. Bellissent-Funel, Chem. Phys., 2000, 258, 315–325 CrossRef CAS.
  4. S. Toba, G. Colombo and K. M. Merz, J. Am. Chem. Soc., 1999, 121, 2290–2302 CrossRef CAS.
  5. Y. Levy and J. N. Onuchic, Annu. Rev. Biophys. Biomol. Struct., 2006, 35, 389–415 CrossRef CAS PubMed.
  6. S. Park and J. G. Saven, Proteins: Struct., Funct., Bioinf., 2005, 60, 450–463 CrossRef CAS PubMed.
  7. C. M. Stegmann, D. Seeliger, G. M. Sheldrick, B. L. de Groot and M. C. Wahl, Angew. Chem., Int. Ed., 2009, 48, 5207–5210 CrossRef CAS PubMed.
  8. A. M. Klibanov, Nature, 2001, 409, 241 CrossRef CAS PubMed.
  9. D. R. Nutt and J. C. Smith, J. Am. Chem. Soc., 2008, 130, 13066–13073 CrossRef CAS PubMed.
  10. M. Levitt and R. Sharon, Proc. Natl. Acad. Sci. U. S. A., 1988, 85, 7557–7561 CrossRef CAS PubMed.
  11. F. Merzel and J. C. Smith, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 5378–5383 CrossRef CAS PubMed.
  12. A. Kuffel and J. Zielkiewicz, J. Phys. Chem. B, 2012, 116, 12113–12124 CrossRef CAS PubMed.
  13. A. Kuffel, D. Czapiewski and J. Zielkiewicz, J. Chem. Phys., 2014, 141, 08B605 CrossRef PubMed.
  14. A. Kuffel, D. Czapiewski and J. Zielkiewicz, J. Chem. Phys., 2015, 143, 10B601 CrossRef PubMed.
  15. S. Del Galdo, P. Marracino, M. D'Abramo and A. Amadei, Phys. Chem. Chem. Phys., 2015, 17, 31270–31277 RSC.
  16. F. Persson, P. Söderhjelm and B. Halle, J. Chem. Phys., 2018, 148, 215101 CrossRef PubMed.
  17. S.-Y. Sheu, Y.-C. Liu, J.-K. Zhou, E. W. Schlag and D.-Y. Yang, J. Phys. Chem. B, 2019, 123, 6917–6932 CrossRef CAS PubMed.
  18. L. Zanetti-Polzi, A. D. Biswas, S. Del Galdo, V. Barone and I. Daidone, J. Phys. Chem. B, 2019, 123, 6474–6480 CrossRef CAS PubMed.
  19. F. T. Burling, W. I. Weis, K. M. Flaherty and A. T. Brünger, Science, 1996, 271, 72–77 CrossRef CAS PubMed.
  20. D. Svergun, S. Richard, M. Koch, Z. Sayers, S. Kuprin and G. Zaccai, Proc. Natl. Acad. Sci. U. S. A., 1998, 95, 2267–2272 CrossRef CAS PubMed.
  21. N. Niccolai, R. Spadaccini, M. Scarselli, A. Bernini, O. Crescenzi, O. Spiga, A. Ciutti, D. Di Maro, L. Bracci and C. Dalvit, et al. , Protein Sci., 2001, 10, 1498–1507 CrossRef CAS PubMed.
  22. S. K. Pal, J. Peon and A. H. Zewail, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 1763–1768 CrossRef CAS PubMed.
  23. J. Peon, S. K. Pal and A. H. Zewail, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 10964–10969 CrossRef CAS PubMed.
  24. B. Bagchi, Chem. Rev., 2005, 105, 3197–3219 CrossRef CAS PubMed.
  25. S. Ebbinghaus, S. J. Kim, M. Heyden, X. Yu, U. Heugen, M. Gruebele, D. M. Leitner and M. Havenith, Proc. Natl. Acad. Sci. U. S. A., 2007, 104, 20749–20752 CrossRef CAS PubMed.
  26. M. Heyden and M. Havenith, Methods, 2010, 52, 74–83 CrossRef CAS PubMed.
  27. S. Khodadadi, J. Curtis and A. P. Sokolov, J. Phys. Chem. B, 2011, 115, 6222–6226 CrossRef CAS PubMed.
  28. J. D. Nickels, H. O'Neill, L. Hong, M. Tyagi, G. Ehlers, K. L. Weiss, Q. Zhang, Z. Yi, E. Mamontov and J. C. Smith, et al. , Biophys. J., 2012, 103, 1566–1575 CrossRef CAS PubMed.
  29. N. Yamamoto, K. Ohta, A. Tamura and K. Tominaga, J. Phys. Chem. B, 2016, 120, 4743–4755 CrossRef CAS PubMed.
  30. S. Khodadadi and A. P. Sokolov, Biochim. Biophys. Acta, Gen. Subj., 2017, 1861, 3546–3552 CrossRef CAS PubMed.
  31. F. Novelli, S. Ostovar Pour, J. Tollerud, A. Roozbeh, D. R. Appadoo, E. W. Blanch and J. A. Davis, J. Phys. Chem. B, 2017, 121, 4810–4816 CrossRef CAS PubMed.
  32. A. Charkhesht, C. K. Regmi, K. R. Mitchell-Koch, S. Cheng and N. Q. Vinh, J. Phys. Chem. B, 2018, 122, 6341–6350 CrossRef CAS PubMed.
  33. R. d. C. Barbosa and M. C. Barbosa, Phys. A, 2015, 439, 48–58 CrossRef CAS.
  34. S. Ebbinghaus, K. Meister, B. Born, A. L. DeVries, M. Gruebele and M. Havenith, J. Am. Chem. Soc., 2010, 132, 12210–12211 CrossRef CAS PubMed.
  35. K. Meister, S. Ebbinghaus, Y. Xu, J. G. Duman, A. DeVries, M. Gruebele, D. M. Leitner and M. Havenith, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, 1617–1622 CrossRef CAS PubMed.
  36. D. M. Huang and D. Chandler, Proc. Natl. Acad. Sci. U. S. A., 2000, 97, 8324–8327 CrossRef CAS PubMed.
  37. H. S. Ashbaugh and M. E. Paulaitis, J. Am. Chem. Soc., 2001, 123, 10721–10728 CrossRef CAS PubMed.
  38. N. Giovambattista, P. J. Rossky and P. G. Debenedetti, Phys. Rev. E: Stat., Nonlinear, Soft Matter Phys., 2006, 73, 041604 CrossRef PubMed.
  39. S. Sarupria and S. Garde, Phys. Rev. Lett., 2009, 103, 037803 CrossRef PubMed.
  40. D. Van Der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. Berendsen, J. Comput. Chem., 2005, 26, 1701–1718 CrossRef CAS PubMed.
  41. G. A. Kaminski, R. A. Friesner, J. Tirado-Rives and W. L. Jorgensen, J. Phys. Chem. B, 2001, 105, 6474–6487 CrossRef CAS.
  42. H. Berendsen, J. Grigera and T. Straatsma, J. Phys. Chem., 1987, 91, 6269–6271 CrossRef CAS.
  43. I. Daidone, M. Aschi, L. Zanetti-Polzi, A. Di Nola and A. Amadei, Chem. Phys. Lett., 2010, 488, 213–218 CrossRef CAS.
  44. A. Amadei, I. Daidone and M. Aschi, Phys. Chem. Chem. Phys., 2012, 14, 1360–1370 RSC.
  45. G. Bussi, D. Donadio and M. Parrinello, J. Chem. Phys., 2007, 126, 014101 CrossRef PubMed.
  46. B. Hess, H. Bekker, H. J. Berendsen and J. G. Fraaije, J. Comput. Chem., 1997, 18, 1463–1472 CrossRef CAS.
  47. T. Darden, D. York and L. Pedersen, J. Chem. Phys., 1993, 98, 10089–10092 CrossRef CAS.
  48. J. L. Abascal and C. Vega, J. Chem. Phys., 2005, 123, 234505 CrossRef CAS PubMed.
  49. F. Eisenhaber, P. Lijnzaad, P. Argos, C. Sander and M. Scharf, J. Comput. Chem., 1995, 16, 273–284 CrossRef CAS.
  50. G.-R. Huang, Y. Wang, C. Do, L. Porcar, Y. Shinohara, T. Egami and W.-R. Chen, J. Phys. Chem. Lett., 2019, 10, 3978–3984 CrossRef CAS PubMed.
  51. M. Gerstein and C. Chothia, Proc. Natl. Acad. Sci. U. S. A., 1996, 93, 10167–10172 CrossRef CAS PubMed.


Electronic supplementary information (ESI) available. See DOI: 10.1039/c9cp06214a

This journal is © the Owner Societies 2020