Luisa A. Ferreiraa,
Xiao Fanb,
Pedro P. Madeirac,
Lukasz Kurganb,
Vladimir N. Uversky*def and
Boris Y. Zaslavsky*a
aAnaliza Inc, 3516 Superior Ave, Suite 4407B, Cleveland, OH, USA. E-mail: bz@analiza.com; Tel: +1-216-432-2700
bDepartment of Electrical and Computer Engineering, University of Alberta, Canada
cLaboratory of Separation and Reaction Engineering, Department of Chemical Engineering, Faculty of Engineering of the University of Porto, Rua Dr. Roberto Frias, 4200-465, Porto, Portugal
dDepartment of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, Florida 33612, USA. E-mail: vuversky@health.usf.edu; Fax: +1-813-974-7357; Tel: +1-813-974-5816
eDepartment of Biological Science, Faculty of Science, King Abdulaziz University, P.O. Box 80203, Jeddah 21589, Saudi Arabia
fLaboratory of Structural Dynamics, Stability and Folding of Proteins, Institute of Cytology, Russian Academy of Sciences, St. Petersburg, Russia
First published on 3rd July 2015
Partitioning of 11 globular proteins was examined in aqueous dextran–PEG–sodium/potassium phosphate buffer (0.01 M K/NaPB, pH 7.4) two-phase systems (ATPSs) containing 0.5 M sorbitol. The data obtained were analyzed together with those reported previously for the same proteins in osmolyte-free ATPS and ATPS containing 0.5 M sucrose, TMAO, or trehalose. It was found that all the partition coefficients for proteins determined in the presence of 0.5 M of different osmolytes and in the absence of osmolytes may be described in terms of the differences between solvent properties of the coexisting phases. Solute-specific coefficients characterizing different types of solute–solvent interactions were calculated for each protein. These solute-specific coefficients are linearly interrelated implying cooperativity of different types of protein–water interactions. The data obtained indicate the lack of any association of the aforementioned osmolytes at concentration of 0.5 M with proteins. Computational analysis of one of the solute-specific coefficient Ss-values characterizing dipole–dipole protein–water interactions shows that it is determined by the peculiarities of protein surface.
For a long time it is believed that one of the potential mechanisms explaining stabilizing action of osmolytes is preferential exclusion of osmolytes from the immediate vicinity of a protein.5,17 Since osmolyte is preferentially excluded from the immediate protein vicinity, the protein is then preferentially hydrated.5,18–21 In other words, the “folding” potential of protective osmolytes can be understood in terms of the model where a globular protein tends to adopt a folded conformation with a minimally exposed surface area due to the tendency of protective osmolytes to be excluded from the protein surface.8,22
Thermodynamically, the stabilization of a protein by protective osmolytes was attributed to the destabilization of the unfolded state of the protein in the presence of osmolyte rather than to the osmolyte-induced stabilization of the native state.5,14,15,21,23 Therefore, protecting osmolytes push the folding equilibrium toward native state by raising the free energy of the unfolded state, whereas denaturing osmolytes push the equilibrium toward the unfolded state by lowering the free energy of the unfolded state.24 The fact that protective osmolytes act primarily on the denatured states leaving native states mostly unaffected explains the ability of these important small organic compounds to stabilize proteins against the denaturation without affecting their biological functions.5
Since protecting/denaturing osmolytes interact unfavorably/favorably with the unfolded state, resulting in preferential depletion/accumulation of osmolyte proximate to the protein surface,24 the important question is by what mechanism osmolytes interact with the protein to affect its stability. It has been emphasized that osmolytes modulate protein stability predominantly affecting the protein backbone,14,25,26 with osmolyte polar groups being able to interact with the protein backbone more favorably than the osmolyte non-polar groups.24 This suggests that protein stabilization in the presence of osmolytes can be attributed to the net repulsive interaction between protecting osmolytes and the backbone of proteins.26–29
An alternative mechanism of osmolyte action is related to the potential effects of these small organic compounds on the solvent properties in the cellular environment. In this scenario, the presence of osmolytes indirectly modifies the stability of biological macromolecules via changes in the solvent properties.30,31 In fact, it was emphasized that the osmolyte-induced shift in the conformational equilibrium toward the protein native state might be rooted in the ability of protective osmolytes to induce asymmetric loss of protein conformational entropy, with greater entropic loss in the unfolded state.29,32 In this scenario, osmolytes reduce the entropy of the ensemble of unfolded conformations by increasing compactness of the unfolded state.29
In other words, in the presence of osmolytes, the unfolded state acquires residual partially collapsed structure characterized by the reduced number of solvent-accessible hydrophobic groups resulting in decreased number of water molecules that have to be immobilized upon unfolding.33 This ability of osmolytes to induce collapse of the unfolded state was demonstrated for several globular proteins, such as protein S6,34 RNase S,32 chymotrypsin inhibitor 2,35 and cutinase33. Importantly, almost in all cases studies so far, the addition of osmolytes to the unfolded proteins resulted in the rapid collapse of the unfolded state to the non-native form with the retarded refolding capabilities.32–35 Furthermore, it has been pointed out that different osmolytes can induce differently collapsed states in a given protein.35
The observations that osmolytes are able to induce compaction of unfolded states provide indirect support to the idea that the addition of osmolyte might change the properties of solvent. In fact, it is well known that the hydrodynamic dimensions of unfolded polymers dramatically depend on the quality of solvent.36,37 A poor solvent induces the attraction of macromolecular segments, resulting in the squeezing of a chain. On the other hand, in a good solvent, repulsive forces occur between segments, leading to the formation of a loose fluctuating coil.38 It is assumed that the concentrated solutions of urea and guanidinium hydrochloride (GdmHCl) are rather good solvents for polypeptide chains, with GdmHCl being closer to the ideal one.37,39 This difference in the solvent quality accounts for the noticeable divergences in the molecular mass dependencies of the hydrodynamic dimensions of the globular proteins unfolded by urea and GdmHCl.40–43 From this angle, the reported osmolyte-induced compaction of unfolded proteins indicates changes in solvent quality from good in the absence of osmolytes to poor in their presence.
In brief, the mechanism of stabilizing effects of osmolytes on proteins in aqueous solution on molecular level remains unclear. Although the prevalent view is based on the preferential solvation model, according to which osmolytes are excluded from protein surface and increase the free energy of protein unfolding,28,44 it is generally agreed that the water structure is changed in osmolyte solutions.29,45–53
Although the data accumulated so far are somewhat contradictory, the conclusion that the water properties in osmolyte solutions are changed relative to those of pure water seems unavoidable. This conclusion is confirmed by the data reported in the companion paper,54 where based on the analysis of partitioning of small organic compounds in aqueous dextran–polyethylene glycol (PEG) aqueous two-phase systems containing different osmolytes (sorbitol, sucrose, trehalose, and TMAO) at concentration of 0.5 M it has been concluded that the compound partition behavior may be described in terms of the solvent properties of coexisting phases. This finding clearly indicates that partition behavior of a solute is not associated with direct osmolyte–solute interactions and reflects changes in the osmolyte-induced solvent properties.54
The purpose of this study was to examine partitioning of proteins in the dextran–PEG–0.01 M K/NaPB–0.5 M sorbitol ATPS, and estimate the solute-specific coefficients for all the proteins to explore if all the osmolytes employed affect the protein partition behavior solely by affecting the solvent properties of the aqueous media.
000 by light scattering was purchased from USB Corporation (Cleveland, OH, USA).| Protein* | Abbreviation | Molecular weight, kDa | pI |
|---|---|---|---|
| α-Chymotrypsin | CHY | 25.0 | 8.75 |
| α-Chymotrypsinogen A | CHTG | 25.7 | 8.97 |
| Concanavalin A | ConA | 104.0 | 4.5–5.5 |
| Hemoglobin human | HHb | 64.5 | 6.8 |
| β-Lactoglobulin A | bLGA | 18.3 | 5.3 |
| β-Lactoglobulin B | bLGB | 18.3 | 5.1 |
| Lysozyme | HEL | 14.3 | 11.0 |
| Papain | Pap | 23.4 | 8.75–9.55 |
| Ribonuclease A | RNase A | 13.7 | 9.63 |
| Ribonuclease B | RNase B | 17.0 | 8.88 |
| Trypsinogen | TRY | 24.0 | 8.7; 9.3 |
For the analysis of the proteins (with exception of hemoglobin) partitioning, aliquots of 30 μL from both phases were transferred and diluted with water up to 70 μL into microplate wells. Then, the microplate was sealed, shortly centrifuged (2 min at 1500 rpm) and following moderate shaking for 45 min in an incubator at 37 °C, 250 μL of o-phthaldialdehyde reagent was combined. After moderate shaking for 4 min at room temperature, fluorescence was determined using a fluorescence plate reader with a 360 nm excitation filter and a 460 nm emission filter, with a sensitivity setting of 100–125.
For the analysis of hemoglobin partitioning, aliquots of 50 – 120 μL from both phases were diluted up to 600 μL in 1.2 mL microtubes. Water was used as diluent. Following vortexing and a short centrifugation (12 min), aliquots of 250 – 300 μL were transferred into microplate wells, and the UV-Vis plate reader was used to measure optical absorbance at wavelength previously determined to correspond to maximum absorption. In separate experiments we compared the protein partition coefficients under two condition of mixing. One condition included vortexing of protein added to the polymer mixture. The other condition included vigorous vortexing of polymer mixture to apparently homogeneous state, followed by adding protein stock solution and subsequent “gentle” mixing with very brief and mild vortexing. The partition coefficients determined for each protein under these two different mixing conditions were identical.
The partition coefficient, K, is defined as the ratio of the sample concentration in the top phase to that in the bottom phase. The K-value for each protein was determined as the slope of the concentration (fluorescence intensity or absorbance depending on the protein) in the top phase plotted as a function of the concentration in the bottom phase averaged over the results obtained from two to four partition experiments carried out at the specified composition of the system. The deviation from the average K value was always less than 3% and in most cases lower than 1%.
- Voronoia (http://proteinformatics.charite.de/voronoia4rna/tools/v4rna/index)59 that was utilized to characterize packing and pockets in the structure. Ten descriptors were computed that characterize size and quantity of pockets and average van der Waals volume, solvent-excluded volume, fraction of buried atoms and average packing density that describe packing (14 descriptors).
- CASTp (http://sts.bioe.uic.edu/castp/)60 that generates number, surface area and volume of pockets on the surface. Both raw and normalized by the protein size values (6 descriptors) were used.
- An algorithm based on ref. 61 to compute contact order (1 descriptor).
- YASARA (http://www.yasara.org/) that we used to compute radius of gyration, nuclear and van der Waals radii, content of six types of secondary structure (α-helix, 310-helix, both helix types, β-sheet, turns and coils), molecular mass, B-factor and occupancy (12 descriptors).
- DSSP (http://swift.cmbi.ru.nl/gv/dssp/)62 that quantifies surface area and secondary structure. Size and properties of the surface were characterized including fraction (in the whole protein chain) of surface residues; fraction of polar, nonpolar, neutral, positively charged, and negatively charged residues on the surface; and hydrophobicity of surface residues that was estimated based on three amino acids indices: Kyte-Doolittle,63,64 Eisenberg,65 and Cid scales.66 Contents of α-helix, 310-helix, both helix types, β-sheet, β-bridge, both β structure types, turn, bend, and coil secondary structures (18 descriptors) were also computed.
We selected the subset of the descriptors empirically using greedy search. The search maximizes Pearson correlation coefficients (PCC) between the outputs of the regression that uses a given subset of descriptors and the observed values. The PCC values were measured based on three-fold cross validation on the considered set of 9 proteins to minimize overfitting into the dataset. Each descriptor was normalized to the [−1, 1] interval using maximum absolute value and we built the regression model using a subset of i descriptors. First, i was initialized with 2, considered all pairs of two descriptors, and selected the pair that gives the highest value of PCC. Next, i was incremented by 1 until the corresponding regression that secured maximal PCC based on i descriptors increased the PCC value by a noticeable margin (0.03) compared to the PCC of the model with i − 1 descriptors. As a result, four descriptors to build the regression model were selected.
It has been shown that the partition coefficient of a solute in an ATPS can be described as:72–78
log Ki = SsΔπ*i + BsΔαi + AsΔβi + Csci
| (1) |
The solute specific coefficients may be determined for a given compound (including proteins) by the analysis of partition coefficients of this compound in multiple ATPSs with different polymer but same ionic composition with established solvent properties of the phases. Once Δπ*, Δα, Δβ, and c parameters in 5–10 different ATPSs are determined, the solute specific coefficients can be calculated by multiple linear regression analysis using eqn (1). It was shown75 also that the partition coefficient of a compound with pre-determined solute specific coefficients in a “new” ATPS with established solvent properties of the phases could be predicted with 90–95% accuracy.
It is important to emphasize that the partition coefficients of a solute in multiple ATPSs with different additives would fit eqn (1) only if the solute–solvent interactions would vary due to different solvent properties of the phases and there would be no association of additives with the solute. It was established72,74 that while the minimal number of different ATPSs to be used for determination of solute-specific coefficients is five, using a set of 10 different ATPS provides much more reliable values of the solute-specific coefficients.
| Difference between solvent properties of coexisting phases | |||||
|---|---|---|---|---|---|
| 0.01 M K/NaPB | 0.5 M sorbitol | 0.5 M sucrosea | 0.5 M trehalosea | 0.5 M TMAOa | |
| a Data reported in ref. 70 and 71 and presented for comparison. | |||||
| ΔG(CH2)*, cal mole−1 | −45 ± 1.2 | −43 ± 1.1 | −39.4 ± 0.44 | −47.7 ± 0.6 | −40.9 ± 0.6 |
| E | 0.033 ± 0.001 | 0.032 ± 0.002 | 0.029 ± 0.001 | 0.035 ± 0.001 | 0.028 ± 0.001 |
| C | 0.058 ± 0.003 | 0.090 ± 0.003 | 0.110 ± 0.002 | 0.113 ± 0.002 | 0.083 ± 0.002 |
| Δπ* | −0.042 ± 0.002 | −0.042 ± 0.004 | −0.073 ± 0.004 | −0.042 ± 0.003 | −0.031 ± 0.002 |
| Δα | −0.051 ± 0.003 | −0.066 ± 0.003 | −0.046 ± 0.005 | −0.081 ± 0.003 | −0.074 ± 0.003 |
| Δβ | 0.006 ± 0.004 | 0.006 ± 0.005 | 0.023 ± 0.006 | 0.006 ± 0.005 | 0.009 ± 0.008 |
![]() |
|||||
| Protein | Partition coefficients | ||||
| α-Chymotrypsin | 0.42 ± 0.01 | 0.427 ± 0.008 | 0.42 ± 0.01 | 0.41 ± 0.01 | 0.42 ± 0.01 |
| α-Chymotrypsinogen A | 1.00 ± 0.01 | 1.5 ± 0.014 | 1.78 ± 0.02 | 1.93 ± 0.01 | 1.37 ± 0.02 |
| Concanavalin A | 0.236 ± 0.003 | 0.237 ± 0.003 | 0.242 ± 0.003 | 0.226 ± 0.003 | 0.233 ± 0.004 |
| Hemoglobin human | 0.129 ± 0.005 | 0.111 ± 0.002 | 0.118 ± 0.003 | 0.091 ± 0.002 | 0.208 ± 0.002 |
| β-Lactoglobulin A | 0.46 ± 0.01 | 0.329 ± 0.004 | 0.309 ± 0.004 | 0.255 ± 0.003 | 0.505 ± 0.005 |
| β-Lactoglobulin B | 0.33 ± 0.01 | 0.191 ± 0.003 | 0.211 ± 0.003 | 0.151 ± 0.003 | 0.27 ± 0.007 |
| Lysozyme | 0.23 ± 0.003 | 0.331 ± 0.004 | 0.325 ± 0.004 | 0.318 ± 0.002 | 0.255 ± 0.009 |
| Papain | 1.05 ± 0.01 | 1.29 ± 0.01 | 1.27 ± 0.01 | 1.37 ± 0.01 | 1.21 ± 0.02 |
| Ribonuclease A | 0.313 ± 0.005 | 0.329 ± 0.003 | 0.332 ± 0.006 | 0.311 ± 0.003 | 0.304 ± 0.006 |
| Ribonuclease B | 0.781 ± 0.004 | 0.334 ± 0.004 | 0.347 ± 0.005 | 0.318 ± 0.004 | 0.768 ± 0.004 |
| Trypsinogen | 0.357 ± 0.005 | 0.432 ± 0.009 | 1.463 ± 0.008 | 0.413 ± 0.006 | 0.431 ± 0.004 |
Since the hydrophobic effect is one of the main driving forces of protein folding, we examined the relationships between the proteins partition behavior and difference between hydrophobic properties of the phases. However, the results did not provide any insight into mechanism of osmolyte effects, and hence are not discussed here. The likely reason seems to be that although the hydrophobic effect plays a crucial role of in protein folding, partition behavior of proteins is governed by the interactions of water with protein surface, and these interactions we explored in our study.
It has been demonstrated previously71 that logarithms of partition coefficients of proteins in dextran–PEG–0.5 M osmolyte–0.01 M K/NaPB are interrelated according to eqn (2):
ln Ki0.5 M TMAO−0.01 M K/NaPB = 0.13±0.06 + 0.29±0.095 × ln Ki0.5 M sucrose−0.01 M K/NaPB + 0.8±0.13 × ln Ki0.01 M K/NaPB, N = 10; R2 = 0.9856; SD = 0.08; F = 239
| (2) |
Similar relationship illustrated graphically in Fig. 1 exists between logarithms of partition coefficients of proteins in dextran–PEG–0.5 M osmolyte–0.01 M K/NaPB. The relationship in Fig. 1 may be described as:
ln Ki0.5 M trehalose−0.01 M K/NaPB = −0.02±0.04 − 0.27±0.09 × ln Ki0.01 M K/NaPB + 1.36±0.07 × ln Ki0.5 M sorbitol−0.01 M K/NaPB, N = 11; R2 = 0.9963; SD = 0.059; F = 1082
| (3) |
Furthermore, Fig. 2 shows that analogous relationship can be obtained for the logarithms of partition coefficients of proteins in ATPSs three different osmolytes:
ln Ki0.5 M sorbitol−0.01 M K/NaPB = −0.05±0.03 + 0.18±0.07 × ln Ki0.5 M TMAO−0.01 M K/NaPB + 0.74±0.05 × ln Ki0.5 M trehalose−0.01 M K/NaPB, N = 11; R2 = 0.9953; SD = 0.058; F = 853.4
| (4) |
It was suggested previously54 that the relationships of the type represented by eqn (2)–(4) imply that the proteins respond to their environment in aqueous solutions depending on the environment and the protein structure. These relationships also seem to imply that the responses are governed by changes in the protein–water interactions and not to specific binding with the components of the environment.
| Protein | Ss | As | Bs | Cs | SD, F |
|---|---|---|---|---|---|
| a Solute specific coefficients represent the following solute–water interactions: Ss – dipole–dipole interactions; As – hydrogen bonding with solute as a donor; Bs – hydrogen bonding with solute as an acceptor; Cs – induced dipole–ion interactions.b Statistical significance p-value (not shown for p < 0.0001).c 0, solute-specific coefficients could not be reliably determined (with p < 0.1) and in subsequent calculations are taken as 0. | |||||
| α-Chymotrypsinogen p-valuesb | 5.05 ± 0.03 | −0.60 ± 0.08, 0.017 | 4.42 ± 0.02 | 7.60 ± 0.02 | 0.0005; 171 443 |
| β-Lactoglobulin A p-valuesb | 6.0 ± 0.4, 0.0002 | 21.0 ± 1.0, 0.002 | −3.1 ± 0.3, 0.008 | −6.3 ± 0.3, 0.002 | 0.006; 6214 |
| β-Lactoglobulin B p-valuesb | 5.0 ± 1.0, 0.04 | 17 ± 3, 0.01 | 0c | −6.7 ± 0.5, 0.0009 | 0.02; 1274 |
| RNase A p-valuesb | 7.61 ± 0.05 | −1.8 ± 0.1, 0.006 | 8.83 ± 0.04 | 4.77 ± 0.04 | 0.001; 360 758 |
| RNase B p-valuesb | 6.9 ± 0.3, 0.0002 | 0c | 7.4 ± 0.3 | 3.5 ± 0.3, 0.001 | 0.007; 7036 |
| Papain p-valuesb | 2.0 ± 0.5, 0.02 | 0c | 0.9 ± 0.4, 0.07 | 2.7 ± 0.4, 0.008 | 0.01; 153.2 |
| Trypsinogen p-valuesb | 8 ± 1, 0.004 | 0c | 8.2 ± 0.8, 0.002 | 6 ± 1, 0.01 | 0.02; 414.9 |
| Lysozyme p-valuesb | 13 ± 1, 0.003 | 0c | 13 ± 1, 0.001 | 10 ± 1, 0.005 | 0.03; 507.3 |
| Chymotrypsin | 6.00 ± 0.02 | 0c | 5.88 ± 0.02 | 3.01 ± 0.02 | 0.0006; 748 714 |
| Concanavalin A p-valuesb | 9.8 ± 0.3 | 0c | 10.1 ± 0.2 | 5.1 ± 0.2, 0.0002 | 0.006; 19 303 |
| Hemoglobin p-valuesb | 20.7 ± 0.5 | 35 ± 1.3, 0.0001 | 4.5 ± 0.2, 0.0001 | 0c | 0.009; 17 992 |
If one or more values reveal a p-value > 0.1, then equations contained different combinations of coefficients were examined. The equation with a set of coefficients providing p-values for all parameters below or equal to 0.1 was accepted. The solute-specific coefficients determined for each compound are presented in Table 3 together with the corresponding p-values (except the cases when p < 0.001).
The previously reported data for small polar organic compounds54 demonstrated cooperativity between different types of solute–water interactions displayed as a linear interrelationship between different solute-specific coefficients.
Analysis of the data in Table 3 shows that similar cooperativity exists for proteins as well in agreement with the data reported previously.72,74
The interrelationship between solute-specific coefficients Ss, As, and Bs listed in Table 3 is plotted in Fig. 3. This relationship may be described as:
| Sprotein is = −0.1±0.7 + 0.44±0.03 × Aprotein is + 1.01±0.08 × Bprotein is, N = 11; R2 = 0.9635; SD = 1.1; F = 105.7 | (5) |
Further analysis of the data in Table 3 indicates that there is a linear relationship between the solute-specific coefficients Cs and Bs illustrated graphically in Fig. 4 and described as:
| Cprotein is = −3.5±0.6 + 0.98±0.08 × Bprotein is, N = 8; R2 = 0.9624; SD = 1.0; F = 153.7 | (6) |
We have established54 that there is a linear relationship between solute-specific coefficients Cs, Bs, and Ss for polar organic compounds in the presence of 0.01 M K/NaPB, pH 7.4.
In the case of nonionic polar compounds the value of solute-specific coefficient Cs might be explained by dipole–ion interactions54 and cooperativity between this type of interactions and dipole–dipole and hydrogen-bonding solute–solvent interactions. In the case of proteins with multiple ionizable and nonionic polar groups it is difficult to separate ion–ion, ion–dipole, and dipole–dipole interactions, and therefore it might be expected that the relationship under discussion would be much less clear cut and not hold for different proteins.
In order to compare the solute-specific coefficients for proteins determined in the presence of 0.01 M K/NaPB, pH 7.4 with those for the same proteins determined in different ionic environment we re-calculated the previously reported data72 using the p-value based approach described above.
The solute-specific coefficients for the proteins examined in ref. 72 are presented in Table 4 with the corresponding p-values.
| Protein | Ss | As | Bs | Cs |
|---|---|---|---|---|
| a Solute specific coefficients represent the following solute–water interactions: Ss – dipole–dipole interactions; As – hydrogen bonding with solute as a donor; Bs – hydrogen bonding with solute as an acceptor; Cs – induced dipole–ion interactions.b Statistical significance p-value (not shown for p < 0.0001).c 0, solute-specific coefficients could not be reliably determined (with p < 0.1) and in subsequent calculations are taken as 0. | ||||
| α-Chymotrypsinogen p-valuesb | −4 ± 1, 0.01 | −4.0 ± 1.4, 0.03 | −1.5 ± 0.8, 0.1 | 0c |
| γ-Globulin human p-valuesb | 9 ± 2.7, 0.02 | 0c | 12 ± 2.1, 0.001 | 13 ± 2.7, 0.003 |
| γ-Globulin bovine p-valuesb | 10 ± 2.8, 0.02 | 0c | 13 ± 2.1, 0.002 | 12 ± 3, 0.01 |
| Hemoglobin bovine p-valuesb | 7 ± 1.3, 0.002 | 0c | 8 ± 1, 0.0002 | 7 ± 1.3, 0.001 |
| Hemoglobin human p-valuesb | 5 ± 1.2, 0.006 | 0c | 6.1 ± 0.9, 0.0005 | 6 ± 1.2, 0.002 |
| β-Lactoglobulin p-valuesb | 8 ± 1.7, 0.003 | 0c | 9 ± 1.3, 0.0005 | 5 ± 1.7, 0.02 |
| Lipase p-valuesb | 1.8 ± 0.1, 0.0002 | 0.6 ± 0.2, 0.02 | 1.8 ± 0.1 | 0.6 ± 0.1, 0.006 |
| Lysozyme p-valuesb | −4.3 ± 0.5, 0.0002 | −3.7 ± 0.9, 0.006 | −3.1 ± 0.4, 0.0004 | 2.5 ± 0.5, 0.003 |
| Myoglobin p-valuesb | 6.3 ± 0.7, 0.0003 | 3 ± 1.1, 0.08 | 6.7 ± 0.6 | 2.9 ± 0.8, 0.01 |
| RNase A p-valuesb | 2.4 ± 0.3, 0.0003 | 0c | 3.4 ± 0.3 | 2.5 ± 0.3, 0.0002 |
| RNase B p-valuesb | 2.4 ± 0.5, 0.002 | 0c | 3.6 ± 0.4 | 2.3 ± 0.4, 0.0002 |
| Transferrin p-valuesb | 14 ± 2, 0.0004 | 0c | 16 ± 1.5 | 12 ± 1.9, 0.0009 |
| Trypsinogen p-valuesb | −1.5 ± 0.7, 0.08 | −3 ± 1.1, 0.03 | 0c | 2.4 ± 0.8, 0.02 |
Analysis of the data in Table 4 shows the cooperativity between the solute-specific coefficients for the proteins illustrated graphically in Fig. 5A and B and described as:
| Sprotein isj = −0.3±0.3 + 0.4±0.1 × Aprotein isj + 0.85±0.03 × Bprotein isj, N = 13; R2 = 0.9915; SD = 0.55; F = 580.7 | (7) |
| Cprotein isj = −0.7±0.8 + 2.5±0.5 × Bprotein isj − 1.9±0.5 × Sprotein isj, N = 13; R2 = 0.9068; SD = 1.5; F = 48.7 | (8) |
Comparison of the above relationships observed for two sets of proteins (including six same proteins) in different ionic environments shows that both regression coefficients in eqn (5) and (7) at As parameter are identical within the error limits, and the regression coefficient at Bs parameter increases ∼1.2-fold with increasing salt concentration in the protein environment. The possible reason of the difference between the regression coefficients at Bsj in eqn (5) and (7) may be that the partitioning of proteins described by eqn (5) and (7) was examined under different ionic composition of the media – in the presence of 0.01 M K/Na–phosphate buffer (eqn (5)) and in the presence of 0.15 M NaCl in 0.01 M Na–phosphate buffer (eqn (7)). Both relationships (eqn (6) and (8)) for Cs parameter are less reliable than those for Ss parameter described by eqn (5) and (7), likely due to the aforementioned reasons.
We computed Pearson correlation coefficients (PCCs) between each of the 57 descriptors that characterize structural properties of the considered proteins (see Materials and Methods) and the observed values of Ss to investigate whether the observed partition-based solute-specific coefficients correlate with these structural properties. In view of linear relationships observed between different solute-specific coefficients for proteins the analysis under consideration may be performed for any single solute-specific coefficient.
We chose solute-specific coefficient Ss as the one with non-zero values for all the proteins. Seven of the structural descriptors have modest correlations above 0.3 and three of them have high correlations above 0.5.
Most of the correlated structural parameters are related to some characteristics of the protein surface. They include the amount of positively charged, neutral, and negatively charged residues on the surface (Fig. 6A), hydrophobicity of the surface residues (Fig. 6B), size and volume of pockets on the surface, and content of β-sheets in the protein fold. Fig. 6A shows that normalized (by protein size) amount of the positively charged residues on the surface is negatively correlated with Ss (PCC = −0.43), while the normalized amount of negatively charged surface residues is positively correlated (PCC = 0.73). Average hydrophobicity of the surface residues is positively correlated with Ss (PCC = 0.53), suggesting that higher value of Ss corresponds to higher hydrophobicity of the surface.
In short, our empirical analysis suggests that the partition behavior of a given protein is determined by the peculiarities of its surface.
| y = −0.0004±39.96x1 − 781.6914±45.05x2 − 0.9639±69.86x3 + 0.0040±103.19x4 + 21.1173±68.79 | (9) |
| Proteins | Ss | Descriptors | Output from regression | |||
|---|---|---|---|---|---|---|
| x1 mass | x2 fraction of positively charged residues on the surface | x3 normalized area of pockets on the surface | x4 volume of pockets on the surface | |||
| CHY | 6.00 | 26 253.159 |
0.004219 | 8.307 | 1512.6 | 5.79 |
| CHTG | 5.05 | 32 534.909 |
0.004149 | 10.913 | 2529.6 | 5.00 |
| bLGA | 6.00 | 17 502.003 |
0.006579 | 9.944 | 1555.8 | 5.91 |
| bLGB | 5.00 | 17 850.429 |
0.006410 | 20.550 | 3891.0 | 5.06 |
| RNaseA | 7.61 | 14 390.182 |
0.008065 | 3.623 | 459.6 | 7.64 |
| TRY | 8.00 | 28 568.447 |
0.000000 | 7.508 | 1305.7 | 8.14 |
| ConA | 9.80 | 29 439.827 |
0.000000 | 0.000 | 0.0 | 9.80 |
| Pap | 2.00 | 25 539.630 |
0.009434 | 6.468 | 1101.6 | 2.11 |
| HHb | 20.70 | 31 490.614 |
0.003484 | 16.990 | 7669.8 | 20.70 |
| Correlation with Ss | 0.36 | −0.43 | 0.23 | 0.70 | 0.99 | |
We report the standard errors for each estimated coefficient in the regression model. The five coefficients are statistically significant with p-values below 0.001. We also estimated the relative contributions of individual descriptors in the regression by normalizing their values and scaling the corresponding absolute values of coefficients to sum to 1. The recomputed absolute coefficients are −0.1774, −0.1046, −0.2811, and 0.4369, respectively. These values indicate that the three descriptors that characterize the surface (x2, x3 and x4) are the main determinants of the value computed by the regression; their values sum up to |−0.1046| + |−0.2811| + |0.4369| = 0.8226 out of 1. This suggests that the value of Ss is primarily influenced by the properties of the surface of the protein including the positive charge and area and volume of pockets.
The outputs generated by the regression have high PCC with the observed data that equals 0.999 (0.998 based on the three-fold cross validation). This value is substantially larger than the PCC of 0.73 calculated for the best single descriptor, fraction of negatively charged residues on the surface (Fig. 6A).
The relation between the observed values of Ss and the values generated using regression is illustrated graphically in Fig. 7. The corresponding data for the three-fold cross validation analysis are also shown for comparison.
Our computational analyses reveal that the partition behavior of proteins is primarily determined by the peculiarities of their surfaces including positive charge and the area and volume of cavities on the surface. We observe that higher values of Ss correlate with lower numbers of positively charged surface residues and similarly they correlate with higher areas and volumes of pockets; last row in Table 5 shows the corresponding correlations.
| This journal is © The Royal Society of Chemistry 2015 |