Structure and kinetics of chemically cross-linked protein gels from small-angle X-ray scattering

Glutaraldehyde (GA) reacts with amino groups in proteins, forming intermolecular cross-links that, at sufficiently high protein concentration, can transform a protein solution into a gel. Although GA has been used as a cross-linking reagent for decades, neither the cross-linking chemistry nor the microstructure of the resulting protein gel have been clearly established. Here we use small-angle X-ray scattering (SAXS) to characterise the microstructure and structural kinetics of gels formed by cross-linking of pancreatic trypsin inhibitor, myoglobin or intestinal fatty acid-binding protein. By comparing the scattering from gels and dilute solutions, we extract the structure factor and the pair correlation function of the gels. The protein gels are spatially heterogeneous, with dense clusters linked by sparse networks. Within the clusters, adjacent protein molecules are almost in contact, but the protein concentration in the cluster is much lower than in a crystal. At the $\sim$1 nm SAXS resolution, the native protein structure is unaffected by cross-linking. The cluster radius is in the range 10 - 50 nm, with the cluster size determined mainly by the availability of lysine amino groups on the protein surface. The development of structure in the gel, on time scales from minutes to hours, appears to obey first-order kinetics. Cross-linking is slower at acidic pH, where the population of amino groups in the reactive deprotonated form is low. These results support the use of cross-linked protein gels in NMR studies of protein dynamics and for modeling NMR relaxation in biological tissue.


Introduction
Glutaraldehyde (GA; 1,5-pentanedial) has been widely used during the past 50 years to immobilise and stabilise proteins through covalent intermolecular cross-links. 1,2This bifunctional reagent has been used as a fixative in studies of cell or tissue ultrastructure, 3,4 to stabilise protein crystals for X-ray diffraction, 5,6 and to characterise the quaternary structure of proteins in solution. 7,8][11][12][13] The present study was motivated by yet another application of protein cross-linking: water NMR studies of biological systems.In a protein solution, all anisotropic nuclear spin couplings are averaged out by protein tumbling.As a result, the longitudinal relaxation of the water ( 1 H, 2 H or 17 O) magnetisation only reports on molecular motions faster than the protein's tumbling time (typically, ∼ 10 ns).Protein cross-linking profoundly alters the NMR conditions, allowing motions on time scales up to hundreds of µs to influence the relaxation.In the NMR context, cross-linked pro-tein gels were first used as model systems for biological tissue, [14][15][16][17][18] wherein the proteins are largely immobilised, 19 in efforts to elucidate the molecular determinants of the water 1 H relaxation that governs contrast in magnetic resonance images of soft tissue.More recently, cross-linked protein gels have been used in 2 H and 17 O magnetic relaxation dispersion (MRD) studies of intermittent protein dynamics on the ns -µs time scale. 20,213][24][25][26] Despite extensive study, the details of the reaction mechanism remain poorly understood. 1,28][29][30][31][32] These equilibria, which depend on pH, temperature and concentration, may account for the efficiency of GA as a cross-linking agent by allowing it to form linkers of variable length.
In quantitative MRD studies of protein dynamics, the cross-links should ideally inhibit protein tumbling without affecting the internal (conformational) dynamics of the protein.A necessary condition for this is that the protein structure is un-affected by GA cross-linking, except locally at the chemically modified residues.For protein crystals, X-ray diffraction demonstrates that the structural perturbation caused by cross-linking is indeed local. 6,26For cross-linked protein gels, the evidence is less direct, but the limited results available so far have not revealed any significant differences in internal protein dynamics between gel and solution. 20e protein gels used in MRD studies are formed by adding GA to protein solutions at concentrations where the protein molecules are separated by several water layers. 20,21But even if the protein is amply hydrated on average, the protein molecules may not be uniformly distributed.Cross-linking may well produce a gel structure with dense tightly cross-linked protein clusters connected by more dilute weakly cross-linked networks.Even if such spatial heterogeneity has little effect on the internal dynamics, water dynamics in the first hydration layer on the protein surface would be affected. 20,21 our knowledge, the structure of chemically cross-linked protein gels has not been examined directly.Such studies would have implications for the interpretation of MRD data from cross-linked protein gels and, more generally, would further our understanding of protein cross-linking by GA.A technique suitable for this task is small-angle X-ray scattering (SAXS), 33,34 which can provide information about gel structure via the structure factor, essentially the Fourier transform of the proteinprotein pair correlation function, as well as about the integrity of the protein's tertiary structure via the form factor.
In protein science, SAXS has proven useful for determining the low-resolution structure of monomeric and oligomeric proteins in dilute solution, 35,36 but this technique has also been used to study protein-protein interactions in more concentrated solutions [37][38][39] and to obtain structural information about more complex protein systems, such as casein micelles, 40 gluten films 41 and gels of heat denatured proteins. 42re, we report SAXS data for GA cross-linked gels of three proteins: bovine pancreatic trypsin inhibitor (BPTI), equine skeletal muscle myoglobin (Mb) and rat intestinal fatty acid-binding protein (IFABP).MRD studies of µs protein dynamics and internal water exchange in these protein gels have already been performed (BPTI and Mb) 20,21 or are currently underway (IFABP). 43For each protein, we analyse the scattering intensity profiles in terms of the inhomogeneous protein distribution in the gel.We also report time-resolved SAXS measurements of the cross-linking kinetics.BPTI (trade name Trasylol, batch 9104; 97 % purity by HPLC) was obtained from Bayer HealthCare AG (Wuppertal, Germany).To remove residual salt, the protein was extensively dialysed against MilliQ water (Millipore) and then lyophilised.

Mb.
Equine skeletal muscle Mb (≥ 95 %) was purchased from Sigma.The protein was further purified by cationexchange chromatography (SP sepharose; GE Healthcare), dialysed against MilliQ water and lyophilised.

IFABP.
The gene encoding rat IFABP was codon optimised for expression in Escherichia coli and synthesised by DNA2.0 (Menlo Park, CA, USA).The synthetic DNA was inserted into the pNIC28-Bsa4 plasmid 44 for expression.The expression construct yields a fusion protein containing, from the N-terminus, the His6-tag, tobacco etch virus (TEV) protease cleavage site and IFABP.The fusion protein was over-expressed using E. coli TUNER(DE3) strain (Novagen) in Terrific Broth (Difco).After harvesting, the bacterial cells were suspended in a lysis buffer (50 mm sodium phosphate, 300 mm NaCl, 10 mm imidazole, pH 8.0) and homogenised by French press.The cell lysate was ultracentrifuged and the supernatant was subjected to His6tag affinity chromatography (HisTrap; GE Healthcare).The His6-tag of the fusion protein was then cleaved off by TEV protease.After the protease digestion, the His6-tag and the protease were removed by passing the solution through the HisTrap column and the flow-through fraction containing IFABP was collected.IFABP was then delipidated by using a Lipidex-1000 (Perkin Elmer) column.The IFABP solution was then dialysed against MilliQ water and lyophilised.

SAXS samples.
The lyophilised proteins were dissolved in MilliQ water (cross-linked BPTI and solution samples for all proteins) or in a buffer solution (50 mm PIPES for cross-linked Mb, 50 mm sodium phosphate for crosslinked IFABP).The solution was then centrifuged at 13 000 rpm for 3 min to remove any insoluble proteins.To prepare cross-linked samples, the protein solution was supplemented with 25 % glutaraldehyde solution (Sigma).After vigorous mixing, the solution was transferred to a 1.5 mm o.d.borosilicate capillary (Hilgenberg GmbH) where the cross-linking reaction proceeded at 6 • C. Approximately 50 µl of the solution was reserved for pH measurement.SAXS measurements were also performed on protein solutions without GA.The pH of these solution samples was adjusted to match that of the corresponding cross-linked sample by adding either HCl or NaOH.

SAXS measurements
SAXS experiments were carried out at the I911-4 beamline 45 of the MAX-lab synchrotron using a wavelength of 0.91 Å.The sample, contained either in a capillary (gels) or in a flow-through cell (solutions), was maintained at 20 • C or, for the kinetics experiments, at 6 • C. For each protein sample, a pure solvent sample (MilliQ water or buffer solution) was also measured.Two-dimensional SAXS images were recorded with a PILATUS 1M detector (Dectris) with an exposure time of 10 s (kinetics series for Mb and IFABP) or 60 s (in all other cases).Control measurements were performed to ensure that the results were not compromised by radiation damage.The scattering vector q range (q = 4π sin θ/λ, where λ is the wavelength and 2θ is the scattering angle) was calibrated with a silver behenate sample.Reported scattering profiles I(q) were obtained as the difference of the azimuthally averaged SAXS 2D images from sample and solvent.

SAXS analysis
For a sample of NP identical protein molecules of volume VP contained in a volume V , the scattering intensity in the decoupling approximation, where the orientation of a protein molecule is taken to be independent of its position and the configuration of other protein molecules, can be factorised as 33,34,46 I(q) = nP (VP ∆ρ) 2 P (q) S(q) , where nP = NP/V is the protein number density and the scattering contrast ∆ρ is the protein-solvent electron density difference.The scattering from an isolated protein molecule is described by the form factor 33,34,46 while information about the gel structure is contained in the structure factor where . . .denotes equilibrium configurational averaging for the isotropic system.Strictly speaking, the structure factor in Eq. (1) should be regarded as an effective structure factor S(q). 46 For the samples studied here, the difference between S(q) and S(q) due to non-spherical protein shape is likely to be small.
According to Eq. ( 1), the structure factor S(q; nP) for a protein gel at concentration nP can be obtained from the corresponding SAXS intensity I(q; nP) and the intensity I q; n 0 P measured from a solution of the same protein at a concentration n 0 P sufficiently low that S(q; n 0 P ) = 1: This approach neglects, in the q range considered, the direct contribution to P (q) from GA as well as any effects of the cross-links on the protein structure via P (q) and ∆ρ.
For each protein, the quantity IP q; n 0 P in Eq. ( 4), hereafter referred to as the apparent form factor (AFF), was obtained by merging solution SAXS profiles recorded at two different protein concentrations, as shown in Fig. S1.The two profiles were first superimposed in an intermediate q window (q = 1.9 ± 0.3, 1.5 ± 0.1 or 1.7 ± 0.1 nm −1 for BPTI, Mb or IFABP, respectively), indicated by vertical dashed lines in Fig. S1, and a hybrid profile was constructed by merging the low-q part of the lower concentration profile with the high-q part of the higher concentration profile.At low q, where the SAXS profile is sensitive to protein-protein correlations, we thus use data from the most dilute solution, while, at high q, we benefit from the better signalto-noise ratio of the data from the more concentrated solution.The merged profile was then smoothed with the aid of a Savitzky-Golay filter. 47For BPTI and Mb, a linear regression on a Guinier plot 33 was performed in a low-q window, indicated by vertical dotted lines in Fig. S1, and used to extrapolate the AFF to q = 0. Finally, the AFF was obtained by scaling the merged profile by a constant factor that minimises the difference between the merged solution profile and the gel profile in a high-q range (2.95 -3.05, 2.9 -3.1 or 3.9 -4.1 nm −1 for BPTI, Mb or IFABP, respectively), where we expect that S(q) = 1.This scaling, which ensures that the gel structure factor derived from Eq. ( 4) tends to 1 at the highest q, is justified by the different sample containers used for gels (capillary) and solutions (flowthrough cell).
For an isotropic sample of identical protein molecules, the structure factor is related to the Fourier transform of the protein-protein pair correlation function (PCF), g(r).After angular integration, one obtains 48 This relation may be inverted to obtain the PCF from the structure factor as To obtain the PCF from Eq. ( 6), the structure factor, which has been experimentally determined in a finite q range, must be extended to higher and lower q.First, a value qmax is selected, above which S(q) is set equal to 1.This value, qmax = 3.0, 3.0 or 4.0 nm −1 for BPTI, Mb or IFABP, respectively, is taken as the midpoint of the q range used to scale the AFF.Then a value qmin is selected (0.09, 0.065 or 0.075 nm −1 for BPTI, Mb or IFABP, respectively), below which S(q) is extrapolated by fitting a quadratic polynomial to the 20 data points (covering ∼ 0.07 nm −1 ) just above qmin.Using cubic spline interpolation, we resample S(q) with fixed spacing ∆q.Finally, we zero-fill S(q) − 1 to obtain a real-space resolution ∆r = 0.065 nm.
It follows from Eq. ( 6) that even a small deviation of S(q) from 1 at high q has a large effect on g(r) at small r.Our protocol of setting S(q) = 1 above qmax thus causes g(r) to be negative at small r (Fig. S2).This unphysical feature is removed by forcing g(r) = 0 whenever the transform produces negative values.
From the PCF, we compute the running coordination number as 3. Results and discussion

Structure factor and pair correlation function
Crystal structures [49][50][51] of the three investigated proteins, BPTI, Mb and IFABP, are shown in Fig. 1.BPTI has a pear-like shape with principal dimensions of 2.1 and 3.4 nm and volume V P = 7.792 nm 3 .The shape of Mb is oblate-like, with principal dimensions 2.6 and 4.0 nm and V P = 21.67 nm 3 .IFABP is also oblate-like with principal dimensions 2.5 and 3.6 nm and V P =18.79 nm 3 .The effective diameter, σ P , for a sphere of volume V P and the number, N Lys , of lysine residues per protein are listed in Table 1, which also presents relevant gel sample characteristics such as the protein volume fraction φ P and the centre-to-centre separation d PP between protein molecules assuming a cubic lattice.The protein concentration, pH and GA/protein mole ratio, N GA , were chosen to match gel samples used in MRD studies. 20,21,43he concentration normalised (gel − solvent) scattering profiles for the three protein gels are shown in Fig. 2. The structure factor, S(q), was deduced from Eq. ( 4) using an apparent form factor (AFF), I q; n 0 P , constructed from SAXS profiles recorded on protein solutions at two concentrations (Sec.2.3, Figs. 2 and S1).For IFABP, the solution SAXS profile, and thus the AFF, increases sharply below q ≈ 0.2 nm −1 even at the lowest concentration (0.5 mm), indicative of protein aggregation.A crude analysis shows that this feature in the scattering profile can be rationalised by a very small fraction (∼ 10 large aggregates (effective diameter ∼ 10 σ P ).For each protein, the AFF constructed from solution SAXS profiles at two concentrations agrees well (at q ≥ 1 nm −1 for IFABP) with a crysol 52 fit based on the crystal structure of the corresponding protein.We therefore conclude that the proteins are essentially monomeric form in our solution samples.
The close agreement between the SAXS profiles from the Mb and IFABP gels with the corresponding AFFs at high q (Fig. 2b,c), where we expect that S(q) = 1, indicates that the only effect of GA is to induce protein-protein correlations.The protein structure thus appears to be the same in solution and gel.We note that although the AFF has been scaled to agree with the gel profile at high q (Sec.2.3), this scaling does not alter the shape of the AFF.For BPTI, the AFF cannot be f Net protein charge at given pH, calculated with standard pKa values and unmodified lysines.
scaled to superimpose with the gel profile over an extended high-q range (Fig. 2a), presumably because the S(q) = 1 limit is not reached in the investigated q range for this small protein.Another consequence of the limited q range is that the dip in the AFF, which reflects the size and shape of the protein molecule, is observed for Mb and IFABP (at q ≈ 2.1 and 2.3 nm −1 , respectively; see Fig. 2b,c), but not for the smaller BPTI.Using the program crysol 52 to compute the form factor from the BPTI crystal structure 1bpi, 49 we find a shallow dip at q ≈ 4.5 nm −1 .Nevertheless, in the subsequent analysis of the BPTI profile, we postulate that S(q) = 1 for q ≥ 3.0 nm −1 .The structure factors for the three protein gels, deduced in this way, are shown in Fig. 3.For BPTI, S(q) shows a pronounced maximum at q ≈ 0.3 nm −1 , already evident in the gel profile (Fig. 2a).In contrast, for Mb and IFABP, S(q) decreases monotonically with q from the lowest examined q value (0.06 nm −1 ) up to q ≈ 1 nm −1 (Fig. 3b,c).
To characterise the gel microstructure in real space, we transform the structure factor into a pair correlation function (PCF), g(r), with the aid of Eq. ( 6), implemented as a fast sine transform.Before the transformation, S(q) is truncated at high q and extended to high and low q as described in Sec.2.3.The modified S(q) is included in Fig. 3 as a blue curve and the resulting PCF is shown in Fig. 4. Because we set S(q) = 1 at high q, the transformation yields negative g(r) values at small r (Fig. S2).If we force g(r) = 0 at these small r values and then inverse transform the corrected PCF according to Eq. ( 5), we find that the backcalculated structure factor differs very little from the original one (Fig. S3).(However, a significant deviation is seen for Mb at q ≈ 1 nm −1 .)This finding is consistent with the expectation that our SAXS data (with q ≤ 4 nm −1 ) are insensitive to short-range (r 2 nm) structural features.
The PCF reflects the static spatial correlations in the sample, regardless of the origin of these cor-relations (cross-links or other interactions), and it can therefore be obtained from the structure factor without any assumptions about the structure of the gel (apart from isotropy).The analysis in Sec.3.2 indicates that the protein gel is inhomogeneous on the length scale probed by our SAXS data, with dense protein clusters connected by less dense regions.This inhomogeneity complicates the interpretation of g(r).One possible approach would then be to model the protein gel as a mixture of clustered and non-clustered protein molecules.Such an approach would, however, introduce more parameters than can be justified by the data.Instead, we assume that the observed scattering is strongly dominated by clustered proteins so that the contribution from non-clustered proteins can be neglected.
The PCF is then used, along with Eq. ( 7), to obtain the running coordination number, N (r), in the protein gel (Fig. 5).For comparison, we show N (r) for a uniform protein distribution at the same protein concentration as in the corresponding gel.

BPTI.
Among the three investigated protein gels, only the BPTI gel produces a low-q maximum in the SAXS profile (Fig. 2), manifested as a peak in S(q) at q ≈ 0.3 nm −1 (Fig. 3a).The PCF has a primary maximum at r = 2.2 nm (Fig. 4a), slightly less than the effective protein diameter (Table 1).For comparison, in the (monomeric) crystal structure 1bpi, 49 there are 10 BPTI neighbours with centreof-mass (COM) separations in the range 2 -3 nm (two each at 2.26, 2.42, 2.56, 2.63 and 2.86 nm).Under certain conditions (high pH, high salt concentration), BPTI forms a tight decamer both in the crystal 53 and in solution. 54,55But the pronounced minima at q = 1.5 and 2.9 nm −1 in the decamer form factor (Fig. S4) are not evident in our gel or solution SAXS profiles.We therefore The apparent form factor, derived from solution SAXS profiles as described in Sec.2.3, is also shown for each protein (black curve).
conclude that decamers are not present under our conditions.While two BPTI molecules that are directly joined by a cross-link are expected to have a separation exceeding the shortest separation in the (monomeric) crystal, the 2.2 nm separation indicated by the PCF might be due to a tight approach (via the extended neutral and largely hydrophobic face of the BPTI molecule) of two BPTI molecules that are both cross-linked to a third one.
For simple liquids, the first-shell coordination number N c is usually obtained by integrating the PCF up to its first minimum.But, for the BPTI gel, the first minimum in g(r) is very shallow and  4) (red) and after truncation at high q and extension to high and low q (blue).Note the linear scale in (a).
extended (Fig. 4a), so we define N c as the integral up to the distance r c = 8.06 nm where g(r) = 1 (on the large-r flank of the primary peak).This integration yields N c = 31.5 (Fig. 5a).In contrast, for a uniform protein distribution and at the same BPTI concentration as in the gel, we would have N c0 = 21.8 at r c = 8.06 nm.Within a sphere of radius 8.06 nm around a given BPTI molecule, the local protein density (concentration) defined as n P (r) = [N (r) + 1]/V (r), where V (r) is the volume of a sphere of radius r centred on the reference protein molecule, is therefore higher than the bulk density by a factor of (31.5 + 1)/(21.8+ 1) = 1.43.From the shoulder seen at r ≈ 5 nm (Fig. 4a), it is clear that the primary peak in the PCF comprises 2, or even 3, overlapping coordination shells.If only the first shell were included, the local density would be even higher.But the local density is not uniform; presumably it is higher along the cross-linked chains than in the regions in between.Indeed, in ∼ 90 % of the volume of the 8 nm sphere (beyond ∼ 3.5 nm), the running coordination number N (r) grows more slowly than for a uniform distribution (Fig. 5a).
In a log-log plot as in Fig. 5, the slope yields the fractal dimension, d f , defined via the scaling relation N (r) ∝ r d f . 56Beyond ∼ 15 nm, N (r) exhibits bulk-like scaling with d f = 3 (Fig. 5a).To make this more precise, we define a correlation length, ξ, as the distance where d f has reached a value of 2.9 on its approach to the bulk value 3. Analysis of the slope in Fig. 5a then yields ξ = 16 nm for the BPTI gel.

Mb.
The SAXS profile and S(q) for the Mb gel do not show any peak at q < 1 nm −1 in the examined q range (Figs.2b and 3b).The PCF clearly reveals at least 3 coordination shells (Fig. 4b) and remains well above (∼ 2) the bulk value 1 even at r = 30 nm (Fig. 4b).The spatial correlations are thus of longer range in the Mb gel than in the BPTI gel.Indeed, the correlation length, ξ = 157 nm, is an order of magnitude longer than for BPTI.
From N (r) in Fig. 5b, we find that the first coordination shell (r < 5.0 nm) contains 5.0, the second shell (5.0 < r < 7.5 nm) 9.0 and the third shell (7.5 < r < 10.0 nm) 14.5 Mb molecules.The first three shells (r < 10 nm) thus contain 28.5 Mb molecules, whereas a uniform distribution would only have 4.0 neighbours within 10 nm.This corresponds to a local density increase by a factor of 29.5/5.0= 5.9.The spatial heterogeneity is thus more pronounced in the Mb gel than in the more concentrated BPTI gel.Since S(0) is proportional to the mean-square fluctuation (or spatial variation) of the protein concentration, the stronger spatial heterogeneity in the Mb gel can explain the orders-of-magnitude larger S(q) at q ≈ 0 (Fig. 3).
The Mb gel analysis may not be quantitatively accurate since, for Mb, the S(q) back-calculated from the corrected (non-negative) PCF differs somewhat from the original S(q) (Fig. S3b).The PCF becomes negative at small r (Fig. S2b) because the negative S(q) − 1 in the range 0.5 < q < 2.0 nm −1 is not fully compensated by a slightly positive S(q) − 1 at higher q (Fig. 3b).The latter feature is not resolved in the noisy high-q part of the SAXS profile from the dilute Mb gel.

IFABP.
The SAXS profile and S(q) for the IFABP gel are qualitatively the same as for the Mb gel (Figs.2c  and 3c).The PCF reveals multiple coordination shells (Fig. 4c), as for Mb, and it approaches the bulk value with a correlation length, ξ = 52 nm, intermediate between those for BPTI and Mb (Table 1).Unlike the Mb case, the inverse transform of the non-negative g(r) yields a back-calculated S(q) in good agreement with the original S(q) (Fig. S3c).
From N (r) in Fig. 5c, we find that the first coordination shell (r < 4.5 nm) contains 4.0, the second shell (4.5 < r < 7.0 nm) 6.9 and the third shell (7.0 < r < 9.5 nm) 7.8 IFABP molecules.The first three shells (r < 9.5 nm) thus contain 18.7 IFABP molecules, whereas a uniform distribution would only have 7.0 neighbours within 9.5 nm.This corresponds to a local density increase by a factor of 19.7/8.0 = 2.5, intermediate between the corresponding values for BPTI and Mb.As for Mb, the strong spatial heterogeneity in the IFABP gel should give rise to a large S(q) at q ≈ 0 (Fig. 3c).

Cluster characteristics.
For all three protein gels, the position of the primary PCF peak is close to the effective protein diameter, σ P (Fig. 4, Table 1), indicating compact clusters with nearest neighbours almost in contact.Because the Mb gel has a 10-fold lower protein concentration than the BPTI gel, the spatial heterogeneity is stronger, as indicated by the large S(q) at q ≈ 0. The correlation length, ξ, may be regarded as a measure of the average cluster-cluster separation.The 10-fold larger ξ for the Mb gel as compared to BPTI can be explained partly by the 10-fold lower protein concentration and partly by the larger clusters (Fig. 4).Presumably, the Mb clusters are larger because of the larger number (19 versus 4) and more even distribution of lysine sidechains (Fig. 1).

Cross-linking kinetics
To study the kinetics of gel formation by GA crosslinking, we performed time-resolved SAXS measurements where scattering profiles were recorded at regular time intervals after mixing protein and GA solutions.Figure 6 shows, for each protein, 16 profiles from the developing gel, along with the respective apparent form factor (Fig. S1).The timing of each scattering profile, counted from the mixing of protein and GA solutions to the middle of the irradiation period and including a 1 min dead-time between mixing and the first irradiation period, is listed in Table S1.Consistent with the finding that GA cross-linking alters the structure factor with little or no effect on the form factor (Sec. 3.1), the time-dependence in the SAXS profile is most evident in the low-q region.For Mb and IFABP, due to the fast kinetics and the limited time resolution of the experiment, we could reliably monitor the process only at q < 0.5 nm −1 .
From the time-resolved SAXS data, we determine the gel formation or cross-linking rate, k CL , by assuming first-order kinetics: The k CL values obtained from exponential fits to the time-dependent scattering intensity at q = 0.1 nm −1 are shown in Table 1.These rates correspond to characteristic cross-linking times, τ CL = 1/k CL , of 8.7 h, 3.3 min and 6.2 min for BPTI, Mb and IFABP, respectively.
Radiation damage during the multiple irradiations enhances scattering at low q, causing the last profile in the time series to overshoot the equilibrium profile in Fig. 2a.But radiation damage only makes a minor contribution to the, mainly structure-related, build-up of the low-q intensity and should therefore only give rise to a modest over- estimate of k CL .
For BPTI, k CL (q) was determined as a function of q up to 2.3 nm −1 , except near 1 nm −1 where the time-dependent SAXS profiles show a quasiisosbestic point (Fig. 6a).The obtained k CL values (Fig. 7) indicate, not surprisingly, that the gel structure forms faster on shorter length scales (2π/2.0≈ 3 nm) than on longer length scales (2π/0.2≈ 30 nm).In the former case, the data suggest, in addition to the principal fast component, one or more minor slow components, but it cannot be excluded that radiation damage plays a role here.
8][59][60] At the much lower protein concentrations examined by these authors, the cross-linking process exhibits two distinct steps, attributed to fast cluster formation by cross-linking of the most reactive lysines followed by slower linkage of clusters. 59At the higher protein concentrations studied here, the time scales of these two processes may overlap, leading to an apparently exponential buildup of scattering intensity at low q (Fig. 7a).
The rate of gel formation depends on many factors, including protein and GA concentrations, pH, temperature and availability of primary amino groups.While it is outside the scope of this study to systematically explore these factors, we note that the BPTI gel forms 2 orders of magnitude slower than the Mb and IFABP gels.The BPTI gel differs from the two other gels in having a much higher protein concentration (Table 1).But this should accelerate gel formation, so there must be other factors at play.We suspect that the dominant factor here is the ∼ 3 units lower pH in the BPTI gel (Table 1), which means that the fraction of εamino groups in the reactive deprotonated (NH 2 ) form 1,2 is 3 orders of magnitude lower in the BPTI gel.Other, presumably less important, effects of a low pH include suppression of GA aldol condensation, 1,2,[27][28][29][30][31][32] which might lead to shorter cross-links and slower gel formation, and a more positive net protein charge, Z (Table 1).A larger |Z| should retard gel formation and make the clusters smaller due to intra-cluster Coulomb repulsion, but, since cross-links remove positive lysine charges, it is not clear which of the three proteins has the largest |Z|.Yet another factor that could slow down BPTI gel formation is the small number (Table 1) and uneven distribution (Fig. 1) of lysine side-chains in BPTI, which may also be responsible for the smaller size of the BPTI clusters (Sec.3.2.4).

Conclusions
The SAXS data presented here provide quantitative information about microstructure and aggregation kinetics in GA cross-linked gels of three different proteins.While the three protein gels are qualitatively similar, the BPTI gel differs quantitatively from the Mb and IFABP gels.This difference is caused by a combination of factors, among which Time-dependent SAXS intensity from BPTI gel (Fig. 6a) at q = 0.2 nm −1 (a) and 2.0 nm −1 (b).The filled data points were used for the exponential fit.
the most important are the higher protein concentration, the lower pH and the more limited availability of lysine amino groups in the case of BPTI.The most important conclusions derived from our analysis of the SAXS data are as follows.
The native protein structure is retained in the cross-linked gel, at least at the ∼ 1 nm resolution afforded by the SAXS data.This conclusion follows from the close agreement at high q between the SAXS profiles from gel and dilute solution (Fig. 2).The evidence is most clear-cut for Mb and IFABP, whereas, for the smaller protein BPTI, the form factor is masked by S(q) oscillations that persist to higher q.While the lysine side-chains involved in cross-links, and perhaps some nearby side-chains as well, must be conformationally perturbed, the SAXS data rule out a significant degree of unfolding.This conclusion is consistent with the finding, from MRD measurements on BPTI, 20 that the ns -µs dynamics of internal water exchange, and the rate-limiting structural fluctuations, 61 are essentially the same whether the protein is free in solution or cross-linked in a gel.Furthermore, the SAXS results indicate that this is true also for Mb and IFABP.
The protein gel is spatially heterogeneous, with dense clusters linked by sparse networks.The strong spatial heterogeneity in the more dilute Mb and IFABP gels produces intense scattering at low q.The BPTI gel, with a higher concentration of smaller clusters and a shorter correlation length, is less strongly heterogeneous.The low-q scattering is therefore much weaker, allowing us to observe a peak at q ≈ 0.3 nm −1 , resulting from the interplay of intra-cluster attraction and inter-cluster repulsion.
Within the clusters, adjacent protein molecules are almost in contact.The number of nearest protein neighbours, estimated by integrating over the first coordination shell (not perfectly well-defined for BPTI) in the PCF (Fig. 4), is 5 ± 1 for all three proteins.Since this exceeds the value of 2 expected for a linear chain, the protein clusters must be multiply connected.Some nearest neighbours may be cross-linked via a third protein molecule while still approaching each other almost to contact via hydrophobic attraction.Despite such close encounters, the cluster is not uniformly dense.For the BPTI gel, the protein concentration within 8 nm of a reference molecule (a spherical volume that includes 2 or 3 coordination shells) is ∼ 24 mm (43 % above the average concentration in the gel), which is still much lower than the concentration of 136 mm in the (monomeric) BPTI crystal 1bpi. 49Yet, the close protein encounters that do occur within a cluster should lead to a stronger dynamical perturbation of water dynamics than in the hydration layer of a protein in dilute solution.At points of particularly close contact between two protein molecules, some water molecules may be trapped with survival times exceeding 1 ns, as seen for internal water molecules.Both of these phenomena, enhanced dynamical perturbation in the hydration layer and trapped water molecules with survival times in the range 1 -10 ns, have been inferred from MRD studies of cross-linked proteins. 20,21roteins with a large number of uniformly distributed lysine side-chains make larger clusters.This generalisation is based on the correlation between the range of protein-protein correlations, as reflected in the PCF (Fig. 4), with the number (Table 1) and surface distribution (Fig. 1) of lysine amino groups in the three investigated proteins.The correlation length, ξ, defined in terms of the fractal dimension, is more closely related to the typical cluster-cluster separation and therefore depends on the overall protein concentration as well as on the cluster size.
Gel formation occurs on time scales from minutes to hours under our conditions (Fig. 6).As judged by the scattering intensity at q ≤ 0.2 nm −1 , gel formation appears to obey first-order kinetics (Fig. 7).The much slower gel formation for BPTI is attributed to the lower pH and the consequent lower abundance of reactive deprotonated amino groups., and IFABP (c), obtained by fast sine transform of the modified structure factor, S(q) (blue curve in Fig. 3).The dotted line indicates the asymptote g(r → ∞) = 1.The negative g(r) at small r is an artifact caused by setting S(q) = 1 at high q.
−5  ) of the IFABP molecules existing in Crystal structures of BPTI (PDB ID 1bpi49 ), Mb (1wla 50 ) and IFABP (1ifc 51 ).Ribbon and surface representations are superimposed, while the haem group of Mb is shown in a stick representation (C, N, O and Fe atoms coloured orange, blue, red and brown, respectively).Lysine side-chains are coloured by element (C, green; N, blue).

Figure 2 .
Figure 2. Concentration normalised scattering profile from cross-linked BPTI (a), Mb (b) and IFABP (c).The apparent form factor, derived from solution SAXS profiles as described in Sec.2.3, is also shown for each protein (black curve).

Figure 3 .
Figure 3. Structure factor, S(q), for cross-linked BPTI (a), Mb (b) and IFABP (c), obtained from the profiles in Fig.2according to Eq. (4) (red) and after truncation at high q and extension to high and low q (blue).Note the linear scale in (a).

Figure 6 .
Figure 6.Time-resolved SAXS profiles at 6 • C from developing gels of BPTI (a), Mb (b) and IFABP (c).Time ordering is indicated by line colour (red → orange → magenta → green → cyan → blue), line type (solid → dotted → dashed) and arrows.The timing of the first and last profiles is given.The AFF is also shown for each protein (black solid curve).

Figure S2 .
Figure S2.Pair correlation function, g(r), for cross-linked BPTI (a), Mb (b), and IFABP (c), obtained by fast sine transform of the modified structure factor, S(q) (blue curve in Fig.3).The dotted line indicates the asymptote g(r → ∞) = 1.The negative g(r) at small r is an artifact caused by setting S(q) = 1 at high q.

Figure S4 .
Figure S4.Scattering cross section for BPTI monomer (red) and decamer (blue).The curves were calculated from the atomic coordinates of the crystal structures 1bpi49 (monomer) and 1bhc 53 (decamer) using the program crysol 52 without hydration-layer correction.
a Number of lysine residues.bEffective protein diameter.c Protein volume fraction.d

Table S1 .
Timing of kinetics series.a Time elapsed from GA addition to the middle of the irradiation period, including a 1 min dead time.
a b 60 s irradiation period.c 10 s irradiation period.