On the question of two-step nucleation in protein crystallization †

We report a real-time study on protein crystallization in the presence of multivalent salts using small angle X-ray scattering (SAXS) and optical microscopy, particularly focusing on the nucleation mechanism as well as the role of the metastable intermediate phase (MIP). Using bovine beta-lactoglobulin as a model system in the presence of the divalent salt CdCl 2 , we have monitored the early stage of crystallization kinetics which demonstrate a two-step nucleation mechanism: protein aggregates form a MIP, followed by the nucleation of crystals within the MIP. Here we focus on characterizing and tuning the structure of the MIP using salt and related effects on the two-step nucleation kinetics. The results suggest that increasing the salt concentration near the transition zone pseudo − c ** reduces the energy barrier for both MIPs and crystal nucleation leading to slow growth. The structural evolution of the MIP and its effect on subsequent nucleation is discussed based on the growth kinetics. The observed kinetics can be well described using a rate-equation model based on a clear physical two-step picture. This real-time study not only provides direct evidence for a two-step nucleation process for protein crystallization, but also elucidates the role and the structural signature of the MIPs in the non-classical process of protein crystallization.


Introduction
Studies of the early stage of nucleation in various systems have revealed new features which cannot be explained using the classical nucleation theory [1][2][3][4][5][6] . A large body of experimental results supported by theory and simulations suggest that a metastable intermediate phase (MIP) exists before the final crystal structure is formed [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21] , i.e. the solutes in the supersaturated solution form in a first step either small clusters or a macroscopic dense liquid phase. In the second step, the nucleation occurs within the MIPs. This two-step nucleation mechanism has originally been proposed by ten Wolde and Frenkel for crystallization of a colloidal system with short-range attraction and near the critical point of the metastable liquidliquid coexistence line 14 . The two-step nucleation mechanism can be considered as an example of Ostwald's step rule in the microscopic world 22 . Later, this concept has been widely used under various conditions [1][2][3][4][5][6][17][18][19][20]23 .
While the two-step mechanism seems plausible for certain experiments, direct observation of such a process is not easy. Recently, direct visualization of the crystallization kinetics and pathways of nucleation in colloidal crystallization became possible and provided detailed information on the MIP and the transition in colloidal suspensions. Colloidal systems exhibit similar phase behavior as atomic and molecular systems, and their large particle sizes enable visualization on a single-particle level. Using this technique, Tan et al. studied the liquid-solid phase transition and observed the formation of a metastable precursor under their experimental conditions, regardless of the final state and the interaction potential 24 . Peng et al. studied the kinetics of a solid-solid phase transition using single-particle resolution video microscopy. They observed that the transition between two different solid states occurs via a two-step diffusive nucleation pathway involving liquid nuclei 25 . This pathway is favored in comparison with the one-step nucleation because the energy of the solid/liquid interface is lower than that between the solid phases.
While these excellent experimental observations on colloidal systems demonstrate that the two-step nucleation follows Ostwald's step rule for simple liquids 16,19,26,27 , the application of this concept to other systems, in particular the protein crystallization, is still challenging. The small dimensions of proteins on the nanometer scale limit the applicability of optical methods for the study of the MIP formation. Because of the larger size and slow dynamics of colloids, the microstructural arrangement of colloidal particles relaxes typically in a time range of seconds, which leads to various non-equilibrium phenomena in these systems. Moreover, the interaction potentials in these colloidal systems are isotropic, whereas the effective protein-protein interactions are often nonisotropic involving hydrophobic and electrostatic patches as well as ion-bridges. The quantitative description remains poorly understood [28][29][30] .
For these systems in which the direct visualization is not possible, an other method has to be developed to characterize the MIP and the nucleation and growth kinetics. Here, we argue that the two-step nucleation can be distinguished from the classical one-step process by following the overall crystallization kinetics. When a MIP exists, particular care should be paid for distinguishing the consequential and the parallel pathway. The consequential pathway corresponds to the real two-step nucleation, in which the crystals nucleate from the MIP. The parallel pathway consists of the parallel formation of MIP and crystals in two one-step nucleation events from the liquid. The different pathways are illustrated in Figure 1.
For a one-step nucleation and growth mechanism with or without MIPs, the nucleation and growth are mainly determined by the supersaturated initial solu-2 Experimental

Materials and sample preparation
The globular protein β-lactoglobulin (BLG) from bovine milk (product no. L3908), CdCl 2 (202908) was purchased from SIGMA-ALDRICH. For sample preparation, appropriate amounts of salt stock solution, millipore water and protein stock solution were mixed. Stock solutions were prepared by dissolving the salt or protein powder in deionized (18.2 MΩ) and degassed millipore water. The protein concentration of stock solutions was determined by UV absorption measurements using an extinction coefficient of 0.96 l · g −1 · cm −1 at a wavelength of 278 nm 32 . All samples in this work were prepared without additional buffer since buffers can affect the phase behavior of proteins and the solubility of salts. The pH of the solutions was monitored using a Seven Easy pH instrument from Mettler Toledo. The pH values for all experimental conditions were above the pI = 5.2 of BLG 33 . Therefore, cation binding is the main driving force of charge inversion instead of pH 33,34 . All experiments were performed at room temperature of 293 ± 1 K.

Optical microscopy
Time-dependent protein crystallization was followed by the transmission optical microscope AXIOSCOPE.A1 from ZEISS. The protein stock solution was filtered (pore size 100 µm) in advance. Samples were prepared using a micro-batch setup with two hydrophobically coated glass slides sealed by silicone (sample thickness ≈ 250-300 µm). Images were taken by an included camera AXIOCAM ICC5.

Small-angle X-ray and Neutron scattering (SAXS and SANS)
Small angle X-ray scattering (SAXS) measurements were performed at the ESRF, Grenoble, France at the beamline ID02. Different energies of 16047 eV and 12460 eV were used for two different beamtimes, and the sample-to-detector distance was 2 m with an accessed q-range of 0.06 to 4.3 nm −1 or 0.04 to 3.9 nm −1 , respectively. Ex-situ measurements were performed using a flow capillary cell. For real-time measurements, samples were measured using quartz capillaries in a vertical capillary holder that were quickly loaded and transferred to the sample station after sample preparation. Measurements started about 2-3 min after mixing and took place every couple of minutes during the whole crystallization process. The beam position in the sample was shifted after each exposure (duration 0.05 s) to avoid radiation damage. For further details on the beamline, calibration and data collection see Ref. 35 .
SANS measurements were carried out at KWS1 at FRMII, Munich, Germany. The applied sample-to-detector distances were 1.5 and 8 m which covers a q-range from 0.04 nm −1 to 3.1 nm −1 at a wavelength of 7Å (∆λ/λ = 10 %). Protein-salt solutions in D 2 O were filled in rectangular quartz cells with a pathlength of 2 mm. The beam size on the sample was 6 mm x 12 mm. Plexiglas was used as secondary standard to calibrate the absolute scattering intensity. The data correction and absolute intensity calibration were obtained using the software QtiKWS 36 .

Results
3.1 Experimental phase diagram of BLG with CdCl 2 or ZnCl 2 We first describe the experimental phase diagram of our system which provides the basis for the following kinetic studies on protein crystallization. Our studies of globular proteins in solutions containing multivalent metal ions have revealed complex phase behavior including reentrant condensation (RC), metastable liquidliquid phase separation (LLPS) and crystallization 34,[37][38][39][40][41][42][43] . A similar experimental phase diagram, like the one shown in Figure 2, has been observed for several proteins in solution in the presence of trivalent metal ions. The physical mechanism of this RC behavior is due to the effective charge inversion of proteins and a cation-mediated attraction, presumably by intermolecular bridges of multivalent cations 38,44 . With an isoelectric point below 7, the proteins used in our work are acidic at neutral pH. At low salt concentrations, proteins carry negative net charges, and the electrostatic repulsion stabilizes the solution. Adding trivalent metal ions into the solution, the binding of metal ions to the carboxyl groups on the protein surface reduces the effective net charge. Above a certain salt concentration c*, electrostatic repulsion is not strong enough to balance the attractive potential, and samples phase separate and become turbid ("regime II"). The interesting observation of this system is that with further increasing salt concentration, the continuous binding of metal ions to the protein surface leads to a charge inversion, which again establishes the long range electrostatic repulsion. Therefore above a second boundary (experimentally rather broad), c**, the solutions become completely clear again. The charge inversion and the effective attraction mediated via multivalent metal ions have been further investigated by experiments, simulations and theoretical studies 41,42,[44][45][46] .
Previous studies on β-lactoglobulin (BLG) systems with divalent salts ZnCl 2 and CdCl 2 showed a similar experimental phase behavior ( Figure 2) 31,47 . For these divalent salts, the samples above a certain salt concentration become gradually less turbid, but not completely clear again, and this transition zone is denoted as pseudo − c**. Both boundaries induced by CdCl 2 and ZnCl 2 are remarkably similar as shown in Figure 2. In comparison to the trivalent salt YCl 3 , both transitions are shifted towards higher salt concentrations 47 . Although the reentrant effect is not complete, a charge inversion with increasing divalent salt concentration has been observed in both cases (S.I., Figure S1). Note that the phase behavior shown in Figure 2 was observed for BLG only, but not for bovine or human serum albumin (data not shown), which suggests a more specific interaction between these divalent ions and BLG 47 .
We emphasize that the observed protein condensation is not caused by changing of the protein structure induced by CdCl 2 , ZnCl 2 or other multivalent salts, as demonstrated in previous work using Fourier transform infrared (FTIR) and circular dichroism spectroscopy for a broad protein and salt concentration range 31,48 .
Both techniques indicate no significant change on the secondary structure of the protein. Moreover, the successful growth of high-quality crystals and fine structural analysis confirm that the proteins are still in their native state 38,47 . From the images shown in Figure 3, one can see that with 17 mM salt, large aggregates are still formed, but the nucleation rates are also high. We emphasize that nearly all crystals are associated with the network. Nearly no crystals form in the dilute phase. In contrast to the high nucleation rate, the crystal growth period is short. After about 2 h, no visible change can be observed anymore, and the resulting system contains a large number of small crystals and most of the network or the amorphous aggregates has turned into the crystalline phase. Increasing the salt concentration by only 1 mM, to 18 mM, the overall phenomenological picture changes dramatically. Large aggregates are still visible, but not well connected to each other. The number of crystals is reduced but the final crystals are bigger. Further increasing the salt concentration, protein aggregates become smaller and the number of crystals is further reduced, but the crystal size is larger. In the end, the MIP is consumed by crystal growth and the solutions become clear. The averaged number of crystals normalized by the area of 1 mm 2 as a function of time is plotted in Figure 4 for three conditions. The number of crystals increases first with time, then saturates. The nucleation rates in the early stage are obtained from the slope of a linear fit, they are 1.44±0.08, 0.32±0.03 and 0.08±0.01 min −1 for 18 mM, 19 mM and 20 mM CdCl 2 , respectively. This decrease of nucleation rate is expected as the driving force is reduced with increasing salt concentration.

Structure of MIP revealed by SAXS and SANS
Due to the limited resolution of optical microscopy, SAXS and SANS were used for studying the structure and the role of the MIP on the crystallization process. We first show typical SAXS results in Figure 5a for samples with a low protein concentration of 6.5 mg/ml and CdCl 2 concentrations covering all three regimes. In regime I, the scattering curve (with 0.5 mM salt) is dominated by the form factor of the BLG dimer, which is consistent with the literature that BLG occurs predominantly as a dimer at room temperature and pH between 3.5 and 7.5 49 . In regime II, with 1 mM salt, the high q part is still dominated by the form factor, but in the low q region, the increase of intensity indicates the formation of protein aggregates. In the third regime (12 -90 mM, not all data are shown), a new feature forms shortly after preparation at q ≈0.7 nm −1 and a sharp peak occurs at 2.2 nm −1 , as indicated by the arrows in this figure. A previous study of BLG in solution in the presence of YCl 3 has shown that this maximum corresponds to the monomer-monomer (M-M) correlation due to the bridging effect 50 . Here, in the presence of CdCl 2 , the peak is sharper. A possible explanation is the formation of a highly ordered fiber-like structure which gives such diffraction peak corresponding to the axial translation of the subunit (BLG monomer). Similar diffraction peaks have been observed in the solution scattering of F-actin 51 . peak at q around 2 nm −1 is pronounced in all cases which is in good agreement with the SAXS measurements. The slight shift of M-M peak to the low q value in SANS is due to the hydration effect 52 . Secondly, after certain time, smeared Bragg peaks appear for all samples. At the same time, the broad peak (MIP) reduces its intensity or completely disappears. Although the low resolution of SANS at the high-q region smears the Bragg peaks, their positions are consistent with the SAXS measurements.
Both SAXS and SANS measurements reveal the similar structural feature of MIP, i.e. the local ordering within the large protein aggregates characterized by a broad peak at q ≈0.7 nm −1 and the M-M correlation peak at q around 2 nm −1 . As discussed in the following section, the broad peak is closely related to nucleation and crystal growth. It thus becomes the structure signature of MIP.

Crystallization kinetics followed by real-time SAXS
To extract information on the underlying crystallization process, we employed real-time SAXS measurements on the crystallization kinetics with high time and structural resolution. Figure 7 shows examples of time-resolved SAXS measurements for 33 mg/ml BLG with 17 (a) and 20 mM CdCl 2 (b) in 3D surface illustration and 2D projection, additional data for samples with 15 and 18.5 mM CdCl 2 were shown in S.I., Figure S2. The bottom 2D projections are created by dividing all curves by the first one and therefore visualize the ongoing changes in the system with time. Selected I(q,t)/I(q,t = 0) curves are further presented in Figure 7c&d. The SAXS curves of both samples feature a strong increase at low q which hardly changes with time, indicating the presence of large aggregates consistent with the observation by optical microscopy. With increasing time, a broad peak located at q ≈0.7 nm −1 develops that has been assigned to the nucleation precursors (MIP) 31 . Once Bragg peaks appear, the most pronounced ones at 1.01 and 1.27 nm −1 overlap with the broad peak. The intensity of the Bragg peaks increases with time, while the broad peak shrinks. In the I(q,t)/I(q,t = 0) plots (Figure 7c&d), it is clear that the broad peak appears before the Bragg peaks and becomes stronger with time. At the end of the crystallization, the broad peak shrinks and eventually disappears (Figure 7a&c).
From the optical microscopy experiments (Figure 3) one observes that the MIP forms before crystallization starts and is consumed during crystal growth. From real-time SAXS measurements one can see that the typical broad peak for MIP follows the same development: it appears first and develops and once crystallization starts it reduces its intensity and eventually disappears. Based on these observations, we propose to use the relative change of the area of this peak (representative of MIP) and the two Bragg peaks which overlap with it to quantify the relationship between the MIP and the crystalline phase as a function of time. At this point, we use the concept of crystallinity from semi-crystalline polymer systems for further data analysis 53 . After subtraction of the intensity at the minimum, the broad peak in the I(q,t)/I(q,t = 0) curves was fitted by a scaled Gaussian function and the Bragg peaks by two further (sharp) Gaussians 31 . The crystallization kinetics can be followed by the enveloped area of the broad region, A interm , and the area of the Bragg peaks, A Bragg , as a function of time. This method is further illustrated in an animation that can be found in the online S.I.(SAXSdecomposition.gif). Figure 8a displays an example of such analysis for a sample with 33 mg/ml BLG and 17 mM CdCl 2 . The development of MIP (A interm ) shows a maximum around 40 min, and the overall crystallinity (A Bragg ) has a plateau between 40 and 60 min, and then grows faster. Interestingly, the overall growth rate, i.e. the first derivative of A Bragg on time, gives a maximum located also around 40 min, indicating that the overall crystal growth rate in the early stage strongly depends on the development of the MIP.
We have performed real-time SAXS measurements on all four salt conditions followed by optical microscope. However, as seen from Figure 3, the number of crystals decreases and the size of crystals increases with increasing salt concentration. This makes the real-time SAXS measurements challenging as the number of the crystals within the illuminated volume drops significantly. We have tried to compensate this by measuring more positions from the sample. This is partly successful, but the time resolution is reduced as only one or two out of ten spots show the development of the Bragg peaks. As shown in Figure 8b, less data for Bragg peaks than the MIP are shown. Nevertheless, one can still recognize the interesting kinetics: first the MIP develops relatively fast and becomes saturated after 120 min. Within the current experimental time scale, only a minor fraction of crystalline phase was detected. The experimental observations on the kinetics including particularly the non-monotonous crystallization rate (red dashed line in Figure 8a) agree well with a simple model, which will be discussed in the following section.

Modeling with rate equations
In order to compare the observed kinetic features also more quantitatively to possible crystallization scenarios, we employed rate equation models. As essential variables of the modeling, L, I, C I and C L denote the mass fractions of free monomers in the liquid, intermediate, crystals in the intermediate and crystals in the liquid. The three paradigmatic cases for the crystallization process shown in Figure 1 are presented as follows: Classical nucleation L → C L : The classical nucleation theory contains a onestep nucleation process of the critical nucleus from the homogeneous solution. After nucleation, crystallites grow larger from the solution. In terms of modeling, ∆ nL = k n L and ∆ gL = k gL LC L represent the nucleation and the growth term with rates k n and k gL . Using these, the process can be easily modeled by the following set of rate equations: Parallel process I ↔ L → C L : In this non-classical process, we assume that in addition to the one-step crystallization process, a reversible intermediate is formed in the solution, competing with crystallization for the free monomers. In addition to the nucleation and growth term from the classical one-step nucleation, we include the formation of the intermediate with the term ∆ I = k I (L − I). The corresponding set of rate equations reads crystallization rate drops again considerably, whereas it increases monotonically for both one-step cases until saturation (Figure 9d). This particular feature of the two-step process is caused by the nucleation and slow growth in the intermediate, while the crystals grow faster once emerged into solution. Thus, the occurrence of the plateau indicates the presence of a multi-step nucleation process.    Table 1 for further details). The following parameter values were used to show the good qualitative agreement of the model with the data set in Figure 8a: k e = 0.15 min −1 , L 0 = 0.2, α I = 0.2, k gL = 0.6 min −1 , k I = 0.03 min −1 , k n = 0.02 min −1 , k gI = 0.2 min −1 . An additional model plot reproducing the data set in Figure 8b can be found in the S.I., Figure S3). (d) Comparison of the crystallization rates dC/dt. While one-step and parallel nucleation processes show a monotonous speed-up until saturation, the two-step process can have a non-monotonous signature with two maxima. The model parameters were chosen for all three models to be the same: k I = 0.05 min −1 , k n = 0.02 min −1 , k gI = 0.1 min −1 , k gL = 0.2 min −1 , k e = 1.0 min −1 , L 0 = 0.7, α I = 0.2.
The large number of model parameters does not allow for a reliable extraction of nucleation rates via model fits to the kinetic analysis of the SAXS data. We emphasize that this is not a problem of the model or the data quality, but the 1-18 | 15 complex pathway that involves many coupled processes. As an example, the amount of crystal nucleation within the MIP depends not only on the rate k n , but also on the amount of MIP. The latter is mainly determined by the supersaturation represented by L 0 , at least for fast formation of the MIP compared to crystal nucleation. Only based on fitting the kinetic data from SAXS, an decrease in L 0 will consequentially cause a decrease in k n . While thus additional information e.g. on the supersaturation are required for a real quantitative fit, the qualitative signatures of the two-step process such as the plateau in the crystal fraction can still be used to provide evidence for the scenario of a two-step nucleation process.
Nevertheless, this simple model can reproduce the experimental crystallization kinetics at different conditions. This can be achieved either by varying the rate parameters or by choosing the amount of MIP which approximately approaches L − L 0 = 0.2. This value can be determined experimentally by following the protein concentration in the supernatant with time using UV-visible spectroscopy. A tentative experiment for sample with 20 mM CdCl 2 leads to the value of 0.2. Further experiments are needed to refine this parameter.

Conclusions and outlook
We have investigated the two-step nucleation process of protein crystallization in solutions by following the overall crystallization kinetics using real-time optical microscopy and SAXS. The experimental results together with a rate equation model provide solid evidence of two-step nucleation in the early stage of crystallization. The BLG-salt (CdCl 2 ) solutions were chosen at the transition zone of pesudo − c**, where small aggregates form after sample preparation. These protein aggregates serve as the metastable intermediate phase (MIP) during crystallization.
SAXS and SANS reveal that the MIP shows a certain local ordering instead of random aggregates as monitored by a broad shoulder at intermediate q ≈0.7 nm −1 , and a monomer-monomer correlation peak at q around 2 nm −1 . Real-time SAXS results show that the crystallization kinetics is proportional to the development of the MIP in the early stage of crystallization, i.e. the appearance of a local maximum in the crystallization rate at the maximum quantity of the intermediate. In the late stage of crystallization, a plateau is developed due to the transition from nucleation controlled in the early stage to growth controlled after the consumption of MIP. This transition in the overall crystallization kinetics is a typical feature for the two-step nucleation in the early stage. These experimentally observed kinetics can be reproduced using a rate equation model.
For further real-time measurements, we note that the smaller beam size and scattering volume of SAXS can be compensated by using SANS. The combination of real-time SAXS and SANS can provide more systematic information of the crystallization kinetics.