Solution structure of hyperactive type I antifreeze protein

Antifreeze proteins (AFPs) protect freeze-intolerant fish species living in icy polar waters against freeze damage. A 34 kDa dimeric type I antifreeze protein (wfAFP 1h) with unusually high activity in comparison to all other antifreeze proteins in fish was recently discovered in the winter flounder. We have measured the size and shape of this hyperactive AFP by using small angle X-ray scattering. Our results show that wfAFP 1h adopts a long cylindrical shape with a length of 19 ¡ 2 nm and a diameter of 2.3 ¡ 0.2 nm, which means that wfAFP 1h does not form a fully extended helical dimer in solution. These findings call for a revision of the structural model of wfAFP 1h and the concept of a flat, threonine-rich ice-binding site extending down the length of the protein. Instead, the hyperactive nature of wfAFP 1h may be derived from a unique 3D arrangement of the helices —yet to be resolved— by which it is able to bind to ice surfaces.


Introduction
The freezing point of seawater (21.9 uC) is more than 1 uC below the equilibrium freezing point of the body fluids of fish surviving in icy polar seawaters. To protect themselves in this undercooled state from rapidly freezing upon contact with ice, these fish produce antifreeze plasma proteins that lower the freezing point by stopping the growth of ice crystals. 1 Antifreeze proteins (AFPs) have specific structural properties that enable them to adsorb to the surface of ice crystals thereby restricting the addition of water molecules to the growing ice surface. The ice surface is forced to grow with a thermodynamically unfavorable curvature, leading to a local depression of the non-equilibrium freezing point, which results in an arrest of further ice growth. 2 Over the years, a wide diversity of AFP structures have been found in many different organisms ranging from fish 3 to insects, 4 bacteria 5 and fungi. 6 In 2004, Marshall and coworkers reported an unusually potent type I AFP isoform in winter flounder (wfAFP 1h), that was 10 to 100 times more active than the previously discovered type I AFP in the same fish species. 7 Circular dichroism measurements at 4 uC showed that this alanine-rich, 195 residue-long (16.7 kDa) protein is almost entirely a-helical. 8 Analytical ultracentrifugation analysis and gel permeation chromatography furthermore indicated that this protein forms a dimer in solution and that it has an extreme asymmetry with an estimated axial ratio of L/d = 18. 9 Based on these data, Marshall and coworkers concluded that wfAFP 1h is a helical dimeric rod of 29 nm in length and 1.6 nm in diameter, consisting of two long helices associated side-by-side. A coiledcoil structure was ruled out based on the paucity of hydrophobic amino acid residues (e.g., Leu, Ile, Val), thereby explaining the instability of the protein (T denat y9 uC). 8,10 Also, a coiled-coil structure would interfere with regularly positioned threonine residues appearing on a flat ice-binding site along the length of the protein. The complex was therefore modelled as an antiparallel dimer composed of two straight helices with relatively weak interactions between the alanine (Ala) side chains, i.e., as an Ala-zipper, resulting in a closer packing with L = 27.5 nm and d = 1.6 nm giving an axial ratio of L/d = 17. 10 In this model both subunits could simultaneously engage in ice-binding.
To obtain reliable evidence on the exact dimensions and multimeric state of wfAFP 1h, we have performed small angle X-ray scattering (SAXS) experiments on the protein in solution. SAXS is a powerful tool for characterizing the structure of macromolecules in their native environment. 11 Coherent scattering of X-rays by randomly oriented proteins gives information on their size, shape and molar mass. In contrast to X-ray diffraction, SAXS does not require crystalline samples which greatly broadens its application perspective, but reduces the attainable resolution.
The results presented here confirm the dimeric state of wfAFP 1h and provide accurate values for its dimensions. The length of wfAFP 1h as determined from the SAXS experiments is found to be smaller than hypothesized based on the previous model, 10 which implies that the helical dimer is arranged differently and not fully stretched out.

Sample preparation
Recombinant wfAFP 1h was produced as previously described 12 with the following modification. Instead of ice affinity purification, three rounds of ammonium sulfate precipitation were used. This is an effective method for purification because wfAFP 1h precipitates from solution at a relatively low percent saturation of ammonium sulfate (28%) while most Escherichia coli proteins remain soluble. The wfAFP 1h pellet from the last precipitation was re-suspended and dialyzed overnight in 50 mM HEPES buffer (pH 6.5) to remove ammonium sulfate. Purity was estimated by SDS-PAGE, and the protein concentration was measured by amino acid analysis. Recombinant wfAFP 1h at 15 mg mL 21 was stored at 4 uC and used within 2-3 weeks of its preparation.
Small angle X-ray scattering Data acquisition and data reduction. The synchrotron radiation X-ray scattering data were collected at the highbrilliance beamline ID02 of the ESRF in Grenoble, France, 13 operating at 12.46 keV. The scattering intensity was measured as a function of momentum transfer vector q = 4p(sinh)/l, where l = 0.1 nm is the radiation wavelength and 2h is the scattering angle. Two sample-to-detector distances of 1.5 and 3 m were used to cover an angular range of 0.064 , q , 3.85 nm 21 . Samples were measured in a polycarbonate (ENKI, KI-Beam) flow through capillary with a diameter of d = 1.9 mm kept in a temperature-controlled holder at T = 5 uC. The twodimensional SAXS patterns were normalized to an absolute intensity scale using the calibrated detector response function, known sample-detector distance, and measured incident and transmitted beam intensities. 13 These normalized SAXS patterns were subsequently azimuthally averaged to obtain the one dimensional SAXS profiles. For each sample, 10 frames of 0.3 s were collected and averaged after checking for radiation damage. This corresponds to a total data collection time of 3 s per sample with a reduced flux of about 10 12 photons s 21 . To obtain the protein scattering curve, the normalized background scattering profile of the buffer and polycarbonate cell was subtracted from the normalized sample scattering profiles. Finally, the absolute calibration of the scattering curves were verified using the known scattering cross-section per unit sample volume, dS/dV, of water, being I(0) = 0.01665 cm 21 for T = 5 uC. 14 Data analysis. Small angle X-ray scattering is a powerful tool to determine the dimensions and molar mass of proteins and protein complexes directly in solution. The primary require-ment is an accurate measurement of the q-dependence of the scattering intensity of the sample of interest, the buffer, and a calibration standard with known scattering cross-section such as water 14 to compute the differential scattering cross-section per unit sample volume, dS/dV. This quantity gives direct access to the size, shape, and average molar mass of the protein (complex) under investigation according to with the difference in scattering contrast with the solvent, K, the weight concentration, c, the molecular weight, M w , and interference effects arising from the structure within the sample, i.e., interparticle and intraparticle interference represented by the structure factor, S(q), and the form factor, P(q). First, the scattering data were analyzed using a Guinier approximation to extract the radius of gyration, R g , and the forward scattering intensity, I 0 , which is dS/dV(q A 0). For monodisperse globular proteins, the Guinier approximation is valid for qR g ¡ 1.3 and R g and I 0 were determined from the slope and y-intercept of the Guinier plot ln(dS/dV(q)) vs. q 2 using PRIMUS from the ATSAS software package. 15 For a rodlike particle, a modified Guinier approximation can be used in the q-range where dS/dV(q) 3 q 2a with a y 1 to extract the cross-sectional radius of the cylinder from the determined cross-sectional radius of gyration, with R cs~ffi ffi ffi 2 p R acs . Using the obtained radius of gyration and the crosssectional radius from the Guinier approximation, the length of a cylinder can be calculated by Subsequently, I 0 = dS/dV(q A 0) was used to calculate the molar mass of the protein (complex), which is listed in Table 1, according to with the molecular weight M SAXS in g mol 21 , the forward scattering intensity I 0 in cm 21 , concentration c in g cm 23 , Avogadro's number N av , the scattering length density difference Dr in cm 22 (r protein 2 r H 2 O , where r protein = 1.25 6 10 11 cm 22 and r H 2 O , = 9.44 6 10 10 cm 22 ) 16 and the partial specific volume of the protein in solution v = 0.7302 in cm 3 g 21 calculated using SEDNTERP. 17 Finally, we have analyzed the scattering profiles in the entire recorded q-range with various form factor models using the software package SASfit 18 assuming that wfAFP 1h was measured in an ideal dilute solution where interparticle interactions can be neglected (i.e., S(q) = 1). Two form factors were used; one describing a cylindrical object 19 and the other describing a worm-like chain model according to the equations reported by Pedersen and co-workers. 20

Molecular shape reconstruction
To derive information on the solution structure of the protein, the molecular shape was reconstructed using simulated annealing methods. First, the radial distribution function (RDF), P(r), which describes the probable frequency of interatomic vector lengths (r) within the scattering particle, was obtained upon indirect inverse Fourier transformation of the scattering data using GNOM. 21 The maximum linear dimension (D max ) was set to approximately 3 6 R g and adjusted to give the best fit to the experimental data. The RDF was considered to be zero at r = 0 Å and approaches zero at D max .
The GNOM output files were used as input for simulated annealing calculations over the range 0.08 , q , 3.8 nm 21 using the online version of DAMMIF. 22 For each scattering curve, 10 independent bead models were generated without predefined shape or symmetry. The 10 different models were superimposed using DAMSEL and DAMSUP. Next, DAMAVER was used to average the aligned models and compute the probability map. 23 Finally, DAMFILT was used to filter the averaged model to give a structure that has high densities on the probability map.

Results and discussion
Guinier and form factor analysis A Guinier approximation (Fig. 1B) was used to obtain the radius of gyration R g and molecular weight M w . The estimated molecular weight of M w = 34.7 ¡ 2.5 kDa corresponds to the formation of a dimer in solution. By using a modified Guinier approximation for a rod (Fig. 1C) the cross-sectional radius R cs is obtained. Using these structural parameters, the length of the protein complex was calculated using eqn (2), confirming that wfAFP 1h adopts a highly elongated structure with a length L = 17.6 ¡ 0.9 nm and a cross-sectional radius of R cs = 1.12 ¡ 0.01 nm, giving an axial ratio of L/d = 7.9 ¡ 0.4 ( Table 2).
The experimental data are also well described by a rigid cylindrical form factor (Fig. 1A and Table 3) and a worm-like chain model (Fig. S1 and Table S1 of the ESI3). The cylindrical model assumes a rigid cylinder with uniform scattering length density. The structural parameters extracted from fitting the data with a cylindrical form factor are the cross-sectional radius R cs = 1.26 ¡ 0.02 nm that is close to the R cs obtained from the Guinier analysis, and the length L = 19.7 ¡ 0.2 nm, which is slightly larger.

Structural model of the protein complex
From the results of the Guinier and form factor analyses it is evident that wfAFP 1h has a highly extended shape with a length of L = 19 ¡ 2 nm and diameter of d = 2.3 ¡ 0.2 nm. These dimensions imply that wfAFP 1h cannot form a dimer of two fully extended helices, since the latter would correspond to a length of y27.5 nm. To derive information on the shape of the dimer, we have reconstructed the ab initio shape of the protein from simulated annealing calculations using DAMMIF, which features an unlimited and adapting search volume that avoids boundary effects, especially in highly elongated objects. 21 The averaged and filtered volumetric map of wfAFP 1h shows a rod-like shape with a length of 18.5 nm (Fig. 2D), resembling the dimensions found in the Guinier and form factor analysis. Given the dimensions and high helical content of wfAFP 1h, it is evident that the molecular structure of this hyperactive antifreeze protein is unlike any antifreeze protein found so far. The b-helical folds of hyperactive AFPs from insects display a flat b-sheet ice-binding surface with regularly positioned threonine residues. 4 Hence, a likely arrangement of the dimer would be two helices associated side-by-side, enabling a uniform presentation of threonine residues along the length of the protein. 10 Yet, this structure is not consistent with the SAXS results, which means that a different conformation of the helices might confer the hyperactive antifreeze activity.
To get more insight into the 3D arrangement of the helices in the protein complex of wfAFP 1h, we have performed rigid body modelling using SASREF. 25 SASREF uses multiple rigid subunits with known atomic coordinates to perform quaternary structure modelling of a protein complex against the experimental data. Since no crystallographic data of the subdomains in the protein complex of wfAFP 1h is available, we have constructed two long a-helices with a length of 18.5 nm and two short helices to account for the remaining number of residues of the protein complex. These subunits and contact conditions (Fig. 3A) were provided as input for SASREF. The result of the rigid body modelling (Fig. 3C) seems consistent with the results of the more coarse-grained modelling approach and comes very close to the hypothesized structure in which the protein complex is able to display regularly positioned threonine residues on its ice-binding face, enabling a large protein surface area to engage in ice-binding which could explain the hyperactive nature.
However, we need to draw attention to the limitations of the resulting structural model. Firstly, the initial contact conditions influence the SASREF results. Therefore, about six contact conditions per complex were set and various contact conditions were tested (Fig. S3-5 of the ESI3). Reducing the number of contact conditions resulted in one of the subunits losing contact with the complex. Secondly, we propose the structural model in Fig. 3C on the basis of additional information rather than the goodness of fit of the different SASREF models. Thirdly, the subunits are adapted from PDB  Table 2 Structural parameters obtained from a (modified) Guinier analysis. R g and M w are obtained using a Guinier approximation (Fig. 1B), R cs by using a modified Guinier approximation for a rod (Fig. 1C) and L was calculated using eqn (2) Conc. (mg mL 21 ) R g (nm) M w (kDa) R cs (nm) L (nm) L/d  Table 3 Structural parameters obtained after fitting the SAXS data with a cylindrical form factor. The radius of gyration is calculated from the obtained R cs and L using eqn (2) Conc. (mg mL 21 ) R g (nm) R cs (nm) Protein complex has unique molecular structure Given the extraordinary shape of the protein that is very distinct from any other AFP found so far, we have searched in the Protein Database (PDB) for proteins with similar shape based on four criteria: (1) high a-helicity, (2) dimeric state, (3) number of residues close to 195 as found for wfAFP 1h and (4) length and axial ratio. We have compared the experimental data with theoretical scattering profiles of 10 different crystal structures, and computed the theoretical scattering curves from the electron density profiles of the selected proteins using the known atomic coordinates deposited in the PDB by using CRYSOL. 26 The theoretical scattering curves were then compared to the experimental scattering data (see Fig. S6-7 of the ESI 3 ). Only a handful of the selected a-helix rich dimers of the protein database have a structure that is comparable yet not identical to that of wfAFP 1h. Most of the matching protein structures originate from the cytoskeleton or muscle. Clearly,  the protein complex adopts a structure with high asymmetry, which is unique for a plasma protein.

Conclusions
Structural data form the basis for understanding the hyperactive nature of wfAFP 1h. Our SAXS experiments provide the first reliable and independent estimation of the length, diameter and molecular weight of wfAFP 1h in solution. We find that wfAFP 1h self-assembles into a dimer with a highly extended (i.e., non-globular) conformation that is unusual for a plasma protein and mostly found in muscle and cytoskeleton proteins. The protein complex has a rigid cylindrical shape with a length and diameter of 19 ¡ 2 nm and 2.3 ¡ 0.2 nm, respectively. This demonstrates unequivocally that the two a-helices in the dimer are not fully extended (i.e., L = 27.5 nm) contrary to what was hypothesized previously, irrespective of whether they are arranged in a coiled-coil or parallel conformation. Furthermore, if the helical dimers would be held together via an interaction of the alanine side chains, the packing would be too close to match the experimentally observed diameter suggesting that wfAFP 1h adopts a more complex conformation. Further evidence from biochemical methods and/or diffraction studies is necessary to gain more insight into the 3D arrangement of the two monomers in the protein complex and the positioning of regularly placed threonine residues at the ice-binding site.