Establishing the link between fibril formation and Raman optical activity spectra of insulin †

Folding of proteins into insoluble amyloidal fibrils is implicated in a number of biological processes. Optical spectroscopy represents a convenient tool to monitor such structural variations. Recently, characteristic changes in Raman optical activity (ROA) spectra of insulin during a pre-fibrillar stage were reported but not supported by a theoretical model. In the present study, molecular dynamics and the density functional theory are used to simulate the spectra and understand the connection between the structure, and ROA and Raman spectral intensities. Theoretical results are consistent with the observations and only confirm exceptional ROA sensitivity to the protein tertiary structure. Surprisingly, this sensitivity reflects local conformational changes in the peptide main and side chains, rather than a direct through-space interaction of the protein components. Side chains providing strong ROA signals, such as tyrosine, can additionally report on local conformational features. Theoretical modeling helps in explaining the observed spectral changes and is likely to enable future applications of ROA spectroscopy in protein structural studies.


Introduction
Amyloidal protein aggregates and fibrils are involved in a wide variety of biological processes including serious medical conditions such as neurodegenerative disorders. 1Insulin itself, a peptide hormone essential for the regulation of carbohydrate metabolism, 2 adopts several fibril forms participating in various metabolic pathways. 3nsoluble protein precipitates are difficult to study by standard high-resolution techniques; the structure is not regular enough for standard X-ray diffraction and provides a mediocre signal for nuclear magnetic resonance spectroscopy. 4Vibrational optical activity (VOA), including spectroscopies of vibrational circular dichroism (VCD) and Raman optical activity (ROA), has been suggested as an alternative technique that is extremely sensitive to protein structural variations. 5,6Indeed, VCD spectroscopy could detect not only the formation of insulin fibrils, but also subtle changes in their macroscopic helical twist caused by pH variation. 7he ROA technique, although in principle capable of capturing a wider range of molecular vibrations than VCD, has not been extensively used yet for fibrils due to experimental artifacts 8 inherent to inhomogeneous samples.The problem was recently overcome for a pre-fibrillar state of insulin, where ROA spectra could be recorded on an incident circularly polarized (ICP) ROA spectrometer. 9Amyloidal clear or milky gels were incubated from a solution of bovine insulin and 0.1 M hydrochloric acid at 82 1C.After several hours at 22 1C the amyloid/fibrils refolded to their native form, and the whole process could be monitored spectroscopically. 10In the future, ROA spectroscopy is thus expected to provide a more detailed insight into the fibrils' world.However, rather minute changes in the spectra have been interpreted predominantly empirically or only on simplified systems.
In the present study we develop a theoretical basis to interpret these experimental results, by comparing them to density functional theory simulations of Raman and ROA intensities.This is a challenging task because the amyloid insulin structure is unknown at atomic resolution.In the past, the Cartesian coordinate transfer (CCT) 11,12 technique enabled us to model spectral intensities of the native insulin form 13 or even of larger globular proteins 14 with unprecedented precision.In these cases, however, the geometries in solutions were supposed to be rather rigid and close to the published X-ray structures.
Fortunately, earlier studies indicate the most likely insulin secondary and tertiary structure in the fibrils as well.On a simplified insulin peptide sequence, a parallel b-sheet structure has been identified by X-ray as the basis of fibril conformation. 15he parallel b-sheets are also formed by proteins of similar size and sequences to insulin. 16,17Finally, ROA and Raman spectra of fibrous insulin resemble those of other proteins forming parallel b-sheets. 10In the present study, b-sheet structures derived from X-ray geometries of other proteins and molecular dynamics (MD) simulations are jointly used to model the insulin structure.
Quantum-mechanical simulations of ROA spectral intensities stimulated this kind of spectroscopy, making it a reliable tool to determine the absolute configuration 18,19 and conformation 20 of biomolecules.In particular, spectral computations within the density functional theory (DFT) provide sufficient precision at acceptable computational cost. 21,22They are based on a perturbation treatment of the interactions of molecules with circularly polarized light; 23,24 the origin-independence of results obtained at approximate computational levels is typically ensured by field-dependent atomic orbitals. 25Normally, a harmonic approximation 26,27 is used for vibrational frequencies, although anharmonic extensions are possible as well. 28evertheless, in spite of efficient implementations 22,29 the direct ''ab initio'' simulations (including DFT) become very lengthy indeed for larger molecules. 30Therefore, various simplifications were suggested in the past, such as the intensitycarrying normal mode algorithm 31 and the molecules-inmolecules (MIM) fragment-based approach. 32We use the CCT methodology 11,12 allowing one to efficiently simulate largemolecule spectra 14 as well as to average a large number of MD snapshots with a minimal loss of DFT accuracy. 13s we show in the present study, the method enabled us to link the observed spectral features to the protein structure and the nativefibril conformational change.By several computational experiments we could also test the sensitivity of ROA spectra to finer structural variations, investigate the role of contact (non-covalent) interactions of peptide chains, identify aromatic marker bands sampling the local geometry, and test other details of the computational methodology.Although primarily aimed at the insulin fibril experiment, the results confirm the potential of ROA spectroscopy to sense protein structural variations, including subtle conformational changes.

Measurement of Raman and ROA spectra
The spectra of native and fibrous insulin have been measured according to the protocol detailed elsewhere. 9Briefly, after condensation at elevated temperature and sonification, a metastable fibril state was obtained and its Raman and ICP ROA spectra were recorded simultaneously.The formation of the fibrils was confirmed by electronic circular dichroism and fluorescence of the thioflavin T dye.The sample was kept in a 30 ml fused silica cell and after about 3 hours the conformation changed back to the native one and its spectra were measured as well.

Molecular dynamics
To account for protein flexibility and temperature fluctuations, molecular dynamics simulations were performed within the Amber10 environment. 33Monomeric insulin and the insulin trimer based on the b-roll geometry were placed in rectangular boxes (40 Å Â 40 Å Â 40 Å and 80 Å Â 30 Å Â 50 Å, respectively) filled with water molecules.In the trimer model only the middle molecule could move.The Amber03 force field, 34 the NTV thermodynamic ensemble, temperature of 300 K, 1 fs integration step and periodic boundary conditions were used.The systems were equilibrated for 1 ns; during a production run (8.65 ns) 865 snapshot geometries were taken (i.e., each 10 ps).An average nuclear density was generated and for the best matching snapshot 35 spectral parameters were calculated as described below, transferred to the other snapshots, and the resultant spectra were averaged.
Other MD runs were performed within the Tinker program 36 modified to simulate the macroscopic twist. 37The X-ray derived insulin geometry was placed in a rectangular box (12 Å Â 110 Å Â 110 Å) filled with water molecules.Helical periodic boundary conditions 37 were applied to allow for a minor (01-61) twist between neighboring insulin units.The Amber99 force field 38 was used in the NVT run with 1 fs integration time.During a 0.1 ns production run 100 snapshots were selected each ps and the spectra were generated as in the previous case.

Computation of vibrational spectra
The force field and tensor derivatives 23,24,39 needed to generate vibrational Raman and ROA spectra for the native and X-ray mimicking fibrous insulin forms were generated from smaller molecular fragments via the CCT method. 11,12Following the automatic procedure described elsewhere 14 the insulin molecule was divided into overlapping ''covalent'' fragments containing four amino acid residues, complemented by ''contact'' fragments accounting for interactions between close side chains.However, test computations confirmed that non-covalent through-space interactions influence the resultant Raman and ROA spectra only in a minor way (Fig. S3, ESI †); thus the contact fragments were not considered further.This is consistent with previous results on other proteins 14 and benchmark numerical tests. 12,40The omission of the non-covalent interactions led to significant saving of computer time.
The fragments were capped by methyl groups and subjected to partial optimization in vibrational normal mode coordinates, 41 fixing the limiting normal frequency 42 to 100 cm À1 .This lower value allowed for a more extensive relaxation of the geometries and provided slightly better spectra (Fig. S4, ESI †) than the limit of 300 cm À1 used as default in the past. 13Program Qgrad 41 was used for the normal mode optimization; the program is interfaced with the Gaussian 43 software.All quantum chemical computations on the fragments were carried out using the B3PW91 44 functional providing excellent results for protein ROA, 45 the standard 6-31++G** basis set and the conductor-like polarizable continuum solvent model (CPCM) 46,47 with water parameters accounting for both protein and aqueous environments.
For the optimized fragments, the harmonic force field (second energy derivatives) and derivatives of the a, G 0 and A polarizability tensors 23,24 were calculated using the Gaussian and transferred to the insulin molecule, and then back-scattered Raman and ICP 24 ROA intensities were generated as usual. 24,39rom the intensities, spectral curves were generated by a convolution with Lorentzian bands; full width at half maximum was set to 10 cm À1 .

Results and discussion
Computation versus experiment Simulated and experimental spectra of the native and fibrous/ amyloidal insulin are compared in Fig. 2. Details of the experimental spectral shapes (upper panels in the figure) were discussed elsewhere. 10In the Raman spectra, formation of the fibrils is associated with minor spectral changes, such as the shift of the principal amide I band (mostly CQO stretching), from 1659 to 1674 cm À1 , and small intensity variations elsewhere.The ROA spectrum changes more; the fibril form gives a more compact ''À/+'' amide I ''couplet'' at 1660/1674 cm À1 instead of a broader one at 1640/1668 cm À1 of native insulin.The native À/+ 1240/1313 cm À1 band intensities become smaller and a positive 1271 cm À1 signal appears in this region for the fibril.The ROA signal around 1313 cm À1 has been previously identified as an important marker band for the a-helix, 10,14,[48][49][50] forming approximately 43% of native insulin, and the 1271 cm À1 band for the b-turn. 10Negative bands at 282, 993 and 1447 cm À1 in the ROA spectrum of native insulin disappear for the fibril, or are at least much less prominent due to experimental noise.
The simulation (lower part of Fig. 2) reproduces many of these observations, sometimes in surprising details.The CQO stretching frequencies of amide I and carboxyl (experimentally within 1659-1728 cm À1 ) are calculated to be very high, which is common for modeling at this level and has been identified as an error inherent to the dielectric solvent model. 51,52In the Raman spectra, the observed shift in an average amide I frequency (1659 -1674 cm À1 for nativefibril transition) is reproduced reasonably well, as 1730 -1742 cm À1 .In ROA, the fibril form exhibits a sharp 1660(À)/1674(+) ''couplet'' in the experiment, which is well reproduced by the theory at 1728/1740 cm À1 .This seems to be a typical feature of b-sheet structures also observed in globular proteins rich in b-sheets (e.g.concanavalin A), 53 although model simulations indicate a significant dependence on detailed geometry, such as the b-sheet twist. 20In the native form, the amide I ROA signal is broader and mostly positive.The experimental Raman 1617 cm À1 band (calculated at 1669 cm À1 , cf.Table 1) belongs to CQC stretching vibrations in aromatic tyrosine residues, and is accompanied by a close band (''shoulder'', experimentally at B1608 cm À1 ) due to analogous phenylalanine vibrations.These ''aromatic'' vibrations generate a fairly strong negative ROA signal at 1620 cm À1 for the amyloid, reproduced by the simulation at 1657 cm À1 .
The amide II band (C-N stretching and NH bending, around 1540 cm À1 ) is weak, which is usual in non-resonance Raman peptide spectroscopy.Only in resonance does its Raman intensity become comparable to that of amide I. 54 In native insulin, however, there is a weak negative ROA signal (1537 cm À1 , computed at 1557 cm À1 ), disappearing for the amyloid, both in theory and experiment.
The histidine C-H bending vibration (calculated 1522 cm À1 , experimentally 1446 cm À1 ) gives strong Raman bands but generates a very weak ROA signal, at spectrometer detection limits.In the extended ''amide III'' region (B1200-1400 cm À1 ) the simulation confirms that ROA is relatively strong and changes with the conformation.The 1313/1339 cm À1 (exp./calc.)ROA positive band of native insulin loses intensity and a new 1271/1306 cm À1 positive band appears in the fibril.The simulated spectrum changes even more, and predicts a 1343 cm À1 negative ROA band for the amyloid, undetected experimentally.Previously, vibrational modes in this region were identified as vibration of the main peptide chain coupled with a C-H bending and side chain vibrations, 9,14,48-50 lending them exceptional sensitivity to fine geometry changes.
The negative 1240 cm À1 (exp.)ROA band of native insulin slightly shifts to 1233 cm À1 for the fibril, in accord with the simulations (1239 -1213 cm À1 ), although here the observed intensity also changes less than that in theory.Partially, this can be explained by the incomplete conversion of the native state into the fibrillar one in the experiment. 10The negative 993 cm À1 experimental ROA band of native insulin is not well predicted theoretically either.
What is truly remarkable is the intense ROA signal in the lowest-wavenumber region (200-300 cm À1 ), comparable with the strongest bands of the extended amide III region.This region was largely ignored in previous protein studies, mostly for experimental reasons (unavailability of narrow filters) and due to difficult interpretation.The +/À native insulin bands at 224/282 cm À1 transform into a positive signal for the fibril, which seems qualitatively reproduced by the theory, although the simulated intensity is lower.Visualization of the dynamic displacement of vibrational normal modes revealed that the negative (282 cm À1 ) ROA bands largely arise from a-helical like segments of insulin.Previously, a similar +/À pattern was observed at 229/302 cm À1 for highly-helical human serum albumin, 14 thus confirming the potential of the low-energy vibrations for probing the protein structure.

Signal of the aromatic residues
Insulin aromatic residues comprise one histidine, three phenylalanine and four tyrosine groups.Even though the aromatic rings themselves are not chiral, a strong ROA signal can be induced by coupling of their vibrations with neighboring covalent bonds and other peptide parts. 14,55,56In turn, they can locally sample the protein conformation.Indeed, according to the molecular dynamic modeling, phenylalanine and tyrosine conformer ratios (w 1 and w 2 side chain torsion angles) significantly change when insulin adopts the fibril/amyloidal form (Table S1, ESI †).
To understand how the Phe and Tyr residues may contribute to insulin ROA spectra, we simulated Raman and ROA spectra of NH 2 -Phe-COH and NH 2 -Tyr-COH model aldehydes (Fig. S5, ESI †).The simulations revealed significant changes in spectral shapes of aromatic bands at 1005, 1330 and 1650 cm À1 due to the conformation of the aromatic side chain.For NH 2 -Phe-COH, the conformers were averaged using MD populations; Raman and ROA spectra were generated for the native and fibrillar insulin and are plotted in Fig. 3 (spectra of individual conformers are shown in Fig. S6, ESI †).For the fibril, the CQC stretching band (number III in Fig. 3) generates a relatively strong negative ROA signal, corresponding to the observed band at 1620 cm À1 (Fig. 2).A similar intensity change occurs for the aromatic hydrogen bending (B1350 cm À1 , number II) and out of plane motion (B1000 cm À1 , number I), where, however, correspondence to the experiment is not so obvious due to overlap with other vibrations.

Effect of fibril twist on the spectra
As pointed out above, Raman and ROA insulin spectra simulated with and without the fragments accounting for non-bonding inter-chain interactions were almost identical (Fig. S3, ESI †).
Earlier studies also suggest that the through-space mutual polarization and other effects contribute to the resultant intensities only marginally. 12However, this does not mean that the ROA technique is insensitive to the tertiary protein structure.It does reflect the fine twist of fibril threads because the twist itself causes local changes in insulin conformation.This is documented in Fig. 4 where the spectra are simulated for three values of the twist (01, À31 and À61).The Tinker program was used to simulate the helical periodic boundary conditions, 37 so that the spectra for the 01 twist do not completely coincide with those simulated by the Amber program; the shapes are fairly similar nonetheless.
The twist variation approximately corresponds to that found in the model X-ray protein structures (Fig. 1).While the Raman spectra (bottom of Fig. 4) are more or less insensitive to the twist, the opposite is the case for ROA shapes (top).The ROA intensity variations are less pronounced above 1500 cm À1 , e.g., the amide ''W'' shape is conserved for the three twist values.On the other hand, the extended amide III region 1200-1400 cm À1 undergoes more profound changes.All intensity variations are not monotonic, e.g., for the À31 twist a positive ROA signal appears around 1275 cm À1 , with the intensity being smaller both for 01 and À61.In the lower-wavenumber region (below B1000 cm À1 ) the spectra are less sensitive to the twist and depend on it in a more monotonic way.At the current level of experimental noise and computational precision it would be too speculative to deduce the twist of insulin fibrils merely by comparing the theoretical and experimental spectra; this ''computational experiment'' nevertheless documents the potential of ROA spectroscopy for future fibril structural studies.
Because the spectra for different fibril twist were generated from a limited number of MD snapshots, we had to ensure that the error of averaging is smaller than the effect of the twist itself.This is documented in Fig. S7 (ESI †): the small differences in Raman and ROA intensities obtained for the 50 and 100 snapshot simulations indicate that the twist-induced changes in the spectra are realistic indeed, i.e. they are mostly caused by structural changes, such as variation of the distribution of torsional angles, rather than by incomplete MD averaging.Examples of the angular distributions are given in Fig. 5, for the main chain j and c torsion angles, and histidine and tyrosine w 1 side chain angle.Note, for example, that the (j, c) values are close to (À1451, 1501), i.e., corresponding to the canonical b-sheet conformation, 57 and that some twist-induced changes are not monotonic in the row 0 -À31 -À61, as is the case for the ROA spectra.As may be expected, the individual side chain angle w 1 distributions react more sensitively to the twist than the averaged j and c ones.

ROA versus VCD
From a more practical point of view it may be useful to summarize some specific features of fibril ROA as compared to another form of the vibrational optical activity, vibrational circular dichroism.As mentioned already in the introduction, one needs to be aware that inhomogeneous anisotropic samples, such as the fibrils, may for both techniques cause instrumental artifacts. 8,24,58This problem seems to be less serious for VCD, where sample rotation or measurement in different orientations usually eliminates the problem. 59For ROA, depolarization and reflection effects destroy the signal more and many researchers including us look for ways to stabilize it.Therefore, in the present work, we restrict the discussion to the ''immature'' fibril experiment, assuring sample homogeneity.Preliminary data nevertheless suggest that the observed spectral features will also apply to ''mature'' fibrils consisting of larger insoluble aggregates.
Based on the results shown above, we can also conclude that the formation of the fibrils causes specific changes in ROA spectra.They are less spectacular than those for VCD; in particular insulin fibrils provided 10-100 fold VCD enhancement if compared to the native form. 7,60The enhancement seems to stem from a long-range synchronization of amide I (CQO stretching) vibrations. 61Such an enhancement is to some extent possible also for ROA, 62 but has not been observed for the insulin fibrils.Here, the amide I signal ''only'' changes the shape (Fig. 2).As indicated by the simulations incorporating the macroscopic twist (Fig. 4), the longer-range order is reflected in the ROA spectra, too, but it is mediated by local conformational changes.
Finally, as a big advantage of ROA and Raman techniques in general, a wider range of wavenumbers is accessible than for VCD.For the insulin fibrils, secondary structure change visible in the amide I region could be confirmed by the a C-H bending (around 1310 cm À1 ) and low-frequency modes (o300 cm À1 ).Obviously, a combination of multiple techniques, such as ROA and VCD, is desirable for obtaining complete information on protein conformational behavior.

Conclusions
By combining molecular dynamics and density functional theory we simulated Raman and ROA spectra of the fibrillar conformation of insulin.Based on the modeling we could interpret most of the experimentally observed spectral changes, in particular the dependence of the ROA peak signs and intensities on the structure.The geometries derived from the b-roll and b-helix proteins provided similar spectra.Roughly speaking, most spectral changes could be interpreted as an a-helixb-sheet secondary structure transition.Most significant changes comprised the amide I, extended amide III and the lowestwavenumber (o300 cm À1 ) spectral regions.
The aromatic residues not only had a strong Raman signal, but also an ROA signal that may locally probe the structure.One has also to admit that many spectral features potentially useful for structure determination are obscured due to limited accuracy of the simulations, band overlaps and experimental noise.
Rather surprising and potentially useful for future fibril studies was the sensitively responding amide III ROA signal to the fine changes of the conformation simulated for different This journal is © the Owner Societies 2017 macroscopic twists.The modeling showed that this does not reflect any direct non-covalent interaction of peptide chains, but is mediated by subtle changes in the local protein conformation.All the results confirm that the ROA technique is indeed very sensitive to protein conformational changes including the fibrillation, and that the complex simulations help extract additional details about the molecular structure and dynamics.
Two similar structural models of the amyloid were considered in the theoretical modeling, b-roll and b-helix.The b-roll insulin conformation was based on the b-roll protein (1VH4 in the structural database, http://www.rcsb.org/),i.e. the torsion angles of insulin were set to those in the A256-A306 residues of the protein.As for native insulin 13 polar groups were protonated to correspond to the experimental pH of 2.5-3.1.An analogous procedure was used for the b-helix, where the torsion angles matched those in the A113 to A163 residues in the b-helix protein (item 1DAB in the database).The terminal part of the insulin B chain (B22-B30) was kept in its native b-sheet-like conformation.The 1VH4 and 1DAB proteins, and the derived insulin pentamer models are plotted in Fig. 1.An alternate view is provided in Fig. S1 in the ESI.† The b-helix geometry differs from b-roll by an additional loop separating two b-sheet parts, making the structure triangular (Fig. 1, part b).However, the two amyloidal models provided very similar spectra (Fig. S2, ESI †), and only the b-roll-like geometry was used for extensive modeling and analysis.

Fig. 1 X
Fig. 1 X-ray geometries of the b-roll and b-helix fibrous proteins.Torsion angles from the yellow parts (viewed from a different perspective also in (a) and (b)) were used for initial insulin geometries.Examples of regular insulin fibril segments (pentamers) based on these units are shown in (c) and (d).

Fig. 2
Fig. 2 Experimental (top) and simulated (bottom) spectra of native and amyloidal insulin; the amide I vibrational region is expanded on the right-hand side.The calculated spectra were obtained as averages from 865 MD snapshot geometries.

Fig. 3
Fig. 3 Calculated Raman and ROA spectra of a model NH 2 -Phe-COH molecule mimicking the Phe and Tyr average conformation in the fibril and native insulin.Characteristic bands of the aromatic ring vibrations (I-III) are indicated.

Fig. 4
Fig.4ROA and Raman spectra of an insulin fibril simulated for three values of the twist (t) between neighboring insulin units (average from 100 MD snapshots).

Fig. 5
Fig. 5 Distribution of selected torsion angles for the three twist values (cf.t Fig. 4): average j and c main chain angles, and the w 1 angle of the histidine and tyrosine side chains.

Table 1
Positions (cm À1 ) of the most intense Raman and ROA bands in insulin spectra