The secondary structure of diatom silaffin peptide R5 determined by two-dimensional infrared spectroscopy

Asger Berg Thomassen a, Thomas L. C. Jansen *b and Tobias Weidner *a
aDepartment of Chemistry, Aarhus University, Langelandsgade 140, Aarhus C 8000, Denmark. E-mail: weidner@chem.au.dk
bZernike Institute for Advanced Materials, University of Groningen, Groningen 9747 AG, The Netherlands. E-mail: t.l.c.jansen@rug.nl

Received 5th March 2024 , Accepted 26th May 2024

First published on 30th May 2024


Abstract

Diatoms, unicellular marine organisms, harness short peptide repeats of the protein silaffin to transform silicic acid into biosilica nanoparticles. This process has been a white whale for material scientists due to its potential in biomimetic applications, ranging from medical to microelectronic fields. Replicating diatom biosilicification will depend on a thorough understanding of the silaffin peptide structure during the reaction, yet existing models in the literature offer conflicting views on peptide folding during silicification. In our study, we employed two-dimensional infrared spectroscopy (2DIR) within the amide I region to determine the secondary structure of the silaffin repeat unit 5 (R5), both pre- and post-interaction with silica. The 2DIR experiments are complemented by molecular dynamics (MD) simulations of pure R5 reacting with silicate. Subsequently, theoretical 2DIR spectra calculated from these MD trajectories allowed us to compare calculated spectra with experimental data, and to determine the diverse structural poses of R5. Our findings indicate that unbound R5 predominantly forms β-strand structures alongside various atypical secondary structures. Post-silicification, there's a noticeable shift: a decrease in β-strands coupled with an increase in turn-type and bend-type configurations. We theorize that this structural transformation stems from silicate embedding within R5's hydrogen-bond network, prompting the peptide backbone to contract and adapt around the biosilica precursors.


1 Introduction

Biomineralization is a process where living organisms produce organic–inorganic composite materials with a high level of structural control such as exoskeletons, shells, bones and teeth.1–3 Scientists have attempted to mimic biominerals to produce biocompatible and sustainable materials with promising applications such as cancer diagnosis and treatment,4 bone regeneration medicine,5 self-healing concrete6 and improving water quality.7

Biosilica is produced by diatoms (Bacillariophyceae) for their cell walls through biosilicification from low concentration (∼70 μM) silicic acid in the ocean.8–11 One of the key advantages of biomineralized silica for applications is that the formation can occur at room temperature, neutral pH and atmospheric pressure, while the reaction is fast and leads to a biomaterial with impressive material properties.8,12,13 Thus, biosilica could be a route to an environmentally friendly production of silica nanostructures.

In 1999 Kröger et al. found that silaffins are one of the major classes of biosilica precipitating proteins in the diatom species Cylindrotheca fusiformis.14 Within this class of proteins are the silaffin-1 peptides that are derived from the Sil1p polypeptide, which proteolytically cleaves the C-terminal in vivo into 7 repeat units (R1–R7).8,15 One of these repeat units from the silaffin-1A band of Sil1p is R5 (SSKKSGSYSGSKGSKRRIL) and it has been shown that this peptide produces biosilica nanoparticles from silicic acid in vitro at a pH value of ∼7.14

The precursor for biosilica, silicic acid Si(OH)4, is a weak acid with a pKa of 9.8, so at neutral pH when R5 produces biosilica, ∼0.18% of the silicic acid is deprotonated and found as the reactive silicate species Si(OH)3O.16 Silica is produced from a condensation reaction where silicate reacts with silicic acid and releases water to form silica in a polymerization reaction that eventually leads to larger particles.16 Silaffin peptides are thought to act as templates for this reaction, where the polyamines of post-translational modifications can improve the reaction conditions of the condensation by leading the reactants together.2

R5–biosilica particles have been applied to immobilize enzymes,17,18 encapsulate and release cargo molecules,19 as a vaccine delivery platform20 and an R5 derivative was also used to synthesize antimicrobial peptide nanoparticles.21

Regardless of the many applications and vast potential of R5–biosilica, the structure of the peptide during silicification is not yet fully understood.22 In order to achieve increased control of the process for nanotechnology applications, it is crucial to gain further insight into the structure of the peptide at the molecular level. We here use two-dimensional infrared (2DIR) spectroscopy to follow the process of biosilicification.

There have been several contradictory studies describing the structure of R5 during silicification.22 Roehrich & Drobny studied the structure of the neat R5 peptide and the composite material of R5 and silica by using solid-state NMR.23 They observed a significant change in the chemical shifts of the backbone atoms, but no patterns that would resemble a defined change in the secondary structure. Instead, they attributed the large shifts of the lysine residues near the N-terminal to being due to solvent exposure in a micelle assembly.23 These findings backed the micelle-like self-assembly model, which was previously hypothesized by Knecht & Wright due to the positively charged side chains on arginine together with the hydrophobic leucine and isoleucine in the RRIL sequence.24 In another solid-state NMR study by Buckle et al., a possible secondary structure change was found between neat R5 and the R5–SiO2 composite, possibly to increase the exposure of charged side chains in a micelle to silicate.25

The possibility of β-strands and a secondary structure change of R5 in the silica composite and the micelle model were disputed by the findings of Senior et al. in a study of the structure of R5 where they used 1D and 2D solution-state 1H-NMR, circular dichroism (CD) and dynamic light scattering (DLS).26 In their study, they concluded that R5 is unstructured and monomeric in solution and the addition of silicic acid does not induce a structural change.

Since silicification is a surface process, Lutz and coworkers studied the R5 structure during silicification at the air–water interface, by combining experiments and simulations of the surface sensitive technique sum-frequency generation (SFG).27

In contrast to the findings of Senior et al.,26 they found that in addition to undefined structures, R5 contains β-strand structures, which were reduced during silicification at the air–water interface. Since the number of interpeptide bonds was reduced during silicification they concluded that silica possibly affects the H-bond network to induce a conformational change of the backbone during silicification.27

Roeters et al. conducted an SFG study at the peptide–silica interface where they combined MD simulations with spectra calculations based on NMR dihedral angles reported by Drobny et al.22,25 They found that R5 possibly changes structure from a β-sheet to turns during silicification at the peptide–silica interface and that a micelle-type assembly is plausible. The solid-state NMR torsion angles also showed that the peptide is extended in a strand-like structure and then contracts slightly into a coiled, more turn-rich structure.23

Gascoigne and coworkers attempted to reconcile the findings from solution-state NMR with the findings from solid-state NMR by combining scattering techniques with CryoTEM images.28 From their analysis they found that the peptide assembles into a mass fractal of 100–200 nm, which is composed of spherical units of 1 nm radius.28 The authors suggested that these findings28 could unify the theory of a larger assembly from solid-state NMR25 with the monomers found in solution-state NMR.26,28

However, the study by Gascoigne et al.28 provided no further insight into the secondary structure of R5. Thus, the existing structural studies of R5 remain conflicting: the solution-state NMR study, where they show that the structure of R5 is a monomeric, random coil and does not change the secondary structure upon silicification.26 The solid-state NMR and SFG studies suggest micelle assembly with a β-strand like R5 structure and the conformation changes during silicification.22,23,25,27 The scattering study instead suggested fractal assembly.28

More in situ structural studies are thus necessary to determine the secondary structure of R5 during mineralization. Two-dimensional infrared (2DIR) spectroscopy in the amide I region is a well-established method for determining protein and peptide secondary structures. The amide I mode originates from the amide bond and is dominated by CO stretching coupled to the adjacent NH bending modes. Since the mode is found throughout the peptide backbone and is sensitive to hydrogen bonding, it is an excellent reporter of secondary structures. The strength of 2DIR lies in the measurement of amide I resonances and their couplings across two spectral axes, which significantly enriches the lineshapes with detailed structural information.30 This attribute of 2DIR makes it particularly effective for in-depth structural analysis of peptides like R5.

2 Materials and methods

2.1 Sample preparation

The R5 peptide (SSKKSGSYSGSKGSKRRIL) was purchased from GenScript (98.2% purity). 1 M Si(OH)4 was prepared by mixing 150 μL of TMOS (98%, Sigma-Aldrich) with 850 μL of 1 mM DCl (dilution of 35 wt% DCl, 99 atom% D, Sigma-Aldrich) and sonicating the solution for 5 minutes. 10 mM PBS (Sigma-Aldrich) was used in all peptide solutions. For neat R5 samples a 1 mL solution of 5 mg mL−1 (2.5 mM) peptide in a 0.01 M solution of PBS–D2O was prepared in an Eppendorf tube. 20 μL of the solution was added to an IR sample cell between two CaF2 windows with a 200 μm Teflon spacer. For the samples with silicate, a solution of 0.5 mL of 1 mM R5 in D2O and 0.01 M PBS was prepared. 13 μL of 1 M Si(OH)4 were then mixed into the solution in a glass vial, which resulted in white flakes and a cloudy solution after 10 minutes of incubation time. 0.5 mL of the mixture with the precipitate was transferred to an Eppendorf tube, which was centrifuged at 14[thin space (1/6-em)]000 rpm for 5 minutes and washed with PBS–D2O three times. Finally, 60 μL of PBS–D2O was added to the centrifuged precipitate in the Eppendorf tube to make a suspension of silica–R5 particles where 20 μL was transferred to a 2DIR cell by pipetting.

2.2 Scanning electron microscopy

After centrifugation, the particles were dried under vacuum. The dry particles were transferred to an aluminium SEM stub with conductive carbon adhesive tape. Unadhered particles were removed from the stub by using a flow of nitrogen. Scanning electron microscopy (SEM) images were obtained using a TESCAN CLARA(S8151) in the analysis mode. An accelerator energy of 15 keV, a current of 7 nA and a working distance of 10 mm was used.

2.3 2DIR measurements

The 2DIR spectrometer is a 10 kHz time domain instrument (2DQuickIR from Phasetech, Madison WI, USA).31,32 A Pharos Yb-based laser (Pharos, Light Conversion) is used to generate 1030 nm pulses of ∼100 fs duration with a 10 kHz repetition rate. A broadband mid-IR pulse (∼1500–1800 cm−1) is then generated from an optical parametric amplifier (ORPHEUS, Light Conversion) and from a difference-frequency mixer (Lyra, Light Conversion). The broadband pulses are then split into pump and probe pulses using a beamsplitter with most of the power going to the pump. Each pump pulse is split into pairs of two pulses using a pulse shaper which consists of two gratings and a germanium acousto-optic modulator in the 4f geometry. The energy of each pump pulse pair was ∼1 μJ and the probe pulse energy was ∼0.15 μJ. The delay between the pair of pump pulses and the probe pulse is controlled using a delay stage. A half-wave plate (HWP) is used to change the polarization of the probe beam followed by a polarizer to filter the light. The pump and probe beams overlap non-collinearly using two parabolic mirrors. The beam diameter of the pump beam is ∼200 μm and the beam diameter of the probe beam is ∼150 μm at the sample. The probe beam enters a grating spectrometer (Princeton Instruments) with 75 grooves mm−1. Finally, the dispersed light from the grating spectrometer is recorded using a 10 kHz 64-pixel MCT detector array (Jackhammer, Phasetech). A rotating frame at 1350 cm−1 was used in the experiments such that the delay between the pump pulses could be scanned in 35 fs steps up to a total of 2205 fs.31 For every pump time delay the signal from a phase cycling scheme was measured where the relative phase between the two pump pulses is varied in four different relative phases and summed to reduce scattered light.29,30,33 A Hamming function was used for windowing. To produce the 2DIR spectrum, the measured probe light was Fourier transformed with respect to the time delay between pump pulses. The probe axis was calibrated with respect to an H2O FTIR spectrum. Spectra with absolute intensities of more than 1.5 times the median of the absolute intensities were discarded due to scattering events.

For each sample, spectra were recorded in parallel and perpendicular relative polarizations between the pump and probe pulses. All spectra were recorded with a waiting time between pump and probe pulses of 250 fs with a spectrum at −10 ps subtracted from each spectrum to further reduce scattering. The beam path was purged with dry nitrogen to remove water absorption lines from the spectra.

2.4 Molecular dynamics simulations

The force field for Si(OH)3O (silicate) was made using Q-force,34 which augments a transferable force field with force field parameters that are calculated from quantum mechanics calculations for a chosen molecule. Default settings were used within Q-force with the overall charge set to −1.

Two different starting structures of R5 were used for two different 2 μs molecular dynamics (MD) simulations with and without silicate. Each simulation contained one R5 peptide and either 6 Cl in the neat R5 simulation or 6 Si(OH)3O in the other simulation.

The starting peptide structures were generated by Roeters et al. from solid-state NMR-derived torsion angles reported by Roehrich et al. for lyophilized R5 before and after interaction with silica.22,23 The structures are shown in Fig. 1a. The OPLS-AA force field35 was used for the peptide since a benchmark study has shown that 2DIR spectra that are calculated from simulations using this force field provide a high overlap with experimental spectra.36 SPC/E was used for the water molecules since it predicts the diffusion coefficients of water at a higher accuracy than many other models36–38 and the model has frequently been used for MD simulations in other studies for 2DIR simulations of proteins.39–41 A well-described diffusion constant of water is important since water has a significant contribution to the lineshapes in 2DIR spectra. All the MD simulations were performed using Gromacs 2019.442–49 and the simulations mostly follow the protocol outlined by Lemkul.50 During the production run the coordinates were saved every 10 ps. Further details of the MD simulations are available in the ESI.


image file: d4cp00970c-f1.tif
Fig. 1 (a) The neat R5 peptide (top) and R5 during interaction with silica (bottom). The structures are from Roeters et al.23 with torsion angles reported by Roehrich et al.24 (b) An Eppendorf tube containing the R5–silica precipitate produced by R5 with silicic acid. The R5–silica nanoparticles were separated from the solution by centrifugation. (c) SEM image of the R5–silica nanoparticles. (d) Experimental 2DIR spectra of neat R5 (left spectra) and R5–silica particles (right spectra). The top row shows the spectra, which were recorded in parallel polarization (‖) and the bottom row shows the spectra, which were recorded with perpendicular polarization (⊥). A grey square is marked below 1620 cm−1 to show the area that was excluded in the comparison with the calculated spectra since this region mainly contains side-chain modes. All the spectra are normalized to the absolute max intensity in the plotted region with contour levels that are symmetric around 0 and spaced evenly.

Representative structures were found by clustering the frames in the 2 μs trajectories. New simulations with a length of 1 ns were made by starting from these structures with the same settings as the 2 μs simulations but instead sampling the coordinates every 20 fs. 1 ns was chosen as the simulation time since 1 ns after the representative structure all the frames were still belonging to the same cluster in the 2 μs trajectories.

Sampling the coordinates at small enough intervals is crucial since it determines the maximum calculated linewidths with the NISE program.51–57 Furthermore, the amide I frequencies fluctuate on time scales less than 100 fs due to solvent librations and hydrogen bonding fluctuations,58 so motional narrowing will occur if the sampling timestep is too large.51 If the time step is too low the computational costs and disk space requirements increase, so 20 fs timesteps have been shown in a benchmark study to be sufficiently low and provide the best balance between lineshapes and computational costs.59 Additionally, 20 fs timesteps have been used in other studies for the purpose of calculating 2DIR spectra.39,51,60,61

The backbone structures of the R5 peptide were clustered in 2 μs MD simulations based on the root-mean-square deviation (RMSD) of the backbone. TTClust62 was used as the clustering software, which utilizes SciPy's clustering functions63 using Python 3.8.8. The Ward Linkage clustering algorithm64 was employed on each of the 2 μs trajectories, where every 10th frame was used due to the heavy computational load of calculating the large RMSD matrix. Ward Linkage was chosen since it has been found in a benchmark to perform well at clustering MD trajectories to retrieve representative structures from the ensemble.65 TTClust offers an algorithm to determine the minimum number of clusters that should be included called the elbow method.66 The elbow method was used for both 2 μs simulations, which yielded 4 clusters in both cases. However, upon visual inspection two of the clusters in the neat simulation appeared highly similar, so instead the number of clusters was set to 3 for the neat simulation and two clusters merged into the third cluster. Finally, representative structures were extracted from each cluster. The representative structures are defined as the frames with the lowest average RMSD to the other frames within the cluster.62

2.5 Spectral calculations

From each of the 1 ns simulations, the Hamiltonian was calculated using the AIM software.51 The Skinner electrostatic map for the backbone amide I vibrations67 was used, with the dipole moments calculated using the Torii method68 and the transition dipole coupling (TDC) map by Tasumi & Torii to calculate the coupling between amide groups that are not nearest neighbors.68 The GLDP method by Jansen & Knoester was used for nearest neighbour coupling and frequency shift.69 This combination of maps together with the OPLS-AA force field was chosen since a benchmark study showed that it performs well for calculating 2DIR spectra.36 When the distance between residues was more than 2 nm, the electrostatic effects were no longer considered, which was also used in other studies.51,61 The Hamiltonian and dipole moment obtained from AIM were used to calculate the 2DIR spectra using the NISE_2017 program.52–57 The response functions were averaged over an ensemble of 500 realizations which were calculated from starting structures taken every 100 frames or 2 ps of the trajectory. The coherence times were set to 5.12 ps and a fixed vibrational lifetime was applied through an exponential apodization function to the coherence times of 1 ps. The anharmonicity was assumed to be 16 cm−1 for the amide I vibration as Hamm et al. have measured it for deuterated NMA using 2D pump–probe spectroscopy.70 Similar settings have also been used in other protein 2DIR studies.36,59,71

The waiting time was set to 240 fs. The waiting time is 10 fs lower than the experimental waiting time but was chosen because the coordinates were sampled every 20 fs in the MD simulations and the calculated waiting time must be an integer multiple of the time between coordinate sampling steps. Alternatively, 12.5 fs sampling time steps in the MD simulations could be used, but that would result in significantly larger files. A 10 fs difference in waiting time should be negligible within the accuracy of the experiments. All spectra were calculated in the frequency range 1540 cm−1 to 1740 cm−1. When simulating 2DIR spectra of proteins it is a common practice to shift the frequency axes either by optimizing the overlap with the experimental spectra such that lineshapes can better be compared to experiments36,51,59 or by using a systematic shift such that the peak positions can also be compared to experiments.36,39,59,72

Cunha et al. found a systematic frequency underestimation of 9.6 cm−1 independent of the secondary structure when using the same force field and frequency map as in this work.36 Based on their findings, a systematic shift of +9.6 cm−1 was applied to the frequencies in all the calculated spectra in this paper.

3 Results

3.1 2DIR experiments

The silica particles produced from the mixture of R5 (Fig. 1a) and silicic acid are shown in the SEM image (Fig. 1b and c). The typical diameters of the particles are in the nanometer range. 2DIR spectra were recorded upon parallel and perpendicular polarizations of R5 in PBS–D2O (neat R5) and of a suspension of R5–silica precipitate in PBS–D2O. The spectra shown in Fig. 1d have been normalized to the maximum of the absolute spectra within the plotted region. The peak located near 1645 cm−1 on the diagonal is assigned to the amide I resonance.73 There are two additional peaks present at frequencies below the amide I region near 1585 cm−1 and 1610 cm−1. These modes are assigned to the two nearly degenerate C–N stretching modes in the guanidinium group of the two arginine side chain within the R5 sequence.74 These features may also include a very small contribution from the tyrosine side chain,73 but the bands from this side chain are typically much weaker than the arginine side chain bands73 and in R5 there is only one tyrosine compared to the two arginine residues.

Based on visual inspection, the difference between the amide I peaks in the spectra of neat R5 and R5 with silica appears small. The main difference between the spectra is that the amide I peaks in the neat spectra are rounder and the spectra with silica are more elongated along the diagonal. In addition, the amide I mode was shifted off the diagonal in the silica spectra. Since R5 is a small and flexible protein, it is known to contain little22 or no26 defined secondary structure. Therefore, it is expected that the differences between the amide I peaks are subtle. The main resonances are centred around 1645 cm−1, which is typically assigned to disordered and coiled structures. To extract more detailed structural information from the data, we must go beyond direct spectral interpretation and have therefore performed MD simulations and spectral calculations.

3.2 MD simulations and 2DIR calculations

Two MD simulations with R5 of 2 μs simulation times were performed – one with neat R5 and one with R5 and silicate. We then applied a clustering algorithm to group similar structural poses into clusters based on the RMSD of the backbone atoms. The representative structures from each cluster were extracted. Using the representative structures as starting configurations, 1 ns MD simulations with high coordinate sampling rates were produced.

Theoretical 2DIR amide I spectra were calculated from the 1 ns MD simulations. The calculated spectra are summarized in Fig. 2. Note that the arginine side chain modes have a small overlap with the amide I modes, but these side chain modes have not been included in the calculations because no frequency maps are available.


image file: d4cp00970c-f2.tif
Fig. 2 Calculated 2DIR spectra from 1 ns MD simulations that were started from the representative R5 structure in each cluster. On the left are the spectra, which were calculated from neat R5 simulations and on the right are spectra which were calculated from simulations containing R5 and silica. For each MD simulation the spectrum was calculated in both parallel (‖) and perpendicular (⊥) polarizations.

The main amide I peaks of neat R5 clusters 2 and 3 are narrower than the experimental features. When comparing the simulated amide I spectra to the experimental spectra, the spectra of clusters 2 and 3 miss the intensity in the high-frequency region, whereas the intensity for cluster 1 overshoots in the high-frequency region. All the clusters are missing some width in the low-frequency side of the main amide I peak compared to the experiment, but this could be explained by the overlap between the arginine modes and the amide I modes which possibly convolutes to artificially broaden the amide I peak of the red side in the experimental spectrum. Since the arginine modes are not part of the simulation this broadening on the red side of the amide I peak is not seen in the simulated spectra.

Moving to the simulation clusters for R5 when bound to silica, the 2DIR spectra of cluster 1 qualitatively match the experimental spectra best – in both spectral position and shape. Cluster 3 also matches the resonance position of the experimental data. The peak position of R5–silica clusters 2 and 4 show a significant mismatch in the spectral position when compared with the experiment.

2DIR experiments were used to measure an ensemble of many different protein structures where the amide I modes of many proteins overlap. For this reason, we have investigated how far a linear combination of the structural clusters can capture the experimental data. To determine how much each of the clusters contributes to the experimental ensemble, we have calculated linear combinations of the 2DIR spectra from each cluster by fitting them to the experimental data.

The spectral overlaps between the calculated spectra and the experimental spectra were determined using the normalized standard deviation S2D, which compares the points in the 2D spectra with a value between 0 and 1, where 1 means a perfect match.75 The calculated data points were interpolated to match the frequencies of the experimental data points. To do this, in every iteration of the fitting procedure, the frequencies of the calculated spectra were interpolated to the experimental frequencies with the quintic spline interpolation from SciPys interp2d.63 Scipy.optimize.minimize63 was used with the Broyden, Fletcher, Goldfarb, and Shanno (BFGS) method76 to maximize the overlap by maximizing S2D. The spectra were normalized to the maximum of the absolute intensities before calculating the overlap and the fitting coefficients were normalized to the sum of the coefficients. The arginine region below (1620 cm−1, 1620 cm−1) was omitted from the fit. The spectra obtained from the fitting procedure are depicted in Fig. 3, along with the associated fitting coefficients for each cluster. Notably, the spectra, both in the parallel and perpendicular polarizations, demonstrate a balanced distribution of influence between clusters 1 and 3. This distribution corroborates the preliminary observations based on the visual inspection of the spectra.


image file: d4cp00970c-f3.tif
Fig. 3 The 2DIR spectra of each cluster were fitted to the experimental spectra to maximize the overlap. The fitted spectra, which are shown were normalized to the maximum of the absolute intensity in the plotted region. The spectra are plotted with 21 evenly spaced and symmetric contour levels. The left spectra show the fitted spectra for neat R5 and the right shows the fitted spectra for R5–silica. The fitted spectra are shown in parallel (‖) and perpendicular (⊥) polarizations. The fitting coefficients for each spectrum generated from a cluster are shown in the spectra. The spectral overlap (S2D) of each fitted spectrum with the corresponding experimental spectrum is also shown.

In the fitted spectra for R5 bound to silica, for parallel polarization, the results show that cluster 1 exhibits the most significant weighting with cluster 4 and cluster 2 following with lower weights. Cluster 3 has no significant contribution. The results are similar for the spectra in perpendicular polarization. The contributions of clusters 1 and 4 to the R5–silica spectrum are approximately equivalent, each accounting for around 50% of the total spectra.

3.3 R5 secondary structure

A Ramachandran difference plot of the winning ensembles before and after silica interaction is shown in Fig. 4a. The dihedral angles of all the residues for all frames in 1 ns simulation time were calculated using MDAnalysis.77,78 Due to the large number of frames 2D histograms were constructed from 2° × 2° bins and weighted by the spectral fitting coefficients. The bins were normalized to the maximum bin value across all the histograms and this value was set to 100%. The sum of histograms from the structure ensembles with and without silica interaction were subtracted to obtain the difference Ramachandran plot shown in Fig. 4a. The plot clearly shows that by interacting with silica, the number of β-strand-like features is reduced, while β-turns are more abundant. Ostensibly, interaction with silica favours more folded and contracted structural motifs. A similar behaviour has been observed for R5 during mineralization at interfaces.22,27
image file: d4cp00970c-f4.tif
Fig. 4 Analysis of the secondary structure of the R5 peptide in the 1 ns simulations where the total structural content was found by summing and weighing the content of each simulation with the average (between parallel and perpendicular) fitting coefficients. The summed contents of neat R5 were then subtracted from R5–silica so negative and positive values represent a loss and increase, respectively, of the particular structural elements after silicifiation. (a) Difference in Ramachandran plots found by binning dihedral angles in a 2D histogram with 2° × 2° bins. Each bin was normalized to the max bin value across all plots, which was set to 100%, and then the difference in the weighted sum of each bin was calculated. The typical regions for β-strands and β-turns are shown in the plots. (b) The secondary structure analyzed by using the DSSP algorithm with the difference in the structure.

DSSP (define secondary structure of proteins) analysis of a structured ensemble can assign the secondary structure for each residue based on hydrogen bond patterns in the backbone.79

The secondary structure was assigned using DSSP with MDTraj80 for each 1 ns simulation. Besides a significant assignment to the random coil structure, there was a significant amount of defined secondary structures identified using DSSP analysis (see Fig. S1, ESI). The structural contents in percentage were summed for both neat R5 and R5–silica, while weighted by the corresponding average fitting coefficients. Finally, the sum of neat R5 was subtracted from the sum of R5–silica to obtain the difference DSSP plot shown in Fig. 4b.

The results corroborate the Ramachandran analysis. There is a decrease in β-sheets and coils and a slight decrease in H-bond turns during silicification. In addition, an increase in bends and β-bridges (single pair of β-sheet hydrogen bonds) is observed.

It is important to note that DSSP analysis was developed for the analysis of large globular proteins.79 When exploring shorter peptides with unconventional and diverse structures, it is prudent to interpret these assignments with a grain of salt. One example is the assignment to a small amount of 310 helical motifs, which likely is based on a minority of very transient species. Nonetheless, it can be concluded that consistent with the Ramachandran plot, DSSP analysis suggests a shift toward more contracted structural poses when interacting with silica.

4 Discussion

The experimental and simulated spectra showed peaks assigned to random coil structures along with clear traces of defined secondary structure. A more detailed analysis showed transitions from sheet-like extended structures to more contracted turn and bend type motifs. This is also borne out in the experimental data. Anti-parallel β-sheets, for example, typically show a transition near 1630 cm−1, which shifts to higher wavenumbers with a lower number of strands and with twisting.73 This shift is also observed directly in the experimental spectra.

The change in the structure from β-strands to β-turns could be explained by the R5 backbone being more stretched out without silicate, while it bends and contracts around silicate molecules as they are added to the solution (Fig. 5).


image file: d4cp00970c-f5.tif
Fig. 5 Illustration of the entanglement of R5 with silicate precursors. The peptide gains more turn structures with silicate present. Ostensibly, the interaction bends R5 around the silicate clusters, leading to a more contracted structural pose compared with the solution state structure.

The decrease in β-sheets during silicification could be caused by the silicate molecules disrupting the backbone H-bond network as also suggested by Lutz et al.27

In addition, DSSP analysis showed an increase in turns without H-bonds (bends), so possibly silicate binds to the residues that are assigned as β-turns without H-bonds since DSSP only counts H-bonds between backbone atoms.79

Our findings are in agreement with previous reports. The β-strand folds we identify in our work support previous findings based on solid-state NMR23,25 and SFG studies.22,27 Previous SFG and MD studies of interfacial R522,27 found a decrease in β-strands and an increase in the turn structure content during silicification, which is also seen in our data.

Furthermore, β-sheet formation in neat R5 has also been predicted by others using SFG.22,27 At the same time, the considerable extent of disordered structures we see in the 2D IR data is in agreement with the solution-state NMR results of Senior et al.26 However, in the latter study is it concluded that there are no ordered secondary structures in R5, and that the peptide stays disordered during silicification. Our data provide evidence that in addition to undefined secondary structures in R5 the peptide also folds into ordered and defined secondary structures and these structures change during silicification.

5 Summary

In conclusion we found that neat R5 contains a combination of short strands, turns and bends by combining experimental and theoretical 2DIR-spectra. When R5 interacts with silicate the more strand-like poses tended to form more bends and turns. This transition may reflect a contraction of R5 around silicate precursors during the silicification process. The data also show that the R5 peptide is partly unstructured in solution, in addition to the defined secondary structures we identified. These findings offer new insights into the fundamental mechanism of R5–silica precipitation, revealing that R5 is not merely disordered. Instead, the peptide exhibits significant structural diversity and exceptional flexibility for interacting with silica.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

A. B. T. thanks Fani Georgieva Madzharova for training and support in using 2DIR, and also for her help with sample preparation. A. B. T. also thanks Adam Chatterley for technical support. A. B. T. additionally thanks Rebekka Klemmt for providing training in using the SEM and thanks Siad Dahir Ali for his help in obtaining SEM images. The Carlsberg Foundation (CF20-0364) and iMAT are acknowledged for funding the TESCAN CLARA SEM. This article is part of a project that has received funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement no. 819039 F-BioIce). The numerical results presented in this work were obtained at the Centre for Scientific Computing, Aarhus https://phys.au.dk/forskning/faciliteter/cscaa/.

Notes and references

  1. L. A. Estroff, Chem. Rev., 2008, 108, 4329–4331 CrossRef CAS PubMed .
  2. C. C. Lechner and C. F. W. Becker, Mar. Drugs, 2015, 13, 5297–5333 CrossRef CAS PubMed .
  3. Y. Chen, Y. Feng, J. G. Deveaux, M. A. Masoud, F. S. Chandra, H. Chen, D. Zhang and L. Feng, Minerals, 2019, 9, 68 CrossRef CAS .
  4. Z. Wang, P. Huang, O. Jacobson, Z. Wang, Y. Liu, L. Lin, J. Lin, N. Lu, H. Zhang, R. Tian, G. Niu, G. Liu and X. Chen, ACS Nano, 2016, 10, 3453–3460 CrossRef CAS PubMed .
  5. Y. Li, Y. Guo, W. Niu, M. Chen, Y. Xue, J. Ge, P. X. Ma and B. Lei, ACS Appl. Mater. Interfaces, 2018, 10, 17722–17731 CrossRef CAS PubMed .
  6. V. Achal, A. Mukherjee, D. Kumari and Q. Zhang, Earth-Sci. Rev., 2015, 148, 1–17 CrossRef CAS .
  7. D. Arias, L. A. Cisternas, C. Miranda and M. Rivas, Front. Bioeng. Biotechnol., 2019, 6 DOI:10.3389/fbioe.2018.00209 .
  8. D. Otzen, Scientifica, 2012, 2012, 867562 CrossRef PubMed .
  9. K. Thamatrakoln and M. Hildebrand, J. Nanosci. Nanotechnol., 2005, 5, 158–166 CrossRef CAS PubMed .
  10. M. Mishra, A. P. Arukha, T. Bashir, D. Yadav and G. B. K. S. Prasad, Front. Microbiol., 2017, 8, 1239 CrossRef PubMed .
  11. P. Tréguer, D. M. Nelson, A. J. Van Bennekom, D. J. DeMaster, A. Leynaert and B. Quéguiner, Science, 1995, 268, 375–379 CrossRef PubMed .
  12. R. Gordon and R. W. Drum, in International Review of Cytology, ed. R. Gordon, Academic Press, 1994, vol. 150, pp. 243–372 Search PubMed .
  13. C. E. Hamm, R. Merkel, O. Springer, P. Jurkojc, C. Maier, K. Prechtel and V. Smetacek, Nature, 2003, 421, 841–843 CrossRef CAS PubMed .
  14. N. Kröger, R. Deutzmann and M. Sumper, Science, 1999, 286, 1129–1132 CrossRef PubMed .
  15. M. Sumper and N. Kröger, J. Mater. Chem., 2004, 14, 2059–2065 RSC .
  16. D. J. Belton, O. Deschaume and C. C. Perry, FEBS J., 2012, 279, 1710–1720 CrossRef CAS PubMed .
  17. H. R. Luckarift, J. C. Spain, R. R. Naik and M. O. Stone, Nat. Biotechnol., 2004, 22, 211–213 CrossRef CAS PubMed .
  18. T. Martelli, E. Ravera, A. Louka, L. Cerofolini, M. Hafner, M. Fragai, C. F. Becker and C. Luchinat, Chem. – Eur. J., 2016, 22, 425 CrossRef CAS PubMed .
  19. C. C. Lechner and C. F. W. Becker, Bioorg. Med. Chem., 2013, 21, 3533–3541 CrossRef CAS PubMed .
  20. D. Reichinger, M. Reithofer, M. Hohagen, M. Drinic, J. Tobias, U. Wiedermann, F. Kleitz, B. Jahn-Schmid and C. F. W. Becker, Pharmaceutics, 2023, 15, 121 CrossRef CAS PubMed .
  21. D. M. Eby, K. E. Farrington and G. R. Johnson, Biomacromolecules, 2008, 9, 2487–2494 CrossRef CAS PubMed .
  22. S. J. Roeters, R. Mertig, H. Lutz, A. Roehrich, G. Drobny and T. Weidner, J. Phys. Chem. Lett., 2021, 12, 9657–9661 CrossRef CAS PubMed .
  23. A. Roehrich, J. Ash, A. Zane, D. L. Masica, J. J. Gray, G. Goobes and G. Drobny, Proteins Interfaces III State Art, 2012, pp. 77–96 Search PubMed .
  24. M. R. Knecht and D. W. Wright, Chem. Commun., 2003, 3038–3039 RSC .
  25. E. L. Buckle, A. Roehrich, B. Vandermoon and G. P. Drobny, Langmuir, 2017, 33, 10517–10524 CrossRef CAS PubMed .
  26. L. Senior, M. P. Crump, C. Williams, P. J. Booth, S. Mann, A. W. Perriman and P. Curnow, J. Mater. Chem. B, 2015, 3, 2607–2614 RSC .
  27. H. Lutz, V. Jaeger, L. Schmüser, M. Bonn, J. Pfaendtner and T. Weidner, Angew. Chem., Int. Ed., 2017, 56, 8277–8280 CrossRef CAS PubMed .
  28. L. Gascoigne, J. R. Magana, D. L. Atkins, C. C. Sproncken, B. Gumi-Audenis, S. M. Schoenmakers, D. Wakeham, E. J. Wanless and I. K. Voets, J. Colloid Interface Sci., 2021, 598, 206–212 CrossRef CAS PubMed .
  29. A. S. Chatterley, P. Laity, C. Holland, T. Weidner, S. Woutersen and G. Giubertoni, Molecules, 2022, 27, 6275 CrossRef CAS PubMed .
  30. P. Hamm and M. Zanni, Concepts and Methods of 2D Infrared Spectroscopy, Cambridge University Press, Cambridge, 2011 Search PubMed .
  31. S.-H. Shim and M. T. Zanni, Phys. Chem. Chem. Phys., 2009, 11, 748–761 RSC .
  32. K. M. Farrell, J. S. Ostrander, A. C. Jones, B. R. Yakami, S. S. Dicke, C. T. Middleton, P. Hamm and M. T. Zanni, Opt. Express, 2020, 28, 33584–33602 CrossRef CAS PubMed .
  33. R. Bloem, S. Garrett-Roe, H. Strzalka, P. Hamm and P. Donaldson, Opt. Express, 2010, 18, 27067–27078 CrossRef PubMed .
  34. S. Sami, M. F. S. J. Menger, S. Faraji, R. Broer and R. W. A. Havenith, J. Chem. Theory Comput., 2021, 17, 4946–4960 CrossRef CAS PubMed .
  35. W. L. Jorgensen, D. S. Maxwell and J. Tirado-Rives, J. Am. Chem. Soc., 1996, 118, 11225–11236 CrossRef CAS .
  36. A. V. Cunha, A. S. Bondarenko and T. L. C. Jansen, J. Chem. Theory Comput., 2016, 12, 3982–3992 CrossRef CAS PubMed .
  37. J. R. Schmidt, S. T. Roberts, J. J. Loparo, A. Tokmakoff, M. D. Fayer and J. L. Skinner, Chem. Phys., 2007, 341, 143–157 CrossRef CAS .
  38. M. W. Mahoney and W. L. Jorgensen, J. Chem. Phys., 2001, 114, 363–366 CrossRef CAS .
  39. A. W. Smith, J. Lessing, Z. Ganim, C. S. Peng, A. Tokmakoff, S. Roy, T. L. C. Jansen and J. Knoester, J. Phys. Chem. B, 2010, 114, 10913–10924 CrossRef CAS PubMed .
  40. Z. Ganim and A. Tokmakoff, Biophys. J., 2006, 91, 2636–2646 CrossRef CAS PubMed .
  41. A. A. Bakulin, D. Cringus, P. A. Pieniazek, J. L. Skinner, T. L. C. Jansen and M. S. Pshenichnikov, J. Phys. Chem. B, 2013, 117, 15545–15558 CrossRef CAS PubMed .
  42. E. Lindahl, B. Hess and D. van der Spoel, J. Mol. Model., 2001, 7, 306–317 CrossRef CAS .
  43. B. Hess, C. Kutzner, D. van der Spoel and E. Lindahl, J. Chem. Theory Comput., 2008, 4, 435–447 CrossRef CAS PubMed .
  44. S. Pronk, S. Páll, R. Schulz, P. Larsson, P. Bjelkmar, R. Apostolov, M. R. Shirts, J. C. Smith, P. M. Kasson, D. van der Spoel, B. Hess and E. Lindahl, Bioinformatics, 2013, 29, 845–854 CrossRef CAS PubMed .
  45. E. Lindahl, M. J. Abraham, B. Hess and D. van der Spoel, GROMACS 2019.4 Source code (version 2019.4), Zenodo, 2019 .
  46. H. J. C. Berendsen, D. van der Spoel and R. van Drunen, Comput. Phys. Commun., 1995, 91, 43–56 CrossRef CAS .
  47. D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark and H. J. C. Berendsen, J. Comput. Chem., 2005, 26, 1701–1718 CrossRef CAS PubMed .
  48. M. J. Abraham, T. Murtola, R. Schulz, S. Páll, J. C. Smith, B. Hess and E. Lindahl, SoftwareX, 2015, 1, 19–25 CrossRef .
  49. S. Páll, M. J. Abraham, C. Kutzner, B. Hess and E. Lindahl, in Solving Software Challenges for Exascale, ed. S. Markidis and E. Laure, Springer International Publishing, Cham, 2015, pp. 3–27 Search PubMed .
  50. J. A. Lemkul, Living J. Comput. Mol. Sci., 2018, 1, 33011 Search PubMed .
  51. K. E. van Adrichem and T. L. C. Jansen, J. Chem. Theory Comput., 2022, 18, 3089–3098 CrossRef CAS PubMed .
  52. T. L. C. Jansen and J. Knoester, J. Phys. Chem. B, 2006, 110, 22910–22916 CrossRef PubMed .
  53. T. L. C. Jansen and J. Knoester, Acc. Chem. Res., 2009, 42, 1405–1411 CrossRef CAS PubMed .
  54. T. L. C. Jansen, B. M. Auer, M. Yang and J. L. Skinner, J. Chem. Phys., 2010, 132, 224503 CrossRef CAS PubMed .
  55. C. Liang and T. L. C. Jansen, J. Chem. Theory Comput., 2012, 8, 1706–1713 CrossRef CAS PubMed .
  56. C. Liang, M. Louhivuori, S. J. Marrink, T. L. C. Jansen and J. Knoester, J. Phys. Chem. Lett., 2013, 4, 448–452 CrossRef CAS PubMed .
  57. C. D. N. van Hengel, K. E. van Adrichem and T. L. C. Jansen, J. Chem. Phys., 2023, 158, 064106 CrossRef CAS PubMed .
  58. M. DeCamp, L. DeFlores, J. McCracken, A. Tokmakoff, K. Kwac and M. Cho, J. Phys. Chem. B, 2005, 109, 11016–11026 CrossRef CAS PubMed .
  59. A. S. Bondarenko and T. L. C. Jansen, J. Chem. Phys., 2015, 142, 212437 CrossRef PubMed .
  60. Y. El Khoury, G. Le Breton, A. V. Cunha, T. L. C. Jansen, L. J. van Wilderen and J. Bredenbeck, J. Chem. Phys., 2021, 154, 124201 CrossRef CAS PubMed .
  61. V. Saxena, R. Steendam and T. L. C. Jansen, J. Chem. Phys., 2022, 156, 055101 CrossRef CAS PubMed .
  62. T. Tubiana, J.-C. Carvaillo, Y. Boulard and S. Bressanelli, J. Chem. Inf. Model., 2018, 58, 2178–2182 CrossRef CAS PubMed .
  63. P. Virtanen, R. Gommers, T. E. Oliphant, M. Haberland, T. Reddy, D. Cournapeau, E. Burovski, P. Peterson, W. Weckesser and J. Bright, et al. , Nat. Methods, 2020, 17, 261–272 CrossRef CAS PubMed .
  64. J. H. Ward Jr, J. Am. Stat. Assoc., 1963, 58, 236–244 CrossRef .
  65. J. Peng, W. Wang, Y. Yu, H. Gu and X. Huang, Chin. J. Chem. Phys., 2018, 31, 404–420 CrossRef CAS .
  66. R. Tibshirani, G. Walther and T. Hastie, J. R. Stat. Soc. Series B Stat. Methodol., 2001, 63, 411–423 CrossRef .
  67. L. Wang, C. T. Middleton, M. T. Zanni and J. L. Skinner, J. Phys. Chem. B, 2011, 115, 3713–3724 CrossRef CAS PubMed .
  68. H. Torii and M. Tasumi, J. Raman Spectrosc., 1998, 29, 81–86 CrossRef CAS .
  69. T. L. C. Jansen, A. G. Dijkstra, T. M. Watson, J. D. Hirst and J. Knoester, J. Chem. Phys., 2006, 125, 044312 CrossRef PubMed .
  70. P. Hamm, M. Lim and R. M. Hochstrasser, J. Phys. Chem. B, 1998, 102, 6123–6138 CrossRef CAS .
  71. T. L. C. Jansen, J. Chem. Phys., 2021, 155, 170901 CrossRef CAS PubMed .
  72. S. Roy, T. L. C. Jansen and J. Knoester, Phys. Chem. Chem. Phys., 2010, 12, 9347–9357 RSC .
  73. A. Barth, Biochim. Biophys. Acta, Bioenerg., 1767, 2007, 1073–1101 Search PubMed .
  74. A. Ghosh, M. J. Tucker and R. M. Hochstrasser, J. Phys. Chem. A, 2011, 115, 9731–9738 CrossRef CAS PubMed .
  75. J. F. Kruiger, C. P. van der Vegte and T. L. C. Jansen, J. Chem. Phys., 2015, 142, 054201 CrossRef PubMed .
  76. J. Nocedal and S. J. Wright, in Numerical Optimization, Springer, New York, NY, 2006, pp. 135–163 Search PubMed .
  77. N. Michaud-Agrawal, E. J. Denning, T. B. Woolf and O. Beckstein, J. Comput. Chem., 2011, 32, 2319–2327 CrossRef CAS PubMed .
  78. R. J. Gowers, M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, J. Domaski, D. L. Dotson, B. Sebastien, I. M. Kenney and B. Oliver, in Proceedings of the 15th Python in Science Conference, ed. S. Benthall and S. Rostrup, 2016, pp. 98–105.
  79. W. Kabsch and C. Sander, Biopolymers, 1983, 22, 2577–2637 CrossRef CAS PubMed .
  80. R. T. McGibbon, K. A. Beauchamp, M. P. Harrigan, C. Klein, J. M. Swails, C. X. Hernández, C. R. Schwantes, L.-P. Wang, T. J. Lane and V. S. Pande, Biophys. J., 2015, 109, 1528–1532 CrossRef CAS PubMed .

Footnote

Electronic supplementary information (ESI) available: Simulation details and Ramachandran plots. See DOI: https://doi.org/10.1039/d4cp00970c

This journal is © the Owner Societies 2024