Cesare M.
Baronio
and
Andreas
Barth
*
Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden. E-mail: barth@dbb.su.se
First published on 15th December 2023
Analysis of the amide I band of proteins is probably the most wide-spread application of bioanalytical infrared spectroscopy. Although highly desirable for a more detailed structural interpretation, a quantitative description of this absorption band is still difficult. This work optimized several electrostatic models with the aim to reproduce the effect of the protein environment on the intrinsic wavenumber of a local amide I oscillator. We considered the main secondary structures – α-helices, parallel and antiparallel β-sheets – with a maximum of 21 amide groups. The models were based on the electric potential and/or the electric field component along the CO bond at up to four atoms in an amide group. They were bench-marked by comparison to Hessian matrices reconstructed from density functional theory calculations at the BPW91, 6-31G** level. The performance of the electrostatic models depended on the charge set used to calculate the electric field and potential. Gromos and DSSP charge sets, used in common force fields, were not optimal for the better performing models. A good compromise between performance and the stability of model parameters was achieved by a model that considered the electric field at the positions of the oxygen, nitrogen, and hydrogen atoms of the considered amide group. The model describes also some aspects of the local conformation effect and performs similar on its own as in combination with an explicit implementation of the local conformation effect. It is better than a combination of a local hydrogen bonding model with the local conformation effect. Even though the short-range hydrogen bonding model performs worse, it captures important aspects of the local wavenumber sensitivity to the molecular surroundings. We improved also the description of the coupling between local amide I oscillators by developing an electrostatic model for the dependency of the dipole derivative magnitude on the protein environment.
The theoretical description of the amide I absorption reaches back several decades.5–8 At present, it is able to explain the structural sensitivity in qualitative terms but is still insufficient for a quantitative analysis. Thus, further improvement is desirable in order to extract more structural information from the experiments by computing spectra of structural models. This will be beneficial for the interpretation of absorption spectra as well as of difference spectra of protein reactions, where the interest is to localize and characterize the protein backbone segments that generate the observed conformational changes.
Current approaches9 to model the amide I spectrum of proteins include quantum chemical calculations of small structures10,11 and parameter transfer to larger structures,12–16 as well as Fourier transformation of the dipole autocorrelation function.17,18 However, because quantum chemical calculations are too time-consuming for ordinary proteins, most calculations are based on the floating oscillator mechanism,8,19–22 in which each amide group hosts an amide I oscillator that is characterized by its local wavenumber and its coupling with the oscillators of other amide groups. Coupling results in collective amide I normal modes, to which individual amide I oscillators contribute to varying degrees. Nearest neighbor coupling constants are often taken from quantum chemical calculations23–27 and include mechanical through-bond interactions, whereas interactions between non-nearest neighbor amide I oscillators can be described successfully by transition dipole coupling (TDC)3,4,7,28–30 or transition charge coupling.26,31 Our previous work32 has optimized the parameters for TDC for the main secondary structures – α-helices, parallel and antiparallel β-sheets – by comparison with DFT calculations of the amide I infrared absorption.
If there were no vibrational coupling, each amide group would vibrate independently from the other amide groups with its local wavenumber, which depends on the environment of the amide group.1,2,24,33,34 This environmental influence is typically divided into two components: (i) the contribution from the two closest amide group neighbors within the peptide chain and (ii) the contribution from the rest of the protein and from the water environment. The first influence depends on the local conformation of the protein backbone and is thus termed local conformation effect. It alters the potential energy surface of a specific amide group through a combination of electrostatic interactions and electronic polarization effects.35 In the floating oscillator approach, this influence is usually implemented by applying wavenumber shifts to the local wavenumbers that were obtained from a series of density functional theory (DFT) calculations of diamides with different backbone dihedral angles.24–27 The shifts induced by the preceding and following neighbors of a particular amide group are thought to be additive.
The second influence on the local wavenumber has been modeled differently in calculations of protein spectra: either by parameter transfer from DFT calculations of small structures,36 by local hydrogen bonding models,20,32,37–39 or by electrostatic models that relate the electric field, field gradients and/or the electric potential at the amide group atoms to the local wavenumber.33,35,40–48 Field and potential influence the local wavenumber because they change the electrostatic energy during a molecular vibration – the former couples to electrons that follow the vibrations of the nuclei, whereas the latter couples to those that participate in an interatomic charge flux.49
Most of the electrostatic models were developed from simulations of an amide group that interacts with water molecules. In a different approach, the model was obtained from experimental data.50 To the best of our knowledge, the susceptibility of the local wavenumber on the surrounding protein environment has not yet been modeled from DFT calculations of peptide structures. There is also a need to optimize the set of charges used to describe the spectroscopic effects.21,51 This has been done for a set of diamides and NMA dimers using an electrostatic model developed for the interaction with water (2H2O),35 resulting in a large negative charge on the amide oxygen, a positive charge on the nitrogen and a negative charge on the hydrogen, which differs from common force fields.
Here, we present a combined approach where charges and parameters were optimized for a set of common secondary structures with the aim to describe the effect of the electrostatic environment on the local wavenumber of each amide group. We find that a model considering the electric field at the positions of the O, N, and H atoms is a good compromise between the quality of the description, the number of free parameters, and the robustness of the model. This model is superior to a description by a local hydrogen bonding model.
A further quantity considered in this work is the local dipole derivative, which influences the calculated spectrum in two ways:
(i) It determines the strength of TDC coupling between non-nearest neighbors. Thus, correct dipole derivatives are important for example to calculate the correct splitting between the high and low wavenumber bands of antiparallel β-sheets.
(ii) The square of the dipole derivative is proportional to the dipole strength, which is proportional to the integrated absorbance of an absorption band. Correct dipole derivatives are therefore a prerequisite for the correct calculation of the relative intensities of the normal modes.
It is known that the dipole derivative magnitude is increased by hydrogen bonding by a factor of ∼1.4, corresponding to a change in dipole strength by a factor of ∼2,2,52,53 and it was found to differ by 20% for different amide groups in a β-hairpin.54 Nevertheless it is usually approximated by a constant in protein spectrum calculations. In order to consider the effect of the molecular environment on the dipole derivative magnitude, we have previously presented a model that is based on the strength of hydrogen bonding to the carbonyl group of the considered amide group.32 Here, we present instead an electrostatic model for this purpose and find that it describes the TDC coupling constants better. Both improvements discussed in this work are steps towards a quantitative description of the protein absorption in the amide I range, for which the theory-based parameters eventually need to be recalibrated by comparison to experimental data.
Additionally, we did DFT calculations on di-amides Ac–Ala–NH–CH3 (2 amide groups) with the same secondary structures and trans N-methylacetamide (NMA). The dihedral angles for all structures were: ϕ = −138.6°, φ = 134.5° for the antiparallel β-strands; ϕ = −121.0°, φ = 114.8° for the parallel β-strands; and ϕ = −57.0°, φ = −47.0°, ω = 180.0° for the α-helices.32 Selected structures are shown in Fig. S1 and S2 of the ESI† and pdb files of the structures are contained in a supplementary data set that is available on figshare (https://doi.org/10.17045/sthlmuni.24324886).
The F-matrix was obtained using Hessian matrix reconstruction60 by transforming the wavenumber eigenvalue matrix from the DFT calculations into a reconstructed Hamiltonian that contains the local wavenumbers in the diagonal and the coupling constants in cm−1 outside the diagonal (available for all structures in the supplementary data set at https://doi.org/10.17045/sthlmuni.24324886). The former are termed local DFT wavenumbers ṽDFT. Finally, this matrix was converted into an F-matrix32 using published conversion factors19,61,62 and made symmetrical by averaging the matrix elements that refer to the interaction between the same two amide groups. The result is termed DFT F-matrix.
Φi = Σcj/rij | (1) |
Ei = ΣeCO![]() ![]() | (2) |
Bold print indicates vectors. Φi is the electric potential (unit: Eha0−1), Ei the electric field component parallel to the CO bond (unit: Ehe−1a0−1) at the position of atom i of the amide group of interest, cj the partial charge (unit: elementary charge e) of atom j in an amide group outside the considered amide group, rij the scalar distance (unit: Bohr radius a0) between atoms i and j, eij the unit vector along the direction from atom j to atom i, eCO the unit vector from the C-atom to the O-atom of the amide group of interest and the sums are over all amide atoms j outside the amide group of interest. Ei is positive when the electric field component points from the C to the O atom.
The electrostatic models describe the effect of the electrostatic environment on the local amide I wavenumber ṽ of a particular amide group. This influence is expressed by a shift ΔṽESM (where ESM stands for electrostatic model) of the local amide I wavenumber ṽ from the wavenumber ṽ0 of an isolated amide group.
ṽ = ṽ0 + ΔṽESM | (3) |
The most general model considered the electric field E and electric potential Φ at the locations of all four amide atoms to calculate ΔṽESM
ΔṽESM = Σ![]() ![]() ![]() | (4) |
The other electrostatic models are simplifications of the general 4P4F8 model and consider a subset of atoms and/or either only the fields or the potentials. The number of free parameters in the simpler models is given by the number of atoms at which the electric field and the electric potential are evaluated and is therefore not additionally stated as a subscript. For example, a model which considers the electric field at three atoms is named 3F and contains three independent parameters. The considered atoms are given as subscripts, e.g. 3FONH considers the field at the O, N, and H atom of the amide group for which the wavenumber shift ΔṽESM is calculated.
ΔṽKS = ξO![]() ![]() | (5) |
The parameters ξO and ξH describe the shifts due to hydrogen bonding to the amide oxygen and the amide hydrogen, respectively, and their values are ξO = 2.4 cm−1 mol kJ−1 and ξH = 1.0 cm−1 mol kJ−1.37 Note that the wrong units were stated in our previous publication.39
ΔṽDFT = ṽDFT − ṽNMA | (6) |
Then these shifts were subtracted to obtain the shift difference ΔΔṽ between the DFT and the electrostatic calculations
ΔΔṽ = ΔṽDFT − ΔṽESM | (7) |
Two quality measures for the performance of the considered electrostatic models were calculated: SSDiff is the sum of the squared differences ΔΔṽ for the considered amide groups.
SSDiff = ΣΔΔṽi2 | (8) |
The sum is either over all 105 amide groups in our set of nine structures, or – when mentioned – over the 71 inner amide groups, i.e. those that have an N- and a C-terminal neighbor. SSDiff can give large values even when the variations between the local wavenumbers within a given structure are well reproduced, but when there is a general difference to the DFT shifts for all amide groups in this structure. Such a general difference could for example be caused by an effect of the local conformation which will have a strong impact on the parameters of the model because the effect is present for all amide groups. Also, groups with only a single hydrogen bond are relatively few and might not be modeled well by only minimizing SSDiff, which could result in a poor modeling of the local wavenumber variations. This however, should be avoided because the groups within a secondary structure element are those that couple most strongly due to their closeness and the effect of coupling depends on the variation of their local wavenumbers. Therefore we aimed to increase the importance of the variations in the optimization and to reduce the effect of a general difference to the DFT shifts by introducing the second quality measure: SSDev.
For its calculation (eqn (9)), first an average shift difference ΔΔṽj between DFT and electrostatic shifts was calculated for each of the nine structures j. The average shift difference was calculated either from all amide groups of a given structure, or from only the inner amide groups when SSDev of the inner groups was of interest. Then, the deviation ΔΔṽi − ΔΔṽj of the individual shift differences from the average shift difference for the respective structure was calculated, squared and summed over all amide groups or over the inner groups.
SSDev = Σ(ΔΔṽi − ΔΔṽj)2 | (9) |
To minimize the shift differences and to optimize the parameters, we used the Matlab function lsqnonlin where the SSDiff and SSDev values for each of the nine structures constituted the vector components of the vector-valued function used for minimization, which gives a total of 18 vector components. Most optimizations were done in this way, which is termed optimization method 1. However, we also minimized SSDiff and SSDev directly for a subset of models and charge sets, which is termed optimization method 2. Both optimizations produced similar results, with generally slightly smaller SSDiff and SSDev values for optimization method 2. For the individual large structures however, it also generated the worst RMSDiff and RMSDev values with the most relevant electrostatic models 3FONH (charge set Set-3FONH) and 4P4F8 (charge set Set-4P4F8) as well as the worst RMSDiff with a combination of local conformation effect and local hydrogen bonding model. Anticipating that the large structures represent protein structures better than the smaller models, we focus therefore on optimization method 1.
It turned out that the inclusion of SSDev in the parameter optimization improved its value considerably with only little increase of SSDiff. The parameters of the short-range hydrogen bonding model were optimized in the same way using ΔṽKS instead of ΔṽESM.
We used also previous results on the local conformation effect for the comparison with our diamide data. Choi & Cho,24 (indicated by C in our tables) tabulated local wavenumbers for different secondary structures (their Table 1), which we used without modification although the dihedral angles for the parallel β-sheet differ by 2 degrees from our values.
Model | Number of charge setsa | Best charge set for | Name of charge setb | RMSDiff (cm−1) | RMSDev (cm−1) |
---|---|---|---|---|---|
a For each model, the number of studied charge sets is listed. b The charge sets are specified with a shorthand notation: the partial charges in e are multiplied with 100 and listed in the order C/O/N/H. The minus sign for negative partial charges is omitted and they are indicated by italic type instead. In some cases a good compromise is also listed. The best charge sets of the most relevant models were given names. c LCE refers to parameter optimizations where the local conformation effect was explicitly accounted for (see text). d For some models, two values are stated for RMSDiff and RMSDev: the first value relates to all amide groups, whereas the second in italics relates only to the inner amide groups, i.e. those that have an N-terminal and C-terminal neighbor. e The model named Zero refers to a calculation in which no wavenumber shift was applied, i.e. all amides had the NMA wavenumber and were compared to the DFT wavenumbers. | |||||
4P4F8 | 41 | RMSDev | 40/55/05/10 (set-4P4F8) | 4.39 | 3.57 |
RMSDiff | 50/60/00/10 | 4.35 | 3.68 | ||
4P4F8 and LCEc | 8 | RMSDev and RMSDiff | 35/55/05/15 | 2.44 | 2.19 |
4P4F6 | 41 | RMSDev | 70/90/00/20 | 5.07 | 3.87 |
Compromise | 50/60/10/20 | 4.80 | 3.97 | ||
RMSDiff | 60/60/10/10 | 4.62 | 4.19 | ||
4P | 41 | RMSDev | 40/40/10/10 | 5.23 | 4.50 |
RMSDiff | 42/42/10/10 | 5.23 | 4.50 | ||
4F | 41 | RMSDev | 40/55/00/15 (set-3FONH) | 5.18 | 3.84 |
Compromise | 30/40/00/10 | 5.14 | 3.85 | ||
RMSDiff | 50/65/00/15 | 5.13 | 3.87 | ||
3FONH | 36 | RMSDev | 40/55/00/15 (set-3FONH) | 5.27, 4.61d | 3.88, 3.26d |
RMSDiff | 50/70/00/20 | 5.27, 4.59d | 3.88, 3.24d | ||
Compromise for inner amides | 50/70/10/30 | 5.44, 4.60d | 3.96, 3.15d | ||
3FONH and LCEc | 9 | RMSDev | 35/50/00/15 | 4.68 | 4.31 |
Compromise | 40/55/00/15 (set-3FONH) | 4.68 | 4.31 | ||
RMSDiff | 45/60/00/15 | 4.67 | 4.32 | ||
Zeroe | 38.5, 40.7d | 12.1, 10.2d |
Gorbunov et al.25 (indicated by G in our tables) presented maps of the local wavenumbers for the entire space of the dihedral angles Φ and Ψ using otherwise fully optimized gas phase DFT calculations in the middle panels of their Fig. 8. We identified amide group 1 with the N-terminal group and amide group 2 with the C-terminal group and ignored thus the footnote with reference list entry 40 in the original publication, which states the opposite numbering. Only our numbering provides agreement with our and other published data. The figures were digitized and the wavenumbers for specific backbone conformations extrapolated. Shifts of the local wavenumbers due to the local conformation effect were then obtained by scaling the wavenumbers with a factor of 0.9765 to improve the agreement with experiments and subsequent subtraction of our experimental reference wavenumber of 1707 cm−1 (see section Spectrum calculation).
La Cour Jansen et al.26,27 (indicated by LCJ in our tables) published tabulated shifts of the local wavenumbers, which were used as provided. When the absolute wavenumbers were of interest, the reference wavenumber was added to the shifts, which was the wavenumber of deuterated NMA in the gas phase (1717 cm−1)66 in the original publication.
∂μ/∂q = ∂μ0/∂q + Σgi![]() | (10) |
Before plotting the spectra, the diagonal elements of the DFT F-matrix were reduced to achieve a downshift of the local wavenumbers by 28 cm−1. Such a shift brings our calculated NMA gas phase wavenumber of 1735.0 cm−1 down to the wavenumber of trans-NMA in a N2 matrix (1706–1707 cm−1)67,68 and close to those in H2 (1710 cm−1)69 and Ar (1708 cm−1)68 matrices. We judge the NMA wavenumber in apolar media to be a reasonable starting wavenumber from which to apply the electrostatic shifts. Accordingly, the diagonal elements for the electrostatic or hydrogen bonding models were obtained after adding ṽ0 = 1707 cm−1 to the wavenumber shift derived from the models. The 28 cm−1 shift of the local wavenumbers for the DFT F-matrix and the choice of an unperturbed local wavenumber of 1707 cm−1 aim to bring the spectra in the experimentally observed spectral range. Using instead an unperturbed local wavenumber of 1735 cm−1, the electrostatic models simulate the original DFT calculations.
The intensity of a normal mode was calculated by first adding the dipole derivatives of individual amide oscillators weighted by their amplitude in the normal mode and then squaring the result. The intensity of a normal mode divided by its wavenumber is proportional to its dipole strength. If not mentioned otherwise, we used the dipole derivative properties optimized in our previous work:32 a dipole derivative position 1.043 Å away from the C-atom along the CO bond and 0.513 Å away from the C-atom along the CN bond, a dipole derivative magnitude ∂μ0/∂q of 2.2 D Å−1 u−1/2 for an isolated amide group, a dipole derivative angle of 22° from the CO bond towards the N-atom, and an A parameter of 0.01 to describe hydrogen bonding effects on the dipole derivative magnitude.
The spectra were generated from Lorentzian lines with 16 cm−1 full width at half maximum at the normal mode wavenumbers. The integral of each component band was equal to the intensity of the particular normal mode normalized to the number of amide groups in the structure.
The model parameters were optimized for each charge set and the performance of the models assessed using two quality parameters SSDiff and SSDev and the related RMSDiff and RMSDev as described in Methods. The performance depended on the charge set used to describe the partial charges on the amide atoms outside the considered group. Therefore, several charge sets were tested. The charge set with the smallest RMSDev after parameter optimization was considered to be the best for the respective model. In general this was also the charge set that resulted in the lowest or one of the lowest RMSDiff values. An exception is the 4P4F6 model, for which a compromise charge set is listed that performs relatively well on both quality measures.
For most models, the charge set is scalable, which means that a multiplication of all charges by a particular number can be compensated for by dividing the d and l parameters by the same number. We chose therefore to use charges that were close to the Gromos charges. The only non-scalable model is the 4P4F6 model because its parameters are related to a component of the dipole derivative.
For the most relevant models, the best charge sets are listed in Table 1 together with the quality parameters RMSDiff and RMSDev. The ESI† contains performance information for all tested models in Table S2 (ESI†), describes the performance of selected models for the inner amides, discusses the model parameters, and lists them in Table S3 (ESI†). Further information is contained in the supplementary data set available on figshare (https://doi.org/10.17045/sthlmuni.24324886).
The best electrostatic model was 4P4F8 with eight adjustable parameters, followed by 4P4F6 with six. However, the number of adjustable parameters can be reduced drastically to only three in 3FONH without much loss of performance. The main improvement that the larger 4P4F8 model provides seems to be a better prediction of the average wavenumber shift for each structure – included in RMSDiff – whereas the local wavenumber variations within each structure are relatively less improved – as indicated by the moderate decrease of RMSDev. Inspite of the slightly lower performance, the 3FONH model provides an excellent compromise between performance, number of parameters, and computational effort and we suggest that this model should be used for calculations of the amide I spectrum. A further simplification of the description in the 2F models reduced the performance. The best 2F models considered the electric fields on the O and N atoms.
All models performed better than a reference calculation without applying local wavenumber shifts. In this calculation all amide groups had the NMA wavenumber, which was compared to the local DFT wavenumbers for the calculation of RMSDiff and RMSDev. These results are listed in Table 1 under the model name Zero.
Experiments with dipeptides at different pH values and thus with different electrostatic environments of the amide group established a strong correlation of the amide I wavenumber with the electric field component along the CO bond at the oxygen, followed by a good correlation for that field component at the carbon.72 The latter might seem surprising in the view of our results, where the 4F model and the 3F models that include the carbon atom result in small dC parameters (Table S3, ESI†) and a model not considering carbon – 3FONH – performs nearly equally well as the 4F model. This indicates little importance of the electric field on this atom for modeling the local amide I wavenumber. However, this does not need to contradict the correlation found by Reppert & Tokmakoff: the mentioned 4F and 3F models contained also the electric field on the oxygen and the small dC parameter may therefore simply reflect an only small benefit of including additionally also the field on the carbon in the model. That the field on the oxygen is more important for modeling is shown by the better performance of the 2FON model as compared to the 2FCN model (Table S2, ESI†) and agrees with the better correlation for the field on the oxygen found by Reppert and Tokmakoff.
Many models performed well for RMSDiff and RMSDev with the same charge set. These models are 4P4F8, 4P, 4F, 3FONH, 3FCON, and 2FOH. In contrast, 4P4F6, 3FCOH and most of the 2F models did not perform well on both quality parameters with the same charge set although both were minimized together in the optimization. Therefore these models need different charge sets to describe a secondary structure's average local wavenumber and its variation over the individual amide groups.
In the best charge sets, the charge on the oxygen is negative, that on the carbon and hydrogen positive and that on the nitrogen ranges from slightly negative to slightly positive. The CO group is negative for most models and the NH group positive. A few models have different charge requirements and perform best or relatively well with neutral CO and NH groups: 4P, 3FCOH, 2FOH, and 2FCN, of which 4P has the best performance. Further charge requirements for particular models are discussed in the ESI.† The results clearly demonstrate that the best charge set depends on the electrostatic model.
For the 4P model, a charge optimization has been carried out previously with parameters derived from NMA–water systems in order to model the local wavenumber shift of structures with two NMA molecules. The optimized charge set was 49/84/53/1835 which surprisingly has negative charge on the hydrogen atom. In our hands however, the best charge sets for the 4P model are electroneutral on the CO group and on the NH group and have relatively little charge on N and H. Moving positive charge from any amide atom to N decreased the performance of the model, but we never tested a charge set with negative H charge.
DSSP64 and Gromos73 charge sets do not satisfy the preference of most models for charged CO and NH groups. Instead they meet the requirement of the 4P model for electroneutral CO and NH groups. Nevertheless, they did not perform well for this model either because of their relatively large charges on the N and H atom. Other well-known charge sets, like those of Amber and Charmm, were not tested here because they have a net charge on the amide group. They have relatively large partial charges on the N (negative) and H (positive) atoms, which deteriorates the performance of the 3FONH and the more comprehensive models. In line with a previous suggestion,21,51,72 we conclude therefore that charge sets from common force fields are not necessarily well-suited to describe electrostatic effects on the amide I wavenumber of individual amide groups.
![]() | ||
Fig. 1 Local wavenumber shifts of individual amide groups calculated by DFT (golden bars) and the 4P4F8 (blue bars) and the 3FONH model (red bars) using Optimization method 1 (Table 1). Top: Parallel β-strand and β-sheets (P), middle: antiparallel β-strand and β-sheets (A), bottom: helices (H). The horizontal axis displays the number of the amide groups, where 1 indicates the N-terminal amide group. Small, medium, and large structures are arranged from left to right. A simple indication of the hydrogen bonding type is shown by gray bars (0, −10, −20, and −30 cm−1 shift for hydrogen bonds to neither H nor O, to H, to O, and to H and O, respectively). |
The top and middle panels of Fig. 1 show the local wavenumber shifts for the β-strands and sheets. We focus first on the local DFT wavenumber shifts (golden bars) of the isolated strands on the left hand side of the top and middle panel, where the N- and C-terminal amide groups have less downshifted local wavenumbers than the inner amides. This indicates that end effects are important for the local wavenumber shifts and holds also for the larger sheets, where in particular the C-terminal amides have higher wavenumbers than inner amide groups with the same type of hydrogen bonding. The end effect can be explained by an interaction between neighboring amide groups in a β-strand, which are arranged such that their dipole–dipole interaction is attractive.60 The consequence of such an arrangement is that each neighbor of an amide group generates an electric field component that points from the oxygen to the carbon atom and thus moves the two atoms further apart. This elongates and weakens the CO bond which lowers the amide I wavenumber. The wavenumber decrease is stronger for the inner amide groups than for the terminal groups because the former are subjected to the added fields of two neighboring groups.60
In addition, the local DFT wavenumber shifts for the larger β-sheets are affected by hydrogen bonding, which is well-known from experiments2,75 and calculations:1,2,68,74 hydrogen bonding leads to downshifts of the amide I wavenumber in the range of several tenths of cm−1 with a stronger effect for hydrogen bonding to the amide oxygen than that to the hydrogen. This generates an alternating pattern of local wavenumber shifts for the two-stranded sheets, observed in Fig. 1, depending on whether the amide group is hydrogen bonded at the oxygen or the hydrogen atom. The same behavior is observed for the outer strands of the four-stranded sheets, whereas their two inner strands show a more uniform local wavenumber shift. These observations are in line with previous quantum chemical calculations of β-sheets.76 Two more observations are of interest: (i) an amide group with a single hydrogen bond to the oxygen can have an equally downshifted local wavenumber as one that has hydrogen bonds to both the oxygen and the hydrogen atoms, and (ii) the local wavenumber shifts do not seem to depend on the number of strands in a sheet as the shifts of the outer strand amides of the four-stranded sheet are comparable to those of the two-stranded sheets.
The local DFT wavenumber shifts of β-strands and β-sheets are well reproduced by the electrostatic models (blue and red bars in Fig. 1), including the alternating hydrogen bonding pattern of the two-stranded sheets and of the outer strands of the four-stranded sheets. The shifts for the antiparallel sheets are somewhat underestimated (not negative enough), while those of the parallel sheets are overestimated (too negative) by both models.
The bottom panel of Fig. 1 shows the results obtained for the three helix structures and we focus again first on the local DFT wavenumber shifts (golden bars). The small helix with three amide groups (bottom, left in Fig. 1) exhibits the highest wavenumber for the middle amide group – higher than the wavenumber of NMA. In contrast to β-sheets, the interaction between neighboring amide dipoles in a helix is repulsive.60 Here, an electric field component of neighboring amides points from the carbon to the oxygen of the considered amide group, which shortens the CO bond and increases the vibrational frequency. The wavenumber upshift is less for the terminal helix amide groups than for the inner amides because the former experience only the field of one neighbor. This results in a lower wavenumber of the terminal amides than of their nearest sequence neighbors, which can be seen in all helix structures. However, the effect is very small for the C-terminal group of the large helix.
The N-terminal amide exhibits the strongest downshift in all structures. This has been noted before29,77,78 and explained by particularly strong hydrogen bonding of its carbonyl group.78 In our helix structures, the N-terminal carbonyl (amide group 1) forms a hydrogen bond with the NH of amide group 4 (O⋯H distance 2.2 Å), which corresponds to the ordinary hydrogen bonding in α-helices between residues i and i + 4. Additionally, it forms a non-linear hydrogen bond with amide group 3 (O⋯H distance 2.4 Å). The respective distance for other amide carbonyls is considerably larger (2.9 Å for amide groups 2 and 3, 2.8 Å for the remaining groups). This additional hydrogen bond is found for all our helix structures, even for the smallest one with three amide groups and explains the low wavenumber of the N-terminal amide group (see Fig. S1d, ESI†).
The local DFT wavenumber shifts in the center of the larger helices show that hydrogen bonding to the oxygen (amides 2 and 3 at the N-terminus) causes stronger downshifts than hydrogen bonding to the hydrogen (last three amides at the C-terminus). Groups in the helix center with hydrogen bonds to both the oxygen and hydrogen atoms have lower wavenumbers than those with hydrogen bonds only to the oxygen. This contrasts with the β-sheets where such groups often have similar wavenumbers. Interestingly, there seems to be a helix length dependency on the local wavenumbers with longer helices having lower local wavenumbers. This can be explained by the interaction between hydrogen bonded amide groups being attractive (in contrast to that between sequence neighbors) leading to a wavenumber downshift. Lower local wavenumbers for longer helices are in line with the low wavenumber of the infrared absorption band of long helices that were experimentally observed for poly-L-lysine, poly-L-glutamic acid, and tropomyosin in D2O79,80 and H2O.81
Both electrostatic models (blue and red bars in Fig. 1) reproduce the DFT trends well, in particular the large downshift of the first amide group in the intermediate and large structures, the increase of the downshift from residue 2 to the center of these helices and its decrease toward the C-terminus, as well as the larger downshifts in the center of the long helix than in the center of the intermediate helix. Both electrostatic models calculate local wavenumber shifts that are slightly too negative for the central amide groups of the large helix.
In order to single out the local conformation effect, we subtracted the wavenumber of NMA (1735.0 cm−1) from the local DFT wavenumbers of our dipeptides. The obtained local wavenumber shifts are listed in Table 2 and compared to literature values in Table S5 (ESI†). We tested then how well these shifts are described by our two main electrostatic models (Table 2). The shifts predicted from the electrostatic models have the same sign as the DFT shifts, except for the small negative value of the N-terminal group in helical conformation, which has the wrong sign with the 3FONH model. They also predict correctly, which of the two groups has the larger shift in each structure. Therefore they capture aspects of the local conformation effect, which seems to be partly due to electrostatic interactions. This has been recognized before using a 4P model.65
Structure | Group | Shift in cm−1![]() |
|||
---|---|---|---|---|---|
DFT | 3FONH (set-3FONH)b | 4P4F8 (set-4P4F8)b | 4P4F8 (40/50/00/10)b | ||
a The wavenumber shifts were calculated by subtracting the NMA wavenumber (1735.0 cm−1) from the local wavenumbers of the diamides. b Charge set for the calculation. c N-terminal amide group. The shift of an N-terminal amide group describes how the local wavenumber of a particular amide group is influenced by the amide group that follows in the sequence. d C-terminal amide group. The shift of the C-terminal group reflects the effect of the preceding group. e The sum of the two shifts is the expected local wavenumber shift of an inner amide group in a protein, i.e. one that has a preceding and a following amide group. Discrepancies between the individual values for the N- and C-terminal amide group and their sums are due to rounding. f ABS: antiparallel β-sheet. g PBS: parallel β-sheet. h RMSDev is the root mean square deviation between the DFT shifts and the shifts obtained with the electrostatic models. RMSDev (N and C) refers to the shift deviations for the N- and the C-terminal amides. i RMSDev(Sum) refers to the deviations of the summed N- and C-terminal shifts. | |||||
Helix | Nc | −2.0 | 2.1 | −0.7 | −1.9 |
Cd | 10.6 | 10.9 | 12.8 | 10.4 | |
Sume | 8.7 | 13.0 | 12.1 | 8.5 | |
ABSf | Nc | −17.1 | −22.9 | −25.7 | −22.3 |
Cd | −13.8 | −6.9 | −6.9 | −7.1 | |
Sume | −30.8 | −29.9 | −32.6 | −29.4 | |
PBSg | Nc | −14.4 | −23.7 | −24.5 | −21.4 |
Cd | −11.5 | −7.2 | −6.8 | −6.9 | |
Sume | −26.0 | −30.8 | −31.3 | −28.3 | |
RMSDev (N and C)h | 5.8 | 6.5 | 4.9 | ||
RMSDev (Sum)i | 3.8 | 3.8 | 1.6 |
The shifts from the electrostatic models differ from the DFT values with root mean square deviations of 5.8 cm−1 for the 3FONH with the best charge set Set-FONH and of 6.5 cm−1 for the 4P4F8 model with its charge set Set-4P4F8. Since the shift of the N-terminal group is due to the influence of the following C-terminal group and vice versa, the sum of the two shifts is the expected shift for an inner amide group that has both a preceding and a following amide group. The sums are better predicted by our models than the individual shifts with root mean square deviations of 3.8 cm−1 for both the 3FONH and the 4P4F8 model with their standard charge sets.
We explored a few other charge sets with the 4P4F8 model and found that the agreement with the DFT shifts can be improved to a root mean square deviation of 4.9 cm−1 for the shifts of the individual groups. The found charge set is particularly good in reproducing the shifts for the helical peptide and the summed shifts for all structures. This limited exploration might indicate that the best electrostatic model and its optimum charge set for describing the local conformation effect might be different from that for hydrogen bonding and long-range electrostatic effects.
The performance of both models improved regarding RMSDiff for all charge sets tested (3FONH: 5.3 to 4.7 cm−1, 4P4F8:
4.4 to 2.4 cm−1), which is better than a previous combination of a 4P model with the local conformation effect (RMSDiff: 5.9 cm−1).65 Regarding RMSDev, the 3FONH performance deteriorated for the better charge sets, but the 4P4F8 model improved considerably for all tested charge sets. However, this improvement of the 4P4F8 model came with a drastic change of its parameters when compared to the optimization without local conformation compensation as further discussed in the ESI.† The parameter changes result in large and opposite wavenumber shifts of the potential and field terms in a typical hydrogen bonding situation, which raises doubts regarding the robustness of this model (see ESI†). In contrast to the 4P4F8 model, the parameters of the 3FONH model change much less and do not change sign when the local conformation effect is explicitly considered. Thus this model seems to be less sensitive to different implementations of the amide I band calculations.
Fig. S3 in ESI† shows the local wavenumber shifts calculated for the 3FONH model with (light blue bars) and without (blue bars) explicit consideration of the local conformation effect. When the local conformation effect is considered, the agreement with the DFT shifts improves for nearly all inner amide groups of all structures, but not for the N-terminal helix groups and the C-terminal groups of the small helix and of the medium and large sheets. The improvement is moderate and the “pure” 3FONH model without an explicit consideration of the local conformation effect is still a good option. This avoids the need to choose a level of theory for calculating the local conformation map and to choose an appropriate reference wavenumber to calculate the shift induced by the local conformation effect.
![]() | ||
Fig. 2 Spectra calculated from the local wavenumbers only. Top: Parallel β-sheets, middle: antiparallel β-sheets, bottom: helices. The local wavenumbers were obtained from the DFT F-matrix (DFT), from combining the local conformation effect from our diamide calculations with a local hydrogen bonding model based on the Kabsch–Sander energy (LCE + KS), either with standard parameters (std) or with optimized parameters (opt) according to Table S6 (ESI†), or from the electrostatic models 4P4F8 (with Set-4P4F8), 4P4F6 (with Set 70/90/00/20), 4F (with Set-3FONH), and 3FONH (with Set-3FONH) according to Table 1 and Table S3 (ESI†). Parameter optimization was done with Optimization method 1. The gray vertical lines indicate band positions in the spectrum obtained with the DFT F-matrix. |
The electrostatic models of this work perform better and produce similar spectral shapes. The deviations between them and the DFT spectra are largest for the small and medium helix spectra where the shapes are best calculated with the 4P4F8 model (dark blue lines), which nevertheless does not reproduce the spectral position of the component bands in the small helix spectrum. The spectra in Fig. 2 reproduce the deviations between the local wavenumbers calculated by DFT and by the 3F and 4P4F8 models (Fig. 1): for the medium and large parallel β-sheets as well as for the large α-helix, the shifts for most amide groups calculated by the electrostatic models (red and blue bars in Fig. 1) are too large when compared with the DFT shifts (golden bars in Fig. 1) and thus the spectra are downshifted too much (red and blue spectra in Fig. 2) with respect to those calculated with the DFT local wavenumbers (golden spectra in Fig. 2). The opposite is the case for the antiparallel β-sheet.
Fig. 3 reports “complete” spectra calculated from an F-matrix with the diagonal elements from DFT calculations or from the different models and the non-diagonal elements from the DFT F-matrix. Note that we do not use the original DFT spectra for the comparison because their normal mode intensities are calculated differently from our normal mode analysis. Thus they would not serve well to compare and test our calculations.
Generally, the models rank in the same way as discussed above for the local wavenumbers. However, it is interesting to observe that the electrostatic models perform more similar and that the simple 3FONH model reproduces the spectral shape at least as good as the 4P4F8 model (compare e.g. the small helix spectra). From a practical point of view therefore, there seems to be little benefit from using the larger 4P4F8 model.
In contrast to Fig. 2, the spectra of Fig. 3 include the effects of vibrational coupling between amide groups. For all spectra in Fig. 3 these are described by the non-diagonal elements of the DFT F-matrix. Comparing the spectra in Fig. 2 without vibrational coupling with those in Fig. 3 with coupling, it becomes clear that the β-sheet spectra have different shapes, whereas the α-helix spectra are similar. The peak positions of the large β-sheets are strongly downshifted by coupling, whereas only slight upshifts are seen for the small and medium helix. Thus, coupling has a strong effect on the β-sheet spectra, but not on the α-helix spectra.7,82–84 Therefore, an accurate calculation of local wavenumbers is particularly important for α-helical structures.
![]() | ||
Fig. 3 Spectra calculated from complete F-matrices using local wavenumbers calculated by the models specified in Fig. 2. Top: Parallel β-sheets, middle: antiparallel β-sheets, bottom: helices. The diagonal elements were either from the DFT F-matrix or from the different models. The non-diagonal elements were always from the DFT F-matrix. The gray vertical lines indicate band positions in the spectrum obtained with the DFT F-matrix. |
The charge set has only a modest influence on the shape of the spectra when an electrostatic model is used with optimized parameters for each charge set. But deviations from the spectra calculated with the DFT F-matrix can be seen when the quality parameters RMSDiff and RMSDev are significantly worse than those of the better charge sets. This is shown in Fig. S5 and S6 (ESI†) for the 3FONH and the 4P4F8 model, respectively.
∂μ/∂q = 2.1 − 30.8 (EO + EC) | (11) |
Reference | Position along COabc (Å) | Position along CNabc (Å) | ∂μ0/∂q (∂μmax/∂q)ad (D Å−1 u−1/2) | Angleac (degrees) | R (10−4 mdyn2 Å−2 u−2) |
---|---|---|---|---|---|
a The parameters were optimized using intermediate and large structural models. b The position is given by the distances from the C-atom along the specified bonds. c Angle and position of the dipole derivative are equal to those of the transition dipole moment. d ∂μ0/∂q is the dipole derivative magnitude in the absence of an electric field or of hydrogen bonding and ∂μmax/∂q (value in brackets) is the maximum dipole derivative calculated for the ensemble of structural models. | |||||
DD(FCO), this work | 1.085 | 0.608 | 2.10 (3.04) | 22 | 5.9 |
Baronio & Barth 202032 | 1.043 | 0.513 | 2.20 (2.51) | 22 | 7.5 |
In summary, we propose a new model to describe the dependency of the dipole derivative magnitude on the electrostatic environment. This model uses the average field at the C and O atoms and requires only a single field parameter. It indicates that the dipole derivative magnitude is largely determined by the CO group, in line with our previous finding.32
Fig. 4 shows spectra calculated with these models. The golden spectra were calculated with the DFT F-matrix elements and the black spectra with diagonal elements from the 3FONH model and the DFT non-diagonal elements. These spectra are also contained in Fig. 3 and are shown here for comparison. The blue and the brown spectra were calculated using 3FONH for diagonal elements, DFT for nearest neighbor elements, and TDC for all other non-diagonal elements. The dipole derivative magnitude was calculated according to the previous32 (blue lines) and the present model (brown lines) with the parameters listed in Table 3. It can be seen that the spectral shape is very similar in all cases. In particular, the spectra obtained with the same local wavenumbers (black, blue, and brown lines) show that both models for the dipole derivative magnitude generate spectra that are equivalent by visual inspection to those from the DFT non-diagonal elements (black lines), even though the new model describes the non-diagonal F-matrix elements better.
![]() | ||
Fig. 4 Spectra calculated from complete F-matrices. Top: Parallel β-sheets, middle: antiparallel β-sheets, bottom: helices. Gold (DFT-DFT): all F-matrix elements were from the reconstructed DFT F-matrix. Black (3FONH-DFT): diagonal elements from the 3FONH model and non-diagonal elements from the DFT F-matrix. Blue and brown: diagonal elements from the 3FONH model, nearest neighbor elements from the DFT F-matrix, all other elements from TDC calculations, either using an existing model for the dipole derivative magnitude32 (blue, 3FONH-TDCKS) or the present DD(FCO) model (brown, 3FONH-TDCDD(Fco)). The gray vertical lines indicate band positions in the spectrum obtained with the DFT F-matrix. |
The models were constructed by comparison to DFT calculations. The process involved optimization of both the model parameters and the charge set used for the electrostatic calculations. Even with fine-tuned parameters for each charge set, the charge set had a considerable impact on model performance and charge sets from common force fields were shown to underperform or are expected to do so. This highlights the need for developing specialized charge sets that are optimized for the spectroscopic property of interest as suggested earlier.21,51,72
The best compromise between accuracy and stability of the inherent parameters was found to be a model that considers the electric field on the three amide atoms O, N, and H (3FONH). Since it describes both the electrostatic effects and (partly) the local conformation effects, its parameters reflect a compromise between these two influences. Future work may disentangle these by using specialized models for each of them.
We developed also a simple electrostatic model for the dipole derivative magnitude. It is based on the average field at the C and O atoms, improves the description of the TDC coupling constants, and agrees better with literature values than our previous model.32
As the models were optimized by comparison to DFT calculations, their performance is tied to that of the DFT calculations. It can be anticipated that the model parameters eventually need to be fine-tuned for a best match with experimental spectra.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3cp02018e |
This journal is © the Owner Societies 2024 |