Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

Determining isoleucine side-chain rotamer-sampling in proteins from 13C chemical shift

Lucas Siemonsa, Boran Uluca-Yazgibc, Ruth B. Pritcharda, Stephen McCarthyd, Henrike Heisebc and D. Flemming Hansen*a
aInstitute of Structural and Molecular Biology, Division of Biosciences, University College London, London, UK WC1E 6BT. E-mail:
bInstitut für Physikalische Biologie, Heinrich-Heine-Universität Düsseldorf, Düsseldorf, Germany
cInstitute of Complex Systems, ICS-6: Structural Biochemistry and JuStruct: Jülich Center for Structural Biology, Forschungszentrum Jülich, Jülich, Germany
dDepartment of Chemistry, University College London, 20, Gordon Street, London, WC1H 0AJ, UK

Received 21st August 2019 , Accepted 14th October 2019

First published on 15th October 2019

Chemical shifts are often the only nuclear magnetic resonance parameter that can be obtained for challenging macromolecular systems. Here we present a framework to derive the conformational sampling of isoleucine side chains from 13C chemical shifts and demonstrate that side-chain conformations in a low-populated folding intermediate can be determined.

Protein side chains have the ability to sample several conformations, which is important for many biological processes. Side-chain motions are often de-correlated from the backbone,1–3 making it important to be able to specifically characterise them. The structure and conformational sampling of side chains are often derived from a combination of several NMR measurements, such as nuclear Overhauser effects (NOEs), three-bond scalar couplings,4,5 residual dipolar couplings and spin-relaxation measurements.6 Although these measurements are feasible for most proteins smaller than 20 kDa, an accurate description of side-chain behaviour of larger systems often becomes challenging. Moreover, characterising the conformational sampling of side chains in low-populated states2,7 is still more difficult, because neither NOEs nor scalar couplings can currently be obtained. Relating chemical shifts, the most easily assessible NMR parameter, to structure and motions provides an attractive alternative to characterise side chains in many challenging systems.8–10

Below we show a framework for relating 13C chemical shifts to the conformational sampling of side chains, with focus on the isoleucine side chain. This side chain is composed of sp3 hybridised carbon atoms allowing both side-chain dihedral angles, χ1 (N–Cα–Cβ–Cγ1) and χ2 (Cα–Cβ–Cγ1–Cδ1), to sample the three canonical states {∼60°, ∼180°, ∼300°}, referred to as gauche+ (gp); trans (t); and gauche− (gm), respectively, Fig. 1a. Using a combination of density functional theory (DFT) calculations and a comprehensive set of experimental long-range scalar coupling constants for model proteins, chemical shift profiles are created for the most populated side-chain rotameric states. These chemical shift profiles, in turn, are used to provide a near complete description of the side-chain rotamer distribution, i.e. determine the populations of the {χ1, χ2} rotametic states from experimental chemical shifts alone. The readily available nature of chemical shifts greatly extends the systems where side-chain rotamer distributions can be obtained. To demonstrate this, the framework is applied to characterise isoleucine side chains in an ‘invisible’, low-populated protein folding intermediate.11

image file: c9cc06496f-f1.tif
Fig. 1 (a) The isoleucine side-chain is shown with the dihedral angles and side-chain carbons labelled. (b)–(d) Theoretical chemical shift surfaces for 13Cα, 13Cγ2 and 13Cδ1 in a β-sheet backbone conformation, respectively. The shielding constants were calculated using the GIAO method with a B3LYP functional and the EPR-III basis set and referenced as described in ESI. The white contours indicate the regions, which together comprise 90% of the total populations observed in high-resolution crystal structures.12

The potential energy surfaces derived from the DFT calculations show that the rotameric states {gp,gm}, {t,gm}, and {gp,gp} are all populated to less than 1% and {gm,gp} is only populated around 3%, which is in agreement with an analysis of a large set of high-resolution crystal structures,12,13 Table S1 and Fig. S1 (ESI). The top five states, in decreasing order of overall population: {gm,t}, {gm,gm}, {gp,t}, {t,t}, {t,gp}, account for >97%, while the top four states account for >95%. Below, mainly the top four states will be considered.

Chemical shift (δ) surfaces vs. {χ1, χ2}, were obtained from DFT calculations (ESI) on a model peptide representing the isoleucine side chain, Fig. 1a. These chemical shift surfaces, Fig. 1b–d and Fig. S2 (ESI), show a strong dependence of the 13C chemical shift on the side-chain conformation and that these shifts frequently vary by up to 5 ppm between each of the allowed four rotameric states. For 13Cδ1 and 13Cγ2 the change in chemical shift between states approximately follows a γ-gauche effect,9,14 whereas the surfaces for 13Cα, 13Cβ and 13Cγ1 are more complex. For example, the surface of δ(13Cβ) vs. {χ1, χ2} shows a 4 ppm change between the rotameric states even though the dihedral angles between 13Cβ and its γ-substituents do not change. Frequently 13Cα and 13Cβ chemical shifts are used for backbone secondary structure determination.15,16 It is interesting to note that the 13Cα surface shows a variation of about 5 ppm between the commonly populated states, {gm,t} and {gm,gm}, similar to the differences in chemical shift between α-helical and β-sheet conformations.

The five aliphatic 13C chemical shifts for isoleucine, δ = {δCα, δCβ, δCγ1, δCγ2, δCδ1}, can be calculated if the populations, p, of the most populated rotameric states are known. For four states, p = {pmt, pmm, ppt, ptt},

δcalc(p) = (pαDα + (1 − pα)Dβ)p (1)
where pα is the probability of an α-helix backbone conformation and Dα and Dβ are matrices representing the five 13C chemical shifts in each rotameric state for α-helix and β-sheet conformations respectively. Eqn (1) holds here because the exchanges between the rotameric states are in the fast-exchange regime9,10 and observed chemical shifts represent a population-weighted average. Equally, the populations of the rotameric states, p, can be determined from experimentally observed 13C chemical shifts when the secondary structure is known, since (1) the number of significantly populated states is less than the number of 13C chemical shifts available and (2) the chemical shifts of the rotameric states are linearly independent meaning that the square matrices (Dα)TDα and (Dβ)TDβ are non-singular. Thus, the populations, p, can be determined by minimising the target function,
χ2 (p) = ‖(δcalcδobs) ∘ W22 = ‖(Dcalcpδobs) ∘ W22 = ‖(−(σcalcσref)pδobs) ∘ W22 subject to ‖p11 = 1 and 0 ≤ pi ≤ 1 (2)
where Dcalc = pαDα + (1 − pα)Dβ; σcalc = −Dcalc + σref = pασα + (1 − pα)σβ is a matrix with the isotropic shielding constants for the five 13C chemical shifts in the rotameric states and σref is the reference shielding constant. Finally, W is a vector with the weights of the five 13C chemical shifts, {1/2.7 ppm, 1/2.0 ppm, 1/1.7 ppm, 1/1.3 ppm, 1/1.7 ppm} determined from the standard deviation of previously assigned 13C chemical shifts in the BMRB database. The DFT calculations, Fig. 1, provide the shielding constants, σcalc, for the aliphatic 13C in each rotameric state and in the two backbone conformations, α-helix and β-sheet (ESI).

Calculated chemical shifts, δcalc, are related to the shielding constants by δcalc = −(σcalcσref), and an accurate determination of the reference shielding constants is therefore essential. The reference shielding constant were initially estimated by requiring that random coil populations derived from a large set of crystal structures12 yield random-coil chemical shifts. In a subsequent optimisation, nuclei-specific reference shielding constants were obtained using a comprehensive set of experimentally derived long-range scalar coupling constants, Tables S2–S4 (ESI). Specifically, long-range 3JCγ1–N, 3JCγ1–CO, 3JCγ2–N, 3JCγ2–CO, 3JCγ2–Cδ1 and 3JCα–Cδ1 couplings were measured experimentally for 17 isoleucine residues in two model proteins, T4 lysozyme (T4L) and ubiquitin. The nuclei-specific reference shielding constants, σref,i, i = {Cα, Cβ, Cγ1, Cγ2, Cδ1}, were optimised by minimising the RMSD between experimentally measured scalar couplings and scalar couplings back-calculated from chemical-shift-derived populations, eqn (2), using standard Karplus curves,17 Fig. S3a (ESI). In these optimisations the backbone conformations from crystal structures were used. The correlation between experimental 3J-coupling constants and 3J-coupling constants calculated using the optimised constants, Fig. 2a, strongly indicate that rotamer distributions, Tables S5 and S6 (ESI), can be determined from 13C chemical shift. The obtained rotamer distributions also agree well with those present in crystal structures, Tables S5 and S6 (ESI), although a substantially broader distribution of populated states is observed in solution from 13C chemical shifts and 3J-couplings.

image file: c9cc06496f-f2.tif
Fig. 2 Comparison of long-range experimentally measured 3J scalar coupling constants with those derived from 13C chemical shifts. (a) Couplings from the two proteins T4L and ubiquitin used in the optimisation of σref,i. (b) Couplings from the four other proteins: GB3, C-SH2 PLC-γ, HIV protease and protein L published previously. The black line is y = x and the grey lines are y = x ± RMSD.

To assess the accuracy of the method, rotamer populations were determined from 13C chemical shifts for four additional proteins, HIV protease,18 GB3,18 C-SH2 PLC-γ1,19 and protein L,20 where long-range scalar couplings have been previously obtained. An RMSD of 0.43 Hz was obtained for this cross-validation set, when the back-calculated 3J-couplings were compared to published experimental 3J-couplings; Tables S7–S10 (ESI). This corresponds to an RMSD in populations of the rotameric states between 0.16 and 0.19, Fig. S4 and S5 (ESI). In a cross-validation akin to Fig. 2, but including the {t,gp} state and using five rotameric states, gives an RMSD between measured and back-calculated 3J-couplings of 0.43 Hz. From our data it is therefore not statistically justified (p-value = 0.55) to include the additional {t,gp} state in the analysis. While the RMSDs of the obtained populations provide an upper bound for the error in determining rotamer populations from chemical shifts, it is expected that the true standard error is smaller due to uncertainties associated with the Karplus parametrisations (RMSD ≈ 0.04 Hz; Fig. S6, ESI) and the assumption that 3J couplings depend only on the intervening dihedral.16,17

Until this point only folded proteins were considered and the ϕ, ψ angles of available structures were used to select the set of DFT calculations, Dα or Dβ, used for calculating the rotamer populations from 13C chemical shifts. The probability of α-helix, pα, can be predicted from backbone chemical shifts using TALOS-N.16 Doing so gives similar RMSD values to those in Fig. 2, 0.42 Hz (Tables S5–S10, ESI). This means that if an accurate backbone structure is not available, but a backbone chemical shift assignment is, then the backbone conformations can be derived from TALOS-N and still provide a robust description of the side-chain conformations in ca. 95% of cases.

To explore the intrinsic conformational preference of isoleucine side chains, the rotamer distributions were calculated form experimental chemical shifts and scalar couplings for two peptides, Ace-Ile-NMe (AIN) and Gly-Ile-Gly (GIG). For these ‘random coil representations’ the backbone was assumed to be 50% β-sheet and 50% α-helix; changing this by up to 10% affected the derived populations, p, by less than 0.02. As shown in Fig. 3a the chemical-shift-derived 3J scalar couplings agree very well with experimental scalar couplings (RMSD = 0.22 Hz). Both the AIN and the GIG peptide show very similar rotamer distributions and a broad similarity between the distributions determined from scalar couplings and from chemical shift is also seen, Table S11 (ESI). In these random coil models {gp,t}, {gm,gm}, and {gm,t} are the three major populated states in agreement with deposited crystal structures. However, the distributions are substantially different from the statistical potential derived from the high resolution crystal structures.12 The populations derived here are ppt ≈ 42%, pmm ≈ 35%, and pmt ≈ 22%, whereas ppt ≈ 14%, pmm ≈ 17%, and pmt ≈ 57% are obtained from the statistical potential.

image file: c9cc06496f-f3.tif
Fig. 3 (a) Comparison of long-range 3J scalar coupling constants derived from 13C chemical shifts and experimentally measured couplings for the AIN and GIG peptide. (b) A DNP-enhanced solid-state double-quantum-single-quantum (DQSQ) NMR-spectrum of the AIN peptide in frozen solution (100 K). See also Fig. S3 (ESI).

To explore these random coil models further, a dynamic nuclear polarization (DNP) enhanced solid-state NMR 13C–13C DQSQ spectrum was recorded on a frozen sample of AIN at 100 K. Under these conditions, exchanges between the side-chain rotameric states are so slow that separate NMR signals are observed for each of the major states, Fig. 3b and Fig. S7 (ESI).21 Calculating the peak positions using the chemical shift profiles also readily identifies {gm,t}, {gm,gm} and {gp,t} as the major states. Moreover, the populations of the three major states determined from peak volumes in the DQSQ spectrum agree well with those determined in solution from 13C chemical shifts, in particular a substantial population of {gp,t}. Importantly, as demonstrated both with solution-state measurements at room temperature and solid-state NMR measurements at 100 K, the random coil sampling of AIN and GIG is significantly different from that obtained from the average of all isoleucine side chains in a large set of high-resolution crystal structures.12 In particular, the population of {gp,t} found here is substantially larger than what is predicted from the statistical potential derived from crystal structures.

Backbone chemical shifts frequently play an important role in structure determination, providing information on the backbone dihedral angles, yet side-chain chemical shifts are rarely used for structural characterisations. The excellent agreement between back-calculated long-range scalar couplings and those measured experimentally, Fig. 2 and 3, suggests that side-chain chemical shift can also be readily used in protein structure determination protocols. Structural characterisations of low-populated and excited states have emerged over the last decade2,22 and these characterisations largely hinge on chemical-shifts derived constraints. Recently, side-chain 13C chemical shifts11 have become available for low-populated states, which allow one to obtain rotamer distributions for side chains in low-populated states using the approach described above. One example is the L24A FF domain, which exchanges between a folded ground state and a protein folding intermediate with an exchange rate11 of 130 s−1. Side-chain chemical shifts were recently obtained for the folding intermediate using CEST NMR experiments.11 Chemical-shift-derived side-chain rotamer populations for the isoleucine residues were obtained using backbone conformations derived from TALOS-N, Fig. 4. In the ground state I43 predominantly samples the {gp,t} with 60 ± 6% and {t,t} with 26 ± 3%, while I44 predominantly samples {gm,t} with 68 ± 10% and {gm,gm} with 23 ± 2%. This agrees with TALOS-N, where the I43 χ1 is predicted to be gp and I44 χ1 is predicted to be in a gm conformation. In the folding intermediate both I43 and I44 adopt a much broader distribution with ppt = 24 ± 6%, pmm = 33 ± 1%, and pmt = 42 ± 6%, similar to the random coil distribution.

image file: c9cc06496f-f4.tif
Fig. 4 (a) Exchange between the ground state of the L24A FF domain and a folding intermediate. Tyr49, which change substantially between the two structures is shown in magenta and the isoleucine residues are shown in blue. (b) and (c) The rotamer distributions for Ile43 and Ile44 in each of the two states.

In conclusion, a close relationship between aliphatic 13C chemical shifts and the conformational sampling of the isoleucine side chain was shown. Using this relationship, side-chain rotamer distributions can be determined directly from the side-chain 13C chemical shifts. The method presented here allows for a determination of full rotameric states represented by both χ angles as opposed to projections along each χ angle independently, as is the case for previous methods4,5 based on 3J long-range scalar couplings. Since chemical shifts can be obtained from a wide variety of sources, such as relaxation dispersion experiments, chemical exchange saturation transfer and solid state NMR, this method greatly increases the situations where side-chain conformational samplings can be obtained. Although the method for determining side-chain conformations from chemical shifts here is shown for isoleucine, it is anticipated that similar approaches will allow for characterisations of other side chains in proteins.

We thank Dr Micha B. A. Kunze for helpful discussions. This work was supported by the Francis Crick Institute through provision of access to the MRC Biomedical NMR Centre. The Francis Crick Institute receives its core funding from Cancer Research UK (FC001029), the UK Medical Research Council (FC001029), and the Wellcome Trust (FC001029). The Jülich-Düsseldorf Biomolecular NMR centre is acknowledged for access to high-field NMR spectrometers. L. S. and R. P. acknowledge the Wellcome Trust for PhD studentships (102404/Z/13/Z and 109160/Z/15/Z). This research was supported by the DFG (HE3243/4-1), The Wellcome Trust (101569/z/13/z), the BBSRC (BB/R000255/1), and the Leverhulme Trust (RPG-2016-268). Software to determine isoleucine side-chain rotamer-sampling from 13C chemical shifts is available from

Conflicts of interest

There are no conflicts to declare.

Notes and references

  1. K. M. Anderson, A. Esadze, M. Manoharan, R. Brüschweiler, D. G. Gorenstein and J. Iwahara, J. Am. Chem. Soc., 2013, 135, 3613–3619 CrossRef CAS PubMed.
  2. G. Bouvignies, P. Vallurupalli, D. F. Hansen, B. E. Correia, O. Lange, A. Bah, R. M. Vernon, F. W. Dahlquist, D. Baker and L. E. Kay, Nature, 2011, 477, 111–117 CrossRef CAS PubMed.
  3. C. Zeymer, N. D. Werbeck, S. Zimmermann, J. Reinstein and D. F. Hansen, Angew. Chem., Int. Ed., 2016, 55, 11533–11537 CrossRef CAS PubMed.
  4. A. Bax, D. Max and D. Zax, J. Am. Chem. Soc., 1992, 114, 6923–6925 CrossRef CAS.
  5. S. Grzesiek, G. W. Vuister and A. Bax, J. Biomol. NMR, 1993, 3, 487–493 CAS.
  6. S. F. Cousin, P. Kadeřávek, N. Bolik-Coulon, Y. Gu, C. Charlier, L. Carlier, L. Bruschweiler-Li, T. Marquardsen, J.-M. Tyburn, R. Brüschweiler and F. Ferrage, J. Am. Chem. Soc., 2018, 140, 13456–13465 CrossRef CAS PubMed.
  7. D. F. Hansen, P. Vallurupalli and L. E. Kay, J. Am. Chem. Soc., 2009, 131, 12745–12754 CrossRef CAS PubMed.
  8. R. E. London, B. D. Wingad and G. A. Mueller, J. Am. Chem. Soc., 2008, 130, 11097–11105 CrossRef CAS PubMed.
  9. D. F. Hansen, P. Neudecker and L. E. Kay, J. Am. Chem. Soc., 2010, 132, 7589–7591 CrossRef CAS PubMed.
  10. D. F. Hansen and L. E. Kay, J. Am. Chem. Soc., 2011, 133, 8272–8281 CrossRef CAS PubMed.
  11. G. Bouvignies, P. Vallurupalli and L. E. Kay, J. Mol. Biol., 2014, 426, 763–774 CrossRef CAS PubMed.
  12. V. B. Chen, W. B. Arendall, J. J. Headd, D. A. Keedy, R. M. Immormino, G. J. Kapral, L. W. Murray, J. S. Richardson and D. C. Richardson, Acta Crystallogr., Sect. D: Biol. Crystallogr., 2010, 66, 12–21 CrossRef CAS PubMed.
  13. S. C. Lovell, J. M. Word, J. S. Richardson and D. C. Richardson, Proteins, 2000, 40, 389–408 CrossRef CAS.
  14. A. E. Tonelli and F. C. Schilling, Macromolecules, 1981, 14, 74–76 CrossRef CAS.
  15. D. S. Wishart and B. D. Sykes, J. Biomol. NMR, 1994, 4, 171–180 CrossRef CAS PubMed.
  16. Y. Shen and A. Bax, J. Biomol. NMR, 2013, 56, 227–241 CrossRef CAS PubMed.
  17. J. M. Schmidt, J. Biomol. NMR, 2007, 37, 287–301 CrossRef CAS PubMed.
  18. J. J. Chou, D. A. Case and A. Bax, J. Am. Chem. Soc., 2003, 125, 8959–8966 CrossRef CAS PubMed.
  19. L. E. Kay, D. R. Muhandiram, N. A. Farrow, Y. Aubin and J. D. Forman-Kay, Biochemistry, 1996, 35, 361–368 CrossRef CAS PubMed.
  20. O. Millet, A. Mittermaier, D. Baker and L. E. Kay, J. Mol. Biol., 2003, 329, 551–563 CrossRef CAS PubMed.
  21. A. König, D. Schölzel, B. Uluca, T. Viennet, Ü. Akbey and H. Heise, Solid State Nucl. Magn. Reson., 2019, 98, 1–11 CrossRef PubMed.
  22. A. J. Baldwin and L. E. Kay, Nat. Chem. Biol., 2009, 5, 808–814 CrossRef CAS PubMed.


Electronic supplementary information (ESI) available. See DOI: 10.1039/c9cc06496f

This journal is © The Royal Society of Chemistry 2019