Determining isoleucine side-chain rotamer-sampling in proteins from 13C chemical shift

A framework is presented to derive the conformational sampling of isoleucine side chains from nuclear magnetic resonance 13C chemical shifts.

Fmoc deprotection steps were performed by agitating the resin in a solution of 40% piperidine in DMF for 3 mins, followed by agitation in 20% piperidine in DMF for 10 mins. The resin was then washed (×6) with DMF. Coupling reagents were preactivated by dissolving the Fmocamino acid (4 eq.), HBTU (4 eq.), and DIPEA (8 eq.) in 1.5 mL DMF prior to addition to the resin. The suspension was agitated for 40 mins before filtering the resin and washing (×4) with DMF. Coupling steps were performed twice to ensure completion of the reaction. Labelling was achieved by the use of Fmoc-[ 13 C, 15 N]-Ile-OH (Sigma-Aldrich); labelled atoms are marked with an asterisk. Following the last Fmoc-deprotection the resin was washed with DMF (×6), DCM (×4), MeOH (×4) and Et2O (×4) and dried overnight in vacuo. Cleavage from the resin was carried out by the addition of 1.5 mL TFA/TIPS/H2O (95:2.5:2.5) for 45 mins, followed by addition of 1.5 mL fresh cleavage solution for a further 45 mins. Eluents were combined and the TFA was removed under a stream of air to leave a pale residue. The residue was dissolved in 6 mL water and lyophilized to a pale yellow solid. The crude product was dissolved in 2 mL 10% acetic acid, and washed three times with 2 mL chloroform. The aqueous layer was isolated and lyophilised to produce 1 as a white solid. No further purification was performed.
Analytical HPLC: The peptide was analysed using a Reprosil Gold 200 C8 250 x 4.6 mm (5 μm particle size) column (Dr. Maisch GmbH) attached to an Agilent 1260 Infinity HPLC. The method was a 5-75 % gradient of Buffer B in Buffer A over 60 minutes at 1 mL min - 1   To synthesis (2) L-[ 13 C6, 15 N1]-isoleucine (3) (98mg, 0.56 mmol, Sigma-Aldrich), was added to methanol (2.5 mL, 62 mmol). The solution was cooled to 4 ºC and acetyl chloride (0.8 mL, 10 mmol) was added dropwise. The reaction was allowed to warm to room temperature. When the reaction was complete (determined by NMR, approximately 1 week) it was concentrated under reduced pressure. The excess methanol was removed from the colourless gum by mixing it with diethyl ether and re-concentrating it twice to give 4 as a white residue.
The residue was suspended in dichloromethane (2.5 mL) and cooled to 4 ºC. Following this triethylamine (0.3 mL, 2 mmol) and acetyl chloride (48 μL, 0.675 mmol) were added to the mixture. After 30 minutes the reaction was diluted with a 10% ammonium chloride solution (5 mL) and stirred vigorously. The biphasic mixture was diluted with DCM (5 mL) and separated using a hydrophobic frit. The organic phase was concentrated under reduced pressure to give 5 as a yellow oil which crystallised overnight. Then 33% methylamine in ethanol (20 mL) was added and stirred at room temperature for 48 hours. The mixture was concentrated under reduced pressure to give crude 2 as a white solid. The crude product was purified by flash chromatography (Biotage 10 g Ultra, dry load) using a rapid gradient 0-100% ethyl acetate in isohexane followed by a 0-20% gradient of methanol in ethyl acetate. The fractions were analysed by thin layer chromatography (~10% methanol in ethyl acetate, visualised using a ninhydrin dip). The appropriate fractions were collected and concentrated to give 2 as a white solid.

Measuring three-bond scalar coupling constants
Spin-echo difference constant-time (CT) 13 C-1 H HSQC experiments were used to obtain the 3 J( 13 C γ1,2 ,CO) and 3 J( 13 C γ1,2 , 15 N) coupling constants 7,8 . The constant time delay was set to T = 28 ms and two spectra were recorded, a reference spectrum giving the intensity, Iref and a spectrum where the coupling of interest is evolved giving rise to an intensity, Icoup. The coupling constant is calculated according to, The HN(CO)C 9 experiment provides an alternative way to obtain 3 J( 13 C γ1,2 ,CO) scalar coupling constants by utilising the resolution of the 15 N-1 H correlation spectrum. In the HN(CO)C experiment, transverse 13 CO magnetisation is evolved for a time T, during which long-range couplings evolve; Icoup. A reference spectrum is recorded by evolving the 13 CO magnetisation for a time T' = T -0.5/ 1 J( 13 CO, 13 Cα), such that magnetisation is transferred to the directly bonded 13 Cα; Iref. The long-range coupling constants are determined from the intensity in the reference spectrum, Iref, and the intensities of cross-peaks in the coupling spectrum: where Cp are the atoms corresponding to the cross-peaks observed, the index p includes all 13 C nuclei where a cross-peak is observed, and R2,CO is the transverse relaxation rate of 13 CO. The coupling delay was set to T=56 ms.
Three-bond scalar coupling constants involving aliphatic 13 C were measured using the pulse scheme published previously by Bax and co-workers 10 with a constant time delay for coupling evolution set to T = 58 ms. In these spectra 3 J(Cα,Cδ1) couplings were calculated from the ratio of the intensities of the diagonal peak Iref = ( 13 Cδ1, 13 Cδ1, 1 Hδ1) and the cross peak Icoup = ( 13 C δ1 , 13 C α , 1 H δ1 ) from Coupling between 13 C δ1 and 13 C γ2 were calculated in a similar manner.

Calculating populations three-bond scalar coupling constants:
The population of the isoleucine side chain rotameric states was determined by a constrained least-squares fit. The target function to be minimised, χ 2 , was defined as where p is a vector containing the unknown populations to be determined, ε is the error in the experimental J-coupling, and the sum is over the available experimental coupling constants, typically a subset of 3 J( 13 CO, 13 C γ1 ), 3 J( 13 CO, 13 C γ2 ), 3 J( 15 N, 13 C γ1 ), 3 J( 15 N, 13 C γ2 ), 3 J( 13 C α , 13 C δ1 ), and 3 J( 13 C γ2 , 13 C δ1 ). The calculated scalar couplings, 3 Ji,calc, were derived from previously published Karplus parametrisations 11,12 and the populations of the states, p. The angular dependence of 3 J( 13 C γ2 , 13 C δ1 ) was assumed to be the same as the angular dependence of 3 J( 13 C α , 13 C δ1 ) and assumed to be 3.7 Hz in a trans conformation and 1.5 Hz in a gauche conformation 12 .
The error related to the Karplus parameterisation was estimated by determining the populations, p, using another Karplus parameterisation 13 and comparing the two sets of populations, Fig. S6.

Solid-State NMR experiments
DNP-enhanced double quantum single quantum (DQSQ) correlation spectra of Gly-[ 13 C6, 15 N1]Ile-Gly peptide (0.5 mg) and Ace-[ 13 C6, 15 N1]Ile-NMe (0.25 mg) were recorded using SPC-5 recoupling for excitation and reconversion of double quantum coherence 14 . The peptides were dissolved in 34 ml d8-glycerol/D2O/H2O (60:30:10 volume ratio) and 2.5 mM AMUPol 15 in a final buffer containing 15 mM NaCl and 10 mM sodium phosphate. The samples were filled in 3.2mm sapphire rotors, with zirconia caps, and experiments were carried out on a Bruker Avance III 800 MHz spectrometer connected to a 527 GHz gyrotron providing a continuous source of microwaves for DNP enhancement. The spectra were recorded at temperature of 100 K at a magic angle spinning rate of 8.2 kHz. The number of t1 increments was 50 (158) corresponding to maximum evolution time of 1.5 (or 4.5) ms for AIN (GIG). 1 H decoupling using SPINAL64 16 with a decoupling field of 100 kHz was employed during evolution and detection periods.
The DQSQ spectra show slanted peak shapes, Fig 3b, and the peaks were therefore modelled with a shape that is given by a product of two functions rotated by the angle θ in the ω1, ω2 plane and centred at a d , / d . Thus, the peak shape, S, was modelled by the following function: and and ℒ are normalised Gaussian and Lorentzian one-dimensional line shapes and a is the intensity. Peak volumes were used to assign the relative population of each of the rotameric states, Figs 3 and S7.

Density Functional Theory Calculations
Density Functional Theory (DFT) calculations were carried out on the AIN construct shown in Fig. 1a using Gaussian09 (g09) 17 . Initially a free structure optimisation was carried out in vacuum using a B3LYP functional and the 6-31G* basis set 18,19 . The dielectric constant in vacuum is 1, whereas inside proteins the dielectric constant has been estimated to be around 6-7, and 80 in water at room temperature. The hydrophobic isoleucine side chain is most-often observed within the hydrophobic core of proteins. The DFT calculations were therefore carried out in vacuum, as opposed to implicit water, because the dielectric constants in vacuum is closer to a protein environment than the dielectric constant in water.
Following the initial optimisation, three backbone conformations (φ,ψ) were selected based on the maxima of the population density in the isoleucine Ramachandran plot of the 'top8000' and 'top500' data sets 1,20 . The α-helical region of the plot showed a single tight distribution at (φα, ψα) = (-61.5 o , -43.9 o ), while the β-sheet region showed a more extended distribution and two points were therefore selected to represent this backbone conformation (φβ1, ψβ1) = (-110.6 o , 127.6 o ) and (φβ1, ψβ1) = (-127.0 o , 127.6 o ); these two conformations are referred to as β1 and β2. A full side-chain {χ1,χ2} grid with 5 o resolution was made for each of the three backbone conformations. A second structure optimisation was carried out for each rotamer with φ, ψ, χ1, and χ2 constrained to the selected values on the grid. Finally the chemical shift shieldings were calculated, using a gauge-independent atomic orbital (GIAO) approach, as implemented in g09, for each optimised structure using the EPR-III basis set 21 . Population distributions over the {χ1,χ2} grid were obtained for each of the three backbone conformations, using the DFT energies (Fig 1 and S2), as follows where v = ∑ exp w− x y t ,y u z \ … | } s t ,s u (S10) and where s t ,s u v is the DFT energy at the grid-point {χ1, χ2} for the secondary structure μ, R is the gas constant and T is the absolute temperature. A temperature of 300 K was used here. It should be noted that varying the temperature T over a range from 100 K to 310 K only affects the calculated chemical shift by 0.1 ppm.

Determining Reference Shielding
The DFT calculations yield the isotropic chemical shielding value, σ, for each atom, in each of conformation, {φ, ψ, χ1, χ2}. Atom-specific random coil chemical shifts, δRC,i, i ∈ { 13 C α , 13 C β , 13 C γ1 , 13 C γ2 , 13 C δ1 } from random-coil models were obtained experimentally (see below) and used to obtain an initial value for the reference shielding value, σref,i, and thus to convert the shielding values to observed chemical shifts, "#$,P = −( ‰J,P − ‰J,P ) (S11) The atom-specific random-coil shielding constant, σRC,i, was obtained from the DFT calculations and using the romater-populations from the top8000 database as follows where Pμ = {Pα, Pβ1, Pβ2} = {0.5, 0.25, 0.25}, s t ,s u v is calculated as described in Eq. S1, and Š v , are obtained from the top8000 database as described above.
A subsequent optimisation of the isotropic shielding constant σref,i was performed by minimising the χ 2 between experimental and back calculated coupling constants. The residual is given by where the sum is over all available three-bond scalar couplings, i, and residues, r. The error associated with each scalar coupling measurements, ε. A minimum value of 0.10 Hz or 0.05 Hz was used for ε for scalar couplings measured in T4L L99A and ubiquitin, respectively. The calculated scalar couplings, P,OE,&RS& 2 ( "#$ ) , were determined using the populations, p, determined from chemical shifts. These optimisations were calculated 136 times using a particle swarm algorithm (available at https://pythonhosted.org/pyswarm/), each time omitting two residues from the T4L L99A and ubiquitin dataset. The final references are shown in Fig.  S3.  Experimentally measured long-range scalar couplings.  Side-chain rotamer populations derived from chemical shifts.        Figure S1: Ace-Ile-NMe (AIN) potential energy surface obtained from DFT with B3LYP functional and a 6-31G* basis set. The φ, ψ, χ1 and χ2 angles were fixed at each grid-point giving a 5 o {χ1,χ2} grid for each of the three backbone conformations, α, β1, and β2. The calculations were carried out as described in Supporting Methods. The minima (red dots) were derived by fitting a two-dimensional second-order polynomial to the energies within ±10 degrees of the minimum. White areas, corresponds to grid-points where the DFT calculation did not converge. Figure S2: Chemical Shift surfaces for Ace-Ile-NMe aliphatic 13 C nuclei, calculated using the shielding tensor from DFT and the un-optimised atom-specific reference tensors, σref,i. The white contours indicate the regions, which together comprise 90% of the total populations observed in high-resolution crystal structures. White areas, corresponds to grid-points where the DFT calculation did not converge.