Protein stabilization by tuning the steric restraint at the reverse turn

The incorporation of pseudoallylic strain by N-methylation at the solvent exposed loop in proteins leads to a stark increase in their thermodynamic stability that can be tuned by altering the amino acid composition.

While Pin1 WW domain analogs NG, 6 to 11 were cleaved off from the resin and globally deprotected by using the cleavage cocktail TFA:DCM:TIPS: H2O (62.5:32.5:2.5:2.5) for 30 minutes at room temperature. The cleaved peptide solution was then precipitated in chilled diethyl ether, centrifuged twice and dissolved in 10% acetonitrile for purification by RP-HPLC.

Purification By RP-HPLC
A suitably adjusted 20-minute gradient of 10% B to 50% B was used for purification of compounds 1 to 4, 3a-3n, 4a-4g, 5, pG and their respective unfolded and fully folded controls, where solvent A was 0.1% TFA in H2O and B was 0.1% TFA in acetonitrile. Pin1 WW domain analogs, NG, 6 to 11 were purified using a 40-minute gradient of 15% B to 55% B, where solvent A was 0.1% TFA in H2O and B was 0.1% TFA in acetonitrile.

Determination of Cis-Trans Equilibrium of N-methylated Tetrapeptides:
To evaluate the influence of pseudoallylic strain in governing cis-trans equilibrium in unrestrained N-methylated linear peptides, L1-L4 tetrapeptides (Ac-V-Xaa-NMeYaa-F-CONH2) were synthesized on Rink amide AM resin using the standard Fmoc-based peptide synthesis as described above. N-methylation of these peptides were also carried out by the above described protocol. The final peptides were cleaved off from the resin using cleavage cocktail TFA:DCM:TIPS:H2O (62.5:32.5:2.5:2.5) and precipitated in chilled water. The crude peptides were purified using a suitably adjusted 20-minute gradient ranging from 15% B to 60% B, where B was acetonitrile solvent with 0.1% TFA. Finally, the conformational dynamics of these unrestrained peptides were investigated by NMR ( 1 H, TOCSY and ROESY) in 100 mM sodium phosphate buffer (pH=3.8). Two conformations were observed for the peptides L1-L4. Population belonging to trans and cis conformations were obtained from the ROESY spectrum. Then, Ktrans/cis was calculated by integrating the peaks belonging to trans and cis conformation in proton spectrum of each tetrapeptides.

Generation of Ramachandran Maps:
The sterically allowed conformational space for the i+1 and i+2 residues in the tetrapeptides were explored by considering two-linked peptide unit systems within the tetrapeptides. A given two-linked peptide system starts from Cα atom of i th residue and ends with Cα atom of i+2 th residue, including all the backbone atoms and Cβ and Hα atoms attached to the middle Cα atom in case of alanine or two Hα atoms attached to the middle Cα atom in case of glycine. The conformational flexibility of these systems was obtained by computing their corresponding Ramachandran maps. Starting with the initial model structure, coordinates for every possible integral value of φ and ψ torsion angles, ranging from -180° to 180° were computed by performing rotations about the N-Cα bond and Cα-C bond. For every conformation, non-bonded inter-atomic distances between atoms which are at least three bonds away from each other were calculated. These distances were compared against a set of contact criteria given by Ramachandran et al. 2 For a given value of φ and ψ, if all the interatomic distances in the conformer are more than the normal limit, it is fully allowed; if one or more distances lie between the S4 normal and extreme limits, it is partially allowed; if one or more distances is lower than the extreme limit, it is disallowed. For Pin1 WW domain analogs NG, 6 to 11, Far-UV CD spectra were recorded at 10 μM concentration in 20 mM sodium phosphate buffer (pH=7) over a wavelength range of 190-260 nm with a scan rate of 100 nm per minute and data pitch of 0.5 nm.
Thermal stability of Pin1 WW variants (10 μM each) were assessed by monitoring the CD signal at 227 nm over a temperature range of 2-110°C with 0.5°C interval in 20 mM sodium phosphate buffer (pH=7). The samples were equilibrated for 2 minutes at each temperature and the data obtained was an average of three independent replicates. To monitor aggregation, thermal unfolding of the compounds were also recorded at two other concentrations, 5 μM and 50 μM. Values for TM was extracted by nonlinear least square fit of thermal denaturation curve, assuming a two-state model as previously described. 4

Fluorescence
Stability of Pin1 WW variants were assessed by chaotrope denaturation, where different concentrations of guanidinium hydrochloride (0-8 M) were titrated in 20 mM sodium phosphate buffer (pH=7.4) solution having equal protein concentration of 2.5 μM. The samples were equilibrated for 1.5 hours at 4°C and the fluorescence intensity were recorded. Emission spectra were recorded from 310 nm to 410 nm in 1 nm steps with excitation at 284 nm. Chemical denaturation curves were obtained by monitoring the maximum tryptophan fluorescence intensity at 338 nm as a function of guanidinium hydrochloride concentration. The data obtained were fit to a two-state model as previously described. 5 Thermodynamic Determination of Compounds pG, 2, 3, 3a, 3g, 3n, 4a and 5: The percentage of beta sheet population and ΔGfold were calculated from the H α chemical shifts for each peptide at the reporter residues (Val3, Val5, Lys8 and Ile10). These reporter residues are located at the hydrogen-bonded position which allows accurate determination of beta-sheet folded fraction. Fraction folded at each residue were calculated by the following equation; Percentage of Fraction Folded = [(δobs-δU)/(δF-δU)]*100 (Equation 1) Where, δobs is the H α chemical shift of the peptide of interest, δU is the H α chemical shift of its unfolded control in which the D-amino acid is replaced with the L-amino acid and δF is the H α chemical shift of its fully folded control obtained by cyclizing the peptide through disulfide bond formation. The error in chemical shift assignment was considered as 0.01 ppm and was incorporated in the calculation by using the error propagation method. 6 The equilibrium constant was calculated using Equation 2: Finally, the ΔGfold was calculated using Equation 3:

ΔGfold = -RTlnK
The ΔGfold was calculated at each reporter positions and the average of the four values have been reported.

NMR Acquisition
For compounds 1 to 4, 3a-3n, 4a-4g, 5, pG, the compounds were dissolved in 100 mM sodium acetate buffer (pH 3.8) H 2 O:D 2 O (9:1). In case of Pin1 WW analogs 6 to 11, the compounds were dissolved in 20 mM sodium phosphate buffer (pH=7.4) having 10% D 2 O. In all the compounds, 0.1% TMSP was used as an internal standard (δ = 0 ppm). Standard Bruker pulse sequences zgesgp for 1 H, mlevesgpph/dipsi2rcesgpph (60 ms mixing time) for TOCSY, roesyesgpph (100 ms mixing time) for ROESY and noesyesgpph (200 ms mixing time) for NOESY were used to acquire the NMR data. Two-dimensional data were obtained using 2048 data points in the direct dimension and 512 data points in the indirect dimension The NMR of compounds 1 to 4, 3a-3n, 4a-4g, 5 and pG were obtained using concentration of 1-3 mM at room temperature (25°C). For Pin1 WW analogs NG, 6 to 11, the NMR were obtained using a concentration of 40-60 M at 15°C.
All NMR data were processed using iNMR (www.inmr.net), and the 2D NMR data were analyzed with SPARKY. 7 The chemical shift tables were generated from TOCSY and 1 H spectra. The sequential assignments and inter-and intra-residue NOEs were determined through ROESY. The NOEs were then integrated and the integration values were converted to distances using the formula V=Kd -6 , where V is the integrated peak volume, K is a constant (determined using resolved diastereotopic CH 2 groups from Tyr2 or in some cases Leu11), and d is the distance between the protons.
Determination of amide temperature coefficients of compounds 2, 3, 3a, 3n, 4, 4a and 5 were obtained by acquiring 1 H NMR spectrum at different temperature ranging from 298K to 328K and thereafter, the difference in Lys8 HN chemical shift per degree change in temperature were determined.
To monitor aggregation of compounds 1 to 4, 3a-3n, 4a-4g and 5, dilution experiments were performed. In this experiment, 1 H NMR spectra were recorded at 1:4 and 1:12 dilutions for each compound to observe any appreciable change in the proton shifts as a consequence of aggregation.
Determination of amide temperature coefficients of Pin1 compounds 8 and 11, were obtained by acquiring series of TOCSY spectra ranging from 288-308 K with every 5 K increment in 20 mM sodium phosphate buffer, pH=7.4.

Structure Calculation
To calculate the structure of the molecule we have used charmM force field, 8 via the interface of Discovery Studio, for the entire process.
The distance restraints were converted into a charm restraint file using a custom Perl script. The resulting file was then used to define NOE restraints inside the charmM syntax. To the distance, 10% were added or subtracted to define the upper and lower limits respectively. If there were any methyl protons involved in the restraints, an additional 0.4Å per methyl group (pseudoatom correction) were added to the upper limit to compensate for the errors involved. 9

S7
For the compounds 3n, 4g and 5, an explicit water box was used with water boundary of 16 Å. The linear structure was solvated in the water box. The solvated structure was refined by distance restraints to obtain an initial structure by the simulated annealing protocol. It was further refined by dihedral angle constraints derived from 1 H NMR spectra employing Bystrov equation 10 followed by a 10 ns restrained molecular dynamics run. The average over the dynamics run was considered to be the final structure and 10 lowest minimum energy structures were used to generate the ensemble.   Figure S1. Linear tetrapeptides displaying varying Ktrans/cis determined by 1H NMR in acetate buffer (pH 3.

.3 *
Note: '*' indicate the missing chemical shifts of the respective residue due to water suppression or peak overlap.

Residues
Compounds NG Table S12. Characteristic NOEs of 2, 3, 3a-3n. Figure S45. Front view and side view of average solution structure of compound 3n. Figure S46. Front view and side view of average solution structure of compound 4g.

CHARACTERISTIC NOEs TURN NOEs STRAND NOEs
S55 Figure S47. Front view and side view of average solution structure of compound 5.   Figure S48. Ramachandran plot of i+1 and i+2 residues of 100 minimum energy conformations of compounds 5, 4g and 3n. Figure S49. Overlayed Circular Dichroism spectra of 6-11. Figure S50. Thermal, concentration dependent denaturation CD and fluorescence monitored chemical denaturation of Compound 11. Figure S51. Thermal, concentration dependent denaturation CD and fluorescence monitored chemical denaturation of Compound 10. Figure S52. Thermal, concentration dependent denaturation CD and fluorescence monitored chemical denaturation of Compound 9.

S62
S63 Figure S53. Thermal, concentration dependent denaturation CD and fluorescence monitored chemical denaturation of Compound 7. Figure S54. Thermal, concentration dependent denaturation CD and fluorescence monitored chemical denaturation of Compound 6.