Oliver R.
Maguire‡
,
Jiayun
Zhu
,
William D. G.
Brittain
,
Alexander S.
Hudson
,
Steven L.
Cobb
and
AnnMarie C.
O’Donoghue
*
Department of Chemistry, Durham University, University Science Laboratories, South Road, Durham DH1 3LE, UK. E-mail: annmarie.odonoghue@durham.ac.uk
First published on 23rd April 2020
Native chemical ligation (NCL) enables the chemical synthesis of peptides via reactions between N-terminal thiolates and C-terminal thioesters under mild, aqueous conditions at pH 7–8. Here we demonstrate quantitatively how thiol speciation at N-terminal cysteines and analogues varies significantly depending upon structure at typical pH values used in NCL.
Over the past two decades, native chemical ligation (NCL, Fig. 1) has revolutionized peptide science through its ability to couple peptide fragments under mild conditions without additional coupling agents or side chain protection.2 Protein total syntheses have been achieved chiefly through the combination of solid phase peptide synthesis (SPPS)3 and native chemical ligation (NCL). SPPS allows for the synthesis of peptide fragments of up to 50 residues in length and NCL allows subsequent bioconjugation of these fragments to give target proteins.4 Early examples of NCL required the presence of a cysteine residue at the N-terminus of one peptide fragment, however, its scope has been expanded substantially through the use of thiol analogues of natural amino acids5 and selenocysteine derivatives.6 These systems can then be transformed into the natural residue via desulfurisation7 or deselenisation8 reactions. Other major advances have included auxiliary-mediated NCL,9 kinetically controlled ligation (KCL)10 and templated NCL.11
The mechanism of NCL involves transthioesterification between the cysteine residue and the thioester followed by an intramolecular S-to-N acyl shift to form the native amide bond (Fig. 1).2c The addition of aryl thiol additives usually accelerate NCL through an initial thioester exchange step to a thioaryl nucleofuge.12 Typically, as drawn in Fig. 1, only a single thiolate species is considered as the active nucleophile in the NCL literature based on the higher nucleophilicity of the anionic thiolate relative to neutral thiol.
However, the cysteine residue at the N-terminus of a peptide has two possible ionization sites at the thiol and ammonium groups. The pKas of an alkyl thiol (∼9–11) and amino acid primary ammonium (∼9–10) are in sufficiently close proximity such that up to four species may be present in solution depending upon the pH: i cationic, ii formally neutral zwitterionic, iii neutral and iv anionic species (Fig. 2). The concentration of each species is controlled by the acid dissociation constants, Ka(A)–Ka(D). Importantly, this means that NCL has the option of two different thiolate species in solution at a given pH (ii and iv), and it is not strictly correct to quote one pKa for the cysteine thiol as is commonly done. The use of modified cysteine analogues in NCL will further alter Ka(A)–Ka(D) and the species distribution.
Fig. 2 The four possible cysteine species i–iv in solution and the acid dissociation constants that define the interrelationship between each species. |
Herein, we report the pKa values for a series of cysteine and thiolated analogues of amino acid methyl esters and peptides (Fig. 3, 1–10). We evaluate the speciation (%i–iv) over the whole pH range, including at typical NCL pH values. N-terminal acid dissociation constants will be most influenced by substituents in close proximity thus monomeric amino acid derivatives and short peptides are appropriate models to assess N-terminal speciation. The dissociation constants Ka(A)–Ka(D) were determined by UV-Vis spectrophotometry using an adapted form of the procedure reported by Benesch13 for evaluation of the acid dissociation constants of cysteine 1. The changes in absorbance from thiolate (ARS-, λmax = 237 nm) were determined across the pH range 1.4–12.5 (Fig. S1–S47, ESI†). A small blue shift of the absorbance to λmax = 235 nm is observed at lower pHs, which is attributed to +H3N-R-S−ii absorbing at shorter wavelengths than H2N-R-S−iv (ESI,† Section S3). For the range of thiolated substrates employed, it was necessary to conduct measurements in the presence of 2 mM TCEP to prevent interference from thiol(ate) oxidation. The fraction of thiolate species, fRS-, in solution was calculated from the ratio of ARS- at a given pH to ARS-(max) at pH 12.5 where all thiols are in thiolate form (eqn (1)). The dissociation constants Ka(A), Ka(B) and Ka(D) could then be obtained by fitting eqn (2) to the data for fRS-versus pH (e.g. Fig. S30 for 6, ESI†, Section S2) and Ka(C) using eqn (S1) (ESI†). Attempts to fit the data to an alternative model with two non-overlapping pKas and three species: H2A, HA− and A2− did not converge upon a solution (ESI,† Section S5).
(1) |
(2) |
The acid dissociation constants pKa(A)–pKa(D) determined for 1–10 are shown in Table 1. With the exception of cysteine zwitterion 1, the pKas are in the order:
pKa(B) < pKa(A) < pKa(C) < pKa(D) |
a pKa(A), pKa(B) and pKa(D) values were obtained from a fit of the percentage of thiol in thiolate form (fRS-, eqn (1)) to eqn (2) and pKa(C) was determined using eqn (S1) (ESI). b Value from Benesch determined in the absence of TCEP.13 c Determined in the absence of 2 mM TCEP. d Determined in the absence of 2 mM TCEP only as minimal oxidation was observed on the timescale of the UV-Vis spectrophotometric experiments. | ||||
---|---|---|---|---|
1 Cysteine | 8.41 (8.53)b | 8.47 (8.86)b | 9.88 (10.36)b | 9.83 (10.03)b |
2 Cysteine methyl ester | 7.31 | 6.63 | 8.29 | 8.98 |
7.35c | 6.99c | 8.60c | 8.95c | |
3 Penicillamine methyl ester | 7.38 | 6.61 | 8.60 | 9.38 |
7.67c | 7.07c | 8.71c | 9.31c | |
4 (4S)-Mercaptoproline methyl ester | 7.12d | 6.89d | 8.52d | 8.74d |
5 H-Cys-Gly-OH | 7.97 | 7.01 | 8.34 | 9.30 |
6 H-Cys-Gly-Phe-NH2 | 7.13 | 6.50 | 8.42 | 9.04 |
7 H-Pen-Gly-Phe-NH2 | 7.44 | 6.35 | 8.39 | 9.49 |
8 (4S)-Mcp-Gly-Phe-NH2 | 7.36 | 7.18 | 8.49 | 8.89 |
9 H-Cys-Ser-Phe-NH2 | 7.31 | 6.67 | 8.43 | 9.06 |
10 H-Cys-Val-Phe-NH2 | 7.39 | 6.74 | 8.39 | 9.03 |
Thiol pKa(A) and pKa(D) values vary between cysteine and derivatives 1–10. The pKa(A) and pKa(D) values for penicillamine derivatives 3 and 7 are higher than for cysteine analogues 2 and 6. The decreased acidity of the penicillamine thiols can be attributed to the two additional electron donating methyl groups in 3 and 7, which will inductively destabilise the thiolate and/or reduce solvation of the penicillamine thiolate due to the adjacent methyl groups. Unexpectedly, the (4S)-mercaptoproline methyl ester 4 and peptide 8 have lower thiol pKa(D) values than for the cysteine methyl ester 2 and peptide 6. It might be predicted that the secondary thiol in 4 and 8 would have a higher pKa(D) value due to greater inductive destabilisation of thiolate than for the primary thiol in 2 and 6. To account for the observed decrease, we propose that the conformation of the pyrrolidine coupled with stereoelectronic effects alters speciation in this case. The pyrrolidine ring in proline and proline derivatives has two major Cγ-endo pucker and Cγ-exo pucker conformations (Fig. 4a). The preferred conformation is dependent upon a combination of stereoelectronic effects, minimisation of unfavourable dipole dipole interactions and whether the substituent Cγ is the R or S enantiomer. Moroder has shown that (4S)-mercaptoproline favours the Cγ-exo ring pucker.14 The pyrrolidine ring adopts a conformation that places the thiol group in the sterically more favourable equatorial position of the ring pucker. In the favoured conformations the C–N and C–S bonds lie in an anti-conformation (Fig. 4b). We postulate that this anti-conformation may promote the formation of the thiolate via stereoelectronic stabilisation by the more electronegative N counteracting an unfavourable inductive effect (Fig. 4c).
Fig. 4 Proposed stereoelectronic justification for the lower pKD values for 4S-mercaptoproline methyl ester 4 and peptide 8 compared to cysteine analogues 2 and 6: (a) the pyrrolidine ring in 4S-mercaptoproline 4 prefers to adopt the Cγ-exo pucker in solution;14 (b) in the Cγ-exo pucker the C–N and C–S bonds are in an anti-conformation (purple coloured bonds); (c) the anti-conformation allows for stereoelectronic stabilisation of the thiolate by hyperconjugation. |
We also examined the effect of the identity of the adjacent amino acid (Gly, Ser or Val) upon the pKa values of the N-terminal Cys residue in peptides 6, 9 and 10. Ser and Val were chosen to represent adjacent residues with hydrogen bonding capabilities and steric bulk, respectively. Compared to Gly, the Ser and Val residues led to increases in both pKa(A) and pKa(B), however, no effects upon pKa(C) and pKa(D) values were observed. Hydrogen bonding of the Ser of 9 to the terminal ammonium will stabilize cationic species i thereby increasing both pKa(A) and pKa(B). The hydrophobic and inductively donating Val of 10 would also be predicted to favour cationic i relative to formally neutral ii and iii by allowing for increased aqueous solvation.
To assess the impact that our results could have for NCL we evaluated the relative concentrations of the four species i–iv across the whole pH range using the pKa(A)–pKa(D) values. The variation in the concentration of i–iv for 1–10 at pH 0–14 is shown in Fig. S57–S66 (ESI†). As NCL is typically performed at pH 7–8, the populations (%) of species i–iv at pH 7 and 8 are given in Fig. 3 and Table S20 (ESI†). Importantly, our results show that the major thiolate species is zwitterion ii rather than anion iv as commonly represented in NCL literature. For all substrates 1–10, ≤ 1% of anion species iv is present at pH 7. For the cysteine peptide 6 at pH 8, the percentage of the concentration of ii (17%) is over twice that of iv (7%). A similar 2-fold higher concentration of iivs.iv exists for the penicillamine peptide 7 at pH 8, with 10% total thiolate. For (4S)-Mcp peptide 8 at pH 8 the effect is more pronounced with a 5-fold higher concentration of iivs.iv, and 41% in thiolate form.
We have established that four thiol(ate) species i–iv must be considered in NCL. Our speciation diagrams (ESI,† Fig. S57–S66) illustrate that structural modifications to the cysteine scaffold significantly change the concentrations of i–iv as a function of pH. While our study provides details of structure-population properties of cysteine species i–iv, their relative reactivities (i.e. nucleophilicities), and thus individual contributions to NCL, remain unknown. However, our study delivers an essential step towards delineating the complex, parallel processes that contribute to NCL, where all species i–iv and their associated rate constants, which are quantitative measures of nucleophilicity, must be considered (see kNCL-i–kNCL-iv, eqn (S8) and Scheme S3, Section S8, ESI†).
These speciation differences between the various cysteine and thiolated analogues could be potentially exploited to perform N-terminal kinetically controlled ligations. C-terminal kinetically controlled ligations have been demonstrated based upon reactivity differences between thioesters.10 N-terminal kinetically controlled ligations could function via adjusting the rate of ligation based upon differences in thiolate concentrations and nucleophilicities using pH control. Used in conjunction with C-terminal kinetically controlled ligations, this could permit even finer control over peptide ligation.
In conclusion, we have demonstrated that four different species are present in solution for N-terminal cysteines and thiolated amino acid analogues, and have determined their pKa(A)–pKa(D) values. Our data highlight that two thiolate species, ii and iv, are present under NCL conditions, and must be considered. The anionic form iv will only be the most abundant thiolate species at higher pH values (pH > 8.3). Notably, (4S)-mercaptoproline methyl ester 4 and peptide 8 are unusually acidic, with lower thiol pKa(D) values than cysteine analogues 2 and 6, and thus higher thiolate iv populations at pH 7–8. We propose that additional stereoelectronic stabilisation of the thiolate in the favoured Cγ-exo ring puckered conformation of (4S)-Mcp favours acid dissociation. The effects of an amino acid adjacent to Cys were confined to cationic species i, and hence only the pKa(A) and pKa(B) values. These data permit the evaluation of the percentage of active thiolate species at typical NCL pH values for a range of widely used N-terminal cysteine derivatives offering (bio)chemists quantitative insight into the structural factors that influence NCL.
The authors thank Dr Jingyi Zong for help with peptide synthesis, Dr David Hodgson for helpful discussions, EPSRC (ORM, EPSRC DTA 1213247) and Cambridge Research Biochemicals Ltd. (ASH) for studentships.
Footnotes |
† Electronic supplementary information (ESI) available: UV-Visible spectrophotometric methods, data and fitting analysis; general synthetic procedures, materials and instrumentation. See DOI: 10.1039/d0cc01604g |
‡ Current address: Radboud University Nijmegen, Institute for Molecules and Materials, Heyendaalseweg 135, 6525 AJ Nijmegen, The Netherlands. |
§ C-terminal COOH groups are deprotonated at pH > 3 (pKa = 2.2 ± 0.4).15 |
This journal is © The Royal Society of Chemistry 2020 |