N-Terminal speciation for native chemical ligation.

Native chemical ligation (NCL) enables the chemical synthesis of peptides via reactions between N-terminal thiolates and C-terminal thioesters under mild, aqueous conditions at pH 7-8. Here we demonstrate quantitatively how thiol speciation at N-terminal cysteines and analogues varies significantly depending upon structure at typical pH values used in NCL.

Proteins are versatile biological macromolecules and their total chemical synthesis allows chemists to access important targets that can be difficult to obtain from traditional biological sources. 1 Furthermore, the incorporation of non-natural amino acid residues is more readily achievable allowing the structural and functional properties of proteins to be probed.
Over the past two decades, native chemical ligation (NCL, Fig. 1) has revolutionized peptide science through its ability to couple peptide fragments under mild conditions without additional coupling agents or side chain protection. 2 Protein total syntheses have been achieved chiefly through the combination of solid phase peptide synthesis (SPPS) 3 and native chemical ligation (NCL).SPPS allows for the synthesis of peptide fragments of up to 50 residues in length and NCL allows subsequent bioconjugation of these fragments to give target proteins. 4Early examples of NCL required the presence of a cysteine residue at the N-terminus of one peptide fragment, however, its scope has been expanded substantially through the use of thiol analogues of natural amino acids 5 and selenocysteine derivatives. 6These systems can then be transformed into the natural residue via desulfurisation 7 or deselenisation 8 reactions.Other major advances have included auxiliary-mediated NCL, 9 kinetically controlled ligation (KCL) 10 and templated NCL. 11he mechanism of NCL involves transthioesterification between the cysteine residue and the thioester followed by an intramolecular S-to-N acyl shift to form the native amide bond (Fig. 1).2c The addition of aryl thiol additives usually accelerate NCL through an initial thioester exchange step to a thioaryl nucleofuge. 12Typically, as drawn in Fig. 1, only a single thiolate species is considered as the active nucleophile in the NCL literature based on the higher nucleophilicity of the anionic thiolate relative to neutral thiol.
However, the cysteine residue at the N-terminus of a peptide has two possible ionization sites at the thiol and ammonium groups.The pK a s of an alkyl thiol (B9-11) and amino acid primary ammonium (B9-10) are in sufficiently close proximity such that up to four species may be present in solution depending upon the pH: i cationic, ii formally neutral zwitterionic, iii neutral and iv anionic species (Fig. 2).The concentration of each species is controlled by the acid dissociation constants, K a (A)-K a (D).Importantly, this means that NCL has the option of two different thiolate species in solution at a given pH (ii and iv), and it is not strictly correct to quote one pK a for the cysteine thiol as is commonly done.The use of modified cysteine analogues in NCL will further alter K a (A)-K a (D) and the species distribution.
Herein, we report the pK a values for a series of cysteine and thiolated analogues of amino acid methyl esters and peptides (Fig. 3, 1-10).We evaluate the speciation (%i-iv) over the whole pH range, including at typical NCL pH values.N-terminal acid dissociation constants will be most influenced by substituents in close proximity thus monomeric amino acid derivatives and short peptides are appropriate models to assess N-terminal  2020, 56, 6114--6117 | 6115 speciation.The dissociation constants K a (A)-K a (D) were determined by UV-Vis spectrophotometry using an adapted form of the procedure reported by Benesch 13 for evaluation of the acid dissociation constants of cysteine 1.The changes in absorbance from thiolate (A RS-, l max = 237 nm) were determined across the pH range 1.4-12.5 (Fig. S1-S47, ESI †).A small blue shift of the absorbance to l max = 235 nm is observed at lower pHs, which is attributed to + H 3 N-R-S À ii absorbing at shorter wavelengths than H 2 N-R-S À iv (ESI, † Section S3).For the range of thiolated substrates employed, it was necessary to conduct measurements in the presence of 2 mM TCEP to prevent interference from thiol(ate) oxidation.The fraction of thiolate species, f RS-, in solution was calculated from the ratio of A RS-at a given pH to A RS-(max) at pH 12.5 where all thiols are in thiolate form (eqn ( 1)).The dissociation constants K a (A), K a (B) and K a (D) could then be obtained by fitting eqn (2) to the data for f RS- versus pH (e.g.Fig. S30 for 6, ESI †, Section S2) and K a (C) using eqn (S1) (ESI †).Attempts to fit the data to an alternative model with two non-overlapping pK a s and three species: H 2 A, HA À and A 2À did not converge upon a solution (ESI, † Section S5). (2) The acid dissociation constants pK a (A)-pK a (D) determined for 1-10 are shown in Table 1.With the exception of cysteine zwitterion 1, the pK a s are in the order: pK a (B) o pK a (A) o pK a (C) o pK a (D) pK a (A) and pK a (D) both refer to deprotonation at the thiol with i more acidic than iii (pK a (A) o pK a (D)).This may be attributed to the greater stability of the zwitterionic conjugate base ii relative to iv due to internal electrostatic stabilisation of the thiolate anion by the ammonium cation.For acid dissociation of the two ammonium species, the electrostatic stabilisation of ii decreases acidity relative to i (pK a (C) 4 pK a (B)).N-terminal pK a (A)-pK a (D) values for cysteine zwitterion 1 are all higher than for 2-10 owing to the proximity of the anionic carboxylate, § which destabilises all conjugate base species ii-iv.This effect decreases with distance or upon conversion to ester (1 4 5 4 2).
Thiol pK a (A) and pK a (D) values vary between cysteine and derivatives 1-10.The pK a (A) and pK a (D) values for penicillamine derivatives 3 and 7 are higher than for cysteine analogues 2 and 6.Fig. 3 The cysteine and thiolated analogues of amino acids methyl esters and peptides studied.The percentage abundance of each of the four species i-iv in solution at pH 7.0 and pH 8.0, 25 1C and ionic strength I = 0.3 M (NaCl) are calculated using acid dissociation constants, K a (A)-K a (D), reported herein.Percentages are accurate to AE6%.
This journal is © The Royal Society of Chemistry 2020 The decreased acidity of the penicillamine thiols can be attributed to the two additional electron donating methyl groups in 3 and 7, which will inductively destabilise the thiolate and/or reduce solvation of the penicillamine thiolate due to the adjacent methyl groups.Unexpectedly, the (4S)-mercaptoproline methyl ester 4 and peptide 8 have lower thiol pK a (D) values than for the cysteine methyl ester 2 and peptide 6.It might be predicted that the secondary thiol in 4 and 8 would have a higher pK a (D) value due to greater inductive destabilisation of thiolate than for the primary thiol in 2 and 6.To account for the observed decrease, we propose that the conformation of the pyrrolidine coupled with stereoelectronic effects alters speciation in this case.The pyrrolidine ring in proline and proline derivatives has two major C g -endo and C g -exo pucker conformations (Fig. 4a).The preferred conformation is dependent upon a combination of stereoelectronic effects, minimisation of unfavourable dipole dipole interactions and whether the substituent C g is the R or S enantiomer.Moroder has shown that (4S)-mercaptoproline favours the C g -exo ring pucker. 14The pyrrolidine ring adopts a conformation that places the thiol group in the sterically more favourable equatorial position of the ring pucker.In the favoured conformations the C-N and C-S bonds lie in an anti-conformation (Fig. 4b).We postulate that this anti-conformation may promote the formation of the thiolate via stereoelectronic stabilisation by the more electronegative N counteracting an unfavourable inductive effect (Fig. 4c).
We also examined the effect of the identity of the adjacent amino acid (Gly, Ser or Val) upon the pK a values of the N-terminal Cys residue in peptides 6, 9 and 10.Ser and Val were chosen to represent adjacent residues with hydrogen bonding capabilities and steric bulk, respectively.Compared to Gly, the Ser and Val residues led to increases in both pK a (A) and pK a (B), however, no effects upon pK a (C) and pK a (D) values were observed.Hydrogen bonding of the Ser of 9 to the terminal ammonium will stabilize cationic species i thereby increasing both pK a (A) and pK a (B).
The hydrophobic and inductively donating Val of 10 would also be predicted to favour cationic i relative to formally neutral ii and iii by allowing for increased aqueous solvation.
To assess the impact that our results could have for NCL we evaluated the relative concentrations of the four species i-iv across the whole pH range using the pK a (A)-pK a (D) values.The variation in the concentration of i-iv for 1-10 at pH 0-14 is shown in Fig. S57-S66 (ESI †).As NCL is typically performed at pH 7-8, the populations (%) of species i-iv at pH 7 and 8 are given in Fig. 3 and Table S20 (ESI †).Importantly, our results show that the major thiolate species is zwitterion ii rather than anion iv as commonly represented in NCL literature.For all substrates 1-10, r 1% of anion species iv is present at pH 7. For the cysteine peptide 6 at pH 8, the percentage of the

Fig. 2
Fig.2The four possible cysteine species i-iv in solution and the acid dissociation constants that define the interrelationship between each species.

a
pK a (A), pK a (B) and pK a (D) values were obtained from a fit of the percentage of thiol in thiolate form ( f RS-, eqn (1)) to eqn (2) and pK a (C) was determined using eqn (S1) (ESI).bValue from Benesch determined in the absence of TCEP.13 c  Determined in the absence of 2 mM TCEP.d Determined in the absence of 2 mM TCEP only as minimal oxidation was observed on the timescale of the UV-Vis spectrophotometric experiments.

Fig. 4
Fig. 4 Proposed stereoelectronic justification for the lower pK D values for 4S-mercaptoproline methyl ester 4 and peptide 8 compared to cysteine analogues 2 and 6: (a) the pyrrolidine ring in 4S-mercaptoproline 4 prefers to adopt the C g -exo pucker in solution; 14 (b) in the C g -exo pucker the C-N and C-S bonds are in an anti-conformation (purple coloured bonds); (c) the anti-conformation allows for stereoelectronic stabilisation of the thiolate by hyperconjugation.

Table 1
Summary of pK a (A)-pK a (D) values for a range of cysteine derivatives at 25 1C, ionic strength I = 0.3 M (NaCl) with 2 mM TCEP. a pK a values are accurate to AE0.07 units