Open Access Article
Yu-De Chuang†
a,
Tsung-Jung Yang†a,
Yu-Chi Chiha,
Ya-Rong Chena,
Po-Cheng Kuoa,
Cheng-Chih Hsu
a,
Shao-Lun Chiou
*a and
John Chu
*ab
aDepartment of Chemistry, National Taiwan University, Taipei 106319, Taiwan. E-mail: d11223104@ntu.edu.tw; johnchu@ntu.edu.tw
bCenter for Emerging Material and Advanced Devices, National Taiwan University, Taipei 10617, Taiwan
First published on 28th May 2026
Mass spectrometry (MS)-based protein analysis is an indispensable tool in modern biomedical research. A key step in sample preparation is proteolytic digestion using enzymes with well-defined amino acid specificity, such as trypsin, chymotrypsin, and StaphV8 protease, which cleave at basic, aromatic, and acidic residues, respectively. The absence of cysteine (Cys)-specific cleavage methods is a gap in the current protein analysis toolbox. Herein, we report a chemical reagent (1) that selectively cleaves the N-terminal amide bond of Cys residues in proteins. Using glutathione as a model peptide, we investigated the reaction kinetics in detail and identified optimized conditions for protein cleavage. Using thioesterase as a model protein, we further demonstrated that 1 is fully compatible with modern MS-based proteomics workflows, including in-gel digestion, where it can be used in combination with existing proteases. This reaction proceeds rapidly and selectively in aqueous buffers, affording high yields while converting the reactive Cys side-chain thiol into a chemically inert five-membered heterocyclic moiety. This transformation eliminates the need for the commonly employed iodoacetamide capping step and introduces a distinct mass tag that facilitates downstream data analysis. Overall, these features establish 1 as a robust and practical new tool for protein analysis.
Methods that are used to cleave a protein only at selected type(s) of AA are particularly useful, and for this reason, proteases with high AA specificity have been central to protein analysis.5 Trypsin, for example, catalyzes amide bond hydrolysis at the C-terminal side of arginine (Arg) and lysine (Lys) residues and is the most widely used enzyme for this purpose. Chymotrypsin and V8 protease cleave at aromatic and acidic AAs, respectively. Furthermore, digesting the same protein using proteases with distinct specificities in separate reactions generates peptide fragments with overlapping sequences. This strategy facilitates de novo protein sequencing, middle-down analysis, or distinguishing homologous proteins.6
However, very few proteases exhibit strict AA specificity. Several chemical reagents have been developed to bridge this gap (Fig. 1b and Fig. S1). For example, cyanogen bromide (CNBr), which cleaves proteins at methionine residues, was once widely used.7 However, its high toxicity and sensitivity to oxidation are obvious drawbacks, and it is almost completely absent from routine protein analysis today. More recently, Raj and Elashal reported a serine (Ser) specific cleavage reagent; however, the reaction conditions were harsher than most biological samples can tolerate.8 While no known protease cleaves exclusively at cysteine (Cys), several chemical reagents target Cys by first modifying the side-chain thiol, such as through cyanylation and acetylation using reagents 2 and 3, respectively (Fig. 1b), followed by backbone cleavage at these positions.9 Unfortunately, the moderate cleavage efficiency of these reagents has limited their widespread adoption in protein analysis. Reported herein is reagent 1, a chemical tool that enables highly efficient and selective cleavage at Cys residues (Fig. 1c).
Using glutathione (GSH) as a model peptide, we showed that 1 cleaves quantitatively at the N-terminal amide bond of Cys in aqueous buffer at pH 11. We also used macolacin thioesterase (mTE) as a model protein to show that 1 is compatible with the standard protein analysis workflow. Specifically, 1 can be used alone or in combination with other proteases for in-gel digestion of proteins, and the resulting peptide mixtures are directly amenable to LCMS analysis. An additional advantage of 1 is that it converts the free Cys side-chain thiol into a chemically inert five membered heterocyclic moiety, thereby obviating the commonly used iodoacetamide capping step that prevents unwanted thiol side-reactions.10 This heterocyclic moiety also serves as a distinct mass tag that facilitates fragment identification during MS analysis. Collectively, these findings establish 1 as a valuable new tool for protein analysis.
We screened for conditions that enhanced amide bond hydrolysis (Fig. 2a, Fig. S1–S5). Our first series of model reactions was setup using GSH at room temperature in aqueous buffers adjusted to pH 6, 7, 8, 9, 10, and 11, wherein one equivalent of 1 was mixed with GSH and incubated at room temperature (Table S1 and Fig. S8). After 24 hours, the reaction was quenched by adding formic acid to a final concentration of 2% (v/v). Reagent 1, the heterocyclic intermediate 4, N-fragment 5, and C-fragment 6 were all readily separable by HPLC. The reaction yields were determined based on integrating the peak that corresponds to the C-fragment 6. We monitored the course of this reaction in detail at pH 10 and 11 by removing small aliquots from the reaction mixture at various time points. These aliquots were immediately quenched and analyzed by HPLC. The results showed that GSH was approximately 50% hydrolyzed after 24 hours at pH 10 (Fig. 2b, Table S2, and Fig. S9) and was completely hydrolyzed within 2 hours at pH 11 (Fig. 2c, Table S3, and Fig. S10). These data showed that basic conditions promote GSH cleavage by 1, resulting in both faster rates and higher yields (Fig. 2d and e).
A series of kinetic models were constructed to try to describe the course of this reaction (Fig. S11 to S16). Data fitting did not noticeably improve when the reactions were allowed to proceed in reverse. Therefore, we chose a simplified model that describes the addition–elimination that leads up to intermediate 4, as well as the hydrolysis that gives rise to fragments 5 and 6, both as (nominally) irreversible steps (Scheme 1). Then, based on a global fit that allows the reaction order of each component to vary, we concluded that the first step of this reaction is zeroth order to methanethiol (MeSH) and half order to hydroxide (OH−), which likely stems from the fact that hydroxide promotes the thiol–thiolate equilibrium of the GSH side-chain.12 The second step is a normal hydrolysis reaction. With these considerations in mind, the rate constants were obtained by fitting the production (and consumption) of each component measured by HPLC at various time points (Fig. S17). The rate constants k1 and k2 are 1.9 M−1.5 s−1 and 0.4 M−1 s−1 at pH 10, and 14.0 M−1.5 s−1 and 2.0 M−1 s−1 at pH 11, respectively.13 This model provides a quantitative description of this reaction and helps to explain its strong pH dependence.
![]() | ||
| Scheme 1 The two-step reaction sequence that cleaves GSH into the N-fragment 5 and the C-fragment 6. | ||
We then tested 1 for protein digestion using the macolacin thioesterase (mTE) as a model (Fig. 3a).14 It contains two free Cys residues at positions 142 and 204. Complete hydrolysis of mTE by 1 would therefore produce three fragments that are 15.3 (A), 7.0 (B), and 19.3 (C) kDa in size. Alternatively, partial digestion of mTE, with hydrolysis occurring at only one of the two sites, would result in two pairs of fragments that are 22.2 (D)/19.3 (C) and 26.1 (E)/15.3 (A) kDa in size. Note that the molecular weight of fragments B, C, and E reported herein includes that of the heterocyclic moiety resulting from 1 reacting with Cys. We incubated mTE with 1 in the presence and in the absence of urea at pH 8, 9, 10 and 11 for 24 hours and then analyzed the reaction mixture by SDS PAGE (Fig. 3b). mTE remained intact when no urea was added, suggesting that denaturation is crucial for protein digestion using 1. In the presence of urea (8 M), fragments A, B, D, and E were readily observed by SDS PAGE, and the smallest fragment B was identified by MS (Fig. S18). Different denaturation conditions were then evaluated at pH 10 (8 M urea, 10% (w/v) SDS, and 95 °C incubation for 10 min), and urea turned out to be the most effective (Fig. 3c).
In LCMS based protein sequencing, generating fragments at more than one type of AA is a useful strategy to increase coverage and facilitate data analysis.3 We therefore applied our reagent in combination with trypsin and, at the same time, evaluated its compatibility with modern proteomics workflow. mTE was analyzed by SDS-PAGE and in-gel digestion according to established protocols (see the SI). The band corresponding to mTE was excised, washed, dehydrated using acetonitrile, and digested using trypsin supplemented with dithiothreitol (DTT) for 16 h. Reagent 1 (2 mM) in CABS buffer (pH 11, 5% DMSO) was then added directly to the mixture and incubated for another 16 h. While iodoacetamide is typically added to cap free Cys thiols and suppress unwanted side reactions, it is unnecessary in our procedure as thiols are converted into an inert five membered heterocyclic moiety upon reaction with 1. The resulting peptide fragments were extracted from the excised gel band and injected directly into the LCMS system. Data analysis was performed using MaxQuant v.2.7 with the following settings: S-oxidation at Met (+ 16 Da), deamidation at Gln and Asn (+ 1 Da), and heterocycle formation at Cys (+ 154 Da) (Table S5).15 As expected, all Cys containing peptide fragments were identified and contained the five membered heterocyclic moiety; no deamidation or N-terminally modified products were detected (Fig. 4c, d, and Fig. S19–S21).
Two additional experiments were performed to assess the stability and specificity of this reagent. In the stability test, 1 was dissolved in an aqueous DMSO solution (5% v/v) and left at 20 °C for one week; it showed no detectable degradation (Fig. S22). In the specificity test, 10 equiv. of 1 was incubated with ubiquitin at pH 11 for 24 hours and then analyzed by MS. Ubiquitin contains seven Lys and no Cys residues. No Lys modification was detected and only minor N-terminal modification was observed (Fig. S23).
Lastly, we estimated the size distribution of peptide fragments generated under various protein cleavage conditions (Table 1 and Fig. S24). Two virtual libraries were compiled based on the UniProt database.16 One contains 3000 proteins selected randomly and the other contains 3000 proteins with at least one Cys residue. Modern MS instruments can readily detect and sequence peptide fragments 7–20 residues in length.17 Our analysis shows that the distribution of Cys residues across proteins is highly uneven, such that those that do contain Cys would often be cleaved multiple times by 1 and yield 27.1% of fragments in the desirable size window described above. It should be emphasized that our objective is not to replace proteases with 1 in sample preparation, but rather to provide a complementary new tool. As such, although this value is somewhat lower than that observed for trypsin (35.3%) and chymotrypsin (39.7%), the difference is modest and is well within the practically useful range. Furthermore, 1 can also be combined with known proteases to generate orthogonal fragment libraries, thereby improving sequence coverage.3
| Random proteins | Fragment distribution, proteins with ≥1 Cys | ||||||
|---|---|---|---|---|---|---|---|
| Cut sitesa (%) | Fragmentb | <7 aa (%) | 7–20 aa (%) | >20 aa (%) | Coveragec | ||
| a Numbers represent average ± standard deviation.b Numbers represent average ± standard deviation.c Coverage is defined as the proportion of amino acids that end up in fragments 7–20 aa in length after cleavage (= fragment length × frequency/protein length). | |||||||
| 1 | Reagent 1 | 2.4 ± 2.3 | 42.3 ± 83.9 | 22.8 | 50.1 | 27.1 | 7.9 |
| 2 | Trypsin (T) | 10.8 ± 3.2 | 9.4 ± 11.2 | 53.5 | 11.2 | 35.3 | 43.6 |
| 3 | Chymotrypsin (C) | 7.5 ± 2.9 | 13.9 ± 18.7 | 40.5 | 19.8 | 39.7 | 34.5 |
| 4 | T + C | 18.3 ± 3.8 | 5.6 ± 6.5 | 70.9 | 2.8 | 26.2 | 49.7 |
| 5 | T + 1 | 12.9 ± 3.4 | 7.9 ± 9.3 | 58.8 | 7.3 | 33.9 | 48.4 |
| 6 | C + 1 | 9.7 ± 3.7 | 10.8 ± 14.8 | 49.2 | 13.2 | 37.6 | 40.6 |
Another advantage is that 1 simplifies the protein analysis workflow by converting reactive Cys thiols into an inert heterocyclic moiety, thereby eliminating the otherwise necessary iodoacetamide capping step. As a small molecule reagent, 1 offers several additional advantages over enzymatic methods and fills an important gap in the current protein analysis toolbox. First, it is readily produced from commercially available starting materials. Second, it is highly stable and can be stored either as a DMSO solution for up to one week or as a solid for extended periods. Third, whereas fragments of the protease itself can interfere with downstream data analysis, the use of 1 simplifies post-cleavage processing as excess reagent can be readily removed using a size exclusion filter.18 Taken together, these results demonstrate the utility of 1 as a valuable new tool for protein analysis and suggest that it may find broad applications in proteomics.
Footnote |
| † These authors contributed equally to this work. |
| This journal is © The Royal Society of Chemistry 2026 |