Chemoselective cysteine or disulfide modification via single atom substitution in chloromethyl acryl reagents

The development of bioconjugation chemistry has enabled the combination of various synthetic functionalities to proteins, giving rise to new classes of protein conjugates with functions well beyond what Nature can provide. Despite the progress in bioconjugation chemistry, there are no reagents developed to date where the reactivity can be tuned in a user-defined fashion to address different amino acid residues in proteins. Here, we report that 2-chloromethyl acryl reagents can serve as a simple yet versatile platform for selective protein modification at cysteine or disulfide sites by tuning their inherent electronic properties through the amide or ester linkage. Specifically, the 2-chloromethyl derivatives (acrylamide or acrylate) can be obtained via a simple and easily implemented one-pot reaction based on the coupling reaction between commercially available starting materials with different end-group functionalities (amino group or hydroxyl group). 2-Chloromethyl acrylamide reagents with an amide linkage favor selective modification at the cysteine site with fast reaction kinetics and near quantitative conversations. In contrast, 2-chloromethyl acrylate reagents bearing an ester linkage can undergo two successive Michael reactions, allowing the selective modification of disulfides bonds with high labeling efficiency and good conjugate stability.


Introduction
Proteins are an emerging class of biotherapeutics with high target affinity and specicity. [1][2][3] Site-selective modication of proteins enables the incorporation of desired synthetic functionalities into proteins at distinct sites, which combine the advantages from both the synthetic world and Nature for the construction of protein bioconjugates with novel functional characteristics. [4][5][6][7][8][9][10] Chemical approaches for protein modication allow the straightforward attachment of the desired functionalities at natural amino acid residues on the protein surface, thereby eliminating the need for tedious genetic engineering. 11 Among these, unpaired cysteines are considered the most sought-aer targets owing to the high nucleophilicity and versatile chemistry landscapes of thiol groups. [12][13][14][15] In addition, disulde bonds have also emerged as attractive modication sites to incorporate tailored functionalities, as a lot of therapeutic relevant proteins or peptides, e.g. antibodies or their antigen-binding fragments, contain at least one solventaccessible disulde bond. 16,17 Maleimides constitute a group of widely-used cysteine bioconjugation reagents due to their fast and efficient reactions with thiols. 12 Besides that, a variety of structurally diverse reagents have also been reported for cysteine modication in order to improve the stability of the resultant bioconjugates as well as retaining similar reaction kinetics. 18,19 However, the strategies for disulde modication are much less explored and the current toolset is limited to ve to six conjugation methods available in the literature. [20][21][22][23][24][25][26] Moreover, the reagents developed to date mainly target a single amino acid residue, for example a cysteine residue or a disulde bond. Besides the (bromo) maleimides, 23 3-bromo-5-methylene pyrrolones 27 and diethynyl phosphinates, 28 there are only a few bioconjugation reagents that can provide a broad spectrum scaffold to address both cysteines and disuldes with high labeling efficiency. Such a strategy is more advantageous compared to reinventing a novel scaffold for every single purpose. Therefore, the development of such a bioconjugation approach, which enables the selective modication at target amino acid residues in a user-dened fashion with great ease, would be highly advantageous to enrich the existing toolbox and also to enable nonexperts to conduct such protein labeling reactions. This prompts us to rethink the strategies and enormous possibilities offered by synthetic chemistry. In fact, modern synthetic technologies provide immense exibility and potential to access structurally diverse reagents, which allows for the customization of their reactivities at the atomic level. From this perspective, we envisioned that multifunctional bioconjugation reagents, which are capable of targeting the specic residues on demand, can be designed by nely tuning their chemoselectivities with the aid of synthetic chemistry. Inspired by the inherent features of the electron-decient systems serving as good Michael acceptors for the reactions with nucleophiles on the protein surface, 12 we proposed 2-halomethyl acryl derivatives (acrylamide or acrylate) as an appropriate option for reactions with thiol groups to accomplish the chemoselective modication of cysteine residues. In addition, considering the different electron-withdrawing properties of the ester and amide bond, we further speculated that a single atom substitution in the acryl position of chloromethyl acryl reagents would inuence their reactivity proles as electrophiles for the second Michael reaction. This, in turn, will allow the customization of their properties to achieve selective modication at either cysteine or disulde sites.
Herein, we reported the convenient, one-pot synthesis of 2chloromethyl derivatives (acrylamide and acrylate) via coupling reactions between commercially available 2-(bromomethyl) acrylic acid with different end-group functionalities (amino group or hydroxyl group) (Fig. 1). The inherent chemoselectivity of 2-chloromethyl acrylamide and acrylate are inuenced by the different electron-withdrawing properties of the amide and ester linkage, which render them suitable for protein modication at either cysteine or disulde site. Specically, we showed that 2-chloromethyl acrylamide compounds containing an amide bond in the scaffold can react with proteins containing a free thiol group via a single Michael reaction with near quantitative conversions. By replacing the amide with an ester linkage yielding the respective 2-chloromethyl acrylate reagents, site-selective disulde modication can be achieved as exem-plied by successful modication of three disulde-containing substrates. In addition, the bioconjugation reagents reported herein are characterized by facile linker synthesis, high water solubility as well as good labeling efficiency.

Results and discussion
Synthesis of 2-chloromethyl acrylamide and acrylate compounds We initiated our study by using the commercially available compound, 2-(bromomethyl)acrylic acid, as starting material to synthesize both 2-halomethyl acrylamide and acrylate bioconjugation reagents (Scheme 1 and S1 †). First, 2-(bromomethyl)acrylic acid reacted with oxalyl chloride to convert the carboxylic acid group to acid chloride in situ. Thereaer, different end-group nucleophiles, amino or alcohol groups (usually 1.5 to 2 equiv.) were added under basic conditions for further reactions. The respective 2-chloromethyl acrylamide and acrylate were subsequently puried and isolated in moderate yields (Scheme 1). Mass spectrometry (MS) data demonstrated that the bromine atom is completely replaced by the chlorine atom affording the 2-chloromethyl acryl compounds ( Fig. S69-S74 †). A toolbox containing different functionalities, e.g. dye or bioorthogonal groups, was obtained as demonstrated in Scheme 1 underlining the broad applicability of this method. Compared to other disulde-and cysteine-modication reagents, which require multiple-step synthesis (e.g., allyl sulfones require four-step synthesis 20 ), the 2-chloromethyl acryl derivatives are readily available through a straightforward one-pot synthesis from commercially available 2-(bromomethyl)acrylic acid precursors. The simplicity of the synthesis provides fast and efficient access to a broad spectrum of functionalities that are of great interest for bioconjugation. In addition, compared to the reported cysteine and disulde modication reagents, e.g. carbonylacrylic reagent 29 or allyl sulfone reagents 20 that contain a hydrophobic phenyl group, the 2-chloromethyl acryl derivatives do not contain any aromatic group in the scaffold, where they are estimated to have lower partition coefficients (n-octanol to water, log P o/w ) ( Fig. S2 and S3 †) indicating improved water solubility. Furthermore, the stability of the bioconjugation reagents in different aqueous environment represents an important consideration for their subsequent usage. The stability of the 2-chloromethyl acrylamide and acrylate compounds was evaluated by incubating compound 3 and compound 4 at three different pH (pH 6, 7 and 8), and the HPLC data indicated that they remained stable over a time course of 36 hours without any degradation ( Fig. S4-S9 †). In contrast, maleimides reagents, which are the most commonly used bioconjugation reagents for cysteine functionalization, easily hydrolyze to nonreactive maleic amides, especially at basic pH (t 1/2 < two hours) (Fig. S10-S12 †).

Chemoselectivity of 2-chloromethyl acrylamide and acrylate towards thiol groups
The reactivity and selectivity of 2-chloromethyl acrylamides and acrylates towards two model amino acids: Boc-Cys-OMe and Boc-Lys-OH ( Fig. 2) were evaluated. For 2-chloromethyl acrylamide, compound 1 was incubated with both Boc-Cys-OMe and Boc-Lys-OH in acetonitrile (ACN)/phosphate buffer (PB, pH 7) for four hours (Fig. 2a). Liquid chromatography (LC) data indicated quantitative conversion to the cysteine-modied compound 10 while lysine-modied compound 9 was not observed, which clearly demonstrated its excellent Scheme 1 Synthesis route of 2-chloromethyl acrylamide and acrylate derivatives containing different functionalities. chemoselectivity towards cysteine over lysine residues (Fig. 2c). In addition, the absence of the side reaction with benzylamine during the synthesis of compound 1 also indicated there were no cross-reactions with amino groups (page 5 in ESI †). With increasing Boc-Cys-OMe to 8 equiv., only compound 10 was found in the LC without the observation of the further addition products (Fig. 2d). Further studies demonstrated that compound 10 does not react with other nucleophiles, such as hydroxyl or amino groups, even when used in 30 equiv. excess at 37 C (Scheme S3 in ESI †). However, if a second thiol functionality was given in very large excess (for example 30 equiv. of compound 18, Scheme S3 in ESI †), the rst thiol functionality was eliminated affording compound 15, presumably due to an addition-elimination reaction (Fig. S14 in ESI †).
For 2-chloromethyl acrylate, compound 4 was incubated with both Boc-Cys-OMe and Boc-Lys-OH under the same conditions used for compound 1 (Fig. 2b). The LC trace also revealed the excellent chemoselectivity towards thiol groups as lysine-modied compound 11 was also not observed in the mixture (Fig. 2e). However, in contrast to the reaction with acrylamides, the peak for compound 12 decreased while the signal for compound 13 increased (Fig. 2f), with increasing amounts of Boc-Cys-OMe used. Aer adding eight equivalents of Boc-Cys-OMe, compound 4 was fully converted to compound 13 with negligible side product formation (Fig. 2f).
These model reactions clearly indicated a pronounced difference in the reactivity of the 2-chloromethyl acrylamide versus the acrylate reagents, presumably originating from the amide or ester linkage. The observed reactivity of 2-chloromethyl acrylamide is consistent with literature where it was reported that catalysts and high temperature are required for thiol addition with a,b-unsaturated amides as Michael acceptors. 30,31 Therefore, we speculate that the second Michael reaction of the 2-chloromethyl acrylamide did not proceed due to the relatively weak electron-withdrawing property of the amide bond, which rendered the a,b-unsaturated amide a poor Michael acceptor. 27 Taken together, these results demonstrated that 2-chloromethyl acrylamides allow straightforward modication of free cysteines with high efficiency and excellent chemoselectivity. On the other hand, 2-chloromethyl acrylates can undergo two Michael reactions in a successive manner, thereby making them suitable candidates to achieve protein modication at the disulde sites.

2-Chloromethyl acrylamide reagents for cysteine modication
Next, the reaction kinetics were rst studied using a model reaction between compound 3 and Boc-Cys-OMe (Fig. 3a). Compound 3 (1 mM) and Boc-Cys-OMe (1 mM) were incubated in ACN/PB (pH 7) mixture (volume ratio: 1 : 10) using Fmoc-Phe-OH (Scheme S4 †) as internal standard. At different time intervals, the reaction was monitored by high-performance liquid chromatography (HPLC) (Fig. S17 †), and quantication of compounds 3 and 14 overtime was plotted with reference to the internal standard (Fig. 3b). HPLC analysis indicated that 80% of compound 3 was converted to the cysteine-modied compound 14 in less than two hours and near quantitative conversion was achieved in less than six hours (Fig. S17 †). The second-order rate constant was determined to be 1.17 M À1 s À1 with the concentration of compound 3 at 1 mM (Fig. 3c). Although this reaction is slower than the maleimide conjugation (10-1000 M À1 s À1 ), 32 it is still comparable to or even faster than some of the recently reported bioconjugation reagents, e.g. ethynylphosphonamidates (0.62 M À1 s À1 ), 33 diethynyl phosphinate (0.47 M À1 s À1 ) (Table S1 in ESI †), and some other conventional bioconjugation methods such as oxime ligation (0.001 M À1 s À1 ), 34,35 Pictet-Spengler ligation (0.015 M À1 s À1 ) 32 and strainpromoted azide-alkyne reaction (0.9 M À1 s À1 ). 36 Thereaer, 2-chloromethyl acrylamide derivatives were applied for cysteine modication on peptide substrates using compound 3 (Fig. 4a). First, the known WSCO2 peptide (sequence: IVRWSKKVCQVS), an endogenous peptide inhibitor of the chemokine CXCR4 receptor that is highly relevant for anti-infectivity in viral infection and anti-migratory effect in cancer, 37 was selected as bioactive substrate (Fig. 4b). In ACN/PB mixture (1 : 10), one equivalent WSCO2 peptide was incubated with 1.1 equivalents of compound 3 for four hours. HPLC analysis of the crude reaction mixture indicated that more than 95% conversion to the desired modied product (WSCO2-PEG4-Tz) was achieved (Fig. 4c). As a control, the thiol-reactive reagent 4,4 0 -dithiodipyridine (4-DPS), which is oen used for free thiol quantication on proteins via a thiol-disulde exchange reaction (the reaction mechanism is shown in Scheme S6 †), 38 was used to mask the cysteine residue. In this case, no further reaction was observed in the HPLC chromatogram in the presence of compound 3 under the same reaction conditions (Fig. 4c). Taken together, these data clearly indicated that the 2chloromethyl acrylamide compounds exhibit excellent chemoselectivity in combination with excellent modication efficiency. In addition to WSCO2, ve other peptides, including RGDC, CEIE, PC-8, Tet, and EK-1 peptides (sequences and MS of  Fig. 4e and S24-S28 †), have also been successfully modied with compound 3. The broad range of substrates used here clearly demonstrates the general applicability of 2-chloromethyl acrylamide compounds for chemoselective modication at cysteine residues.
Aer demonstrating the successful modication of the model peptides, we proceeded to functionalize the more complex substrates, i.e. proteins. The protein ubiquitin that plays an important role in protein degradation by the proteasome, which contains a cysteine mutation at its K63 position, was selected (Fig. 5a). Aer incubation of one equivalent ubiquitin with ten equivalents of two different 2-chloromethyl acrylamide derivatives respectively, the desired bioconjugates were obtained. The successful modication was conrmed with the expected m/z in the MS shown in Fig. 5b. Similarly, if 4-DPS was used to mask the accessible cysteine residue on the protein surface, no reaction was observed even in the presence of ten equivalents of the 2-chloromethyl acrylamides (Fig. S29 †). Besides ubiquitin, a single-chain V H H antibody domain with specic binding activity against the green uorescent protein (anti-GFP nanobody) has also been successfully modied with 2-chloromethyl acrylamide derivatives (Fig. 5c). The MALDI-Tof-MS characterization clearly indicated the successful modication with the expected m/z shown in Fig. 5d.

2-Chloromethyl acrylate reagents for disulde modication
Next, the feasibility of 2-chloromethyl acrylate for disulde bond modication was evaluated on both peptide and protein substrates. The cyclic peptide hormone somatostatin (SST), which plays a key role in regulating the endocrine system and contains an accessible disulde bond in its sequence, 39 was selected as a model peptide (Fig. 6a). The disulde bond in SST was rst reduced by two equivalents of tris(2carboxyethyl)phosphine (TCEP) to generate the two free thiol groups in ACN/PB mixture (1 : 10) at pH 7, followed by incubation with 1.1 equivalents of compound 4 for overnight in one-pot. HPLC of the crude reaction mixture revealed good modication efficiency (91% based on HPLC quantication) (Fig. 6b). The isolated SST-Ph conjugate was also characterized by MALDI-Tof-MS showing successful functionalization  ( Fig. 6c). SST-Ph was further incubated with TCEP before the subsequent addition of the thiol-reactive reagent, 4,4 0dithiodipyridine (4-DPS). HPLC analysis showed that SST-Ph remained intact without any observation of side reactions occurring with 4-DPS. Native SST was used as a control and the LC showed that a new peak was formed when 4-DPS and TCEP were used (Fig. S32 †). These results taken together indicated the complete modication at the disulde site (Fig. 6b). Since the 2-chloromethyl acrylate compounds do not contain aromatic groups in their scaffold, they have a rather low log P o/w and thus provide better water-solubility than the reagents that contain phenyl groups, such as allyl sulfone reagents. This is particularly advantageous when modifying some therapeutic relevant proteins, which will suffer from aggregation issues if a large amount of organic solvent is needed during the modication process. Therefore, the disulde modication efficiency of SST with 2-chloromethyl acrylate and allyl sulfone reagents was evaluated and compared with using compound 8 and an allyl sulfone reagent (denoted as "IC-tetrazine", which was developed by our group before 22 ) (Fig. 6a). Disulde modication of SST with IC-tetrazine required 40% ACN for solubilization, whereas less than 10% of ACN was needed to dissolve compound 8. More importantly, the modication efficiency of compound 8 was considerably higher (83% based on the quantication of HPLC peak for the reaction mixture) compared to IC-tetrazine (67%) (Fig. 6e). For denitive conrmation of the modication site, SST-PEG-Tz was selected for LC-MS/MS analysis. Aer trypsin digestion, a fragment with m/z at 659.7814 [M + 2H] 2+ was observed corresponding to fragment 1 (Fig. 6e). The expected modied sequence was detected by MS/MS, thus conrming that the modication occurred at the disulde site (Tables S2-S5 and Fig. S40-S44 †). Besides SST, another therapeutic relevant cyclic peptide octreotide, an analog of somatostatin with a longer biological half-life that is oen applied in cancer diagnostics, 40 was also successfully functionalized with a coumarin motif under the similar reaction conditions mentioned above (Scheme S11 †). The MALDI-Tof-MS data conrmed the successful functionalization with a peak at 1364 corresponding to [M + H] + (Fig. S46 †).
Subsequently, this new disulde modication strategy was also evaluated on a more complex substrate, the protein enzyme lysozyme (from hen egg white), in which the disulde at C6-C127 is predicted to be solvent-accessible among the four available disulde bonds. 20,41 To test the applicability of the 2chloromethyl acrylate compounds for disulde modication, different functionalities were incorporated into lysozyme, such as a phenyl group, a uorescent dye (coumarin), or a bioorthogonal tag (tetrazine group) (Fig. 7a). Aer adding 1.2 equivalents of TCEP, the 2-chloromethyl acrylate derivatives were also added in one pot, and the reaction mixture was incubated at 50 mM PB (pH 7) overnight. Some precipitates were observed aer incubation overnight, presumably due to the aggregation of reduced lysozyme despite the mild conditions employed. 42 Thereaer, the modied lysozyme derivatives (Ly-Ph, Ly-PEG-Cou, and Ly-PEG-Tz) were puried by using Hi Trap hydrophobic interaction column with the isolated yields of 28%, 22%, and 24%, respectively. MALDI-Tof-MS data of the three modied lysozymes derivatives conrmed their successful functionalization (Fig. 7b). The yields are higher than our previous report where lysozyme was modied with allyl sulfone  Table S12 †) and comparable to that where cysteine in human serum albumin was modied with maleimide ($30%). 20,43,44 Notably, around 25-30% of native lysozyme was recovered aer the purication, which can be recycled for modication. In order to identify the modication site, Ly-PEG-Tz was analyzed by LC-MS/MS. Aer trypsin digestion, only the fragment containing C6-C127 disulde bonds was observed with an addition of PEG-Tz functionality (m/z 1553.7021 [M + H] + , Fig. S49-S54 †). The fragments showing modication at other disulde bonds were not observed in the analysis (Table S7 in ESI †). Further MS/MS analysis conrms the expected sequence and demonstrates the site-selective modication at the disulde site (Fig. 7c). The expected y and b ions and the zoom-in spectra of the respective fragment ions are shown in Section 8.2 in ESI. † Lysozyme is an antimicrobial enzyme that is capable of hydrolyzing the 1,4-beta-linkages in the peptidoglycan of Grampositive bacterial cell walls, thus leading to the lysis of bacteria (Fig. 8a). Therefore, the catalytic activity of modied lysozyme was assessed by investigation of the absorbance change at 450 nm of Micrococcus lysodeikticus lyophilized cell suspensions over time, where the activity of the modied lysozyme is proportional to their capability to hydrolyze the bacterial cell walls. 45 In comparison to native lysozyme, the disulde-modied lysozyme Ly-PEG-Tz retained 86% of its activity (Fig. 8b, calculation details shown in Section 8.3 in ESI †). In contrast, statistical modication of lysine residues of lysozyme using tetrazine N-hydroxysuccinimide compounds (Scheme S13 †), which gave a heterogeneous mixture according to the MS data (Fig. S67 †), resulted in total loss of its catalytic activity (Fig. 8c, calculation details shown in Section 8.3 in ESI †). Hence, disulde modication of proteins with 2-chloromethyl acrylate compounds represents an attractive approach to functionalize enzymatic proteins at distinct sites to preserve their catalytic activity.

Conclusion
In conclusion, we report that single atom substitution in 2chloromethyl acryl reagents can achieve selective protein modication at cysteine or disulde sites on demand. The reactivity prole of the prepared bioconjugation reagents can be customized by simply selecting different end-group functionalities (either amino or hydroxyl groups) to obtain the respective 2-chloromethyl acrylamide and acrylate compounds. Notably, the synthesis of the reported 2-chloromethyl acrylamide and acrylate compounds proceeds via a simple and easily implemented one-pot reaction based on easily accessible starting materials. We anticipate that the synthetic approach presented herein can be easily adapted in any laboratory for a broader scientic community.
Excellent labeling efficiency and high chemoselectivity of the 2-chloromethyl acrylamide compounds were demonstrated by the chemoselective modication of cysteine residues in several model peptides as well as proteins. In contrast, 2-chloromethyl acrylate regents allow modication of disulde-containing peptides and proteins, such as SST, octreotide, and lysozyme. In addition, our new approach could offer the possibility for the dual modication of proteins by capitalizing on the reactivity difference of the 2-chloromethyl acrylamide and acrylate compounds. In this way, one could envision protein dual functionalization at cysteine residues and disulde bonds can be achieved in a stepwise fashion within one system. We believe that the strategy presented herein offers an entirely new and elegant chemical approach to chemists and biologists to greatly enrich the currently available methodology toolbox for cysteine and disulde modication. In this way, such progressive technologies will provide easy access to the broader scientic community in the design and preparation of advanced protein conjugates for various biological, biophysical, and medicinal applications.

Conflicts of interest
The authors declare no conict of interest.