Reem
Mousa
,
Dana
Shkolnik†
,
Yam
Alalouf†
and
Ashraf
Brik
*
Schulich Faculty of Chemistry, Technion-Israel Institute of Technology, Haifa 3200008, Israel. E-mail: abrik@technion.ac.il
First published on 12th February 2025
Chemical protein synthesis has emerged as a powerful approach for producing ubiquitin (Ub) and ubiquitin-like modifiers (Ubls) in both their free and conjugated forms, particularly when recombinant or enzymatic strategies are challenging. By providing precise control over the assembly of Ub and Ubls, chemical synthesis enables the generation of complex constructs with site-specific modifications that facilitate detailed functional and structural studies. Ub and Ubls are central regulators of protein homeostasis, regulating a wide range of cellular processes such as cell cycle progression, transcription, DNA repair, and apoptosis. Ubls share an evolutionary link with Ub, resembling its structure and following a parallel conjugation pathway that results in a covalent isopeptide bond with their cellular substrates. Despite their structural similarities and sequence homology, Ub and Ubls exhibit distinct functional differences. Understanding Ubl biology is essential for unraveling how cells maintain their regulatory networks and how disruptions in these pathways contribute to various diseases. In this review, we highlight the chemical methodologies and strategies available for studying Ubls and advancing our comprehensive understanding of the Ubl system in health and disease.
NCL has been further broadened by introducing desulfurization8,9 and deselenization10–12 reactions, which expands its applicability to the preparation of proteins lacking native cysteine residues (Fig. 1c). Additionally, other ligation strategies including serine/threonine13 and α-ketoacid-hydroxylamine (KAHA)14 ligation have expanded the scope of reactions to access diverse complex proteins (Fig. 1b).
Semi-synthesis further extends the capabilities of these methods by combining chemically synthesized fragments with recombinantly expressed protein domains.15,16 This hybrid approach enables the preparation of large proteins with different modifications, including non-canonical amino acids, isotopic labels, and post-translational modifications (PTMs). Chemical synthesis of proteins has enabled researchers to investigate their biochemical, structural, and functional properties in ways that are challenging to achieve using traditional molecular biology and enzymatic approaches. It also allowed the incorporation of specific and unique modifications to facilitate various studies such as the generation of activity-based probes (ABPs) designed to unravel a protein's interactome, expression level, and cellular localization.1–3
Chemical and semi-synthetic methods have been extensively applied to study ubiquitination and deubiquitination,17–19 key post-translational modifications that maintain protein homeostasis and regulate cellular processes.20 Ubiquitination involves attaching ubiquitin (Ub), a small, conserved protein with a β-grasp fold and a flexible C-terminal diglycine motif to substrates21 through an isopeptide bond with a lysine residue, facilitated by a cascade of E1, E2, and E3 enzymes.22 This process is counter reacted by deubiquitinating enzymes (DUBs) that remove Ub or Ub chains, modulating cellular signaling.23–25
Ubiquitin-like proteins (Ubls), which are structurally similar to Ub, can also be conjugated to protein substrates via a similar mechanism, affecting various cellular processes.26 Humans have eighteen conjugatable Ubls, including five SUMO paralogs, NEDD8, UFM1, URM1, ISG15, ATG12, and FAT10, and seven ATG8 paralogs (Fig. 2).27 Research on Ubls focuses on their conjugation mechanisms, substrate recognition, specificity, interactions with other PTMs, and their roles in health and disease.28 Relying solely on enzymatic methods to prepare homogeneous Ubl-based conjugates presents challenges, such as limited availability of enzymatic machinery and difficulties in achieving site-specific modifications in sufficient quantities and homogeneity for biochemical and functional analyses. Chemical synthesis offers solutions to these challenges, enabling deeper exploration of Ubl biology and their involvement in various diseases.
![]() | ||
Fig. 2 Representative conjugatable Ubl members. (a) Sequences of Ub and Ubls highlighting the diglycine motif and their length. (b) The tertiary structures of Ub and Ubls are also presented, highlighting several crucial structural elements (hydrophobic patches, Gly–Gly motif.). The PDB code for: Ub-1UBQ, NEDD8-1NDD, SUMO-1-4WJQ, UFM1-5IA7, ISG15-1Z2M, ATG8-2KQ7, FAT10-6GF1, and URM1-2QJL. |
In this review, we focus exclusively on UBLs that have been studied using chemical or semi-synthetic methods, except for FAT10, which has not been synthetically prepared. We highlight how the different approaches have contributed to our understanding of the various biochemical, structural, and functional aspects of Ubls. By providing this review, we offer a valuable resource for researchers to encourage them to use these methods to explore Ubls biology and understand their role in health and disease, potentially leading to new therapeutic applications.
Family name | Enzymatic machinery | Proteases | Chemical synthesis approaches and applications | ||
---|---|---|---|---|---|
E1 | E2 | E3 | |||
SUMO family | SAE1 | UBC9 | RanBP2 | SENP 1–3 | NCL (SEA thioester): SUMO-1, SUMO-1-P53 peptide conjugate, SUMO-2, SUMO-3, SUMO-2 dimer, SUMO-3 dimer |
SUMO-1 | SAE2 | PIAS 1–4 | SENP 5–7 | KAHA ligation: SUMO-2, SUMO-3 | |
SUMO-2 | ZNF451 | DeSI-1,2 | Direct SPPS (aggregation breaker): SUMO-2, SUMO-3 | ||
SUMO-3 | Others | USPL1 | NCL (Phcam linker): di-Ub(K63)-Lys11-SUMO-2, di-Ub(K63)-Lys33-SUMO-2 and di-Ub(K63)-Lys42-SUMO-2) | ||
SUMO-4 | Click chemistry: SUMO-1-RanGAP-1, SUMO-1-Ubc9, SUMO-2 -PML peptide conjugates | ||||
SUMO-5 | |||||
NEDD8 | NAE1 | UBC12 | RBX1/2 DCN1 | NEDP1 | NCL ([Pd(allyl)Cl]2): NEDD8-cullin peptide |
Others | DEN1 | KAHA ligation | |||
SENP8 | Direct SPPS (backbone amide propargylation) | ||||
UFM1 | UBA5 | UFC1 | UFL1 | UFSP1 | KAHA ligation |
UFSP2 | |||||
ISG15 | UBE1L | UBCH8 | HERC5 | USP18 | NCL |
UBA7 | HERC6 | NCL (Acm-NMe2): ISGylated-Ub | |||
EFP | |||||
ATG8 family | ATG7 | ATG3 | ATG5 | ATG4 | EPL: lipidated-LC3 |
LC3A | ATG10 | ATG12 | |||
LC3B | ATG16 | ||||
LC3B2 | Complex | ||||
GABARAP | |||||
GABARAPL1 | |||||
GABARAPL2 | |||||
ATG12 | |||||
FAT10 | Uba6 | USE1 | Parkin | Not reported | Not reported |
URM1 | MOCS | ? | ? | Not reported | NCL (Cys alkylation to mimic Gln (Ψ-Gln)) |
SUMO is conjugated to its substrates through an isopeptide bond between its C-terminal Gly and a substrate's Lys residue, leading to a single SUMO or poly-SUMO chain that is internally linked via an isopeptide bond(s) (Table 1 and Fig. 3).37 These modifications can alter protein stability, sub-cellular localization, and their intercoms. For instance, SUMOylation of RanGTPase-activating protein 1 (RanGAP1) targets it to the nuclear pore complex,38 while SUMOylation of promyelocytic leukemia (PML) assists in the assembly and stabilization of PML nuclear bodies (NBs), involved in DNA damage repair and antiviral responses.39
The Lys residue in substrates is typically located within a distinctive motif known as the SUMO consensus motif, featuring the sequence ΨKX(E/D), where Ψ represents hydrophobic amino acid, K is the modified lysine, X is any amino acid and E/D represents a negatively charged amino acid (either glutamate or aspartate).40 In addition to the covalent linkage, SUMO can interact with other proteins non-covalently through a SUMO-interacting motif (SIM) of a substrate.41
SUMOylation is known to be a reversible process where SUMO is cleaved from its substrate by SUMO-specific proteases (SENPs) (Table 1 and Fig. 3). SENPs are a family of cysteine proteases, comprising of seven members: SENP1–3 and SENP5–7.42 They are primarily localized in the nucleus, with certain members present in sub-nuclear compartments like the nucleolus and PML-NBs. SENPs play an additional role in the maturation process of SUMO where they cleave its tail, an extension of amino acids at the C-terminus, thus producing the mature SUMO with its di-Gly motif exposed and ready to initiate the cycle (Fig. 3).42 Since SUMO is involved in numerous cellular processes, aberration in SUMOylation can contribute to various diseases including cancer, neurological disorders, infections, diabetes, and others.43
NEDDylation occurs via activation of NEDD8 by E1 (NAE1: APPBP1-Uba3 dimer), transfer to E2 (Ubc12), which is known to be highly specific and achieved by the presence of Ala72 in NEDD8 that allows the specific activation by the E1 and the consequent interaction of the N-terminus of E2 (Ubc12). The final step is a covalent conjugation to the substrate by an E3 ligase (RBX1, RBX2, and others).45
Like the majority of Ubls, NEDD8 is synthesized with a C-terminal tail that is cleaved by specific C-terminal hydrolases known as NEDP1, DEN1, SENP8, and the Ub hydrolase UCH-L3, to expose the diglycine motif, through which NEDD8 is covalently linked to different substrates.46,47 Like ubiquitination, NEDDylation is a reversible process where the proteases NEDP1/DEN1/SENP8 promote deconjugation of NEDD8 from its target (Table 1).46,47
The most well-characterized substrates for the NEDD8 are the cullin protein family, which serves as a scaffold for Ub ligase complexes and promotes ubiquitination and proteasomal degradation. The cullin family regulates proteins involved in the cell-cycle, transcription, signal transductions, regulation of O2, centrosomes, and cytoskeleton.48–50 NEDDylation can also modulate P53's stability by modifying its E3 Ub ligase Mdm2, leading to increased ubiquitination and degradation. The direct NEDDylation of P53 inhibits its transcriptional activity by prompting or inhibiting its ability to activate or repress target genes.51
DDRGK1, an adapter protein of the UFMylation system, facilitates the recruitment of the UFMylation machinery to the ER surface for conjugation to various proteins embedded in the ER membrane such as RPN1, RPL26, and CYB5R3.54,55 UFMylation of these substrates induces recruitment of the ATG8 family, initiating autophagy of the UFMylated ER. Downregulation of UFM1-mediated ER-phagy leads to ER stress and accumulation of misfolded proteins.55
Like other Ubls, UFMylation is a reversible PTM, where UFSP2 is known to be involved in de-UFMylation. Loss of function of the UFM1 pathway is implicated in various diseases such as cancer, diabetes, schizophrenia, and ischemic heart disease. Additionally, it plays a crucial role in embryonic development and hematopoiesis due to its tight relationship with the ER stress response.56
ISG15 serves a dual function: intracellularly as a protein modifier and extracellularly as a cytokine that is highly expressed and secreted upon IFN stimulation.62 As a protein modifier, ISGylation modulates various biological processes and displays an intricate interplay with ubiquitination. While in some cases upregulation of ISGylation inhibits Ub-mediated proteasomal degradation due to competition over Ub binding domains, in other cases, ISGylation can lead to Ub-mediated proteasomal degradation.63,64 Controversial results of the connection between ubiquitination and ISGylation suggest context-dependent outcomes. Moreover, while Ub-ISG15 hybrid chains are known to exist, their recognition by other proteins remains poorly understood (Table 1).
Notably, the expression of ISG15's E1, E2, and E3 is also highly induced by type 1 interferon (IFN), influenza B virus, lipopolysaccharide, and genotoxic stress. Aberration in ISGylation is associated with cancer, neurodegenerative diseases, and problems in response to pathogen infections, while normal ISGylation is crucial in embryonic development.65
ATG8 plays a crucial role in autophagosomal membrane formation, where its C-terminus is covalently linked to the phospholipid, phosphatidylethanolamine (PE), through the enzymatic cascade described before, forming a lipidated ATG8–PE.70 This lipidated form serves as a scaffold to recruit other autophagy-related proteins that are necessary for autophagosome formation. ATG8–PE also ensures the specificity and selectivity of proteins, organelles, and cellular components that are targeted for degradation.
This interaction is generally mediated by LC3-interacting regions (LIRs), located in the unstructured region of the ATG8-interacting proteins, and is composed of negatively charged amino acids followed by two hydrophobic amino acids spaced by two random residues.70 The attachment of ATG8 proteins to PE is reversed by the ATG4 proteases, which regulate its turnover on the autophagosomal membrane and the autophagy dynamics.67
Dysregulation of the ATG8–PE interaction is associated with various diseases such as cancer, infections, inflammation, and neurodegenerative disorders.71
The constitutive expression of FAT10 is restricted to immune system tissues, but its presence in other tissues can be induced by pro-inflammatory cytokines like interferon (IFN)-γ and tumor necrosis factor (TNF).73 Covalent attachment of FAT10 to substrates is termed FATylation and occurs through an enzymatic cascade like Ub, including activation by E1 (UBA6) transfer to an E2 (USE1) and ligation to its substrate(s) by E3 (suggested to be PARKIN) (Table 1).74–76 In addition to its involvement in immune responses and inflammation regulation, FAT10 directly targets its substrates for degradation by the 26S proteasome, making it unique amongst the Ubls.77 FAT10 is upregulated in various cancer types, such as gastrointestinal cancer, hepatocellular carcinoma (HCC), pancreatic ductal adenocarcinoma, and human glioma.78
Recently, it was discovered that URMylation promotes the stress-dependent phase separation of target proteins aiding in stress resilience and cell survival.87 Since deUrmylases have not been identified yet, it remains unclear whether deURMylation occurs and if so, how this process might be reversed.
In this study, NEDD8 was prepared in its conjugated form with the 26-mer derived from cullin1 (703–728), a known substrate for NEDDylation. NEDD8 was prepared from two segments employing one ligation step at position 57, where Ala was mutated to Cys. First, the C-terminus of NEDD8 was prepared using Fmoc-SPPS where it was directly attached to Lys720 in the cullin1's C-terminus fragment on resin (Fig. 4a). This was achieved by using the alloc protecting group on Lys720, allowing for resin selective removal and peptide elongation to generate peptide 1 (Fig. 4a),91 which was ligated with peptide 2 to form the full-length NEDD8.
After assembly of NEDD8, the NEDDylated peptide was treated with [Pd(allyl)Cl]2, for thiazolidine (Thz) deprotection to form 3. This complex was demonstrated to be an excellent reagent for effective unmasking of Thz, enabling its removal within 15 min under NCL conditions.90 This intermediate was either ligated with the N-terminal peptide of cullin 4, followed by desulfurization to give the NEDDylated cullin conjugate 5 (Fig. 4b, path A), or directly deprotected and desulfurized to yield NEDDylated cullin conjugate 6 (Fig. 4b, path B). Conjugate 6 exhibited the secondary structure known for NEDD8 and was cleaved by the known ubiquitin C-terminal hydrolase isozyme 3 (UCH-L3).92 It should be noted that UCH-L3 is not the endogenous enzyme responsible for removing NEDD8 from cullins and was used solely as a model to provide evidence on the integrity of our synthetic conjugates.
Melnyk's group reported a synthetic approach for the preparation of SUMO-1, using the bis(2-sulfanylethyl) amido (SEA) thioester surrogate (Fig. 5a).93 SUMO-1(2–50) 7 and SUMO-1(51–97) 8 bearing a SEAoff (the cyclic disulfide form) were used for the assembly of full-length SUMO-1 9 (Fig. 5a). N- to C-sequential ligation was initiated by thioesterification of the SUMO-1(2–50)-SEAoff7 by mercaptopropionic acid at pH 4 in the presence of TCEP, followed by NCL with SUMO-1(51–97)-SEAoff8 to give 9. The full-length SUMO-1-SEAoff9 was further activated to SEAon allowing further attachment to a model peptide featuring the SUMO consensus motif ΨKX(E/D). The Lys residue in this peptide was modified with Cys to facilitate the SEA ligation with SUMO-1 (Fig. 5a). The folded domain of SUMO-1 in the synthetic conjugate 10 was confirmed by CD spectrum and the cleavage assay using Ulp1, a known Cys protease.
Using this strategy, the group prepared a SUMO1-P53 protein–peptide conjugate employing a one-pot approach based on three segments (7, 8, and P53 peptide). The SUMO-1-SEAoff9 thioester was assembled using a single NCL reaction and further conjugated to P53 peptide through δ-mercapotlysine residue (Fig. 5b).94 To study the effect of Cys52 in SUMO-1 properties, the wild-type and Cys52Ala SUMO-1 analogs were prepared by selective and non-selective desulfurization, controlled by the absence or presence of denaturants in the reaction. These synthetic conjugates helped elucidate the important role of Cys52 in maintaining SUMO-1's structure, thermal stability, and functionality.
SEA linker was also employed for the preparation of SUMO-2 and SUMO-3 employing SEA-mediated ligation.95 The sequence homology and the inability to distinguish between SUMO-2 and -3 encouraged Melnyk and coworkers to investigate the role of the conserved Cys residue on SUMO-2 and SUMO-3 domain's stability and properties. Cys to Ala mutation was achieved by radical desulfurization under denaturing conditions. Both the secondary structure and the thermal stability analyses together with the conjugation and deconjugation studies revealed that mutating the conserved Cys47 in SUMO-3 must be considered with caution as the fold of SUMO-3 is significantly affected. Notably, this mutation interrupted the cleavage rate of the SUMO-3 conjugate by SENP1 and SENP2. This study highlighted that SUMO-2 and -3 are distinct proteins and should not be considered identical.
Following these studies, Ovaa and coworkers employed NCL to prepare another Ubl, ISG15. ISG15 was considered as a linear dimer of two Ub-like modules and therefore it was divided into two domains at the native Cys76 which were ligated to give the full-length ISG15.96
Recently, our group was able to access URM1 for the first time via a single NCL.97 Since URM1 lacks a Cys residue and its Ala residues are not suitably positioned for ligation and desulfurization, an alternative method was used. Glutamine at position 32 was substituted with Cys, which was then alkylated with bromoacetamide to produce pseudo-glutamine (Ψ-Gln), a mimic of Gln with a single atom difference (Fig. 6a). Using this approach three tetramethylrhodamine (TAMRA)-labeled URM1 analogs (Fig. 15c) were prepared containing different C-terminus modifications, carboxylic acid (URM1-COOH), hydrazide moiety (URM1-CONHNH2) and deleted glycine at position 101 (URM1-ΔG101-COOH). All URM1 analogs were delivered into cells using a newly developed method termed suspension bead loading (SBL), requiring only small quantities of protein compared to other delivery methods (Fig. 6b). As URM1 is a synthetically challenging protein, SBL provided an efficient, economical, minimally cytotoxic delivery platform.
We found that URM1 localizes mainly in the nucleolus under normal conditions and diffuses out in response to oxidative stress. Additionally, we have demonstrated that regardless of URM1's C-terminus, its localization and degree of conjugation are oxidative stress dependent.
KAHA ligation was used in the preparation of several medium-sized proteins, including Ubls. For example, UFM1 was the first Ubl to be synthesized using this ligation, where three analogs with different C-terminus modifications were prepared, carboxylic acid, amide, and thioester (Fig. 7a).100 These analogs were prepared from three peptides in two ligation steps, where the ligation sites were selected to be phe29-Thr30 and Ala60-Gln61. 5-oxaproline (Opr) was introduced for sequential KAHA ligation in peptides 11 and 13, where Fmoc protection was needed for the middle peptide 11 which was also equipped with α-ketoacids. The N-terminal peptide 12 was also synthesized with the α-ketoacid functionality. In addition, peptide 13 was functionalized differently at its C-terminus. Employing KAHA ligation between these fragments gave the desired protein 14 (Fig. 7a). CD analysis confirmed its secondary structure featuring multiple β-strands and α-helices. This synthetic method for UFM1, specifically the thioester analog, could potentially enable labeling with tags and site-specific conjugation to protein substrates.
Notably, the KAHA ligation process at its initial development required a key step where cyanosulfurylide had to be oxidized by oxone to form α-ketoacid. However, this step was incompatible with residues such as Cys, Met, and Trp, leading to undesired oxidation.101 Therefore, the group introduced a protected form of AA suitable for obtaining enantiopure peptide α-ketoacid 15 directly upon cleavage from resin (Fig. 7b). This method, which is compatible with all amino acids, was successfully applied to the preparation of SUMO-2/3 from three segments.102
Biochemical studies were performed to verify the structure and function of SUMO-2/3 that have homoserine residues due to the ligation requirement. Using SENP2, as a SUMO protease, successful cleavage of SUMO-2's tail to expose the di-Gly motif was observed. Additionally, the SUMOylation reaction on the substrate RanGAP1 demonstrated SUMO-2/3 activity. Both experiments illustrated that the homoserine residue does not affect the in vitro recognition and processing by the SUMOylation machinery.
NEDD8 has also been accessed using KAHA ligation. Investigating the NEDDylation process and identifying new substrates has always been challenging due to difficulties in its expression. NEDD8 synthesis involved a newly developed photolabile protecting group incorporated in the α-ketoacid to facilitate one-pot multiple KAHA ligation (Fig. 8a).103 The photo-protected α-ketoacid in the desired peptide was unmasked under mild conditions through irradiation at 365 nm. Three distinct strategies were applied to prepare NEDD8, with two involving a three-segment process. The first strategy proceeded from the N-to-C-direction (Fig. 8b) using the photo-protected tyrosine α-ketoacid in peptide 16, which reacted with 17. The ligation product was directly irradiated to unmask the α-ketoacid. Peptide 18 was subsequently ligated with the C-terminal peptide 19, yielding NEDD8 (3–76) (20) (Fig. 8b).
The second approach operated from the C-to-N-direction using photo-protected oxaproline. It is worth noting, that in both strategies the segments were involved in sequential one-pot KAHA ligations including photocleavage of the appropriate protecting group under the ligation conditions. The third strategy is based on four segments applying three ligation steps and only one HPLC purification step. The synthetic NEDD8 was obtained in good purity and acceptable yield without intermediate handling or isolation steps.
In vitro activity of the semisynthetic LC3-PE was assessed using an ATG4 cleavage assay, demonstrating cleavage within 1 h. Furthermore, the function of LC3-PE in membrane tethering and fusion was also examined to underscore the significance of LC3 lipidation for membrane association and fusion promotion. Additionally, mutants at the C-terminus of LC3 assisted in understanding the structure–function relationship of the deconjugation specificities of ATG4 and RavZ proteases.107
![]() | ||
Fig. 10 Schematic presentation of Fmoc-SPPS used for the preparation of SUMO-2 and SUMO-3 and for NEDD8 where aggregation breakers and backbone propargylation were introduced, respectively. |
Using two different strategies we were able to synthesize four different SUMO-2-Lys63 linked-di-Ub. After failed attempts in conjugating the N-terminus of SUMO2 to Lys63-di-Ub, a polyArg tag was installed to 3,4-diaminobenzoic acid (Dbz) at SUMO2's C-terminus to increase the solubility and improve handling during the preparation and purification steps (Fig. 12a). Following two sequential ligation steps with the branched di-Ub and desulfurization, the tag was removed using three sequential steps that include NaNO2, thiolysis and hydrolysis to give the desired product 27. The second strategy, which turned out to be more effective, was based on attaching the polyArg tag via the phenyl-acetamidomethyl (Phacm) linker that was removed using PdCl2 upon synthesis completion (Fig. 12b).
![]() | ||
Fig. 12 The synthesis of the hybrid chain SUMO-2-Lys63 linked-di-Ub introduced a polyArg solubilizing tag which was installed via (a) 3,4-diaminobenzoic acid (Dbz) or (b) the Phacm linker. |
Our strategy involved the synthesis of Thz-SUMO(2–45)-COSR (21′) and Cys-SUMO-(47–93) (22′) where two solubilizing tags were installed at Ala 23 and Ala 74 via the Phacm linker. The full SUMO (24′) was ligated to the first Ub-COSR (23), which was further reacted with another Ub-COSR (25) unit through δ-mercaptolysine.
The Di-Ub-(K63)-Cys-SUMO (26′) was subjected to PdCl2 to remove the solubilizing tag, followed by a desulfurization reaction to furnish the Ala native residues (Fig. 12b). Using this approach di-Ub(K63)-Lys11-SUMO-2, di-Ub(K63)-Lys33-SUMO-2 and di-Ub(K63)-Lys42-SUMO-2 were also prepared.
The Melnyk group also reported a rapid and robust synthesis for all SUMO-2/3 dimers to investigate how the composition of these chains impacts their properties.94 According to their strategy, SUMO-2/3 dimers were assembled through one-pot ligation between three segments. First, SUMO-2/3 was produced with a C-terminal SEA group, then ligated to additional SUMO-2/3 through Lys(Cys) at position 11. The secondary structure of SUMO-2/3 dimers was verified as well as their behavior with different SENPs, that confirmed their structural integrity. Additionally, our group prepared an ISGylated-Ub hybrid chain, where a new solubilizing tag, Acm-NMe2, was introduced during SPPS (Fig. 13).117 To overcome the hydrophobicity of the N-terminal region of ISG15, Ala residues at both positions 11 and 41 were mutated to Cys protected with an Acm-NMe2 tag. This modification was introduced to interfere with aggregation and facilitate solubility due to its charged tertiary amine. ISG15(2–60)-NHNH2 (28) and ISG15(61–157)-NHNH2 (29) were both synthesized by Fmoc-SPPS, where Cys78 at the hinge region was mutated to Ser to prevent dimerization, and Ala61 was mutated to Cys to enable NCL. Following ligation, ISG15 C-terminus 30 was activated to thioester using acetylacetone (acac) and MPAA to form the thioester, which was further ligated with Ub 31 via δ-mercaptolysine at position 29. The N-terminus of Ub was equipped with biotin and 6 Arg residues. The ISGylated Ub was then subjected to PdCl2 for Acm-NMe2 removal, to give product 32, which was extremely hydrophobic and difficult to handle (Fig. 13). Despite significant challenges in preparation and low purification yield, we were able to successfully characterize the natively folded conjugate by trypsin digestion, SDS-gel, western blotting, and CD measurement.
The reactive warhead propargylamide (PA) was a useful probe for profiling SUMO-specific protases (Fig. 14a and b). Employing linear synthesis of SUMO, including N-N′-Boc-protected 5-carboxyrhodamine (Rho) dye together with PA, also allowed visualization of the SENP protease activity in cells.108 Ovaa and his colleagues successfully validated the reactivity and specificity of SUMO-based probes both in vitro and in whole-cell lysate. All SUMO isoforms exhibited reactivity toward SENP1 and SENP2, while SENP6 showed a slight preference for SUMO-2/3, and SENP3 and SENP7 demonstrated a distinct preference toward SUMO-2/3 probes. Cell experiments demonstrated that in response to ectopically expressed SENP enzymes, the cellular distribution of SUMO-2 and SUMO-1 was altered.
PA was further employed with Rho-labeled ISG15 to trap the known deISGylase USP18 in cell lysates. In lysate, where FLAG-USP18 and catalytically dead USP18 were overexpressed, only the active USP18 was labeled through its catalytic Cys.96 ABPs were further utilized to investigate the proteolytic cleavage of SUMO, synthesizing di-SUMO bearing vinyl amide (VA), which forms a covalent crosslink with the target enzyme (Fig. 14a).108 Linear Fmoc-SPPS employing diaminobutyric acid residue (Dab) and a ligation handle (4-((tert-butoxycarbonyl)amino)-3-(tert-butyl disulfaneyl)butanoic acid) at position K11 of the proximal SUMO-2 and thioester in the distal SUMO-2-ΔG, followed by NCL afforded the K11 diSUMO-2 precursor. Final thiol elimination with 2,5-dibromohexanediamide119 yielded the diSUMO2-VA (Fig. 14c). This unique probe participated also in both in vitro studies and in cell lysate to demonstrate its reactivity toward all SENPs, including the endogenous and ectopically expressed forms, except for SENP8 (NEDD8 specific protease). Additionally, introducing this probe revealed for the first time that SENP3 prefers diSUMO-2 over SUMO-1/2.108
The electrophilic group dehydro-alanine (Dha) has been employed to study the activity of Cys proteases, extending beyond the Ub system120–122 to include Ubls (Fig. 14a and b). Fluorescent UFM1 molecules, equipped with either Dha or PA reactive groups, were chemically synthesized by SPPS and NCL to capture conjugating enzymes and target cysteine proteases, respectively.123 Introducing Dha to the protected peptide was done by equipping the C-termini with Cys(Bn)-OMe which was later transformed by oxidative elimination with O-mesitylenesulfonylhydroxyl-amine to generate the UFM1-Dha probe (Fig. 14d). Rho-UFM1-Dha and Rho-UFM1-PA were applied in vitro and in cell experiments and their sub-cellular localization was visualized using confocal microscopy. UFM1-Dha showed reactivity toward the E1 conjugating enzyme UBA4, but no cross-reactivity with Ub conjugating enzyme UBE1. UFM1-PA was recognized only by UFM1-specific proteases, trapping Ufsp1 faster than Ufsp2. Introducing Rho-UFM1-PA by electroporation into either unmodified HeLa cells or those transfected with catalytically dead FLAG-Ufsp1 revealed colocalization exclusively with catalytically active Ufsp1 and distribution throughout the cell and nucleus.
The Dha probe has also been employed to study the semisynthetic ISG15, where it was expressed with a Cys to Ser mutation and with an additional C-terminal Cys.96 The probe effectively captured the established E1, E2, and E3, and the specific proteases USP18, USP5, and USP14, in various in vitro studies.
The production of Ubls with reactive groups was also achieved using intein-mediated recombinant peptide hydrazide. This hydrazide C-terminus serves as a handle for subsequent activation and aminolysis. The initial Ubl to be tested was UFM1-NHNH2124 which was oxidized with sodium nitrate, then subjected to aminolysis with propargylamide or thioesterification with MESNA, followed by aminolysis with glycine 7-amido-4-methylcoumarin to form the fluorescent probe UFM1-AMC (Fig. 15c).
This approach made ABPs accessible not only to chemistry laboratories but also to biological ones. The use of acyl hydrazide functionality was expanded by Bode and coworkers, who produced it through a single site-specific acetylation of recombinant UFM1-NHNH2 with different anhydrides at pH 3. This allowed access to the electrophilic warheads methyl-fumarate, glycidic acid, pentynoic acid, and acetic acid anhydrides derivatives (Fig. 14a).125
This method preserved the integrity of the folded protein and prevented reactions with unprotected amino acid side chains. These ABPs were used for in vitro and in-cell experiments. The chloroacetyl probe demonstrated remarkable selectivity towards the de-UFMylase UFPS2 in cell lysates and live cells (Fig. 14a). Following immunoprecipitation and proteomics analysis, only a small quantity of unspecific labeling was observed.
This method was further expanded to facilitated direct preparation of NEDD8 (ΔG76, ΔGG) and SUMO2 (ΔG93, ΔGG) ABPs, with different electrophiles without extensive preparation or purification steps.126 The specificity and cross-reactivity of NEDD8 and SUMO2 based probes with three known specific proteases USP21, SENP1, and SENP8, were investigated. These probes showed excellent specificity for their respective DUBs (SENP8 for NEDD8 and SENP1 for SUMO2), with minor cross-reactivity observed for NEDD8.
Recently, the same group introduced on-demand photoaffinity SUMO-ABPs which were designed to capture SENPs in vitro and in live cells upon photo-irradiation.129 A bulky nitrobenzyl (Nbg) (Fig. 15a) photocaged group was introduced during the synthesis of SUMO-2-PA at Gly93 and at Gly64 to interfere with the binding with SENPs and prevent aspartamide formation, respectively. UV radiation at 365 nm cleaved the Nbg groups, granting SENP2 trapping via the propargylamine functionality. For in-cell SENPs profiling, SUMO-2 was labeled with a cell-permeable cR10 and D-biotin and delivered to HeLa cell, where SENP3 was successfully captured upon UV activation.
The TAMRA fluorophore was also included in the development of a Ubl-fluorogenic polarization reagent to investigate the deconjugating activity toward various Ubl (Fig. 15d). Among these, a NEDD8–peptide conjugate was prepared by NCL between a synthetic NEDD8-thioester and TAMRA-labelled 5-thioLys-Gly using the E1 enzyme.132 The fluorescence polarization assay was used to test NEDD8 using a deconjugation assay with known UCH-L3 and USP21. UCH-L3 exhibited deNEDDylation activity, while USP21 showed no cross-reactivity toward NEDD8.
Fluorescence polarization was further applied to investigate the specificity of several SENPs. This was done using a high-throughput one-pot ligation desulfurization strategy for the synthesis of isopeptide-linked SUMO-3 in a 96-well plate, where each well contained a TAMRA-labeled peptide sequence of the active site of the most abundant SUMOylated proteins.133 The deconjugation assay was initiated by treatment with five different SENPs (SENP1, SENP2, SENP5, SENP6, and SENP7) resulting in a comprehensive dataset of SENP preferences for each SUMO substrate.
Recently, the acyl hydrazide functionality was applied to generate a fully folded fluorescence polarization substrate for Ub/Ubls. These substrates were activated from their C-terminal hydrazide to acyl-azide, followed by subsequent functionalization to isopeptides.134 Fully cleavable substrates (Ubl-KG-TAMRA) for SUMO1, SUMO2, NEDD8, and ISG15 were prepared with this procedure to investigate the substrate specificities toward human UCHL3, USPL1, USP2, USP7, USP16, and USP36. SUMO paralog's specificity toward USPL1 was re-investigated by fluorescence-polarization-based cleavage assays and rationalized with crystal structure analysis. For structural analysis SUMO-2/3 were prepared with either a 2-bromoethyl warhead or as a ΔN-SUMO2/3-PA(where the N-terminal domain was excluded) and covalently interacted with USPL1. Extensive biochemical analysis, including mutations at key residues of SUMO-2/3 at the interface with USPL1, revealed that the specificity of USPL1 toward SUMO-2/3 over SUMO-1 is attributed to its ability to recognize the Gly27 loop.
The applicability of the method was extended to NEDD8 and ISG15, which led to the discovery that USP16 and USP36 are active toward ISG15. Consequently, both USP16 and USP36 became the first human DUBs known for their specificity towards three distinct modifiers. Furthermore, the cross-reactivity of UCHL3 for NEDD8 and Ub was also reaffirmed in the same study.
The field still requires further expansion of the chemical toolbox to effectively access still difficult Ubl conjugates, particularly those that are highly hydrophobic and form longer chains on a particular substrate. Given the limited examples available in the literature regarding Ubls linked to their substrates or in their hybrid forms, our ability to replicate these complex biological systems and fully understand the critical aspects of Ubl chain regulatory mechanisms remains restricted.
Alongside more straightforward access to Ubls, we should also emphasize the need for advanced chemical techniques to study PTMs of Ubls,135–138 which are crucial for understanding how these modifications influence Ubl function and regulation, an aspect that remains to be fully explored. Moreover, we still need methods that allow us to follow the fate of proteins upon Ubl modification. The existing literature offers limited studies that investigate synthetic modified proteins introduced in their natural cellular environments, resulting in a limited understanding of these conjugates. With recent advances in the delivery of synthetic proteins to cells, it is now possible to take advantage of these precious synthetic conjugates to study their cellular behavior in cells. In particular, to study their localization under different cellular conditions, interactome and stability.
Additionally, delivery of external proteins could compensate for a lack of endogenous protein, thus rescuing the knockdown phenotype.
Towards these goals, we have recently synthesized SUMO-2 tail analogs, using the established synthetic methods and delivery approaches mentioned above, to examine their cleavage rate and cellular localization. The study revealed that the native SUMO-2 tail undergoes rapid processing and its critical role in nuclear localization and integration into PML-NBs.139
Innovations in probe design aimed at increasing specificity and sensitivity are also required to enhance our ability to study Ubls in diverse biological contexts, thus, enabling examination of their regulation machinery and aberration related to it. Finally, taking advantage of these conjugates and probes in therapeutic applications shows great promise, particularly in targeting Ubls and their related machineries in various diseases, similar to what has been done in the Ub system.25,140 Our recent work, where we identified de novo cyclic peptides through a combination of chemical protein synthesis and the RaPID system to modulate the Ub system, is one example of this avenue.141–143 These peptides exhibited strong selectivity for K48 or K63-linked Ub-chains, affecting their cellular behavior, including interactions with DUBs, the proteasome, and DNA repair machinery, positioning them as novel therapeutic candidates. Similarly, one could use any of the synthetic Ubl conjugates to find selective cyclic peptides to modulate biological processes.
Footnote |
† These authors contributed equally. |
This journal is © The Royal Society of Chemistry 2025 |