Total chemical synthesis of SUMO-2-Lys63-linked diubiquitin hybrid chains assisted by removable solubilizing tags† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c7sc00488e Click here for additional data file.

We report the first total chemical synthesis of four different SUMO-2-Lys63-linked di-ubiquitin hybrid chains, in which the di-ubiquitin is linked to different lysines in SUMO.


Introduction
The covalent attachment of ubiquitin (Ub) and ubiquitin-like proteins to lysines in substrate proteins regulates a vast array of cellular processes. 1,2 Ubiquitin is an 8.5 kDa, 76-amino acid protein, whereas the small ubiquitin-like modier (SUMO) proteins have molecular weight $10 kDa and exist as four isoforms known as SUMO-1/2/3/4. SUMOylation regulates essential cellular processes such as transcription, mitochondrial dynamics and the DNA damage response. 1,2 Analogous to ubiquitination, SUMOylation is also mediated by the E1, E2 and E3 enzymes, which catalyze formation of an isopeptide bond between the SUMO C-terminal Gly and the epsilon amino group of a Lys residue in a target protein. Like poly-ubiquitin chains, poly-SUMO-2/3 chains are also found in cells, where one SUMO is connected via its C-terminus to a Lys that is present in the consensus motif JKXD/E (where J represents a large hydrophobic residue and X any amino acids) of the preceding SUMO.
It was initially believed that SUMO and Ub were independent in their homotypic signaling. However, mass spectrometry studies revealed the presence of ubiquitinated SUMO in cells, 3 indicating the presence of mixed SUMO-ubiquitin polymers. More recently, hybrid chains consisting of Lys63-linked polyubiquitin conjugated to SUMO-2 chains were shown to play a role in DNA double strand break (DSB) repair by recruiting the BRCA1-A complex 4 subunit RAP80. This subunit of the BRCA1-A complex was previously known to bind specically to Lys63linked di-Ub, 5,6 and it was shown to also contain a SUMOinteracting motif, suggesting that RAP80 recognizes both ubiquitin and SUMO. Indeed, a hybrid chain containing Lys63di-Ub linked to SUMO-2 binds 80-fold more tightly to RAP80 than SUMO-2 or Lys63-di-Ub alone. 4 The enhanced affinity of the hybrid chain to RAP80 is mediated by the tandem SUMO interacting motif and the tandem ubiquitin interacting motifs present in RAP80. 7 Despite some progress in understanding the involvement of the hybrid chains in DNA DSB repair, there are still many fundamental questions to be answered. For example, there are eight possible lysines in SUMO-2 to which ubiquitin can be conjugated (Lys5, Lys7, Lys11, Lys21, Lys33, Lys35, Lys42 and Lys 45), in addition to the SUMO N-terminus (Met1), 8 yet it is not known which SUMO-ubiquitin linkage might be optimal for efficient binding to RAP80. Furthermore, the three dimensional structural topologies of the different types of hybrid chains are uncharacterized. Since additional proteins containing SUMOinteracting motifs in proximity to ubiquitin-binding motifs have been identied, 4 it is possible that proteins other than RAP80 may also recognize hybrid chains in DNA damage or in other cellular events. At least one deubiquitinating enzyme has been identied that specically removes ubiquitin from SUMO, 9 although it is not known whether there are additional enzymes specic for the hybrid chains that could also potentially discriminate among the different chain types.
In order to investigate these important questions, homogenous material consisting of hybrid chains in workable quantities is needed. While hybrid chains containing ubiquitin fused to the N-terminus of SUMO can be prepared enzymatically, 4,8 until this work the other hybrid chains remained inaccessible. Moreover, the probes and unique analogues based on these hybrid chains are oen not easy or even possible to prepare by enzymatic approaches. Chemical synthesis should in principle overcome these challenges and make it possible to prepare a desired hybrid chain and its desired analogues. While the synthesis of native SUMO-1/2/3 using ligation approaches [10][11][12] and SUMO-linked to Ubc9 via a triazole bond has been achieved, 13 the synthesis of hybrid chains remains inaccessible. We report here for rst time the preparation of four different SUMO-2-Lys63-linked-di-Ub analogues by employing current state of the art chemical methods for protein synthesis. 14 In these syntheses, we encountered major problems in the handling and purication of synthetic peptide intermediates, which gave rise to inefficiencies. To overcome these problems, we examined various synthetic strategies to assist in these syntheses, mainly the use of two methods for attaching solubilizing tags at different positions in SUMO to facilitate synthesis. This approach enabled us to compare the reliability of these methods and select the most efficient one for future synthesis.

Results and discussion
We previously demonstrated the total chemical-and semisynthesis of different ubiquitin conjugates with increasing complexity, such as free tetra-Ub chains, 15 doubly modied H2B histone with glycan and Ub units, 16 and tetra-ubiquitinated proteins. [17][18][19] This has facilitated biochemical, biophysical and functional analyses of these conjugates. [18][19][20] Equipped with various synthetic tools for protein synthesis of such complex conjugates, we decided to examine the total chemical synthesis of SUMO-2 linked to Lys63-di-Ub via its N-terminus, di-Ub(K63)-SUMO-2 (1, Scheme 1), using a convergent approach. Towards this goal, we strategically divided the 93 amino acids of SUMO-2 into two fragments, Thz-SUMO(2-45)-SR, 2, (Thz: thiazolidine) and Cys-SUMO(47-93)C49S, 3, (Scheme 1). The N-terminal Met was mutated to Thz to enable ligation with Ub and Ala46 was mutated to Cys to facilitate native chemical ligation (NCL). 21 As shown in Scheme 1, this Cys residue is converted to alanine during the desulfurization reaction, thus restoring the native sequence. The native Cys49 was mutated to Ser, which is structurally closer to Cys, to prevent its conversion to Ala during desulfurization in the later stage of the synthesis. Fragment 2 was synthesized using Fmoc-SPPS and an N-acylurea approach followed by thiolysis aer peptide deprotection and cleavage from the resin. 22 Fragment 3 was also synthesized employing Fmoc-SPPS. This fragment exhibited anomalous behavior in the HPLC purication step, but despite this we were able to successfully obtain the fragment in 8% isolated yield. Having both fragments in hand, we then carried out the ligation reaction, followed by Thz conversion to Cys to obtain Cys-SUMO(2-93), 4. 23,24 On the other hand, di-Ub(K63*)-COSR was synthesized employing isopeptide chemical ligation between the two Ub building blocks; namely Ub(K63*)-N-methyl-Cys 25 5a, bearing d-mercaptolysine (K*) 26 at position 63, and Ub-COSR 7. This was followed by switching the C-terminal N-methyl-Cys to the corresponding thioester of 3-mercapto propionic acid (MPA), 6a, as we previously reported for the synthesis of Lys48linked di-Ub. 15 We then attempted nal ligation between Cys-SUMO(2-93) 4 and di-Ub(K63*)-COSR 6a. However, we did not observe product formation even aer 24 h incubation, likely due to the poor solubility of di-Ub(K63*)-COSR in the ligation buffer.
It is important to mention that, in this synthesis, the rst ligation between fragments 4 and 5 and the subsequent mercaptolysine deprotection proceeded smoothly, which is clearly evident from analytical HPLC and mass analysis of these reactions. However, the HPLC purication process of the resultant ligation product 6 was very challenging, as we observed dragging of the product-eluting over $10 min-which led to an overlap of the impurities and the desired product being obtained in a very low yield (ESI S20 †). In addition, we reencountered the same purication problem while isolating the di-Ub(K63)-Ala1-SUMO 1 conjugate. Although this approach was successful in obtaining the nal product, the low yield of the nal product (<2%) and the difficulties in the purication steps make it very difficult to apply in the synthesis of the analogue and other analogues based on the hybrid chains.
We speculated that the peptide behaviors in HPLC analysis and purication might be due to the hydrophobic nature of the synthetic intermediates and decided to install a removable solubilizing tag in fragment 3 during SPPS. This could in principle increase the solubility, lower the dragging of the peptide fragments, and ease the handling and purication of the intermediates and nal conjugate. A variety of methods have been developed to aid the synthesis of hydrophobic proteins or those whose synthetic intermediates are difficult to purify and use in the ligation. [28][29][30][31][32][33] In considering this problem, we were inspired by the recent work of Liu and coworkers, in which the 3,4-diaminobenzoic acid (Dbz) linker 22 was employed to attach a poly-Arg tag to the C-terminus of a hydrophobic peptide fragment to increase its solubility in the ligation buffer, and assist in its purication. 34 Subsequently, the Dbz linked to the (Arg) 6 tag was switched to the thioester by treatment with sodium nitrite followed by thiolysis with exogenous thiol to enable the next ligation. We also chose to utilize the Dbz linker to install an (Arg) 6 tag at the C-terminus of Cys-SUMO(47-93) hoping to control the anomalous behavior of this fragment during the preparative HPLC purication and improve the overall yield. We also hoped this would ease the handling of the SUMO-Ub conjugates in the purication steps. In our case, the tag could be removed by converting the Dbz into the more reactive triazole intermediate followed by hydrolysis.
We rst attempted the synthesis of a model peptide bearing a Dbz linked (Arg) 6 tag at the SUMO C-terminus, SUMO(49-93)-Dbz-(Arg) 6 10, using allyloxycarbonyl (Alloc) protection on the Dbz during SPPS. Aer peptide elongation, the alloc protection was removed and the peptide was cleaved from the resin to obtain 10. The mass analysis of the reaction mixture showed two masses of 6262.2 Da and 6244.1 Da in a ratio of 78 : 22 (ESI S25 †). We presumed that under strong acidic conditions, the free amine of the Dbz might have condensed with the carbonyl of the less sterically hindered C-terminal Gly to give a SUMO-(49-92)-NHCH 2 -benzimidazole side product 11 with a mass of 6244.1 Da (Scheme 2). This assumption was further supported by analyzing the resulting mixture of the removal step of the Dbz-(Arg) 6 tag. Upon treatment with 60 equiv. of NaNO 2 at À15 C for 10 minutes we observed the formation of a benzotriazole intermediate 12.
Extending the reaction for an extra hour at the same temperature provided a mixture containing SUMO(49-91)-piperazine-2,5-dione 13, the desired hydrolyzed product SUMO(49-93)-COOH 14 and the unreacted 11 (ESI S27 †). We reasoned that, since the peptide benzotriazole 12 is highly reactive, it spontaneously undergoes hydrolysis and intramolecular nucleophilic attack by the amide of the subsequent Gly on the activated C-terminus, also leading to the formation of piperazine-2,5-dione 13.
To control the formation of the benzimidazole side product 11, we synthesized SUMO(49-93)-Dbz-(Arg) 6 while protecting the free amine with a propargyloxycarbonyl (Proc) group. The peptide was cleaved from the resin with subsequent removal of the Proc group using PdCl 2 24 on the crude peptide in Gn$HCl to obtain SUMO(49-93)-Dbz-(Arg) 6 10 without the formation of any side products. The Proc group was chosen instead of Alloc for protection because its removal in aqueous solution using palladium is much more efficient. In addition, to avoid piperazine-2,5-dione 13 formation, we chose to use mercaptoethanol in this reaction that could rapidly form a thioester by intercepting the benzotriazole intermediate 12.
To our delight, adding an excess of mercaptoethanol to the benzotriazole 12, and continuing the reaction for an additional hour, led to the quantitative formation of the thioester, SUMO(49-93)-COSCH 2 -CH 2 OH. Subsequent elevation of the reaction pH to 9, 35 while keeping it for an additional hour at room temperature, provided the desired SUMO(49-93)-COOH, (ESI, S30 †). Encouraged by these results with the model peptide, we then implemented this approach for the synthesis of di-Ub(K63)-Ala-SUMO(2-93). We rst synthesized Cys-SUMO(47-93)-Dbz-(Arg) 6 15 and ligated it with fragment 2, which was followed by Thz removal to obtain Cys-SUMO(2-93)-Dbz-(Arg) 6 16 (Scheme 3). At this stage of this work, and in a parallel project, we discovered the efficient removal of Thz by palladium complexes, 24 hence we began to employ Pd[(allyl)Cl] 2 for Thz removal instead of the  traditional conditions of using methoxylamine in pH 4. Polypeptide 16 was then subjected to ligation with Ub(K63*)-COSR 5, bearing protected d-mercaptolysine at position 63, followed by Thz-removal to furnish Ub-(K63*)-Cys-SUMO(2-93)-Dbz-(Arg) 6 17. It should be mentioned that, although HPLC analysis with and without the tag gave a similar pattern, 17 exhibited much less dragging in preparative HPLC in comparison with the conjugate without a tag, and the product was isolated in 35% yield. The next ligation with 7 also proceeded smoothly, and subsequent desulfurization afforded 8, which upon the removal of the solubilizing tag, and when employing the above established conditions, afforded the desired product 9 in 2-3% overall yield (Fig. 1) starting from fragment 15.
Although the incorporation of a Dbz based solubilizing tag assisted in isolating the SUMO and SUMO-Ub conjugates, the key fragment Cys-SUMO(47-93)-Dbz-(Arg) 6 15 was obtained in very low yield ($6%). In addition, the requirement for multiple steps to remove the tag rendered this approach low yielding and operationally difficult, in particular when dealing with smallscale synthesis. This prompted us to search for other alternative approaches for enhancing the solubility of the peptide fragments. During the course of validating the applicability of a Dbz based solubilizing tag, our laboratory has invented a phenylacetamidomethyl (Phacm) based removable solubilizing tag applicable to protein synthesis. 28 This newly developed linker is stable under SPPS and NCL conditions and can be cleaved quantitatively using PdCl 2 aer completing the synthesis of the target protein. Therefore, we decided to examine the effect of the Phacm based solubilizing tag in the synthesis of di-Ub(K63)-Ala-SUMO(2-93) 1. One obvious advantage of this approach is the opportunity to distribute the solubilizing tag along the polypeptide sequence, in contrast with the previous method, which is mainly limited to the C-terminus. We therefore wondered if installing the two Phacm linked solubilizing tags across the SUMO chain would improve the behavior of the SUMO-Ub conjugates during purication and the overall yield.
Subsequently, the two Phacm linked solubilizing tags were removed quantitatively by treating with PdCl 2 in Gn$HCl buffer at pH 7 for 2 h (Fig. 2). The reaction was quenched with an excess of DTT, dialyzed against Gn$HCl at pH 7, and subjected to desulfurization to produce the desired product 1 in 39% yield and 6% overall yield starting from fragment 19 (ESI, S51 †), which is twofold better compared to synthesis based on the Dbz tag. We successfully applied a similar strategy to synthesize di-Ub(K63)-Lys11-SUMO-2, di-Ub(K63)-Lys33-SUMO-2 and di-Ub(K63)-Lys42-SUMO-2. These analogues were then folded in Tris buffer solution of pH $ 7.3 and their secondary structures were examined by circular dichroism, which suggested correct folding of these chains (ESI, S54 †). We are currently building on our optimized synthetic approach to prepare workable quantities of these chains and use them in a variety of biochemical, proteomic and structural studies, which will be reported in due course.
In summary, we have accomplished total chemical synthesis of four different di-Ub(K63)-SUMO-2 analogues. During our attempts at synthesis, major handling and purication problems were encountered, which forced us to attach poly-Arg tag(s) to SUMO to assist in overcoming these obstacles. Two strategies were examined, in the rst approach a poly-Arg tag was attached to the Cterminus of SUMO via a Dbz-cleavable linker. In the second approach, we attached the poly-Arg tags via the newly developed Phacm linker, which is cleaved by PdCl 2 upon completion of the synthesis. Although both were successful in overcoming the abovedescribed challenges, the approach based on the Phacm linker turned out to be more efficient and reliable, and it required fewer steps. The current study and the developed synthetic approaches open new horizons for studying the role of the hybrid chains in DNA repair and should enable studies that otherwise are difficult or impossible to conduct. Moreover, the synthetic lessons learned here should assist in the synthesis of other challenging proteins.