Wen-Hao
Wu†
a,
Jianwen
Guo†
bc,
Longshuai
Zhang
bc,
Wen-Bin
Zhang
*a and
Weiping
Gao
*bc
aBeijing National Laboratory for Molecular Sciences, Key Laboratory of Polymer Chemistry & Physics of Ministry of Education, Center for Soft Matter Science and Engineering, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, P. R. China. E-mail: wenbin@pku.edu.cn
bDepartment of Geriatric Dentistry, Beijing Laboratory of Biomedical Materials, Peking University School and Hospital of Stomatology, Beijing, 100081, P. R. China. E-mail: gaoweiping@hsc.pku.edu.cn
cBiomedical Engineering Department, Peking University, Beijing, 100191, P. R. China
First published on 9th June 2022
Living organisms have evolved cyclic or multicyclic peptides and proteins with enhanced stability and high bioactivity superior to their linear counterparts for diverse purposes. Herein, we review recent progress in applying this concept to artificial peptides and proteins to exploit the functional benefits of these macrocycles. Not only have simple cyclic forms been prepared, numerous macrocycle variants, such as knots and links, have also been developed. The chemical tools and synthetic strategies are summarized for the biological synthesis of these macrocycles, demonstrating it as a powerful alternative to chemical synthesis. Its further application to therapeutic peptides/proteins has led to biomedicines with profoundly improved pharmaceutical performances. Finally, we present our perspectives on the field and its future developments.
Fig. 1 (A) The molecular graphs of three macrocycles with different chemical topologies. The line represents a polymer chain with a red vertex being a branch point. (B) General illustration and typical examples of the four categories of the naturally occurring peptide/protein-based macrocycles with distinct topologies. Type I: sunflower trypsin inhibitor 1 (SFTI-1) with planar θ-curve topology, rhesus-θ defensin-1 (RTD-1) with multicyclic topology,51 and bacteriocin AS-48 with pure cyclic topology; Type II: lasso peptide and Spy0128 domain with cyclic-branch topologies; Type III: cyclotide varv F with K3,3 graph topology, tick-derived protease inhibitor (TdPI) with interchain linked topology,52 and trefoil knot; Type IV: Pyrobaculum aerophilum citrate synthase (PaCS) with [2]catenane or Hopf link topology and the capsid of HK97 with “chain-mail” topology. |
Peptide/protein macrocycles are not uncommon in nature. Although nascent proteins are strictly linear due to the template polymerization mechanism of ribosomal synthesis,31,32 living organisms have evolved diverse cyclic peptides/proteins via post-translational processing to combat diseases or as part of their defense mechanisms. The most common form is simple macrocycles formed by seamlessly linking the N- and C-termini after expression.33 Up to now, more than 1400 sequences of naturally occurring simple cyclic peptides/proteins have been documented.34,35 The largest group of cyclic peptides were discovered from plants, named cyclotides.36–38 Lasso peptides are perhaps the second largest group with a cyclic peptide threaded by its tail (Fig. 1B).39,40 The disulfide bond formation leads to even more complex topologies, such as cysteine knots and ladders (embedded rings that are threaded by the disulfide bonds)33,41 and mechanically interlocked, multicyclic protein links (Fig. 1B).20 Occasionally, the cyclic structures can also be fixed by other covalent linkages, such as thioester, ester, or isopeptide bonds. The beautiful molecular ‘chain-mail’ structure of bacteriophage HK97 capsid is composed of mechanically interlocked protein rings that are covalently closed by isopeptide bonds (Fig. 1B).42–44 While their cyclization mechanism differs from case to case, it seems that folding and assembly are all critical in pre-organizing the protein precursors for ring closure.35,40,43,45,46 Cyclization confines the conformational space of peptides/proteins and thus, imparts diverse functions while enhancing their stability.33,35 For example, cyclotide Kalata B1 exhibits cytotoxicity, anti-microbial, and anti-virus activities, as well as excellent proteolytic resistance.33,47,48 A larger cyclic bacteriocin AS-48, isolated from Enterococcus faecalis S-48, has a remarkably high melting temperature (Tm) of 93 °C (Fig. 1B).49,50 The protein catenane Pyrobaculum aerophilum citrate synthase (PaCS) displays a Tm about 10 °C higher than the linear control without disulfide bonds (Fig. 1B).20 The HK97 capsid can even tolerate up to 5 M of guanidinium hydrochloride.42,43
Inspired by the functional benefits observed in these naturally occurring peptide/protein-based macrocycles, much effort has been directed to synthesizing artificial peptide/protein-based macrocycles as well as their variants. While nature evolves these compounds for optimal fitness, artificial systems can fully unleash the power of cyclization in engineering their properties. To date, there has been considerable success in their syntheses, with a rapidly expanded toolbox of broad ranges of efficient technologies for chemical cyclization.53–60 In this review, we shall focus on their biological synthesis and biomedical applications. Unlike organic synthesis, biosynthesis relies on the cellular machinery and is often genetically encoded to allow programming of the products’ structures. It is usually highly efficient, specific, and capable of generating considerable complexity with high bioactivity both in vitro and in vivo.59,60 By relating genotype to phenotype, biosynthesis also provides a facile approach for the rapid discovery of bioactive leads.61–63 The chemical space could be even further expanded by incorporating non-canonical amino acids.64–66 The biosynthesis of diverse peptide/protein-based macrocycles and their variants thus provides a rich resource for exploring their biomedical applications.
Expressed protein ligation is a strategy for protein ligation or cyclization based on an intein-mediated splicing reaction.72 It was first reported independently by Muir73 and Xu74 groups in 1998. The intein-bearing protein precursor generates a C-terminal α-thioester group via S/O acyl transfer, which then undergoes reversible trans(thio)esterification with the N-terminal residue (Cys, Ser, or Thr) of another protein precursor to form a branched oxy(thio)ester intermediate. After succinimide formation and subsequent S/O to N shift, the intermediate converts to a conjugate ligated with a native peptide bond.75 The process is similar to that of native chemical ligation,76 but both precursors are produced by recombinant technique, not chemical synthesis. It has been successfully employed for protein backbone cyclization where the reactive Cys and intein are appended at the N- and C-termini of the target protein, respectively (Fig. 2A). The first case was reported by Camarero and Muir77 on the cyclization of Src homology 3 domain from the c-Crk adaptor protein which increases its ligand binding affinity by 6-folds. Afterward, a series of peptides (such as a brain-binding peptide, complementarity determining region H3/C2)78 and proteins (such as β-lactamase (BLA),79 thioredoxin, maltose binding protein74) were cyclized by the same method. More interestingly, it could also be applied in living cells.80
Fig. 2 Schematic illustration of protein head-to-tail cyclization by expressed protein ligation (A), and split-intein mediated reaction (B). |
Split-intein-mediated ligation takes advantage of the fact that intein domains may be split into two parts, named N-intein (IN) and C-intein (IC), which can spontaneously reconstitute to form a native peptide bond linkage between their fusion proteins while excising themselves off the fusion.81,82 This protein splicing event can be used for protein cyclization (Fig. 2B). Benkovic and co-workers83 first used split Ssp inteins to produce cyclic peptides and proteins by fusing IN and IC at the C- and N-termini of target protein, respectively. The obtained circular dihydrofolate reductase (DHFR) exhibited improved thermal and proteolytic stability over its linear counterpart. It has later been used on other proteins as well, including green fluorescence protein (GFP),84 xylanase,85 and a peptide HIV inhibitor.86 Notably, this strategy is also applicable in cultured cells. For example, Camarero and co-workers87 synthesized cyclotide MCoTI-I by both expressed protein ligation and split-intein-mediated ligation, the latter of which outperformed the former in the in vivo expression. Besides Ssp inteins, a lot of split inteins with high reactive activity and robust extein tolerance have been developed by genomic sequencing and rational engineering (e.g., gp41-1, gp41-8, NrdJ-1, and IMPDH-1).72 Among them, a split intein derived from Nostoc puntiforme PCC73102 (Npu), which possesses robust trans-splicing activity with an efficiency of 98%, has been widely used.88 By varying the positions of split sites in Npu intein, two mutually orthogonal split intein pairs (102 residues for IN1 and 36 residues for IC1; 15 residues for IN2 and 123 residues for IC2) have been developed, which greatly enhances our capability to synthesize complex protein macrocycle variants.89,90
Sortase A is a kind of transpeptidase enzyme that could anchor proteins onto the cell wall surface. It was first reported by Schneewind and co-workers91 in 1999 and has received wide applications since then. It usually recognizes a pentapeptide sequence of LPXTG (X represents any amino acid) at the C-terminus and an oligoglycine sequence at the N-terminus of the substrate and catalyzes the cleavage of the amide bond between T and G to generate an acyl-enzyme thioester intermediate, which was subsequently attacked by the amine group at the N-terminal oligoglycines for ligation or cyclization (Fig. 3A).92 Boder and co-workers first used Sortase A to cyclize a bifunctional precursor protein, Gly3-GFP-LPETG-His6.93 The final products were characterized as a mixture of both monomer and dimer. The more specific cyclization of GFP was then realized by Ploegh group94 and Gao group23 and further extended to biomedical applications. Since then, a series of cyclic proteins, such as histatin-195 and interferon-α (IFN-α),22,24,25 were produced by this method. These cyclic peptides and proteins showed enhanced stability and improved biological activity compared with their linear counterparts. Nowadays, Sortase A has been engineered to be independent of Ca2+ and capable of recognizing different pentapeptide sequences, greatly enriching this toolbox.75 A Ca2+-independent Sortase A discovered from Streptococcus pyogenes has been successfully applied to cyclize GFP in vivo.96,97 Compared with intein-mediated cyclization, Sortase A requires fewer recognition sequences in the precursor protein. However, its application in vivo is quite limited due to the complication of intracellular nucleophiles such as the ε-amino group of lysine.98 Sortase A also suffers from low turnover rates (the molar ratio of the enzyme to protein substrate is usually from 0.1 to 1) and undesirable reverse reaction.94 It thus requires a lot of enzymes and a large excess of one substrate to promote the reaction, which is not ideal for protein cyclization.
Fig. 3 Schematic illustration of protein head-to-tail cyclization by Sortase A mediated reaction (A), and Butelase 1 mediated reaction (B). |
Apart from Sortase A, other enzymes have also been developed to produce peptide/protein-based macrocycles. The most representative case is a class of peptide ligases in the asparaginyl endoproteases (AEP) family, which requires very short recognition motifs to ligate a range of targets.99 Butelase 1 is the first AEP peptide ligase purified from the cyclotide-producing plant Clitoria ternatea.100,101 This enzyme recognizes the C-terminal D/N-HV sequence of polypeptide precursor and catalyzes the cleavage of the amide bond between the D/N and H to generate an acyl-enzyme intermediate. Then, target cyclization proceeds via ligation of the D/N residue to the N-terminal amino acid residue (Fig. 3B).102,103 Tam and co-workers102 have constructed cyclic GFP, human growth hormone (hGH), and interleukin-1 receptor antagonist via Butelase 1 mediated ligation. Compared with Sortase A, Butelase 1 shows exceptionally high efficiency (>90% yield, low enzyme to substrate molar ratio: 0.001 to 0.01, and short reaction time) and better tolerance to recognition amino acids (almost all amino acids at the N-terminus).102,104 However, until now, Butelase 1 still cannot be recombinantly expressed in an active form on a large scale, which limits its availability and wide use.104
OaAEP1b is another kind of AEP peptide ligase isolated from the plant Oldenlandia affinis with a similar catalytic mechanism to that of Butelase 1 but different recognition sequences at the C-terminus of protein precursor (NGL, NAL, or NCL).105,106 Despite usually lower catalytic efficiencies than Butelase 1, OaAEP1b could be expressed in E. coli and activated under acidic conditions.107 Based on the structure-based mutagenesis of OaAEP1b, Wu and co-workers107 engineered an OaAEP1b variant which shows hundreds of times faster catalytic kinetics than the wild-type. This variant was demonstrated to be highly efficient for the ligation and cyclization of peptides and proteins. By now, a lot of peptide/protein-based macrocycles have been synthesized by OaAEP1b, including GFP and its variants,108 intrinsically disordered malarial vaccine candidate Plasmodium falciparum merozoite surface protein 2,109 and several cyclotides (MCoTI-II, Kalata B1, a kallikrein-related peptidase 5 inhibitor based on the SFTI-1 scaffold and a potent α-conotoxin from Conus victoriae).105,110,111
The natural translation process in ribosome is limited to 20 canonical amino acid building blocks, however, genetic code reprogramming (GCR) provides a new opportunity to incorporate hundreds of non-canonical amino acids, especially side chain-substituted amino acids, into the sequence of peptides/proteins. In general, GCR is based on the “mis-acylation” of tRNA molecules with non-canonical amino acids for subsequent incorporation into polypeptide chains during translation.112 Suga and co-workers113 have shown that GCR could generate backbone-cyclized polypeptides during ribosomal synthesis using recombinant elements for peptides synthesis (Fig. 4). They introduced a Cys-Pro-glycolic acid (C-P-HOG) sequence into the peptide chain via GCR, which could self-rearrange to form a diketopiperazine–thioester intermediate, and then react with an N-terminal Cys residue in the same way as native chemical ligation. This elegant strategy not only enables the synthesis of several naturally occurring backbone-cyclized peptides, such as eptidemnamide, scleramide, RTD-1, and SFTI-1 but also provided another option for the generation of cyclic peptide libraries and screening of such libraries in vitro to identify potential lead drugs. The involvement of non-canonical amino acids often leads to lower production yields. Once a target is identified, the short cyclic peptide may also be synthesized using solid-phase peptide synthesis, obviating the problem of scale-up in the recombinant ribosomal synthesis. It should be noted that since GCR typically needs flexizyme-mediated tRNA acylation,114 it is challenging to use this system in cells for making large-sized protein molecules.
Fig. 4 Schematic illustration of the head-to-tail cyclization of peptides by genetic code reprogramming. |
Howarth and co-workers125 first dissected Spy0128 into two types of peptide/protein reactive pairs, termed as pilin-C (residues 18–229)/isopeptag (16 amino acids at the C-terminal) and pilin-N/isopeptag-N. The two parts of each reactive pair can reconstitute and spontaneously form an isopeptide bond irreversibly between Lys and Asn via an autocatalysis process. However, the inefficient reaction (60% yield) and relatively large molecular weight (∼35 kDa) limit their application. Howarth and co-workers126 then divided the CnaB2 domain (the second immunoglobulin-like collagen adhesin domain)119 derived from FbaB into SpyTag (the C-terminal β-strand composed of 13 amino acid residues) and SpyCatcher (138 residues at the N-terminal), giving rise to another peptide/protein reactive pair. The SpyTag/SpyCatcher chemistry shows high reactivity with a second-order rate constant (k) of ∼1.4 × 103 M−1 s−1. It was further optimized into the second and third generations with exceptional reactivity via rational engineering and directed evolution (k ∼ 2.0 × 105 M−1 s−1 for SpyCatcher002 and k ∼ 5.5 × 105 M−1 s−1 for SpyCatcher003).127,128 Meanwhile, the reaction is broadly applicable both in vitro and in cellulo. As a supplement, the SnoopTag/SnoopCatcher pair was also created by splitting the D4 domain of RrgA adhesin from Streptococcus pneumoniae, which shows comparable efficiency but mutually orthogonal reactivity to SpyTag/SpyCatcher ligation.129 Up to now, many peptide/protein reactive pairs have been developed, constituting a rich toolbox for side-chain ligation.118
Compared to main-chain ligation, the genetically encoded peptide/protein reactive pairs, such as SpyTag/SpyCatcher chemistry, are advantageous in producing cyclic proteins and their variants. First, the reaction is highly efficient and robust. Second, since tag and catcher react not only at the termini but also inside the protein chain, they can enable domain-selective cyclization with intact N-and C-termini for further modification.130 For example, by genetically fusing SpyTag and SpyCatcher to the N- and C-termini of the target protein, Howarth and co-workers131,132 successfully realized the side-chain cyclization of a series of enzymes, including BLA, DHFR, and phytase, leading to the cyclic-branched structures as we categorized in Type II (Fig. 5). These enzymes exhibit remarkably enhanced thermal stability in many regards: (1) improved anti-aggregation behavior; (2) excellent thermal resilience, tending to refold and restore the catalytic activity better. These functional benefits are presumably attributed to the cyclic-branched topology as well as the extraordinary stability of the SpyTag/SpyCatcher complex. Besides, other proteins such as firefly luciferase133 and lichenase,134 have also been cyclized by SpyTag/SpyCatcher chemistry to improve their performance in bioactivity and stability.
Although SpyTag/SpyCatcher chemistry has been successful for biologically inspired synthesis of protein macrocycles, especially those with complex topological structures,130 it is not traceless. The large SpyTag/SpyCatcher complex (∼15 kDa) after cyclization might unexpectedly alter the structure and function of the original protein due to the increased steric hindrance and potential immunogenicity. To reduce the size of the ligation scar, Howarth and co-workers135,136 have developed “SpyTag/KTag/SpyLigase” and “SnoopTagJr/DogTag/SnoopLigase” systems by further splitting the SpyTag/SpyCatcher and SnoopTag/SnoopCatcher pairs into three parts. The SpyLigase and SnoopLigase catalyze the reaction of SpyTag/KTag and SnoopTagJr/DogTag, respectively, forming an isopeptide bond with a scar of only 5–6 kDa molecular weight. Recently, Zhang and co-workers137 have found an alternative way to split SpyCatcher, leading to BDTag (25 amino acids), and SpyStapler (∼8 kDa). SpyStapler can efficiently catalyze the isopeptide bonding between SpyTag and BDTag. This tool has later been adapted to develop an active-template synthesis of protein catenanes.138 The small size of fused tags and higher coupling efficiency make these tools attractive for researchers in the field of chemical biology, synthetic biology, protein engineering, and biomaterials.
Recently, Iwaï and co-workers142 have reported the production of a mathematical protein trefoil knot with closed ends. The trefoil knot is the simplest knot that could not be embedded in the plane. A deeply trefoil-knotted YibK from Pseudomonas aeruginosa was cyclized by split intein and Sortase A, respectively, to generate the trefoil knot exhibiting increased thermal stability and reduced aggregation propensity (Fig. 6). In comparison, protein catenanes contain two or more mechanically interlocked rings with higher structural complexity. Based on the “assembly-reaction” synergy, Zhang and co-workers143,144 have synthesized a series of protein catenanes using the entwined p53dim homodimerization domain57 and SpyTag/SpyCatcher chemistry in cellulo. The expressed nascent protein precursors first pre-organize into an entangled intermediate assembly as guided by p53dim followed by ring closure through SpyTag/SpyCatcher reaction to lock in the catenane topology (Fig. 7A).143
A lot of proteins, including unstructured ELP and well-folded GFP, BLA, or DHFR have been concatenated by this modular approach.143,144 Due to the domain-selective feature and intact termini of SpyTag/SpyCatcher, this strategy could be extended to synthesize star-like protein catenanes and protein pretzelanes.145,146 Moreover, lasso proteins, though not included in the nonplanar molecular graphs, could also be produced by a similar strategy. The precursor protein SpyCatcher-p53dim-SpyTag-p53dim would form the cyclic-branched structure with the tail threaded in the ring due to “assembly-reaction” synergy.147 By replacing the peptide/protein reactive pair with orthogonal split intein chemistry, Zhang and co-workers148 recently developed an autonomous streamlined synthesis of traceless protein heterocatenanes with two distinct rings. With a sequence of multiple post-translational processes including p53dim guided chain interlacing and orthogonal intein-mediated ring closure, the nascent linear chain could be transformed into a main-chain heterocatenane directly in cells (Fig. 7B). The topological diversity has been further expanded by using an engineered p53dim heterodimer. Protein [3]catenane and [4]catenane composed of three or four interlocked rings were successfully synthesized with the combination of p53dim heterodimer for controlled multi-chain intertwining and mutually orthogonal split-intein-mediated ligation and SpyCatcher/SpyTag reaction for closing the primary and secondary rings, respectively (Fig. 7C).149 Recently, by rewiring the connectivity of the SpyTag/BDTag/SpyStapler complex in 3D space, an active template strategy was developed to enable selective protein heterocatenane synthesis. The chain entanglement was introduced by folding and the reconstituted complex served to catalyze the isopeptide bond formation for ring closure, leading to the heterocatenane topology (Fig. 7D).138,150 Although still in its infancy, the biosynthesis of non-planar protein macrocycles, such as protein knots and links, is promoted by the development of genetically encoded protein entangling motifs and ligation tools. These macrocycles with nonplanar graphs often exhibit extra stability and other functional benefits, which holds great promise in protein therapeutics.
SICLOPPS is a method for the high-throughput screening of macrocyclic peptide variants produced by intracellular expression and split intein mediated cyclization (Fig. 8A).160,162 It facilitates the construction of libraries comprised of up to about 108 sequences and the screening process can be performed both in prokaryotes and in eukaryotes.163 For example, in order to evolve peptide macrocycles to suppress the toxicity of human α-synuclein, a presynaptic neuronal protein relevant to Parkinson's disease,164,165 Lindquist and co-workers26 constructed a library containing more than 30 million cyclic peptide octamers via intein-mediated protein trans-splicing. After selection on a series of filtering assays in a yeast synucleinopathy model, two peptide macrocycles were picked and confirmed to possess reproducible and specific activities with suppressed toxicity. These selected peptide macrocycles were introduced into C. elegans Parkinson's disease model and found to significantly reduce dopaminergic neurodegeneration. Another example aimed to engineer peptide macrocycles to regulate the interactions between the heterodimeric hypoxia-inducible factor (HIF)-1α and -1β, which is crucial for cancer therapy.166,167 Tavassoli and co-workers27 constructed a library containing more than 3.2 million cyclic hexapeptides by SICLOPPS and screened out active peptide macrocycles for the inhibition of HIF-1 heterodimerization. One of the selected peptide macrocycles, named cyclo-CLLFVY, was observed as a good inhibitor to bind with the PAS-B domain (amino acid residues 235–350) of HIF-1α. These examples demonstrate the potential of SICLOPPS in developing highly active macrocyclic peptide drugs for disease treatments.
Unlike phage display or SICLOPPS, the in vitro transcription and translation process and the ability to conveniently incorporate noncanonical amino acids notably enlarge the library size of mRNA display to up to 1012–1014.63,157,168 Moreover, with flexizymes,66,169 a class of artificially evolved ribozymes capable of charging tRNA with various non-standard amino acids, Suga and co-workers170,171 have developed an attractive mRNA display methodology, termed random nonstandard peptide integrated discovery (RaPID) platform. As shown in Fig. 8B, the RaPID system allows the synthesis of cyclized peptide libraries by genetic code reprogramming and spontaneous posttranslational cyclization. Iterative rounds of affinity-based selection and gradual enrichment of the active species give rise to the desired peptide macrocycles for targeted proteins. Using this technology, they successfully identified a series of peptide macrocycle-based inhibitors to factor XIIa (an initiator of the contact system),172–174 calcium and integrin-binding protein 1 (an intracellular protein implicated in the survival and proliferation of triple-negative breast cancer),175 influenza viral envelope protein hemagglutinin,176 and human epidermal growth factor receptor (EGFR),177,178 which provide potent drug candidates in antithrombotics, pneumonia prevention, and cancer therapy. Through effective cyclization and incorporating noncanonical amino acids, such as β-amino acids, D-amino acids, and L-carboranylalanine, these peptide macrocycles showed not only high inhibitory activity but also protease resistance and/or good cell permeability. Other exciting examples include the discovery of low nanomolar inhibitors of prolyl hydroxylase isoform 2,179 isoform-selective Akt kinase180 or NAD-dependent deacetylase sirtuin 2,181 and the evolvement of cyclic peptides with the high binding ability to the interleukin-6 receptor.182 The RaPID also exhibited its commercial promise to develop the potential macrocyclic inhibitor of programmed death 1/programmed death ligand 1(PD-1/PD-L1).183
With these advances, the identification of peptide macrocycles with target biological functions is no longer considered a major obstacle in drug discovery and development. More significant challenges now may lie in the poor membrane permeability, low oral bioavailability, and metabolic instability of these drug candidates.157 Cyclization is expected to play vital roles in optimizing, at least partially, these crucial pharmacological properties for clinical translation.
Cytokines are a class of small-molecule proteins regulating cell growth, differentiation, and immune responses.188 As an important section of protein-based therapeutics, many cytokines, such as IFN, granulocyte colony stimulating factor (GCSF), interleukins, and hGH, have been approved by FDA for the treatment of various diseases. As an elegant example, Ploegh and co-workers22 constructed a cyclic-branched IFN-α coupled with 10 kDa PEG through two sequential transacylation procedures catalyzed by two Sortase A variants189 recognizing different sequences (LPETA for SrtAstrep and LPETG for SrtAStaph) (Fig. 9A). The engineered cyclic-branched IFN-α showed not only enhanced in vivo stability and thermal stability over its linear counterpart (Fig. 9B and C), but also improved pharmacokinetics over its non-PEGylated counterpart (Fig. 9D). However, considering the multiple-step synthesis approach and the low efficiency of PEGylation, this strategy is complicated by a low overall yield (∼10%). Alternatively, Lu24 and Gao25 groups reported two methods to construct long-circulating cyclic IFN-α, respectively. Compared with cyclic-branched IFN-α, these cyclic IFN-α exhibited better ex vivo and in vivo tumor penetration, which should be explained by the compact structures. These results suggest that cyclization may be combined with other strategies, such as PEGylation22,190 and ABD fusion,25 to further improve its performance.
Fig. 9 Synthesis of cyclic-branched IFN-α and PEG conjugate via two sequential transacylation procedures mediated by Sortase A (A). The cyclic-branched IFN-α and PEG conjugate shows optimized bioactivity (B), improved thermal stability (C), and prolonged circulation time (D).22 |
Apart from IFN-α, GCSF was also cyclized by Sortase A22 or split-intein191-mediated ligation. Honda and co-workers191 constructed a series of connecting loops to develop different cyclic variants, and one of them showed better thermal resistance with a 13 °C increase in Tm. hGH192 and interleukin-1 receptor antagonist102 were also cyclized to show the enhanced stability against thermal denaturation and comparable biological activities. However, it is difficult to evaluate their therapeutic potential due to the lack of in vivo data.
Enzymes are also a kind of potential therapeutics. Nine recombinant enzymes have been approved for clinical use from January 2014 to July 2018.194 However, unlike the cyclization of cytokines, studies on the cyclization of enzymes have mainly focused on model proteins. As a classical model, BLA has been cyclized either by intein-mediated peptide ligation79 or SpyTag/SpyCatcher chemistry132 as mentioned above. All these cyclic BLA and cyclic-branched BLA showed better stability against thermal denaturation than their linear counterparts.
Monoclonal antibodies seem to be the most important section of protein-based therapeutics considering their dominance in terms of product approvals or market value.194 They usually show high affinity and specificity but need to be produced in mammalian cells with a high cost. Thus, scFv antibodies, whose heavy and light variable fragments are usually connected with a short peptide linker, have emerged to be an alternative option. However, scFv antibodies suffer from their instability. The inter-domain interactions would lead to aggregate formation. To suppress scFv oligomerization, Morioka and co-workers193 constructed a kind of cyclic scFv antibody by Sortase A-mediated peptide ligation (Fig. 10A). By doing so, aggregation was markedly suppressed without disturbing the binding activities and thermal stability of the scFv antibody (Fig. 10B). This work is a useful attempt to produce cyclic scFv antibodies, holding great promise for further use in biomedical fields.
Fig. 10 (A) Scheme for Sortase A-mediated scFv cyclization and (B) analysis of molecular size distributions by dynamic light scattering: linear scFv (left) and cyclic scFv (right) (B).193 |
Besides simple cyclization, a more complicated, higher order protein [n]catenane based artificial antibody has been developed by Zhang and co-workers.149 Based on the synthetic strategy mentioned above, a human epidermal growth factor receptor 2 (HER2)-specific affibody (AffiHER2 that targets the HER2 receptor specifically) has been genetically encoded into the scaffolds of protein [3]catenane or [4]catenane, giving rise to bivalent and trivalent artificial antibodies termed [3]catbody and [4]catbody. Due to the multivalent effect and stability enhancement of the [n]catenane scaffold, both [3]catbody and [4]catbody exhibit increased binding affinities to HER2 receptors, significantly prolonged circulation time, and optimized tumor accumulation. This indicates that protein macrocycles and their variants can not only increase the stability of target protein drugs but also bring in other functional benefits, such as multi-valent effects.
Despite these progresses, much work is still urgently needed. The extensive presence of peptide/protein-based macrocycles in nature should be understood in detail, particularly their formation mechanism and structural diversity. The detailed study on the structure–property relationship requests a moderate diversity of artificial peptide/protein-based macrocycles. Although the “assembly-reaction” synergy is powerful, there is still limited toolkits of genetically encoded protein entangling and ligation motifs. This toolbox should be largely expanded and optimized in terms of their reactivity and specificity to facilitate the synthesis of diverse macrocycles. Computer-aided methodology and direct evolution have been considered powerful to optimize these protein toolkits. In addition, it is also important to consider the scale-up synthesis of these macrocycles in industry. With their rapid development, we believe that more and more peptides/protein-based macrocycles will be successfully developed with increasing structural complexity, evolving from simple circle to multicycles (Type I), cyclic-branch (Type II), and further to knots and links (Type III and IV). Functional benefits will be gained in this endeavor, which shall open a new avenue for peptide/protein-based drug discovery and enable advanced therapeutics for biomedical applications.
Footnote |
† These authors contributed equally. |
This journal is © The Royal Society of Chemistry 2022 |