Buried treasure: biosynthesis, structures and applications of cyclic peptides hidden in seed storage albumins

B. Franke a, J. S. Mylne b and K. J. Rosengren *a
aThe University of Queensland, Faculty of Medicine, School of Biomedical Sciences, Brisbane, QLD 4072, Australia. E-mail: j.rosengren@uq.edu.au
bThe University of Western Australia, School of Molecular Sciences, Crawley, WA 6009, Australia

Received 18th December 2017

First published on 30th January 2018

Covering: 1999 up to the end of 2017

The small cyclic peptide SunFlower Trypsin Inhibitor-1 (SFTI-1) from sunflower seeds is the prototypic member of a novel family of natural products. The biosynthesis of these peptides is intriguing as their gene-encoded peptide backbone emerges from a precursor protein that also contains a seed storage albumin. The peptide sequence is cleaved out from the precursor and cyclised by the albumin-maturing enzymatic machinery. Three-dimensional solution NMR structures of a number of these peptides, and of the intact precursor protein preproalbumin with SFTI-1, have now been elucidated. Furthermore, the evolution of the family has been described and a detailed understanding of the biosynthetic steps, which are necessary to produce cyclic SFTI-1, is emerging. Macrocyclisation provides peptide stability and thus represents a key strategy in peptide drug development. Consequently the constrained structure of SFTI-1 has been explored as a template for protein engineering, for tuning selectivity towards clinically relevant proteases and for grafting in sequences with completely novel functions. Here we review the discovery of the SFTI-1 peptide family, their evolution, biosynthetic origin, and structural features, as well as highlight the potential applications of this unique class of natural products.

image file: c7np00066a-p1.tif

B. Franke

Bastian Franke obtained his PhD in analytical biochemistry and structural biology from the School of Biomedical Sciences, University of Queensland, Australia (2017). During his PhD he employed mass spectrometry and NMR spectroscopy to study the biosynthesis and structures of plant peptides. Subsequently, he moved to Switzerland to join the Biozentrum, University of Basel, as a Postdoctoral Fellow. He is currently working on developing new protein isotope labeling techniques in eukaryotic cells for NMR studies.

image file: c7np00066a-p2.tif

J. S. Mylne

Joshua S. Mylne is a plant biochemist who has worked broadly in genetic engineering (PhD Botany, University of Queensland, 2002), developmental genetics and epigenetics at the UK's John Innes Centre (2001–2005) and biochemistry at the Institute for Molecular Bioscience, UQ (2006–2012). He held successive Australian Research Council QEII and Future Fellowships (2008–2016), is the 2012 Goldacre medal winner, a 2014 Feinberg Foundation Visiting Fellow to the Weizmann and 2018 Fulbright Professional Scholar. His lab, founded in 2013, studies protein evolution, biosynthesis and a new program in herbicide discovery. He is tenured in the School of Molecular Sciences, University of Western Australia.

image file: c7np00066a-p3.tif

K. J. Rosengren

K. Johan Rosengren is a structural biologist with a particular interest in disulfide-rich peptides. He obtained his PhD from the Institute for Molecular Bioscience, University of Queensland (2003). After research positions at Linnaeus and Uppsala Universities in Sweden he returned to UQ, and established his own laboratory at the School of Biomedical Sciences in 2011. He has held a National Health and Medical Research Council CDA Fellowship (2010–2013), an Australian Research Council Future Fellowship (2014–2017), and is the 2011 Sir Paul Callaghan Medal winner. The focus of his lab is discovery, structural studies and therapeutic applications of bioactive peptides.

1 SFTI-1 – a bicyclic inhibitor from sunflower seeds

1.1 Discovery and structural features of SFT1-1

SunFlower Trypsin Inhibitor-1 (SFTI-1) was discovered in the late 1990s when Russian scientist Alexander Konarev extracted and purified a proteinase inhibitor from the seeds of the common sunflower (Helianthus annuus). It was named based on the strong complex it formed with its target proteinase – namely trypsin. Sue Luckett et al. used X-ray crystallography to describe the remarkable structure of SFTI-1 in complex with bovine β-trypsin.1 SFTI-1 is a 14-amino acid, head-to-tail bicyclic peptide (Fig. 1). It contains a double stranded antiparallel β-sheet, which is bridged by a single disulfide bond between Cys3 and Cys11 creating an extended 7-residue bioactive loop (active site: Lys5–Ser6) and a 5-residue cyclisation loop. The active site loop contains two consecutive proline residues (Pro8 and Pro9), of which the first adopts a cis- and the second a trans-conformation, creating a type VIb β-turn.1 Later investigations by Korsinczky et al. using NMR spectroscopy demonstrated that the solution structure of SFTI-1 alone did not differ significantly from its structure in the crystal complex (Fig. 1).2 The highly optimised and rigid structure is the key to its potency,3 and it is stabilised both by the covalent cyclisation of the backbone and cysteine residues and by an extensive hydrogen bond network. The presence of the hydrogen bonds identified in the crystal structure has also been confirmed using hydrogen–deuterium exchange experiments on SFTI-1 in isolation in solution.2 The importance of the key structural elements has been investigated using modified versions. An analogue linearised between Gly1 and Asp14 in the cyclisation loop shared incredible structural homology with native SFTI-1, except for lacking hydrogen bonds involving Gly1, and Arg2, which are involved in the type I β-turn in cyclic SFTI-1.2 Removal of the disulfide bond in cyclic SFTI-1 led to a slightly more disordered structure, but an extensive hydrogen bond network and the conformational restraints caused by three proline residues were sufficient to maintain an overall fold similar to native SFTI-1, and both the acyclic and disulfide deficient analogues retained a high level of potency against trypsin.4 Removal of both the disulfide and cyclic backbone does, however, completely disrupt both structure and activity.4
image file: c7np00066a-f1.tif
Fig. 1 Sunflower trypsin inhibitor-1. Crystal structure of SFTI-1 (cyan, with key binding residues in magenta) in complex with bovine β-trypsin (green, PDB code: 1SFI). The NMR solution structure of SFTI-1 (navy blue) is superimposed (PDB code: 1JBL). The active site Lys and Cys residues (yellow) are labelled with residue numbers. The insert panel shows a schematic presentation of the bicyclic structure of SFTI-1 and its amino acid sequence.

1.2 SFTI-1 – one of the most potent serine-protease inhibitors known

SFTI-1 belongs to the serine protease inhibitor family and is one of the most potent naturally occurring trypsin inhibitors known (Ki = 0.1 nM). The positively charged Lys5 is crucial for the trypsin inhibitory activity of SFTI-1 and interacts with the negatively charged carboxyl group of Asp189 at the base of the binding pocket S1 of bovine trypsin.1,5 The scissile bond between Lys5 and Ser6 is within hydrogen bond length of Ser195 and His57 of the trypsin catalytic triad. An extensive network of hydrogen bonds and ion pairs are formed between the primary binding region of the inhibitor SFTI-1, Arg2–Cys3–Thr4–Lys5–Ser6–Ile7, and the P3–P2′ positions in the active site of trypsin.1

The bioactive loop of SFTI-1 shows striking homology in sequence and structure to the bioactive loop from the trypsin-binding domain of Bowman–Birk Inhibitors (BBIs).1 BBIs are the best studied group of plant protease inhibitors and thought confined to the grasses (Poaceae) and the legumes (Fabaceae), but an ancestral form was revealed in the spikemoss Selaginella moellendorffii and BBIs were found more broadly through the plant kingdom.6 A string of seven highly conserved consecutive residues as well as a ninth residue “CTKSIPPxC” are shared by SFTI-1 and BBIs, and they adopt very similar 3D structures.1,7 NMR studies have also shown that the conformation of the bioactive loop by itself is similar to that of native SFTI-1.8 Thus to investigate the significance of the cyclisation loop Descours et al. transplanted a 7-residue (TKSIPPI) and an 11-residue (RCTKSIPPICF) version of the bioactive loop onto a turn inducing D-Pro-L-Pro template. NMR studies confirmed a well-defined conformation for both mimetics, with a very similar fold to the reactive loop of native SFTI-1. The 11-residue loop mimetic showed a similar inhibitory activity to SFTI-1, whereas the 7-residue loop mimetic had a ten-fold reduction in potency against trypsin, relative to SFTI-1.9 This reduction in binding affinity might be a result of a slight conformational change within the binding loop when the disulfide bond was replaced by a D-Pro-L-Pro template linker. In another study it was shown that significant structural changes of the cyclisation loop occurred when Asp14 in the secondary loop was mutated to an Ala, which resulted in two distinct conformations with a cis- or trans-peptide bond preceding Pro13. The hydrogen bonding between Arg2–Asp14, which is missing in the Asp14Ala mutant, seems to have had stabilising effects on the backbone in native SFTI-1 and in maintaining the highly restrained structure of SFTI-1.10 These studies confirmed that the fine-tuning of the conformation by the cyclisation loop was responsible for the unique potency of SFTI-1 compared to other BBIs.

1.3 Genetic origin of SFTI-1 – buried in a seed storage albumin

Given its unusual structure, the origin of SFTI-1 was an enigma. It was not until more than a decade after the discovery of SFTI-1 that its true genetic and biosynthetic origin was revealed, and SFTI-1 was shown to be embedded in a 151-residue napin-type seed storage albumin precursor protein, which was consequently named Preproalbumin with SFTI-1 (PawS1).11 PawS1 remarkably gives rise to two protein products after post-translational processing; SFTI-1 and a heterodimeric, napin-type 2S seed storage albumin. An unprocessed 116-residue version of the precursor PawS1 was recently produced recombinantly in E. coli and studied by NMR spectroscopy (Fig. 2). The structure of the dual-fated PawS1 showed that SFTI-1 and the albumin, which comprises a small and a large subunit (SSU and LSU), are conformationally well-defined but independent entities held together by a linker.12 The SFTI-1 domain already adopts its typical bioactive structure, whereas the albumin domain comprises four helical segments in a compact bundle. NOE relaxation experiments confirmed that both the “GLDN” linker (between SFTI-1 and SSU) and the “LRMAVEN” linker (between SSU and LSU) have a high degree of flexibility.12 This flexibility is pivotal for the maturation of the seed storage albumin and for the cleavage and cyclisation of SFTI-1, which require access by proteases.11,13
image file: c7np00066a-f2.tif
Fig. 2 Solution NMR structure of recombinant PawS1. (A) Sequence of a 116-residue portion of proalbumin PawS1 after removal of the signal sequence and sequence preceding SFTI-1. The N-terminal SFTI-1 domain is shown in navy blue, the small subunit (SSU) of the albumin domain in green and the large subunit (LSU) in orange. The linker “GLDN” separates SFTI-1 and the SSU and the linker “LRMAVEN” separates the SSU and the LSU of the albumin (B) NMR structure of PawS1 determined by triple resonance NMR spectroscopy. (C) The N-terminal SFTI-1 domain of the proalbumin PawS1 (navy blue) superimposed on the NMR structure of SFTI-1 (cyan) showing an identical structural conformation. (PDB codes: SFTI-1 1JBL; PawS1 5U87).

2 Biosynthesis of SFTI-1

2.1 Albumin processing by asparaginyl endopeptidase

Seed storage proteins are typically proteolytically matured in transport vesicles on their way to their final destination of the protein storage vacuole. Hara-Nishimura et al. showed that asparaginyl endopeptidase (AEP, also known as vacuolar processing enzyme) recognises and cleaves precursor 2S albumins from pumpkin seeds on the C-terminal side of Asn residues.14 Proteolytic cleavage occurs only at Asn residues located in the hydrophilic part of the precursor pumpkin protein at positions Asn35 and Asn74. These Asn residues are conserved at equivalent positions among 2S albumin precursors of various plant species. Asn at other positions are not cleaved as their surrounding residues and structure might prevent AEP from accessing these peptide bonds,14 a suggestion supported by the NMR structure of PawS1.12

2.2 SFTI-1 hijacking the albumin processing machinery

In 2011 Mylne et al. proposed that SFTI-1 has hijacked the albumin processing machinery.11 AEP was shown to be needed to liberate SFTI-1 by cleaving at Asn and Asp residues that flank its sequence, in addition to the normal Asn cleavage sites around the albumin in PawS1. Furthermore, AEP was suggested to act as a transpeptidase during the final cleavage, ligating the N-terminal Gly to the C-terminal Asp and thereby forming cyclic SFTI-1.11 Bernath-Levin et al. subsequently provided direct evidence that SFTI-1 can be simultaneously excised and macrocyclised from its linear precursor PawS1 through an in situ assay using 18O-water. Transpeptidation of the N-terminal Gly to the C-terminal Asp of SFTI-1 was attributed to a nucleophilic attack at the carbonyl center of the acyl thioester by the N-terminal Gly, with no 18O from 18O-water being present in cyclic SFTI-1 confirming the lack of an intermediate hydrolysis event.13

AEP proteases from different plants have been recombinantly expressed in E. coli and tested for ability to cleave and cyclise SFTI-1 variants.13In vitro reactions with SFTI–GLDN and SFTI(D14N)–GLDN peptide substrates showed that the sunflower HaAEP1 could only cleave the substrate SFTI(D14N)–GLDN and not perform macrocyclisation, whereas CeAEP1 from jack bean (Canavalia ensiformis) had the ability to cleave the substrate SFTI(D14N)–GLDN and perform macrocyclisation. RcAEP1 from castor bean (Ricinus communis) preferred the native Asp14 substrate SFTI–GLDN, but could only cleave and not perform macrocyclisation. Thus multiple AEPs might act in concert in vivo to allow cleavage and macrocyclisation. Bernath-Levin et al. proposed that the most abundant AEP in sunflower seeds, HaAEP1, cleaves and releases SFTI–GLDN from its precursor albumin PawS1. Once released, a sunflower AEP with the ability to recognise Asp and macrocyclise, performs the transpeptidation.13 Once SFTI-1 is cyclic it cannot be re-cleaved by the enzyme that made it, illustrating how macrocyclisation protects SFTI-1 from protease recognition. No acyclic-SFTI was found in sunflower seeds in vivo, which is likely because acyclic-SFTI enters a pathway of degradation.13

These studies and general knowledge of the seed storage protein processing now allow us to summarise the sequence of events that leads to the dual fate for PawS1[thin space (1/6-em)]15 (Fig. 3). The PawS1 gene is transcribed into PawS1 mRNA, which exits the nucleus pores, is recognised and translated by ribosomes. PawS1 is synthesised as a preproalbumin polypeptide on the rough ER where it enters the secretory pathway through the endomembrane system into the ER lumen (Fig. 3A), during which the N-terminal signal peptide is removed and the preproalbumin is processed into proalbumin by signal proteases. Disulfide formation occurs in the ER lumen due to its high oxidative redox potential, and is likely catalysed by protein disulfide isomerases16 (Fig. 3B).

image file: c7np00066a-f3.tif
Fig. 3 Processing of the precursor PawS1. (A) Schematic presentation of preproalbumin PawS1 with the signal sequence (pink), SFTI-1 domain (cyan), SSU (green) and LSU (orange). (B) The signal peptide is removed on entry into the ER and conserved disulfide bonds are formed in the ER lumen. (C) HaAEP1 removes the pro-region and cleaves after residues preceding the SSU and LSU. (D) A cyclising AEP removes the GLDN tail and cyclises SFTI-1. An aspartic exoprotease is thought to remove the albumin linker LRMAVEN.17

The PawS1 proalbumin is transported from the endoplasmic reticulum to the Golgi apparatus.18 Proalbumin PawS1 and AEPs are packaged in separate dense vesicles. These fuse together at what are termed multi-vesicular bodies that are trafficked to protein storage vacuoles. The proteolytic processing of proalbumin PawS1 to PawS1 albumin occurs within the multi-vesicular bodies and is mediated by AEPs and aspartic proteases.19 AEP removes the N-terminal pro region of the proalbumin and cleaves at the residue preceding what will become the mature albumin SSU and LSU (Fig. 3C). An AEP ligase, which recognises Asp joins Gly to Asp to make bicyclic SFTI-1 and the linker peptide “LRMAVEN” between the small and the large subunit of the albumin domain is thought to be trimmed back by an aspartic protease, generating the mature heterodimeric albumin PawS1 (Fig. 3D).17 The complete processing leading to the fully matured albumin from the precursor has also been confirmed by mass spectrometry and NMR spectroscopy monitored in vitro processing of recombinant PawS1.12

2.3 Cyclisation of other peptides by bifunctional proteases

Intriguingly, SFTI-1 is not the only peptide that gets liberated from a precursor protein and head-to-tail cyclised. A number of other cyclic peptides have been identified from various sources over the last two decades, including plants, bacteria, fungi and mammals (Fig. 4).20 The largest and most studied family is the plant cyclotide family.21 Despite having a completely unrelated genetic origin, evidence suggests that there are similarities in the mechanism of macrocyclisation of different classes of cyclic plant peptides. Saska et al. first showed that AEP activity is required to produce the prototypic cyclotide kalata B1 (kB1) from its precursor protein Oak1 in N. benthamiana.22 Mylne et al. confirmed that the maturation of the cyclic trypsin inhibitor (TI) knottins from the precursor protein named TIPTOP (Two Inhibitor Peptide TOPologies) from Momordica cochinchinensis was also dependent upon AEP activity.23 The repeated independent recruitment of AEP to macrocyclic plant peptide production by evolution was termed “biosynthetic parallelism” and implied AEP performed the macrocyclisation.23 Recombinantly expressed AEP from the plant Oldenlandia affinis efficiently cyclised the cyclotide kB1, but does not mediate the N-terminal processing, which must occur first for efficient cyclisation to take place.24
image file: c7np00066a-f4.tif
Fig. 4 Selected structures of circular proteins from plants, bacteria, mammals and fungi. (A) SFTI-1 from sunflower seeds (PDB code: 1JBL), (B) kalata B1 isolated from the leaves of the tropical plant Oldenlandia affinis (1NB1), (C) bacteriocin AS-48 produced by the bacterium Enterococcus faecalis (1E68) (D) θ-defensin, found in the leukocytes of rhesus macaques (2LYF) (E) α-amanitin from the fungus Amanita virosa (3CQZ) (F) omphalotin A isolated from the fungus Omphalotus olearius.

Although the precursor proteins PawS1, Oak1 and TIPTOP differ in their sequences, they all share commonalities within their cyclic peptide domain: typically an N-terminal Gly and a C-terminal Asp (for SFTI-1 and TIPTOP) or Asn (for kB1) that are linked to form the cyclic peptide backbone. Trailing the peptide domains there is also conservation with the P1′ usually small (Gly, Ser or Ala) and the P2′ is often Leu. These findings of cyclic peptides from unrelated precursor proteins in phylogenetically distant plant families support the hypothesis that AEP performs peptide cyclisation in vivo and that different AEPs are involved and can work in tandem in cleavage and/or cyclisation processing events.23 Although cyclic peptides from sources outside the plant kingdom have very different precursors there is mounting evidence that bifunctional proteolytic enzymes are involved also in the cyclisation of these peptides. A particularly interesting case is omphalotin A, a heavily N-methylated 12-residue cycle, until recently believed to be the result of non-ribosomal synthesis. Omphalotin A is however gene-encoded as a large protein with a N-terminal 399 residue methyl-transferase domain that installs the N-methylations before the C-terminal peptide is released and cyclised, likely by a prolyl oligopeptidase.25,26

3 SFTI-1 is the prototypic PawS-derived peptide

3.1 Discovery of PawS-Derived Peptides (PDPs)

When describing the genetic origin of SFTI-1, it was realised that PawS1 was not the only sunflower albumin precursor containing an insert resulting in a bicyclic peptide. A second protein PawS2 gives rise to SFTI-1 Like peptide-1 (SFT-L1).11 To understand how widespread these peptides could be in nature, species related to sunflower were probed for the presence of PawS1 genes and gene products. Screening 267 species from the Asteraceae family with liquid chromatography-mass spectrometry and gene-based approaches discovered sequences encoding a whole new family of peptides similar to SFTI-1.7De novo transcriptomics as an alternative screening approach was later used to overcome the limitation that heterologous PCR approach cannot amplify PawS1 genes from species that are distantly related to the common sunflower.27 These novel peptides were termed PawS-Derived Peptides (PDPs) and are widely distributed throughout the Heliantheae and Millerieae tribes of the sunflower family Asteraceae. Like SFTI-1, all PDPs are encoded within a precursor protein that also encodes a seed storage albumin, and thus are only found in the seeds of the sunflower plant, not in roots, stems or plant tissues. To date 36 sequences of the peptide family have been described, of which 22 have been confirmed in planta7,27 (Fig. 5). All peptides share the conserved disulfide bond, but some variants that have an Asn rather than Asp residue at the C-terminus and are Cys-stapled hairpins rather than bicyclic.
image file: c7np00066a-f5.tif
Fig. 5 Sequences of selected PawS-derived peptides. Alignment of PDPs discovered in vivo (image file: c7np00066a-u1.tif) and gene-predicted (image file: c7np00066a-u2.tif). All sequences were aligned using ClustalW and manually refined and shaded using BoxShade. PDPs are highly conserved at the proto-N-terminal Gly, the two Cys residues forming a disulfide bond and the proto-C-terminal Asp or Asn. PDP-10, -11 and -19 are linear peptides (image file: c7np00066a-u3.tif).

3.2 Evolution of the PDP family

Due to its BBI mimicry, SFTI-1 was initially considered the smallest member of the Bowman–Birk family, but because of its different genetic and biosynthetic origin SFTI-1 is no longer considered a member of the BBI family.11 Instead SFTI-1 has been shown to have evolved independently through a stepwise process over 45 million years.28 Transcriptomes from 110 sunflower relatives were assembled and analysed by Jayasena et al., allowing the dating of the key events related to the appearance of the PDPs. This analysis revealed that an insertion event that appeared ∼45 Ma resulted in the addition of two AEP processing sites creating the duality of an albumin precursor. These ancestral PawS-Like precursors are referred to as PawL proteins. Through a further expansion of the PawL1 insert ∼34 Ma the PawS proteins arose and the peptide sequence came to include two Cys residues that through oxidation makes the excised peptide a stable bi-cyclic product. This allowed the further evolution of a specialised function as a protease inhibitor, which occurred about ∼23 Ma.28 Intriguingly the PawL precursors have now also been shown to produce different smaller cyclic peptides without a disulfide bond, further expanding the breadth of natural products arising from seed storage albumin processing.28

3.3 Structural diversity of PDPs

Solution NMR spectroscopy has been used to characterise the structures of a wide range of PDPs, showing significant structural diversity as suggested by their sequences.2,7,29 All PDPs are less than 20 amino acids in length and contain a single conserved disulfide bond, which in all peptides appears to adopt a short right-handed hook (or staple) conformation. Common to most PDPs is the central Pro–Pro motif and an absolutely conserved proto-N-terminal Gly, two Cys and a proto-C-terminal Asp or Asn. The Asp–Gly cyclisation motif is found in all cyclic PDPs. PDP-10, PDP-11 and PDP-19, which have a C-terminal Asn are all acyclic. Although various PDPs share a similar basic scaffold they are quite different in their characteristics in terms of projections of side-chains, flexibility and physiochemical properties. PDP-5, -6, -7, -9, -10, -11, -20 and -22 all contain a central anti-parallel β-sheet bridged by the disulfide bond. In contrast, SFT-L1, PDP-4 and PDP-21 possess a more irregular backbone structure lacking the β-sheet (Fig. 6). All structures have tight turns but their conformations differ, with PDP-7 containing a type I β-turn like SFTI-1, PDP-9 and PDP-11 type VIII β-turns, while the others have less defined type IV β-turns.
image file: c7np00066a-f6.tif
Fig. 6 Solution NMR structures of PDPs. Ribbon representations illustrating the secondary structural elements of the PDP family. The backbone (cyan) is cross-braced by a single disulfide bond (yellow). All PDPs adopt an anti-parallel β-sheet except SFT-L1, PDP-4 and PDP-21, which only consist of turns. The single disulfide bond adopts a right-handed hook conformation.

It is not uncommon for peptides to adopt multiple conformations in solution and during the NMR analyses evidence for conformational inhomogeneity was seen in the form of multiple sets of resonances for some peptides. Mostly the alternative conformations were minor populations with weak intensity, but a structure for PDP-8 could not be determined due to multiple main conformations.30 The majority do however appear to adopt rather well defined conformations, as evident from the significant deviation from random coil observed for many residues.29,30 All peptides have amide protons that are slowly exchanging with the solvent confirming an extensive network of stable hydrogen bonds. Hydrogen bonds are predominantly found between backbone amides and carbonyls in the elements of secondary structure, however side-chain polar groups also contribute, in particular in SFTI-1 where Thr4 and Ser6 are involved in both side-chain to side-chain and side-chain to main-chain hydrogen bonds.1 In other PDPs key side-chain to side–side chain interactions are predominantly non-polar. In particular in the PDP-10 and PDP-11 structures where Val5, Pro6, Tyr7, Pro8, Pro9, Phe10 and Phe11 cluster to create a bend in the otherwise generally flat structures.30 No extensive studies into the flexibility of PDPs using NMR relaxation studies have yet been undertaken but may provide further insights into their folds.

3.4 Bioactivities and physiological significance of PDPs

Like many other cyclic plant peptides, the functional role of SFTI-1 is still unknown, but its ability to inhibit proteases and the fact that plants do not have trypsin suggests that SFTI-1 could have primarily evolved as a defence protein against feeding by pests, parasites and pathogens. Elliott et al. showed that SFTI-1 was able to inhibit trypsins from the gut of the bollworm H. armigera,7 confirming its inhibitory activity is not specific to bovine β-trypsin. Trypsin inhibitory properties have only been described for PDPs that show high sequence similarity with SFTI-1, i.e. PDP-3 and PDP-12. PDP-3 and PDP-12 are subtle variants of SFTI-1, with only a single amino acid change (Phe12 to Tyr12 for PDP-3 and Ile10 to Val10 for PDP-12), and inhibit trypsin at a similar concentration.7 Not all PDPs have evolved an exposed positively charged residue at the end of an extended β-sheet, and the question of the functional implication remains. Given their different structures a common mode of action is difficult to envisage, however multiple targets related to defence might be possible.

4 SFTI-1 in drug design and protein engineering

4.1 SFTI-1 as a model system for protease inhibitor design

The potent activity of SFTI-1 against trypsin suggested it was an ideal framework for protease inhibition that could be tuned for other enzymes. Jaulent and Leatherbarrow designed a bifunctional inhibitor by replacing the cyclisation loop of SFTI-1 with another bioactive loop, creating a double-headed inhibitor with a potent trypsin inhibitory activity (Fig. 7). Substitution of one of the Lys residues by Phe resulted in a bifunctional trypsin/chymotrypsin inhibitor.31,32
image file: c7np00066a-f7.tif
Fig. 7 SFTI-1 in drug design and protein engineering. Examples highlight different regions of SFTI-1 that have been excised and replaced with different epitopes. These include replacement of the bioactive loop with angiogenic sequences from laminin, osteopontin and VEGF (blue), replacement of the cyclisation loop with a second bioactive loop (red) and modification of the active site β-strand to target proteases like KLK4 or binding to and preventing fibril formation of tau-protein.

Modifying SFTI-1 using non-natural amino acids in the substrate-specific P1 position allowed the generation of potent and specific protease inhibitors targeting bovine α-chymotrypsin and cathepsin G.33,34 Positions outside the P1 position have also been explored to optimise substrate–enzyme binding affinity to different serine proteases.35

Particularly promising from a pharmaceutical point of view, SFTI-1 was found to inhibit matriptase, a type II transmembrane serine protease found on the surface of epithelial cells and certain cancer cells, at comparable potency to trypsin with a Ki = 0.92 nM.36 Based on these findings SFTI-1 analogues with improved inhibitory selectivity and metabolic stability were designed and synthesised.37–41 SFTI-1 has also been re-designed as a potent and selective kallikrein-related peptidase 4 (KLK4) inhibitor,42 a peptidase overexpressed in malignant prostate tumors. Swedberg et al. showed that KLK4 preferred the sequence FVQR as a substrate for cleavage, which was re-engineered in the backbone of SFTI-1. The inhibitory activity of the re-engineered SFTI-FCQR had a Ki = 3.59 nM. In a different study the effect of amino acid substitutions on the internal hydrogen bonding network of SFTI-1 in complex with KLK4 and trypsin were investigated resulting in the production of a second generation inhibitor SFTI-FCQR Asn14 with a 125-fold increased potency against KLK4 (Ki = 0.039 nM),40 and the structures of these complexes have been studied.43 SFTI-1 variants were used to explore binding kinetics and functions,3 and to investigate specificity and improve selectivity of engineered inhibitor variants44 (Fig. 7). Furthermore, SFTI-1 variants have been designed that potently and specifically target the proteolytic activity of the neutrophil serine protease cathepsin G45 and the human and yeast 20S and 26S proteasome.46

The therapeutic potential of SFTI-1-like peptides for targeting clinically relevant protease is further illustrated by a peptide named peptide leucine arginine (pLR), which is isolated from skin secretions of the North American leopard frog L. pipiens. pLR targets the serine protease tryptase, which plays a crucial role in the development of allergic airway inflammation. A conformational comparison revealed that pLR showed striking similarities with the binding loop of SFTI-1.47 The application of pLR reduced increased total lung collagen and the deposition of extracellular matrix collagen and fibrin proteins in a long-lasting chronic asthma model as well as the acute asthma phenotype in murine asthma models.

4.2 SFTI-1 as a scaffold for grafting applications

Cyclisation has long been considered a valuable improvement for peptide drug development processes by enhancing stability and reducing sensitivity to proteolytic breakdown. Although peptides exhibit high potency and specificity towards molecular targets as well as low toxicity, their use has been limited by their poor stability, limited oral availability and low membrane permeability. Studies by Cascales et al. showed that native SFTI-1 can penetrate cells and was internalised into MCF-7 cells, however it did not interact with any phospholipid models tested.48 Further studies have taken advantage of this ability to enter cells and shown that by introducing a FRET fluorescence acceptor intracellular proteolytic activity can be studied.49 The advantages that a cyclic backbone provides have been highlighted by studies where peptides that are linear in their native form have been backbone cyclised by a linker. Cyclisation of the conotoxin peptide Vc1.1, made it more stable in simulated gastric fluid, simulated intestinal fluid and human serum than its linear counterpart. Cyclic Vc1.1 also showed improved inhibition of the N-type Ca2+ channel currents.50 Native chemical ligation in solid peptide synthesis is still the strategy of choice for producing synthetic cyclic peptide, but studies of their biosynthesis is opening a new field of research. Cyclic peptides can now be produced by bacterial recombinant strategies,51,52 as well as by chemical or chemoenzymatic methods using AEP or other macrocyclases.13,24,53–60

Rather than optimising the inherent protease inhibitory activity, sequences with completely novel function can be grafted into the SFTI-1 framework. Re-engineering the backbone has been applied to stabilise bioactive epitopes and to create pro-angiogenic agents and potent anti-angiogenic peptides to inhibit tumor growth.61,62 Wang et al. explored the effect of transplanting a tau-derived hexapeptide into SFTI-1 templates to create inhibitors that target fibril ends and stop fibril growth. SFTI-1 was re-engineered to have two ‘faces’; a complement face that recognises the end of fibrils through hydrogen bonds and other non-covalent interactions and an opposing face, which contains the SFTI-1 prolines. Prolines will inhibit hydrogen bond formation and therefore inhibit fibril growth63 (Fig. 7). In another application a short anti-inflammatory sequence identified from the annexin A1 protein was grafted into SFTI-1 framework and found to be effective in reducing inflammation in a mouse model of acute colitis.64

4.3 SFTI-1 variants in the fight against resistant pathogens

In addition there is growing interest in exploring the potential of antimicrobial peptides as antibiotics to overcome resistance acquired to conventional antibiotics. Li et al. discovered that the short peptide ORB isolated from skin secretions of the frog O. grahami showed striking similarities with the binding loop of SFTI-1 and possesses strong antimicrobial activity against S. aureus, E. coli, B. dysenteria and C. albicans.65 Malik et al. could also show that cyslizing these peptides to create bicyclic SFTI-1 analogues benefitted activity and stability, and that this antimicrobial peptide strongly inhibited the virulent pathogen S. aureus in an in vivo mouse model for wound infection.66

5 Conclusions and outlook

SFTI-1 is a peptidic natural product with an intriguing structure and biosynthetic origin. Its unique properties have generated widespread interest in its use in therapeutic and biochemical applications. Examples not only show that the naturally occurring peptide SFTI-1 has potential as a model system to provide insights into protease inhibitor mechanism and design, but also in the development of new therapeutic agents to target a wide range of diseases from cancer to Alzheimer's.42,61–63 Although SFTI-1 was the focus of all these therapeutic applications, the emerging sunflower PDP family of peptides promises to be a useful “toolbox” for scaffold selection as they differ in sequence, structural fold and physiochemical properties; a feature that also implies the family have diversity in biological function. The full physiological roles of PDPs are yet to be understood, but with continued studies additional bioactivities might be discovered, adding further value to these buried treasures.

6 Conflicts of interest

There are no conflicts to declare.

7 Acknowledgements

This work and B. F. were supported by Australian Research Council grant DP120103369 to K. J. R. and J. S. M. K. J. R. and J. S. M. were supported by Australian Research Council Future Fellowships FT130100890 and FT120100013 respectively.

8 References

  1. S. Luckett, R. S. Garcia, J. J. Barker, A. V. Konarev, P. R. Shewry, A. R. Clarke and R. L. Brady, J. Mol. Biol., 1999, 290, 525–533 CrossRef CAS PubMed.
  2. M. L. J. Korsinczky, H. J. Schirra, K. J. Rosengren, J. West, B. A. Condie, L. Otvos, M. A. Anderson and D. J. Craik, J. Mol. Biol., 2001, 311, 579–591 CrossRef CAS PubMed.
  3. S. J. de Veer, J. E. Swedberg, M. Akcan, K. J. Rosengren, M. Brattsand, D. J. Craik and J. M. Harris, Biochem. J., 2015, 469, 243–253 CrossRef CAS PubMed.
  4. M. L. J. Korsinczky, R. J. Clark and D. J. Craik, Biochemistry, 2005, 44, 1145–1153 CrossRef CAS PubMed.
  5. E. Zabłotna, A. Jaśkiewicz, A. Łęgowska, H. Miecznikowska, A. Lesner and K. Rolka, J. Pept. Sci., 2007, 13, 749–755 CrossRef PubMed.
  6. A. M. James, A. S. Jayasena, J. Zhang, O. Berkowitz, D. Secco, G. J. Knott, J. Whelan, C. S. Bond and J. S. Mylne, Plant Cell, 2017, 29, 461–473 CrossRef CAS PubMed.
  7. A. G. Elliott, C. Delay, H. Liu, Z. Phua, K. J. Rosengren, A. H. Benfield, J. L. Panero, M. L. Colgrave, A. S. Jayasena, K. M. Dunse, M. A. Anderson, E. E. Schilling, D. Ortiz-Barrientos, D. J. Craik and J. S. Mylne, Plant Cell, 2014, 26, 981–995 CrossRef CAS PubMed.
  8. A. B. E. Brauer, G. Kelly, S. J. Matthews and R. J. Leatherbarrow, J. Biomol. Struct. Dyn., 2002, 20, 59–70 CAS.
  9. A. Descours, K. Moehle, A. Renard and J. A. Robinson, ChemBioChem, 2002, 3, 318–323 CrossRef CAS PubMed.
  10. N. L. Daly, Y. K. Chen, F. M. Foley, P. S. Bansal, R. Bharathi, R. J. Clark, C. P. Sommerhoff and D. J. Craik, J. Biol. Chem., 2006, 281, 23668–23675 CrossRef CAS PubMed.
  11. J. S. Mylne, M. L. Colgrave, N. L. Daly, A. H. Chanson, A. G. Elliott, E. J. McCallum, A. Jones and D. J. Craik, Nat. Chem. Biol., 2011, 7, 257–259 CrossRef CAS PubMed.
  12. B. Franke, A. M. James, M. Mobli, M. L. Colgrave, J. S. Mylne and K. J. Rosengren, J. Biol. Chem., 2017, 292, 12398–12411 CrossRef CAS PubMed.
  13. K. Bernath-Levin, C. Nelson, A. G. Elliott, A. S. Jayasena, A. H. Millar, D. J. Craik and J. S. Mylne, Chem. Biol., 2015, 22, 571–582 CrossRef CAS PubMed.
  14. I. Hara-Nishimura, Y. Takeuchi, K. Inoue and M. Nishimura, Plant J., 1993, 4, 793–800 Search PubMed.
  15. J. S. Mylne, I. Hara-Nishimura and K. J. Rosengren, Funct. Plant Biol., 2014, 41, 671–677 CrossRef CAS.
  16. Y. Onda, A. Nagamine, M. Sakurai, T. Kumamaru, M. Ogawa and Y. Kawagoe, Plant Cell, 2011, 23, 210–223 CrossRef CAS PubMed.
  17. N. Hiraiwa, M. Kondo, M. Nishimura and I. Hara-Nishimura, Eur. J. Biochem., 1997, 246, 133–141 CAS.
  18. L. Li, T. Shimada, H. Takahashi, H. Euda, Y. Fukao, M. Kondo, M. Nishimura and H. Hara-Nishimura, Plant Cell, 2006, 18, 3535–3547 CrossRef CAS PubMed.
  19. M. S. Otegui, R. Herder, J. Schulze, R. Jung and L. A. Staehelin, Plant Cell, 2006, 18, 2567–2581 CrossRef CAS PubMed.
  20. L. Cascales and D. J. Craik, Org. Biomol. Chem., 2010, 8, 5035–5047 CAS.
  21. D. J. Craik, N. L. Daly, T. Bond and C. Waine, J. Mol. Biol., 1999, 294, 1327–1336 CrossRef CAS PubMed.
  22. I. Saska, A. D. Gillon, N. Hatsugai, R. G. Dietzgen, I. Hara-Nishimura, M. A. Anderson and D. J. Craik, J. Biol. Chem., 2007, 282, 29721–29728 CrossRef CAS PubMed.
  23. J. S. Mylne, L. Y. Chan, A. H. Chanson, N. L. Daly, H. Schaefer, T. L. Bailey, P. Nguyencong, L. Cascales and D. J. Craik, Plant Cell, 2012, 24, 2765–2778 CrossRef CAS PubMed.
  24. K. S. Harris, T. Durek, Q. Kaas, A. G. Poth, E. K. Gilding, B. F. Conlan, I. Saska, N. L. Daly, N. L. van der Weerden, D. J. Craik and M. A. Anderson, Nat. Commun., 2015, 6, 10199 CrossRef CAS PubMed.
  25. N. S. van der Velden, N. Kalin, M. J. Helf, J. Piel, M. F. Freeman and M. Kunzler, Nat. Chem. Biol., 2017, 13, 833–835 CrossRef CAS PubMed.
  26. S. Ramm, B. Krawczyk, A. Mühlenweg, A. Poch, E. Mösker and R. D. Süssmuth, Angew. Chem., Int. Ed. Engl., 2017, 56, 9994–9997 CrossRef CAS PubMed.
  27. A. S. Jayasena, D. Secco, K. Bernath-Levin, O. Berkowitz, J. Whelan and J. S. Mylne, Plant Methods, 2014, 10, 34 CrossRef PubMed.
  28. A. S. Jayasena, M. F. Fisher, J. L. Panero, D. Secco, K. Bernath-Levin, O. Berkowitz, N. L. Taylor, E. E. Schilling, J. Whelan and J. S. Mylne, Mol. Biol. Evol., 2017, 34, 1505–1516 CrossRef PubMed.
  29. B. Franke, A. S. Jayasena, M. F. Fisher, J. E. Swedberg, N. L. Taylor, J. S. Mylne and K. J. Rosengren, Biopolymers, 2016, 106, 806–817 CrossRef CAS PubMed.
  30. A. G. Elliott, B. Franke, D. A. Armstrong, D. J. Craik, J. S. Mylne and K. J. Rosengren, Amino Acids, 2016, 49, 103–116 CrossRef PubMed.
  31. A. Jaulent and R. Leatherbarrow, Protein Eng., Des. Sel., 2004, 17, 681–687 CrossRef CAS PubMed.
  32. A. Jaulent, A. Brauer, S. Matthews and R. Leatherbarrow, J. Biomol. NMR, 2005, 33, 57–62 CrossRef CAS PubMed.
  33. M. Stawikowski, R. Stawikowska, A. Jaśkiewicz, E. Zabłotna and K. Rolka, ChemBioChem, 2005, 6, 1057–1061 CrossRef CAS PubMed.
  34. A. Łegowska, D. Debowski, A. Lesner, M. Wysocka and K. Rolka, Bioorg. Med. Chem., 2009, 17, 3302–3307 CrossRef PubMed.
  35. K. Hilpert, G. Hansen, H. Wessner, R. Volkmer-Engert and W. Hohne, J. Biochem., 2005, 138, 383–390 CrossRef CAS PubMed.
  36. Y. Q. Long, S. L. Lee, C. Y. Lin, I. J. Enyedy, S. Wang, P. Li, R. B. Dickson and P. P. Roller, Bioorg. Med. Chem. Lett., 2001, 11, 2515–2519 CrossRef CAS PubMed.
  37. H. Fittler, O. Avrutina, B. Glotzbach, M. Empting and H. Kolmar, Org. Biomol. Chem., 2013, 11, 1848–1857 CAS.
  38. P. Quimbar, U. Malik, C. P. Sommerhoff, Q. Kaas, L. Y. Chan, Y.-H. Huang, M. Grundhuber, K. Dunse, D. J. Craik, M. A. Anderson and N. L. Daly, J. Biol. Chem., 2013, 288, 13885–13896 CrossRef CAS PubMed.
  39. A. Gitlin, D. Dębowski, N. Karna, A. Łęgowska, M. Stirnberg, M. Gütschow and K. Rolka, ChemBioChem, 2015, 16, 1601–1607 CrossRef CAS PubMed.
  40. J. E. Swedberg, S. J. de Veer, K. C. Sit, C. F. Reboul, A. M. Buckle and J. M. Harris, PLoS One, 2011, 6, e19302 CAS.
  41. A. Gitlin-Domagalska, D. Debowski, A. Legowska, M. Stirnberg, J. Okonska, M. Gutschow and K. Rolka, Biopolymers, 2017, 108, e23031 CrossRef PubMed.
  42. J. E. Swedberg, L. V. Nigon, J. C. Reid, S. J. de Veer, C. M. Walpole, C. R. Stephens, T. P. Walsh, T. K. Takayama, J. D. Hooper, J. A. Clements, A. M. Buckle and J. M. Harris, Chem. Biol., 2009, 16, 633–643 CrossRef CAS PubMed.
  43. B. T. Riley, O. Ilyichova, M. G. Costa, B. T. Porebski, S. J. de Veer, J. E. Swedberg, I. Kass, J. M. Harris, D. E. Hoke and A. M. Buckle, Sci. Rep., 2016, 6, 35385 CrossRef CAS PubMed.
  44. S. J. de Veer, C. K. Wang, J. M. Harris, D. J. Craik and J. E. Swedberg, J. Med. Chem., 2015, 58, 8257–8268 CrossRef CAS PubMed.
  45. J. E. Swedberg, C. Y. Li, S. J. de Veer, C. K. Wang and D. J. Craik, J. Med. Chem., 2017, 60, 658–667 CrossRef CAS PubMed.
  46. D. Debowski, M. Cichorek, M. Lubos, S. Wojcik, A. Legowska and K. Rolka, Biopolymers, 2016, 106, 685–696 CrossRef CAS PubMed.
  47. S. Rothemund, F. D. Sönnichsen and T. Polte, J. Med. Chem., 2013, 56, 6732–6744 CrossRef CAS PubMed.
  48. L. Cascales, S. T. Henriques, M. C. Kerr, Y.-H. Huang, M. J. Sweet, N. L. Daly and D. J. Craik, J. Biol. Chem., 2011, 286, 36932–36943 CrossRef CAS PubMed.
  49. M. Filipowicz, N. Ptaszynska, K. Olkiewicz, D. Debowski, K. Cwiklowska, T. Burster, M. Pikula, A. Krzystyniak, A. Legowska and K. Rolka, Biopolymers, 2017, 108, e22988 CrossRef PubMed.
  50. R. J. Clark, J. Jensen, S. T. Nevin, B. P. Callaghan, D. J. Adams and D. J. Craik, Angew. Chem., Int. Ed. Engl., 2010, 49, 6545–6548 CrossRef CAS PubMed.
  51. J. Austin, R. Kimura, Y.-H. Woo and J. Camarero, Amino Acids, 2010, 38, 1313–1322 CrossRef CAS PubMed.
  52. Y. Li, T. Aboye, L. Breindel, A. Shekhtman and J. A. Camarero, Biopolymers, 2016, 106, 818–824 CrossRef CAS PubMed.
  53. G. K. Nguyen, S. Wang, Y. Qiu, X. Hemu, Y. Lian and J. Tam, Nat. Chem. Biol., 2014, 10, 732–738 CrossRef CAS PubMed.
  54. C. N. Alexandru-Crivac, C. Umeobika, N. Leikoski, J. Jokela, K. A. Rickaby, A. M. Grilo, P. Sjo, A. T. Plowright, M. Idress, E. Siebs, A. Nneoyi-Egbe, M. Wahlsten, K. Sivonen, M. Jaspars, L. Trembleau, D. P. Fewer and W. E. Houssen, Chem. Commun., 2017, 53, 10656–10659 RSC.
  55. C. J. White and A. K. Yudin, Nat. Chem., 2011, 3, 509–524 CrossRef CAS PubMed.
  56. T. Katoh, Y. Goto, M. S. Reza and H. Suga, Chem. Commun., 2011, 47, 9946–9958 RSC.
  57. J. E. Townend and A. Tavassoli, ACS Chem. Biol., 2016, 11, 1624–1630 CrossRef CAS PubMed.
  58. H. Luo, S. Y. Hong, R. M. Sgambelluri, E. Angelos, X. Li and J. D. Walton, Chem. Biol., 2014, 21, 1610–1617 CrossRef CAS PubMed.
  59. C. J. Barber, P. T. Pujara, D. W. Reed, S. Chiwocha, H. Zhang and P. S. Covello, J. Biol. Chem., 2013, 288, 12500–12510 CrossRef CAS PubMed.
  60. W. E. Houssen, A. F. Bent, A. R. McEwan, N. Pieiller, J. Tabudravu, J. Koehnke, G. Mann, R. I. Adaba, L. Thomas, U. W. Hawas, H. Liu, U. Schwarz-Linek, M. C. Smith, J. H. Naismith and M. Jaspars, Angew. Chem., Int. Ed. Engl., 2014, 53, 14171–14174 CrossRef CAS PubMed.
  61. L. Y. Chan, S. Gunasekera, S. T. Henriques, N. F. Worth, S. J. Le, R. J. Clark, J. H. Campbell, D. J. Craik and N. L. Daly, Blood, 2011, 118, 6709–6717 CrossRef CAS PubMed.
  62. L. Y. Chan, D. J. Craik and N. L. Daly, Biosci. Rep., 2015, 35, e00270 CrossRef PubMed.
  63. C. K. Wang, S. E. Northfield, Y.-H. Huang, M. C. Ramos and D. J. Craik, Eur. J. Med. Chem., 2016, 109, 342–349 CrossRef CAS PubMed.
  64. C. Cobos Caceres, P. S. Bansal, S. Navarro, D. Wilson, L. Don, P. Giacomin, A. Loukas and N. L. Daly, J. Biol. Chem., 2017, 292, 10288–10294 CrossRef PubMed.
  65. J. Li, C. Zhang, E. Xu, J. Wang, H. Yu, R. Lai and W. Gong, FASEB J., 2007, 21, 2466–2473 CrossRef CAS PubMed.
  66. U. Malik, O. N. Silva, I. C. M. Fensterseifer, L. Y. Chan, R. J. Clark, O. L. Franco, N. L. Daly and D. J. Craik, Antimicrob. Agents Chemother., 2015, 59, 2113–2121 CrossRef CAS PubMed.


Current address: Focal Area Structural Biology and Biophysics, Biozentrum University of Basel, 4056 Basel, Switzerland.

This journal is © The Royal Society of Chemistry 2018