Hamada
Saad‡
*ab,
Thomas
Majer‡
a,
Keshab
Bhattarai‡
a,
Sarah
Lampe
a,
Dinh T.
Nguyen
b,
Markus
Kramer
c,
Jan
Straetener
d,
Heike
Brötz-Oesterhelt
de,
Douglas A.
Mitchell
b and
Harald
Gross
*ae
aDepartment of Pharmaceutical Biology, Institute of Pharmaceutical Sciences, University of Tübingen, Auf der Morgenstelle 8, 72076 Tübingen, Germany. E-mail: hamada.saad@pharm.uni-tuebingen.de; harald.gross@uni-tuebingen.de
bDepartment of Chemistry and the Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana–Champaign, Urbana, Illinois 61801, USA
cInstitute of Organic Chemistry, University of Tübingen, Auf der Morgenstelle 18, 72076 Tübingen, Germany
dDepartment of Microbial Bioactive Compounds, Interfaculty Institute of Microbiology and Infection Medicine, University of Tübingen, Auf der Morgenstelle 28, 72076 Tübingen, Germany
eCluster of Excellence: EXC 2124: Controlling Microbes to Fight Infection, University of Tübingen, Tübingen, Germany
First published on 30th October 2023
Lasso peptides are a class of ribosomally synthesized and post-translationally modified peptides (RiPPs) that feature an isopeptide bond and a distinct lariat fold. A growing number of secondary modifications have been described that further decorate lasso peptide scaffolds. Using genome mining, we have discovered a pair of lasso peptide biosynthetic gene clusters (BGCs) that include cytochrome P450 genes. Using mass spectrometry, stable isotope incorporation, and extensive 2D-NMR spectrometry, we report the structural characterization of two unique examples of (C–N) biaryl-linked lasso peptides. Nocapeptin A, from Nocardia terpenica, is tailored with a Trp–Tyr crosslink, while longipepetin A, from Longimycelium tulufanense, features a Trp–Trp linkage. Besides the unusual bicyclic frame, a Met of longipepetin A undergoes S-methylation to yield a trivalent sulfonium, a heretofore unprecedented RiPP modification. A bioinformatic survey revealed additional lasso peptide BGCs containing P450 enzymes which await future characterization. Lastly, nocapeptin A bioactivity was assessed against a panel of human and bacterial cell lines with modest growth-suppression activity detected towards Micrococcus luteus.
Lasso peptides are one of nearly 50 described classes of RiPPs and display a structurally unique lariat conformation as the class-defining feature. The N-terminus of the core peptide and a side chain carboxylate of an Asp/Glu residing at position 7–9 form a macrolactam ring that is threaded by the C-terminal “tail” residues.2 Large, steric-locking residues adjacent to the plane of the ring and/or disulfide bridge(s) result in a kinetically trapped rotaxane conformation that endows most lasso peptides with extraordinary thermal and proteolytic stability.3
Lasso peptide biosynthesis starts with the ribosomal synthesis of a bipartite precursor peptide (A) that contains N-terminal leader and C-terminal core regions. The leader portion harbors the recognition sequence, which directs the enzymatic processing events via interaction with the RiPP precursor recognition element (RRE).4 Meanwhile, the core region receives all of the post-translational modifications (PTMs).5 Upon RRE binding, leader peptidolysis is initiated by the co-occurring (B) protein, releasing the core peptide as a substrate for the ATP-dependent lasso cyclase (C).3a,6 In some lasso peptide biosynthetic pathways, secondary modifications include disulfide bonds, C-terminal methylesterification, N-acetylation, citrullination, O-phosphorylation, glycosylation, epimerization, β-hydroxylation, and aspartimidation.2b,7 The combination of genome sequencing and enhanced genome-mining algorithms has revealed a large number of RiPP-associated PTMs, including lasso peptides.7c,8
Bacteria of the genus Nocardia were first studied mainly to understand their pathogenicity. However, metabolic and genomic studies later demonstrated the extraordinary potential of Nocardia to produce bioactive and structurally diverse secondary metabolites distinct from other Actinomycetota.9 In contrast to the well-established and prolific biosynthetic potential of Streptomyces, only a fraction of the predicted natural products for Nocardia have been discovered to date. Known RiPPs from Nocardia are currently restricted to the lipolanthines nocavionin,10a nocaviogua A and B,10b the thiopeptides nocathiacins10c and nocardithiocin,10d and the chimeric lanthipeptide nocathioamide.8d
Based on this promising potential,8d,9c,11 we mined the genomes of two highly similar Nocardia terpenica strains (i.e., IFM 0406 and 0706T) for RiPPs using RODEO.8a The Nocardia-specific bioinformatics analyses identified three putative lasso peptide biosynthetic gene clusters (BGCs) which were not linked to any reported natural product. One Nocardia BGC contained a member of protein family PF00067,12 a predicted cytochrome P450 protein. While P450-encoding genes are relatively rare in bacterial RiPP BGCs,13 they have yet to be reported in a lasso peptide BGC. Since RiPP-associated P450 proteins perform versatile and complex oxidative transformations,13,14 we envisioned such BGCs would yield a new type of lasso peptide. Thus, we characterized the products of this pathway, termed the nocapeptins, given the origin was Nocardia terpenica (taxonomic order: Corynebacteriales). Bioinformatic expansion outside of Nocardia led to the discovery and characterization of the longipeptins, from the phylogenetically distant actinomycete, Longimycelium tulufanense (taxonomic order: pseudonocardiales).
A BLAST-P search of NopB/C followed by RODEO analysis unveiled a small number of similar BGCs (Fig. S2†). One such result was from Longimycelium tulufanense CGMCC 4.5737 and was termed lop BGC) BGC. The lop and nop BGCs exhibit considerable protein similarity and genetic synteny (Fig. 1 and Table S1†). However, L. tulufanense contains two additional genes, lopG and lopH, encoding a more divergent second cytochrome P450 protein and a protein of unknown function, respectively, suggesting that the resulting lasso peptide will receive additional modifications relative to the nop BGC product.
Based on the presence of a canonical Thr(-2), RODEO predicted that leader peptidolysis would occur at Gln/Gly for NopA and LopA, yielding 15-residue core peptides (Fig. 1). Macrocyclization was predicted between Gly1 and Asp8 with Arg14 being the most suitable candidate for a steric-locking residue. The P450 proteins encoded by the nop/lop BGCs suggested additional oxidations would be installed; however, bioinformatics cannot predict the specific modification(s) installed by these enzymes. Driven by the new core sequences and the unprecedented inclusion of P450 enzymes in the BGC, we initiated a media screening campaign to isolate and characterize the products of the nop/lop BGCs.
The collision-induced dissociation (CID) spectra of 1 and 2 showed a series of y and b ions confirming the NopA sequence (Fig. S4†). Despite the lower probability of observing internal fragments arising from two amide dissociation events, low-intensity y and b fragments were observed that corroborated the amino acid composition of the macrocycle and the residues forming the macrolactam (Tables S2 and S3†). A consistent −2.0157 Da deviation was found in most daughter ions, allowing localization of the secondary modification to the Gly1-Trp2-Tyr3 region.
Using a similar mass-targeted growth media screen,17L. tulufanense yielded candidate products with [M + 2H]2+ values of 851.3472 (major), 843.3501 (minor), and 844.3395 (trace), designated as longipeptins A–C (3–5), respectively (Fig. S7†). We hypothesized that the post-translational modification statues of the longipeptides would deviate from 1-2 owing to the presence of lopG and lopH in the BGC (Fig. 1). Using high-resolution tandem MS (HR-MS/MS), we confirmed the LopA sequence (Fig. S9–S11 and Tables S4–S6†). Besides the anticipated –2H loss, the neutral molecular formula of longipeptin A (3, C76H96N22O22S) was consistent with net addition of oxygen and methylation (+CH2) (Fig. S12 and S13†). While longipeptin B (4) retains methylation, it lacks hydroxylation, suggesting it may be a biosynthetic intermediate. On the other hand, longipeptin C (5) lacks methylation but retains net addition of oxygen (Fig. S12 and S13†).
Despite their structural similarity, the MS2 spectra of 3 and 4 differed significantly from 5 (Fig. S8†). To explore the discrepancy, and to determine if the net addition of oxygen was from hydroxylation, met oxidation, or another modification, we began with the analysis of the MS2 spectrum of 5. The resulting y/b ions agreed with the expected core sequence of LopA, a Gly1-Asp8 macrolactam, and loss of 2.0157 Da within the Gly1-Trp2-Trp3 region. Further MS2 inspection unveiled a new series of (yn-64 Da) ions consistent with ejection of Me2S and thus the presence of Met S-oxide (Fig. 2, S11 and Table S6†). Based on the trace amounts isolated, we conclude that 5 was likely formed by spontaneous air oxidation.
The b ion series of 3 and 4 supported the sequences and location of the previously noted modifications. In 3, the 2H loss and the exclusive hydroxylation were localized by the key b3 ion. Considering the spectral similarity of 3 and 4 (Fig. S8†) and the annotated b ions, methylation was localized to the Met13-Arg14-Asp15 C-terminal region. Considering several possibilities, S-methylation of Met13 was the most plausible and that such an assignment successfully dereplicated the full MS2 spectra of 3 and 4. The parent ions of 3 and 4 supported Met13 S-methylation by affording a characteristic M-62 ion, arising from a presumed β-elimination of dimethylsulfide (Fig. 2). In addition, the complete annotation of a new series of y fragments , (yn-62) including the most intense ions y4*–7*, corroborated the putative modification in 3 and 4 (Fig. S9–S10, Tables S4–S5†).
For nocapeptin A (1), the LC-MS profiles of IFM 0406 cultures supplemented with [2H7] L-Tyr and [2H8] L-Trp (all C–H protons replaced with deuterium) separately showed incorporation of +6 and +16 Da, respectively (Fig. 3, S5 and S6†). The [2H7] L-Tyr feeding data unambiguously demonstrated that the modification was not dehydrogenation, rather a single C–H of Tyr3 was involved in the mass deviation. The [2H8] L-Trp data showed the expected shift for Trp and allowed the conclusion that the C–H bonds of Trp2 and Trp7 were unaltered.
Notably, 1H-NMR analysis highlighted a discriminative resonance of a single aromatic NH of one Trp residue (δH 10.47, broad singlet), despite two Trp present in the peptide (Fig. S14†). 1H–13C HMBC and 1H–1H NOESY correlations assigned the unmodified Trp as Trp7 (Fig. S32†), and hence, Trp2 was expected to be modified at the indolic N1 position. The observable couplings obtained from a 1H–15N HMBC spectrum, between H2-Trp2 and H2-Tyr3 with N1-Trp2 (δN 120.80) in addition to 1H–13C LR-HSQMBC experiment,191H–1H NOESY relationships, and the isotope labeling experiments permitted the assignment of a biaryl connection between the N1 of Trp2 and C3 of Tyr3 (Fig. 4, S21, S23 and S29†). The polypeptide backbone of 1 was assigned via the sequential connectivity of the delineated fragments using 1H–13C HMBC cross-peaks from α-H resonances to the amide carbonyls of the neighboring amino acid. In addition, the detected Hα,β(i) → HN(i + 1) NOESY correlations supported the complete sequence. The macrolactam between Gly1 and Asp8 was validated similarly (Fig. 4D and S33†). Ultimately, a threaded conformation of 1 was evident with Gln13 and Arg14 assigned as upper and lower plugs, respectively, via NOESY correlations (Fig. S23†). The latter experimental findings were in good agreement with the calculated conformation model of the backbone structure, generated with the software LassoHTP.20
Using our stable isotope labeling results, we defined the absolute configuration of three amino acids. Thus, the interpretation of the MS-shift patterns showed that both Trp residues and the Tyr unit possess the L-configuration. In addition, L-Pro10 residue was deduced similarly and the stereochemistry of the remaining chiral residues of 1 were presumed to be L-configured. In analogy to other biarylic RiPPs,21 we expect that, due to the extra crosslink, atropisomery is given for nocapeptins. However, since in the 1H or 13C NMR spectra neither broad nor doubled resonances were observed, we assume that only one atropisomer is biosynthesized by nature.
Unlike nocapeptin A (1), the longipeptin A (3) titer was too low for a complete NMR-based structural elucidation. However, keeping in mind the anticipated modification of 3, the available NMR data permitted partial structural elucidation, including substructures containing the modifications. The annotated spin systems from 1H–1H COSY, 1H–1H TOCSY, and 1H–13C HSQC-TOCSY enabled the assignment of 1xAla, 1xPro, 1xSer, 1xArg, 1xAsn, 2xAsp, and 2xGly (Fig. S45–S47†).
Expectedly, the NMR data showed a significant downfield shift of Met γ-CH2 (δC 43.13 instead of the typical ∼30 ppm). Thus, in agreement with the MS data, we proposed this signal originated from S-methylation of Met13, forming a deshielded sulfonium (Fig. 5 and S47†). In addition, three candidate spin systems including the backbone amidic NH, α-H, and β-Ha+b were assembled in tandem with three aromatic systems to constitute the 3xTrp units (Fig. S47†).
Unfortunately, we could not obtain an adequate 1H–1H NOESY spectrum, which prevented a confident connection of the Trp separate substructures even though weaker allylic 4J2H, βHa+b couplings, observed in the 1H–1H TOCSY spectrum suggested possible connectivities (Fig. S45 and S47†). The characteristic indolic NHs of 2xTrp as a pair of singlets (δH 10.18 and 10.55) (Fig. S41A†) and the 1H–13C HMBC correlations identified high-confidence locations of a crosslink and hydroxylation event in the biaryl-containing substructure (Fig. 5, S46 and S47†). Building on the MS data and comparable NMR shifts of 1, two possible isomers were proposed in which the (N1–C5) biaryl crosslink was adopted alternatively between Trp2 and Trp3 (Fig. 5 and S48†).
Considering their sequence similarity, we predict the P450 proteins NopF/LopF (76% identity, 87% similarity) form the biaryl linkages in 1 and 3, respectively. Given that Trp hydroxylation is unique to longipeptin, we anticipate LopG, a second P450 protein encoded by the lop BGC, performs this reaction. The greater sequence divergence of LopG relative to LopF/NopF (35% and 37% identity, respectively) supports this assignment.
Biarylic crosslinks are present in several distinct RiPP classes. Crocagin A22 is a tripeptide RiPP, in which two C-terminal Tyr–Trp residues undergo indole-backbone (C–N) cyclization by a dioxygenase. Atropeptides21b represent a P450-modified RiPP class that contains C–C and C–N linkages between Trp and Tyr residues. Biarylitides23 also contain P450-dependent C–C or C–N linkages between Tyr and His residues. While biarylitides display crosslinks between the first and third residues of the core region, 1 and 3 possess biaryl crosslinks at adjacent aromatic residues.
To shed light on the sequence-function relatedness of NopF/LopF versus known P450-modified RiPPs, a sequence similarity network (SSN) was constructed using the top 1000 non-redundant BLAST-P hits of NopF, predicted atropeptide- and biarylitide-associated cytochrome P450 proteins, and 883 P450 proteins encoded within 10 ORFs of an RRE domain.5c,21b,24 The SSN and similarity/identity analysis (Fig. S51 and S52†) suggested that NopF/LopF are rare examples of P450 proteins within lasso peptide BGCs. A comparative analysis of sequence space also demonstrates that NopF/LopF have significantly diverged from other RiPP-associated cytochrome P450 proteins.
Perhaps the most unusual PTM described in this study is the rare S-methylation of Met that affords a trivalent sulfonium. Given the other functional assignments, LopH is the most probable candidate Met S-methyltransferase. LopH shares modest sequence similarity (45%) with a hypothetical methyltransferase (WP_051757134.1, PF00145, Fig. S53†). To support or refute a tentative assignment of LopH as a methyltransferase, we obtained an AlphaFold-predicted structure and used DALI25 to identify structurally homologous matches from the Protein DataBank.26 The top hit was a methyltransferase domain-containing subunit of human 5,10-methylenetetrahydrofolate reductase (PDB code: 6fcx). This structure was crystallized with S-adenosylhomocysteine (SAH) bound (Fig. S54, S55 and S57†).27 A comparison of 6fcx and LopH found considerable similarity in the S-adenosylmethionine (SAM)-binding sites (specifically, 6fcx residues Glu463, Thr464, Thr481, Ser484, Thr560, and Thr573; which are equivalent to LopH Glu45, Thr46, Thr63, Ser66, Thr126, and Thr134) (Fig. S56 and S58†). As SAM is a common methyl donor for methyltransferases, the predicted structures and the sequence comparison support a tentative assignment of LopH in forming the sulfonium moiety in 3.28 Future experimental work will be necessary to confirm this prediction.
Sulfonium groups will readily react with nucleophiles if dealkylation or substitution is possible.29 The S-methylated Met residue of longipeptin A was stable to extraction and purification. The enhanced stability may be attributed to the position of Met within the lasso peptide structure as this position is equivalent to a steric plug residue established in nocapeptin A.
S-Methylated RiPPs are quite rare with only two cases of thiopeptides, Sch40832 (ref. 30) and the structurally related thioxamycin31 and thioactin.32S-Methylation has also been described for a proteusin that is proposed to contain a single S-methylated Cys.33 Thiol methylation was also illustrated only in a few NRPS cases. Echinomycin and thiocoraline represent NRPS-derived products where S-methylation is catalyzed by a SAM-dependent methyltransferase (Ecm18) or a bifunctional enzyme (TioN) with S-methyltransferase and amino acid adenylation domains.34 Maremycin A/B/G and FR900452 provide further examples of S-methylation from BGCs containing homologs of the methyltransferase MarQ.35
These examples highlight single S-methylation of Cys, thereby differing from the current report on S-dimethylmethionine. Unique examples for charged sulfonium units represent bleomycin A2 (ref. 36) and dimethylsulfoniopropionate (DMSP),37 respectively. Bleomycin A2 carries a 3-aminopropyl-dimethylsulfonium moiety as terminal amine at the Eastern side chain of the molecule. While the freestanding C domain BlmII is postulated to mediated the fusion of the charged side chain with the bleomycin aglycon, the actual biosynthetic origin of 3-aminopropyl-dimethylsulfonium moiety remains enigmatic. DMSPs are well known metabolites found in marine environments. Related compounds have been described, such as gonyol and gonydiol, which are DMSP-derived biosynthetic intermediates of malleicyprols.37 Generally, DMSP-related metabolites are part of the sulfur cycle, and some marine microorganisms, they protect against osmotic, oxidative, and thermal stresses. In this context, it is notable that the longipeptins producer, L. tulufanense, was isolated from a high-salinity lake38 which may connect the molecular structure of 3 with DMSP ecology.
The BGCs of 1 and 3 encode unique structural features, and the bioinformatic efforts in this study expand the range of PTMs associated with lasso peptide biosynthesis, suggesting that additional oxidative tailoring steps may yet be uncovered (Fig. S59†). To our surprise, the co-occurrence of cytochrome P450 proteins with further maturation enzymes in some retrieved candidate BGCs presents a roadmap to discover additional lasso peptides (Fig. S59†). Given the structural constraint installed by NopF/LopF, future work is warranted to reconstitute the enzymatic activity and substrate scope.39 Lastly, the prediction that LopH is a founding member of a new Met S-methyltransferase family suggests that biochemical characterization of LopH may result in diversifying the existing methylation panel with a sulfonium PTM that can be harnessed in different bioconjugation contexts.40
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3sc02380j |
‡ These authors contributed equally. |
This journal is © The Royal Society of Chemistry 2023 |