Caitlin A. McCaddena,
Diana P. Łomowska-Keehner
a,
Tracy Qua,
Jordan Nafie
b,
Tyler A. Alsup
a and
Jeffrey D. Rudolf
*a
aDepartment of Chemistry, University of Florida, Gainesville, Florida 32611-7011, USA. E-mail: jrudolf@chem.ufl.edu
bBiotools, Inc., West Palm Beach, FL 33407, USA
First published on 9th June 2025
Bacteria have been long proposed to harbor ancestral forms of the bifunctional terpene synthases found in plants. Recent studies described the first identification of these fused bacterial diterpene cyclases/synthases (DCSs). Using genome mining, we found candidate proteins in bacteria that were bioinformatically identified to possess both classes of terpene synthase domains. Here, we report the discovery of a plant-like tridomain bifunctional DCS from Streptomyces albulus. A diterpene overproduction system in E. coli enabled the isolation and structural elucidation of syn-abieta-7,13-diene by NMR, GC-MS, and VCD.
The skeletal complexities of terpenoids are produced by terpene synthases (TSs) and/or cyclases. These enzymes employ carbocation chemistry to create (poly)cyclic hydrocarbon frameworks, often with numerous stereocenters, through intricate carbon–carbon bond formations, rearrangements, and hydride, alkyl, or proton shifts.8–10 TSs are classified into two types based on how they generate the initial cationic intermediate. Type I TSs, which have a conserved α-helical isoprenoid fold, abstract the diphosphate moiety of the polyprenyl diphosphate substrate through coordination of a trinuclear Mg2+ complex with conserved DDxxD and NSE/DTE motifs.9 Type II TSs, often referred to as terpene cyclases (TCs), utilize an acidic motif, often DxDD, to protonate an alkene or epoxide to trigger cyclization; TCs have an α-helical didomain (γβ) architecture.11 Due to their distinct methods of ionization, the reactions of TSs and TCs can be sequentially coupled as the allylic diphosphate is left intact by the type II ionization of TCs.
The prototypical plant TS, commonly referred to as a “TPS” and exemplified by the ent-copalyl synthase (CPS)–kaurene synthase (KS) involved in gibberellin biosynthesis, is a tridomain (γβα) bifunctional enzyme (Fig. 1). In some evolutionary lineages, TPSs retain the tridomain architecture but have lost either the type I (e.g., AtCPS) or type II (e.g., TbTS) function, or completely lost the γ domain and thus type II activity (e.g., DsKSL).9,12–15 Conversely, prototypical bacterial TSs are monofunctional, consisting of either an α domain or γβ didomain.9,12–15 This even holds true in bacterial biosynthetic pathways that make terpene skeletons or natural products identical to plants, such as the distinct ent-CPS and KS enzymes in gibberellin biosynthesis in Bradyrhizobium japonicum (Fig. 1).16 Given distinct type I TSs and type II TCs in bacteria and bifunctional TPSs in plants, it was hypothesized that the original fusion event required to form the tridomain bifunctional TPS may have occurred in bacteria.14
We recently genome mined for unique TSs in bacteria to expand the chemical space of diterpenes (terpene skeletons originating from the C20 precursor geranylgeranyl diphosphate; GGPP).17 During that effort, we identified a putative bifunctional diterpene synthase (diTS) from Chitinophaga japonensis that produced sandaracopimaradiene (1; Fig. 1). The production of 1, a known plant and fungal diterpene skeleton, indicated that both type II and I reactions were needed to convert GGPP into 1 via an n-copalyl diphosphate intermediate. As we were preparing that study for publication, Chen, Peters, and colleagues reported five bifunctional diterpene cyclases/synthases (DCSs, and used here for consistency) from bacteria, including the sandaracopimaradiene synthase, which was named ChjDCS (UniProt A0A562SP41).18 Here we build on our initial discovery of a bacterial tridomain bifunctional TS via genome mining and bioinformatics, resulting in the functional characterization of a syn-abieta-7,13-diene synthase from Streptomyces albulus NPDC012681 (also known as S. noursei).
After our initial characterization of ChjDCS, we were interested in how many other tridomain bifunctional DCSs were identifiable in genomic databases. A default BLAST search of the UniProt database using full-length ChjDCS as the query returned almost exclusively plant TPSs; only two of the top 250 hits were from bacteria, one of which had a sequence length (>600 amino acids) indicative of a tridomain architecture. UniProt ID I4EKS1 from Nitrolancea hollandica Lb (507 amino acids, 63% coverage, 31% ID, 47% similarity) was partially characterized as a type II clerodane diphosphate synthase.19 UniProt ID A0A6I6S2R0 from Streptomyces sp. GS7 (820 amino acids, 97% coverage, 25% ID, 39% similarity) was identified and named StrDCS in the previously mentioned study.18 At first glance, this suggested that DCSs are rare in bacteria but are present in more than one phylum, e.g., Bacteroidota and Actinomycetota. When the BLAST search was restricted to bacteria in UniProt, seven additional hits with sequence lengths >600 were returned, but of these, only StrDCS and AcrDCS (A0A4R5B427, previously identified but inactive18) appeared to possess true γβα structures based on AlphaFold predictions (Fig. S1†).
We continued a manual database mining effort using the following criteria: bacterial origin, sequence length >700 amino acids, PF designations of PF19086 (“terpene synthase family 2, C-terminal metal binding domain”) or PF13243 (“squalene-hopene cyclase C-terminal domain”), and AlphaFold prediction of γβα structures. We found an additional six putative DCSs (Table S1†): CseDCS (A0A924DZ81, ent-kaurene) and SpDCS (A0A366LV81, inactive), which were both previously identified,18 along with A0A944KY65 from Streptomyces sp. ISL-98, A0A1X7HFA2 from Streptomyces sp. Amel2xC10, A0A5Q0GZZ2 from Saccharothrix syringae, and A0A4U0NJT4 from Streptomyces piniterrae.
We then used ChjDCS and, at the time, the yet to be characterized StrDCS to search for homologs in the NCBI database and the Natural Product Discovery Center (NPDC) genomic library of actinobacteria.20 No additional hits were found in NCBI (three were identified later; see below), but one hypothetical protein from Streptomyces albulus NPDC012681, here named bacterial abieta-7,13-diene synthase (bAbS), was found in the NPDC.
A sequence alignment of these 10 putative DCSs showed significant levels of conservation among all 10 sequences, particularly in the γβ region of the sequences (Fig. S2 and S3†). The DxDD motif in the type II active site is nearly strictly conserved, with only DVEG motifs in A0A4U0NJT4 and A0A944KY65. The metal-binding motifs in the type I active site, DDxxD and NDxxS/TxxxE, are also well conserved. This initial alignment also revealed that the α domains of A0A4U0NJT4 and A0A944KY65 were not in the traditional γβα architecture. Upon closer inspection of their sequences and predicted structures (Fig. S1–S4†), it was clear that the α domains of A0A4U0NJT4 and A0A944KY65 were N-terminal of the γβ didomain and thus in a novel αγβ architecture. Even more unusually, A0A4U0NJT4 appears to have two N-terminal α domains, one that is complete (1–327 with DDHFD and NDIVSYNRE motifs) and one truncated form (354–561 with only a DDYFS motif). Overall, these alignments supported that these enzymes are bifunctional DCSs.
To confirm the bifunctional activity of the putative DCS from S. albulus, we synthesized a codon-optimized gene of babs and heterologously expressed it in our engineered E. coli GGPP overproduction system.21 Initially using thin-layer chromatography and high-performance liquid chromatography (HPLC), we identified a single major product, 2, in the organic extract. Parallel analysis by gas chromatography-mass spectrometry (GC-MS) revealed two peaks, 2 and an additional minor product, with M+ m/z values of 272 (Fig. 2). We then performed a large-scale (12 L) fermentation to isolate 2 for structural elucidation. Using 1D and 2D nuclear magnetic resonance (NMR) spectroscopy (Table S2 and Fig. S5–S11†), we unambiguously determined that the planar structure of 2 was abieta-7,13-diene. The NMR spectra revealed two double bonds with two olefinic protons, five methyl groups, six methylenes, three methines, and two sp3 quaternary carbons; thus, 2 was tricyclic. The overall 1H spectrum resembled that of a 6/6/6-tricyclic abietadiene skeleton with an isopropyl moiety confirmed by its two doublet methyls at C-16 (δH 1.01 ppm; J = 6.9 Hz) and C-17 (δH 1.02 ppm; J = 6.9 Hz) and the heptet methine at C-15 (δH 2.18 ppm; J = 6.9 Hz). 1H–1H COSY analysis revealed four spin systems connecting fragments of the A (C-1–C-2–C-3), B (C-5–C-6–C-7), and C (C-9–C-11–C-12) rings as well as the isopropyl moiety. Key 1H–13C HMBC correlations including those seen from H-7 to C-5, C-6, and C-9, H-14 to C-7, C-8, C-9, and C-15, and H-15 to C-12 and C-13 placed the conjugated diene in the B and C rings with the isopropyl moiety attached at C-13. 2D NOESY correlations between CH3-20 and H-9 supported that they were on the same face suggesting a syn or syn-ent configuration. Vibrational circular dichroism (VCD)22,23 of 2, in comparison with the DFT-calculated spectra of all eight possible stereoisomers, confirmed the absolute configuration of 2 is 5S,9S,10S, and thus syn-abieta-7,13-diene (Fig. S12, ESI†).
We did not isolate the minor product due to low yield, but during our study, the highly homologous StrDCS (100% coverage, 99% ID, 98% similarity) was reported to mainly produce syn-abieta-11,13(15)-diene (3) along with several minor diterpenes including palustradiene (4), the 8,13-diene isomer of 2.18 The EI fragmentation and retention index of our minor peak suggested that the minor peak may be 4 (Fig. S13†),24 but a comparison of the NMR spectra of 2 and 3 suggest they may be the same isomer (note: a correction to ref. 18 revising the structure of 3 to 2 is being prepared – personal communication with Prof. Reuben Peters). There are no significant active site differences between StrDCS and bAbS that would help explain the differences in product formation (Fig. S14†).
The biosynthesis of syn-abietadiene requires a type II domain that forms syn-copalyl diphosphate from GGPP and a type I domain that then yields the tricyclic skeleton. To confirm the bifunctional nature of bAbS, we performed site-directed mutagenesis to knock out the function of each domain (Fig. 2). The bAbS D282A variant, removing the catalytic Asp in the γβ active site, produced a new major product, 5, and only a trace amount of 2. Diterpene 5 was identified as geranylgeraniol (GGOH) based on EI fragmentation and retention index. The analogous mutation in ChjDCS, D299A, also yielded 5 while abolishing the production of 1 (Fig. S15†). Given that this Asp to Ala mutation (D282A in bAbS or D299A in ChjDCS) abolishes type II activity, 5 is most likely the result of endogenous E. coli phosphatases acting on excess GGPP. The bAbS D589A variant, removing the final Asp in the DDxxD motif of the α domain, did not produce 2 or 5; instead, a new product, 6, displaying an M+ m/z value of 290 appeared; this was identified as syn-copalol and matched the identification of 6 from StrDCS D585A.18 These data support that bAbS, like plant TPSs, possesses two TS active sites that independently act to transform GGPP into 2.
The cyclization mechanisms of the abietane and pimaradiene diterpene skeletons both go through an isopimara-15-en-8-yl cationic intermediate (Fig. S16†). In an elegant study with abietadiene synthase from Abies grandis (AgAS), a single amino acid change, A723S in AgAS, was found to switch AgAS into a pimaradiene synthase (Fig. 3A).25 Intriguingly, sequence alignment of bAbS and ChjDCS with AgAS, abietadiene synthase from Picea abies (PaAS), and isopimaradiene synthase from Picea abies (PaPS) revealed that ChjDCS retained the Ser found in PaPS while bAbS had a Gly (Fig. 3B). As such, we were presented a unique opportunity to investigate whether this facile switch present in the plant descendants originated in the ancestral, bacterial progenitors involved in abietadiene and isopimaradiene biosynthesis. We speculated that a similar single residue switch may be possible in these bacterial DCSs. We generated the bAbS G692S and ChjDCS S688A variants and screened them for activity in E. coli (Fig. 3C). However, no such single residue switch was observed, i.e., bAbS G692 did not produce syn-isopimaradiene and ChjDCS 688A did not produce n-abietadiene. The activity of ChjDCS was only slightly altered as 1 remained the major product with two new minor products, 5 and n-copalol (7), appearing. Mutation in bAbS significantly disrupted the overall activity resulting in the loss of 2 and the production of GGOH (5) and syn-copalol (6). More studies are needed to understand the sequence-structure–function relationships of the bacterial DCSs; however, it is evident that the rules that guide product formation in plant TPSs do not seamlessly translate to DCSs.
With plant-like tridomain bifunctional DCSs now known in bacteria, we were interested in what biosynthetic pathways these enzymes may be involved in. There are only a handful of plant-like diterpenoid natural products known to be produced by bacteria, e.g., gibberellins, gifhornenolones;26,27 however, these biosynthetic pathways encode, or have been proposed to encode, two distinct TSs. Biosyntheses leading to bacterial-specific natural products may begin with plant-like diterpenes but shunt away into different pathways, such as (thio)platensimycin and (thio)platencin,28,29 or those recently found in cyanobacteria.30 We examined the predicted biosynthetic gene clusters (BGCs) of each of the bacterial DCSs (Fig. 4). Overall, the BGCs did not share much organizational homology, even between DCSs that were highly similar in sequence. Many of these BGCs also do not resemble typical terpenoid clusters, which normally include a plethora of oxidative enzymes: only two clusters encoded cytochrome P450 enzymes and no other clear oxygenases were present. These observations may suggest that these bacteria are not making complex diterpenoids based on the products of these DCSs or that downstream biosynthetic genes are not clustered.
![]() | ||
Fig. 4 Genetic neighborhoods of bacterial DCSs. Gene colors were assigned based on predicted functions. Connections with a threshold of 0.3 (corresponding to 30% identity) are shown. The GenBank accession numbers are provided under the DCS name or protein accession number. Figure was made using Clinker and CAGECAT.32,33 |
The characterization of several tridomain bifunctional DCSs from multiple phyla has led to important insights into diterpene biosynthesis and enzyme evolution. Chen, Peters, and colleagues provided detailed evolutionary analysis of DCS and TPS evolution and will not be further discussed here.18 However, the presence of distinct architectures of DCSs, αγβ (A0A944KY65) or ααγβ (A0A4U0NJT4) vs. the typical γβα organization, implicates multiple evolutionary paths to bifunctional terpene synthases in bacteria.
When preparing this manuscript, we re-BLASTed ChjDCS and StrDCS against the NCBI database, restricting to bacterial proteins, and found three additional hits with γβα structures, supporting a growing population of these bacterial enzymes (Table S1, Fig. S1–S4† and Fig. 4): WP_325534023 from Chitinophaga sp. (99% coverage, 35% ID, 52% similarity with ChjDCS), WP_376174804 from Streptomyces noursei (100% coverage, 98% ID, 98% similarity with StrDCS), and WP_327596760 from Streptomyces chartreusis (100% coverage, 73% ID, 81% similarity with StrDCS). In addition, while mining the protein databases, we found many other predicted proteins that harbor type I or II TS domains along with other unexpected domains (e.g., PF01593 amino oxidase in A0A4U8Z138 from Methylocella tundrae or PF01048 purine nucleoside phosphorylase A0A6G2N0A4 from Streptomyces sp. SID4946). One such characterized example is the drimenol synthase from Aquimarina spongiae, which has a haloacid dehalogenase (HAD)-like hydrolase domain N-terminal to a type II TC β domain.31 Interestingly, the putative DCS A0A5Q0GZZ2 is found two genes away from a putative type II γβ didomain fused to an N-terminal HAD-like hydrolase domain (A0A5Q0H0C6; Fig. 4). These preliminary bioinformatics data support that genetic fusion of terpene-related genes yielding multi-functional enzymes may be a common occurrence in bacteria and provide opportunities for future research.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5ob00724k |
This journal is © The Royal Society of Chemistry 2025 |