Open Access Article
Tohru
Abe
a,
Haruna
Shiratori
a,
Kosuke
Kashiwazaki
b,
Kazuma
Hiasa
c,
Daijiro
Ueda
a,
Tohru
Taniguchi
d,
Hajime
Sato
*ce,
Takashi
Abe
*b and
Tsutomu
Sato
*a
aDepartment of Life and Food Sciences, Graduate School of Science and Technology, Niigata University, Ikarashi 2-8050, Nishi-ku, Niigata, 950-2181, Japan. E-mail: satot@agr.niigata-u.ac.jp
bDepartment of Electrical and Information Engineering, Graduate School of Science and Technology, Niigata University, Ikarashi 2-8050, Nishi-ku, Niigata, 950-2181, Japan. E-mail: takaabe@ie.niigata-u.ac.jp
cInterdisciplinary Graduate School of Medicine and Engineering, University of Yamanashi, 4-4-37 Takeda, Kofu, Yamanashi 400-8510, Japan. E-mail: hsato@yamanashi.ac.jp
dFrontier Research Center for Advanced Material and Life Science, Faculty of Advanced Life Science, Hokkaido University, North 21 West 11, Sapporo 001-0021, Japan
ePRESTO, Japan Science and Technology Agency, Kawaguchi, Saitama 332-0012, Japan
First published on 5th June 2024
Non-canonical terpene synthases (TPSs) with primary sequences that are unrecognizable as canonical TPSs have evaded detection by conventional genome mining. This study aimed to prove that novel non-canonical TPSs can be efficiently discovered from proteins, hidden in genome databases, predicted to have 3D structures similar to those of class I TPSs. Six types of non-canonical TPS candidates were detected using this search strategy from 268 genome sequences from actinomycetes. Functional analyses of these candidates revealed that at least three types were novel non-canonical TPSs. We propose classifying the non-canonical TPSs as classes ID, IE, and IF. A hypothetical protein MBB6373681 from Pseudonocardia eucalypti (PeuTPS) was selected as a representative example of class ID TPSs and characterized. PeuTPS was identified as a diterpene synthase that forms a 6/6/6-fused tricyclic gersemiane skeleton. Analyses of PeuTPS variants revealed that amino acid residues within new motifs [D(N/D), ND, and RXXKD] located close to the class I active site in the 3D structure were essential for enzymatic activity. The homologs of non-canonical TPSs found in this study exist in bacteria as well as in fungi, protists, and plants, and the PeuTPS gene is not located near terpene biosynthetic genes in the genome. Therefore, structural-model-based genome mining is an efficient strategy to search for novel non-canonical TPSs that are independent of biological species and biosynthetic gene clusters and will contribute to expanding the structural diversity of terpenoids.
000 compounds.1 Terpene synthases (TPSs; also referred to as terpene cyclases) are responsible for generating the structural diversity of cyclic skeletons and are classified into classes I and II based on differences in their cyclization initiation mechanisms (elimination of diphosphate or protonation of double bonds/epoxides, respectively) and active site motifs (DDXXD + NSE/DTE and DXDD).2,3 The active site motifs in class I TPSs, DDXXD and NSE/DTE, are involved in diphosphate group binding in the substrate via Mg2+ to eliminate the diphosphate group.2,3 The effector triad, composed of Arg, Asp, and Gly residues, is involved in substrate recognition and carbocation stabilization and is conserved in class I TPSs.4,5 Genome mining using sequence homology such as BLAST and Hidden Markov models, which are based on the overall and motif sequences of these enzymes, has led to the discovery of many novel terpenes from various organisms such as bacteria, fungi, and plants.5–9 In contrast, several non-canonical (or unconventional) TPSs with primary sequences that are unrecognizable as canonical classes I and II with no known active site motifs have been recently discovered.10–16 Notably, the 3D structures of two types of non-canonical TPSs (BalTS from Bacillus alcalophilus and AsR6s from Acremonium strictum) that catalyze class I type reactions are similar to those of class I TPSs (Table S1†).13,16 We have proposed classifying the BalTS homolog family into class IB.13,14 In the class IB TPS (BalTS), the six Asp residues in the DYLDNLXD and DY(F,L,W)IDXXED motifs exist in positions sterically close to those in class I, and these have been proven to be catalytic residues.13,14 Furthermore, in AsR6 (named class IC in this study), Asp and Lys residues are reported to be located in a position similar to that in class I and IB and are involved in diphosphate group recognition.16 The class IB and IC TPSs have been discovered through comprehensive analyses of gene-disruption-strain library and analysis of a related biosynthetic gene cluster, respectively.12,15 Further discoveries of non-canonical TPSs should contribute to expanding terpenoid diversity; however, the number of non-canonical TPSs that can be found by these methods is limited and efficient approaches that target a wide range of genes have not yet been developed. This study aimed to prove that novel non-canonical TPSs can be efficiently discovered from proteins, hidden in genome databases, predicted to have 3D structures similar to those of class I TPSs.
As a representative for analyzing type-1, MMAR2565 from Mycobacterium marinum discovered using the above search was selected. Regarding types 2–6, since the origin of detected genes was difficult to obtain, homologs from easily available bacterial strains or genomes were analyzed (Fig. 1e, Table S3, Fig. S1†). In addition, the functions of type-1 (MMAR2565) homologs were also analyzed to confirm that the homologs distributed in various bacterial species (E value < 10−5: currently 120 species; Fig. 2a) are non-canonical TPSs (Fig. 1e). Seven type-1 homologs (PeuTPS, NioTPS, CmeTPS, HtsTPS, SceTPS, SpaTPS, and HauTPS) from various phylogenetic groups, including non-actinomycetes (Fig. 2a, Table S3†), were selected for functional analysis.
:
1 mixture of all-E-geranylfarnesyl diphosphate (GFPP, C25)/all-E-hexaprenyl diphosphate (HexPP, C30)—enzymatically synthesized by E-IDSs, and with the exception of three non-canonical TPS candidates (types 2–4), all proteins demonstrated activity toward prenyl diphosphate substrates to form dephosphorylated products, similar to class I type reactions (Fig. 2b and S3†). These proteins, with the exception of the less active MMAR2565 (Fig. 2b), were confirmed by gas chromatography-mass spectrometry (GC-MS) analysis to produce unknown terpenes (Fig. S4 and S5†), demonstrating their function as TPSs. In addition to diterpene synthases (type-1 of PeuTPS, NioTPS, CmeTPS, HtsTPS, SceTPS, SpaTPS, and HauTPS; type-5 of SalTPS), a sesquiterpene synthase (type-6 of NpsTPS) was found (Fig. 2b, S3–S5†). Since these findings were generated from only one test reaction performed under only one condition, to determine whether type 2–4 proteins exhibit enzymatic activities, further experiments under various reaction conditions need to be carried out. Here, we propose naming the three types of non-canonical TPSs (types-1, 5 and 6) classes ID, IE, and IF, respectively. BLAST searches found 120, 823, and 3 homologs of class ID, IE, and IF TPSs, respectively (Fig. 2a). Notably, the homologs of non-canonical TPSs discovered in this study exist in bacteria as well as in fungi, protists, and plants (Fig. 2a).
To demonstrate that non-canonical TPSs absent in biosynthetic gene clusters can also be discovered with this strategy, the class ID homolog of PeuTPS that is not located near terpene biosynthetic gene clusters including E-IDS (Fig. S6†) was chosen as a representative example and characterized in detail using the most reactive GGPP substrate. GC-MS analysis revealed that PeuTPS produced two products (1
:
2 = ca. 85
:
15) from GGPP both in vitro and in vivo (Fig. 3a, S4, and S5†). Products, 1 and 2, were isolated from E. coli cultures co-expressing PeuTPS, GGPP synthase,18 and mevalonate pathway enzymes.19 Chemical structures of 1 and 2 were determined using mass spectrometry and nuclear magnetic resonance (NMR) (Fig. 3b, S7–S23†). Analyses of the HMBC and 1H,1H-COSY spectra showed that both 1 and 2 have 6/6/6-fused tricyclic skeletons and the structural difference between 1 and 2 is the position of the double bond in the C ring (Fig. 3b, S7, and S16†). The relative stereochemistry of the stereogenic centers at positions 1, 2, 7, and 14 in 1 was determined using NOE correlations (H1–H15, H1–H16, H1–H19, H8equatorial-H19, and H2–H8axial) (Fig. 3b and S7†). Since no H1–H14 correlation was observed in 1H,1H-COSY, the H1–H14 dihedral angle was inferred to be close to 90°. The NMR data and 3D structures calculated by the conformational search supported the relative stereochemistry of at position 14 in 1 (Fig. 3b, S7, and S15†). The relative stereochemistry of the stereogenic centers at positions 1, 2, 7, 10, and 14 in 2 was determined by NOE correlation (H1–H16, H1–H19, H10–H19, H8equatorial-H19, H2–H8axial, and H2–H14) (Fig. 3b and S16†). The absolute configuration of 1 was determined as 1S,2R,7S,14S using vibrational circular dichroism (VCD) spectroscopy (Fig. S24†). Owing to the low isolated amount, the stereochemistry of 2 could not be analyzed using VCD spectroscopy. However, since both 1 and 2 are produced by the same enzyme, PeuTPS, the absolute configuration of 2 should be similar to that of 1, as shown in Fig. 3b. Compounds 1 and 2 have not been reported as natural products. To the best of our knowledge, the only known natural products with the same cyclic skeleton (gersemiane skeleton) as 1 and 2 are 11 compounds found in corals, seagrass, liverwort, and marine cyanobacteria (Fig. S25†).20–25 We propose the names peugersemienes A and B for the compounds 1 and 2, respectively. This is the first report of TPS forming the gersemiane skeleton of 1 and 2 (Fig. 3c). We propose the reaction pathways of 1 and 2 based on DFT calculations (Fig. S26 and S27†); however, further analysis may be necessary to elucidate this mechanism. The production of 1 and 2 by P. eucalypti culture was not observed in this study, suggesting that either the PeuTPS gene might not be expressed under normal laboratory culture conditions or that both 1 and 2 might be further converted to unknown compounds. Several homologs of class ID, IE, and IF TPSs are clustered with genes of tailoring enzymes such as P450 on the genomes (Fig. S6†), suggesting that these TPSs may serve as core enzymes in undiscovered natural product biosynthesis. We are currently analyzing the structures of products synthesized by the class ID, IE, and IF TPSs other than PeuTPS and will report the diverse structures of these terpenes in the near future.
![]() | ||
| Fig. 4 Structure of PeuTPS. (a) 3D structure of PeuTPS predicted by Alphafold2. (b) Superposition of the 3D structure of PeuTPS (pink and gray) with the class I TPS domain of AgBIS (green), BalTS (class IB; blue), and AsR6 (class IC; yellow). The model of PeuTPS has a sufficient confidence level in the region superposed with other TPSs (Fig. S1†). Additionally, the RMSD of PeuTPS and class I TPS domain of AgBIS (3.1) is comparable to that of the crystal structures, AgBIS-BalTS and AgBIS- AsR6 (both 3.7). (c) Proposed catalytic residues of PeuTPS (pink) superposed with that of AgBIS (green), BalTS (blue), and AsR6 (yellow). A side chain of farnesyl thiopyrophosphate and Mg2+ ion co-crystalized with AgBIS are represented as gray sticks and gray spheres, respectively. (d) Conservation of proposed catalytic residues in class ID TPSs. The height of the symbols in the sequence logos indicates the sequence conservation in 120 homologs at that position. | ||
The catalytic residues in motifs of class I (DDXXD and NSE/DTE), IB [DYLDNLXD and DY(F,L,W)IDXXED], and Asp/Lys residues of class IC exist in similar positions in 3D structures (Fig. 4c). Seven residues (D95, N96, N289, D290, R338, K341, and D342) of PeuTPS are also located sterically close to the active site of known TPSs (Fig. 4c), and the same or similar residues were conserved in class ID TPSs (Fig. 4d and S29†). The new motifs [D(N/D), ND, and RXXKD] are clearly different from those of class I, IB, and IC TPSs (Fig. 4d). Therefore, we analyzed the enzymatic activities of PeuTPS variants in which these seven residues were individually replaced with Ala. All variants were inactivated or exhibited a significant reduction in activity (Fig. 5a), and the results were similar to previous analyses of BalTS (class IB) Ala variants targeting the six catalytic Asp residues.13 In addition, docking simulation of the GGPP substrate into the PeuTPS model structure suggested that the cavity around the residues of the motifs [D(N/D), ND, and RXXKD] is the only space in which the substrate can bind (Fig. S30†). Therefore, the position of the active site in PeuTPS as well as its overall 3D structure may be similar to that of classes I, IB, and IC. In addition, Mg2+ was essential for PeuTPS activity (Fig. 5b), suggesting that Mg2+ may contribute to the binding of the diphosphate group of the substrate, similar to most class I, IB, and IC TPSs. We propose that PeuTPS not only catalyzes diphosphate elimination but also protonation in the second step (Fig. S26†). However, its catalytic mechanism remains unclear. Furthermore, the effector triad strictly conserved in class I TPSs has not been found in the 3D structural model of PeuTPS (Fig. S31†). To elucidate the detailed catalytic mechanism of PeuTPS, the actual 3D structure of PeuTPS or its homolog should be evaluated in the future. In addition, we are currently analyzing the products of PeuTPS homologs in detail, and a comparison of their active site structures will help us understand the catalytic mechanism.
Footnote |
| † Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sc01381f |
| This journal is © The Royal Society of Chemistry 2024 |