Structural-model-based genome mining can efficiently discover novel non-canonical terpene synthases hidden in genomes of diverse species†
Abstract
Non-canonical terpene synthases (TPSs) with primary sequences that are unrecognizable as canonical TPSs have evaded detection by conventional genome mining. This study aimed to prove that novel non-canonical TPSs can be efficiently discovered from proteins, hidden in genome databases, predicted to have 3D structures similar to those of class I TPSs. Six types of non-canonical TPS candidates were detected using this search strategy from 268 genome sequences from actinomycetes. Functional analyses of these candidates revealed that at least three types were novel non-canonical TPSs. We propose classifying the non-canonical TPSs as classes ID, IE, and IF. A hypothetical protein MBB6373681 from Pseudonocardia eucalypti (PeuTPS) was selected as a representative example of class ID TPSs and characterized. PeuTPS was identified as a diterpene synthase that forms a 6/6/6-fused tricyclic gersemiane skeleton. Analyses of PeuTPS variants revealed that amino acid residues within new motifs [D(N/D), ND, and RXXKD] located close to the class I active site in the 3D structure were essential for enzymatic activity. The homologs of non-canonical TPSs found in this study exist in bacteria as well as in fungi, protists, and plants, and the PeuTPS gene is not located near terpene biosynthetic genes in the genome. Therefore, structural-model-based genome mining is an efficient strategy to search for novel non-canonical TPSs that are independent of biological species and biosynthetic gene clusters and will contribute to expanding the structural diversity of terpenoids.
- This article is part of the themed collection: 2024 Chemical Science HOT Article Collection