Genome Mining for Enzyme Discovery
The emergence of high-throughput sequencing (or Next Generation Sequencing, NGS) in the mid-2000s has generated an incredible amount of protein and gene sequences deposited in the databases, and the number of sequences will be increasing in the future. Nucleic and protein databases are therefore gold mines for discovering novel enzymes. The first step to discover novel enzymes catalyzing a target reaction is to find a gene encoding an enzyme catalyzing that chemical transformation. There are many ways to achieve this goal. This chapter describes a panel of methods based on different approaches such as text mining, sequence identity and 3D structure of the active site or a combination thereof.