Jump to main content
Jump to site search

Prediction of zinc-binding sites using multiple sequence profiles and machine learning methods


Zinc (Zn2+) cofactor has proven to be involved in numerously biological mechanisms and zinc-binding site is recognized as one of the most important post-translation modifications in proteins. Therefore, accurate knowledge of zinc ions in protein structures can provide potential clues for elucidation of protein folding and functions. However, determining zinc-binding residues by experimental means is usually lab-intensive and associated with high cost in most cases. In this context, the development of computational tools for identifying zinc-binding sites is highly desired, especially in the current post-genomic era. In this work, we developed a novel zinc-binding site prediction method by combining several intensively-trained machine learning models. To establish an accurate and generative method, we downloaded all zinc-binding proteins from the Protein Data Bank and prepared a non-redundant dataset. Meanwhile, a well-prepared dataset by other groups were also used. Then, effective and complementary features were extracted from these sequences and three-dimensional structures of these proteins. Moreover, several well-designed machine learning models were intensively trained to construct accurate models. To assess the performance, the obtained predictors were stringently benchmarked using the diverse zinc-binding sites. Furthermore, several state-of-the-art in silico methods developed specifically for zinc-binding sites were also evaluated and compared. The results confirmed that our method is very competitive in the real world applications and can become a complementary tool to wet lab experiments. To facilitate the community, a web server and stand-alone program implementing our method were constructed and are publicly available at http://bioinformatics.fzu.edu.cn/znMachine.html. The downloadable program of our method can be easily used to high-throughput screening of potential zinc-binding sites across proteomes.

Back to tab navigation

Supplementary files

Publication details

The article was received on 12 Mar 2019, accepted on 15 Apr 2019 and first published on 16 Apr 2019

Article type: Research Article
DOI: 10.1039/C9MO00043G
Citation: Mol. Omics, 2019, Accepted Manuscript

  •   Request permissions

    Prediction of zinc-binding sites using multiple sequence profiles and machine learning methods

    R. Yan, X. Wang, Y. Tian, J. Xu, X. Xu and J. Lin, Mol. Omics, 2019, Accepted Manuscript , DOI: 10.1039/C9MO00043G

Search articles by author