Jump to main content
Jump to site search

Issue 10, 2014
Previous Article Next Article

An integrative computational model for large-scale identification of metalloproteins in microbial genomes: a focus on iron–sulfur cluster proteins

Author affiliations

Abstract

Metalloproteins represent a ubiquitous group of molecules which are crucial to the survival of all living organisms. While several metal-binding motifs have been defined, it remains challenging to confidently identify metalloproteins from primary protein sequences using computational approaches alone. Here, we describe a comprehensive strategy based on a machine learning approach to design and assess a penalized generalized linear model. We used this strategy to detect members of the iron–sulfur cluster protein family. A new category of descriptors, whose profile is based on profile hidden Markov models, encoding structural information was combined with public descriptors into a linear model. The model was trained and tested on distinct datasets composed of well-characterized iron–sulfur protein sequences, and the resulting model provided higher sensitivity compared to a motif-based approach, while maintaining a good level of specificity. Analysis of this linear model allows us to detect and quantify the contribution of each descriptor, providing us with a better understanding of this complex protein family along with valuable indications for further experimental characterization. Two newly-identified proteins, YhcC and YdiJ, were functionally validated as genuine iron–sulfur proteins, confirming the prediction. The computational model was then applied to over 550 prokaryotic genomes to screen for iron–sulfur proteomes; the results are publicly available at: http://biodev.extra.cea.fr/isph. This study represents a proof-of-concept for the application of a penalized linear model to identify metalloprotein superfamilies on a large-scale. The application employed here, screening for iron–sulfur proteomes, provides new candidates for further biochemical and structural analysis as well as new resources for an extensive exploration of iron-sulfuromes in the microbial world.

Graphical abstract: An integrative computational model for large-scale identification of metalloproteins in microbial genomes: a focus on iron–sulfur cluster proteins

Back to tab navigation

Supplementary files

Publication details

The article was received on 06 Jun 2014, accepted on 07 Aug 2014 and first published on 07 Aug 2014


Article type: Paper
DOI: 10.1039/C4MT00156G
Citation: Metallomics, 2014,6, 1913-1930

  •   Request permissions

    An integrative computational model for large-scale identification of metalloproteins in microbial genomes: a focus on iron–sulfur cluster proteins

    J. Estellon, S. Ollagnier de Choudens, M. Smadja, M. Fontecave and Y. Vandenbrouck, Metallomics, 2014, 6, 1913
    DOI: 10.1039/C4MT00156G

Search articles by author

Spotlight

Advertisements