An integrative C. elegans protein–protein interaction network with reliability assessment based on a probabilistic graphical model†
In Caenorhabditis elegans, a large number of protein–protein interactions (PPIs) are identified by different experiments. However, a comprehensive weighted PPI network, which is essential for signaling pathway inference, is not yet available in this model organism. Therefore, we firstly construct an integrative PPI network in C. elegans with 12 951 interactions involving 5039 proteins from seven molecular interaction databases. Then, a reliability score based on a probabilistic graphical model (RSPGM) is proposed to assess PPIs. It assumes that the random number of interactions between two proteins comes from the Bernoulli distribution to avoid multi-links. The main parameter of the RSPGM score contains a few latent variables which can be considered as several common properties between two proteins. Validations on high-confidence yeast datasets show that RSPGM provides more accurate evaluation than other approaches, and the PPIs in the reconstructed PPI network have higher biological relevance than that in the original network in terms of gene ontology, gene expression, essentiality and the prediction of known protein complexes. Furthermore, this weighted integrative PPI network in C. elegans is employed on inferring interaction path of the canonical Wnt/β-catenin pathway as well. Most genes on the inferred interaction path have been validated to be Wnt pathway components. Therefore, RSPGM is essential and effective for evaluating PPIs and inferring interaction path. Finally, the PPI network with RSPGM scores can be queried and visualized on a user interactive website, which is freely available at http://rspgm.bionetworks.tk/.