The integrated disease network†
Abstract
The growing body of transcriptomic, proteomic, metabolomic and genomic data generated from disease states provides a great opportunity to improve our current understanding of the molecular mechanisms driving diseases and shared between diseases. The use of both clinical and molecular phenotypes will lead to better disease understanding and classification. In this study, we set out to gain novel insights into diseases and their relationships by utilising knowledge gained from system-level molecular data. We integrated different types of biological data including genome-wide association studies data, disease–chemical associations, biological pathways and Gene Ontology annotations into an Integrated Disease Network (IDN), a heterogeneous network where nodes are bio-entities and edges between nodes represent their associations. We also introduced a novel disease similarity measure to infer disease–disease associations from the IDN. Our predicted associations were systemically evaluated against the Medical Subject Heading classification and a statistical measure of disease co-occurrence in PubMed. The strong correlation between our predictions and co-occurrence associations indicated the ability of our approach to recover known disease associations. Furthermore, we presented a case study of Crohn's disease. We demonstrated that our approach not only identified well-established connections between Crohn's disease and other diseases, but also revealed new, interesting connections consistent with emerging literature. Our approach also enabled ready access to the knowledge supporting these new connections, making this a powerful approach for exploring connections between diseases.
- This article is part of the themed collection: Computational Integrative biology (IB)