Computational Integrative Biology – on the joint analysis of diverse biological data sets

Jan Baumbach *a and Richard Röttger b
aComputational Biology group, Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark. E-mail: jan.baumbach@imada.sdu.dk
bPractical Computer Science in BioMedicine, Department of Mathematics and Computer Science, University of Southern Denmark, 5230 Odense, Denmark

Received 25th September 2014 , Accepted 25th September 2014
image file: c4ib90037e-p1.tif

Jan Baumbach

Jan Baumbach studied Applied Computer Science in the Natural Sciences at Bielefeld University in Germany. His research career started at Rothamsted Research in Harpenden (UK) where he worked on computational methods for the integration of molecular biology data. He returned to the Center for Biotechnology in Bielefeld for his PhD studies on the reconstruction of bacterial transcriptional regulatory networks. He developed CoryneRegNet, the reference database and analysis platform for corynebacterial gene regulations. Afterwards, at the University of California at Berkeley, he worked in the Algorithms group of Richard Karp on protein homology detection. In Berkeley, he also developed Transitivity Clustering, a novel clustering framework for large-scale biomedical data sets. Since March 2010, Jan has been the head of the Computational Systems Biology group at the Max Planck Institute for Informatics and the Cluster of Excellence for Multimodal Computing and Interaction at Saarland University in Saarbrücken, Germany. Recently, he moved with parts of his group to the University of Southern Denmark, as the head of the Computational Biology group. His current research concentrates on the combined analysis of biological networks together with OMICs data and the modeling of genetic expression pathways. He currently establishes Computational Breath Analysis as bioinformatics discipline dedicated to metabolic biomarker discovery in human exhaled air.

image file: c4ib90037e-p2.tif

Richard Röttger

Richard Röttger studied Computer Science at the Technical University of Munich and Technology Management at the Center for Digital Technology and Management, a joint venture of the Ludwig-Maximilian-University and the Technical University of Munich. He wrote his thesis about estimating the size and completeness of gene regulatory networks at the University of California at Berkeley. He pursued his PhD, at the Max-Planck-Institute for Informatics at Saarland University in Saarbrücken, Germany, on clustering of large-scale biomedical datasets. He developed a Transitivity Clustering strategy with the ability to cope with missing values. He employed several evolutionary studies on bacterial functional genomics. Recently, he was appointed Assistant Professor for Practical Computer Science in BioMedicine at the University of Southern Denmark. He currently develops methods for modeling complex biological networks in conjunction with multiple OMICs data sets.


This themed issue of Integrative Biology is dedicated to several research areas of computational biology, commonly called bioinformatics. Bioinformatics is a cross-disciplinary computer science research area driven by the necessity of integrating and analyzing huge, often noisy, biomedical OMICs data sets. Nowadays, computer-aided studies for unraveling molecular mechanisms with physiological relevance are no longer the exception. More and more medical questions are being addressed by processing large-scale OMICs data. Here, typically, drawing conclusions from wet lab data by utilizing integrated software approaches alternates with the prediction of new wet lab targets and the design of new experiments. In order to understand how cells survive, reproduce, grow, form organs and alter their behavior in accordance to changing environmental conditions, biological data must be combined appropriately to form a comprehensive picture. This is integrative computational biology – and, if applied in a well-structured way, this will shape the future of modern biomedicine, as it will influence treatment strategy selection and help to identify new molecular targets.

The following small but diverse selection of articles from different bioinformatics areas describe integrated computational studies that we believe to be interesting, informative and educational for the reader.

An ongoing trend in computational biology is the enrichment of biological networks and/or models with additional biological datasets and a priori knowledge. This enhances the quality of the underlying models and allows for investigating available biological datasets together in a broader and more significant context.

Bender et al. (DOI: 10.1039/C4IB00175C) combine sequence and structure data in order to determine the relationship and the bioactivity of protease inhibitors against the Serine Protease family, using proteochemometric modelling. Sinha et al. (DOI: 10.1039/C4IB00124A) present a method for modeling the important Wnt signaling pathway by employing static Bayesian networks, which allow for integrating of prior biological knowledge, like epigenetic information and inter/intra-cellular factors. In the work of Sarajlić et al. (DOI: 10.1039/C4IB00125G), the connection of the disease–disease interaction of diabetes and aneurysm is investigated by combining different disease-related pathways and genetic information, in order to identify the most relevant proteins in these pathways. Another proficient way of combining different biological datasets is presented in the work of Pauling et al. (DOI: 10.1039/C4IB00137K). The authors integrate multiple OMICs datasets with hybrid interactome networks and identified dysregulated functional network modules with high breast cancer specificity. An even broader approach is taken by Pržulj et al. (DOI: 10.1039/C4IB00122B); the authors combine a multitude of different datasets into an “integrated disease network”. This approach allows for gaining knowledge on molecular mechanisms driving diseases and shared between diseases by a combined analysis of transcriptomic, proteomic, metabolomic and genomic data.

All presented analyses ultimately depend on the quality of the underlying networks and of the enrichment datasets. Azevedo et al. (DOI: 10.1039/C4IB00136B) analyze the ability of different BLAST+ metrics in order to predict protein–protein interactions. The work of Poongavanam et al. (DOI: 10.1039/C4IB00111G) deals with the problem of how the selection of the initial ligand structures influences the ranking of the binding affinity of the different compounds using a set of HIV-1 RNase H inhibitors.

Another way of benefiting from computational power and biological data is presented in a second contribution of Azevedo et al. (DOI: 10.1039/C4IB00140K). The authors employed computer simulations in order to predict the efficiency of potential new drugs, targeted against the two-signal transduction system in the multi-resistent bacteria C. pseudotuberculosis.

Through a rigorous paper selection and review process, we also ensured the usual 3-I criteria for publication in Royal Society of Chemistry's Integrative Biology: Insight, Innovation, and Integration. Sincerely, we believe that you will enjoy reading this themed issue on Computational Integrative Biology.


This journal is © The Royal Society of Chemistry 2014