Analysis of multi-source metabolomic data using joint and individual variation explained (JIVE)†
Abstract
Metabolic profiling is increasingly being used for understanding biological processes but there is no single analytical technique that provides a complete quantitative or qualitative profiling of the metabolome. Data fusion (i.e. joint analysis of data from multiple sources) has the potential to circumvent this issue facilitating knowledge discovery and reliable biomarker identification. Another field of application of data fusion is the simultaneous analysis of metabolomic changes through several biofluids or tissues. However, metabolomics typically deals with large datasets, with hundreds to thousands of variables and the identification of shared and individual factors or structures across multiple sources is challenging due to the high variable to sample ratios and differences in intensity and noise range. In this work we apply a recent method, Joint and Individual Variation Explained (JIVE), for the integrated unsupervised analysis of metabolomic profiles from multiple data sources. This method separates the shared patterns among data sources (i.e. the joint structure) from the individual structure of each data source that is unrelated to the joint structure. Two examples are described to show the applicability of JIVE for the simultaneous analysis of multi-source data using: (i) plasma samples subjected to different analytical techniques, sample treatment and measurement conditions; and (ii) plasma and urine samples subjected to liquid chromatography-mass spectrometry measured using two ionization conditions.