Analysis of High-dimensional Data from Designed Metabolomics Studies
In most systems involving biological studies, the effects of experimental factors on the system are assessed using functional genomics tools such as metabolomics or proteomics. Datasets resulting from metabolomics or metabolic profiling experiments are becoming increasingly complex because of underlying factors, such as time (time-resolved or longitudinal measurements), different treatments or combinations thereof, leading to between-factor interactions. For the analysis of such complex data, combinations of Analysis of Variance (ANOVA) models and high-dimensional analysis methods such as Principal Component Analysis (PCA) and Partial Least Squares-Discriminant Analysis (PLS-DA) have been developed. The linear model familiar from ANOVA separates the data into orthogonal effect matrices which allows construction of independent models for each effect. The high-dimensional analysis methods, however, explore these effect matrices for correlations and underlying relationships between the metabolites. These methods facilitate a relatively simple interpretation of the variation induced by each different factor in the experimental design. Here, two applications are presented in which the first one focuses on different treatments of plants, whilst in the second application the differences between human individuals in a polyphenolic intervention study represents the factor of major importance.