Issue 1, 2015

PLS/OPLS models in metabolomics: the impact of permutation of dataset rows on the K-fold cross-validation quality parameters

Abstract

Among all the software packages available for discriminant analyses based on projection to latent structures (PLS-DA) or orthogonal projection to latent structures (OPLS-DA), SIMCA (Umetrics, Umeå Sweden) is the more widely used in the metabolomics field. SIMCA proposes many parameters or tests to assess the quality of the computed model (the number of significant components, R2, Q2, pCV-ANOVA, and the permutation test). Significance thresholds for these parameters are strongly application-dependent. Concerning the Q2 parameter, a significance threshold of 0.5 is generally admitted. However, during the last few years, many PLS-DA/OPLS-DA models built using SIMCA have been published with Q2 values lower than 0.5. The purpose of this opinion note is to point out that, in some circumstances frequently encountered in metabolomics, the values of these parameters strongly depend on the individuals that constitute the validation subsets. As a result of the way in which the software selects members of the calibration and validation subsets, a simple permutation of dataset rows can, in several cases, lead to contradictory conclusions about the significance of the models when a K-fold cross-validation is used. We believe that, when Q2 values lower than 0.5 are obtained, SIMCA users should at least verify that the quality parameters are stable towards permutation of the rows in their dataset.

Graphical abstract: PLS/OPLS models in metabolomics: the impact of permutation of dataset rows on the K-fold cross-validation quality parameters

Supplementary files

Article information

Article type
Opinion
Submitted
16 Jul 2014
Accepted
23 Oct 2014
First published
23 Oct 2014

Mol. BioSyst., 2015,11, 13-19

Author version available

PLS/OPLS models in metabolomics: the impact of permutation of dataset rows on the K-fold cross-validation quality parameters

M. N. Triba, L. Le Moyec, R. Amathieu, C. Goossens, N. Bouchemal, P. Nahon, D. N. Rutledge and P. Savarin, Mol. BioSyst., 2015, 11, 13 DOI: 10.1039/C4MB00414K

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Spotlight

Advertisements