Issue 11, 2010

Detecting influential observations by cluster analysis and Monte Carlo cross-validation

Abstract

The detection of influential observations is an essential step for building high performance models and has been recognized as an important and challenging task in many industrial and laboratorial applications. A new approach for detecting influential observations is developed based on their effect on partial least squares (PLS) modeling. In this method, we build a large number of PLS models by using Monte Carlo cross-validation (MCCV), and then perform principal component analysis (PCA) on the regression coefficients of these models. Because a model with influential observations is different from the one without influential observation, the series of PLS models cluster into different groups in principal component (PC) spaces, based on the different number of influential observations they contain. The influential observations can be therefore recognized according to the frequency number of each sample in each group. By three examples quantitatively modeling near-infrared (NIR) and Raman spectra, it was shown that the method can detect the influential observations intuitively and veraciously.

Graphical abstract: Detecting influential observations by cluster analysis and Monte Carlo cross-validation

Supplementary files

Article information

Article type
Paper
Submitted
22 May 2010
Accepted
13 Aug 2010
First published
10 Sep 2010

Analyst, 2010,135, 2841-2847

Detecting influential observations by cluster analysis and Monte Carlo cross-validation

X. Bian, W. Cai, X. Shao, D. Chen and E. R. Grant, Analyst, 2010, 135, 2841 DOI: 10.1039/C0AN00345J

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements