Elastic net wavelength interval selection based on iterative rank PLS regression coefficient screening
Abstract
In recent years, near-infrared (NIR) spectroscopy has been extensively applied as an analytical tool in various fields. However, the spectral data obtained from these modern spectroscopic instruments usually contain a large number of variables with high co-linearity, which render the prediction of a response variable unreliable. To address this problem, a novel wavelength interval selection method, called elastic net variable selection by using iterative rank PLS regression coefficient screening (EN-IRRCS) is proposed. The EN-IRRCS method combines the grouping effect of elastic nets and the core idea of sure independence screening (SIS) in sorting the correlation between the response variable and the predictor variables, which can automatically select successive strongly correlated predictor spectral variables related to the response. Three real NIR datasets were employed to investigate the performance of the proposed method. The results indicate that EN-IRRCS is a good wavelength interval selection strategy.