A combination algorithm for variable selection to determine soluble solid content and firmness of pears
Abstract
Informative variable (or wavelength) selection plays an important role in quantitative analysis by visible and near infrared (Vis/NIR) spectroscopy. In this study, a new combination of Monte Carlo-uninformative variable elimination (MC-UVE) and the successive projections algorithm (SPA) was proposed to select the most effective variables. The selected variables were used as the inputs of a least squares-support vector machine (LS-SVM) to build MC-UVE-SPA-LS-SVM models for determining the soluble solid content (SSC) and firmness of pears. Conventional PLS models were also developed for comparison. The results indicated that calibration models built using MC-UVE-SPA-LS-SVM on 14 and 17 effective variables achieved the optimum performance for two internal quality indices compared with full-spectrum PLS, MC-UVE-PLS, MC-UVE-LS-SVM and MC-UVE-SPA-PLS models by balancing model accuracy and model complexity. The correlation coefficient (r) and root mean square error of prediction (RMSEP) and residual predictive deviation (RPD) values for the prediction set were 0.9486, 0.3244, 3.1598 and 0.8955, 1.1077, 2.2469 for SSC and firmness, respectively. The overall results indicated that Vis/NIR spectroscopy incorporated with MC-UVE-SPA-LS-SVM could be applied as an alternative fast and accurate method for the nondestructive determination of the SSC and firmness of pears. The effective variables might be important for the development of portable instruments and online monitoring of the quality of pears.