Issue 44, 2025

Construction of prediction models for phenolic compounds in Cabernet Sauvignon grapes based on visible/near-infrared spectroscopy

Abstract

Focusing on the contents of phenolic compounds such as tannins and anthocyanins, this study aims to construct prediction models for phenolic compound concentrations in wine grapes Cabernet Sauvignon and to identify key characteristic wavelengths associated with these contents. Diffuse reflectance spectra of wine grapes Cabernet Sauvignon were collected using a portable fiber-optic spectrometer. Principal component analysis (PCA) was employed to eliminate outliers, and the SPXY algorithm was applied to divide the dataset into a calibration set (n = 145) and a prediction set (m = 49). Various preprocessing methods and their combinations—including Savitzky–Golay convolution smoothing (SG), multiplicative scatter correction (MSC), standard normal variate (SNV), and standardization (SS)—were compared to determine the optimal preprocessing strategy for different phenolic compounds, yielding the best preprocessed spectral data. Subsequently, characteristic wavelengths were extracted using competitive adaptive reweighted sampling (CARS), successive projections algorithm (SPA), and uninformative variable elimination (UVE). Through comparative analysis, the most effective wavelength selection method and the optimal number of characteristic wavelengths were identified, and partial least squares regression (PLSR) models for tannins and anthocyanins contents were established. The results demonstrated that for tannins, the model combining SG–SNV–SS preprocessing with the CARS algorithm achieved the best performance (Rc2 = 0.9964, Rp2 = 0.9939, RPD = 3.7653), with seven optimal characteristic wavelengths identified at 422.64 nm, 828.86 nm, 948.92 nm, 993.22 nm, 1003.17 nm, 1122.10 nm, and 1122.94 nm. For anthocyanins, the model based on raw spectral data combined with the CARS algorithm yielded the best results (Rc2 = 0.9899, Rp2 = 0.9768, RPD = 6.5591), with six optimal characteristic wavelengths identified at 440.35 nm, 580.76 nm, 632.38 nm, 777.21 nm, 898.61 nm, and 1013.96 nm. The constructed models effectively screened key characteristic wavelengths associated with tannins and anthocyanins contents, enabling accurate prediction of phenolic compounds in wine grapes. This research provides a solid theoretical basis and technical support for the development of portable instruments and the selection of light source devices.

Graphical abstract: Construction of prediction models for phenolic compounds in Cabernet Sauvignon grapes based on visible/near-infrared spectroscopy

Article information

Article type
Paper
Submitted
14 Sep 2025
Accepted
23 Oct 2025
First published
31 Oct 2025

Anal. Methods, 2025,17, 9038-9050

Construction of prediction models for phenolic compounds in Cabernet Sauvignon grapes based on visible/near-infrared spectroscopy

S. Wang, B. Li, X. Zhang, Y. Li, P. Liu, K. Zhu and Y. Zhang, Anal. Methods, 2025, 17, 9038 DOI: 10.1039/D5AY01538C

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements