Issue 27, 2025

Deciphering the differences in aroma components of tobacco from different origins based on HS-GC-IMS and multivariate statistical analysis

Abstract

This study employed headspace gas chromatography-ion mobility spectrometry (HS-GC-IMS) technology combined with multivariate statistical analysis methods to analyze the flavor compounds in flue-cured tobacco from five different regions in China: Henan, Hunan, Yunnan, Chongqing, and Fujian. A total of 98 volatile aroma compounds were identified through HS-GC-IMS analysis, including esters, ketones, aldehydes, acids, alcohols, heterocyclic compounds, sulfur-containing compounds, other types of compounds, and 8 uncharacterized compounds. Principal Component Analysis (PCA) and Orthogonal Partial Least Squares Discriminant Analysis (OPLS-DA) were utilized to conduct dimensionality reduction and distinguish the samples, effectively recognizing differences in volatile compounds among tobacco leaves from various origins. A Random Forest (RF) classification model was constructed, and its reliability was validated through ROC (Receiver Operating Characteristic) analysis, achieving an AUC (Area Under the Curve) value of 0.980, which demonstrates exceptional predictive performance. PCA revealed distinct separations of tobacco leaf samples from different regions on the PCA score plot, and OPLS-DA analysis further validated these differences and confirmed the model's validity through permutation testing. Twenty key aroma compounds with VIP > 1.0 were screened by integrating OPLS-DA with the Random Forest classification model. These compounds showed significant differences in content among different samples, suggesting their potential as chemical markers for distinguishing the origin of flue-cured tobacco. This study not only provides a new method for identifying volatile compounds in tobacco but also offers novel insights into the geographical identification of flue-cured tobacco.

Graphical abstract: Deciphering the differences in aroma components of tobacco from different origins based on HS-GC-IMS and multivariate statistical analysis

Supplementary files

Article information

Article type
Paper
Submitted
31 Mar 2025
Accepted
12 Jun 2025
First published
01 Jul 2025

Anal. Methods, 2025,17, 5736-5748

Deciphering the differences in aroma components of tobacco from different origins based on HS-GC-IMS and multivariate statistical analysis

S. Li, N. Mao, C. Chen, H. Zhao, X. Chen, L. Wang, F. Cui, W. Feng and Z. Wu, Anal. Methods, 2025, 17, 5736 DOI: 10.1039/D5AY00531K

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements