Issue 30, 2026, Issue in Progress

Prediction and visual analysis of flue-cured tobacco aroma types based on machine learning and feature derivation

Abstract

The present study aimed to investigate the relationship between aroma types and chemical properties of flue-cured tobacco (FCT), and to explore the applicability of machine learning (ML) combined with feature derivation in the FCT industry. A total of 619 Sichuan FCT samples representing three aroma types (fresh-sweet, honey-sweet, and mellow-sweet) were utilized. Feature derivation was performed based on 51 raw chemical indices, followed by a three-tier key indicator selection process incorporating separability analysis, Random Forest (RF) importance ranking, and redundant feature elimination via correlation analysis. By comparing multiple machine learning models, the optimal model adapted to the Sichuan FCT dataset was screened out. Model parameter optimization was accomplished in combination with the genetic algorithm (GA), and finally, visual interpretation of the model's decision-making mechanism was realized by means of SHAP values. The results demonstrated that after three-tier screening, 9 key characteristic indices including rutin-malonic acid, rutin and chlorogenic acid et al were finally identified. The random forest (RF) algorithm was the optimal model for this dataset; after parameter optimization, the model achieved an F1-score of 88.3% and an accuracy of 93.5%, which greatly reduced the detection cost and improved the model's discrimination performance. Additionally, the SHAP value interpretation framework clearly reveals the intrinsic correlation between chemical characteristics and aroma types. This study not only enhances the efficiency of aroma type classification for Sichuan FCT but also clarifies the key chemical indicators associated with aroma traits. It further provides quantitative support for optimizing FCT quality through the targeted regulation of key component contents.

Graphical abstract: Prediction and visual analysis of flue-cured tobacco aroma types based on machine learning and feature derivation

Article information

Article type
Paper
Submitted
21 Jan 2026
Accepted
04 May 2026
First published
20 May 2026
This article is Open Access
Creative Commons BY license

RSC Adv., 2026,16, 27308-27318

Prediction and visual analysis of flue-cured tobacco aroma types based on machine learning and feature derivation

Z. Wang, S. Yang, X. Wang, J. Qiu, J. Cao and X. Hao, RSC Adv., 2026, 16, 27308 DOI: 10.1039/D6RA00543H

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements