Open Access Article
Ling
Leng†
a,
Ruihan
Zhang†
ac,
Yuxia
Shan
a,
Rong
Chen
a,
Minghui
Yang
*a,
Zhenze
Cui
*b and
Yuan
Liang
*c
aDalian University of Technology Affiliated Women and Children's Hospital, Dalian, 116024, China. E-mail: myang@dlut.edu.cn
bDalian Medical University, Dalian, 116024, China. E-mail: cuizhenze64412@163.com
cDepartment of Thoracic Oncology(1), Cancer Hospital of Dalian University of Technology, Liaoning Cancer Hospital & Institute, No. 44 Xiaoheyan Road, Dadong District, Shenyang 110042, Liaoning Province, P R China. E-mail: liangyuan@cancer-hosp-ln-cmu.com
First published on 4th February 2026
Objectifying and standardizing diagnostic methods are essential steps toward the modernization and global recognition of traditional Chinese medicine (TCM). The “four diagnostic methods”, namely, inspection, auscultation and olfaction, inquiry, and palpation, constitute the fundamental diagnostic framework of TCM, among which olfactory diagnosis plays a vital role. This method relies on identifying characteristic odors from the breath or secretions of a patient to guide syndrome differentiation (Bian Zheng). However, conventional olfactory diagnosis remains highly subjective depending on the practitioner's experience, which results in inconsistent outcomes and challenges in reproducibility and quantification. The integration of medical science with modern sensing and analytical technologies provides a transformative pathway to overcome these limitations. Recent studies have shown that exhaled breath contains a complex spectrum of volatile organic compounds (VOCs) that collectively form a personalized “breathprint”, reflecting physiological and pathological states. Objective analysis of these VOCs enables quantifiable, evidence-based characterization of disease-related odors, thereby providing a scientific foundation for olfactory diagnostics in TCM. This review summarizes advancements in VOC detection methodologies, including gas chromatography-mass spectrometry (GC-MS) and electronic nose (e-nose) systems paired with data-driven analytical frameworks, to advance the transformation of traditional olfactory diagnosis in TCM into a standardized, evidence-based diagnostic paradigm.
Recent advances in analytical chemistry and sensing technologies have opened new opportunities for transforming this ancient diagnostic art into a scientific and quantifiable process.2 Exhaled human breath is now recognized as a rich and non-invasive source of physiological information, containing hundreds of volatile organic compounds (VOCs) derived from endogenous metabolic activities, diet, and microbiome interactions.3 These VOCs serve as biomarkers for various diseases, including diabetes, asthma, liver dysfunction, and certain cancers.4 The VOC profile generated by an individual constitutes a unique metabolic signature, providing real-time insights into biochemical pathways and pathophysiological developments. The identification and analysis of VOC patterns make it possible to capture objective molecular correlates of what was historically described in TCM as “foul”, “sweet”, or “fishy” pathological odors. This convergence between VOC analytics and the TCM olfactory theory provides a scientific foundation for standardizing olfactory diagnosis (Fig. 1).
Contemporary detection technologies such as gas chromatography-mass spectrometry (GC-MS) and electronic nose (e-nose) systems have markedly advanced VOC characterization. GC-MS enables precise compound identification and quantification,5 whereas e-nose arrays enable rapid, pattern-based odor recognition, mimicking the human olfactory system.6 Combined with emerging data-driven approaches—including machine learning, multivariate statistical analysis, and pattern recognition algorithms—these technologies facilitate efficient extraction of diagnostic features from complex VOC datasets. Consequently, they provide a powerful platform for mapping traditional TCM odor categories to measurable chemical signatures and disease-specific biomarkers. This review systematically discusses state-of-the-art VOC detection techniques, their integration with intelligent data processing, and their application in VOC profile correlation with TCM pattern identification. By bridging ancient diagnostic wisdom with modern sensing and analytical sciences, we aim to establish a comprehensive framework for the objectification, standardization, and modernization of TCM olfactory diagnosis—integrating advanced computational techniques with traditional sensory insights—to enable the development of intelligent, evidence-based diagnostic systems that harmonize ancient medical wisdom with modern scientific rigor.
In olfactory diagnosis, ancient Chinese physicians relied on “oral odor” to identify pathological conditions, as TCM believes that such odors stem from imbalances in the zang-fu organs. For example, an acidic taste in the mouth can be associated with a perceptible sour odor. The Zhu Bing Yuan Hou Lun noted that “An acrid taste indicates phlegm in the upper burner and chronic cold in the spleen and stomach, preventing proper digestion. Undigested food leads to distension and fullness, causing qi to reverse. This results in belching and sour breath”. This theory posits that a sour taste in the mouth results from stomach coldness hindering food transformation. Regarding halitosis, Danxi Shoujing (Hand Mirror of Danxi), a foundational text in TCM authored by Zhu Zhenheng, recorded: “Appetite impairment (liver dysfunction affecting spleen qi) manifests as a distinctive foul odor resembling decaying fish, detectable prior to disease manifestation”. These passages describe liver-related odors and their differential diagnosis in various liver conditions. Some patients with severe liver damage emit a distinctive odor reminiscent of mouse urine in their breath.7 Many classical Chinese medical books describe body odors associated with diseases affecting different organs.
The principle of applying exhaled VOCs to disease diagnosis shares profound conceptual similarities to the role of olfactory diagnosis in TCM. Although rooted in different theoretical paradigms, both serve as indicators of vital state through “odor”. To investigate the modern scientific implications of this ancient wisdom linking the five zang organs to five odors, this review attempts to construct an integrative conceptual framework. It explores correlations between the organ theory of TCM and certain disease-associated VOCs currently identified (as shown in Table 1). It is crucial to emphasize that this framework does not aim to establish absolute one-to-one diagnostic relationships. Rather, it serves as an inspirational integrative model designed to provide new perspectives and directions for interdisciplinary research.
| Five zang organs and five odors | Modern systemic correlation | Generation and verification of VOCs | Ref. | ||
|---|---|---|---|---|---|
| Core VOC | Generation process | Potential diagnostic value | |||
| Liver-rancid | Neuroendocrine system | Methanethiol | Liver failure results in impaired methionine metabolism, leading to the production of methyl mercaptan by intestinal flora, which is exhaled through the lungs | Non-invasive screening for hepatic encephalopathy and assessment of cirrhosis severity | 22–25 |
| Heart-scorched | Circulatory, nervous system | Ketones (e.g., acetone); aldehydes (e.g., hexanal) | Myocardial ischemia or heart failure, on the one hand, causes energy metabolism disorders and incomplete fatty acid oxidation, producing ketones; on the other hand, oxidative stress leads to lipid peroxidation, generating volatile alkanes | Auxiliary diagnosis of myocardial ischemia and monitoring of heart failure progression | 26–29 |
| Spleen-fragrant | Digestive, immune, endocrine system | Short-chain fatty acids (e.g., acetic acid); amines (e.g., cadaverine) | Under normal physiological conditions, the spleen functions properly, and the mild “grain aroma” exhaled may be composed of short-chain fatty acids (e.g., acetic acid) produced by normal intestinal flora metabolism and volatile components from grains themselves. Pathologically (spleen deficiency), impaired transportation leads to food stagnation, and abnormal bacterial fermentation produces cadaverine (an amine) and excessive acetic acid, mixing to form a putrid odor | Non-invasive evaluation of gastrointestinal function and auxiliary diagnosis of functional dyspepsia | 30–32 |
| Lung-fishy | Respiratory, immune system | Hydrogen sulfide; ammonia, volatile alkanes | In cases of lung infection, pathogens decompose proteins to produce hydrogen sulfide and amines; during tumor or inflammatory states, lipid peroxidation generates volatile alkanes | Auxiliary differentiation of pulmonary infection types and non-invasive screening for lung cancer | 33–35 |
| Kidney-putrid | Genitourinary, endocrine, reproductive system | Ammonia, trimethylamine | In renal failure, urea diffuses into the respiratory tract and skin via the bloodstream and is decomposed into ammonia by bacteria; trimethylamine originates from intestinal flora metabolism | Non-invasive monitoring of renal function and auxiliary diagnosis of uremia | 36–40 |
The above framework provides a modern biological perspective for understanding TCM theories; however, its application and interpretation require careful consideration due to the inherent complexity and limitations of TCM itself. First, the scientific evidence strength for the correlations between different elements in the framework significantly varies. For example, the links between “kidney-deficiency” and “putrid odor” (via ammonia) or “liver-qi stagnation” and “sulfurous odor” (via methyl mercaptan) are clinically recognized physical signs with well-established pathophysiological mechanisms, thereby providing a solid evidence base. However, the association between “heart-qi deficiency” and ketones/aldehydes is more of a highly speculative hypothesis derived from theories of myocardial energy metabolism and oxidative stress, with unclear olfactory correspondence. Moreover, these VOCs are more commonly found in other diseases, such as diabetes, exhibiting low specificity. Second, the non-specificity of VOCs is a universal challenge. Many compounds listed in the table, such as aldehydes and volatile alkanes, are general products of systemic oxidative stress or inflammation rather than exclusive biomarkers for specific zang-fu organ diseases. Therefore, future research should shift from searching for single biomarkers to constructing “VOC fingerprints” for specific TCM syndromes (e.g., “spleen-insufficiency”) or diseases, utilizing multivariate analysis patterns to enhance diagnostic accuracy. Finally, it is essential to recognize the fundamental paradigm differences between TCM and modern medicine. The “five zang organs” in TCM are functional aggregates, whereas VOCs are specific chemical substances. Mapping the two is a beneficial simplification aimed at promoting interdisciplinary understanding; however, it cannot fully encompass the systematic and holistic connotations of TCM theories.
In conclusion, although the direct deterministic mechanism between “odor type” and “specific chemical components” has not been fully elucidated, investigation of the relationship between traditional medicine and modern odor chemistry can facilitate the transformation of traditional empirical knowledge into standardized, quantifiable modern medical systems, thereby providing innovative solutions for disease prevention, treatment, and health management.
![]() | ||
| Fig. 2 The process of non-invasive respiratory detection via the electronic nose system. Figure reproduced from ref. 49 with permission from [Springer Nature], copyright [2023]. | ||
The electronic nose technology shows potential in the modernization of TCM olfaction diagnosis; however, its inherent technical limitations—including sensor drift, background noise elimination, and the non-linear interference of humidity with sensor responses—pose challenges to clinical VOC precision analysis and pathological mechanism interpretation. A primary bottleneck lies in the non-resolvable nature of features: mainstream sensors (e.g., MOS, CP, QCM) rely on an “overall response pattern”, where signals are superpositions of interactions from multiple gases, making it difficult to isolate the contribution of a single component from mixed responses.50 In addition, the complexity of real-world application scenarios may introduce interference: environmental factors such as humidity, temperature, and exhaled background gases affect the stability and accuracy of discrimination.51 With the improvement of electronic nose technology in recent years, issues such as humidity drift have been gradually addressed.52 Therefore, electronic nose is more suitable for playing a role in rapid screening and auxiliary classification in TCM modernization research. In-depth biomarker discovery and pathological mechanism studies remain reliant on quantitative analysis techniques, such as high-resolution mass spectrometry.
Recent progress in gas analysis technologies has markedly enhanced the applicability of VOC detection in TCM olfactory diagnosis. Gas chromatography-mass spectrometry (GC-MS), while remaining the gold standard owing to its high sensitivity (detection limits down to ppt levels) and comprehensive standardized databases, faces practical limitations in clinical implementation.53 These constraints primarily stem from its dependence on laboratory infrastructure, time-consuming sample protocols (the duration of routine analysis is up to 1 hour),54 and bulk equipment size (weighing >50 kg), which collectively impede its deployment in point-of-care TCM diagnostic settings.55
Direct mass spectrometry techniques, exemplified by proton-transfer reaction mass spectrometry (PTR-MS) and extractive electrospray ionization mass spectrometry (EESI-MS), provide innovative solutions by circumventing traditional sample separation and derivatization steps.56 However, technical challenges persist: PTR-MS exhibits susceptibility to water vapor interference and struggles with isomer differentiation,57–61 whereas EESI-MS demonstrates limited sensitivity to nonpolar compounds and potential fragmentation of terpenoid VOCs in drift tubes, thereby complicating quantitative assessments.62,63
While spectroscopic techniques excel in real-time, non-invasive detection, they primarily target inorganic gases and demonstrate limited specificity for the complex VOC mixtures that define TCM olfactory signatures. This gap highlights the need for integrated sensor systems that synergistically combine electronic noses, optical spectroscopy (e.g., Fourier transform infrared spectroscopy [FTIR]), and machine learning algorithms.64–67 Recent progress in FTIR technology has enabled high-throughput detection of exhaled VOCs with high spectral resolution and rapid acquisition times, expanding its utility in clinical diagnostics and environmental monitoring. Separately, TCM diagnostic frameworks highlight the diagnostic value of VOC cluster patterns, driving demand for advanced multivariate analysis algorithms, such as partial least-squares discriminant analysis (PLS-DA), to interpret these patterns.68–70 In sum, algorithmic optimizations, sensor miniaturization, and machine learning convergence are propelling next-generation TCM olfactory diagnostic systems from concept to practice.
| Data type | Data content | Characteristic form | Example | References |
|---|---|---|---|---|
| a TD-GC-MS, thermal desorption gas chromatography-mass spectrometry. b CAS, chemical abstracts service registry number. c TIC, total ion chromatogram. d EIC, extracted ion chromatogram. e N, sample number. f P, VOC type. | ||||
| Original spectrum data | Raw GC-MS/MS and TD-GC-MSa signals | Continuous signal curve | TIC,c EICd | 71, 72 |
| Peak identification data | Retention time, area, height, and other semi-quantitative indicators | Numeric features | Peak area, peak height | 72, 73 |
| Compound identification data | Compound name, structure, CASb | High-dimensional data matrix | Esters, aldehydes, ketones, alkanes, etc. | 74, 75 |
| Multi-sample data matrix | Two-dimensional table: samples × VOCs | Classification/structural indicators | Ne × Pf matrix | 75 |
TCM syndrome patterns (Bian Zheng) are complex clinical phenotype entities that exhibit multidimensional, nonlinear, and multilevel characteristics. VOCs in exhaled breath can serve as biomarkers reflecting the internal metabolic states of the body and have been shown to be associated with specific TCM syndromes. To reveal and validate this association, analysis of VOC data typically requires sophisticated feature extraction, statistical modeling, and predictive validation. VOC data primarily originate from analytical techniques, such as GC-MS, PTR-MS, and selected-ion flow-tube mass spectrometry (SIFT-MS). The datasets generated by these techniques are typically high-dimensional and sparse and contain a high degree of noise. Therefore, comprehensive data preprocessing and normalization are required before modeling to ensure analytical quality and robustness. The data processing procedure is as follows: (1) data acquisition and preprocessing: the data collected by the instrument undergoes a series of preprocessing steps, including noise reduction, baseline correction, area normalization, smoothing filtering, and peak extraction, to ensure the accuracy and reliability of subsequent analyses.76 (2) Feature extraction and selection: dimensionality reduction, structuring, denoising, and enhancing discriminability are crucial preprocessing steps before modeling. Utilizing various feature extraction methods, such as principal component analysis (PCA),77 linear discriminant analysis (LDA),78 orthogonal partial least-squares discriminant analysis (OPLS-DA),79 t-distributed stochastic neighbor embedding (t-SNE), uniform manifold approximation and projection (UMAP), representative features are extracted from raw data to reduce dimensionality and enhance the accuracy of subsequent classification or quantitative analysis. (3) Data modelling and classification: the extracted VOC feature variables are used to construct classification models for predicting or distinguishing TCM syndromes. Common modelling approaches include K-nearest neighbor (KNN),80 support vector machine (SVM),81 logistic regression (LR), random forest (RF), XGBoost, artificial neural networks (ANN),82 convolutional neural networks (CNN) in deep learning (DL),83 recurrent neural networks (RNN),84 deep neural network (DNN), convolutional neural network (CNN), and RNN, which are suitable for processing complex patterns or spectral structures but require a large sample size. In small-sample studies, they should be used with caution due to the risk of overfitting or performance degradation caused by data heterogeneity. Key optimization methods for small-sample scenarios include ridge regularization, L1/L2 regularization, dropout, sparse network architecture, spectral peak perturbation, and other data augmentation strategies to avoid inaccurate real-world performance owing to overfitting or data heterogeneity as well as other algorithms to establish correlation models between exhaled-breath data and specific diseases or physiological states. Subsequently, exhaled-breath data is classified, for instance, to distinguish between healthy individuals and patients or to differentiate between various diseases85 or subjected to quantitative analysis, including disease severity assessment.86 (4) Model evaluation and optimization: to ensure the clinical utility and robustness of the model, multidimensional assessment of the classification model is essential. For instance, model performance metrics, such as accuracy, sensitivity, specificity, F1 score, and receiver operating characteristic (ROC) curves, should be selected based on specific research objectives. Model optimization through methods such as cross-validation and external validation set assessment enhances classification accuracy and generalization ability.87 When models are intended for clinical use, decision curve analysis (DCA) can be employed to evaluate their clinical utility. Moreover, VOC data in TCM syndrome research often originate from diverse hospitals, devices, and populations, exhibiting significant data heterogeneity with notable distribution differences across centers. Therefore, heterogeneous federated learning (HFL), as a critical approach to address sample scarcity and heterogeneity, enables cross-center collaborative modeling without sharing raw data. This not only expands the “virtual sample size” but also enhances the model's generalization ability in multi-region and multi-device scenarios, making it one of the key technologies for constructing high-trust VOC-based TCM syndrome models in the future. As illustrated in Fig. 3, the integrated analysis process of TCM olfactory diagnosis and exhaled-breath VOCs involves several specific steps.
In summary, using suitable detection methods, conducting systematic feature selection, selecting appropriate classification modeling approaches, and implementing rigorous model evaluation mechanisms enable the establishment of a robust correspondence between exhaled-breath VOCs and TCM syndrome patterns. This objective, quantitative approach offers a novel pathway for the differentiation of TCM syndromes.
Current research on VOCs for disease screening, diagnosis, and metabolic abnormality monitoring has made considerable progress in the international breathomics field. Teams from the UK, the Netherlands, Israel, and other countries have formed multi-center, large-sample research systems for lung cancer, infectious diseases, metabolic syndrome, and inflammatory diseases, accumulating extensive experience in sampling standardization, validation of the generalization ability of machine learning models, and cross-platform consistency. In contrast, VOC research conducted under the context of TCM, particularly studies aimed at syndrome differentiation, is primarily driven by a few domestic research teams. There are still significant differences from international breathomics research in terms of sample size, research design, model selection, and syndrome annotation systems. Most domestic studies focus on the VOC fingerprint characteristics of specific syndromes (e.g., phlegm–dampness, qi deficiency, phlegm–blood stasis); however, due to limitations such as single-center data, small sample sizes, and subjective syndrome annotation, the cross-population reproducibility and universality of the results still need further verification. The following content will focus on elaborating the application of VOCs in TCM syndrome differentiation.
The research team of professor Lin Xuejuan used a custom-built TCM e-nose to analyze the oral breath odor profiles of patients, thereby unveiling substantial disparities in odor characteristics between those diagnosed with exterior heat syndrome and those diagnosed with exterior cold syndrome.43,88 Patients suffering from chronic gastritis also demonstrated distinct odor profiles based on their cold or heat pattern.42 Subsequent research has shown that patients diagnosed with heat syndromes generally exhibit more pronounced olfactory signals. This finding led to the formulation of a hypothesis proposing a correlation between odor intensity and patterns of cold heat and deficiency excess. This hypothesis provides a significant foundation for the objective validation of TCM olfactory diagnosis.
The e-nose is aligned with the TCM diagnostic principle of “inferring the internal from the external” by capturing “overall odor patterns” rather than individual compounds. Its core advantage is translating experiential aspects of traditional olfactory diagnosis, such as “concentration, intensity, coldness, and heat”, into quantifiable sensor parameters. This provides an objective tool for the eight principles diagnosis. Despite the need for further optimization to enhance the identification of complex syndromes, the efficacy of this approach in identifying heat syndromes and distinguishing between excess and deficiency syndromes has shown considerable clinical potential. This development indicates a pivotal shift in the transformation of TCM diagnosis, moving from an empirical, descriptive model to a data-driven paradigm.
The research team of professor Ren Yifeng has categorized pathological syndrome factors of pulmonary nodules into the following: yin deficiency, phlegm, dampness, qi impediment, and blood deficiency, among others. The integration of the Cyranose 320 electronic nose with a suitable model facilitates precise identification, contingent upon distinct pathological conditions, thereby substantiating its high specificity and sensitivity.48 The research team of professor Zhu Jie employed the Cyranose 320 electronic nose to explore the distribution characteristics of TCM pattern elements in colorectal cancer and pulmonary nodules, conducting odor spectrum identification46,47 to effectively distinguish disease patterns. Li Yuan et al.89 employed gas chromatography coupled with surface acoustic wave (SAW) sensor technology to analyze exhaled breath from patients with spleen–stomach dysfunction. This method effectively identified patterns such as spleen qi deficiency pattern, damp heat in the spleen and stomach pattern, yang deficiency of the spleen and stomach pattern, stomach yin deficiency pattern, and stomach fire pattern, among others.
Research on electronic noses and odor spectrum analysis has provided substantial evidence for the objectification of TCM pattern classifications, such as cold, heat, deficiency, and excess. The conventional diagnostic paradigm, predicated on the notion that “the intensity and character of odors reflect cold, heat, deficiency, and excess”, has been translated into quantifiable pattern features through the application of modern sensor technology. This development has yielded high levels of accuracy and clinical relevance across a range of diseases and TCM patterns. This advancement not only promotes the modernization and digitization of TCM olfactory diagnosis but also establishes a methodological foundation for the objectification of eight-principles syndrome differentiation. However, further optimization for discriminating complex patterns and large-scale validation studies are still warranted.
The research team of professor Lin Xuejuan has completed a study on the identification of exhalation patterns associated with various disease locations in conditions including community-acquired pneumonia, type 2 diabetes, and chronic gastritis.41,42,44,45 The findings indicated that the primary disease locations in patients with chronic gastritis involve the stomach, spleen, and liver, each exhibiting distinct odor characteristics.93 Patients with chronic gastritis and qi stagnation syndrome showed a distinct oral breath odor spectrum compared with those with disease locations identified as stomach, spleen–stomach, or liver–spleen–stomach. In patients with heat syndrome, prevalent disease locations included the stomach, lung, spleen, and liver. The olfactory spectrum varied among patients sharing the same disease location, and the electronic nose effectively differentiated odor profiles across these locations.43 Furthermore, the team developed advanced models for odor spectrum recognition in diabetes research. These models were used to stage type 2 diabetes mellitus (T2DM), distinguishing between pre-diabetic and diabetic states, and to identify common disease locations, such as the liver, kidneys, and spleen. The study showed T2DM disease location patterns: the pre-diabetic stage primarily involved the liver and kidney (liver 19.67%), whereas the diabetic stage predominantly involved the kidney and liver (kidney 66.67%). These findings indicate that odor changes are associated with organ function, providing biological relevance to the observed odor profiles. The research employed multi-algorithm comparisons (DT/RF/SVM/DNN) and ROC curve analysis, which rigorously enhanced the accuracy of the deep neural network (DNN) models. In a related study, the research team of professor Ren Yifeng used the Cyranose 320 electronic nose combined with a specific model to identify common TCM syndromes in pulmonary nodules.48 This approach enabled precise identification of single-site lesions in the liver, lungs, or kidneys, demonstrating high specificity and sensitivity. The research team of professor Zhu Jie used an electronic nose to identify odor profiles associated with the pathological locations of colorectal cancer and pulmonary nodules.46,47 The study focused on the TCM pathological locations for both diseases. The pathological location patterns for colorectal cancer patients primarily included the large intestine as well as the spleen, liver, stomach, and kidneys. Contrarily, the patterns for pulmonary nodules involved the liver, lungs, kidneys, and spleen. The research showed distinct differences in exhalation profile characteristics across different pathological locations. Sensor data processing using five algorithms—RF, KNN, LR, SVM, and XGBoost—achieved accuracies exceeding 80% for all methods. Among these, random forest demonstrated the highest recognition accuracy. However, due to the limitations of small sample sizes and high-dimensional data characteristics in TCM clinical research, the application of DNN and CNN models often leads to overfitting (excellent performance during model training but failure in external validation), resulting in diagnostic errors in clinical practice. In future related research, it is necessary to introduce standardized data processing workflows, feature engineering strategies, model optimization approaches, heterogeneous federated learning (HFL) architectures, and explainable artificial intelligence (XAI) tools to ensure the usability, generalizability, and reliability of AI in TCM olfaction research.
Research on VOCs in exhaled breath provides a novel objective method for identifying disease locations. The electronic nose technology and associated spectral analysis can effectively reveal odor differences associated with distinct organ systems, whereas multi-algorithm modeling markedly enhances the accuracy of disease localization. These findings not only validate the TCM theory that ‘disease locations can be reflected by odors’ but also provide biological interpretability for odor spectra.
In summary, research on exhaled VOCs provides an objective method for identifying disease patterns and locations in TCM, which fully demonstrates the modern application potential of “olfactory diagnosis”. By capturing holistic VOC profiles, electronic noses transform traditional empirical concepts, such as “cold, heat, deficiency, excess, and pattern differentiation of zang fu organs”, into quantifiable parameters and have shown high accuracy and clinical value across multiple diseases. Furthermore, multi-algorithm modeling enhances identification efficacy and provides clear biological interpretations for odor changes. However, most current studies remain constrained to the qualitative analysis provided by electronic noses, which hinders the separation and identification of individual VOC components and constrains the in-depth exploration of specific biomarkers. The future incorporation of precision detection technologies, such as chromatography and mass spectrometry, has the potential to enhance the differentiation and treatment of TCM syndromes. Such an integration is expected to facilitate a shift from macro-level experiential practice to micro-level mechanistic understanding, thereby achieving greater precision and standardization in TCM therapy. Despite the preliminary research foundation established by domestic teams in TCM syndrome-VOC correlation, future efforts still rely on more standardized sampling protocols, stricter syndrome annotation systems, and multicenter collaborative studies. These measures are aimed at constructing a high-quality evidence system capable of engaging in dialogue with international breathomics research and enhancing the reliability, interpretability, and international recognition of TCM syndrome-related VOC biomarkers.
The AI-based identification and analysis of auscultation and palpation is a crucial step in the modernization of the four diagnostic methods of TCM. Future integrated, intelligent diagnostic systems for the four diagnostic methods will represent a major application of AI in TCM syndrome differentiation. These systems combine multi-label classification and deep learning technologies to automate the complex task of TCM pattern differentiation. Treating information gathered from the four examinations as a unified dataset for analysis through multi-label and deep learning models enables these systems to more effectively handle the intricate relationships between diverse symptom characteristics and syndromes, thereby enhancing diagnostic accuracy.96,97
Data processing technology is a crucial part of exhaled-breath detection. Owing to the complex composition of exhaled breath, untargeted full-spectrum analysis often detects massive metabolite information. To effectively identify these potential biomarkers, researchers must employ a series of data analysis methods for in-depth exploration. The combined application of basic statistical methods and machine learning (ML) has become the mainstream data analysis method in the field of exhaled-breath analysis in recent years. It integrates the rigor of statistics and the intelligence of algorithms, thereby enabling more precise revelation of biochemical information in exhaled breath and enhancing the accuracy of data analysis. The data processing of exhaled-breath signals mainly includes data acquisition, preprocessing, feature extraction, selection, modeling, classification, as well as model evaluation and optimization. In these steps, improper preprocessing may remove valid spectra and introduce additional errors and uncertainties. The quality of data preprocessing and feature extraction directly affects model performance. If the data is not comprehensive or biased, the accuracy of model predictions will be affected. Existing ML models and algorithms may not be applicable to all types of exhaled-breath samples, particularly those with significant individual differences or complex matrices. To address these problems, the amount of high-quality training data can be increased to ensure data diversity and representativeness, reduce bias, and improve model generalization ability. In summary, with the rapid development of artificial intelligence, continuous research and optimization of existing algorithms, as well as continuous introduction of the latest ML and DL (deep learning) technologies, is the best choice to promote the innovative development of data analysis in current digital-intelligent TCM olfaction diagnosis.
Footnote |
| † These authors contributed equally to this work. |
| This journal is © The Royal Society of Chemistry 2026 |