Open Access Article
Zhiheng Yu,
Yongyan Ji,
Zijun Meng and
Xiang Li
*
Shanghai Key Laboratory of Air Quality and Environmental Health, Department of Environmental Science & Engineering, Fudan University, Shanghai 200438, P.R. China. E-mail: lixiang@fudan.edu.cn
First published on 28th April 2026
The lack of universal cross-species sampling and quality control methods has limited the potential of breath metabolomics to advance from clinical discovery to mechanistic validation. To address this challenge, this study developed and systematically validated an integrated cross-species breath analysis platform, with its core components comprising a high-sensitivity mouse breath sampling system (FaunaScope) and a quality control (QC) strategy incorporating behavioral monitoring. By identifying ethyl acetate and dimethyl sulfide as characteristic interference markers, the platform exhibited high detection capability and good analytical reproducibility in the analysis of 33 core volatile organic compounds (VOCs) in mouse breath, with detection rates exceeding 88.2% for 30 VOCs and coefficients of variation below 30% for more than 70% of the compounds. Using inflammatory bowel disease (IBD) as a demonstration model, the platform enabled a full-chain study from validation of breath fingerprints in clinical patients to longitudinal monitoring in mouse models, successfully capturing dynamic metabolic changes in short-chain fatty acids (SCFAs) and responses to exclusive enteral nutrition (EEN) intervention, while revealing metabolic feature divergence between humans and mice attributable to differences in pathological mechanisms. Overall, this platform exhibits high sensitivity and strong resistance to interference, providing an effective translational medicine tool for linking clinical findings with fundamental mechanistic research.
However, to advance breath analysis from clinical observation to pathological mechanism elucidation, translational research still faces severe technical barriers. First, in terms of sampling strategies, although sampling bags (e.g., Tedlar bags) are widely used in clinical and animal studies due to their convenience,3,6–10 their limitations in sensitivity, temporal resolution, and background contamination control are becoming apparent. Studies have shown that polymer bags can compromise sample integrity through analyte adsorption loss or material background outgassing, which is particularly detrimental to the analysis of trace VOCs.11,12 Second, regarding detection techniques, while online mass spectrometry (e.g., PTR-MS and SIFT-MS) has been used in numerous studies for large-scale disease screening and characterization due to its real-time monitoring advantage,13,14 it has limitations. As highlighted in authoritative reviews, online mass spectrometry techniques lack separation steps like chromatography. This results in significant limitations in resolving isomers and in the structural identification of new compounds, thereby limiting their potential application in the in-depth tracing of complex metabolic pathways.15 In contrast, offline analysis based on gas chromatography-mass spectrometry (GC-MS) provides higher structural resolution. Numerous studies have confirmed its gold-standard status in the discovery of novel biomarkers and in full-spectrum metabolomic analysis.16–18 However, this technique demands extremely high enrichment efficiency and cleanliness from the sampling front-end.19 Critically, existing breath sampling devices designed for laboratory animals still have design flaws. Intubation-based ventilation systems can reduce ambient air contamination, but the procedure is invasive and introduces system background from complex tubing.20 In contrast, non-invasive whole-body or nose-only exposure chambers, while avoiding anesthesia, are highly susceptible to contamination from non-respiratory sources (mainly excreta, i.e., urine and feces).21 Overall, these devices generally lack effective quality control (QC) mechanisms for the excretory behavior of rodents. The matrix effect from high-concentration VOCs released by excreta can easily mask trace endogenous breath signals. This leads to a significant decrease in the data's signal-to-noise ratio and reproducibility. This has become a key bottleneck that prevents current animal breath research from achieving reliable, high-fidelity correlation with high-quality clinical data.
To address the core analytical challenges in ensuring sampling reliability and controlling matrix interference, this study developed and systematically validated an integrated cross-species breath metabolomics research platform. At the platform's methodology level, we first focused on the bottleneck of sampling technology. We independently developed a high-performance breath collection system for small laboratory animals—the FaunaScope. This system employs a compact geometric design matched to the mouse's size, aiming to minimize dead volume. It is seamlessly integrated with high-sensitivity thermal desorption-gas chromatography-tandem mass spectrometry (TD-GC-MS/MS). Critically, to address the interference from complex biological matrices at the source, we established a pioneering quality QC strategy. This strategy combines real-time behavioral monitoring with the identification of interference biomarkers characteristic of excretion, ensuring the collection of high-fidelity pure breath samples. To rigorously test and demonstrate the platform's performance under complex pathophysiological conditions, we selected Inflammatory Bowel Disease (IBD) as a demanding demonstration model. The “gut–lung axis” metabolic feature of IBD, i.e., metabolites such as short-chain fatty acids (SCFAs) produced by gut microbiota must cross a compromised intestinal barrier and be exhaled via blood circulation,22 is a weak and complex signal transduction process that poses a great challenge to the analytical platform's sensitivity and anti-interference capability. In this study, the platform was not only successfully applied to the longitudinal monitoring of a dextran sulfate sodium (DSS)-induced colitis mouse model and its response to exclusive enteral nutrition (EEN) intervention, but more importantly, it accurately captured the phenomenon of metabolic feature divergence between humans and mice caused by the pathological differences between “acute gut leakage” and “chronic dysregulation”. This result highlights the excellent capability of this platform in detecting subtle metabolic fluctuations within complex biological matrices. Therefore, this study establishes a reliable analytical tool for translational research, capable of providing a solid bridge to connect clinical phenotypes with their underlying pathobiological mechanisms.
As shown in Fig. 1A, the platform architecture covers two parallel routes: (1) a clinical discovery route, which supports non-invasive breath sampling and high-throughput biomarker screening based on large-scale human cohorts; (2) a basic validation route, which utilizes the FaunaScope breath collection system for long-term, non-invasive breath monitoring and mechanism elucidation in various small laboratory animal models. Both routes rely on a unified high-sensitivity detection core—the thermal desorption-gas chromatography-tandem mass spectrometry (TD-GC-MS/MS) system. This is combined with automated data preprocessing and multivariate statistical analysis to ensure high fidelity and comparability of cross-species data.
Breath sample collection was performed according to a Standard Operating Procedure (SOP). All participants were required to fast overnight (>8 h) and underwent sampling between 07:00 and 09:00 in the morning. Prior to sampling, subjects were instructed to rinse their mouths and avoid strenuous physical activity to reduce exogenous interference. Breath sampling was conducted using a ReCIVA™ breath sampler (Owlstone Medical, UK), coupled with a CASPER clean air supply unit, and preferential collection of the alveolar fraction was achieved under temperature-controlled conditions. Exhaled breath was enriched onto sorbent tubes (Tenax TA/Carbograph 5TD) at a flow rate of 200 mL min−1, with a total collection volume of 2.0 L per sampling session. Urine samples collected synchronously with breath sampling and their corresponding processing procedures are described in the SI (Text S2).
At the gas supply end, high-purity synthetic air (air, 99.999%) serves as the carrier gas. Its flow rate is precisely controlled by a mass flow controller before it passes through an activated charcoal adsorption filter. This step is designed to remove trace organic impurities from the gas source and tubing. This establishes a low background sampling baseline, ensuring that detected VOCs originate from the experimental animal and not from environmental interference.
The core sampling component is a custom-made transparent polymethyl methacrylate (PMMA) restrainer. Its transparent material allows for real-time observation of the mouse's state. Its compact internal structure is designed to reduce dead volume, thereby effectively minimizing the dilution of the breath sample by the carrier gas. To ensure the fidelity of sample transfer, all connecting tubes and fittings within the device are made of polytetrafluoroethylene (PTFE). This material has an extremely low chemical background and adsorptivity, which minimizes the contamination or loss of VOCs in the tubing.
The device employs a dual-channel parallel design, allowing for the independent sampling of two small laboratory animals simultaneously. Driven by the negative pressure of a mini vacuum pump, the purified carrier gas continuously flows through the restrainer. It carries the metabolic gases exhaled by the animal through a thermal desorption tube (TD tube) at a constant flow rate, achieving online enrichment and collection of breath VOCs.
To minimize potential interference of stress induced by restraint and handling on breath metabolic profiles, a 3-day acclimation period (Days 1–3) was implemented prior to the formal experiments. During this period, all mice underwent standardized device acclimation training to gradually familiarize them with the breath sampling environment (detailed procedures are described in SI, Text S3). From Day 4 onward, mice were randomly assigned to a DSS-induced group or an EEN intervention group (n = 6 per group). In both groups, acute colitis was induced by providing ad libitum access to a 2% (w/v) DSS solution (molecular weight 36
000–50
000; MP Biomedicals, LLC., USA) for 8 consecutive days. During the modeling period, mice in the EEN group received exclusive enteral nutrition formula (4.5 g/18 mL) in place of standard chow to mimic a clinical nutritional intervention regimen.
Throughout the modeling period, clinical symptoms were monitored daily and the disease activity index (DAI) was calculated. The DAI score was determined based on a composite assessment of body weight change, stool consistency, and the presence of fecal blood, with detailed scoring criteria provided in the SI (Table S1).
Quality control was maintained throughout the entire sampling process. The core QC strategy involved real-time monitoring of excretion interference: any sample with a recorded defecation or urination event during sampling was labeled as a “contaminated sample (WE)” and was excluded from subsequent data analysis (see Section 3.4 for details). Furthermore, for each sampling batch, a gas sample from the unloaded device was collected concurrently as an environmental blank for background subtraction. All procedures were performed by the same trained operator to minimize systematic and human errors.
Compound identification was based on matching full-scan data (m/z 30–350) against the NIST20 library. For quantitative analysis, a combined approach of Selected Ion Monitoring (SIM) and Multiple Reaction Monitoring (MRM) was employed, with precise quantification achieved by constructing standard curves using the external standard method. Specifically, standard calibration curves were constructed by injecting 1 µL of standard mixture solutions and internal standards into conditioned sorbent tubes, with each concentration level repeated in triplicate to mimic breath sampling. The calibration curves were established based on the peak area ratios (target analyte to internal standard) versus mass concentrations. The detailed concentration ranges, linear dynamic ranges, and correlation coefficients (R2) for the targeted compounds are comprehensively summarized in SI Table S2. By acknowledging the inherent differences in the abundance of breath metabolites and sampling volumes between humans and mice (2.0 L vs. 0.5 L), species-specific targeted quantitative libraries were developed. For human samples, a target list comprising 71 VOCs was established (see SI Table S3 for details). For mouse samples, 33 core VOCs with high detection rates were selected for focused monitoring (see Section 3.2, Table 1). The abbreviations and full names of these compounds are provided in Table S4 in the SI. Detailed parameters for the TD, the GC oven temperature program, and MS conditions can be found in the supplementary materials (Text S4).
| No. | Compound name | RT (min) | Mean ± SD (ng L−1) | CV (%) | LOD (ng L−1) | LOQ (ng L−1) | Detection rate (%) |
|---|---|---|---|---|---|---|---|
| 1 | Acetic acid | 12.76 | (2.04 ± 0.73)×103 | 35.73 | 40.080 | 94.295 | 100.0 |
| 2 | Propionic acid | 16.98 | 7.36 ± 3.14 | 22.70 | 0.068 | 0.159 | 100.0 |
| 3 | Dimethyl sulfide | 16.76 | (6.00 ± 1.00)×10−2 | 13.47 | 0.054 | 0.065 | 88.2 |
| 4 | Isobutyric acid | 18.88 | 1.09 ± 0.58 | 29.20 | 0.124 | 0.353 | 100.0 |
| 5 | Butyric acid | 19.92 | 2.66 ± 1.19 | 24.59 | 0.004 | 0.012 | 100.0 |
| 6 | Valeric acid | 23.88 | 0.94 ± 0.42 | 45.11 | 0.006 | 0.018 | 100.0 |
| 7 | D-Limonene | 27.28 | 2.40 ± 1.23 | 51.38 | 0.410 | 0.930 | 100.0 |
| 8 | Benzonitrile | 27.87 | 0.15 ± 0.03 | 18.13 | 0.107 | 0.219 | 100.0 |
| 9 | Phenol | 28.76 | 0.72 ± 0.25 | 34.45 | 0.201 | 0.342 | 100.0 |
| 10 | Geranyl acetone | 37.87 | 2.34 ± 0.84 | 35.78 | 0.087 | 0.225 | 100.0 |
| 11 | Propionaldehyde | 7.21 | 0.50 ± 0.12 | 23.30 | 0.713 | 1.365 | 11.8 |
| 12 | Acetone | 7.38 | 34.23 ± 11.58 | 33.82 | 2.858 | 8.373 | 100.0 |
| 13 | Isopropanol | 7.63 | 95.27 ± 66.53 | 69.83 | 0.561 | 1.324 | 100.0 |
| 14 | Methacrolein | 9.65 | 0.44 ± 0.33 | 73.52 | 0.460 | 0.814 | 47.1 |
| 15 | Ethyl acetate | 11.09 | 0.93 ± 0.56 | 59.91 | 0.332 | 0.732 | 100.0 |
| 16 | Tert pentanol | 12.90 | 0.26 ± 0.16 | 60.75 | 0.014 | 0.033 | 100.0 |
| 17 | Butanol | 14.04 | 1.22 ± 0.27 | 22.44 | 0.781 | 1.735 | 100.0 |
| 18 | 2-Pentanone | 14.98 | 0.19 ± 0.02 | 12.48 | 0.222 | 0.414 | 35.3 |
| 19 | Acetoin | 16.76 | 5.51 ± 2.03 | 36.84 | 0.397 | 1.141 | 100.0 |
| 20 | Hexanal | 19.24 | 0.75 ± 0.12 | 15.34 | 0.246 | 0.317 | 100.0 |
| 21 | 1,2-Propanediol | 19.28 | 0.40 ± 0.06 | 15.60 | 0.202 | 0.233 | 100.0 |
| 22 | 2-Heptanone | 23.02 | 0.25 ± 0.06 | 23.88 | 0.156 | 0.161 | 100.0 |
| 23 | Heptanal | 23.32 | 0.65 ± 0.18 | 27.28 | 0.117 | 0.223 | 100.0 |
| 24 | Benzaldehyde | 26.53 | 1.04 ± 0.29 | 28.41 | 0.542 | 0.871 | 100.0 |
| 25 | Octanal aldehyde | 27.09 | 0.73 ± 0.19 | 25.82 | 0.167 | 0.330 | 100.0 |
| 26 | Nonanal | 30.20 | 1.69 ± 0.63 | 37.35 | 0.614 | 1.534 | 100.0 |
| 27 | Decanal | 32.64 | 0.49 ± 0.20 | 30.83 | 0.338 | 0.476 | 76.5 |
| 28 | Decane | 25.40 | 0.18 ± 0.07 | 41.79 | 0.151 | 0.304 | 58.8 |
| 29 | Undecane | 28.79 | 0.39 ± 0.23 | 60.53 | 1.392 | 3.830 | 5.9 |
| 30 | Dodecane | 31.42 | 0.45 ± 0.17 | 37.28 | 0.656 | 1.598 | 5.9 |
| 31 | Tridecane | 33.54 | 0.65 ± 0.19 | 29.55 | 0.243 | 0.482 | 100.0 |
| 32 | Tetradecane | 35.44 | 2.90 ± 0.93 | 32.03 | 0.556 | 1.088 | 100.0 |
| 33 | Pentadecane | 37.50 | 5.02 ± 0.87 | 17.38 | 0.723 | 1.590 | 88.2 |
Histopathological evaluation was conducted using a comprehensive scoring system (total score range: 0–15), with detailed criteria provided in the SI (Table S5). To eliminate subjective bias, the scoring was performed using a strict double-blind strategy. Two pathologists, unaware of the group assignments, independently evaluated the slides under 200× magnification. If the discrepancy between their scores was greater than 1 point, a third pathologist was brought in for a review. The final histological score for each sample, used in statistical analysis, was the average of the scores.
:
2 ratio using stratified sampling. To strictly prevent data leakage, feature standardization (Z-score) parameters were fitted exclusively on the training set. Inter-group differences were assessed using the Mann–Whitney U test combined with False Discovery Rate (FDR) correction (p < 0.05; FDR < 0.1). Multiple models, including random forest and XGBoost, were deployed, and their diagnostic performance was evaluated using the ROC-AUC.
For the mouse data, the data underwent log
10 transformation and standardization before being subjected to Principal Component Analysis (PCA). Differences among multiple groups were analyzed using a one-way analysis of variance (ANOVA) followed by Tukey's HSD post-hoc test, with FDR correction (FDR < 0.1) also applied to control for false positives.
Further methodological validation assessed the sensitivity and precision of the analytical workflow (Table 1). The method exhibited extremely low limits of detection for various key metabolites, for example, propanoic acid (LOD: 0.068 ng L−1) and butanoic acid (LOD: 0.004 ng L−1), ensuring the accurate quantification of trace signal molecules. The low detection rate of some long-chain alkanes (e.g., undecane and dodecane) (approx. 5.9%) can be attributed to their concentrations under healthy baseline conditions being far below their respective limits of detection. This precisely reflects the method's ability to objectively distinguish background noise from true biological signals. In terms of precision, the coefficients of variation (CVs) for all target analytes ranged from 12.48% to 73.52%. Notably, the CV values for over 70% (24/33) of the compounds, including various core metabolites such as propanoic acid (29.08%), butanoic acid (22.18%), and 1,2-propanediol (15.60%), were all below 30%. This indicates that the entire collection and analysis workflow possesses high robustness. For the few compounds with higher CV values, their variation is more likely to originate from the unavoidable inter-individual biological differences in living animals, rather than from the technical variation of the method itself.
To further demonstrate the analytical superiority of the FaunaScope, a comprehensive comparison with conventional sampling methods (e.g., Tedlar bags and conventional animal exposure chambers) was conducted (see SI Table S6 for details). Conventional exposure chambers often involve relatively large internal volumes, which can introduce dilution effects and reduce the effective concentration of trace VOCs, thereby potentially decreasing signal-to-noise ratios.21 Furthermore, Tedlar bags are known to emit interfering background VOCs such as phenol and N,N-dimethylacetamide.8,25 The FaunaScope overcomes these limitations through its inert material composition and minimized dead volume, thereby significantly improving the signal-to-noise ratio for in vivo breath monitoring.
In summary, the FaunaScope system combined with TD-GC-MS/MS analysis constitutes a highly sensitive and precise analytical workflow. This workflow lays a solid analytical methodological foundation for reliably capturing and quantifying faint metabolic changes in complex disease models.
Based on the quantitative data of 71 core VOCs established by the methodology, this study further employed seven mainstream machine learning algorithms to build diagnostic models. Among them, the random forest (RF) model exhibited the optimal classification performance in test set validation (Fig. 2C). Its area under the ROC curve (AUC) reached 0.815 (Fig. 2D), significantly outperforming other models. The robustness of the RF model to class imbalance and feature collinearity in high-dimensional data processing is the key to its excellent diagnostic efficacy, and this advantage has also been confirmed in studies on complex disease biomarker screening.28 To precisely identify the most biologically significant core biomarkers, we adopted a strict multi-step screening strategy: a compound had to simultaneously satisfy being in the top 15 for feature importance in the random forest model, showing significant inter-group differences (Mann–Whitney U test, p < 0.05, and FDR < 0.1) and having a clear gut microbial metabolic origin. Through this strategy, we ultimately identified 4 key biomarkers: propanoic acid, butanoic acid, isobutanoic acid, and 1,2-propanediol.
Compared to the healthy control group, the abundances of these four biomarkers in the breath of IBD patients were all drastically down-regulated (Fig. 2E). Specifically, the mean concentration of the core biomarker propanoic acid decreased from approximately 8.0 ng L−1 in the healthy group to about 4.0 ng L−1 in the disease group, a drop of nearly 50%. The decrease in isobutanoic acid was even more significant, with its concentration level reduced by about 55% (from ∼1.5 ng L−1 to <0.7 ng L−1). This systemic deficiency of SCFAs and their derivatives is generally attributed to severe dysbiosis of the gut microbiota, particularly the reduced abundance of acid-producing bacteria (e.g., Faecalibacterium and Eubacterium). Multiple clinical studies and meta-analyses have also confirmed that SCFA levels in the gut and various biological samples (feces and blood) of IBD patients are significantly reduced. Furthermore, their concentrations are negatively correlated with intestinal inflammation activity.29–31 Notably, our parallel analysis of concurrent urine samples also revealed a consistent downward trend for these biomarkers (SI Fig. S1). This further confirms their reliability as indicators of systemic metabolic disorder, rather than originating merely as local products from the oral cavity. This cross-matrix commonality of metabolic features also provides a solid basis for using breath as a non-invasive window for monitoring systemic metabolic status.32,33 Metabolic pathway enrichment analysis suggested (Fig. 2F) that these substances primarily originate from the anaerobic fermentation of carbohydrates and dietary fibers by gut microbiota.34,35 Their down-regulation in breath may reflect the reduced abundance and impaired function of acid-producing gut bacteria in the IBD state. Additionally, microbial fermentation derivatives like 1,2-propanediol have also been shown to be closely related to changes in gut microbial metabolism,36 supporting our interpretation of it as a candidate biomarker for microbial metabolic alterations.
However, although the clinical study established a clear correlation of the down-regulation of SCFAs and their derivatives in IBD, it also exposed key methodological challenges. In a clinical environment with confounding factors such as complex diet and medication, it is difficult to precisely resolve the dynamic response of these biomarkers to therapeutic interventions. It is also impossible to directly validate their “gut–lung axis” transport mechanism at the in vivo level (Fig. 2G). Therefore, to deeply investigate their generation, transport, and exhalation processes, it is necessary to rely on animal models with highly controlled environments. This limitation highlights the necessity of building a high-fidelity animal experiment platform. It also provides a clear research direction and strategy for using the FaunaScope system for full-cycle longitudinal monitoring to solve complex translational medicine problems.
First, longitudinal tracking of systemic clinical indicators revealed significant inter-group differences. Body weight monitoring showed (Fig. 3B) that mice in the DSS model group experienced a sharp weight loss (down to 75% of baseline) in the late modeling stage (Days 9–11), presenting a typical cachectic state. This wasting phenotype is consistent with severe diarrhea, dehydration, and enhanced systemic protein catabolism caused by acute colitis.38 In contrast, the body weight of the EEN intervention group significantly rebounded and approached the baseline (>98%) at the experimental endpoint. This confirmed the effectiveness of exclusive enteral nutrition in counteracting inflammation-related energy deficit.39,40 The DAI score further quantified this phenotypic difference (Fig. 3C): the DAI score of the DSS group progressively worsened over time, reaching a peak on Day 11 (score near 4.0). In contrast, the DAI curve of the EEN group remained at a low level throughout (<2.0, p < 0.05), visually reflecting the excellent efficacy of EEN in alleviating clinical symptoms.
In the anatomical and histopathological evaluations at the experimental endpoint, we observed macroscopic and microscopic pathological changes that were highly consistent with the clinical phenotype. Gross morphology showed that the colons of mice in the DSS group exhibited extremely significant shortening (approx. 30% shorter than those of the EEN group) (Fig. 3D and SI S2A). This was due to smooth muscle spasm and fibrosis caused by transmural inflammation.41 At the microscopic level, the H&E staining results (Fig. 3D) further revealed the injury mechanism: sections from the model group showed extensive disintegration of the crypt structure, depletion of goblet cells, and massive inflammatory cell infiltration in the mucosa and submucosa. These are all typical pathological features of DSS-induced colon injury.42 Conversely, in the EEN intervention group, the aforementioned pathological changes were significantly suppressed. The colon length remained normal, and relatively intact crypt architecture and epithelial continuity were visible under the microscope, indicating that EEN can effectively promote mucosal healing.39 Quantitative analysis of the histological score also reconfirmed that EEN intervention could reduce the intestinal microscopic pathological damage score by 45% (SI Fig. S2B).
In summary, the systemic clinical, anatomical, and histopathological data all confirmed that this study successfully constructed a severe IBD mouse model and an effective EEN intervention model. This laid a reliable biological foundation for subsequent in-depth breath metabolomics analysis based on the FaunaScope system.
PCA intuitively revealed the drastic perturbation of excretion behavior on breath fingerprints (Fig. 4A). Under the three physiological backgrounds of healthy (HC), model (DSS), and intervention (EEN), the WE group samples all significantly deviated from their corresponding NE group cluster centers. Their intra-group dispersion also increased substantially. This indicates that VOCs released from excreta constitute strong background noise sufficient to mask endogenous metabolic features. Bubble plot analysis further confirmed the broad-spectrum nature of this interference. In the WE group, over 80% of the detected compounds showed a “large circle/dark red” feature, representing high concentrations. This displayed an explosive signal enhancement across the full spectrum (Fig. 4B). This is attributed to the release of large amounts of highly volatile compounds from fresh excreta under the effects of body temperature and airflow.
To identify universal “excretion event indicators,” we performed differential screening between WE and NE within all three experimental groups (p < 0.05). A network Venn diagram was used to extract the intersection (Fig. 4C). The results showed that although there were subtle differences in the excretion metabolite profiles at different physiological states, ethyl acetate and dimethyl sulfide were consistently identified as core intersecting substances. This finding is of great significance. It indicates that these two substances are stable excretion markers unaffected by the disease state. The violin plot (Fig. 4D) further revealed their astonishing interference intensity. Compared to the NE group, the mean concentration of dimethyl sulfide in WE group samples increased by 2–3 times, while ethyl acetate surged by 5–10 times. From a biochemical mechanism perspective, dimethyl sulfide is a typical metabolic end-product from the degradation of sulfur-containing amino acids (e.g., methionine) by gut anaerobic bacteria. Its high abundance in fecal headspace has been confirmed by research to be a direct indicator of gut microbial fermentation activity.44,45 Ethyl acetate, as a downstream product of ethanol and acetic acid under the action of microbial esterases, is also widely present in biological excreta.46,47 Therefore, these two compounds can serve as reliable “characteristic interfering substances” to identify potential sample contamination or background interference in animal experimental breath analysis. While ethyl acetate and dimethyl sulfide serve as robust markers for excretion interference in our current model, their universality requires further investigation. Because the production of these volatile metabolites is closely linked to gut microbiota and host metabolism, their baseline levels may vary across different mouse strains, diverse dietary interventions (beyond EEN), or specific disease models with severe dysbiosis. Therefore, validating the stability of these markers—or identifying context-specific alternatives—across broader biological conditions represents an important direction for future research to standardize this QC strategy.
Based on the above findings, this study established a quality control strategy based on monitoring excretion behavior. That is, during the data filtering stage, if an excretion event was observed and recorded for a sample during sampling, or if an abnormal synergistic increase in the concentrations of ethyl acetate and dimethyl sulfide was found during data post-processing, then that sample is defined as a contaminated sample and is excluded from subsequent analysis. This quality control strategy fills a methodological blind spot in existing rodent breath research, which often overlooks matrix background interference. Although this strategy sacrifices some sample size, it effectively eliminates exogenous interference. This ensures that the final retained data can truly reflect the endogenous metabolic changes exhaled from the lungs of mice. This is not only an important component of the SOP for the cross-species breath analysis platform but also lays a solid data quality foundation for the subsequent precise analysis of faint endogenous metabolic signals and cross-species heterogeneity under an interference-free background.
The expression profile of key biomarkers (Fig. 5A) intuitively displayed significant inter-group differences. The DSS model group showed a consistent high-expression feature compared to the HC. The quantitative violin plot (Fig. 5B) further revealed the drastic extent of this change. Specifically, except for isobutanoic acid, the mean signal intensities of propanoic acid, butanoic acid, and 1,2-propanediol in the DSS group were significantly elevated by about 2–3 times compared to the HC group (p < 0.05).
Notably, we observed an interesting phenomenon. The levels of all four biomarkers in the EEN intervention group not only significantly decreased (p < 0.05 vs. DSS) but were also generally lower in value than the HC group baseline. The decrease in isobutanoic acid was particularly significant (approx. 50% of the baseline level). This phenomenon can be attributed to the specific remodeling effect of EEN intervention on the gut microbiota. Previous studies have pointed out that exclusive enteral nutrition, as a monotonous diet, can not only effectively control inflammation but also lead to a decrease in microbial diversity and total biomass due to substrate uniformity.48 Therefore, the overall decline of SCFA levels in the breath of the EEN group is likely a direct reflection of the absolute reduction in fermentation capacity of the gut's acid-producing bacteria under special dietary pressure.
To capture the dynamic trajectory of these metabolic changes, we plotted a time-series bubble heatmap during the modeling period (Fig. 5C). The results showed that the bubble volume of the DSS group increased linearly over time (Days 5, 8, and 11), reaching a peak in the late modeling stage (Day 11). This indicates that the exhalation of metabolites was highly positively correlated with inflammation progression. Conversely, the bubbles of the EEN group remained consistently small, confirming that the nutritional intervention continuously suppressed the occurrence of metabolic disorders throughout the entire treatment window.
However, when we compared the mouse data with the clinical cohort results in a cross-species manner, we observed a significant divergence in the human-mouse breath metabolic features. Specifically, under the same IBD pathological background, SCFAs in the breath of human patients were generally decreased compared to healthy humans. In contrast, breath SCFAs in DSS mice showed a consistent increase compared to healthy mice (Fig. 5D).
Regarding this seemingly contradictory phenomenon, this study, by combining the pathological characteristics of the DSS model with previous mechanistic research, for the first time constructed a differential mechanism model of “acute leakage vs. chronic dysbiosis” (Fig. 5E). On one hand, the core pathology of DSS-induced colitis is an acute inflammation characterized by chemical epithelial denudation and barrier collapse.49 We infer that under this extreme condition, although the gut microbiota may have already shown initial dysbiosis, the complete failure of the physical barrier becomes the dominant factor. This leads to the high concentration of SCFAs, originally accumulated in the intestinal lumen, losing the mucosal barrier's obstruction. They can then passively and unimpededly leak into the blood circulation through the severely damaged epithelium, ultimately triggering an abnormal increase in the signal at the breath end. On the other hand, clinical IBD is usually in a long-term chronic inflammatory process. Multiple metagenomic studies have confirmed that its core feature is the abundance depletion and functional loss of acid-producing bacteria (e.g., Faecalibacterium prausnitzii).50,51 At this time, although there is also increased intestinal permeability in patients, the deficiency in source generation capacity becomes dominant. This results in SCFAs manifesting as a metabolic decrease in the breath.
This profound mechanistic insight strongly demonstrates the core value of the breath analysis platform constructed in this study. It not only possesses the high sensitivity to capture subtle metabolic fluctuations, but more importantly, through systematic longitudinal monitoring in controlled animal experiments, the platform successfully elucidated complex pathophysiological mechanisms that are inaccessible by relying solely on clinical correlation studies. This confirms that our platform is not just a high-precision analytical tool, but also a powerful mechanism-elucidation platform. It can provide key evidence for connecting clinical phenotypes with basic pathology and for revealing the complex interactions between them.
Finally, a limitation of this study is the relatively small sample size (n = 6 per group), which was chosen in adherence to the ‘reduction’ principle of animal ethics for this exploratory research. Given the inherently high inter-individual biological variability in breath VOC profiles, this sample size limits the statistical power for robust biomarker validation. Future studies involving larger animal cohorts are warranted to comprehensively validate these initial findings.
Supplementary information (SI): additional experimental details, supplementary texts, tables, and figures. See DOI: https://doi.org/10.1039/d6ay00164e.
| This journal is © The Royal Society of Chemistry 2026 |