Open Access Article
Peiqin Shi†
ad,
Xing Huang†a,
Qinfei Kea,
Xingran Kou*a and
Dachuan Zhang
*bc
aCollaborative Innovation Center of Fragrance Flavour and Cosmetics, Faculty of Flavour Fragrance and Cosmetics, Shanghai Institute of Technology, Shanghai 201418, China. E-mail: kouxr@sit.edu.cn
bDepartment of Food Science and Technology, Faculty of Science, National University of Singapore, 117542, Singapore. E-mail: dachuan.zhang@nus.edu.sg
cNational University of Singapore (Suzhou) Research Institute, 377 Lin Quan Street, Suzhou Industrial Park, Jiangsu 215123, China
dSchool of Food Science and Technology, Jiangnan University, Wuxi, Jiangsu 214122, China
First published on 10th February 2026
Sleep disturbances affect up to one-third of the global population, yet current pharmacological therapies based on insomnia medications carry notable risks and side effects. Aromatic plants have long been valued for their capacity to ease stress and promote sleep; however, bioactive volatiles driving these benefits remain poorly understood. This study presents a comprehensive survey of 2391 volatiles across 991 aromatic plants, integrated with an ensemble machine-learning approach to identify their potential sleep-promoting activity. To evaluate the predictive accuracy of our approach, five candidate volatiles were computationally prioritized for in vivo testing. Four of these (an 80% success rate) robustly induced sleep-promoting effects, as evidenced by electroencephalogram analysis and modulation of γ-aminobutyric acid (GABA) receptor expression. In parallel, this work identified plant families such as Asteraceae, Lamiaceae, and Lauraceae as particularly enriched in high-potential volatiles and highlighted individual species—including Lavandula angustifolia and Perilla frutescens—as promising candidates for further pharmacological investigation. By combining large-scale data mining, computational prediction, and in vivo experimentation, this work first provides a comprehensive understanding of the landscape of sleep-promoting volatiles and aromatic plants and offers a reusable workflow to accelerate the discovery of bioactive compounds with potential applications in medicine, functional foods, and natural therapeutics.
Aromatic plants contain a wide array of bioactive compounds, many of which have potential neuropharmacological effects.10 For instance, compounds such as 3,5-dimethoxytoluene exhibit central nervous system modulation, influencing neurotransmitter pathways.11–13 Traditional methods for identifying bioactive molecules rely on bioassay-guided fractionation and in vivo studies, which are labor-intensive, time-consuming, and restricted by the availability of chemical standards. Recent advances in machine learning (ML) have enabled the prediction of molecular bioactivity using computational models, facilitating high-throughput screening of natural products.14–16 For instance, Erlina et al. and Wang et al. utilized ML to identify phytochemicals with inhibitory activity against SARS-COV-2 from a wide range of plant-derived chemicals.17,18 Likewise, Srisongkram et al. applied ML to screen for potential α-amylase and α-glucosidase inhibitors in indigenous Thai plants.19 Brown et al. successfully employed ML strategies to discover anti-inflammatory compounds from hops.20 Additionally, Zhang et al. employed ML to predict the biological activity of genes and enzymes for food and agricultural applications.21,22 However, existing studies predominantly focus on non-volatile compounds, leaving a substantial gap in the systematic identification of bioactive molecules that exert their effects through olfactory stimulation.16,22–29 The structure–activity relationships governing the sleep-promoting effects of volatile organic compounds (VOCs) remain poorly understood. Furthermore, unlike non-volatile bioactives, which interact with the body through digestion and systemic circulation, VOCs primarily act via olfactory receptors and the olfactory bulb, directly influencing brain activity and neurotransmitter release. Despite increasing evidence supporting the neuroactive properties of VOCs, the molecular determinants of their activity, such as functional groups, element compositions, and lipophilicity, are still not well characterized. Moreover, previous studies have largely concentrated on a narrow subset of molecules, neglecting to explore the bioactivity of structurally complex compounds present in plant extracts and essential oils.30
Despite growing interest in alternative sleep aids, identifying sleep-promoting VOCs in aromatic plants remains challenging due to the chemical complexity of plant extracts and the low efficiency of conventional bioassay-guided fractionation and in vivo methods. To address this limitation, this work presents a data-driven approach combining ML with big data on aromatic plant composition (Fig. 1). It systematically reveals the sleep-promoting potential of over 2300 VOCs and approximately 1000 types of aromatic plants. In addition to evaluating individual compounds, we further integrated predicted sleep-promoting potential scores of VOCs with occurrence and abundance data across plant species to prioritize botanical sources with the highest sleep-promoting potential. This prioritization helps to identify promising leads for further pharmacological studies and supports the selection of candidate plants for the development of natural sleep aids. The implications of this research extend beyond sleep disorders, contributing to the broader fields of natural product-based drug discovery and the development of functional perfumes, cosmetics, and foods.
To examine their sleep-promoting activity, we compiled a dataset for ML modeling consisting of 244 sleep-promoting compounds (positive samples, e.g., GABA agonists) and 244 non-sleep-promoting compounds (negative samples, e.g., GABA inhibitors) from the literature, public databases, and patents (Fig. 2a and b; see Supporting Dataset 2). A molecular scaffold analysis was conducted to explore the chemical diversity within the dataset. The positive compounds exhibited 78 distinct molecular scaffolds (Fig. 2c), while the negative compounds displayed 108 unique scaffolds (Fig. 2d). Positive compounds tend to share common structural characteristics, likely because they are designed for specific targets, such as GABA_A, or optimized for improved pharmacophore features, resulting in a higher degree of structural similarity. Further analysis using t-distributed stochastic neighbor embedding mapping showed that positive and negative samples are separated into distinct groups in most cases, indicating distinct structural characteristics. A similar trend is observed in their physicochemical properties: negative samples generally have higher molecular weight, more rotatable bonds, a larger topological polar surface area, more rings, and more hydrogen bond acceptors, whereas positive samples tend to have a lower number of hydrogen bond donors (Fig. S1). These findings collectively highlight distinct structural characteristics differentiating sleep-promoting compounds from other compounds, underscoring the feasibility of developing a classification model capable of effectively identifying novel sleep-promoting VOCs (Fig. 2e).
Among all the models, the RF model built on RDKit descriptors (RF-RDKit) exhibited strong predictive power, achieving an AUC-ROC of 0.957 ± 0.02 on the independent test sets. To further improve predictive accuracy, we developed a stacking ensemble model by combining the four best-performing classifiers: RF model built on Molecular ACCess System keys (RF-MACCS), RF-RDKit, XGBoost-MACCS, and SVM-MACCS (Fig. S2). The stacking model significantly surpassed the individual models, achieving an AUC-ROC of 0.994 ± 0.08, an accuracy of 0.961 ± 0.024, a precision of 0.957 ± 0.033, and a recall of 0.967 ± 0.024 (Table S1). Its false positive and false negative rates are 4.4% and 3.2%, respectively (Fig. 3a). These results highlight the effectiveness of model stacking, which leverages the strengths of multiple classifiers to improve overall predictive performance.
Another strategy to mitigate the challenge of limited training data is few-shot learning with pre-trained models. To compare this strategy with the model stacking approach, we tested the performance of two pre-trained models, namely CHEM-BERT33 and knowledge graph-enhanced molecular contrastive learning with functional prompts (KANO)34 fine-tuned on the dataset. CHEM-BERT leverages self-supervised learning on large-scale molecular datasets to capture chemical semantics. KANO incorporates domain-specific molecular graphs and functional group information to improve molecular representation learning. Both methods utilized pre-trained models and chemical knowledge to enhance the model's predictive ability on small datasets. However, we found that both models demonstrated lower performance than the stacking model in this task, with AUC-ROC values of 0.482 ± 0.045 and 0.901 ± 0.037, respectively. The lower performance of CHEM-BERT might be attributed to the substantial distribution shift between its pre-training datasets and our specific dataset. Although it has captured extensive chemical semantics on large molecular datasets, its representations appear to lack adequate specificity to accurately distinguish bioactive sleep-promoting volatiles, resulting in near-random predictive performance. Although KANO, which incorporates domain-specific molecular graphs and functional prompts, achieved better results than CHEM-BERT, it still underperformed when compared to the simple stacking ensemble approach. One possible reason for this limited performance is that, despite leveraging domain knowledge, the representational power of graph-based pre-training may still fail to adequately capture subtle pharmacological characteristics critical for predicting biological activities in highly specialized datasets. Additionally, fine-tuning on limited-instance data could lead these relatively complex pre-trained models to overfit due to the large number of parameters being adjusted, resulting in instability and reduced generalizability on our dataset. Collectively, these results underscore that, in this scenario of limited bioactive data, the simple and straightforward ensemble stacking strategy can be a better solution to address the data limitation compared to the pre-train/fine-tune strategy.
When a molecule contains at least one C–N bond or features a five-membered ring, the corresponding SHAP values are negative (highlighted in red), indicating a negative impact on the model's prediction of GABA activity (Fig. 3b, d, and e). Conversely, when these features are absent, the SHAP values are positive (highlighted in blue), suggesting a positive effect on the model's classification. In the RF-RDKit model, the most influential descriptor was SlogP_VSA8, which indicates the logarithm of the hydrophobic contribution surface area, ranging from 12.0 to 13.5 Å2. A higher SlogP_VSA8 value correlated with a decreased probability of sleep-promoting activity (Fig. 3c).
For the decision tree-based models (e.g., RF-MACCS and XGBoost-MACCS), we analyzed SHAP interaction values to explore how feature interactions influence model predictions (Fig. S4). Strong interaction effects were observed involving the C–N bond and the absence of a five-membered ring, indicating that both their individual contributions and interactions with other features play a significant role in the model's output. In addition, the interaction between the N–H group and aromatic rings > 1 consistently showed a negative effect for certain samples, suggesting that the co-occurrence of N–H groups and multiple aromatic rings reduces the sleep-promoting activity of VOCs.
These insights enhance interpretability by illuminating the “black box” of ML models, providing valuable guidance for identifying new sleep-promoting molecules and understanding their mechanisms.
During sleep, electroencephalogram (EEG) signals can be classified into distinct stages, including non-rapid eye movement (NREM) sleep, REM sleep, and wakefulness (Fig. 4a). NREM sleep is marked by higher EEG amplitude, indicating synchronized brain activity, and lower electromyography (EMG) amplitude, reflecting muscle relaxation. In contrast, REM sleep exhibits lower EEG amplitude, accompanied by continued low EMG activity, indicating a state of reduced brain activity and muscle atonia. Wakefulness, however, is associated with low EEG amplitude and increased EMG activity, signifying an alert and active state of both the brain and muscles. According to EEG analysis, mice exposed to carvacrol, safranal, vanillin, and methyl eugenol exhibited a significant reduction in wakefulness duration compared to the normal control group (NOR) (Fig. 4b; p < 0.01, p < 0.05, p < 0.01, and p < 0.05, respectively). Additionally, total sleep duration was significantly extended in these groups (Fig. 4e; p < 0.01, p < 0.05, p < 0.01, and p < 0.05, respectively). The observed increase in total sleep duration was primarily driven by an increase in NREM sleep (Fig. 4c; p < 0.01, p < 0.01, p < 0.05, and p < 0.05, respectively), rather than REM sleep. These findings suggest that carvacrol, safranal, vanillin, and methyl eugenol effectively prolong NREM sleep duration and enhance overall sleep time in mice.
To investigate the sleep-promoting mechanisms of aroma compounds, we analyzed their effects on GABA_A receptor protein expression using Western blot analysis. The results indicated that carvacrol, safranal, vanillin, and methyl eugenol increased the expression of one or more GABA_A subunits (Fig. 5). Compared to the NOR group, the expression of GABA_Aα1 was significantly elevated in the vanillin and methyl eugenol groups, increasing by 1.55-fold and 1.63-fold, respectively (Fig. 5a; p < 0.05 and p < 0.05, respectively). Similarly, GABA_Aβ2 expression was significantly upregulated in the carvacrol, vanillin, and methyl eugenol groups, with increases of 1.91-fold, 1.93-fold, and 1.75-fold, respectively (Fig. 5b; p < 0.05, p < 0.05, and p < 0.05, respectively). Additionally, methyl eugenol significantly increased the expression of GABA_Aγ2 by 1.69-fold (Fig. 5c; p < 0.05). These findings suggest that these VOCs enhance GABAergic signaling by upregulating GABA_A receptor subunits, further supporting their potential role in promoting sleep.
We applied SHAP analysis to four validated sleep-promoting VOCs—carvacrol, safranal, vanillin, and methyl eugenol—using our trained models. Across all compounds, the models consistently highlighted structural features that contribute positively to the predicted sleep-promoting activity. These include the presence of no more than one aromatic ring, absence of heterocycles, lack of C–N bonds, absence of five-membered rings, and no nitrogen atom bonded to two aromatic atoms. Additionally, all compounds exhibited SlogP_VSA8 = 0, indicating no atomic hydrophobic contributions within the 12.0–13.5 Å2 surface area range. These findings reinforce our previous SHAP-based conclusions and demonstrate that the machine-learning models have captured meaningful and relevant patterns linked to sleep-promoting properties (Fig. S5–S8).
Through this process, 991 unique species with recorded aromatic properties were identified, and they were classified into three major plant classes: Magnoliopsida (dicotyledons), Pinopsida (conifers), and Equisetopsida (ferns). Among these, the Magnoliopsida class emerged as the most abundant and extensively studied, comprising 128 families and 914 species. In contrast, Pinopsida included seven families and 76 species, and Equisetopsida only comprised one family and one species (see Supporting Dataset 3).
Fig. 6 illustrates the number of VOCs with high sleep-promoting potential (prediction scores exceeding 0.95) detected in plant extracts from various families, while Fig. S9 displays the combined score of the number of sleep-promoting VOCs and their content. Within Magnoliopsida, three families stood out for their high diversity of sleep-promoting VOCs: Asteraceae, Lamiaceae, and Lauraceae (Fig. 6 and S10). Aromatic plants from Asteraceae, Lamiaceae, and Lauraceae are well-documented for their calming, sedative, anxiolytic, and central nervous system-suppressing properties in traditional medicine, which align with our data-driven findings.36–40 When examining specific species, we identified several candidates that not only contain a broad spectrum of putative sleep-promoting VOCs but also exhibit relatively high concentrations of these compounds. For example, Silybum marianum, Sphagneticola trilobata, and Petasites japonicus from the Asteraceae family, Lavandula angustifolia, Ocimum basilicum, Perilla frutescens, and Vitex negundo from the Lamiaceae family, and Litsea monopetala and Litsea cubeba from the Lauraceae family (Fig. S11). These findings suggest that these particular species may be valuable for further research and potential therapeutic applications related to sleep disorders. The synergy of multiple VOCs, including terpenes and phenolic compounds, may be contributing to the overall sleep-promoting effect—an area that warrants detailed pharmacological and mechanistic studies.
Beyond Magnoliopsida, notable sleep-promoting potential was also observed in Pinopsida (Fig. 6, S12, and S13). Although the families in the Pinopsida class exhibited relatively lower scores overall, two of them, particularly Pinus sylvestris (Pinaceae) and Taxodium distichum (Cupressaceae), demonstrated considerable potential. This suggests that Pinopsida plants may serve as an alternative direction and provide promising leads for further biochemical investigations. In particular, Cupressaceae and Pinaceae demonstrated elevated levels of terpenes (e.g., cedrol and alpha-fenchone), which are linked to sedative or sleep-regulating pathways. A summary of high-potential aromatic plants is presented in Fig. S14 and S15. In addition, among the lower-scoring families, some species exhibit high bioactive potential. For instance, Aconitum carmichaelii in the Ranunculaceae family, Artocarpus heterophyllus in the Moraceae family, and Linum usitatissimum in the Linaceae family (see SI) showcase concentrations of potent VOCs that exceed expectations based on a family-wide assessment.
To ensure data consistency and reliability, we performed rigorous data cleaning steps: (1) we restricted the dataset to include only compounds that fell within the model's applicability domain, ensuring valid concentration values and clearly defined plant sources. (2) To prevent inflated estimations of compound frequency and concentration due to multiple extracts from different plant parts (e.g., roots and leaves), we retained only unique instances of compounds, removing duplicate entries from various plant tissues. For the same compound in different parts, we only retain the concentration values of each compound when it first appeared in different plants. (3) Given the variation in compound naming conventions across different studies, we standardized all compound names according to the PubChem41 database to facilitate accurate comparisons. (4) Compounds for which the literature described presence qualitatively (e.g., “detected,” “major constituent,” or “trace amount”) without numerical values were excluded from these analyses.
To visualize the diversity of the dataset, we used MoleculeCloud,45 which generates molecular scaffolds by representing the core molecular structure with all non-carbon atoms replaced by carbon, thereby showcasing the structural variability within the dataset. Additionally, we applied t-distributed stochastic neighbor embedding to map the chemical space of the dataset and evaluate the clustering patterns between positive and negative samples using three types of molecular descriptors, including Extended Connectivity Fingerprints with a bond diameter of 4 (ECFP4), MACCS, and RDKit fingerprints.
We developed multiple models using conventional ML algorithms, including RF, KNN, SVM, XGBoost, and GBDT,48 as well as deep learning frameworks, including message-passing neural networks,31 Attentive FP,32 CHEM-BERT,33 and KANO.34 These models were trained using various molecular graph representations mentioned earlier. The conventional ML algorithms were implemented with scikit-learn (version 1.0.2) and XGBoost (version 1.7.6) in Python (version 3.8.16).
To optimize the ML models' performance, hyperparameter tuning was conducted using a grid search approach combined with five-fold cross-validation. The hyperparameters used are presented in the SI. To further enhance predictive accuracy, we employed a stacking ensemble strategy49 combining multiple ML models to capture different aspects of the data. Specifically, we selected the four best-performing individual models and integrated them into a stacking framework to leverage their complementary strengths (Fig. S2).
The models were evaluated using AUC-ROC, accuracy, precision, and recall (eqn (1)–(3)).
![]() | (1) |
![]() | (2) |
![]() | (3) |
To ensure reliable predictions, we employed a previously reported method that quantifies the model's applicability domain using Euclidean distance and KNN.24 First, the threshold T for determining whether a compound falls within the applicability domain was calculated from the training set using the following equation:
| T = Zσ + Y | (4) |
We employed the SHAP algorithm to interpret the ML model's predictions. SHAP calculates feature importance by assigning Shapley values derived from cooperative game theory to quantify the individual contribution of each feature to the model's predictions.
Male C57BL/6 mice (20–25 g, aged 4–8 weeks) were sourced from Shanghai SLAC Laboratory Animal Co., Ltd. The animals were acclimated for a minimum of one week under controlled conditions: temperature (20 °C ± 3 °C), humidity (50% ± 5%), and a 12-hour light/dark cycle, with ad libitum access to food and water. All animal experiments received approval from the Experimental Animal Ethics Committee of Wuchuang Biotechnology (Shanghai) Co., Ltd (approval no. WTPZ20230812001, Shanghai, China). All procedures were conducted in accordance with the ARRIVE 2.0 guidelines for reporting animal research.
The mice were divided into experimental and control groups based on whether they were exposed to the aroma or not. In the experimental group, five different VOCs were tested, with six mice assigned to each compound. The sleep conditions of the mice were monitored to evaluate the effects of each VOC. In the control group, six mice were used to assess sleep conditions without the presence of aromatic plant VOCs. For aroma exposure, 600 grams of pure water were added to the fragrance lamp, followed by 0.20 to 0.25 grams of the VOCs. The mice were exposed to the aroma from 8:00 a.m. to 4:00 p.m. to investigate its effect on sleep.
Additionally, we validated our approach with EEG measurements and GABA_A receptor expression analysis in mouse brain tissue and found that four out of five (an 80% success rate) predicted sleep-promoting VOCs—carvacrol, safranal, vanillin, and methyl eugenol—demonstrated the expected sleep-promoting activity. Our findings confirmed that the abovementioned VOCs significantly prolonged the duration of NREM sleep, indicating their potential to enhance sleep quality. The observation that these VOCs effectively modulate NREM sleep aligns with previous evidence highlighting the importance of NREM in physical restoration and energy replenishment. Detailed EEG data revealed that the increase in NREM was accompanied by heightened theta-wave activity, a hallmark of deeper, restorative sleep stages. These results support the hypothesis that specific VOCs can exert sedative or sleep-regulating effects by influencing particular pathways.
Interestingly, although linalool was predicted to have sleep-promoting properties, our experiments did not reveal significant increases in NREM sleep or total sleep time for this VOC, nor did it influence GABA_A receptor levels. These findings suggest that linalool may exert weaker or context-dependent effects on sleep modulation. First, the concentration of linalool used in the in vivo tests may not have been optimal; either too low to exert a measurable effect or too high, potentially leading to paradoxical excitation or adverse effects. Second, rapid metabolism or clearance of linalool in vivo may limit its bioavailability and central nervous system penetration, thereby attenuating its hypnotic effects. Third, species-specific differences in receptor binding or physiological response could also contribute, as computational predictions are often based on data from human or model organism targets that may not fully translate to the experimental model used. Indeed, linalool may have more subtle or context-specific modulatory effects on neural pathways, necessitating further research under various conditions to fully elucidate its role in sleep regulation.
In our comprehensive survey, we identified plant families with high potential for sleep-promoting activity, as well as individual species with strong bioactive potential even within lower-scoring families. This discrepancy highlights the chemotypic variability of plant secondary metabolite production: even families with minimal representation of sleep-promoting chemicals at the aggregate level can harbor “outliers” that produce specific compounds in higher concentrations or with greater synergy. Several factors may drive this phenomenon. First, plants often evolve specialized metabolic pathways in response to localized ecological pressures—such as herbivory, pathogen defense, or pollinator attraction—which can lead to the accumulation of distinct, potent VOCs in a handful of species. Consequently, while an entire family might be characterized by a moderate or minimal presence of sleep-promoting molecules, a few species could stand out due to their unique evolutionary trajectories. Second, these species may have been traditionally valued for medicinal properties unrelated to sleep (e.g., analgesia, cardioprotection, or antimicrobial activity), yet some of the same compounds responsible for these effects might also induce sedation or modulate sleep pathways. As research techniques advance and more sophisticated analytical tools become commonplace, it is increasingly clear that pharmacologically relevant bioactivity can stem from a broad range of overlapping or multifunctional plant metabolites.
Several aspects of this study could be expanded further. First, while our in vivo assays provide compelling evidence for the sleep-promoting effects of certain VOCs, long-term safety and efficacy studies are still required before these molecules can be considered for clinical use. Second, our focus on GABA_A receptor mechanisms does not rule out the possibility that other neurotransmitter systems (e.g., serotonin and melatonin) may also contribute to the observed sleep-promoting effects. Third, synergy among multiple VOCs within a single plant—and how these interactions might enhance or diminish sedative outcomes—is still poorly understood, highlighting the need for more sophisticated analytical and modeling approaches.52,53 Fourth, it is possible that yet-to-be-identified biochemical pathways54 in these species contribute to sleep-promoting activity. For example, some VOCs in lower-scoring families might be incompletely characterized, either because the plants are less commonly studied or because advanced analytical techniques (e.g., metabolomics, multi-omics approaches)55 have not been applied extensively to these taxa. Looking ahead, incorporating additional omics data (e.g., proteomics, metabolomics, and genomics) and conducting synergistic studies that evaluate multiple receptor pathways could further enhance our understanding of how these plant-derived VOCs influence sleep and other physiological processes. Finally, it is important to recognize that not all plant-derived essential oils or VOCs are universally safe. For instance, lavender oil (derived from Lavandula angustifolia) is usually regarded as safe for most adults when used appropriately for aromatherapy, topical application, or as a food flavoring. Nevertheless, concerns have been raised about its use in young boys before puberty due to potential hormonal disturbances.56 Such considerations highlight the need for safety assessments alongside efficacy studies in future research to ensure that sleep-promoting plant-derived VOCs can be applied both effectively and responsibly.
Nevertheless, we are confident that the current study provides an effective approach for estimating the sleep-promoting activity of plant-derived VOCs and prioritizes key plant families and species with high sleep-promoting potential for further investigation. Moreover, when high-quality training data are available, the general framework proposed in this study can be readily applied to prediction tasks other than sleep-promoting VOCs, such as the prediction of natural antioxidants, antidiabetic agents, or other health-promoting molecules.
Supplementary information (SI): detailed descriptions of machine learning models and hyperparameters, additional performance evaluations, extended figures and tables. See DOI: https://doi.org/10.1039/d5dd00173k.
Footnote |
| † Co-first authors. |
| This journal is © The Royal Society of Chemistry 2026 |