HS–SPME–GC–MS combined with machine learning methods for screening volatile quality indicators in Hypericum perforatum L.†
Abstract
Hypericum perforatum L. (HPL), a natural product with high medicinal value, exhibits diverse bioactivities. The efficacy of H. perforatum varies with different parts of the plant, and the content of active ingredients also varies significantly. Elucidation of the volatile compounds in different medicinal parts of HPL and screening of suitable active compounds as indicators for quality control are essential for its quality improvement. In this study, HS–SPME–GC–MS was used to characterize the volatile compounds in H. perforatum collected from Xinjiang, China. Subsequently, an OPLS-DA model was established to visualize differences among three distinct parts of H. perforatum. Then, network pharmacology was used to analyze the pharmacological activity of the identified differential metabolites. Finally, three classifiers (support vector machine, random forest, and K-nearest neighbor) were used to assess the qualification of the identified quality markers. A total of 159 volatile compounds were identified by combining the MS database and retention indices for rigorous evaluation to remove redundant information. And 67 differential metabolites were screened by the OPLS-DA model. Furthermore, network pharmacological analysis revealed that 48 compounds were associated with 1159 target genes. Among these, 18 highly active compounds were selected as potential markers. All three classifiers demonstrated good performance across different variables. Ultimately, eight compounds were selected as markers for laboratory quality control of H. perforatum.