Issue 36, 2025

A machine learning-based strategy for screening bioactive compounds in natural products: a case study on Hypericum perforatum L

Abstract

As a medicinal plant, Hypericum perforatum L. (HPL) is characterized by an abundant material basis, with multiple components jointly exerting biological activity. It is crucial to screen for suitable quality markers based on its specific biological activities for its quality control. This study explores the dose–effect relationship between multiple components in HPL and antioxidant activity, integrating it with machine learning algorithms to construct a virtual screening model for natural antioxidants. High-resolution mass spectrometry was used to collect high-precision semi-quantitative data of HPL, and the in vitro antioxidant activity of the sample was determined. Taking the semi-quantitative data as the X value and the in vitro antioxidant activity data as the Y value, nine independent machine learning models and two ensemble learning models were established, respectively. Based on feature importance scores across all models and combined with the dose–effect analysis, key antioxidant active components were identified. Subsequently, molecular docking and molecular dynamics simulations of endogenous antioxidant mechanisms were conducted for the screened high-characteristic components. The machine learning model established in this study can accurately predict the antioxidant activity of HPL samples. Based on the Bagging integrated learning strategy, the multilayer perceptron regression (MLPR) model showed the best performance, with the training set coefficient of determination (R2) reaching 0.9688, and the prediction set R2 being 0.8761. The root mean square error (RMSEp) and the mean absolute error (MAEp) of the prediction set were 4.27% and 3.47%, respectively, in comparison to the average DPPH scavenging activity of the prediction set, which was 55.59%. Subsequent molecular docking results confirmed that the screened 26 compounds have good in vitro antioxidant activity. For the screened 26 potential bioactive substances, the Keap1/Nrf2/ARE pathway was selected to validate their potential endogenous antioxidant mechanism. Molecular docking and molecular dynamics simulations showed that hyperoside, isohyperoside, kaempferol-3-O-rutinoside, ligustroside, and rutin have excellent binding ability with the Keap1 protein. This study developed a HPL natural antioxidant component screening strategy based on machine learning and non-targeted metabolomics, which can provide new insights for the discovery of key natural products in medicinal plants and drug development.

Graphical abstract: A machine learning-based strategy for screening bioactive compounds in natural products: a case study on Hypericum perforatum L

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
28 May 2025
Accepted
06 Aug 2025
First published
07 Aug 2025

New J. Chem., 2025,49, 15709-15722

A machine learning-based strategy for screening bioactive compounds in natural products: a case study on Hypericum perforatum L

J. Qian, W. Nie, Z. Zhang, G. Fang, C. Li and W. Li, New J. Chem., 2025, 49, 15709 DOI: 10.1039/D5NJ02230D

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements