Issue 11, 2025

AgreementPred: a cheminformatic framework for drug and natural product category recommendation based on multi-representation structural similarity data fusion

Abstract

Natural products offer a vast reservoir of bioactive compounds, playing a crucial role in drug discovery. In this big data era, the annotation of their pharmacological categories holds great potential for accelerating drug discovery and advancing mechanistic studies of herbal medicines. However, a vast majority of natural products' classification remains unannotated. Existing recommendation frameworks for pharmacological categories are predominantly tailored to conventional drugs and frequently require extensive experimental data which are typically lacking for natural products. Traditional cheminformatic approaches based on structural similarity, while widely adopted, often struggle to achieve a satisfactory balance between prediction recall and precision, thereby limiting their overall effectiveness. In this study, a simple and explainable category recommendation framework for drugs and natural products based on multi-representation structural similarity data fusion, AgreementPred, was proposed. The framework utilized PubChem compound annotations which comprised two compound classification systems, Anatomical Therapeutic Chemical (ATC) classification and Medical Subject Headings (MeSH) as category labels, extending the scope of application beyond conventional drugs. The similarity search results using 22 molecular representations were combined to improve prediction recall. The predicted annotations were subsequently filtered by agreement scores to enhance prediction precision. Compared to existing equivalent approaches, AgreementPred achieved superior recall-precision balance in both ATC and category prediction tasks. With an agreement score threshold of 0.1, AgreementPred showed 0.74 and 0.55 of recall and precision, respectively, for the category prediction for 1000 compounds from a pool of 1520 categories. Finally, AgreementPred was applied to 321 605 unannotated drugs and natural products. The resulting prediction is expected to be of contribution to drug discovery, as well as mechanistic study purposes.

Graphical abstract: AgreementPred: a cheminformatic framework for drug and natural product category recommendation based on multi-representation structural similarity data fusion

Supplementary files

Article information

Article type
Paper
Submitted
27 Jul 2025
Accepted
29 Sep 2025
First published
30 Sep 2025
This article is Open Access
Creative Commons BY-NC license

Digital Discovery, 2025,4, 3304-3319

AgreementPred: a cheminformatic framework for drug and natural product category recommendation based on multi-representation structural similarity data fusion

C. Sutcharitchan, B. Wang, D. Zhang, Q. Liu, T. Zhang, P. Zhang and S. Li, Digital Discovery, 2025, 4, 3304 DOI: 10.1039/D5DD00329F

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements