Semantic repurposing model for traditional Chinese ancient formulas based on a knowledge graph

Abstract

Drug repurposing can dramatically decrease cost and risk in drug discovery and it can be very helpful for recommending candidate drugs. However, as traditional Chinese medicine (TCM) formulas are multi-component, the repurposing methods for western medicine are usually not applicable for TCM formulas. In this study, we proposed a concept/strategy for multi-component formula/recipe discovery with network and semantics. With this concept, we establish a semantic formula-repurposing model for TCM based on a link-prediction algorithm and knowledge graph (KG). The proposed model integrating semantic embedding with KG networks facilitates the effective repurposing of traditional Chinese medicine formulas. First, we construct a KG that consists of more than 46 600 ancient formulas, including over 120 000 entities, 415 900 triples and 12 relations that are extracted from non-structural textual data by deep-learning techniques. Then, a link-prediction model is built on KG triplets for entity and edge semantic vectors. The formula-repurposing task is considered as computing the similarity of semantic vectors in KG between entities and query formulas. In the current version of the proposed model, two ways of repurposing are tested: one is searching for a similar formula to the query one, and the other is seeking a possible formula for rare, emerging diseases or epidemics. The former is based on the name of a formula; the latter is carried out through symptom entities. The experiments are exemplified with existing formulas, Fufang Danshen Tablets (Image ID:d5dd00344j-u1.gif) and the symptoms of COVID-19. The results agree well with existing clinical practices. This suggests our model can be a comprehensive approach to constructing a knowledge graph of TCM formulas and a TCM formula-repurposing strategy, which is able to assist compound formula development and facilitate further research in multi-compound drug/prescription discovery.

Graphical abstract: Semantic repurposing model for traditional Chinese ancient formulas based on a knowledge graph

Supplementary files

Article information

Article type
Paper
Submitted
05 Aug 2025
Accepted
28 Oct 2025
First published
29 Oct 2025
This article is Open Access
Creative Commons BY-NC license

Digital Discovery, 2026, Advance Article

Semantic repurposing model for traditional Chinese ancient formulas based on a knowledge graph

X. Dong, W. Zhao, F. Li, L. Hu, H. Li and G. Li, Digital Discovery, 2026, Advance Article , DOI: 10.1039/D5DD00344J

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements