Issue 16, 2025, Issue in Progress

An approach of molecular-fingerprint prediction implementing a GAT

Abstract

In the domain of metabolomics, the accurate identification of compounds is paramount. However, this process is hindered by the vast number of metabolites, which poses a significant challenge. In this study, a novel approach to compound identification is proposed, namely a molecular-fingerprint prediction method based on the graph attention network (GAT) model. The method involves the processing of fragmentation-tree data derived from tandem mass spectrometry (MS/MS) data computation and the subsequent processing of fragmentation-tree graph data with a technique inspired by natural language processing. The model is then trained using a 3-layer GAT model and a 2-layer linear layer. The results demonstrate the method’s efficacy in molecular-fingerprint prediction, with the prediction of molecular fingerprints from MS/MS spectra exhibiting a high degree of accuracy. Firstly, this model achieves excellent performance in receiver operating characteristic (ROC) and precision–recall curves. The factors that have the most influence on the resultant performance are identified as edge features using different training parameters. Then, better performance is achieved for accuracy and F1 score in comparison with MetFID. Secondly, the model performance was validated by querying the molecular libraries through methods commonly used in related studies. In the results based on precursor mass querying, the proposed model achieves comparable performance with CFM-ID; in the results based on molecular formula querying, the model achieves better performance than MetFID. This study demonstrates the potential of the GAT model for compound identification tasks and provides directions for further research.

Graphical abstract: An approach of molecular-fingerprint prediction implementing a GAT

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
10 Feb 2025
Accepted
02 Apr 2025
First published
22 Apr 2025
This article is Open Access
Creative Commons BY license

RSC Adv., 2025,15, 12757-12764

An approach of molecular-fingerprint prediction implementing a GAT

C. Deng, C. Zhou, L. Shi and B. Wang, RSC Adv., 2025, 15, 12757 DOI: 10.1039/D5RA00973A

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements