Graph neural networks for identifying protein-reactive compounds

Victor Hugo Cano Gil; Christopher N. Rowley

doi:10.1039/D4DD00038B

Graph neural networks for identifying protein-reactive compounds†

Victor Hugo Cano Gil

^a and Christopher N. Rowley

*^a

Author affiliations

* Corresponding authors

^a Department of Chemistry, Carleton University, 1125 Colonel By Dr, Ottawa, ON K1S 5B6, Canada
E-mail: christopherrowley@cunet.carleton.ca
Tel: +1(613) 520-2600 x 1647

Abstract

The identification of protein-reactive electrophilic compounds is critical to the design of new covalent modifier drugs, screening for toxic compounds, and the exclusion of reactive compounds from high throughput screening. In this work, we employ traditional and graph machine learning (ML) algorithms to classify molecules being reactive towards proteins or nonreactive. For training data, we built a new dataset, ProteinReactiveDB, composed primarily of covalent and noncovalent inhibitors from the DrugBank, BindingDB, and CovalentInDB databases. To assess the transferability of the trained models, we created a custom set of covalent and noncovalent inhibitors, which was constructed from the recent literature. Baseline models were developed using Morgan fingerprints as training inputs, but they performed poorly when applied to compounds outside the training set. We then trained various Graph Neural Networks (GNNs), with the best GNN model achieving an Area Under the Receiver Operator Characteristic (AUROC) curve of 0.80, precision of 0.89, and recall of 0.72. We also explore the interpretability of these GNNs using Gradient Activation Mapping (GradCAM), which shows regions of the molecules GNNs deem most relevant when making a prediction. These maps indicated that our trained models can identify electrophilic functional groups in a molecule and classify molecules as protein-reactive based on their presence. We demonstrate the use of these models by comparing their performance against common chemical filters, identifying covalent modifiers in the ChEMBL database and generating a putative covalent inhibitor based on an established noncovalent inhibitor.

Supplementary files

Article information

DOI: https://doi.org/10.1039/D4DD00038B
Article type: Paper
Submitted: 01 Feb 2024
Accepted: 23 Jul 2024
First published: 25 Jul 2024
This article is Open Access

Download Citation

Digital Discovery, 2024,3, 1776-1792

Permissions

Request permissions

Graph neural networks for identifying protein-reactive compounds

V. H. Cano Gil and C. N. Rowley, Digital Discovery, 2024, 3, 1776 DOI: 10.1039/D4DD00038B

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Digital Discovery

Graph neural networks for identifying protein-reactive compounds†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Graph neural networks for identifying protein-reactive compounds

Social activity

Search articles by author

Spotlight

Advertisements