Explainable active learning framework for ligand binding affinity prediction

Satya Pratik Srivastava; Rohan Gorantla; Sharath Krishna Chundru; Claire J. R. Winkelman; Antonia S. J. S. Mey; Rajeev Kumar Singh

doi:10.1039/D5DD00436E

Explainable active learning framework for ligand binding affinity prediction

Satya Pratik Srivastava,^a Rohan Gorantla,

^bc Sharath Krishna Chundru,^a Claire J. R. Winkelman,^c Antonia S. J. S. Mey

*^c and Rajeev Kumar Singh

*^a

Author affiliations

* Corresponding authors

^a Shiv Nadar University, Delhi-NCR, India
E-mail: rajeev.kumar@snu.edu.in

^b School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK

^c EaStCHEM School of Chemistry, University of Edinburgh, Edinburgh EH9 3FJ, UK
E-mail: Antonia.mey@ed.ac.uk

Abstract

Active learning (AL) prioritises which compounds to measure next for protein–ligand affinity when assay or simulation budgets are limited. We present an explainable AL framework built on Gaussian process regression and assess how molecular representations, covariance kernels, and acquisition policies affect enrichment across four drug-relevant targets. Using recall of the top active compound, we find that dataset identity which is a target's chemical landscape sets the performance ceiling and method choices modulate outcomes rather than overturn them. Fingerprints with simple Gaussian process kernels provide robust, low-variance enrichment, whereas learned embeddings with non-linear kernels can reach higher peaks but with greater variability. Uncertainty-guided acquisition consistently outperforms random selection, yet no single policy is universally optimal; the best choice follows structure–activity relationship (SAR) complexity. To enhance interpretability beyond black-box selection, we integrate SHapley Additive exPlanations (SHAP) to link high-impact fingerprint bits to chemically meaningful fragments across AL cycles, illustrating how the model's attention progressively concentrates on SAR-relevant motifs. We additionally provide an interactive active learning analysis platform featuring SHAP traces to support reproducibility and target-specific decision-making.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

DOI: https://doi.org/10.1039/D5DD00436E
Article type: Paper
Submitted: 29 Sep 2025
Accepted: 20 Dec 2025
First published: 23 Dec 2025
This article is Open Access

Download Citation

Digital Discovery, 2026,5, 769-779

Permissions

Request permissions

Explainable active learning framework for ligand binding affinity prediction

S. P. Srivastava, R. Gorantla, S. K. Chundru, C. J. R. Winkelman, A. S. J. S. Mey and R. K. Singh, Digital Discovery, 2026, 5, 769 DOI: 10.1039/D5DD00436E

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Digital Discovery

Explainable active learning framework for ligand binding affinity prediction

Abstract

Supplementary files

Transparent peer review

Article information

Download Citation

Permissions

Explainable active learning framework for ligand binding affinity prediction

Social activity

Search articles by author

Spotlight

Advertisements