Benchmarking explainable AI methods for toxicophore detection and toxicity prediction

Dina Khasanova; Igor V. Tetko

doi:10.1039/D5DD00576K

Benchmarking explainable AI methods for toxicophore detection and toxicity prediction

Dina Khasanova

*^abc and Igor V. Tetko

*^ad

Author affiliations

* Corresponding authors

^a Helmholtz Munich – German Research Center for Environmental Health (GmbH), Institute of Structural Biology, Molecular Targets and Therapeutics Center, 85764 Neuherberg, Germany
E-mail: igor.tetko@helmholtz-munich.de

^b TUM School of Natural Sciences, Technical University of Munich, 85748 Garching, Germany
E-mail: dina.khasanova@tum.de

^c Molecular Networks GmbH (MN-AM), 90411 Nuremberg, Germany

^d BIGCHEM GmbH, Valerystr. 49, 85716 Unterschleißheim, Germany

Abstract

Recent studies have reported inconsistent behavior across explainable AI (XAI) methods in molecular property prediction, raising concerns about their reliability. This work investigates whether such inconsistencies arise from the XAI methods themselves or from the accuracy of the underlying predictive model. A high-accuracy model was first trained on deterministic functional-group labels, where all evaluated XAI methods consistently highlighted the correct atoms corresponding to the true structural motifs. The analysis was extended to mutagenicity prediction, where the methods again identified known toxicophores and chemically meaningful scaffolds. Model performance was then systematically degraded by introducing controlled amounts of label noise. As predictive accuracy decreased, agreement between XAI methods weakened gradually, and the highlighted features became less chemically relevant. When accuracy reached around 0.65, this trend changed, with a much sharper loss of agreement, indicating an explainability cliff. These findings underline the importance of assessing model accuracy before drawing conclusions from XAI outputs.

This article is part of the themed collection: AI in Drug Discovery at ICANN2025

Supplementary files

Article information

DOI: https://doi.org/10.1039/D5DD00576K
Article type: Paper
Submitted: 22 Dec 2025
Accepted: 05 May 2026
First published: 14 May 2026
This article is Open Access

Download Citation

Digital Discovery, 2026, Advance Article

Permissions

Request permissions

Benchmarking explainable AI methods for toxicophore detection and toxicity prediction

D. Khasanova and I. V. Tetko, Digital Discovery, 2026, Advance Article , DOI: 10.1039/D5DD00576K

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Digital Discovery

Benchmarking explainable AI methods for toxicophore detection and toxicity prediction

Abstract

Supplementary files

Article information

Download Citation

Permissions

Benchmarking explainable AI methods for toxicophore detection and toxicity prediction

Social activity

Search articles by author

Spotlight

Advertisements