When machine learning models learn chemistry I: quantifying explainability with matched molecular pairs

Kerrin Janssen; Jan M. Wollschläger; Jonny Proppe; Andreas H. Göller

doi:10.1039/D5DD00398A

When machine learning models learn chemistry I: quantifying explainability with matched molecular pairs

Kerrin Janssen,^a Jan M. Wollschläger,

^b Jonny Proppe*^a and Andreas H. Göller

*^c

Author affiliations

* Corresponding authors

^a TU Braunschweig Institute of Physical and Theoretical Chemistry, Gauss Str 17, 38106 Braunschweig, Germany
E-mail: j.proppe@tu-braunschweig.de

^b Bayer AG Pharmaceuticals, R&D, Machine Learning Research, 13353 Berlin, Germany

^c Bayer AG Pharmaceuticals, R&D, Computational Molecular Design, 42096 Wuppertal, Germany
E-mail: andreas.goeller@bayer.com

Abstract

Explainability methods in machine learning-driven research are increasingly being used, but it remains challenging to assess their reliability without deeply investigating the specific problem at hand. In this work, we present a Python-based Workflow for Interpretability Scoring using matched molecular Pairs (WISP). This workflow can be applied to assess the performance of explainability methods on any given dataset containing SMILES and is model-agnostic, making it compatible with any machine learning model. Evaluation on two physics-based datasets demonstrates that the explanations reliably capture the predictions of the respective machine learning models. Furthermore, our workflow reveals that explainability methods can only meaningfully reflect the property of interest when the underlying models achieve high predictive accuracy. Therefore, the explainability performance on a test set can function as a quality measure of the underlying model. To ensure compatibility with any model type, we developed an atom attributor, which generates atom-level attributions for any model using any descriptor that can be obtained using SMILES representations. This method can also be applied as a standalone explainability tool, independently of WISP. WISP enables users to interpret a wide range of machine learning models in the chemical domain and gain valuable insights into how these models operate and the extent to which they capture underlying chemical principles.

Digital Discovery

When machine learning models learn chemistry I: quantifying explainability with matched molecular pairs

Abstract

Supplementary files

Article information

Download Citation

Permissions

When machine learning models learn chemistry I: quantifying explainability with matched molecular pairs

Social activity

Search articles by author

Spotlight

Advertisements