An interpretable machine learning framework for prediction of adsorption energies and generative design of active sites on arbitrary catalysts

Matthew S. Johnson; Richard H. West; Judit Zádor

doi:10.1039/D5FD00143A

An interpretable machine learning framework for prediction of adsorption energies and generative design of active sites on arbitrary catalysts

Matthew S. Johnson,

*^a Richard H. West

^b and Judit Zádor

*^a

Author affiliations

* Corresponding authors

^a Combustion Research Facility, Sandia National Laboratories, Livermore, California 94550, USA
E-mail: jzador@sandia.gov, mjohnson541@gmail.com

^b Department of Chemical Engineering, Northeastern University, Boston, Massachusetts 02115, USA

Abstract

We present a highly interpretable and efficient machine learning framework for predictive and generative modeling of adsorption energies on surfaces using subgraph isomorphic decision trees (SIDTs). Extracting graph representations of 344 756 relaxed geometries and their associated adsorption energies from the OC20 database, we used them to train a 24 777-node SIDT that achieves 0.36 eV MDAE, 0.54 eV MAE, and 0.82 eV RMSE. We then developed and implemented novel techniques to use SIDTs as generative models enabling efficient catalyst optimization for arbitrary objective functions and constraints as a function of the adsorption energies and prediction uncertainties of multiple adsorbates and the catalyst structure itself. In particular, our SIDT provides substructure representations of the subdistributions of adsorption energy, rather than mere samples from the subdistributions, as is commonly done in traditional generative modeling. We show how this can be exploited for efficient and interpretable catalyst active site design in two examples. For the ammonia decomposition reaction sequence, we show that we are able to use our generative techniques to minimize the overall barrier height of the sequence generating catalyst substructures predicted to decrease the overall barrier from 2.7 eV on Pt(111) to 0.4 eV. We also discuss how we can exploit the accurate SIDT uncertainties and the interpretability of the SIDT to identify regions of chemical space that are in need of improved coverage and might be improved using active-learning schemes.

This article is part of the themed collection: Bridging the Gap from Surface Science to Heterogeneous Catalysis Faraday Discussion

Supplementary files

Article information

DOI: https://doi.org/10.1039/D5FD00143A
Article type: Paper
Submitted: 02 Dec 2025
Accepted: 06 Feb 2026
First published: 06 Feb 2026

Download Citation

Faraday Discuss., 2026, Advance Article

Permissions

Request permissions

An interpretable machine learning framework for prediction of adsorption energies and generative design of active sites on arbitrary catalysts

M. S. Johnson, R. H. West and J. Zádor, Faraday Discuss., 2026, Advance Article , DOI: 10.1039/D5FD00143A

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Faraday Discussions

An interpretable machine learning framework for prediction of adsorption energies and generative design of active sites on arbitrary catalysts

Abstract

Supplementary files

Article information

Download Citation

Permissions

An interpretable machine learning framework for prediction of adsorption energies and generative design of active sites on arbitrary catalysts

Social activity

Search articles by author

Spotlight

Advertisements