Reaction Center Prediction by Analyzing Attention of a Chemical Language Model

Abstract

Pretrained chemical language models are widely used to predict molecular properties and chemical reactions, yet interpreting their internal attention mechanisms remains difficult. Here, we analyze attention matrices from a chemical language model to extract information relevant to reaction center prediction. We introduce Peak-Activated Binary Attention (PABA), which binarizes attention matrices by retaining only peak values based on a parameter alpha. Using PABA, we identify a top-ranked attention head that effectively predicts reaction centers. We further develop a supervised extension, Supervised PABA (SPABA), which achieves a Matthews correlation coefficient (MCC) of 0.73 and outperforms existing supervised methods for reaction center prediction. SPABA reduces dependence on explicit reaction templates while preserving high accuracy and generalizability, providing a robust framework for reaction center prediction.

Supplementary files

Article information

Article type
Paper
Submitted
31 Jan 2026
Accepted
06 May 2026
First published
08 May 2026
This article is Open Access
Creative Commons BY license

Digital Discovery, 2026, Accepted Manuscript

Reaction Center Prediction by Analyzing Attention of a Chemical Language Model

X. Xiong, R. Jia, Y. Tian, J. Chen and B. Tian, Digital Discovery, 2026, Accepted Manuscript , DOI: 10.1039/D6DD00055J

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements