Reaction Center Prediction by Analyzing Attention of a Chemical Language Model
Abstract
Pretrained chemical language models are widely used to predict molecular properties and chemical reactions, yet interpreting their internal attention mechanisms remains difficult. Here, we analyze attention matrices from a chemical language model to extract information relevant to reaction center prediction. We introduce Peak-Activated Binary Attention (PABA), which binarizes attention matrices by retaining only peak values based on a parameter alpha. Using PABA, we identify a top-ranked attention head that effectively predicts reaction centers. We further develop a supervised extension, Supervised PABA (SPABA), which achieves a Matthews correlation coefficient (MCC) of 0.73 and outperforms existing supervised methods for reaction center prediction. SPABA reduces dependence on explicit reaction templates while preserving high accuracy and generalizability, providing a robust framework for reaction center prediction.
Please wait while we load your content...