Learning Rates: Predicting Rate Coefficients for Hydrogen Abstraction Reactions
Abstract
Accelerating the discovery of complex chemical systems, from sustainable aviation fuels to atmospheric models, requires the rapid determination of thousands of elementary rate coefficients, a task fundamentally bottlenecked by traditional, low-throughput transition-state searching. Here we develop a high-throughput digital pipeline and a reaction-aware geometric message-passing framework for predicting the three parameters of the modified Arrhenius equation directly from molecular structure. A dataset of ~1,800 hydrogen-abstraction reactions was generated using automated workflows and high-level electronic-structure calculations. By incorporating reactive-atom distance (RAD) features -- a novel data representation that solves the "spatial blindness" of standard molecular graphs -- the model achieves a cross-validated median error of 0.285 dex (~1.9x) in k(T) across 300--3000 K. While accuracy is modestly lower in heteroatom-rich environments, the framework robustly captures the underlying structural trends and directly yields the complete Arrhenius parameter triplet, ensuring a rigorous, continuous temperature dependence across the entire evaluated range. These results establish reaction-aware representation learning as a scalable strategy to replace weeks of quantum chemical compute with near-instantaneous inference, providing a clear path for the data-driven acceleration of kinetic modeling.
Please wait while we load your content...