Finding the Hidden Catalytic Knowledge from Literature Data
Abstract
Over decades of catalytic research, a vast amount of knowledge has been documented in the literature, yet its potential scientific value has not been fully explored. Catalytic performance is determined by multiple factors such as electronic structure and reaction conditions, exhibiting a complex nonlinear structure-performance relationship. Simultaneously, differences in experimental conditions and inconsistent data dimensions among different studies make it difficult to directly use existing data for rule extraction and catalyst design. This Perspective systematically summarizes three strategies for discovering new catalytic knowledge from literature data: first, leveraging “human intelligence” and statistical analysis to discover new catalytic knowledge through literature integration and mechanistic insights; second, constructing interpretable descriptors using symbolic regression and machine learning to achieve quantitative prediction of catalytic performance; and third, combining large language models and AI agents for multi-source data integration, knowledge extraction, and intelligent catalyst recommendation. Overall, these data-driven methods can transform scattered experience into computable design criteria, enabling a new paradigm for the rational design and efficient screening of catalytic materials, and accelerating the development of catalysis research towards a digital materials ecosystem and closed-loop research model that integrates AI, theoretical calculations, and autonomous experiments.
Please wait while we load your content...