Finding the hidden catalytic knowledge from literature data
Abstract
Over decades of catalytic research, a vast amount of knowledge has been documented in the literature, yet its potential scientific value has not been fully explored. Catalytic performance is determined by multiple factors such as electronic structure and reaction conditions, exhibiting complex nonlinear structure–performance relationships. Simultaneously, differences in experimental conditions and inconsistent data dimensions among different studies make it difficult to directly use existing data for rule extraction and catalyst design. This perspective systematically summarizes three strategies for discovering new catalytic knowledge from literature data: first, leveraging “human intelligence” and statistical analysis to discover new catalytic knowledge through literature integration and mechanistic insights; second, constructing interpretable descriptors using symbolic regression and machine learning to achieve quantitative prediction of catalytic performance; and third, combining large language models and AI agents for multi-source data integration, knowledge extraction, and intelligent catalyst recommendation. Overall, these data-driven methods can transform scattered experience into computable design criteria, enabling a new paradigm for the rational design and efficient screening of catalytic materials, and accelerating the development of catalysis research towards a digital materials ecosystem and closed-loop research model that integrates AI, theoretical calculations, and autonomous experiments.
- This article is part of the themed collection: EES Catalysis Recent HOT articles

Please wait while we load your content...