SmartCIF: A Context-Aware Multi-Agent System for Automated Preprocessing and Curation of MOF CIFs
Abstract
Computational screening of metal-organic frameworks (MOFs) relies on crystallographic inputs that are commonly treated as “computation-ready”. In practice, however, conventional CIF preprocessing often applies fixed-parameter treatments, overlooking the structural details described in the original reports. To address this, we introduce SmartCIF, a context-aware literature-integrated framework that redefines CIF preprocessing as an explicit assumption-driven procedure. SmartCIF couples topology-based structural analysis with natural-language reasoning over the original publications to make chemically informed decisions about retaining or removing all kind of CIF parts according to the user’s computational objectives. Benchmarking across 321 MOFs against reported BET surface areas and CO2/N2 adsorption data demonstrates that SmartCIF reconciles geometric accessibility with chemical fidelity, avoiding both pore-blocking and over-opened nonphysical results base on the original publications. These results establish that CIF preprocessing is inherently application-dependent and that treating preprocessing assumptions as explicit, controllable variables is essential for reproducible interpretable high-throughput screening. This assumption-aware paradigm embodied by SmartCIF generalizes existing computation-ready resources and provides a flexible foundation for large-scale simulations beyond adsorption.
Please wait while we load your content...