MOFReasoner: Think Like a Scientist-A Reasoning Large Language Model via Knowledge Distillation

Abstract

Large Language Models (LLMs) have potential in transforming chemical research. Nevertheless, their general-purpose design constrains scientific understanding and reasoning within specialized fields like chemistry. In this study, we introduce MOFReasoner, a domain model designed to enhance scientific reasoning, using Metal-Organic Frameworks (MOFs) adsorption as a case study. By employing knowledge distillation from teacher models and Chain-of-Thought (CoT) reasoning extracted from a corpus of over 8242 research articles and 500 reviews, we developed a domain chemical reasoning dataset. Using domain chemical reasoning datasets, general chemistry datasets, and general reasoning datasets, the LLMs were fine-tuned. The model's performance was evaluated across four tasks: experimental studies, chemical mechanisms, application scenarios, and industrialization challenges. MOFReasoner outperformed existing general-purpose models, such as GPT-4.5 and DeepSeek-R1. Furthermore, the model achieves prediction accuracy comparable to DFT, enabling material recommendation. This work underscores the potential of integrating domain-specific knowledge, CoT reasoning, and knowledge distillation in creating LLMs that support scientific inquiry and decision-making within the discipline of chemistry.

Supplementary files

Article information

Article type
Paper
Submitted
25 Sep 2025
Accepted
08 Jan 2026
First published
09 Jan 2026
This article is Open Access
Creative Commons BY license

Digital Discovery, 2025, Accepted Manuscript

MOFReasoner: Think Like a Scientist-A Reasoning Large Language Model via Knowledge Distillation

X. Bai, Z. Zheng, X. Zhang, H. Wang, R. Yang and J. Li, Digital Discovery, 2025, Accepted Manuscript , DOI: 10.1039/D5DD00429B

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements