Large language models for porous materials: from text mining to autonomous laboratory

Seunghee Han; Taeun Bae; Junho Kim; Younghun Kim; Jihan Kim

doi:10.1039/D5DD00578G

Large language models for porous materials: from text mining to autonomous laboratory

Seunghee Han,

^a Taeun Bae,

^a Junho Kim,

^a Younghun Kim

^a and Jihan Kim

*^a

Author affiliations

* Corresponding authors

^a Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea
E-mail: jihankim@kaist.ac.kr

Abstract

Porous materials such as metal–organic frameworks (MOFs), covalent organic frameworks (COFs), zeolites, and porous carbons play central roles in gas storage, separation, catalysis, and environmental technologies. However, their design and discovery remain resource-intensive, relying heavily on expert intuition and fragmented knowledge distributed across the literature. Recent advances in large language models (LLMs) present new opportunities to accelerate these workflows by integrating scientific text mining, domain reasoning, and experimental planning. In this review, we outline the emerging role of LLMs across the porous materials research ecosystem. We first introduce the foundations of LLMs, followed by a discussion of NLP-based text mining for literature analysis. We then examine LLM adaptation including prompt engineering and fine-tuning, and autonomous research systems from human-in-the-loop to self-driving laboratories. For each domain, we summarize how LLM architectures are integrated with research systems, highlighting their applications, advantages, and limitations. Additionally, we discuss the current challenges of applying LLMs to porous materials, trade-offs between prompt engineering and fine-tuning, the influence of generation parameters such as temperature, and safety considerations in autonomous laboratory systems. Finally, we expect LLMs to advance toward multimodal reasoning, tighter integration with structured knowledge bases, and safer autonomous experimental workflows. Together, these developments suggest emerging LLM-driven paradigms that could transform the conceptualization, design, and synthesis of porous materials.

Article information

https://doi.org/10.1039/D5DD00578G

Article type

Review Article

Submitted

23 Dec 2025

Accepted

29 Mar 2026

First published

07 Apr 2026

This article is Open Access

Download Citation

Digital Discovery, 2026,5, 1470-1500

Permissions

Request permissions

Large language models for porous materials: from text mining to autonomous laboratory

S. Han, T. Bae, J. Kim, Y. Kim and J. Kim, Digital Discovery, 2026, 5, 1470 DOI: 10.1039/D5DD00578G

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Digital Discovery

Large language models for porous materials: from text mining to autonomous laboratory

Abstract

Article information

Download Citation

Permissions

Large language models for porous materials: from text mining to autonomous laboratory

Social activity

Search articles by author

Spotlight

Advertisements