Can large language models predict the hydrophobicity of metal–organic frameworks?

Abstract

Recent advances in large language models (LLMs) offer a transformative paradigm for data-driven materials discovery. Herein, we exploit the potential of LLMs in predicting the hydrophobicity of metal–organic frameworks (MOFs). By fine-tuning the state-of-the-art Gemini-1.5 model exclusively on the chemical language of MOFs, we demonstrate its capacity to deliver weighted accuracies that surpass those of traditional machine learning approaches based on sophisticated descriptors. To further interpret the chemical “understanding” embedded within the Gemini model, we conduct systematic moiety masking experiments, where our fine-tuned Gemini model consistently retains robust predictive performance even with partial information loss. Finally, we show the practical applicability of the Gemini model via a blind test on solvent- and ion-containing MOFs. The results illustrate that Gemini, combined with lightweight fine-tuning on chemically annotated texts, can serve as a powerful tool for rapidly screening MOFs in pursuit of hydrophobic candidates. Taking a step forward, our work underscores the potential of LLMs in offering robust and data-efficient approaches to accelerate the discovery of functional materials.

Graphical abstract: Can large language models predict the hydrophobicity of metal–organic frameworks?

Supplementary files

Article information

Article type
Paper
Submitted
12 Feb 2025
Accepted
04 Apr 2025
First published
04 Apr 2025
This article is Open Access
Creative Commons BY-NC license

J. Mater. Chem. A, 2025, Advance Article

Can large language models predict the hydrophobicity of metal–organic frameworks?

X. Wu and J. Jiang, J. Mater. Chem. A, 2025, Advance Article , DOI: 10.1039/D5TA01139F

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements