Data efficient molecular image representation learning using foundation models

Yonatan Harnik; Hadas Shalit Peleg; Amit H. Bermano; Anat Milo

doi:10.1039/D5SC00907C

Data efficient molecular image representation learning using foundation models†

Yonatan Harnik,

^a Hadas Shalit Peleg,^a Amit H. Bermano*^b and Anat Milo

*^a

Author affiliations

* Corresponding authors

^a Department of Chemistry, Ben-Gurion University of the Negev, Beer Sheva, Israel
E-mail: anatmilo@bgu.ac.il

^b School of Computer Science, Tel Aviv University, Tel Aviv, Israel
E-mail: amberman@tauex.ac.il

Abstract

Deep learning (DL) in chemistry has seen significant progress, yet its applicability is limited by the scarcity of large, labeled datasets and the difficulty of extracting meaningful molecular features. Molecular representation learning (MRL) has emerged as a powerful approach to address these challenges by decoupling feature extraction and property prediction. In MRL, a deep learning network is first trained to learn molecular features from large, unlabeled datasets and then finetuned for property prediction on smaller specialized data. Whereas MRL methods have been widely applied across chemical applications, these models are typically trained from scratch. Herein, we propose that foundation models can serve as an advantageous starting point for developing MRL models. Foundation models are large models trained on diverse datasets capable of addressing various downstream tasks. For example, large language models like OpenAI's GPT-4 can be finetuned with minimal additional data for tasks considerably different from their training. Based on this premise we leveraged OpenAI's vision foundation model, CLIP, as the backbone for developing MoleCLIP, a molecular image representation learning framework. MoleCLIP requires significantly less molecular pretraining data to match the performance of state-of-the-art models on standard benchmarks. Furthermore, MoleCLIP outperformed existing models on homogeneous catalysis datasets, emphasizing its robustness to distribution shifts, which allows it to adapt effectively to varied tasks and datasets. This successful application of a general foundation model to chemical tasks highlights the potential of innovations in DL research to advance synthetic chemistry and, more broadly, any field where molecular property description is central to discovery.

This article is part of the themed collection: 15th anniversary: Chemical Science community collection

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

DOI: https://doi.org/10.1039/D5SC00907C
Article type: Edge Article
Submitted: 04 Feb 2025
Accepted: 13 May 2025
First published: 22 May 2025
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry

Download Citation

Chem. Sci., 2025,16, 10833-10841

Permissions

Request permissions

Data efficient molecular image representation learning using foundation models

Y. Harnik, H. Shalit Peleg, A. H. Bermano and A. Milo, Chem. Sci., 2025, 16, 10833 DOI: 10.1039/D5SC00907C

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Chemical Science

Data efficient molecular image representation learning using foundation models†

Abstract

Supplementary files

Transparent peer review

Article information

Download Citation

Permissions

Data efficient molecular image representation learning using foundation models

Social activity

Search articles by author

Spotlight

Advertisements