Spectra to structure: contrastive learning framework for library ranking and generating molecular structures for infrared spectra

Abstract

Inferring complete molecular structure from infrared (IR) spectra is a challenging task. In this work, we propose SMEN (Spectra and Molecule Encoder Network), a framework for scoring molecules against given IR spectra. The proposed framework uses contrastive optimization to obtain similar embedding for a molecule and its spectra. For this study, we consider the QM9 dataset with molecules consisting of less than 9 heavy atoms and obtain simulated spectra. Using the proposed method, we can rank the molecules using embedding similarity and obtain a Top 1 accuracy of ∼81%, Top 3 accuracy of ∼96%, and Top 10 accuracy of ∼99% on the evaluation set. We extend SMEN to build a generative transformer for a direct molecule prediction from IR spectra. The proposed method can significantly help molecule library ranking tasks and aid the problem of inferring molecular structures from spectra.

Graphical abstract: Spectra to structure: contrastive learning framework for library ranking and generating molecular structures for infrared spectra

Supplementary files

Article information

Article type
Communication
Submitted
17 May 2024
Accepted
15 Oct 2024
First published
17 Oct 2024
This article is Open Access
Creative Commons BY-NC license

Digital Discovery, 2024, Advance Article

Spectra to structure: contrastive learning framework for library ranking and generating molecular structures for infrared spectra

G. C. Kanakala, B. Sridharan and U. D. Priyakumar, Digital Discovery, 2024, Advance Article , DOI: 10.1039/D4DD00135D

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements