Deep Set Model for the Automated NMR Fingerprinting of Unknown Mixtures

Abstract

Elucidating unknown mixtures is a critical challenge in chemistry and chemical engineering. Nuclear magnetic resonance (NMR) spectroscopy is a powerful analytical technique generally suited for this purpose. However, component-wise elucidation with NMR is tedious for complex mixtures, requires expert knowledge, and often yields ambiguos results. In contrast, identifying and quantifying structural groups in a mixture from NMR spectra is much more straightforward. In prior work, we have introduced 'NMR fingerprinting' for the automated elucidation of structural groups in unknown mixtures based on standard NMR experiments and a support vector classification (SVC) from machine learning (ML). In the present work, we present a substantially advanced NMR fingerprinting method that employs a deep set model (DSM), addressing major shortcomings of the SVC, and integrates additional information from 2D NMR experiments. The DSM was trained on experimental NMR spectra of pure components from open-source databases, augmented with synthetic spectral data, and comprises invariant and equivariant network structures to ensure predictions independent of the input order of the NMR signals. Tested on experimental pure-component test data, the DSM performs excellently, significantly outperforming our previous approaches. Furthermore, we demonstrate the applicability of the DSM to unknown mixtures by predicting the structural groups from NMR spectra of test mixtures measured using a benchtop NMR spectrometer. The predictions agree very well with the true mixture compositions, highlighting the method's potential for efficient automated mixture analysis and providing a reliable basis for downstream tasks, such as thermodynamic modeling using group-contribution methods.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
07 Nov 2025
Accepted
19 Feb 2026
First published
20 Feb 2026
This article is Open Access
Creative Commons BY license

Digital Discovery, 2025, Accepted Manuscript

Deep Set Model for the Automated NMR Fingerprinting of Unknown Mixtures

J. Wagner, K. Munneman, T. Specht, H. Hasse and F. Jirasek, Digital Discovery, 2025, Accepted Manuscript , DOI: 10.1039/D5DD00490J

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements