Synthetic blood-based infrared molecular fingerprints: artificial cohorts for methodological research

Abstract

Infrared molecular fingerprinting of human blood samples provides a powerful, minimally invasive approach for disease detection and health monitoring. However, ethical and legal constraints often limit the sharing of real patient data collected from clinical studies. In this work, we present a synthetic dataset of blood-based infrared molecular fingerprints, generated using multivariate Gaussian models fitted on real measurements from a large case-control study targeting various cancer types. The synthetic dataset retains the statistical and physical properties of real molecular fingerprints, enabling the development and validation of analytical methodologies without compromising patient privacy. We demonstrate that the provided artificial dataset can serve as a proxy for real data in methodological research, facilitating reproducibility and collaboration in biomedical spectroscopy. This approach offers a practical solution for overcoming ethical barriers in clinical data sharing in spectroscopic biomarker research.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
30 Dec 2025
Accepted
26 May 2026
First published
01 Jun 2026
This article is Open Access
Creative Commons BY license

Anal. Methods, 2026, Accepted Manuscript

Synthetic blood-based infrared molecular fingerprints: artificial cohorts for methodological research

N. Leopold-Kerschbaumer, N. Feiler and K. Kepesidis, Anal. Methods, 2026, Accepted Manuscript , DOI: 10.1039/D5AY02166A

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements