Raman spectrum matching with contrastive representation learning†
Raman spectroscopy is an important, low-cost, non-intrusive technique often used for chemical identification. Typical approaches identify a spectrum by comparing it with a reference database using supervised machine learning, which usually requires careful preprocessing and multiple spectra available per analyte. We propose a new machine learning technique for spectrum identification using contrastive representation learning. Our approach requires no preprocessing and works with as little as a single reference spectrum per analyte. We have significantly improved or are on par with the existing state-of-the-art analyte identification accuracy on two Raman spectral datasets and one SERS dataset that include a single component. We demonstrate that the identification accuracy can be further increased by slightly increasing the candidate set size using conformal prediction on the SERS dataset. Based on our findings, we believe contrastive representation learning is a promising alternative to the existing methods for Raman spectrum matching.