Using machine learning to map simulated noisy and laser-limited multidimensional spectra to molecular electronic couplings†
Abstract
Two-dimensional electronic spectroscopy (2DES) has enabled significant discoveries in both biological and synthetic energy-transducing systems. Although deriving chemical information from 2DES is a complex task, machine learning (ML) offers exciting opportunities to translate complicated spectroscopic data into physical insight. Recent studies have found that neural networks (NNs) can map simulated multidimensional spectra to molecular-scale properties with high accuracy. However, simulations often do not capture experimental factors that influence real spectra, including noise and suboptimal pulse resonance conditions, bringing into question the experimental utility of NNs trained on simulated data. Here, we show how factors associated with experimental 2D spectral data influence the ability of NNs to map simulated 2DES spectra onto underlying intermolecular electronic couplings. By systematically introducing multisourced noise into a library of 356 000 simulated 2D spectra, we show that noise does not hamper NN performance for spectra exceeding threshold signal-to-noise ratios (SNR) of ca. 12.4, 2.5, and 5.1 if uncorrelated additive, correlated additive, or intensity-dependent noise sources dominate, respectively. In stark contrast to human-based analyses of 2DES data, we find that the NN accuracy improves significantly (ca. 84% → 96%) when the data are constrained by the bandwidth and center frequency of the pump pulses. This result is consistent with the NN learning the optical trends described by Kasha's theory of molecular excitons. Our findings convey positive prospects for adapting simulation-trained NNs to extract molecular properties from inherently imperfect experimental 2DES data. More broadly, we propose that machine-learned perspectives of nonlinear spectroscopic data may produce unique and perhaps counterintuitive guidelines for experimental design.