Removing non-resonant background from broadband CARS using a physics-informed neural network

Broadband coherent anti-Stokes Raman scattering (BCARS) is capable of producing high-quality Raman spectra spanning broad bandwidths, 400-4000 cm-1, with millisecond acquisition times. Raw BCARS spectra, however, are a coherent combination of vibrationally resonant (Raman) and non-resonant (electronic) components that may challenge or degrade chemical analyses. Recently, we demonstrated a deep convolutional autoencoder network, trained on pairs of simulated BCARS-Raman datasets, which could retrieve the Raman signal with high quality under ideal conditions. In this work, we present a new computational system that incorporates experimental measurements of the laser system spectral and temporal properties, combined with simulated susceptibilities. Thus, the neural network learns the mapping between the susceptibility and the measured response for a specific BCARS system. The network is tested on simulated and measured experimental results taken with our BCARS system.

In figures S1 and S2, the full spectrum of the retrieved signal using VECTOR2 and the Kramers-Kronig (KK) method is shown for glycerol, polymer, PMMA and polystyrene, ethanol and benzonitrile respectively.The KK method was implemented using the procedure published in Ref. 1.All spectra in these figures were normalised to the maximum value and scaled up in the fingerprint region (< 2252 cm -1 ) by an integer as shown in each figure.The asymmetric least-squares detrender was used for phase-error correction in the KK results with a smoothing parameter of 1×10 −3 .The asymmetric parameter used was 1×10 −3 for the CH region and 1×10 −2 for the fingerprint region.Also shown is the corresponding spontaneous Raman spectrum measurement for all six analytes.It is notable that the VECTOR2 retrieved spectra have lower noise than those retrieved using KK, and in many cases the peaks appear to be sharper.The benefit of the deconvolution, which is inherent in VECTOR2, is also apparent in some case; for example in the case of benzonitrile, the two peaks appearing in the band 950-1050 cm -1 are clearly resolved by VECTOR2 but not by KK.The accuracy of the VECTOR2 retrieved spectra is significantly superior in the CH region for most cases.It should be noted that KK is capable of providing excellent results in this region when the SNR is higher than that provided by the stimulation profile of our system.This highlights the capacity of VECTOR2 to retrieve spectra from noisy CARS spectra.

OBSERVED RETRIEVAL OF RESONANCE IN THE "SILENT REGION" OF BENZONI-TRILE.
It is important to stress that, being a convolutional deep-learning network, VECTOR has translational invariance in resonance recovery; for example, in the benzonitrile spectrum a strong resonance is observed at 2200 cm -1 as shown in Fig. S1, which manifests in the CARS spectrum even in a relatively weak band of the stimulation profile.No peaks were ever included in the training set in this region and yet VECTOR2 can accurately retrieve it.Although the recovery of a resonance is translationally invariant in the context of recovering a resonance from a CARS interference line-shape, the network can also be considered translationally variant in the sense that these resonances are effectively scaled in response to the laser stimulation profile, which varies as a function of wavenumber.It is possible that this scaling operation occurs in the first few layers of the autoencoder.

POSSIBLE INCLUSION OF STOKES PHASE
VECTOR2 has not been trained to retrieve the phase of S, however a separate autoencoder could be trained to do to this as a separate task.It is assumed that the phase of the stimulation profile (S) is flat in all of our simulations, which is approximately true for our experimental system.The impact of a non-flat phase in the stimulation profile are twofold, 1.It is important to emphasise that a non-flat phase of S must result from a non-flat phase of the Stokes field (Es).This could significantly impact on the amplitude of S in the fingerprint (3-colour) region.For example, a heavily chirped Stokes would result in a negligible stimulation profile in the fingerprint region.
2. For a given S, a slowly varying phase will have little or no impact when compared with a flat phase.This results from the convolution that describes the overall process.
Below is the fundamental equation governing the CARS process.
Electronic Supplementary Material (ESI) for Analytical Methods.This journal is © The Royal Society of Chemistry 2023 Fig. S1.Retrieved Raman spectrum of glycerol, polymer and PMMA using the Kramers-Kronig method [1] and VECTOR2.Also shown is the spontaneous Raman spectrum for each analyte.Spectra were initially normalised using data >2500 cm -1 , following which data to the left of the dashed vertical line were scaled for clarity with scale values shown.No post-processing or denoising was performed after phase retrieval in any case.

Fig. S2.
Retrieved Raman spectrum of polystyrene, ethanol and benzonitrile using the Kramers-Kronig method and VECTOR2.Also shown is the spontaneous Raman spectrum for each analyte.Spectra were initially normalised using data >2500 cm -1 , following which data to the left of the dashed vertical line were scaled for clarity with scale values shown.No post-processing or denoising was performed after phase retrieval in any case.
So long as E pr can be described as a narrow Dirac delta-like functional, then the phase of S over the support of E pr is approximately constant at every wavenumber position.However, if E pr has a relatively larger support such as for a sinc function associated with a flat-top temporal pulse, then the phase of S may vary appreciably over the support of E pr .In such case, the phase of S could impact on the result of the convolution, and the approximation in the above equation no longer holds.Typically sinc functions are avoided, however given VECTOR2s ability for deconvolution, these are a valid option with the potential for enhanced resolution.