Identifying subpopulations in multicellular systems by quantitative chemical imaging using label-free hyperspectral CARS microscopy

Quantitative hyperspectral coherent Raman scattering microscopy merges imaging with spectroscopy and utilises quantitative data analysis algorithms to extract physically meaningful chemical components, spectrally and spatially-resolved, with sub-cellular resolution. This label-free non-invasive method has the potential to significantly advance our understanding of the complexity of living multicellular systems. Here, we have applied an in-house developed hyperspectral coherent anti-Stokes Raman scattering (CARS) microscope, combined with a quantitative data analysis pipeline, to imaging living mouse liver organoids as well as fixed mouse brain tissue sections xenografted with glioblastoma cells. We show that the method is capable of discriminating different cellular sub-populations, on the basis of their chemical content which is obtained from an unsupervised analysis, i.e. without prior knowledge. Specifically, in the organoids, we identify sub-populations of cells at different phases in the cell cycle, while in the brain tissue, we distinguish normal tissue from cancer cells, and, notably, tumours derived from transplanted cancer stem cells versus non-stem glioblastoma cells. The ability of the method to identify different sub-populations was validated by correlative fluorescence microscopy using fluorescent protein markers. These examples expand the application portfolio of quantitative chemical imaging by hyperspectral CARS microscopy to multicellular systems of significant biomedical relevance, pointing the way to new opportunities in non-invasive disease diagnostics.


S1. FUCCI2AR LIVER BILE DUCT ORGANOIDS
i. TPF z-stack TPF images and hyperspectral CARS z-stacks were acquired at 1 µm and 5 µm intervals, respectively, over a z-range of 30 µm. The pixel size in the xy plane was (86 nm) 2 . A pixel dwell time of 100 µs and 1 µs and PMT gains of 10 6 and 10 4.6 were used for TPF and CARS, respectively. A montage of all the TPF xy planes is presented in Fig. S1, yellow boxes indicate the z planes where corresponding hyperspectral CARS data sets were acquired. A total of 13 nuclei were identified, and numbered as indicated in Fig. S1.

ii. Calculating nucleus volumes
The 3D Objects Counter [1] plugin for ImageJ was used to quantitatively analyse the TPF z-stack to determine the nuclei volumes. A filters size of 500 voxels (minimum number of voxels needed for an object to be counted) and an intensity threshold value equivalent to 1.34 × 10 5 phe/s were used. This individually segmented the nuclei of cells 3, 4, 5, 6, 10, 11 and 12, while nuclei 7, 8, 9, and 13 were joined and required a manual separation as follows. The area around the nucleus was manually traced in all z-planes and all other cells removed, leaving only the nucleus of interest. This was then processed through the 3D Objects Counter with the same intensity threshold value of 1.34×10 5 phe/s. The TPF signal from the nuclei of cells 1 and 2 is too close to the background and could not be reliably segmented by 3D Objects Counter, therefore, cells 1 and 2 are not included in any of the subsequent analysis. An overview of the nuclei contour as a result of this segmentation is shown in Fig. S2. For each of the remaining 11 nuclei, the intensity (in phe/s) was summed over all the voxels within the nucleus, and multiplied by the voxel volume (in µm 3 ), to obtain the integrated intensity. This is plotted against the nucleus volume in the main manuscript  iii. FSC 3 factorisation The hyperspectral CARS dataset for the xy plane z = 12 µm was processed through our HIA/FSC 3 pipeline. A total of seven components were used for the factorisation which was   The FSC 3 components often contain a fraction of water, a result of the water being present across the imaged volume in each voxel. This is seen as a rising (χ) above 3100 cm −1 in the component spectra. To determine the dry volume fraction of the components, the water part is subtracted from the component spectra and the resulting dry spectra are integrated and compared with known integrals for the dry components. In detail, we subtracting a fraction (α) of the water spectrum (χ) W (component c 1 , Fig. S3) from spectrum of the component in question, resulting in its dry spectrum (χ) dry = (χ) − α (χ) W where the fraction α is chosen such that the (χ) dry is zero at 3150 cm −1 , as expected for the organic materials investigated.
Following the method outlined by Karuna et.al. [2], the dry fraction of component c i is where A i is the spectral integral of (χ) dry for component c i , and A Bulk is the corresponding integral of the dry bulk material of the component. The latter is estimated as A Bulk = A OA F i , where A OA is the spectral integral of the susceptibility measured for a known pure dry material (we used oleic acid, OA) and the factor F i is calculated assuming that the integral is proportional to the volume density of -H bonds.
The factor F i can be expressed as et.al. [2] and are given in table S1 alongside the integrals calculated from the dry spectra of S9. The dry vol/vol concentration is then taken as the dry fraction multiplied by the  corresponding integrals are given in table S1.

v. Calculating nucleus area
The concentration map of component c 5 clearly shows the nucleus area, see Fig. S4-S8.
For each xy plane imaged by CARS, the nucleus region was traced by hand in ImageJ.
These regions were then used as templates to calculate γ d i C i averaged over the nucleus area, for the concentration maps of components i = 2 to 6. TPF images were used to identify nuclei positive for the mVenus marker, and were labelled according to the assignment given in Fig. S1. Cell 1 and 2 were excluded, as they had a too low TPF intensity to reliable segment a nucleus volume, as explained in section S1 ii. If a nucleus appeared in more than one z plane,

vi. Analysis of non-fluorescent cells
The dry concentration γ d 3 C 3 + γ d 4 C 4 averaged over the nucleus area was also calculated for all cells that were imaged by hyperspectral CARS and did not exhibit a mVenus TPF localization in the nucleus. Nuclei areas were segmented in the hyperspectral CARS images, using the same procedure discussed in section S1 v. We considered two cases: all the nuclei areas, and a sub-set whereby we excluded nuclei with small areas (< 22 µm 2 ), which could be too close to the top/bottom edges. In this sub-set, we also excluded nuclei with very large areas (> 53 µm 2 ), to select cells in the G1 phase. An overview of γ d 3 C 3 + γ d 4 C 4 for each cell in the two sets, and the resulting mean, standard deviation, and standard error of the mean is shown in Fig. S12. 14 S2. BRAIN TISSUE WITH GBM

i. GSC-derived tumours -susceptibilities and concentration maps
For the mouse brain tissues xenografted with eGFP-labelled GBM stem cells (GSC), six regions of interest (ROIs) containing both cancer and normal cells were initially found, using wide-field epi-fluorescence and DIC imaging. A z-position approximately in the centre of the 30 µm-thick section was located using TPF imaging (Fig. S13). Alongside the TPF images, hyperspectral CARS was acquired over the range 2600-3700 cm −1 , with 5 cm −1 step size. Hyperspectral CARS was processed using our HIA/FSC 3 pipeline. 22

iii. Dry Spectra and integration ranges
Dry spectra were calculated as described in section S1 iv. The dry spectra for components c 2 to c 5 are shown in Fig. S23. Fig.S24 Fig. 7, with the addition here of region colour-coding for clarity.