D.
Petersen‡
a,
L.
Mavarani§‡
a,
D.
Niedieker
a,
E.
Freier¶
a,
A.
Tannapfel
b,
C.
Kötting
a,
K.
Gerwert
*a and
S. F.
El-Mashtoly
a
aDepartment of Biophysics and Protein Research Unit Europe (PURE), Ruhr University Bochum, ND/04 Nord, 44780 Bochum, Germany. E-mail: gerwert@bph.ruhr-uni-bochum.de; Fax: +49 234 3214238; Tel: +49 234 3224461
bKlinikum Bergmannsheil, Ruhr-University Bochum, 44780 Bochum, Germany
First published on 3rd November 2016
The great capability of the label-free classification of tissue via vibrational spectroscopy, like Raman or infrared imaging, is shown in numerous publications (review: Diem et al., J. Biophotonics, 2013, 6, 855–886). Herein, we present a new approach, virtual staining, that improves the Raman spectral histopathology (SHP) images of colorectal cancer tissue by combining the integrated Raman intensity image in the C–H stretching region (2800–3050 cm−1) with the pseudo-colour Raman image. This allows the display of fine structures such as the filamentous composition of muscle tissue. The morphology of the virtually stained images is in agreement with the gold standard in medical diagnosis, the haematoxylin–eosin staining. The virtual staining image also represents the whole biochemical fingerprint, and several tissue components including carcinoma were identified automatically with high sensitivity and specificity. For fast tissue classifications, a similar approach was applied on coherent anti-Stokes Raman scattering (CARS) spectral data that is faster and therefore potentially more suitable for clinical applications.
In the last decade, several studies have shown that spectral histopathology (SHP) is capable of classifying different tissue types and especially diseased tissue such as cancer.4–11 The measured vibrational spectra are integral signals of the proteome, genome, and metabolome. Thus, when vibrational spectra are collected from distinct regions of, for example, tissue sections, variations in the spectral patterns are detected and can be correlated with the tissue types or carcinoma from which the spectra are collected. For instance, colorectal and lung carcinoma are identified in this regard by infrared (IR) imaging.12–16
Several groups showed the application of Raman and coherent anti-Stokes Raman scattering (CARS) imaging on colon tissue.17–22 In all cases normal and carcinoma tissue were successfully distinguished, but most of these studies lack elaborated automated bioinformatics. We have recently established a workflow that includes Raman microscopy, bioinformatics, histopathology, and IHC (Fig. S1 in the ESI†) to automatically classify different tissue types and cancer regions.23 The workflow is divided into training and validation stages. In the training stage, Raman spectral imaging of thin sections of colon tissue was performed. Hierarchical cluster analysis (HCA) of the Raman spectroscopic data was performed as an unsupervised segmentation. From this segmentation similar spectra were grouped into clusters producing a pseudo-colour image. The H&E and/or IHC staining were performed on adjacent thin tissue sections. Images of these staining were annotated by the pathologist and then used to identify the corresponding Raman spectral “fingerprints” of different tissue types including cancer based on the comparison with pseudo-colour images. These spectral “fingerprints” were used as a database to perform a supervised classification through a classifier such as random forest (RF).24 RF classifiers are accurate and robust against over-fitting. In the validation stage, Raman spectral maps of new thin tissue sections were measured and automatically annotated by the trained RF. By using this workflow, our preliminary results of Raman based RF with 532 nm excitation displayed carcinoma regions and cells such as lymphocytes and erythrocytes in addition to an autofluorescence specific to p53 active areas in the crypt region of the lamina propria mucosae.23
This means that Raman SHP can resolve small structures like erythrocytes and lymphocytes and visualize detailed chemical and morphological compositions due to the higher spatial resolution of Raman imaging in comparison with IR imaging. This advantage allows us to detect borders and transitions between diseased and healthy tissue in an accurate way, which is of importance in clinical diagnosis.25 Thereby, not only the carcinoma can be resected precisely, but also healthy tissue around the carcinoma is spared, which can be crucial in some organs, for example, brain.26
Herein, we present a new method for the graphical representation, virtual staining, that adds the morphological information given by the Raman intensity to the RF pseudo-colour images. These label-free images with high spatial resolution enable a direct comparison with H&E stained images, and thus can help the pathologists in their diagnosis, especially for questionable areas. Several tissue classes and carcinoma regions of colorectal carcinoma were identified and represented by highly resolved RF images. Furthermore, we extend our method to a fast tissue classification using CARS imaging of colorectal cancer tissues coupled with second harmonic generation (SHG), which is a perfect combination suitable for clinical applications.
The tissue sections were mounted on reflective silver coated microscope slides (low-emissivity slides [Kevley Technologies, Chesterland, OH]) and deparaffinised before measurements. By using formalin-fixed, deparaffinised samples, which were stable over a long period of time, we were able to use the tissue slides for long term measurements and perform several Raman measurements on the same tissue slides. Subsequent H&E staining was performed on the measured tissue sections or adjacent thin tissue sections. For CARS measurements, tissue resections were first frozen in liquid nitrogen and cut with a cryotome. Afterwards, the tissue sections were mounted on glass slides (Menzel Glas, Braunschweig, Germany). These slides were first dried in dry air before and during CARS measurements, which were acquired for a very short term. The subsequent H&E staining was conducted on the same tissue slide.
CARS imaging of tissue samples was performed on a commercial setup (TCS SP5 II CARS, Leica Microsystems, Heidelberg, Germany) as described previously.27 Briefly, two picosecond-pulsed laser beams were collinearly aligned and focused on the sample through a HCX IRAPO L (25×/0.95 W, Leica Microsystems) objective. Multispectral CARS and SHG datasets were acquired in a region between 2700 cm−1 and 3000 cm−1. The datasets consisted of 61 spectral images and the acquisition time for the whole dataset was ∼2 ms per pixel, which was much faster than spontaneous Raman imaging by more than 100 times. Areas of roughly 300 μm × 300 μm (1024 × 1024 pixels) were scanned in epi (backward) and forward directions.
Fig. 1 Multistep classification approach used for classification of different tissue components of colorectal carcinoma. |
The lymph follicle and carcinoma spectra are hardly discriminable for the RF, especially for a low signal to noise ratio (SNR). For this reason the carcinoma and lymph follicle classes were selected from the resulting RF. The corresponding spectra were parameterized by spectral curve deconvolution.31 Features at 1240, 1337, 1390 and 1580 cm−1 were used to distinguish between both components by linear discriminant analysis (LDA). Single cell nuclei have the same spectral fingerprint as carcinoma, though they are strongly limited in their extension. Regions up to 10 × 10 μm are recognized as cell nuclei.
The concentration of one component is directly proportional to its Raman intensity. Utilizing this information, we created images which reflect the integrated intensity information of the CH-stretching vibration and the pseudo-colour image of the RF. In this study these images will be referred as virtual staining.
Multispectral CARS and SHG datasets were acquired in the 2700–3000 cm−1 region. The datasets consisted of 61 spectral images. CARS spectral datasets were normalized between 2700 cm−1 and 3000 cm−1 and k-means clustering29 was applied.
Likewise to the virtual staining of the RF the calculated pseudo-colour images of k-means were weighted by a combination of the CARS and SHG intensities at 2850 cm−1 and 408 nm, respectively.
Herein, by using the molecular information contained within the spectra, even more tissue components or cell types were automatically identified such as carcinoma tissue, connective tissue, muscle tissue, erythrocytes, lymphocytes, lymph follicle and crypts. The classification of tissue components was enhanced in the present study by developing a new multi-step classification scheme. The scheme is divided into two parts. In the first part a multi-step RF classifier (Fig. 1) was applied and it classifies the classes, connective tissue, muscle, erythrocytes, lymphocytes, crypts, carcinoma and lymph-follicle. In the second part, parameters from a curve deconvolution were calculated for spectra which were identified as carcinoma or lymph-follicle. With these parameters, both classes were successfully reclassified by LDA. Classified carcinoma regions, which were less than 10 μm × 10 μm in size, were recognized to be cell nuclei. The datasets employed for the training step were excluded from validation. The datasets shown in this study for validation were acquired from an additional thin tissue section from patient with low grade and stage I colorectal cancer.
An example of the Raman based SHP results from one patient will be presented here in detail. Fig. 2A displays the H&E stained tissue of a colorectal adenocarcinoma. Haematoxylin stains the cell nuclei in blue/purple, while eosin stains the cytosol in different red coloured shades.32 The annotations of tissue components were performed by an expert pathologist.
These annotations were used to determine the regions of interest for Raman micro-spectroscopic measurements in the next step. The left side region of this tissue section shows the carcinoma area, whereas the right side region is a non-cancerous (normal) region. In order to confirm the presence of carcinoma we performed IHC staining of the tissue with p53 and Ki-67 antibodies, which shows accumulation of p53 and Ki-67 proteins, respectively, in the left side region of the tissue (see Fig. S3 in the ESI†).32,33 According to the H&E staining and the IHC, staining regions of interest are selected as displayed in Fig. 2. The selected regions show different tissue types characteristically for colon tissue. Panel B shows a carcinoma region, adjacent muscle and connective tissue. Panel C displays the border between the mucosa containing the colon crypts and the submucosa separated by a thin muscle layer called lamina muscularis mucosae, which plays an important role in the diagnosis of colon cancer. In panel D a part of a lymph follicle is presented, whereas panel E shows the transition between the tunica muscularis and the tela serosa.
For the diagnosis, it is important to differentiate between cancer, crypts and the membrane muscle layer. Fig. 3A and B show the transition from the mucosa containing the crypts to the submucosa. The two tissue types are separated by the lamina muscularis mucosae. We were not only able to automatically identify the nuclei part of the crypt (dark purple) but also the lamina muscularis mucosae (salmon). Furthermore, connective tissue (green) and even cellular shaped features like lymphocytes (pink), erythrocytes (olive) and undefined cell nuclei (blue) were automatically identified. Although a few pixels were misclassified, Raman SHP in Panel B reproduces all information from the H&E staining image (Panel A). In Fig. 3C, the p53 IHC staining for the selected carcinoma region in Fig. 2B is shown. Cancer regions (red) were identified by Raman based SHP (Fig. 3D) and these results are in an agreement with the IHC stained carcinoma area (Fig. 3C). The p53 active cancer regions (Fig. 3C) were obtained in the Raman SHP image (Fig. 3D) as the red region. Inside the cancer region remaining goblet cells (dark purple) and infiltrating immunocompetent cells were observed (pink). The cancer region can be clearly separated towards the neighbouring muscle region (salmon). Small morphological differences were observed between IHC and SHP images because adjacent tissue slices were used.
Fig. 3 Comparison of H&E and IHC staining with Raman SHP of selected regions from colorectal carcinoma tissue presented in Fig. 2. (A) H&E staining of a region showing the transition from the tunica mucosa to the tela submucosa as shown also in Fig. 2C. (B) Raman SHP of the same region shown in (A). Salmon: muscle, dark purple: crypts, green: connective tissue, blue: cell nuclei, olive: erythrocytes, pink: lymphocytes. (C) IHC staining by the p53 antibody of a region showing the transition between carcinoma tissue and healthy tissue as also indicated in the H&E image (Fig. 2B). Accumulation of p53 is shown in red. (D) Raman SHP of the same region shown in (C). Red: carcinoma, green: connective tissue, salmon: muscle, dark purple: crypts, blue: cell nuclei. |
The mean training spectra of these components are shown in Fig. S4 (see the ESI†). The low standard-deviation of each class of the spectra, shown in grey, confirms the consistency of each class. In our continued approach towards Raman based automated SHP of colon carcinoma, we detected clear differences between carcinoma and connective tissue. Since the clinical use of the method is in focus, this clear differentiation was one of the main goals of our approach, and thus of great importance. The spectral differences between carcinoma and connective tissue are shown in detail in Fig. 4a. These spectra were improved regarding their standard deviation compared to our previous study.23 This is because new spectra were added for the training and spectra with a lower SNR were removed.
Fig. 4 Wavelet-denoised Raman mean spectra of carcinoma (red), connective tissue (green), and crypt (black) used in Raman RF. The spectra are shown in the 725–875 and 1210–1790 cm−1 regions. |
Large spectral differences between carcinoma and connective tissue are found. For example, the spectral differences can be found at 1330 and 787 cm−1, indicating higher protein and DNA contents, respectively, in the carcinoma spectra. These results are similar to those reported previously.4 A peak also appears at 1586 cm−1, representing guanine and adenine (ring breathing modes of DNA bases).4 The higher amount of DNA caused by enhanced proliferation is confirmed by the IHC staining for Ki-67 (see Fig. S3†).
The increased content of protein and DNA in carcinoma was also found by a Raman imaging study on gastric cancer34 and in a fiber-optic approach for colon cancer.31 On the other hand, higher lipid contents were detected in the connective tissue spectra through the Raman bands at 860 and 1458 cm−1. Higher lipid contents were also observed in the crypt when comparing with carcinoma through a Raman band at 1458 cm−1 as shown in Fig. 4b. In addition, crypts and lamina muscularis mucosae can be separated more clearly (Fig. 3B) in the present study. The differentiation between the crypts in mucosa and lamina muscularis mucosae is crucial for cancer diagnosis. This is because adenomas are formed in the mucosa, while the penetration of the lamina muscularis mucosae layer by carcinoma is defined as invasive cancer (see details in CARS results).
As different cell types, erythrocytes and lymphocytes were identified. The class of the erythrocytes shows the most characteristic spectra, since their hemoglobin is in the resonance condition with a 532 nm excitation laser. Due to normalization the enhancement caused by the resonance is not seen in the spectrum (Fig. S4 in the ESI†). Nevertheless, the spectra show enhanced characteristic bands for heme.18,35,36
Differences in the spectra of lymph follicles and single lymphocyte cells in the tissue were also detected (see Fig. S5 in the ESI†). The lymphocyte spectra have a characteristic pattern due to the large lipid content, with a strong band around 1443 cm−1 assigned to the CH2 bending mode.37 Brown et al. showed that it was possible to differentiate lymphocytes in different stages.38 Furthermore, they reported small but significant differences in the Raman spectra of activated and non-activated T-lymphocytes. This could also be the reason for the spectral differences observed here between cells in the lymph follicle and the other lymphocytes within the tissue (see Fig. S5 in the ESI†). The characteristic lipid bands are less intense in the spectra of the lymph follicle, where a higher content of protein bands is found. Thus, these results demonstrate the capability of Raman based SHP as a label-free method for recognition of several tissue components simultaneously and in an automated way.
The area shows the outer muscle layer of the colorectum (muscularis propria) at the top right corner, and the adjacent connective tissue (subserosa) at the bottom left. Especially in the muscle area the Raman based SHP (Fig. 5B) shows a problem with a lower SNR and the single step RF: a lot of misclassified pixels are recognized in the pseudo-colour image, which give the impression of a noisy image. By using a Gaussian filter (3 × 3 Gaussian matrix, σ = 1) and a multistep RF (see Data analysis section) the Raman based SHP was further improved as shown in Fig. 5C.
The image displays a large improvement in the Raman based SHP as shown in Fig. 5B. The muscle (salmon) and the connective tissue (green) can be separated precisely. Nevertheless, the representation of the tissue appears very flat and homogeneous within one component.
The concentration of molecules within a voxel is directly proportional to the integrated Raman intensities. Therefore, information of the detailed structures of a certain tissue can e.g. be obtained from the integrated signal of the CH stretching vibrations in the 2800–3050 cm−1 region.
A combination of the pseudo-colour map of the classification and the Raman intensities allows displaying of fine structures such as the filamentous structure within the muscle tissue and its orientation (Fig. 5D). This increases the information content of the presented data and gives an impression of colouring the tissue structure with our classification, which is comparable to stained tissue with dyes. Thus, instead of using a dye as in H&E staining, this method provides a label-free way to virtually stain a tissue and is therefore, a non-invasive approach. An advantage of label-free Raman imaging in comparison with H&E is that the same tissue section can also be used in a non-invasive manner for further analysis, like next generation sequencing, proteomic analysis, or immunohistochemistry. In principle, immunohistochemistry can be performed after H&E de-staining but it is an invasive method and the probability of losing the tissue sections during this process is relatively high.39 The virtual staining approach was applied to the previously selected regions shown in Fig. 2B–E. Fig. 6 shows the H&E staining of the four selected regions (A, D, G, J) in direct comparison with the Raman SHP (B, E, H, K) and the virtually stained Raman images (C, F, I, L).
Fig. 6 Comparison of H&E staining, Raman SHP and Raman virtual staining. (A, D, G, J) H&E staining of selected regions of interest shown in Fig. 2. (B, E, H, K) Raman SHP of the same regions shown in (A, D, G, J). Red: carcinoma, green: connective tissue, salmon: muscle, dark purple: crypts, pink: lymphocytes, olive: erythrocytes, purple: lymph follicle. (C, F, I, L) Raman virtual staining, constructed from Raman SHP shown in (B, E, H, K) and their corresponding integrated Raman intensities in the 2800–3050 cm−1 region. |
The picture clearly shows the enhanced visualization of the muscle fibres (salmon) and connective tissue (green). The resolution of Raman imaging intensities over the CH stretching vibrations is roughly equal to the conventional imaging of the H&E, and sometimes even seems to deliver a more detailed and sharper representation of the sample. The same images can be created with a black background if necessary (see Fig. S6–S8 in the ESI†).
The sensitivity and the specificity for carcinoma recognition are at 96% and 98%, respectively, as shown in Table 1. This shows how precisely the carcinoma can be allocated. Other approaches using Raman spectroscopy on carcinoma or basal cell carcinoma show similar results for the sensitivity and specificity.40,41
Sensitivity % | Specificity % | |
---|---|---|
Carcinoma | 96 | 98 |
Crypts | 96 | 99 |
Lymph follicle | 86 | 99 |
Connective tissue | 93 | 99 |
Muscle | 98 | 99 |
Lymphocytes | 99 | 99 |
Erythrocytes | 100 | 100 |
The virtual stained images give comparable results to the stained and labelled tissue sections. This proves Raman based SHP as a label-free supplement to the standard methods of diagnosis, such as H&E and IHC staining. It does not only identify biomarkers in human tissue, as shown here for colorectal cancer, but it can also be used as a diagnostic assistant technique.
CARS imaging at a single wavenumber is very common in bio-spectroscopy. For instance, CARS imaging near 2850 cm−1 has been used to monitor the lipid distribution in tissues or lipid droplets in cancer cells.27,45–47 In addition to imaging at a single wavenumber, Potma et al.45 used CARS spectra in the C–H stretching region coupled to principal component analysis for imaging meibomian glands. We have also used a similar approach including CARS spectra in the C–H stretching region and cluster analysis to identify subcellular organelles of cancer cells.27 Generally, clustering of CARS spectra produces a pseudo-colour image, which represents the various spectral distributions over the examined tissue section. Different tissue components shown in the pseudo-colour images can be identified by comparison with the corresponding H&E stained image of the same tissue section or the next adjacent tissue slice as shown in the workflow of SHP (Fig. S1†). This would provide more information than those which can be obtained from CARS imaging at a single wavenumber. Here, we have used CARS spectra in the C–H stretching region in combination with SHG and cluster analysis to set the stage for automatic identification of different tissue components.
The H&E staining images shown in Fig. 7 display regions with benign (A) and carcinoma (E) morphological features. For instance, panel A shows the transition from the mucosa containing intact crypts to the submucosa and they are separated by the lamina muscularis mucosae. In panel E, cancer with low anaplasia can be seen. The original structure of the crypt regions still can be observed. Fig. 7B and F display the k-means clustering result of CARS spectra in the 2700–3000 cm−1 region of the tissue sections shown in panels A and E, respectively. These panels accurately reproduce the structures that are apparent in the H&E stained images: the intact crypts (cyan and pink), submucosa (dark blue), and lamina muscularis mucosae (purple) are clearly shown in panel B, while carcinoma (pink) is displayed in panel F. Examples of the CARS mean spectra that are obtained from k-means clustering are shown in Fig. S9 (see the ESI†).
Fig. 7C shows a combination of the SHG (408 nm) and CARS intensities at 2850 cm−1. SHG of tissues at 408 nm visualizes mainly the fibrous collagen network, whereas the CARS intensity at 2850 m−1 depicts the lipid rich regions in the tissue. To obtain more information about the structural details, the pseudo-colour images of k-means (B and F) are combined with the intensity images of both SHG and CARS (C and G) and the results are displayed in panels D and H.
Although the concentration of one component is non-linearly proportional to its CARS or SHG intensity, such a combination improves the quality of the images as shown in detail in Fig. S10 and S11 (ESI†). These results are comparable to the H&E staining (Fig. 7A and E, see also Fig. S10 and S11†). The images (Fig. 7D and H) display an improvement compared with the pseudo-colour image of k-means clustering (Fig. 7B and F). For instance, the crypt (cyan and pink), the connective tissue (blue and olive), and lamina muscularis (for example purple) can be separated precisely from one another (Fig. 7D). The invasive carcinoma is clearly visible in panel H. Furthermore, the detailed structures of these components are more visible in Fig. 7D and H. Similarly to the Raman virtual imaging shown above (Fig. 6), images of high structural details can be generated with a faster imaging technique such as CARS.
Recent studies focus on fast measurements of tissue using non-linear techniques such as SRS, but lack the bioinformatics.26 Pseudo-HE images of tissues were also created using CARS/TPEF/SHG to be used as a pathological screening tool.44 On the other hand, pseudo-colour images of k-means of CARS spectra in the CH stretching region and SHG at 408 nm were used to produce highly resolved pseudo-colour images that can be used to differentiate between different tissue types. A combination of the presented data evaluation and CARS measurements paves the way for fast clinical label-free diagnostics. Our next step is to perform CARS measurements of native colorectal cancer tissues from several patients to obtain a large dataset that enables us to perform automatic recognition of various tissue components including carcinoma.
Footnotes |
† Electronic supplementary information (ESI) available: Fig. S1–S11 display the workflow, images of IHC, H&E staining, Raman SHP, and Raman virtual staining as well as the Raman and CARS mean spectra of different tissue components. See DOI: 10.1039/c6an02072k |
‡ These authors have equally contributed to this work. |
§ Current address: Institute for High-Frequency and Communication Technology, University of Wuppertal, 42119 Wuppertal, Germany. |
¶ Current address: Leibniz Institute for Analytical Science (ISAS), 44227 Dortmund, Germany. |
This journal is © The Royal Society of Chemistry 2017 |