Morpho-molecular ex vivo detection and grading of non-muscle-invasive bladder cancer using forward imaging probe based multimodal optical coherence tomography and Raman spectroscopy †

Non-muscle-invasive bladder cancer a ﬀ ects millions of people worldwide, resulting in signi ﬁ cant discom-fort to the patient and potential death. Today, cystoscopy is the gold standard for bladder cancer assessment, using white light endoscopy to detect tumor suspected lesion areas, followed by resection of these areas and subsequent histopathological evaluation. Not only does the pathological examination take days, but due to the invasive nature, the performed biopsy can result in signi ﬁ cant harm to the patient. Nowadays, optical modalities, such as optical coherence tomography (OCT) and Raman spectroscopy (RS), have proven to detect cancer in real time and can provide more detailed clinical information of a lesion, e.g. its penetration depth (stage) and the di ﬀ erentiation of the cells (grade). In this paper, we present an ex vivo study performed with a combined piezoelectric tube-based OCT-probe and ﬁ ber optic RS-probe imaging system that allows large ﬁ eld-of-view imaging of bladder biopsies, using both modalities and co-registered visualization, detection and grading of cancerous bladder lesions. In the present study, 119 examined biopsies were characterized, showing that ﬁ ber-optic based OCT provides a sensitivity of 78% and a speci ﬁ city of 69% for the detection of non-muscle-invasive bladder cancer, while RS, on the other hand, provides a sensitivity of 81% and a speci ﬁ city of 61% for the grading of low- and high-grade tissues. Moreover, the study shows that a piezoelectric tube-based OCT probe can have signi ﬁ cant endurance, suitable for future long-lasting in vivo applications. These results also indicate that combined OCT and RS ﬁ ber probe-based characterization o ﬀ ers an exciting possibility for label-free and morpho-chemical optical biopsies for bladder cancer diagnostics. bands are typical collagen bands. One can observe that the lipid bands are related to the positive coe ﬃ cients for tumor and the collagen bands to the negative coe ﬃ cients for non-tumor biopsies.


Introduction
There are currently 3.4 million people affected by bladder cancer worldwide, 1 with 75% of the newly diagnosed cases being non-muscle-invasive bladder cancer (NMIBC) and thereby non-muscle-invasive disease. 2 Among the different types of NMIBC, the five year recurrence rate is 20-80%. 3 The risk of recurrence and repeated surveillance cystoscopy in the outpatient department (OPD), which is the standard procedure when suspicious lesions are detected, subsequently jeopardizes the health of the patient. 4 The current gold standard in bladder cancer assessment is cystoscopy, which provides a visualization of the bladder mucosa by white light endoscopy (WL). Although the detection rate for muscle-invasive bladder cancer using WL is very high, 97-100%, the detection of NMIBC is still challenging. 5 Significant efforts go into developments to remedy this situation.
Along with the examination by WL, a biopsy is performed and the resected suspicious tissue is inspected by pathologists. 6 This allows access to clinically relevant information on the stage and grade of a lesion. The tumor stage classifies the invasion depth into the bladder wall. The tumor grade, on the other hand, describes the level of differentiation from healthy cells. 7 Histological diagnosis is defined in accordance with UICC's 2017 TNM classification 8 and the 2016 WHO classification. 9 There is a significant need to allow instant diagnosis during surveillance cystoscopy in the OPD, thus avoiding admittance to the operating theatre for diagnostic surgery.
In situ diagnostic procedures of stage and grade during surgery are currently not available, resulting in unnecessary tissue resection, which can cause significant negative effects on the patient's well-being. Photodynamic diagnosis (PDD), using hexylaminolevulinic acid as a photosensitizing compound for fluorescence guidance, is a promising approach, which may assist the presented optical diagnostic method. 6,10 Moreover, PDD does not provide clinically relevant information on stage and grade for immediate treatment decision. Due to these current shortcomings for the onsite diagnosis of NMIBC, there is a significant need for real-time assessment of the tumor stage and grade during cystoscopy.
Autofluorescence and diffuse reflectance have also been applied in bladder biopsies, demonstrating the capability to differentiate tumor and healthy bladder tissue with high sensitivities and specificities, where significant changes of tryptophan and collagen at 340 nm and 390 nm emission peaks were reported. 11,12 Nevertheless, in order to differentiate tumor grade in the bladder tissue, more biochemical information about the sample that can be associated with the main molecular changes related to tumor grading is needed.
In recent years, there have been extensive developments in optical technologies and their translation to clinical diagnostic applications, e.g. interference-based approaches, such as optical coherence tomography (OCT), and label-free molecular specific spectroscopy-based approaches, such as Raman spectroscopy (RS). OCT is widely used in ophthalmology to indicate anatomical changes in the retina by providing cross-sectional images in depth at micrometer resolution, allowing access to structural tissue information. 13,14 Since the eye is optically easily accessible, the retina can be examined non-invasively using a laser beam. However, to acquire information from the internal organs, OCT has to be extended with an optical endoscope. Such endoscopic OCT-based imaging has been reported for the cardiovascular system, the gastrointestinal tract and the urinary tract. [15][16][17][18] These implementations, however, only achieved B-scans and endoscopic volume images of the bladder tissue, which are paramount to characterize the tumor tissue, were not previously presented. A comprehensive overview on OCT methodology and applications is given elsewhere. 19 While OCT can rapidly provide morphological information from different depths, it lacks chemical information on the underlying molecular composition.
Raman spectroscopy, on the other hand, is based on inelastic light scattering between a photon and a molecule, providing label-free information on the molecular composition of a sample. RS provides information on molecular changes at a single cell level, [20][21][22][23] and has been extensively used for clinical tissue characterization. [24][25][26] It has been broadly applied in numerous studies for the diagnosis of cardiovascular diseases, 27 biochemical characterization of human cells 28 and organs, 29 including the discrimination of brain tumors 30,31 and malignant breast tissues, [32][33][34] and extensive research in cervical cancer, [35][36][37] lung cancer, 38 and colon, 39 prostate 40 and bladder cancers. 41,42 RS has been readily implemented in fiber optic probes for a variety of pathologies of different organs. 43 While each modality has intriguing capabilities, individually they can cover only a certain but complementary diagnostic aspect. As such, only a combination of both modalities can harvest the full diagnostic potential, and provide information on the invasiveness of the tumor, i.e. stage, by taking cross-sectional images of the bladder wall at a micrometer scale, using OCT, and obtain information about the grade by assessing the biochemical composition of the superficial tissue, using RS. 44 Ko et al. reported on the first multimodal approach of combining OCT and RS to detect and characterize dental diseases. 45 Ever since, multimodal optical coherence tomography and Raman spectroscopy (OCT-RS) has been reported for imaging ex vivo human breast tissue samples and in vivo wound healing, 46 ex vivo human retina, 47 in vivo and ex vivo skin, 48-51 ex vivo atherosclerotic plaque deposition 52 and ex vivo rectal mucosa. 53 Furthermore, Ashok et al. combined OCT and RS to increase the sensitivity and specificity of colon cancer detection. 54 In this study, microscopy-based OCT achieved a sensitivity, specificity and accuracy of 78%, 74% and 75%, respectively. RS achieved 89% sensitivity, 77% specificity and 82% accuracy. Recently, Bovenkamp et al. showed the ability of microscopic OCT and RS to provide diagnostic information regarding stage and grade of bladder cancer. For differentiating the pT2 stage from the pTa stage, they reported a sensitivity, specificity and accuracy of 80%, 60% and 71%, respectively. Furthermore, the sensitivity of detecting high grade lesions with RS was 99%, whereas the specificity was 87%, indicating correct detection of low grade lesions. 55 To use these techniques in vivo in the clinics, they need to be integrated into rigid or flexible cystoscopes to allow in situ assessment of the bladder wall. As a first step in this direction, we report on the development of a combined OCT-Raman system for stage and grade examination of bladder biopsies, based on fiber-optical probes. The combined system mimics the optical performance of an endoscopic probe combining OCT and RS. It gives insight into in vivo conditions and the expectable outcome with the first performance benchmarks. Moreover, it allows one to co-register morphological and molecular information of NMIBC, using fiber optical probes, and enables new opportunities to interpret the data, by the assessment of co-localized molecular and morphological signals.
The developed system was used to characterize a total of 119 biopsies in an imaging fashion that mimics in vivo conditions, for the diagnostic evaluation for the detection of NMIBC, and for the grading of low-and high-grade tissues.

Biopsy handling
The study was approved by the Ethical Committee at the Capital Region of Denmark, H-17015549, and a data processor agreement between the universities in Jena and Vienna and the Capital Region of Denmark was made (HGH-2018-038. I-suite nr. 6639). Prior to the operations, the patients gave their written informed consent to have biopsies for the study taken. All operations were performed according to the guidelines of the Urological Department at Herlev Hospital, Capital Region of Denmark and the experiments were performed in accordance with the above approvals obtained. For the experiments, 119 biopsies from 44 patients were obtained during resection of the bladder tumor in the operation theatre. Immediately after the procedure, the biopsies were moistened with a sodium chloride solution and delivered to the imaging laboratory within 15 minutes. In case a sample could not be examined within 20 minutes after removal, it was snap-frozen to −20°C and imaged on the following 1-2 days. For examination on the OCT-RS setup, the biopsies were carefully placed on CaF 2 glass slides and positioned on the translational stage ( Fig. 1 and 2). After the combined analysis using OCT-RS, the biopsies were fixed and stained. The pathologist then staged and graded the biopsies based on the underlying histology. The histopathological results are summarized in Tables 1 and 2. The quantity of the extracted malignant biopsies reflects NMIBC samples. Since pTa and CIS are confined to the urothelium 57 and pT1a has only invaded the superficial lamina propria, they are considered non-muscle-invasive bladder cancer 58 and by that early stage.

Instrumentation
Swept source OCT (SS-OCT) system. The experiments were performed using a miniaturized imaging device that mimics an in vivo situation where the measurement would be per- Fig. 1 Combined multimodal imaging system. The Raman setup consists of a single mode laser, a spectrometer and a PC. Besides a bright field camera (1), the Arduino board is also connected to the Raman PC. The Raman probe (2) receives the excitation light from the laser and guides back the Raman signal to the spectrometer. The swept source OCT (SS-OCT) setup includes an akinetic laser source, the interferometric optical setup including the photodiodes, driving electronics for the endoscope and a PC. The OCT probe (3) is optically connected to the optical setup. Moreover, the OCT probe is electrically connected to the piezo amplifier. The communication is realized via an Arduino board, triggering the acquisition and the translation of the two-axis stage, which is connected to the Raman PC.   formed using a fiber-optic probe. The OCT system (see Fig. 1) used an akinetic swept source laser (Insight Photonic Solutions, Inc., Lafayette, Colorado) 59 with a central wavelength of 1304 nm, a bandwidth of ∼90 nm (at 0 dB level) and a sweep frequency of ∼173 kHz. The OCT system 60 was based on a Mach-Zehnder interferometer configuration, with an output power of 11 mW at the tip of the endoscope. A system sensitivity of 104 dB was achieved, which is reduced to 99 dB in combination with the fiber-optic probe. The detailed description of the system incorporating the endoscopic probe can be found elsewhere. 61 For OCT imaging, a piezoelectric tube-based forward viewing endoscope with a diameter of 3 mm and a rigid length of 15.6 mm was used. The optical components were arranged in a Fourier-plane configuration to allow telecentric scanning across the tissue plane. 62 The field of view (FOV) was adjusted at stable scanning behavior to a diameter of 1-1.4 mm. The piezoelectric tube was driven at a quasi-resonance frequency of 510 Hz and scanned the attached optical fiber in a spiral pattern. The measured axial and lateral resolution was 12 µm and 28 µm in air, respectively. The confocal parameter was ∼950 µm. The working distance was ∼500 µm.
OCT data acquisition. For the three-dimensional remapping of the OCT data acquired with the piezoelectric tube-based scanning probe, a positional calibration was performed prior to the measurements. During this step, the scan pattern was imaged onto a position-sensitive device (PSD -SpotOn Analog SPOTANA-9L, Duma Optronics Ltd), the parameters were optimized for an optimal scan pattern, and a look-up table (LUT) was created. To reconstruct the volumetric OCT stacks with a size of 500 × 500 pixels, an algorithm in combination with the LUT was used to remap the spiral-shaped scan pattern onto a square-Cartesian grid. Due to the circular scan pattern, the final 3D-volume had a cylindrical shape. The time for an entire spiralrising and collapsingwas 2 seconds, which can be reduced to less than a second by changing the resonance frequency of the scanned fiber. However, the data during the rising spiral was recorded for an OCT stack, and the acquisition rate was 0.5 Hz. The OCT volume scan consisted of 510 consecutive circles (B-scans) with 340 A-scans per B-scan. Because the FOV of one single spiral was not large enough to image an entire biopsy with a size between 1 mm 2 and 15 mm 2 , subsequent scans were performed in a raster scan pattern. These stacks were stitched together in a post-processing step to obtain an image of the entire biopsy (Fig. 2b). The OCT data were acquired in such a way that the round FOVs were overlapping with the neighboring FOVs. To stitch the individual OCT FOVs, the volume scans were aligned by rotating the 3D stacks and cropping, in order to create rectangular stacks. These were combined in the same order the data were acquired.
Raman system. The acquisition of the Raman spectra was performed by collecting the Raman signal with an in-house developed Raman fiber-optic probe, which was connected to a spectrometer (Acton Series LS785, Princeton Instruments) with a spectral resolution of 5 cm −1 . The spectrometer was equipped with a back-illuminated deep-depletion charge-coupled device (BI-DD-CCD) (PIXIS 100BR_eXcelon, Princeton Instruments) with a 1340 × 100 imaging array and 20 × 20 µm sized pixels. The excitation fiber of the Raman probe was coupled to a 785 nm single-mode excitation laser (Fergy-Laser, Princeton Instruments) with an output power of 70 mW at the end of the fiber probe. This power could be safely used in in vivo applications. The excitation light from the optical fiber was coupled out from the fiber probe using a lens, and after passing through a narrow-band clean-up filter and a dichroiclongpass filter focused onto a spot size of 100 µm. The generated Raman signal was collected using the same lens, and after passing through the dichroic-longpass filter and an additional longpass filter focused onto a multimodal collection fiber, with a diameter of 200 µm, which was coupled to the spectrometer. The biological sample was scanned pixel-by-pixel resulting in a two-dimensional hyperspectral Raman map (Fig. 2c).
Combined OCT and Raman setup. The measurements of morpho-molecular tissue information of a biopsy were performed sequentially. To achieve the data sampling from the same sample locations, the fiber-optical probes were mounted on an in-house designed holder with a known positional offset between the OCT and the Raman probe, and were mounted above a translational stage (MLS203, Thorlabs). Since the device controlling software for OCT acquisition and Raman acquisition was running on two separate computers, and the translational stage was controlled by the Raman unit, a triggerbased communication was established between the systems, using an Arduino UNO board (Board model UNO/R3). This allowed the sending of triggering events between the two systems to initiate an acquisition for both modalities from the same region of interest (ROI). Both systems incorporated selfbuilt LABVIEW software for acquisition and driving of the imaging systems. Fig. 1 shows the combination of the Raman and SS-OCT setups. The automated imaging procedure can be described as given below.
First, a region of interest on the biopsy was selected by using a bright field image acquired with a standard camera (DCC1645C, Thorlabs) (Fig. 2a). Based on the specific FOV covered by the OCT probe, the software calculated the required number of tiles in x and y for the OCT measurement. The number of points for the Raman acquisition was kept constant to 30 × 30 points. After the start of the data acquisition procedure, the biopsy was moved below the OCT probe, and an OCT stack was acquired. After the acquisition of the first tile was performed, the OCT software sent a trigger through the Arduino board, indicating that the OCT stack was acquired, and the Raman computer executed a command to translate the biopsy to the next location. This was repeated until the entire ROI on the biopsy was sampled. After all OCT frames were acquired, the biopsy was automatically moved below the Raman probe, followed by a scan of the identical ROI as in OCT. The acquisition time for each imaging modality depended on the biopsy size: for example, imaging a biopsy with a size of ∼4 mm 2 took 1 minute for the acquisition of nine (3 × 3) OCT stacks and 13 minutes for 900 (30 × 30) Raman-point measurements. The goal of the study was to sample the entire biopsy to allow for a comprehensive characterization of the underlying morpho-molecular changes occurring in malignant tissues. For the measurements in an in vivo setting, the acquisition time for OCT imaging and single point Raman spectroscopy is less than 5 s each.

Data analysis
OCT texture analysis. Texture analysis has been shown to be a suitable approach to perform an automated image classification for OCT images 63 and has been successfully used to differentiate benign and malignant biological tissues. 64 Consequently, this method was chosen to analyze the OCT images. 54,55 The texture analysis approach described by Bovenkamp et al. was used in the present study. 55 In contrast to the reported procedure, the single OCT stacks were divided into 25 × 25 equally sized fragments and only the middle 5 × 5 subframes were used for the texture analysis. This excluded the border area of the single scanned FOVs only containing zero pixels due to the circular shape of the scanning pattern (Fig. 2b). Furthermore, the outermost circles might be undersampled at a large FOV or distorted due to slight changes in the resonance frequency of the scanner during operation. Because OCT measurements were performed on biopsies placed on the CaF 2 glass slide, some of the OCT stacks contained high glass reflections. Therefore, only a subset of OCT stacks per biopsy was chosen. The stacks, containing glass reflections, exhibit texture feature artifacts not found in stacks that contain information about the biopsy only. These features were used to exclude the glass stacks, which are not relevant for classifying malignant and benign tissues. As a result, the classification was built on 80 features per observation and 29 006 observations in total. The classification was carried out using a fine Gaussian support vector machine (SVM) with a 20% holdout validation.
Spectral analysis of Raman data: pre-processing. Before any analysis of the Raman data was performed, all spectra were identically pre-processed. The spectra were first corrected for cosmic spikes, using an algorithm developed by Ryabchykov et al. 65 The wavenumber calibration was performed based on a reference spectrum of 4-acetamidophenol. Subsequently, the measured spectra were de-noised by using the first 8 PCA components. The spectra were corrected for dark current followed by an intensity calibration, using the spectra of a National Institute of Standards and Technology (NIST)-standardized white light source (Kaiser HCA accessory). The measured intensity lamp spectra were fitted to the reference spectra of the lamp and a transfer function was estimated to calibrate the intensity of the measured spectra. The baseline correction was performed using asymmetric least squares (AsLS) and extended multiplicative signal correction (EMSC). 66 As the last two steps of the pre-treatment workflow, all Raman spectra were further treated using the Savitzky-Golay filter and area normalized within the regions from 600 to 1750 cm −1 and 2800 to 3000 cm −1 . These regions were afterwards used to construct the classification models, using partial least squares linear discriminant analysis (PLS-LDA) to classify between tumor and non-tumor. All computations were performed using the statistical programming language R and Python.

Classification model and cross-validation of Raman data
The classification was performed using a PLS-LDA algorithm, which combines partial least squares and linear discriminant analysis using a 5-fold cross-validation. A hierarchical approach to create and validate each model was adopted, where the first layer model system ML1 focusses on classifying between tumor and non-tumor, and the second layer model system ML2 uses the tumor predictions to differentiate lowand high-grade tumors (Fig. 3).
The created models were validated by applying k-fold partitioning of the data, where each layer has a specific partition according to the existent data for each variable of classification. For the first classification layer (ML1), 10 iterations for each partitioning were performed. The generated ML1-models were validated by the test data. In order to test all spectra of each biopsy, 10% of the created models were selected randomly. For each spectral point, a mean of the predictions was calculated, and this prediction map per pixel is displayed in the flowchart (Fig. 3). The mean prediction maps were used to select the spectra predicted as tumor and non-tumor. The second layer classification model system (ML2) used the tumor-defined areas of the biopsies. A mean spectrum per biopsy was calculated for those areas. The resulting data was k-fold partitioned and a set of training and testing biopsies was selected. The ML2 was created with the training data set that results from 16-fold partitions and was validated with the testing biopsies; 10 iterations of 16 different partitions were employed to create and validate ML2 (Fig. 3). The described method has previously been reported in Cordero et al. 67 Multivariate curve resolution alternating least squares (MCR-ALS) for collagen distribution of Raman data Initially, the pure components for each biopsy were determined by using an orthogonal projection approach (OPA) algorithm. This function extracts the initial 'pure' components of the set of spectra based on spectral dissimilarity of the data set. It is important to note that the pure component spectra can contain contributions from other substances and can deviate to some degree to pure component spectra of the native substance. The OPA estimates the dispersion matrix of the mean spectrum of the data set: the higher the dissimilarity, the purer the component. The estimated pure components of tumor and non-tumor spectra were correlated to the literature. 41 It was found that for non-tumor biopsies, there are most of the relevant bands of collagen 1 spectrum, which can be linked to the dominant presence of collagen in non-tumor spectra. All the non-tumor biopsies were used to find the collagen pure component, which was calculated by employing the OPA function, which provides the first estimation of pure components of the dataset. Afterwards, the mean standard deviation of the OPA pure collagen obtained from each biopsy was used to further calculate the MCR concentrations, where ALS complement the MCR function fitting the concentrations to improve the estimations of the pure components. The MCR-ALS algorithm used the extracted mean collagen pure component and consequently calculates the relative concentration at each measured location of each biopsy. The concentration matrix indicates how intense the presence of collagen in the biopsy is and was calculated for each biopsy.

Correlation of OCT to histopathological images
To visually compare the morphological information obtained from OCT with the histopathological information, OCT and the corresponding histopathology for two different biopsies are shown in Fig. 4. In the non-tumor case (Fig. 4a), the layered appearance of the bladder wall is visible by OCT. The thin dark top layer in the OCT image is correlated to the mucosa, delimiting the bladder from the inner lumen. The second bright layer corresponds to the connective tissue, also referred to as lamina propria. Besides nerves and blood vessels, the bright appearance in the OCT images indicates a strong scattering tissue constituent, such as collagen fibers. The deepest visible layer in the OCT image is the muscularis propria, the muscle layer. It has a clear demarcation from the lamina propria and appears rather dark. In Fig. 4b, the corresponding OCT and histopathological image of a pTa lesion is shown. Here, the thickened urothelium is clearly visible. Whereas the pTa staged tumor has not infiltrated the lamina propria, the demarcation between the urothelium and lamina propria is still intact. These correlations show the ability of the presented endoscopic OCT probe to identify early stage lesions due to morphological changes in the mucosa of the bladder wall. The used OCT probe design is thereby suitable to detect NMIBC. Even though some biopsies can be distinctly correlated, in general, the correlation between OCT and histopathological images is difficult and was not feasible for all biopsies. For example, the biopsy-collection procedure relies on forceps, which mechanically stresses the biopsy. Additionally, the biopsy can get twisted during transportation from the operation theatre to the OCT-RS setup. Furthermore, the fixation process for histopathological slicing includes formalin, which can cause the biopsy to shrink, inducing discrepancies in dimensions. For example, in renal and intestinal biopsies, shrinkage between 11% and 33% was reported, respectively. 68,69 For quantitative classification of the OCT data, texture analysis was carried out.

Texture analysis of the OCT data
The classification was performed on 116 non-tumor and tumor biopsies. The stacks were labeled with histopathological results obtained for the biopsies and the results of the texture analysis are summarized in Table 3. Sensitivity indicates the correct detection of tumor in accordance with the histopatho-   54,55 The deviation is primarily due to the use of a fiber-probe-based scanning approach, where compromises with respect to the optical performance are made. But on the other hand, there is clear evidence that the optical performance of the probe is sufficient for NMIBC. Furthermore, the remaining glass reflection artifacts from the substrate can additionally reduce the performance. This, however, should be of no concern for in vivo measurements. Nevertheless, the performance of the texture analysis was further compromised, because the classification of the histopathological results provides only a single label for the entire biopsy corresponding to the highest pathologic severity observed. The OCT images, on the other hand, are spatially resolved, and contain a variety of regions. For instance, a pTa staged biopsy may contain areas of healthy bladder wall, which does not influence the histopathological outcome. In contrast, the classification is sensitive to these transitions and the heterogeneity of the lesion within one particular biopsy. The sensitivity for pTa, for instance, increases to 90%, if two blinded experts, familiar with OCT images of the bladder wall, are classifying the OCT images. If the decision was inconclusive between the two experts, the pathologically more severe statement was taken to label the biopsy. The heterogeneity of the biopsy can be a significant factor that reduces the accuracy of the OCT texture analysis (see Fig. 7).

Raman analysis
For Raman spectroscopy, a two-layer model system was created and validated to distinguish tumor from non-tumor tissues, followed by grading of tumor regions. The first model level differentiates tumor and non-tumor with an accuracy of 92%. As described in the first section, the two-layer model was created with a different set of spectra, summarized in the flowchart shown in Fig. 3. For the second modeling step ML2, which differentiates low-grade from high-grade, an accuracy of 77% was achieved. Fig. 5a shows the mean spectrum of all  non-tumor (black) and low-(blue) and high-grade (red) tumor biopsies, in which the lipid and collagen bands are highlighted. The mean coefficient of the model system layer 1 (ML1) is shown in Fig. 5b. The same bands are also highlighted, illustrating the relation between negative coefficients and the spectral bands for collagen at 856 cm −1 , 937 cm −1 and 1265 cm −1 , which indicate the C-C vibrations and amide II of collagen, 70 respectively. The collagen bands and the negative LDA coefficients show a clear relationship between the constituent and the non-tumor class. The marked lipid bands at 1300 cm −1 , 1656 cm −1 and 2854 cm −1 correspond to the CH 2 deformation, twist vibration, CvC and the symmetric stretching of lipids, respectively. 70,71 As can be seen from the comparison, there is a relation between the main lipid bands, positive LDA coefficients, and the tumor class. These observations are consistent with previous findings, 41,72 where nontumor tissues are mainly characterized for having a dominant presence of collagen. The two-layer model (ML1 and ML2) performance is summarized in Table 3: ML2 can identify the true positive LG easily than the true negative HG, with an achieved sensitivity of 81% to differentiate low-grade tumour from high-grade tumor. The achieved specificity of 68% to differentiate the grading indicates that the models need more spectral information of the true negative HG in order to better distinguish the main differences between tumor grading. As Fig. 5a shows, low and high grade mean spectra have very little variations and more HG-biopsies are required to allow the models to classify properly between the two classes. MCR analysis and collagen distribution. The classification models show a distinct relation between the collagen presence in non-tumor and tumor samples; therefore the collagen distribution can be related to the mean predictions of model ML1. As previously described, each biopsy has a set of spectra that has a mean prediction obtained from model ML1, which allows providing information on the heterogeneity or homogeneity of a biopsy. An MCR algorithm is applied to use the extracted pure components of collagen for the non-tumor biopsies to find the constituent distribution in the biopsy by estimating the concentration of the component for each spectrum of the biopsy. Fig. 6a shows the mean and standard deviation spectra of the homogeneous non-tumor (black) and tumor (red) biopsy and the extracted MCR collagen component (green) from all non-tumor biopsies, as described in the previous sections. Fig. 6b displays the mapping of the mean prediction (left) and the collagen distribution (right), showing how the presence of collagen dominates in the biopsy that is predicted as a homogeneous non-tumor tissue. This is consistent with Fig. 6c, where the mapping of a heterogeneous non-tumor biopsy shows that the areas predicted as tumor (red areas) have less presence of collagen (darker green) in comparison with the areas predicted as non-tumor (black areas). In the same way, Fig. 6d and e show the mapping of mean prediction and collagen concentrations in homogeneous and heterogeneous tumor biopsies, respectively.
The homogeneous tumor biopsy shows a distinct difference when compared to the homogeneous non-tumor biopsy, where the dark colored area indicates a low presence of collagen. Similarly, the areas predicted as non-tumor in Fig. 6d are brighter than the areas predicted as tumor for the heterogeneous tumor tissue. This contrast is depicted easily by comparing the prediction and collagen distribution maps in Fig. 6. The black area of the prediction map (left) is non-tumor and is lighter colored in the collagen distribution map (right). The correlation between the prediction maps and the collagen distribution maps provides an insight into the relation between one of the main bladder tissue constituents and the tumor regions of the analyzed tissue.

OCT-Raman combination
The combination of RS and OCT opens a new means to better comprehend the essential basis of the data. Even though both modalities are based on different physical origins, i.e. OCT depends on the light scattering due to changes of the index of refraction in the tissue, and RS relies on the molecular vibrational bonds in the sample, both origins are inherently coupled. The collected imaging data of the two complementary modalities were acquired in a co-registered manner, offering the possibility to correlate molecular and morphological features from the same locations.
The correlation was performed, first by stitching the OCT stacks and applying an in-house developed algorithm to compute the surface curvature of the biopsy, which allowed for the flattening of the surface. After the curvature correction, the mean of the biopsy in the z-dimension was calculated, and the obtained image was employed to compute a mask for locations belonging to the biopsy and locations outside the biopsy. The RS data were pre-treated as described in the Materials and methods section, and the mean prediction Raman map was employed to correlate the OCT and RS data. The RS mean prediction map was interpolated to account for the size difference between the Raman map and the OCT image. Furthermore, because the OCT scan covered more area than the RS map, the RS map was used as a mask for the OCT image. Transparency was applied to the RS map in order to provide visual information of both the OCT image and the Raman map. Fig. 7 shows the transition of tumor to non-tumor area using the combined information of both modalities. To better visualize the data, images were displayed for both modalities indicating a healthy bladder wall structure within a cancerous lesion. An OCT cross-sectional and en-face image of a biopsy, which was histopathologically diagnosed as pTa low-grade tumor is shown in Fig. 7a. The overlaid Raman information of the tumor margin, where the red shade is the predicted tumor fraction and the black shade is the predicted non-tumor fraction as established by the ML1 model, is shown in Fig. 7b. The bright features of the lamina propria appear in the OCT image, which indicates a pronounced transition between the urothelium and the lamina propria in the healthy bladder wall and are well correlated to the regions that were predicted as non-tumor tissues by RS. To better visualize the information, cross-sectional images of the indicated regions marked by the colored lines are framed in green and yellow. The cross-sectional images are maximum projections of 10 scans around the indicated position. This combination of both modalities enables a better comprehension of the underlying signal origin and enables further localized pathological analysis of the biopsies with localized diagnosis to provide more detailed histopathological labels. This can lead to increased accuracy given by computer-assisted classification of tumor and nontumor areas, providing a strong indication that the combination of optical label-free modalities can provide a comprehensive, localized diagnostic value.
Although biopsy handling and the pathological examination, such as biopsy torsion or discrete biopsy labels, influence the performance, the achieved accuracies of 73% and 77% for OCT and RS, respectively, are promising for future in vivo tests, allowing the determination of the stage and grade simultaneously in vivo. Here, artefacts from sample handling will become completely negligible, additionally improving the accuracy.

Conclusions
In summary, we have demonstrated that fiber-probe based OCT and RS are suitable to provide clinically relevant information for detection and grading of NMIBC biopsies. Here, a forward looking, piezoelectric tube-based OCT probe is used for a comprehensive characterization of bladder cancer lesions for the first time, providing volumetric morphological information of entire biopsies. The presented OCT probe provides sufficiently high optical performance to determine small morphological structures in depth. Raman spectroscopy, on the other hand, demonstrates clear spectral differentiation of tumor and non-tumor, and low-and high-grade lesions in the bladder tissue based on the biomolecular composition. By developing an imaging platform that combines both modalities using forward-viewing fiber-optical probes, it was possible to acquire morphological volumetric images of biopsies and to create co-localized and co-registered hyperspectral molecular maps for the sample, providing comprehensive diagnostic information for the penetration and grade of bladder cancer at an early stage. By performing the study using fiber-optic probes and a large number of samples, it was possible to evaluate the relevant parameters of the probes for the in vivo application of OCT and RS. For example, it was possible to show that the piezoelectric tube-based OCT probe is mechanically stable to conduct a reproducible measurement of more than 100 biopsies with more than 1000 individual stack acquisitions over an extended period of time. Probe parameters, such as the outer dimension, meet the restrictions given by surgical instruments for in vivo applications, including additional sheathing for biocompatibility and safety. The parameter evaluation reveals adequate power levels and performance for visualization of the clinically relevant data for both modalities. The presented axial and lateral resolution for OCT and a sensitivity of 99 dB are sufficient. The FOV should be as big as possible, but a diameter of 1 mm is enough to identify and characterize relevant lesions. The excitation power of 70 mW and 11 mW used by RS and OCT, respectively, are well within the limit for maximum permissible exposure on skin and suitable for in vivo applications. The required acquisition times of 0.5 s and 2 s for RS and OCT, respectively, are suitable for interoperative handling by urologists during bladder examination. Comparing OCT and RS, one can see that OCT can acquire information from a larger FOV faster, allowing for the detection of cancerous lesions. RS, on the other hand, provides higher sensitivity, specificity and accuracy for differentiating tumor from non-tumor tissues, and additionally allows the grading of tumors. As such, OCT can be used as a red-flag technology and RS can be used to provide diagnostic information. Moreover, the results serve substantially as the next step towards in vivo testing of the OCT-RS combination. The presented findings pave the way for the development of multimodal, endoscopic probes enabling OCT and RS to supply the clinicians with clinically important, localized information in real time, which is until now only accessible after histopathological examination.

Conflicts of interest
There are no conflicts to declare.