Satarupa Banerjee*a,
Swarnadip Chatterjeeb,
Anji Anuraa,
Jitamanyu Chakrabartyc,
Mousumi Pald,
Bhaskar Ghoshe,
Ranjan Rashmi Pauld,
Debdoot Sheetf and
Jyotirmoy Chatterjeea
aSchool of Medical Science and Technology, Indian Institute of Technology, Kharagpur, India. E-mail: satarupa@smst.iitkgp.ernet.in
bAdvanced Technology Development Centre, Indian Institute of Technology, Kharagpur, India
cDepartment of Chemistry, National Institute of Technology Durgapur, India
dDepartment of Oral and Maxillofacial Pathology, Guru Nanak Institute of Dental Science and Research, Kolkata, India
eDepartment of ENT & Head Neck Surgery, Medical College, Kolkata, India
fDepartment of Electrical Engineering, Indian Institute of Technology, Kharagpur, India
First published on 11th January 2016
The biopsy based diagnosis of oral precancers like leukoplakia (OLK) and submucous fibrosis (OSF) as well as squamous cell carcinoma (OSCC) suffers from observer specific variability. The present work explores the utility of intensity and textural features from optical coherence tomography (OCT) images after specific feature subset selection for precise classification of oral lesions using variants of support vector machine. Concomitant application of Fourier transform infrared (FTIR) spectroscopy for endorsing global biochemical signatures, and histochemistry was performed further for value addition of the OCT findings. Immunohistochemical findings for characterization of specific local molecular alteration were also included in this. Result suggested that, OCT features could differentiate the lesions with high sensitivity and specificity. The FTIR result showed glycogen, keratin and carbohydrate related alteration in OSCC, decrease in collagen specific amino acids and skeletal muscle related proteins in OSF and distinct variation in tissue hydration status in diseases. There was also increase in keratin layer thickness in OLK due to overexpression of cytokeratin 10 in superficial layer; while in OSF, skeletal muscle was found to be replaced with dense collagen I. These disease specific alterations were assumed to be the underlying phenomenon associated with intensity and textural variations in OCT images, using which specific quantitative imaging biomarkers were proposed.
OCT, a non-invasive imaging technique, provides real-time, high-resolution, micro-architectural sub-surface images of up to 2 mm tissue depth.2 Previous studies correlated healing progression and maturation of epithelial and sub-epithelial components considering OCT image attributes and histopathological features.3 ‘Lucidity’ is the optical intensity descriptor used for interpreting OCT images. It tends to vary in different regions of layered body structure like oral mucosa, skin wounds etc. Since the operating principle of OCT imaging is governed by backscattering of light and exploiting a ‘biological window’ with minimal absorption, the changes in tissue refractive index modulate intensity characteristics.4 Such scattering also depends upon tissue structural components, surface roughness,5 hydration cum maturation status, nuclei size, presence of collagen fibres, keratin content,6 tissue type7 and membrane lipid density of cells.8 In skin,9 cervix10 and oral mucosa11 transition zones and architectural changes during disease progression can be identified by OCT. Such demarcations are possible due to differential thickness and composition of epithelial or sub-epithelial layers.12,13 In this context, Ughi et al. utilized intravascular coronary OCT for differentiating normal and abnormal pathologic condition by textural image analysis.14 A recent study also implemented automated classification of oral malignancy in hamster buccal pouch model using OCT textural features.15 However, further scopes are there to enhance the diagnostic efficiency of such OCT images for oral mucosal lesions, by corroborating chemical and molecular signatures of tissues documented by FTIR, histochemistry (HC) and immunohistochemistry (IHC). The present study therefore primarily delved to classify oral lesions on the basis of intensity and textural features of OCT images, besides providing tissue architectural information, and also to amalgamate information obtained from other modalities like FTIR and HC/IHC towards better characterization of oral lesions. The FT-IR and HC/IHC are considered for global assessment of biochemical variation and local composition/gene expressional changes respectively.
FTIR is a widely used low cost tool for chemical portrayal of materials, yet underexplored diagnostic modality for spectral characterization of biopsied tissues.6 In this perspective, FTIR in transmission mode was used for functional group analysis and disease specific chemical characterization of oral lesions from global dimension. HC and IHC findings also provided local specific compositional alteration. Periodic acid-Schiff (PAS) depicted information on polysaccharides as well as keratins and Van Gieson's (VG) staining illustrated differential staining of collagen and other connective tissue components.16 The IHC study of collagen I (COL-I) and cytokeratin 10 (CK 10) expressions endorsed the vital compositional17 and maturational18 information respectively and corroborated with the tissue architecture.
After analyzing the same tissues under different modalities, viz. OCT, FTIR, HC and IHC, two propositions were considered. Firstly, oral lesions can be segregated on the basis of a specific subset of intensity and textural features extracted from OCT images which could be further proposed for optimum disease segregation. Recently quantitative imaging biomarkers (QIBs) are defined as “an imaged characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes or a response to a therapeutic intervention”.17 The concepts of QIBs further helped to assume that, if biochemical characterization of the same tissue sections can be performed, then selected OCT features could be rechristened to QIBs. Therefore support vector machine (SVM) was used here for disease classification, since it can classify the diseases with high predictive accuracy, medium fitting speed, and good prediction speed along with memory as shown in previous studies.18,19 Quadratic and cubic kernels were also used here to manipulate the efficiency of the learners, since they are commonly used non-linear kernel beside linear one.20 After feature reduction using minimum Redundancy Maximum Relevance (mRMR) algorithm21 for feature subset identification followed by the classification task and biochemical characterization of the tissues using specified modalities, QIBs were thus proposed.
Secondly, it was assumed that disease specific difference in the textural feature was due to disease specific changes in biochemical component at tissue level. The uniqueness in the present work is not only providing structural information but to treat OCT as a measurement modality which cannot be interpreted by a human observer. This may overcome the limitation of need of expert based disease diagnosis. It was also assumed that multimodal approach may provide complementary information, where disease specific difference in the intensity and textural features of OCT can also be logically correlated with characteristic molecular pathology attributes. Therefore underlying chemical alterations were also sought to validate the notion that difference in the global chemical signatures in different disease condition may be associated with changes in the intensity and textural features. In previous studies, amalgamation of the morphological information of OCT and biochemical information for diagnosis of diseases resulted in increased specificity and sensitivity,22 whereas this study was performed in fragmented manner to highlight utility of each modality.
Two pre-cancers (viz. oral leukoplakia (OLK) and oral submucous fibrosis (OSF)), beside oral squamous cell carcinoma (OSCC) were chosen in this study. OLK is presented by white plaques of questionable risk having excluded from other known diseases or disorders that carry no increased risk for cancer,23 whereas OSF, defined as a chronic, premalignant condition is characterized by progressive sub-epithelial fibrosis.24 The reasons behind selection of the two lesions are, their high malignant potentiality and despite having differences in their origin, they both culminate into OSCC.19 OSCC is defined as “a malignant epithelial neoplasm exhibiting squamous differentiation as characterized by the formation of keratin and/or the presence of intercellular bridges”.25 It may also be emphasized that, consumption of tobacco (smoked or smokeless) and areca nut are the major risk factors associated with OLK and OSF respectively.23
Finally on the basis of textural and intensity attribute selection in OCT images, molecular characterization of tissues with the OLK, OSF and OSCC as well as logical integration of the results, QIBs could be proposed, which is the main aim of this study. Multimodal diagnostic evaluation of oral lesions in turn thus not only addressed diagnostic ambiguity, but also emphasized role of value addition in translational research towards better disease characterization.
After OCT imaging of fixed tissues, they were paraffin embedded. 4 μm thick sections were mounted on six albumin coated glass slides and two poly-L-lysine coated glass slides. All the sections were de-paraffinized using xylene. Albumin coated slides were used for FTIR data acquisition, H&E (Haematoxyline and Eosine) staining based histology, and HC, while the poly-L-lysine coated slides were used for IHC staining.
Feature type | Features considered | Reference |
---|---|---|
Intensity | 1. Mean gray | 26 |
2. Median gray | ||
3. Standard deviation gray | ||
4. Entropy gray | ||
5. Coefficient Of variance gray | ||
6. Skewness gray | ||
7. Kurtosis gray | ||
8. Variance gray | ||
Texture | 9. Contrast of gray level co-occurrence matrix (GLCM) | 14 and 26 |
10. Correlation of GLCM | ||
11. Energy of GLCM | ||
12. Entropy of GLCM | ||
13. Homogeneity of GLCM | ||
14. Cluster shade | ||
15. Cluster prominence | ||
16. Information measures of correlation | ||
17. Max probability | ||
18. Sum of entropy | ||
19. Sum of variance | ||
20. Difference entropy | ||
21. Local binary pattern (LBP), mean | ||
22. LBP, standard deviation |
![]() | ||
Fig. 1 In vivo OCT images of (a) NOM (b) OLK, (c) OSF (d) OSCC and corresponding H&E images (at 5× magnification) (e) NOM (f) OLK (g) OSF and (h) OSCC depicting structural correlation. |
Diseases classified | Feature not used (feature number used in Table 1) |
---|---|
NOM vs. OSF | 2, 3, 4, 8, 14 |
NOM vs. OLK | 14 |
NOM vs. OSCC | 0 |
OLK vs. OSCC | 10, 13 |
OSF vs. OSCC | 8, 19, 20 |
OLK vs. OSF | 1, 8, 19, 20 |
NOM vs. OSF vs. OSCC | 1, 4, 8, 18, 19, 20 |
NOM vs. OLK vs. OSCC | 8, 9, 20 |
NOM vs. OLK vs. OSF vs. OSCC | 4, 9 |
Disease | Kappa score (κ) | Standard error of kappa (SEκ) | 95% Confidence interval | The strength of agreement |
---|---|---|---|---|
NOM | 0.857 | 0.136 | 0.59–1.00 | Very good |
OLK | 1.000 | 0.000 | 1.00–1.00 | Perfect |
OSF | 0.941 | 0.058 | 0.83–1.00 | Very good |
OSCC | 1.000 | 0.000 | 1.00–1.00 | Perfect |
Present study utilized OCT imaging for specifying structural information, which was then correlated with histological findings (Fig. 1). In Fig. 1a and e architectural features like distinct keratin layers above supra-basal layer and prominent grooves of rete pegs were evident in NOM condition, while in Fig. 1b and f increase in keratin thickness and hyperplasia were clearly visible, which were characteristic histological features of OLK. In OSF cases (Fig. 1c and g), atrophic rete-pegs and increased collagen deposition was noticed in sub-epithelial region. In cases of OSCC (Fig. 1d and h), increased numbers of blood vessels were clearly seen, which might be due to neo-angiogenesis.
Although this study correlated OCT with histology, but still expert based interpretation was needed for diagnosis. The intensity and textural features of OCT were therefore analyzed towards value addition to the process of oral pre-cancer and cancer differential diagnosis as well as to boost computer aided diagnostic (CAD) technique. The texture and intensity features based classification of OCT images (Table 1) using PCA–LDA score plots (Fig. 2), depicted significant overlapping in between NOM and PMD conditions. Therefore, SVM based two class disease classification was further performed. The results suggested that selected OCT features could differentiate the lesions with high sensitivity and specificity, mostly with 90% overall accuracy. Table 4 presents the classification performance of SVM with different kernels, with 10 fold cross validation. Cubic and quadratic kernels were found to be more efficient than linear kernel during classification. All the lesions could be classified using quadratic SVM with 82.6% accuracy after optimization of classifiers, when four class classifications was performed. The confusion matrix has been presented in Fig. 4. Further, sequential feature reduction based attribute selection was then performed towards optimal classification of OCT features. Specific feature subsets, not selected during two class disease classification (provided in Table 2), were not used further, and therefore the rest (not mentioned in Table 1) were considered as optimum selected features from OCT images.
![]() | ||
Fig. 2 LDA score plot of OCT intensity and textural feature using 20 principle components after PCA–LDA with confidence ellipse representing confidence interval at 80%. |
Classification conditions | Classifier used | Sensitivity (%) | Specificity (%) | Accuracy (%) |
---|---|---|---|---|
NOM vs. OLK | Linear SVM | 37.5 | 85.4 | 71.9 |
Quadratic SVM | 81.3 | 92.7 | 89.5 | |
Cubic SVM | 87.5 | 97.6 | 94.7 | |
NOM vs. OSF | Linear SVM | 56.3 | 98 | 88 |
Quadratic SVM | 50 | 98 | 86.6 | |
Cubic SVM | 75 | 98 | 92.5 | |
NOM vs. OSCC | Linear SVM | 81.3 | 93.8 | 91.3 |
Quadratic SVM | 68.8 | 95.3 | 90 | |
Cubic SVM | 81.3 | 96.6 | 93.8 | |
OLK vs. OSCC | Linear SVM | 58.5 | 85.9 | 75.2 |
Quadratic SVM | 80.5 | 87.5 | 84.8 | |
Cubic SVM | 80.5 | 92.2 | 87.6 | |
OSF vs. OSCC | Linear SVM | 90.2 | 92.2 | 91.3 |
Quadratic SVM | 88.2 | 95.3 | 92.2 | |
Cubic SVM | 88.2 | 92.2 | 90.4 | |
OLK vs. OSF | Linear SVM | 78 | 80.4 | 79.3 |
Quadratic SVM | 92.7 | 86.3 | 90.2 | |
Cubic SVM | 87.8 | 86.3 | 87 |
When two tailed ‘t’ test with 95% confidence interval was performed between the disease conditions with the OCT features, the information measure was significant to differentiate NOM vs. OSCC and NOM vs. OSF. Skewness of gray value was important for NOM vs. OSCC and NOM vs. OLK. Other statistically significant parameters (p < 0.05) to delineate NOM and OSCC were mean and median of gray values, entropy of GLCM, cluster shade, cluster prominence and sum of variance. Correlation and homogeneity of GLCM, difference entropy as well as mean and standard deviation of LBP were found to be important to distinguish NOM and OSF. Lower entropy, mean of gray value in OSCC indicated increased homogeneity in OCT images, while low cluster prominence indicated small variation in gray scale too, as validated from H&E images (Fig. 2).30
Since biochemical alteration could only be validated by multimodal characterization of oral lesions and selected feature subset obtained from OCT images can only be rechristened to QIBs if significant alterations are present in disease conditions,17 HC and IHC studies were performed and logically corroborated with OCT images towards better disease characterization. The HC and IHC (Fig. 3) were effective to elucidate specific local molecular signatures, corroborative with structural information from OCT.
As per HC and IHC observation, increased PAS positivity was obtained (Fig. 3b) in OLK than NOM (Fig. 3a) and it was in synergy with a previous result, since hyperkeratosis was a signature in OLK.23 This result is also reflected in OCT (Fig. 1b) and histology (Fig. 1f). Further the observation on expression of keratin producing cells, sought by CK 10 expression, a marker of early terminal differentiation-cum-maturation was indicative for differential diagnosis.31 Result showed moderate expression of CK 10 in NOM (Fig. 3e) and OSF (Fig. 3g) while increased expression of this molecule throughout epithelium in OLK (Fig. 3f) indicated the presence of immature keratin producing epithelial cells. However, CK 10 expression was not evident in OSCC (Fig. 3h).
When VG stained sections of OSF were compared to NOM and other conditions, significant increase in collagen deposition was found in lamina propria (Fig. 3i–l), as literature suggests that in OSF muscle fibres are replaced by collagen.32
Since the epithelium of all OLK cases were found to be immunopositive for CK 10, in synergy with the previous studies,33 it can be deduced that increased lucidity of OLK in OCT image (Fig. 1b) was perhaps due to increase in epithelial keratinized cells (Fig. 3j) as well as more increased nuclei size than NOM (Fig. 3a–b).8 Again in OSF, distinct lucidity of sub-epithelium (Fig. 1c) could be due to increased COL-I expression (Fig. 3o). In OSCC, OCT image (Fig. 1d) was homogeneous in nature, as distinctness between epithelium and sub-epithelium was minimal. Same observation was also supported by H&E staining (Fig. 1h).
As HC and IHC could provide only local information related to some specific molecules of epithelium and sub-epithelium, in addition to these, FTIR (Fig. 5 and 6) was performed on these oral pathosis to check whether discriminating signature could be noted for the presence of unique disease specific global chemical alteration. The lesions were therefore tried to be segregated on the basis of global chemical signatures obtained in ‘fingerprint’ region of FTIR after spectral pre-processing by optimized PCA–LDA. Result presented in Fig. 5a suggested that the lesions could be completely segregated when LDA scores plot with confidence ellipse representing confidence interval at 95% were plotted. It suggested significant variations in chemical composition between oral lesions, and thus the notion of disease specific chemical signature was validated. Since a recent review suggested degradation of collagen cores in OSF,34 the cause of disappearance of peaks in OSF area between 1400 and 1700 cm−1 (Fig. 5b) was found possibly due to decrease in skeletal muscle phospholipid and proteins. Although collagen fibres are rich in proline and/or hydroxyproline,35 amount of these amino acids along with glycine was found to be decreased in OSF.36 The peak picking from second derivative spectra of the same region ‘1800–900 cm−1’ for understanding minute chemical changes in each disease condition, Fig. 5 depicted only minute alteration existed between NOM and OLK, but it was noted that these two could be classified from OCT images with highest accuracy (Table 4).
![]() | ||
Fig. 6 Mean spectra after RBBC of the area between (a) 1600–1800 cm−1 (b) 2400–2000 cm−1 and (c) 3700–3000 cm−1 for depicting tissue hydration status of NOM, OLK, OSF and OSCC. |
This result may be attributed to alteration in tissue hydration status that affects the scattering in OCT, as evident from Fig. 6. When mean spectra after RBBC of areas between 1600–1800 cm−1, 2000–2400 cm−1 and 3700–3000 cm−1 were considered, it was observed that OSCC possess higher content of bound water than normal condition, which was also in synergy with a recent study.7 It was also evident from Fig. 5 that bound water content in OLK was lesser than OSCC, but higher than NOM. Among all these oral lesions, bound water content was found to be least in OSF, which was not found to be reported in any previous studies. It was evident from Fig. 5b that, in OSCC many peaks were found to be dissolved in the area of carbohydrates including glycogen (1200–900 cm−1). This finding was in synergy with a previous study6 and also validated with PAS positivity of the tissue sections (Fig. 3a–d). Depletion of glycogen and associated proteins were found to be the major chemical attributes of OSCC, which also might be the underlying cause of homogeneity in pre-processed FTIR peak in the area of 900–1500 cm−1 (Fig. 4b).
Since biochemical alteration could be validated by multimodal characterization of oral lesions,17 the optimum selected features from OCT images therefore can be finally proposed as QIBs. Results thus also helped to prove both the proposed hypotheses that, oral lesions can be subjectively distinguished and characterized from the multimodal information obtained from OCT–histology–HC–IHC and objectively classified with the aid of intensity and texture features of OCT images. It can also be substantiated that, difference in the disease specific epithelial and sub-epithelial intensity and texture were due to chemical alteration of epithelium and sub-epithelium in different lesions. Hence it may be concluded that OCT information can be logically corroborated with FTIR, HC and IHC. The presumptions used to devise the study, the approach for addressing the research questions and a crisp outcome from meaningful integration of quantitative as well as qualitative knowledge obtained in this study has been depicted in Fig. 7.
![]() | ||
Fig. 7 Representing multimodal characterization of oral lesions with plausible informational convergence endorsing complementarity of methods. |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c5ra24117k |
This journal is © The Royal Society of Chemistry 2016 |