Uraib
Sharaha
ab,
Daniel
Hania
c,
Dima
Bykhovsky
d,
Itshak
Lapidot‡
ef,
Mahmoud
Huleihel‡
a and
Ahmad
Salman‡
*g
aDepartment of Microbiology, Immunology and Genetics, Ben-Gurion University of the Negev, Beer-Sheva 84105, Israel
bDepartment of Biology, Science and Technology College, Hebron University, Hebron P760, Palestine
cDepartment of Green Engineering, SCE - Shamoon College of Engineering, Beer-Sheva 84100, Israel
dElectrical and Electronics Engineering Department, SCE-Sami Shamoon College of Engineering, Beer-Sheva 84100, Israel
eDepartment of Electrical and Electronics Engineering, Afeka Tel-Aviv Academic College of Engineering, Tel-Aviv 69107, Israel
fLIA Avignon Université, 339 Chemin des Meinajaries, Avignon 84000, France
gDepartment of Physics, SCE-Sami Shamoon College of Engineering, Beer-Sheva 84100, Israel. E-mail: ahmad@sce.ac.il; Tel: +972-8-6475794
First published on 27th June 2025
Early cancer detection improves patient outcomes, but most Raman spectroscopy research has focused on discriminating between normal and malignant cells, ignoring the essential precancerous stage. This study fills that gap by combining Raman spectroscopy with machine learning methods to characterize and categorize normal (primary fibroblast cells from mouse embryos), precancerous (murine fibroblast cell lines (NIH/3T3)), and malignant mouse fibroblast cells transformed by a murine sarcoma virus (MBM-T) as cancerous cells. Key spectral bands associated with malignancy progression were identified using ANOVA-based feature selection, while Log-likelihood estimation decision logic enhanced classification robustness across multiple measurements per cell. The method was 95.8% accurate in classifying normal from cancerous cells, 91% for normal vs. precancerous cells, and 86% for precancerous vs cancerous cells. These results show that Raman spectroscopy has the potential to be a valuable diagnostic tool for early cancer detection, offering insight into carcinogenesis spectrum indications. This study advances Raman-based diagnostics in oncology by strengthening spectrum analysis and classification algorithms.
752
735 new invasive cancer cases were reported in the United States. For all cancers combined, the incidence rate was 439 per 100
000 standard population overall.2 Furthermore, the GLOBOCAN statistics show that an anticipated 19.3 million new cancer cases and about 10.0 million cancer deaths occurred in 2020.3 Early detection is crucial since the cancer stage at diagnosis has a significant impact on patient survival and quality of life.4,5
The prognosis of cancerous patients is strongly dependent on early detection, yet many malignancies lack reliable screening methods for precancerous stages.6,7 Conventional imaging techniques such as X-ray, MRI, and PET scans often fail to detect subtle biochemical changes preceding malignancy, limiting their effectiveness in early diagnosis.8–10 This highlights the need for sensitive, label-free techniques to identify early molecular transformations associated with cancer development. Thus, developing novel cancer detection, diagnosis, and treatment methods is urgently required.1
Raman spectroscopy has emerged as an effective tool for biochemical characterization of cells and tissues, providing a noninvasive, water-insensitive method for detecting molecular changes.9,11–18
Raman spectroscopy enables label-free, hypothesis-free molecular profiling, circumventing antibody optimization and spectral overlap limitations inherent to fluorescence-based techniques.13,19 Preserving samples in their native state facilitates re-analysis and is particularly suited for dried or archival specimens where labeling is impractical.20 These attributes make Raman a complementary tool to targeted flow cytometry for broad biomolecular studies.21
Several studies have effectively combined Raman spectroscopy with machine learning methods to classify normal and cancerous cells.22–31 However, most of these studies focus on well-established malignancies, overlooking the transitional precancerous stage—a critical window for early intervention.9,11–18 Precancerous cells exhibit progressive molecular changes, including metabolic shifts, nucleic acid modifications, and altered protein structures, yet the spectral biomarkers associated with this transformation remain poorly characterized.
To address this gap, we present a Raman spectroscopy-based approach that explicitly incorporates the precancerous stage, enabling the detection of molecular changes preceding malignancy. Our study examines three cell types: murine primary fibroblasts (normal), NIH/3T3 fibroblasts (precancerous), and MBM-T sarcoma-transformed cells (cancerous). We identify key Raman biomarkers associated with cancer progression by analyzing spectral differences across these states. For academic integrity, it is important to clarify that these cell lines are not meant to represent a direct oncogenic trajectory but instead serve as distinct phenotypic states used to evaluate our Raman-based approach's sensitivity and discriminatory power.
In addition, to improve classification robustness and account for multiple spectral measurements per cell, we combine powerful machine learning-based feature selection with decision logic techniques. This study advances Raman spectroscopy as a cutting-edge diagnostic tool for early cancer diagnosis. It reveals important spectral wavenumbers that disclose the transition from healthy to precancerous and, eventually, malignant states. Our findings demonstrate the enormous potential of Raman spectroscopy in tandem with machine learning as a sensitive, label-free diagnostic approach, paving the path for more effective screening methods and preventive healthcare interventions.
While measurements for all three classes (normal, precancerous, and malignant) were not always performed on the same day, we minimized potential batch effects by:
• Adhering to strict, consistent preparation protocols,
• Using identical instrument settings for all sessions,
• All cells were used at early passages—passage 2 for primary cells and passages 3–4 for cell lines and transformed cells—with viability and normal morphology confirmed under a light microscope before drying.
The three cell type were maintained in T25 cell culture flasks using an RPMI medium supplemented with 10% FBS, two mM L-glutamine and 2% penicillin–streptomycin solution (50–100 μg ml−1), in a 5% CO2 environment at 37 °C s We harvested the cells from the culture flask by trypsin treatment, then centrifuged to form a pellet, washed three times with 500 μl of 0.9% NaCl, and re-suspended in 50 μL of 0.9% NaCl. Cell concentration was determined using a hemocytometer. The cells were then pelleted and resuspended to achieve a final 30–50 cells per μL concentration. A 2 μL drop of each sample was applied onto an aluminum-coated slide and left to dry for 10 minutes.
Dried cells were used to ensure sample stability and reproducibility during Raman spectral acquisition. While drying can alter cell morphology and may damage membranes, our analysis targets biochemical composition rather than structural features.
Raman measurements were conducted after carefully adjusting the focal plane to ensure accurate focus. Spectra were recorded with a 60 s integration time for all measurements. The laser was focused onto the sample using a 50× NA objective lens (Olympus MPLAN), generating a diffraction-limited spot size of 1.54 μm. A 600 lines per mm grating optimized signal strength while minimizing background autofluorescence.
The wavenumber calibration was done using a silicon reference sample every two hours. For each cell, three measurements were taken from the center, cytoplasm, and membrane regions, as illustrated in Fig. 1. The Randomizing measurement regions were randomized within sessions to reduce systematic bias.
Although 785 nm excitation is standard for live-cell Raman spectroscopy (minimizing fluorescence and photodamage),32–34 we selected 532 nm for its higher scattering efficiency (∼1/λ4) and improved spatial resolution, critical for resolving fine biochemical details in fixed/dried cells. This wavelength has been successfully applied to study lipid–protein dynamics and subcellular structures in fixed systems,35,36 with minimal fluorescence interference in processed samples. A low-power illumination further reduced potential artifacts, aligning with established ex vivo cellular analysis protocols.37–39
Over one year, 222 individual and different cells were analyzed using the Raman facility: 92 normal, 76 precancerous, and 54 cancerous cells. These different 222 cells were conducted across multiple biological replicates from different mice, cultures, and batches, ensuring experimental variability and enhancing the generalizability of the findings.
First, the spectra were cut to the 1800–600 cm−1 range and smoothed using the Savitzky–Golay algorithm (5-point window) to reduce instrumental noise and enhance spectral clarity. Next, baseline correction was applied to eliminate fluorescence-induced variations and spectral baseline shifts.
For baseline correction, each spectrum was divided into 64 equal-sized segments. The minimum y-value within each segment was identified, and these minima were used to fit a polynomial function representing the baseline. This polynomial was then subtracted from the original spectrum to obtain the baseline-corrected spectrum. The entire procedure was repeated five times to ensure optimal correction.
The final pre-processing step involved vector normalization. Each spectrum was treated as a vector, with the average intensity across all wavenumbers calculated and subtracted from the spectrum. The resulting spectrum was then normalized to a unit vector by computing the sum of the squared intensity values (Y-axis) and dividing by the square root of this sum. Since vector normalization can yield negative intensity values, all spectra were adjusted by shifting the minimum intensity to zero.
However, a substantial portion of these wavenumbers contributed little to no valuable information for classification. To refine the dataset and optimize classification performance, the ANOVA F-score was applied to these Raman spectra for feature selection.41–43 This step is crucial for reducing data dimensionality while simultaneously enhancing classification accuracy.
Fig. 2 presents a comprehensive workflow of the machine learning pipeline, illustrating each stage—from feature extraction to model construction.
To further ensure robustness, 5-fold cross-validation was implemented in some of the classification experiments, enabling the estimation of the error as standard deviation. Here, cells—not individual spectra—were partitioned into five folds, ensuring no spectra from the same cell appear in both sets. We kept the same ratio of cells in each fold. In each iteration, four folds trained the model, while the fifth fold's cells were classified via the LLR decision system, which combines predictions from all three spectra of each validation cell to classify it.
By rotating the test fold across all partitions, the method generated an averaged performance metric with standard deviation, quantifying model stability.
This dual-validation strategy—LOGO for exhaustive per-cell assessment and 5-fold CV for error estimation—strengthened generalization while rigorously respecting the data's hierarchical structure.
This approach assumes statistical independence between the features. It evaluates each feature's significance in differentiating between the different pairs,46,47 ranking them from the most significant to lowest based on their F-scores. Higher-ranked features exhibit greater discriminatory power. The most significant 60 spectral features distinguishing between Controls and Precancerous, Controls and Cancerous, as well as Precancerous and Cancerous, are detailed in Table S1a.†
Since feature selection is critical for interpretability (explainable AI),48 we prioritized features that reflect malignancy-driven biological alterations rather than unrelated variability. We also applied relative entropy as an alternative feature selection method to further validate our findings and compared the results with the ANOVA F-score approach (Table S1b†).
We repeat each classification procedure 20 times using a 5-fold cross-validation framework to evaluate classification performance, thus guaranteeing strong model generalization methodically and improving classification accuracy. The database was randomly divided into five folds each time, ensuring diverse training and testing sets. Different feature vector subsets were used each time, starting with the five most statistically significant wavenumbers and incrementally expanding the selection in five increments, up to 150 features.
The optimal feature subset for each classification task (Control vs. Cancerous, Control vs. Precancerous, and Precancerous vs. Cancerous) was selected through manual evaluation of feature importance rankings (Fig. 3a).
We plotted the accuracy versus the number of features in steps of 5 features. Fig. 3a shows the curve plateaus at 10 features for Control vs. Cancerous, 70 for Control vs. Precancerous, and 115 for Cancerous vs. Precancerous classification. Thus, we chose these values as the optimal feature numbers for their respective classifications. The standard deviation of accuracy was calculated and plotted in Fig. 3a. Moreover, in Fig. 3b, we compare the accuracy of the Logistic Regression (LR) classifier for the classification between Precancerous vs. Cancerous obtained using two approaches: (1) Log-Likelihood Ration (LLR)-based decision logic applied across three measurement sites (Fig. 1) and (2) classification performed separately for each site without decision logic.
For comparison, extra analysis was conducted using average spectra, where each cell was represented by the average Raman spectrum from the three sites shown in Fig. 1. The LR model was used to classify the two categories in each couple, with five-fold cross-validation.
As shown in Fig. 3b, the LR classifier achieved better performance using the first approach when the number of features was relatively small (10 features were chosen).
Fig. S2a and S2b† are similar to Fig. 3b but correspond to the classification of Normal vs. Cancerous and Normal vs. Precancerous, respectively.
To consolidate multiple measurements into a final classification, we explored the LLR,53 which takes into account all the scores from all the measurements. LLR offers a refined approach by weighting predictions based on certainty, making it particularly valuable for high-precision classification tasks.
At the cell level, classification is determined using the LLR as a decision-logic approach, which relies on the classifier's scores for each spectrum within a given cell. This process is repeated for all cells in the dataset, with the final classification of each cell being based on the collective output of the classifier from its spectra. The performances of the binary classifier are summarized in a confusion matrix in Table 1.
| Predicted | |||
|---|---|---|---|
| Positive | Negative | ||
| True | Positive | True Positive (TP) | False Positive (FP) |
| Negative | False Negative (FN) | True Negative (TN) | |
When the classification was performed between couples of the three categories, the cancerous (MBM-T) category was determined to be the positive state in Normal–Cancerous and Precancerous–Cancerous. At the same time, when the classification was between Precancerous–Normal, the precancerous (NIH/3T3) category was determined as the positive state.
Accuracy is the percentage of truly predicted both positive and negative states; Sensitivity is the percentage of actual positive states; Specificity is the percentage of truly predicted negative state samples out of actual negative state samples; PPV is the percentage of truly predicted positive state out all the samples predicted as positive by the classifier; and NPV is the percentage of truly predicted negative state out all the samples predicted as negative by the classifier.
![]() | ||
| Fig. 4 (a) Average Raman spectra in the 1800–600 cm−1 region from Normal (primary), Precancerous (NIH/3T3), and Cancerous (MBM-T) cells measured from the three sites: cytoplasm, edge, and center. (b) Difference spectra (Δ) for Normal–Cancerous, Normal–Precancerous, and Precancerous–Cancerous comparisons. The top twenty discriminative features (Table S1†) are marked in blue, orange, and green shading. | ||
However, not all spectral differences between the tested cell types directly reflect compositional and biochemical changes associated with cancer progression. Spectral variations can arise from two main sources: inter-variance, which represents true biological differences between Normal, Precancerous, and Cancerous states due to pathological abnormalities, and intra-variance, which includes both biological variability and technical factors within each group (not relevant to malignant transformation), such as batch-to-batch variations.61 Additionally, due to the spatial resolution of Raman, different organelles and components could be measured depending on the measured site; these variations in the same cell types are also considered as intra-variance.
Therefore, careful feature selection is essential to isolate biomarkers linked to malignant transformation.
By correlating these wavenumbers with their corresponding biomolecules, we aim to identify the underlying biological changes associated with malignancy, relating spectral features to molecular alterations that may drive cancer development and offering insights into key biochemical processes underlying tumorigenesis.48 A key advantage of feature selection over dimensionality reduction methods such as PCA, UMAP, and Diffusion Map is its interpretability, allowing for a more direct biological understanding of the spectral variations.62,63
Our approach enhances diagnostic accuracy and interpretability by selecting the most informative spectral features, ensuring they reflect malignancy-driven alterations rather than unrelated variability. This framework strengthens the reliability of spectral classification, providing deeper insights into key biochemical processes involved in cancer progression.49,64–66 In contrast, other methods do not give specific information regarding the contribution of specific wavenumbers to the classification. The information derived using these methods is spread across the entire spectroscopic range.
To demonstrate the efficiency of the selected feature techniques, we employed the t-test to compare the averages of predefined metrics between two groups of the compared categories and evaluate whether they differed substantially.
A t-test determines whether the compared categories are substantially different, whether observed variances are due to chance, or represent a meaningful difference.54 The p-value represents the likelihood of obtaining outcomes as extreme as those observed under the null hypothesis. A p-value of less than 0.05 denotes a statistically significant difference.
Before performing t-test analysis, Raman spectra acquired from different subcellular regions (center, cytoplasm, membrane) were rigorously averaged within each individual cell to avoid pseudo-replication. This process generated one representative spectrum per cell, resulting in 222 independent cell-level data points and ensuring that each biological replicate contributed a single, independent measurement to the analysis.
We applied a t-test to compare the two category pairs based on different metrics, as presented in Table 2: Normal–Cancerous, Normal–Precancerous, and Precancerous–Cancerous.
| Pair | Metric | Wavenumbers (cm−1) | Major contributed molecules | p-Value |
|---|---|---|---|---|
| Normal–Cancer | I | 1337, 1336, 1334, 1333, 1332, 1331, 1329, 1328, 1327, 1326, 1324, 1323, 1322, 1321, 1320, 1318, 1317, 1316, 1315, 1313 | Lipid/protein/nucleic acids | 1.2 × 10−36 |
| Normal–Precancer | II | 1625, 1624 | Amide I | 1.6 × 10−22 |
| III | 1425 | Deoxyribose | 2.9 × 10−22 | |
| IV | 1318, 1317, 1316, 1315, 1313,1312 | Lipid/protein/nucleic acids | 2.5 × 10−26 | |
| V | 1250, 1249, 1248, 1246, 1245 | Amide III, guanine, cytosine | 2.0 × 10−23 | |
| VI | 1102, 1099, 1098, 1097, 1094, 1093 | PO2−/DNA/lipids | 4.4 × 10−22 | |
| Precancer–Cancer | VII | 1748, 1747, 1746 | Lipids/phospholipids | 1.6 × 10−8 |
| VIII | 1705, 1703, 1702, 1701, 1700, 1698, 1697, 1690 | amino acids aspartic & glutamic acid/Amide I | 1.2 × 10−8 | |
| IX | 1327, 1326, 1324, 1323, 1322, 1321 | Proteins/nucleic acids | 4.1 × 10−9 | |
| X | 1264, 1263, 1261, 1260 | Amide III | 2.2 × 10−8 |
The number of metrics defined for each pair depends on the dispersion of these top 20 wavenumbers across the 1800–600 cm−1 region (Fig. 4b). These predefined metrics are calculated as the sum of absorption intensities at the corresponding wavenumbers within each metric and are associated with their specific vibrational modes and contributing molecules (Table 2).
A significant difference between the two averages suggests a substantial alteration in the contents of the corresponding molecules. For example, for the “Normal–Cancer” pair, we summed the absorption intensities at the wavenumbers corresponding to metric I (Table 2). The resulting p-value indicated a statistically significant differentiation between the normal and cancer groups. Furthermore, the observed alteration in lipid, protein, and nucleic acid contents are significant in the transformation from normal to cancerous state.
We presented the t-test results as a violin plot based on each metric in Fig. 5. As seen in Table 2, significant differences were observed between the averages of the compared groups in all the metrics, in the specific wavenumber and corresponding biomolecules. The significant differences observed in all the evaluated metrics confirm the effectiveness of the selected feature selection techniques in distinguishing between the compared categories. The identified spectral features and their corresponding biomolecules provide biologically relevant insights into the underlying differences between the groups. These findings reinforce the importance of feature selection in enhancing classification accuracy and improving the interpretability of spectral-based diagnostics.
![]() | ||
| Fig. 5 t-Test statistical calculations presented as Violin plots for: Normal–Cancer (a), Normal–Precancer (b), and Precancer–cancer (c) based on each metric. The calculation was performed for the I–X metrics defined in Table 2. For each spot appears, three horizontal lines, the middle line is the average, and the upper and lower lines represent the minimum and maximum of the calculated metrics. The shadowed plot is the kernel density distribution estimate of the metrics values. | ||
The ANOVA F-score analysis identified the most informative spectra features, effectively distinguishing among the compared categories in three pairs of the groups: Cancerous vs. Normal, Precancerous vs. Normal, and Cancerous vs. Precancerous (Table S1†). These key features, visually marked in Fig. 4b with blue, orange, and green shading, enhance early cancer detection and understanding of tumorigenesis.
The transition from normal to cancerous states involves progressive molecular changes that can be detected using Raman spectroscopy.67 We systematically analyze molecular changes based on the best 20 selected features associated with precancerous transformation and the transition between these states (Table 3).
| Wavenumber (cm−1) | Vibrational mode | Associated biomolecules | Cell type comparison | Biochemical significance |
|---|---|---|---|---|
| Normal–Cancer (*); Normal–Precancerous (**); Precancerous–Cancerous (***). | ||||
| 1690–1698 | C C stretch |
Lipid | *** | ● Indicates altered lipid metabolism, with increased unsaturated fatty acids supporting cell division, membrane fluidity, and metastasis.68 |
Amide I C O stretch |
Protein | ● Protein structure changes (misfolding, aggregation) are linked to disrupted proteostasis, oxidative stress, and oncogenic activation.69 (see 1687–1680 for more on protein aggregation and structural shifts.) | ||
| 1705–1700 | C C OH |
Amino acids aspartic & glutamic acid | *** | May reflect altered amino acid metabolism, protein modifications, and microenvironmental shifts, supporting tumor growth and progression70,71 |
| 1748–1746 | C C stretch |
Lipid | *** | lipid changes in cancer progression, including increased unsaturation, membrane remodeling, enhanced lipid synthesis, oxidative stress, and altered signaling, all supporting tumor growth and metastasis.72 |
| 1624, 1625 | Amide I C O stretch |
Protein | ** | Protein misfolding and post-translational modifications (glycosylation, oxidation) are associated with oxidative stress and the transition from normal to precancerous stages.73 |
| 1337–1313, 1327–1321 | CH2/CH3 deformation and torsion | Lipids | *, *** | Increased intensity in cancer cells suggests altered lipid metabolism and membrane composition, supporting rapid cell division and higher membrane fluidity, critical for tumor growth.68 |
| 1321 | CH2 bending | Lipid–protein interactions | * | May reflects altered lipid–protein interactions in cancer cell membranes, contributing to membrane dynamics essential for tumor growth and metastasis.74 |
| 1318–1312 | CH2/CH3 bending vibrations | Lipids and proteins | ** | Increased intensity in precancerous cells suggests lipid raft formation, which is involved in signaling pathways that promote proliferation, survival, and membrane remodeling.75 |
| 1264–1261 | PO2− stretch | Nucleic acids | *** | ● Nucleic acids: Extensive modifications in DNA, including mutations, chromosomal instability, and increased replication, driving tumor growth.76 |
| CH2/CH3 Deformation | Lipids/proteins | ● Lipids: Metabolism alterations and membrane remodeling enhance oncogenic signaling and metastasis.77 | ||
| 1250–1245 | C–C stretching | Proteins/lipids | ** | Alterations in lipid metabolism and raft formation are critical in early cancer progression. In precancerous cells, membrane changes facilitate oncogenic signaling.77 |
| 1100 | C–N stretch | Proteins/nucleic acids | ** | Reflects protein backbone modifications and nucleic acid integrity changes (e.g., DNA methylation, oxidative damage), common in early-stage mutations.73,78 |
| 1102–1093 | PO2− stretch | Nucleic acids | ** | Shifts and intensity changes correspond to genomic instability (epigenetic changes, mutations), marking early DNA disruptions linked to cancer initiation.79 |
We correlated the top 20 selected spectral features (wavenumbers) with the functional groups of the biomolecules that compose the cells, identifying specific biochemical alterations associated with malignant transformation based on literature (Table 3).
These spectral markers reflect key molecular changes, including lipid membrane remodeling, shifts in protein secondary structures, and alterations in nucleic acid composition. The observed spectral variations suggest disruptions in lipid saturation, indicative of altered membrane fluidity and cellular signaling in tumorigenesis. Additionally, protein-related features point to changes in β-sheet and α-helix structures linked to cytoskeletal remodeling and protein folding dynamics. Moreover, the neoplastic cells generate more lactate than healthy cells,80,81 while spectral bands corresponding to nucleic acids highlight transcriptional and epigenetic alterations characteristic of precancerous and cancerous states (reference). The upregulation of specific Raman bands associated with oxidative stress suggests an imbalance in cellular redox homeostasis, a hallmark of cancer progression. Systematically mapping these spectral features to biochemical processes provides deeper insight into the molecular events driving tumorigenesis. This approach enhances our ability to differentiate between normal, precancerous, and cancerous states, reinforcing the potential of Raman spectroscopy for early cancer detection and classification.
Cancerous and precancerous cells express oncoproteins that resemble normal cytoplasmic proteins, disrupting DNA–protein interactions and modifying nuclear proteins involved in cell division and DNA replication, resulting in uncontrolled proliferation.81
Genes associated with cell cycle regulation (e.g., CCNB1, MCM3, MCM4, MCM7) and oxidative phosphorylation (e.g., ATP5B, ATP5G3) are upregulated in precancerous and cancerous cells, leading to increased protein levels.82 These molecular changes correspond to specific Raman spectral shifts, particularly in the Amide II and Amide III regions at 1250 and 1267 cm−1.83
The spectral differences between the abnormal and normal categories are larger than the spectral differences between the cancerous and precancerous categories, as seen in Table S1,†Fig. 2 and 3. This trend is further supported by the p-values in Table 2, where the Normal–Cancer comparison exhibits significantly lower p-values than the Normal–Precancerous and Precancerous–Cancerous comparisons. This is not surprising and makes sense from a biological point of view. As mentioned above, the precancerous and cancerous biological systems have many similar properties and characteristics. As is known, the precancerous (NIH/3T3) cells have undergone many changes due to various mutations throughout the multiple transfers. However, they are still not considered cancerous cells because they do not have all the properties of a cancerous cell.84
The classification analyses in all the classification experiments were based on selected features derived from Raman spectra by ANOVA F-score in the 1800–900 cm−1. The classification was performed separately as a binary classification between all the different pairs: Normal–Cancerous, Normal–Precancerous, and Precancerous–Cancerous.
In this analysis, leave-one-group-out cross-validation (LOGOCV) was used to generate the receiver operating characteristic (ROC) curve to optimize the classification operating point (threshold) and to estimate the classifier's performance as the area under the curve (AUC) of the ROC curve. Each group consisted of three measurements taken from the center, cytoplasm, and edge of the same cell (Fig. 1).
For example, when the classification is between the normal and cancerous categories, the ROC curve evaluates the tests’ accuracy quantitatively in terms of correct determination for a certain sample as normal or cancerous by calculating the AUC of the ROC curve.
Fig. 6 presents the LR model's ROC curves and the operating points for the binary classification: Normal vs. Cancerous, Normal vs. Precancerous, and Precancerous vs. Cancerous. The LOGOCV was used at the spectrum level, while the sample category was determined using LLR.
The LR classification's performance for the discrimination between the different categories of each pair: Normal–Cancerous, Normal–Precancerous, and Precancerous–Cancerous is summarized in Table 4.
| AUC | Accuracy (%) | Sensitivity (%) | Specificity (%) | PPV (%) Positive Predictive Value | NPV (%) Negative Predictive Value | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cells: Normal (n = 92), Precancerous (NIH/3T3, n = 76), Cancerous (MBM-T, n = 54). Methods: (I) ANOVA F-score (no reg.), (II) ANOVA F-score + class-weighted (to address imbalance), (III) Relative entropy + class-weighted (to address imbalance). The averaged spectrum across the three measurement sites (Av Sp). | ||||||||||||||||||
| (a) Normal–Cancerous classes (10 selected features) | ||||||||||||||||||
| Site | I | II | III | I | II | III | I | II | III | I | II | III | I | II | III | I | II | III |
| Three sites | 0.99 ± 0.00 | 0.99 ± 0.01 | 0.99 ± 0.00 | 95.9 ± 1.0 | 96.3 ± 0.8 | 96.9 ± 0.9 | 93.4 ± 2.5 | 95.8 ± 1.6 | 96.9 ± 1.6 | 97.3 ± 0.5 | 96.6 ± 0.8 | 96.8 ± 0.8 | 95.3 ± 0.9 | 94.4 ± 1.3 | 94.8 ± 1.2 | 96.2 ± 1.4 | 97.5 ± 1.0 | 98.1 ± 0.9 |
| Center | 0.93 ± 0.01 | 0.94 ± 0.01 | 0.96 ± 0.01 | 93.6 ± 1.1 | 94.2 ± 1.2 | 95.8 ± 1.2 | 91.2 ± 2.3 | 94.5 ± 1.9 | 94.8 ± 2.7 | 94.9 ± 1.3 | 94.0 ± 1.6 | 96.3 ± 1.2 | 91.4 ± 1.9 | 90.3 ± 2.4 | 93.8 ± 1.9 | 94.9 ± 1.3 | 96.7 ± 1.1 | 97.0 ± 1.5 |
| Cytoplasm | 0.93 ± 0.01 | 0.94 ± 0.01 | 0.94 ± 0.01 | 93.8 ± 1.2 | 94.3 ± 1.1 | 93.8 ± 1.1 | 91.9 ± 2.2 | 94.6 ± 2.0 | 94.4 ± 1.9 | 94.7 ± 1.0 | 94.2 ± 1.2 | 93.5 ± 1.2 | 91.4 ± 1.7 | 90.6 ± 1.7 | 89.6 ± 1.7 | 95.3 ± 1.3 | 96.8 ± 1.2 | 96.6 ± 1.1 |
| Membrane | 0.92 ± 0.02 | 0.92 ± 0.02 | 0.92 ± 0.01 | 92.3 ± 1.5 | 92.4 ± 1.4 | 91.6 ± 1.1 | 88.8 ± 2.4 | 91.5 ± 2.8 | 91.9 ± 2.0 | 94.4 ± 2.0 | 92.9 ± 1.5 | 91.5 ± 1.7 | 90.4 ± 3.1 | 88.4 ± 2.2 | 86.4 ± 2.3 | 93.5 ± 1.3 | 94.9 ± 1.6 | 95.1 ± 1.1 |
| Av. Sp. | 0.98 ± 0.00 | 0.98 ± 0.01 | 0.98 ± 0.00 | 95.0 ± 0.9 | 94.9 ± 1.1 | 96.5 ± 0.7 | 93.1 ± 2.1 | 94.0 ± 1.8 | 96.2 ± 1.8 | 96.1 ± 1.0 | 95.4 ± 1.0 | 96.7 ± 0.2 | 93.4 ± 1.6 | 92.3 ± 1.6 | 94.5 ± 0.4 | 95.9 ± 1.2 | 96.4 ± 1.1 | 97.8 ± 1.0 |
| (b) Normal–Precancerous (70 selected features) | ||||||||||||||||||
| Site | I | II | III | I | II | III | I | II | III | I | II | III | I | II | III | I | II | III |
| Three sites | 0.95 ± 0.02 | 0.96 ± 0.01 | 0.97 ± 0.01 | 89.8 ± 1.6 | 91.1 ± 1.9 | 91.4 ± 1.3 | 87.6 ± 2.4 | 89.7 ± 2.3 | 89.6 ± 2.1 | 91.6 ± 1.6 | 92.2 ± 2.1 | 92.8 ± 1.5 | 89.7 ± 1.9 | 90.5 ± 2.4 | 91.2 ± 1.7 | 91.0 ± 1.8 | 91.5 ± 1.8 | 91.6 ± 1.6 |
| Center | 0.87 ± 0.02 | 0.87 ± 0.01 | 0.88 ± 0.02 | 86.9 ± 1.5 | 87.3 ± 1.1 | 88.7 ± 1.5 | 83.1 ± 2.9 | 84.9 ± 1.8 | 86.6 ± 2.2 | 90.1 ± 1.6 | 89.2 ± 1.7 | 90.4 ± 2.1 | 87.4 ± 1.8 | 86.7 ± 1.8 | 88.2 ± 2.3 | 87.2 ± 2.0 | 87.7 ± 1.3 | 89.1 ± 1.6 |
| Cytoplasm | 0.87 ± 0.02 | 0.87 ± 0.01 | 0.91 ± 0.02 | 87.6 ± 1.7 | 87.5 ± 1.3 | 90.6 ± 1.7 | 85.1 ± 2.9 | 86.1 ± 2.2 | 90.0 ± 2.5 | 89.7 ± 1.9 | 88.6 ± 1.6 | 91.1 ± 1.8 | 87.2 ± 2.1 | 86.3 ± 1.7 | 89.3 ± 2.0 | 89.0 ± 1.8 | 88.6 ± 1.6 | 91.7 ± 1.9 |
| Membrane | 0.84 ± 0.01 | 0.85 ± 0.02 | 0.89 ± 0.02 | 84.5 ± 1.4 | 85.0 ± 2.0 | 89.0 ± 1.8 | 79.7.6 ± 2.3 | 81.9 ± 2.9 | 87.6 ± 3.6 | 88.5 ± 1.8 | 87.6 ± 1.9 | 90.1 ± 1.6 | 85.1 ± 2.0 | 84.5 ± 2.2 | 88.0 ± 1.7 | 84.1 ± 1.5 | 85.4 ± 2.1 | 89.8 ± 2.6 |
| Av. Sp. | 0.96 ± 0.01 | 0.96 ± 0.01 | 0.97 ± 0.01 | 88.7 ± 1.2 | 88.9 ± 1.5 | 92.7 ± 1.0 | 85.7 ± 2.5 | 87.3 ± 2.1 | 91.3 ± 1.5 | 91.2 ± 1.8 | 90.2 ± 1.7 | 94.0 ± 1.2 | 89.0 ± 1.9 | 88.1 ± 1.9 | 92.6 ± 1.4 | 88.6 ± 1.7 | 89.6 ± 1.6 | 92.9 ± 1.1 |
| (c) Precancerous–Cancerous (115 selected features) | ||||||||||||||||||
| Site | I | II | III | I | II | III | I | II | III | I | II | III | I | II | III | I | II | III |
| Three sites | 0.89 ± 0.03 | 0.89 ± 0.01 | 0.87 ± 0.02 | 83.9 ± 2.9 | 84.4 ± 2.0 | 82.0 ± 2.3 | 82.2 ± 4.2 | 83.1 ± 3.0 | 79.9 ± 3.8 | 85.1 ± 3.3 | 85.3 ± 2.9 | 83.6 ± 3.3 | 79.8 ± 3.9 | 80.2 ± 3.2 | 77.7 ± 3.5 | 87.1 ± 2.8 | 87.7 ± 1.9 | 85.5 ± 2.3 |
| Center | 0.82 ± 0.02 | 0.83 ± 0.02 | 0.80 ± 0.03 | 82.2 ± 2.2 | 82.7 ± 2.0 | 80.0 ± 2.9 | 79.5 ± 3.1 | 81.5 ± 3.8 | 78.1 ± 4.7 | 84.1 ± 2.9 | 83.6 ± 2.1 | 81.3 ± 3.7 | 78.2 ± 3.4 | 78.0 ± 2.4 | 74.9 ± 3.8 | 85.3 ± 1.9 | 86.5 ± 2.4 | 84.0 ± 2.9 |
| Cytoplasm | 0.78 ± 0.02 | 0.78 ± 0.02 | 0.75 ± 0.03 | 78.4 ± 2.3 | 78.2 ± 2.0 | 75.1 ± 2.6 | 76.2 ± 3.6 | 78.1 ± 2.6 | 73.1 ± 4.6 | 79.9 ± 2.8 | 78.3 ± 3.2 | 76.4 ± 3.2 | 73.0 ± 3.1 | 72.0 ± 3.0 | 68.9 ± 3.0 | 82.6 ± 2.3 | 83.4 ± 1.6 | 80.1 ± 2.8 |
| Membrane | 0.79 ± 0.03 | 0.79 ± 0.01 | 0.75 ± 0.02 | 79.0 ± 2.5 | 78.6 ± 1.5 | 75.6 ± 2.3 | 77.6 ± 4.0 | 79.1 ± 2.3 | 73.7 ± 3.6 | 79.9 ± 3.2 | 78.2 ± 3.2 | 77.0 ± 3.2 | 73.4 ± 3.3 | 72.2 ± 2.5 | 69.6 ± 3.1 | 83.4 ± 2.5 | 84.1 ± 1.2 | 80.5 ± 2.2 |
| Av. Sp. | 0.92 ± 0.02 | 0.92 ± 0.02 | 0.89 ± 0.02 | 86.0 ± 1.6 | 84.8 ± 2.5 | 82.3 ± 2.3 | 83.7 ± 2.6 | 84.0 ± 2.6 | 82.6 ± 3.9 | 87.6 ± 2.1 | 85.5 ± 3.2 | 82.2 ± 3.4 | 82.8 ± 2.5 | 80.5 ± 3.7 | 76.8 ± 3.4 | 88.3 ± 1.6 | 88.2 ± 1.9 | 87.0 ± 2.5 |
As shown in Fig. 3, S2a, S2b,† and Table 4, the performance of the LR logic across three measurement sites, incorporating the LLR DL method for classifying samples as Precancerous vs. Cancerous, surpasses the performance obtained when each site is analyzed separately without decision logic.
Therefore, for future studies, acquiring spectra from additional sites and analyzing cells as a group using spectra from all sites is recommended. This approach involves evaluating data at the spectrum level and applying a decision logic method to determine the classification at the cell level.
The results presented in Table 4 demonstrate the powerful capability of Raman spectroscopy machine learning for excellent differentiation between the normal and precancerous or cancerous cells, based on the changes in the cells’ biomolecules, with 91% and 95.1% accuracy, respectively. In addition, this study shows the high potential of this method to differentiate between precancerous and cancerous cells with 90% accuracy, where the spectral differences between precancerous and cancerous categories are minute. This finding is significant since detecting precancerous cells before developing cancerous cells is critical for the effective prevention/treatment of cancer development.
The data is unbalanced and was threatened by training the LR with a weighted loss to emphasize the smaller class. This is a widely used strategy to overcome the unbalanced data problem.85
When comparing classifier performance using ANOVA F-score selected features with and without class-weighted (Columns I and II), we observe no significant advantage from imbalance adjustment (Column II). The performance metrics (AUC, Accuracy, Sensitivity, Specificity, PPV, and NPV) remain statistically equivalent within their respective error margins.
This finding aligns with expectations, as our dataset's minority class representation (24%) substantially exceeds the 10–15% threshold where class imbalance typically impairs performance.86,87 Existing literature demonstrates that class-weighted LR with feature selection maintains robustness for minority classes ≥20%,85 which supports our observed metric stability (Table 4).
The consistent outperformance of average spectra analysis and multi-site LLR fusion over single-site methods across all classification pairs (Normal–Cancerous, Normal–Precancerous, Precancerous–Cancerous) and performance metrics stems from noise reduction, enhanced statistical power, and compensation for site-specific variability. Integrating data across sites improves signal-to-noise ratio, increases generalizability, and mitigates biases, leading to more robust and biologically meaningful classification. This aligns with established data fusion principles and confirms their superiority for diagnostic applications requiring high reliability.
Our comparative analysis reveals that multi-site LLR fusion demonstrates superior performance for Normal–Cancerous and Normal–Precancerous classification by leveraging discriminative weighting of pronounced spectral differences. In contrast, average spectra analysis shows marginally better, though error-bound, results for Precancerous-Cancerous discrimination.
Comparing the performance of the logistic regression (LR) classifier using features selected by ANOVA F-score and relative entropy (columns II and III), we find that the key metrics—AUC, accuracy, sensitivity, specificity, PPV, and NPV—are statistically equivalent within their respective error margins. This alignment is expected, as Table S2b† shows that both methods identify nearly the same 60 features, differing primarily in their ranking order.
Table 4 and Fig. 6 demonstrate that the selected features enable clear discrimination between the two compared categories in each pair.
For decades, scientists have been looking for distinct features that can help them to discriminate between a cancerous cell and a normal cell. Our findings in this study support the possible use of this spectroscopic method to detect precancerous and cancerous cells early.
Footnotes |
| † Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d5an00360a |
| ‡ Contributed equally. |
| This journal is © The Royal Society of Chemistry 2025 |