An outlier detection algorithm based on segmentation and pruning of competitive network for glioma identification using Raman spectroscopy
Abstract
Raman spectroscopy is a promising diagnostic tool for brain gliomas, owing to its non-invasive and high information density properties. However, identifying patterns in glioma cancer tissue and healthy tissue in the brain is challenging, and outlier spectra resulting from operator error or changes in external conditions can compromise the model's robustness and generalizability to new data. Given the heterogeneity of glioma tissue, the within-group variance of data obtained by a portable Raman spectrometer is relatively high, and inconsistencies in instrument repeatability and experimental conditions can lead to an incompact distribution of non-outlier points, complicating outlier detection. Strict outlier criteria may result in the deletion of non-outlier points, leading to reduced sample utilization. To address these issues, we propose the SPCN outlier detection algorithm, which segments and prunes a competitive network to extract global outlier features, identifies topological errors, and divides initial outlier domains using the α–β region segmentation method. The algorithm also proposes a two-stage pruning method based on the characteristics of the manifold map and visualizes the outlier measure using a normalized histogram. Compared to traditional methods, SPCN is label-free and does not require an estimation of outlier distance threshold or data distribution density. We compared the accuracy of six outlier detection algorithms using Raman spectra collected from brain glioma tissues of 113 patients and examined changes in pattern recognition accuracy after removing the outliers, confirming the precision and robustness of SPCN. This method has the potential to enhance the accuracy and reliability of glioma diagnosis via Raman spectroscopy and can also be applied to outlier detection in other spectra such as near infrared and middle infrared.
- This article is part of the themed collection: Analytical Methods HOT Articles 2023