Tao
Hu
ab,
Haoyu
Xiao
a,
Shanling
Ji
a,
Zhihao
Wu
a,
Yunlin
Quan
a,
Wang
Zhen
a,
Xiao
Li
*ab,
Jianxiong
Zhu
*a and
Zhonghua
Ni
*a
aSchool of Mechanical Engineering, Jiangsu Key Laboratory for Design and Manufacture of Micro-Nano Biomedical Instruments, Southeast University, Nanjing, Jiangsu Province 211189, P. R. China. E-mail: lx2016@seu.edu.cn; mezhujx@seu.edu.cn; nzh2003@seu.edu.cn
bAdvanced Ocean Institute of Southeast University, Nantong, Jiangsu Province 226010, P. R. China
First published on 13th August 2025
In early cancer diagnosis, extracellular vesicles (EVs) are more advantageous than circulating tumor cells due to their smaller size, greater stability, and enhanced tissue penetration. These qualities lead to higher EV concentrations in body fluids, facilitating early detection. This study leverages surface-enhanced Raman scattering (SERS) for EV detection, employing a novel biosensor made with a molybdenum disulfide (MoS2) composite film on silicon and demonstrating a lower limit of detection (LOD) and multi-marker synchronous quantitative testing performance compared to existing methodologies. This biosensor efficiently measures EV concentrations and precisely detects three specific proteins on ovarian cancer EVs simultaneously (CD63, CD24, and CA125). Using the ovarian cancer cell line HO8910, the sensor demonstrated a detection limit of 1.4 × 104 particles per mL and a wide linear range of 3.4 × 104 particles per mL to 3.4 × 108 particles per mL. It also effectively discriminated between serum samples from healthy individuals and ovarian cancer patients at different stages. Additionally, machine learning was applied to analyze detection data, resulting in a diagnostic model with a 97.78% prediction accuracy. This highlights the sensor's potential in revolutionizing early cancer detection and establishing new diagnostic models.
Nanomaterials10 generally refer to ultrafine particle materials composed of nanoparticles with sizes ranging from 1 to 100 nm. They exhibit distinct properties compared to bulk materials. In recent years, researchers have incorporated nanomaterials into biosensors, particularly in the field of EV detection, to develop highly sensitive and rapid detection biosensors. Firstly, nanomaterials with high surface areas and porous structures, such as graphene11 and molybdenum disulfide (MoS2),12 can provide more active binding sites for capturing antibodies. Combined with specific antibodies targeting EV membrane proteins, nanomaterials can be used for isolation, purification, and sensitive detection of EVs.13 Secondly, nanomaterials can be used for EV labeling. For example, by utilizing the surface modification of gold nanoparticles and specific antibodies recognizing EV membrane proteins, efficient probes14 for recognizing EV membrane proteins can be prepared to investigate their expression. Some two-dimensional nanomaterials have demonstrated excellent physicochemical properties and have shown great application prospects in biosensing, such as MoS2 and graphene.15,16
Traditional methods for EV detection, such as nanoparticle tracking analysis (NTA), dynamic light scattering (DLS), flow cytometry, and transmission electron microscopy (TEM),17–19 suffer from limitations including complex sample preparation, cumbersome operation, high equipment cost, and limited sensitivity, making them less suitable for high-throughput, low-cost, and rapid early screening. Surface-enhanced Raman scattering spectroscopy (SERS), as a fast and non-destructive non-contact optical detection method, amplified Raman signals through “hotspots” formed by plasma20,21 or electromagnetic enhancement of gold, silver and other particles to achieve highly sensitive detection. It has been widely used in quantitative detection of EVs and exploration of the expression of EV surface mask proteins, and a series of significant advances have been made.22–24 Meanwhile, researchers have demonstrated innovation by developing a series of biosensors incorporating nanomaterials.25–27 These advanced biosensors offer high-sensitivity measurements of EV concentrations and detailed analyses of EV membrane protein expression, marking significant progress towards early cancer diagnosis. Furthermore, the integration of machine learning and Raman spectroscopy has revolutionized the analysis and interpretation of molecular structural data. Machine learning28 enhances data analysis efficiency, converting vast datasets into actionable insights, while Raman spectroscopy provides detailed, high-resolution spectral data.29,30 Together, these technologies improve the accuracy of EV detection and analysis, facilitating the optimization of algorithms and contributing to the advancement of diagnostic methodologies.31
In this work, to achieve high-sensitivity detection of the EV concentration and multi-channel analysis of surface membrane proteins, we introduced MoS2,32 a nanomaterial with a high specific surface area and excellent biocompatibility, and prepared a SERS biosensor capable of detecting multiple surface membrane proteins of ovarian cancer EVs simultaneously. The sensor was used to detect the expression of multiple membrane proteins on EVs from the ovarian cancer cell line HO8910, and the lowest detection limit reached 1.4 × 104 particles per mL, with a linear range of 3.4 × 104 particles per mL to 3.4 × 108 particles per mL. Expression differences in surface membrane proteins among different cell lines were also evaluated. Furthermore, the sensor was used to detect EVs in serum samples from healthy individuals and early and late-stage ovarian cancer patients, and the expression levels of surface membrane proteins were statistically compared. Finally, machine learning was introduced to analyze and classify the detection data, and a cancer diagnostic classification model was established with an overall prediction accuracy of 97.78%, demonstrating the enormous potential of the sensor in early cancer diagnosis (Fig. 1).
![]() | ||
Fig. 1 Illustrative diagram of a surface-enhanced Raman scattering biosensor based on the MoS2 composite film for ovarian cancer detection and diagnosis. |
The experimental instruments used include a cell crusher (1500YC) from Nanjing Emmanuel and an ultrapure water system (UPT-II-10T) from Sichuan Youpu. Weighing measurements were conducted using an electronic balance (ME104E) from Mettler Toledo (Switzerland). Cleaning procedures were carried out using an ultrasonic cleaner (KQ-400DE) from Kunshan Ultrasonic Instruments, and high-speed centrifugation was performed with a Heraeus Multifuge ×1 (Thermo Fisher, USA). For surface treatments, a plasma gun from Femto Science was employed. UV-vis spectroscopy was conducted using a UV-1600PC from Shanghai Mapada (MAPADA), while confocal Raman spectroscopy was performed with an Alpha 300 Raman-AFM from Witec (Germany). High-resolution imaging was achieved with a transmission electron microscope (Talos F200X) and a scanning electron microscope (FEI Inspect F50), both from Thermo Fisher. Cell culture was performed using DMEM medium sourced from GIBCO, and filtration was carried out with 0.22 μm filters from Merck Millipore Ltd. Ultracentrifuge tubes were provided by Beckman Coulter, and nanoparticle tracking analysis was conducted using a ZetaView PMX 110.
Raman detection, shown in Fig. 3e, was performed at an excitation wavelength of 633 nm. A 50 μm × 50 μm area on the film was selected for surface scanning. The laser power was set to 1 mW. The scanned area was divided into 10 × 10 small regions, and the integration time for each region was 2 seconds.
All other Raman detection experiments were performed at an excitation wavelength of 633 nm, and 4–5 scanning areas of 40 μm × 40 μm were randomly selected on the MoS2 composite films for surface scanning processing. The laser power was 1 mW. The scanning range was divided into 20 × 20 small regions, and the integration time of each small region was 2 seconds.
-For protein CD63, the linear equation is y = 1555.45x − 6463.47, R2 = 0.9979, with a LOD of 1.4 × 104 particles per mL;
-For protein CD24, the linear equation is y = 1419.05x − 5982.65, R2 = 0.9950, with a LOD of 1.6 × 104 particles per mL;
-For protein CA125, the linear equation is y = 816.89x − 3471.91, R2 = 0.9974, with a LOD of 1.8 × 104 particles per mL.
Based on the lowest protein detection limit, the detection limit of this SERS biosensor for EVs from ovarian cancer cell line HO8910 is determined to be 1.4 × 104 particles per mL.
Multivariate statistical analysis was performed using MATLAB, including principal component analysis (PCA). For cancer staging detection, a support vector machine (SVM) model was implemented in MATLAB with the deep learning toolbox. Spectral data from healthy, early-stage cancer, and late-stage cancer groups were divided into training (3000 groups) and testing (150 groups) sets. The SVM model, trained with the “fitcecoc” function with a linear kernel, was validated using test set accuracy, the confusion matrix, the receiver operating characteristic (ROC) curve, and the area under the curve (AUC).
MoS2 dispersion was prepared via a liquid-phase exfoliation method, followed by surface modification with sodium cholate to introduce carboxyl groups for antibody conjugation. Characterization, including zeta potential analysis, spectroscopic characterization, and TEM, confirmed the successful preparation of negatively charged MoS2 nanosheets (Fig. S1). Subsequently, the surface of the silicon wafer after ultrasonic cleaning was charged by argon plasma pretreatment, then immersed in PDDA and MoS2 dispersion solution in turn, and repeatedly assembled to form a MoS2 composite film (Fig. 2a). Moreover, by observing the surface morphology of the MoS2 film using SEM (Fig. 2b), we identified the presence of wrinkles and protrusions on the surface. Through a layer-by-layer self-assembly process, the two-dimensional nanomaterials formed a three-dimensional structure on the silicon substrate, effectively increasing the specific surface area of the detection substrate. This provides additional active sites for antibody connection, thereby enhancing the sensitivity of the sensor. Then, the Raman enhancement effect of the sensor was evaluated using DTNB as a Raman reporter (Fig. 2c), and we found that the 10-layer MoS2 showed significantly better enhancement than the 5-layer MoS2, while the improvement beyond 10 layers was minimal. Considering the fabrication time and cost, 10 layer MoS2 was selected for the composite film. With fewer layers, MoS2 and PDDA cannot form a complete film, resulting in thinner or even uncovered regions, leading to weaker overall Raman enhancement. As the layer count increases, the composite film becomes more complete, significantly enhancing the Raman effect, though this effect does not change significantly at higher layer counts.
We obtained a stable, wine-red gold nanoparticle solution using a thermal reduction method (Fig. S2). Based on TEM analysis (Fig. 2d), we statistically analysed 50 synthesized gold nanoparticles and determined an average particle diameter of approximately 12.9 nm. The overall morphology and size distribution of the nanoparticles are shown in Fig. S3. This particle size is suitable for capturing EVs in the diameter range of 40–200 nm and facilitates subsequent SERS analysis. Subsequently, a two-step method was employed, in which Raman reporters (DTNB) and recognition antibodies (anti-CD63) were sequentially added to the gold nanoparticle solution to modify the surface, yielding gold nanoparticle immunorecognition probes (Fig. 2e). The modification process was characterized via UV-vis absorption spectroscopy38 (Fig. 2f). The UV-vis spectra at different stages of modification showed that, upon the addition of Raman reporters to the gold nanoparticle surface, the overall particle size increased,39 causing the redshift of the absorption peak from 525 nm to 530 nm. When the recognition antibodies were adsorbed onto the modified surface, the absorption peak of Au further redshifted to 536 nm and the full width at half maximum (FWHM) increased by 9 nm, which is consistent with the size-dependent behavior of the UV-vis absorption peak of gold nanoparticles.40
The sensor detection process is shown in Fig. 3b. To systematically evaluate the SERS enhancement effect, equal volumes of DTNB solution or prepared Au@DTNB tags were, respectively, applied onto clean silicon substrates, MoS2-modified silicon substrates, and Au@DTNB tag-modified MoS2 substrates, and Raman measurements were performed under identical experimental conditions. As shown in Fig. 3c, the Raman signal intensity of DTNB was significantly enhanced on both the MoS2 film and gold nanoparticles. When both components were present, the Raman signal was further amplified, confirming the synergistic enhancement effect and the feasibility of the SERS sensor. Furthermore, upon formation of the sensor's sandwich-type structure, the close proximity between the immunorecognition probe and the MoS2 substrate further strengthened the Raman signal, thereby enhancing the sensitivity of the sensor. Furthermore, upon formation of the sensor's sandwich-type structure, the proximity between the immunorecognition probe of gold nanoparticles and the MoS2 substrate notably strengthened the Raman signal, thereby enhancing the sensor sensitivity. Before detecting ovarian cancer EVs, the sandwich-type immunosensor structure of the sensor was verified, and six control groups were set up to detect each group of sensors by Raman spectroscopy. The results showed that only when a complete sandwich-type immunosensor structure was formed, a signal response could be obtained (Fig. 3d).
The sensor surface was subsequently subjected to area scanning, with specific parameters provided in the Experimental section. Three spectral windows centered at 1328, 1627, and 2223 cm−1 (each spanning 50 cm−1) were selected, corresponding to the principal Raman peaks of DTNB, TFMBA, and 4-MBN, which were employed to label the exosomal membrane proteins CD63, CD24, and CA125, respectively. The signals corresponding to these proteins were visualized in red, green, and blue channels (Fig. 3e). Prominent signal intensities were observed within each spectral window, indicating successful EV capture and effective immunorecognition of the targeted membrane protein markers.
The bright regions in the scanned images of the three proteins notably overlapped, consistent with the multi-channel detection, suggesting that the EVs in this area feature multiple surface membrane proteins labeled by the immunorecognition probe.
Then, EVs from the ovarian cancer cell line HO8910 were selected as detection targets to evaluate the analytical performance of surface protein marker detection. Initial characterization of EV samples by NTA demonstrated that the particle size distribution showed predominant nanoparticles in the 100–175 nm range (Fig. S4), consistent with typical EV dimensions. Quantitative analysis revealed a sample concentration of 3.7 × 108 particles per mL. Based on these findings, the EVs were serially diluted across a concentration gradient spanning 3.7 × 102 to 3.7 × 108 particles per mL to establish detection systems at varying concentrations. The EpCAM antibody was used as the capture antibody to capture EVs, and the Raman reporter DTNB-CD63, TFMBA-CD24, and 4-MBN-CA125 immunodetection probes were used to label the surface membrane proteins of EVs. After removing the background signal, the corresponding Raman signal waterfall plot at each concentration was obtained (Fig. 4a–c). At extremely low concentrations, the signal response of the sensor to ovarian cancer EVs was nearly zero, while at higher concentrations, obvious Raman characteristic peaks of the Raman reporters DTNB, TFMBA, and 4-MBN at 1328 cm−1, 1627 cm−1, and 2223 cm−1, respectively, were observed, and the signal intensity of the peaks increased with the increase of the EV concentration, showing a certain linear relationship. Leveraging the relationship between peak signal strengths and EV concentrations, we ascertained the detection limit of the sensor for EVs from the HO8910 cell line using linear regression. According to the linear curves (Fig. 4d–f), the linear detection ranges of the sensor for the three protein markers were determined to be 3.7 × 104 particles per mL to 3.7 × 108 particles per mL. The detection limit of the sensor for CD63 protein was determined to be 1.4 × 104 particles per mL; for CD24 protein, it was 1.6 × 104 particles per mL; and for CD63 protein, it was 1.8 × 104 particles per mL. Therefore, based on the lowest detection limit, the detection limit of the sensor for EVs from the HO8910 cell line was determined to be 1.4 × 104 particles per mL. We compared the detection results obtained with our developed sensor with the LOD reported in the existing literature (Tables S1 and S2). It was found that our sensor not only enables the simultaneous detection of multiple surface proteins but also possesses a significantly lower LOD, underscoring the pronounced advantages of this biosensor.
To explore the sensor's ability to simultaneously detect multiple membrane proteins on EVs and detect EVs from different ovarian cancer cell lines and to validate its feasibility in ovarian cancer diagnosis, we conducted experiments on EVs derived from the ovarian cancer cell lines HO8910 (a concentration of 3.7 × 107 particles per mL) and SKOV3 (a concentration of 5.2 × 107 particles per mL). We used Anti-EpCAM as the capture probe and Anti-CD24/TFMBA/Au, Anti-CD63/DTNB/Au, and Anti-CA125/4-MBN/Au as composite immunorecognition probes. These probes were mixed and simultaneously used to recognize and label the surface membrane proteins of EVs. The specific procedures followed the same protocol described in the previous section. After incubation, Raman spectroscopy was performed for detection (Fig. 5a). The results clearly showed the characteristic peaks of the corresponding Raman reporters for the same three proteins in both cell lines. This validated the sensor's capability to simultaneously identify and characterize multiple membrane protein expressions on EVs.
Furthermore, to minimize the error between experimental results in multiple detections, the repeatability of the sensor's Raman signal was validated. Using DTNB to label the membrane protein CD63 on EVs at a concentration of 3.7 × 107 particles per mL from the HO8910 cell line, Raman spectra were obtained from 30 random regions to assess signal uniformity across different areas of the same sensor film. The intensities of two Raman peaks at 1328 cm−1 and 1560 cm−1 were statistically analyzed. The results (Fig. 5b) showed that the relative standard deviations (RSD) of the Raman signal intensities at 1328 cm−1 and 1560 cm−1 were 4.45% and 8.04%, respectively, validating the good reproducibility of the biosensor's response. Additionally, we stored 40 prepared sensors for two weeks and recorded the Raman peak intensity at 1328 cm−1 of the DTNB Raman reporter by selecting five sensors every two days (Fig. 5c). After four weeks, the intensity had decreased to 79.3% of the original value. These results indicate that the biosensor has good reproducibility and stability. It is worth noting that, although EVs are inherently heterogeneous in terms of size and surface marker expression, the average SERS signals obtained from multiple measurements remain stable and reproducible.47 Therefore, we consider that such heterogeneity does not significantly affect the evaluation of signal uniformity in our experiments.
We examined EVs in a total of 30 serum samples, including 10 samples from healthy individuals, 10 samples from early-stage ovarian cancer patients (stage I/II), and 10 samples from late-stage ovarian cancer patients (stage III/IV), and plotted the results (Fig. 6a and b). The figures show that only background signals were observed in healthy samples, and as EpCAM expression in normal serum EVs is relatively low, theoretically, EVs secreted by normal cells would not be captured by the capture probe immobilized on the composite membrane, resulting in no corresponding Raman signals. Although a few samples might exhibit nonspecific adsorption of immune recognition probes or low EpCAM expression due to inflammation, leading to their capture by the probe, the overall signals corresponding to the three membrane proteins in patient samples were significantly higher than those in healthy samples. For early- and late-stage ovarian cancer patients, the expression levels of CD63 showed minimal differences, as CD63 is an exosome-specific protein expressed in almost all EVs, regardless of the cell origin or cancer stage.52 In contrast, the expression levels of CD24 and CA125 increased as the disease progressed. Furthermore, from the bar graph, it can be observed that even within the same group, the signal intensities of the same membrane protein varied due to differences in membrane protein expression on the EV surface, which makes it difficult to accurately classify serum samples using traditional data analysis methods.
Therefore, machine learning was introduced to achieve the visualization of multidimensional data,53 more intuitively demonstrate the expression differences of surface membrane proteins of EVs in serum samples between healthy individuals and ovarian cancer patients, and establish an ovarian cancer diagnosis and discrimination platform. Initially, we attempted to distinguish healthy individuals from ovarian cancer patients using Raman signal features of individual reporters. For the 30 sample sets, we extracted the Raman signal features corresponding to each reporter and divided the dataset into a training set of 20 samples and a test set of 10 samples. The model was trained using support vector machines (SVM) and validated on the test set. Ultimately, the classification accuracies for the models based on CD63, CD24, and CA125 were 65%, 85%, and 75%, respectively (Fig. S5), indicating that the diagnostic accuracy of a single biomarker remains suboptimal. Subsequently, we further explored a combined analysis using the Raman signal characteristics of these three biomarkers. In our analysis, the machine learning methods include principal component analysis (PCA)9 and SVM.54 PCA is utilized to reduce the dimensions of serum sample data collected from healthy individuals, as well as patients with early and late-stage ovarian cancer, thereby intuitively highlighting the characteristic differences between the groups. Furthermore, the dataset was expanded by data augmentation techniques, in order to train a SVM model with enhanced accuracy, aiming to improve the efficiency of early ovarian cancer diagnosis. To elucidate the distinctive features among the three groups more intuitively, we subjected the serum sample data to dimensionality reduction through principal component analysis (PCA). The PCA-processed results, presented in Fig. 6c, reveal that the combined variance ratio of the two principal factors exceeded 90%, satisfying the criteria for the new lower-dimensional space. Moreover, in this transformed coordinate system, there are distinct differences between healthy individuals and patients with early- and late-stage ovarian cancer. PC1 is strongly associated with the separation of the three sample groups, and it typically corresponds to overall intensity, which is consistent with our previous analyses. These distinctions underscore the potential of our biosensor for clinical applications, as it effectively discriminates between the different states of ovarian cancer.
The SVM algorithm is a supervised learning model based on the principle of maximum margin separation. Its core concept involves identifying the optimal hyperplane that effectively separates data points of different categories. The support vectors determine the final classification boundary, enabling SVM to excel in classification tasks. Given that we only have detection results for 30 samples (10 samples each from healthy, stage I/II, and stage III/IV groups), such a small dataset can lead to overfitting and poor generalization when training the model. To address this issue, we expanded the dataset by adding Gaussian noise to the original data. Specifically, 25 sets of data were augmented 120-fold, resulting in a total of 3000 datasets for the training set, while 5 sets of data were augmented 30-fold, resulting in 150 datasets for the test set. The average spectrum of the expanded dataset is shown in Fig. 6d. The model was trained using the augmented training set, and the detailed procedure can be found in the Methods section. Upon completion of the training, the test set was used to evaluate the model's predictions. The confusion matrix of the prediction results is shown in Fig. 6f, and the overall prediction accuracy of the model is 97.78%. This high accuracy rate is indicative of the model's robust capability to discriminate between cancer stages in subjects. To address the potential risk of overfitting due to data augmentation, we also performed 10-fold cross-validation using only the original dataset (Fig. S6). The model maintained a high classification accuracy (93.3%) after cross-validation, indicating that data augmentation did not result in artificially inflated accuracy. This demonstrates that the model's discriminative power is indeed grounded in the genuine differences present in the original data. Moreover, the receiver operating characteristic (ROC) curves were plotted (Fig. 6g), and the area under the curve (AUC) was calculated. The three ROC curves were plotted using a one-vs-rest strategy, where each class (healthy, early-stage, and late-stage) was treated as the positive class and the other two classes as negative. All ROC curves and AUC values were derived from the same model. The AUC values for healthy individuals, early-stage ovarian cancer patients, and late-stage ovarian cancer patients were 1, 0.9988, and 0.9988, respectively. These AUC outcomes demonstrate that the ovarian cancer diagnosis and discrimination platform we developed exhibits considerable efficacy. Upon comparing the prediction accuracy with the existing literature (Table S1), it is found that our model not only effectively predicts different stages of cancer in patients but also demonstrates superior accuracy. Therefore, it holds promise as a powerful auxiliary tool for cancer diagnosis.
This journal is © The Royal Society of Chemistry 2025 |