Non-destructive diagnostic testing of cardiac myxoma by serum confocal Raman microspectroscopy combined with multivariate analysis

Qiang Chen; Tao Shi; Dan Du; Bo Wang; Sha Zhao; Yang Gao; Shuang Wang; Zhanqin Zhang

doi:10.1039/D3AY00180F

View PDF VersionPrevious ArticleNext Article

DOI: 10.1039/D3AY00180F (Paper) Anal. Methods, 2023, 15, 2578-2587

Non-destructive diagnostic testing of cardiac myxoma by serum confocal Raman microspectroscopy combined with multivariate analysis†

Qiang Chen‡ ^a, Tao Shi‡ ^a, Dan Du ^b, Bo Wang ^b, Sha Zhao ^b, Yang Gao ^a, Shuang Wang ^c and Zhanqin Zhang *^b
^aDepartment of Cardiovascular Surgery, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China
^bDepartment of Anesthesiology, The First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, China. E-mail: zhangzhanqin12@sina.com
^cInstitute of Photonics and Photon-Technology, Northwest University, Xi'an, China

Received 4th February 2023 , Accepted 13th April 2023

First published on 14th April 2023

Abstract

The symptoms of cardiac myxoma (CM) mainly occur when the tumor is growing, and the diagnosis is determined by clinical presentation. Unfortunately, there is no evidence that specific blood tests are useful in CM diagnosis. Raman spectroscopy (RS) has emerged as a promising auxiliary diagnostic tool because of its ability to simultaneously detect multiple molecular features without labelling. The objective of this study was to identify spectral markers for CM, one of the most common benign cardiac tumors with insidious onset and rapid progression. In this study, a preliminary analysis was conducted based on serum Raman spectra to obtain the spectral differences between CM patients (CM group) and healthy control subjects (normal group). Principal component analysis-linear discriminant analysis (PCA-LDA) was constructed to highlight the differences in the distribution of biochemical components among the groups according to the obtained spectral information. Principal component analysis was combined with a support vector machine model (PCA-SVM) based on three different kernel functions (linear, polynomial, and Gaussian radial basis function (RBF)) to resolve spectral variations between all study groups. The results showed that CM patients had lower serum levels of phenylalanine and carotenoid than those in the normal group, and increased levels of fatty acids. The resulting Raman data was used in a multivariate analysis to determine the Raman range that could be used for CM diagnosis. Also, the chemical interpretation of the spectral results obtained is further presented in the discussion section based on the multivariate curve resolution-alternating least squares (MCR-ALS) method. These results suggest that RS can be used as an adjunct and promising tool for CM diagnosis, and that vibrations in the fingerprint region can be used as spectral markers for the disease under study.

Introduction

Cardiac myxoma (CM) is the most common primary and benign tumor formed by stellate to full-bodied mesenchymal cells in a mucinous-like stroma.¹ It can occur at different ages, most commonly in females between the ages of 50 and 60.² Nonetheless, CM is a benign tumor that can cause blood flow obstruction, fainting or sudden death.³ If tumor fragments are shed, they can cause systemic artery or pulmonary artery embolism.⁴ Current diagnosis of CM is based on clinical symptoms, and radiographic or echocardiographic findings.³ The non-specific nature of the clinical symptoms of CM poses a challenge to current clinical diagnostic techniques for CM,⁵ such as imaging or echocardiography, and the prolonged diagnostic period.^6,7 Although echocardiography is the preferred clinical diagnostic modality, the non-specific clinical symptoms of the patient and the fact that much is still unknown about the tumour entity to the general practitioner will lead to misdiagnosis and failure to provide accurate diagnosis and treatment recommendations to the patient.⁸ Hence, there is a strong need for a complementary diagnostic technique that can shorten the diagnostic period for CM, while also allowing analysis of the biochemical components to help to establish objective diagnostic criteria for CM. Overcoming this challenge will facilitate early and accurate detection of CM and guide anticoagulation management and surgical resection.

Raman spectroscopy is a non-destructive and non-invasive technique to assess the chemical composition of a sample based on the physical mechanism of Raman scattering generated by rotational and vibrational modes of molecular bonds. Therefore, the shift of various biomolecules (proteins, nucleic acids, lipids, and carbohydrates, among others) can provide detailed information about chemical composition at the molecular level, which has shown excellent potential for applications in disease diagnosis and progression monitoring.⁹ For instance, Claudio Durastanti et al. successfully distinguished diagnostically between samples taken from tumor cells and healthy cells based on Raman spectroscopy combined with multivariate statistical analysis.¹⁰ Furthermore, in the case of the early detection of Alzheimer's disease by using body fluids, Ralbovsky et al. used Raman spectroscopy combined with machine learning to diagnose saliva and serum as having potential biomarkers for the diagnosis of early Alzheimer's disease.^11,12 Raman spectroscopy thus allows for the acquisition of biochemical information about biological macromolecules, cells, tissues and organs, as well as the analysis of bodily fluids such as urine, saliva, blood and tears to reveal the relationship between the changing characteristics of the Raman spectra and clinical conditions in patients.^13–16 The capability of non-destructive detection and rapid analysis based on confocal Raman microspectroscopy enables the application of this technique as an adjunct to the diagnosis and treatment of diseases associated with biological tissues or bodily fluids.

Despite the potential of confocal Raman microspectroscopy for clinical diagnosis, no studies have been reported on the use of serum Raman spectroscopy to diagnose CM. In the present work, Raman spectroscopy was applied to CM diagnosis with the aim of improving patient diagnostic accuracy as an adjunct diagnostic tool. Initially, serum samples were taken from clinical samples of normal and CM patients and Raman spectra were acquired. PCA-LDA¹⁶ and PCA-SVM¹⁷ models were constructed based on spectral differences and PCA downscaling analysis to optimally separate and determine the diagnostic accuracy of Raman spectral features in CM patients. In addition, a correlation between serum Raman spectra and echocardiographic data in CM patients was similarly demonstrated, and in this manner we provide a detailed understanding of CM spectra based on serum samples to facilitate their clinical application possibilities.

Experimental

Human subjects

Twenty-three patients diagnosed with CM between January 2020 and January 2021 served as study subjects: 13 (56.5%) female patients and 10 (43.5%) male patients who were 51.78 ± 14.43 years old. In addition, 21 healthy subjects were included in the study as the normal group, with no age difference between the CM patient group and the normal group (51.78 ± 14.43 vs. 46.19 ± 15.03, p = 0.22). The CM patients were diagnosed by clinical examination and echocardiography. Echocardiography was performed by an experienced echocardiographer using a standard technique in a blind manner. The results were as follows: 19 patients were diagnosed with left atrial myxoma and 4 patients with right atrial myxoma. 21 (91.3%) of the 23 patients had 1 cardiac myxoma and 2 (8.7%) had 2 smaller myxomas, for a total of 25 myxomas in all patients, and based on 3D ultrasonography used to assess tumor volume. The study was approved by the ethical review committee of the First Affiliated Hospital of Xi'an Jiaotong University, and each participant was informed of the purpose of the study and signed a written informed consent form and registered with the Chic TR registry (No. Chic TR 2100042308). The patient demographics are listed in Table 1.

Table 1 Demographic data of CM patients and healthy control subjects

Characteristics	Normal (n = 21)	CM (n = 23)	P-value
Age	46.19 ± 15.03	51.78 ± 14.43	0.22
Gender (male/female)	10/11	10/13
Tumor volume, mm³	—	42846.72 ± 53730.15

Serum sample preparation

All serum samples from CM patients and normal individuals, except for medical use, were acquired from the First Affiliated Hospital of Xi'an Jiaotong University and stored at −80 °C until sample collection. RS analysis of serum samples was performed on air-dried samples for enhanced signal acquisition, when droplets of serum had formed coffee-ring-like regions at the rim.¹⁸ For regular sampling, four to six Raman spectra were taken for each of the different serum samples in the center of the coffee ring to avoid chemical and physical inhomogeneities after Raman scattering. To avoid interference from the glass Raman bands, the serum samples were dropped (1 μL) onto a flat gold-covered glass slide (pGOLD Slides, Nirmidas Biotech, US) dedicated to Raman measurement.^19,20 The samples to be tested were allowed to stand at 18 °C for 10 min before drying.

Confocal Raman microspectroscopy

In this study, an Alpha 500R confocal Raman spectroscopy system (WITec GmbH, Germany) and a 532 nm excitation wavelength semiconductor laser were used for spectral acquisition, with the confocal objective connected to the laser via a single-mode fibre (10 μm). The specific experimental procedure was as follows: before the experiments were carried out, the platform was first calibrated, the experimental parameters were set and the spectral wave number was calibrated using the silico Raman peak (520.7 cm⁻¹). Next, after calibration, the sample was placed on a multi-axis piezoelectric scanning stage and the sample point was positioned using a 20× objective (NA = 1, Zeiss, Germany) and the spectral signal was acquired and transmitted to the spectrometer (UHTS300, WITec GmbH, Germany) using a multimode optical fibre (100 μm). Finally, a back-scattering depth-depleted electron-multiplying CCD camera (Du401A-BR-DD-352, Andor Technology, UK) acquired, transmitted, processed and reproduced the Raman signal from the sample. The spectral integration time was three times a second, and the acquisition was repeated once at each spectral point. A total of 44 serum samples (normal: n = 21 and CM: n = 23) were included in this study. A total of 200 serum spectra were obtained from the samples of normal and CM groups for additional analysis. To ensure an appropriately high signal-to-noise ratio without damage to the samples, the laser power was controlled at around 10 mW.

Raman spectroscopy for multivariate analysis

Pre-processing of Raman spectra was performed based on spectral data analysis software (NWUSA).²¹ Based on this, nine-order polynomial fitting and five-point Savitzky–Golay smoothing were performed on the Raman spectra in the range 800–3100 cm⁻¹. The spectral range 1800–2800 cm⁻¹ was discarded as it is a special spectral window for the Raman silent region, where Raman bands from biological molecules and tissues are minimized. After the spectral pre-processing, the acquired spectra were normalized by the area-under-curve method to avoid the effects of the experimental apparatus, such as sample inhomogeneity and excitation light intensity drift.²²

SVM, developed by Vapnik and Burges, is considered superior to traditional linear methods due to its ability to represent nonlinear features in data.^23,24 As a result, SVM is attracting increasing interest for the classification of spectral data.^25–28 Up to now, SVM detection of CM has not been investigated. Initially, the SVM algorithm was used to identify changes in serum spectral information in the normal and CM patients. SVM separates different types of sample data through an optimal hyperplane and represents the maximum “decision partitioning interface” between each class by introducing a kernel function. It provides balanced classification performance, makes the data more recognizable, and minimizes structural risk by converting the original linear non-separable data vector into a high-dimensional separable feature space. However, multiple physical parameters, including spatial and temporal domains, can lead to complex calculations and inefficiencies in the implementation of SVM algorithms.^29,30 Nevertheless, including multiple physical parameters in both spatial and temporal domains would generate complex computations and inefficiencies in the implementation of SVM algorithms.³¹

To overcome these problems, the Raman spectral data can be reconstructed and reduced by performing PCA, reducing the dimensionality of the data set while retaining the important biochemical components of the spectrum. The most significant principal component information was then used as an input variable to further construct a PCA-SVM model with linear, polynomial and Gaussian radial basis as kernel functions. Performing a feature transformation such as PCA prior to SVM classification of Raman spectra can be used as a denoising step to improve classification performance. PCA-SVM classification models have proven applications in the analysis of Raman spectral data and therefore offer significant classification advantages. At the same time, a grid search method was used to find the best parameters for the SVM model, namely the penalty coefficient (C) and the kernel parameter (γ), and for the PCA-SVM model, a tenfold cross-validation method was used. The PCA-LDA algorithm discriminant model was then constructed to further classify the model, and its sensitivity, specificity and overall accuracy were determined based on the leave-one-out cross-validation (LOOCV) method. Linear discriminant analysis (LDA) is a supervised machine learning technique for classification problems, but LDA is not applicable to situations where the independent variable is much larger than the samples, whereas PCA is applicable. Therefore, PCA is used for dimensionality reduction and then LDA for discriminant analysis, which can simultaneously reduce and discriminate the original data to improve the classification efficiency. In addition, the LDA uses the mean values of each category, which allows the differences between categories in the data to be obtained, and the algorithm is superior, allowing better validation and assessment of the classification performance of serum samples from the normal and CM groups. In addition, LOOCV is a general cross-validation error estimation method, which is an unbiased estimate of the true error rate of the classifier, and the “leave-one-out” method has its own advantages, involving the use of a single observation from the original sample as the validation (test) set and the rest of the observations as the training set. The process is repeated so that each observation in the sample is used once as validation data.³²

Furthermore, by applying the alternating least-squares window for multivariate curve resolution from Joaquim's work,³³ we analyze and interpret the obtained spectra from a non-mathematical statistical, i.e., chemical, point of view, and obtain important information about the pure compounds in both groups. As an iterative self-modeling method, MCR-ALS does not require prior knowledge of its properties and composition. It can provide a profile that satisfactorily reflects the chemical significance.³⁴ MCR-ALS was performed in the 800–3100 cm⁻¹ band, and all spectral data have been background subtracted and normalized.

Results

Raman spectroscopic analysis

The average Raman spectra of the sera obtained randomly in the normal and CM groups are shown in Fig. 1A. Each point comprising the Raman mean spectra of the normal and CM groups was pre-processed based on the same parameters to minimize sample and instrument-induced data inaccuracies. All point spectra were taken randomly and without purpose on serum samples, and all point spectra were normalized and averaged for comparison. As shown in Fig. 1A, the results of the serum Raman spectra of the normal and CM groups showed that the biochemical component distribution features could be identified by the Raman spectra features at 851, 952, 1008, 1153, 1208, 1346, 1457, 1518, 1571, 1663, 2929, and 3059 cm⁻¹. The biochemical composition assignments of the above peaks are shown in Table 2.³⁵


	Fig. 1 (A) Area-under-curve normalized Raman spectra of the normal and CM groups. A 50-point spectrum for the normal group and a 94-point spectrum for the CM group were subsequently obtained as Raman mean spectra. (B) Differential spectra of the normal group versus patients with CM to highlight the subtle differences in the Raman mean spectra. (C) The violin plots of the normal group and the CM group were randomly selected to show the normalized spectral intensity, and the p-values of both groups were less than 0.05, which was statistically significant compared with the normal group.

Table 2 The peak positions and tentative assignments of acquired Raman spectra

Peak shift (cm⁻¹)	Vibrational mode	Major assignments
851	Ring breathing	Tyrosine
952	C–C stretching vibration	α-Helix, proline, valine
1008	C–C symmetric stretch	Phenylalanine
1153	C–C stretch mode	Carotenoids
1208	Ring vibration	L-Tryptophan, phenylalanine
1346	CH₃, CH₂ wagging	Tryptophan, adenine, guanine
1457	C–H bending	Protein
1518	C–C stretch mode	Carotenoids
1571	C=C bending vibration	Phenylalanine, acetoacetate
1663	C–C stretching vibration	Amide I
2929	C–H stretching vibration	Lipids
3059	C–H stretching vibration	Lipids

The intensity of the phenylalanine Raman peak at 1008 cm⁻¹ is slightly reduced in the CM group compared to the normal group.^36,37 The peak at 1153 cm⁻¹ was caused by the C–C stretching mode vibration of carotenoids,³⁸ and the same result can be found at 1518 cm⁻¹,³⁸ also attributed to the C–C stretching mode vibration of carotenoids. In addition, the intensity of the peaks at 1153 and 1518 cm⁻¹ in the CM group was significantly decreased compared to the normal group, indicating a sharp decrease in carotenoid content in the CM group, and in addition, the intensity of the Raman peak of carotenoids may indicate that this is the resonance Raman peak. In contrast, the intensity of the Raman peak at 1663 cm⁻¹ is slightly increased compared to the normal group,³⁹ indicating an increased amide content in the CM group. Similar results of stronger Raman peak intensity, also shown in the Raman peak at 2929 cm⁻¹, can be observed,⁴⁰ with higher intensity of the serum Raman peak in CM group, compared to the normal group, which were attributed to increased protein and fatty acid content. Other prominent Raman bands, such as 952, 1208, 1457 and 1571 cm⁻¹, are associated with protein CH-bond vibrations.⁴¹ In order to highlight the changes in the serum spectra due to CM, Raman differential spectra were obtained, as shown in Fig. 1B. The differential spectra show a decrease in the intensity of the Raman bands associated with proteins such as phenylalanine and carotenoids, with significant differences observed at 952, 1008, 1153, 1208 and 1518 cm⁻¹, and these changes are derived from the vibrational modes of the C–H or C–C bond of the protein.³⁷ The most striking observation is the negative feature of carotenoids at two positions, 1153 cm⁻¹ and 1518 cm⁻¹.³⁸ The more negative features in the difference spectra indicated a decrease in carotenoid content in the CM group. On the other hand, as shown in Fig. 1B, tryptophan and protein contents represented by 1346 and 1457 cm⁻¹ showed a weak increase in intensity in the CM group. At 2929 cm⁻¹, there is a significant increase in Raman intensity.⁴⁰ These indicate that the CM group has an increased content of the aforementioned substances compared to the normal group due to the lesion causing the associated content in the serum.

To investigate whether there is a difference between the mean spectra of the normal and CM groups, a t-test was performed on the mean of 2 independent samples, with the spectral intensity as a continuous variable. The mean spectra of the two groups came from normalized normal and CM groups. Normal tests were performed for each group prior to the independent t-test, and a Mann–Whitney U test was performed for groups with a non-normal test or uneven result variance. The statistical significance of the Raman peak data was evaluated for 30 samples from the randomly selected normal and CM groups with Raman peaks at 1008, 1153, 1457, 1518, 1663 and 2929 cm⁻¹. Violin plots for the randomly selected normal and CM groups show the normalized spectral intensities with intermediate values (solid lines), as shown in Fig. 1C, with p-values less than 0.05 for each of the two groups indicating statistical differences.

Spectral discrimination by PCA-LDA

Fig. 2A and B show the score plots and PCA loading spectra, respectively. All the results calculated from the normal and CM groups are shown as blue and light green dots in Fig. 2A, respectively. As shown in Fig. 2A, the score plot discriminates between normal and CM groups by PC1 and PC2, and the PCA loading spectra contain both positive and negative features, attributed to the spectral contributions of specific biochemical components. In the PCA scatter plot (Fig. 2A), PC1 contributed predominantly to discriminating between the normal and CM groups, with a variance value of 86.89%. The normal and CM groups were symmetrically distributed on the PC1 zero baseline, with the normal group distributed on the negative half-axis to the left of the zero baseline and the CM group on the positive half-axis to the right of the zero baseline. The assignment of the biochemical components, with reference to Fig. 1A and Table 2, showed that the spectral loading of PC1 was primarily in the negative features at 1008, 1153, 1346 and 1518 cm⁻¹, as shown in Fig. 2B. This would suggest that the between-group differences in the CM group compared to the normal group arise primarily from the reduced spectral contribution of phenylalanine and carotenoids, while the positive feature at 2929 cm⁻¹ indicated an increased spectral contribution of lipids in the CM group. Although the contribution of PC2 was as low as 6.72%, it again revealed biochemical differences between the two groups. The positive features at 952, 1008, 1153, 1518 and 2929 cm⁻¹ showed the major spectral differences between the normal and CM groups, as shown in the PC2 loading spectra in Fig. 2B. What these observations suggest is that the variability in serum Raman spectral content between the CM and normal groups is mainly derived from changes in the composition of proteins such as carotenoids and phenylalanine.


	Fig. 2 (A) PCA score plots for the first two principal components of the normal and CM groups. (B) PCA loading spectra for PC1 and PC2 from the normal group and CM group.

In order to estimate the salient spectral features of the different serum spectra between the normal and CM groups, the first few PCs were used as input variables to LDA to generate a valid spectral discriminative diagnostic model. Fig. 3A and B depict the linear discriminant scores and the ROC curves of the PCA-LDA model for serum spectra based on the PCA-LDA algorithm, respectively. Additionally, the number of classes required for the LDA projection of the resulting two sample groups is dimension-1. As shown in Fig. 3A, the normal group spectrum was distributed on the positive side of the first discriminant function, while the CM group spectrum was distributed on the negative side. As shown in Fig. 3B, the area under the ROC curve for both the normal and CM groups was 0.9349, which in turn quantifies the performance of the differential diagnostic model for serum spectra. A higher degree of convexity of the ROC curve represents better model performance. As shown in Table 3, the LOOCV confusion matrix for the spectral classification of the normal and CM groups is presented. As shown in Table 4, the discriminative model achieved sensitivities of 90% and 81.43% for normal and CM groups, specificities of 81.43% and 90%, and an overall classification accuracy of 85%.


	Fig. 3 (A) A scatter plot representing the linear discriminant scores of the normal group and CM group. (B) ROC curve results for the normal and CM groups of the PCA-LDA model.

Table 3 The classification model based on the PCA-LDA algorithm and the leave-one-out cross-validation method were used to identify Raman spectral results from the CM group and normal group

Actual\predicted	Normal	CM
Normal group	45	5
CM group	13	57

Table 4 Cross-validation performance^a

Indicators\group	Normal	CM
a Overall accuracy: 85%.
Sensitivity	0.9000	0.8143
Specificity	0.8143	0.9000

Spectral discrimination by PCA-SVM

Raman spectroscopy has been used and PCA-LDA models have been constructed to provide essential biochemical information and methods to study the differential diagnosis of CM based on serum Raman spectra. Further PCA-SVM models were developed to investigate in more detail the biochemical changes in the serum Raman spectra of the normal and CM groups, and to classify and predict the different serum samples of the two groups. Similarly, the most significant PC scores were employed as input variables for the SVM algorithm to construct the PCA-SVM model. By comparing the sensitivity, specificity and overall accuracy of the PCA-SVM models with linear, polynomial and RBF kernels (see Tables 5–10 for details), the results showed that the PCA-SVM with a RBF kernel showed the best classification performance. The classification accuracy of the optimized PCA-SVM model training process and the relationship between the three kernel function parameters and the classification accuracy are shown in Fig. 4A–C. The 3D surface plot of the PCA-SVM model for the RBF kernel is displayed in Fig. 4C, which illustrates the relationship between classification accuracy and the parameters C and γ. During execution of the grid search optimization, the parameters C and γ of the RBF kernel PCA-SVM model were set in the range of 2⁻¹⁰ to 2¹⁰ to the power of 2, which indicates that larger values of C and smaller values of γ present lower classification accuracy. The results of the ultimate calculation showed that the RBF kernel SVM model had the highest classification accuracy of 97.9167%, when parameter C was equal to 27 and γ was equal to 2–4. The PCA-SVM classification scatter plots for the three kernel functions are shown in Fig. 4D–F. Normal serum samples and CM serum samples are sampled points showing a degree of dispersion pattern. As shown in Fig. 4F, in the RBF kernel PCA-SVM, the boundaries of the class domain are found to be determined by the characteristics of the sampled points, which differ from the linear kernel and polynomial kernel PCA-SVM models (Fig. 4D and E). For the linear kernel and polynomial kernel PCA-SVM models, there were many incorrectly distributed classification points, which indicated poor classification performance. The classification results for the PCA-SVM model with the RBF kernel are shown in Table 9 based on the normal and CM group test sets, where the overall accuracy of the test set was 95.8333%.

Table 5 The classification confusion matrix of the linear kernel PCA-SVM model on the testing set (normal and CM group). Actual: pathological results. Predicted: Raman spectra combined with the classification model to obtain the discrimination results

Actual\predicted	Normal	CM
Normal group	7	3
CM group	0	14

Table 6 The classification performance (sensitivity, specificity, and overall accuracy) of the linear kernel PCA-SVM model on the testing set (normal and CM groups)^a

Indicators\group	Normal	CM
a Overall accuracy: 87.5%.
Sensitivity	0.7000	1
Specificity	1	0.7000

Table 7 The classification results of the polynomial kernel PCA-SVM models on the testing set. Actual: pathological results. Predicted: Raman spectra combined with the classification model to obtain the discrimination results

Actual\predicted	Normal	CM
Normal group	7	3
CM group	0	14

Table 8 The classification performance (sensitivity, specificity, and overall accuracy) of the polynomial kernel PCA-SVM models on the testing set

Indicators\group	Normal	CM
Sensitivity	0.7000	1
Specificity	1	0.7000


	Fig. 4 (A) A 2D plot of the linear kernel PCA-SVM model for the serum spectrum. (B and C) 3D surface plots of the PCA-SVM models for the polynomial kernel and the RBF kernel, respectively. In the linear kernel PCA-SVM model, the classification accuracy of the model is a function of that parameter; in the polynomial kernel PCA-SVM model, the classification accuracy is a function of the parameter C and the polynomial order; in the RBF kernel PCA-SVM model, the classification accuracy is a function of the parameters C and γ. (D–F) The linear, polynomial and RBF kernel PCA-SVM classification scatter distributions, respectively.

Table 9 The classification results of the RBF kernel PCA-SVM models on the testing set. Actual: pathological results. Predicted: Raman spectra combined with the classification model to obtain the discrimination results

Actual\predicted	Normal	CM
Normal group	10	0
CM group	1	13

Table 10 The classification performance (sensitivity, specificity, and overall accuracy) of RBF kernel PCA-SVM models on the testing set^a

Indicators\group	Normal	CM
a Overall accuracy: 95.8333%.
Sensitivity	1	0.9286
Specificity	0.9286	1

Discussion

In the present study, serum Raman spectra from patients with CM and normal subjects were identified and elucidated based on a confocal Raman microspectroscopic technique and a multivariate analysis method. After analyzing the serum spectral characteristics of the CM and normal groups, it was found that their protein content varied between the groups.

In what was presented as an exploratory study, the differences in serum Raman spectra between patients with CM and normal subjects were identified and clarified based on confocal Raman microspectroscopic techniques and multivariate analysis methods. In Fig. 1 and multivariate analysis methods (PCA-LDA and PCA-SVM), the results showed that the serum spectra between patients with CM and normal groups indeed derived from changes in the content of, for example, carotenoids and related proteins, such as the Raman peaks at 1008, 1153, 1208, 1518, 1663 and 2929 cm⁻¹. The marked increase in peak intensity at 2929 cm⁻¹ indicates an increased lipid content in the CM group, and we hypothesize that such an elevated peak may represent an essential and significant feature in the classification of serum spectra of CM patients and healthy individuals. The Raman bands at 2929 cm⁻¹ were derived from fatty acids, and the disruptive effects of fatty acid accumulation in cardiac cells lead to mitochondrial dysfunction,⁴² and mitochondrial dysfunction may also cause reduced activity of biochemical properties,⁴³ and the phenylalanine content can characterize the level of biochemical activity of the substance,^44,45 and the reduced intensity of the Raman peak at 1008 cm⁻¹ may indicate a weaker biochemical activity in the CM group. Another report suggested that changes in serum phenylalanine concentrations were a powerful predictor of the risk of cardiovascular disease.⁴⁶ The intensity of the carotenoid peaks at 1153 and 1518 cm⁻¹ was lower in the CM group than in the normal group, indicating a significantly lower carotenoid content in the CM group. Such a sharp spectral difference suggested that the carotenoid functional groups between 1145 and 1170 cm⁻¹ and 1506 and 1534 cm⁻¹ could be used as spectral markers to distinguish serum samples of CM patients from those from normal subjects.⁴⁷ Carotenoids have antioxidant activity and protect the body from reactive nitrogen species (RNS) and reactive oxygen species (ROS) causing damage.^48,49 As the main precursor of vitamin A in the human diet, high dietary intakes of carotenoids and their concentrations have been associated with changes in cardiovascular and cerebrovascular diseases and mortality of specific etiologies.^50,51 On the other hand, a reduction in carotenoid content (reduced intensity of the Raman peak) may similarly cause increased morbidity and mortality from cardiovascular disease. Our results indicated that fingerprint spectra of proteins such as carotenoids and lipids could be used to discriminate between the serum of CM patients and normal subjects. From an exploratory point of view, as spectral markers of changes in CM patients, specific biochemical pathways should be significantly characterized and highly visible in the Raman spectra, such as the fingerprint spectra mentioned above. Furthermore, PCs were used as input variables for the SVM algorithm to discriminate between the CM and normal group spectra for classification. A high accuracy of 95.83% was achieved when using the RBF non-linear kernel PCA-SVM model (87.5% for both linear and polynomial kernels), as indicated in the entries in Table 4. These results demonstrate that, after optimizing the kernel function parameters of the SVM algorithm, the RBF kernel PCA-SVM model has better applicability and higher accuracy than the linear kernel or polynomial kernel PCA-SVM models for both the CM and normal groups. The aforementioned ideas for dimensionality reduction are particularly crucial in applications where large amounts of data need to be processed quickly, since PCA-based dimensionality reduction analysis for hundreds of spectral variables is drastically reduced to two variables and hence the computational effort is greatly reduced. Moreover, most of the sample source features and the relationships between their variables were nonlinear, and the RBF kernel is exactly a nonlinear kernel.⁵² Thus, based on Raman spectral analysis and multivariate algorithms (PCA-LDA and PCA-SVM), serum Raman spectral features and variations were obtained and serum spectra from the CM and normal groups were successfully classified.

However, it should be emphasized and taken into account that the results of the PCA-based reduced-dimensional multivariate analysis only express the correlation between the principal axes and that these results were based on a statistical perspective rather than a chemical one.⁵³ Hence, we attempted to use multivariate curve resolved-alternating least squares (MCR-ALS) to discuss the chemical variation in our obtained results and to interpret them from the point of view of alteration in chemical composition and distribution of the main biochemical components to aid in the analysis of the content variation in the spectra with respect to the CM and normal groups. As shown in Fig. 5A and B, based on the principles of MCR-ALS, the mixture of chemical components from serum samples in the CM and normal groups was resolved into individual component contributions, i.e. the MCR-ALS-resolved concentration distribution spectra could be compared with the standards (normal group) to identity the chemical identity (CM group). In the biochemical component concentration distribution profile (Fig. 5A), the first 70 points were randomly selected from the CM group and the last 50 points were randomly selected from the normal group, for a total of 120 points. The biochemical substances represented by group two were higher in the CM group and lower in the normal group. Referring to Fig. 5B, it can be seen that the blue color represents the lipid Raman peak at 2929 cm⁻¹, which indicated that lipids were generally higher in serum samples from the CM group than from the normal group. In addition, the most significant changes were in the biochemical substances represented by component 1. As shown in Fig. 5A, the biochemical components of component 1 were more distributed in the normal group, and the substances represented by component 1 were 1008, 1153 and 1518 cm⁻¹, as shown in the component spectrum of Fig. 5B. The above results were consistent with our spectral analysis and multivariate analysis method, which indicated the feasibility of the differential diagnosis of CM based on multivariate Raman spectroscopy combined with serum Raman spectroscopy.


	Fig. 5 (A) The concentration profile features of the biochemical components of the serum samples from the CM and normal groups. The horizontal coordinates 0–70 indicate that the sample points came randomly from the CM group and 70–120 from the normal group. (B) The spectral components corresponding to the concentration profile features.

In summary, we have made an illustrative attempt to analyze the relationship between serum spectral results and the distribution of biochemical components among patients with CM and among metabolisms, and the deeper relationship between serum spectral results and CM should be further investigated and explored in preclinical models. From the fitted results of the Raman bands investigated in this study, the normalized intensity of the measured Raman bands varied significantly. Moreover, based on the obtained Raman spectra, we have extracted from the intensity variations with different groups of Raman spectra, normalized peak area and full width at half maximum (FWHM), to discuss the correlation between CM volumes and Raman spectra. The correlation coefficients between areas and CM volumes (r = −0.37, −0.43, −0.45, 0.45 and 0.31 for area 1 (973–1033 cm⁻¹), area 2 (1119–1187 cm⁻¹), area 4 (1481–1550 cm⁻¹), area 5 (1642–1698 cm⁻¹) and area 6 (2841–3012 cm⁻¹), respectively) were moderated, but the associations were clearly statistically significant (p = 2.14 × 10⁻³, 0.41 × 10⁻³, 0.19 × 10⁻³, 0.17 × 10⁻³ and 0.01, respectively). Several correlation coefficients between FWHM and CM volumes (r = 0.49, 0.37 and 0.46 for FWHM 1 (973–1033 cm⁻¹), FWHM 2 (1119–1187 cm⁻¹) and FWHM 4 (1481–1550 cm⁻¹)) were moderated, but the associations were also clearly statistically significant (p = 4.05 × 10⁻⁵, 2.39 × 10⁻³ and 0.13 × 10⁻³, respectively). This analysis suggested that peak area and FWHM might be available factors that are determinants of CM volumes. The relevant results and discussion are given in the ESI (Fig. S1–S3†).

Conclusions

In the present study, Raman microspectroscopy and multivariate analysis methods were used to investigate the compositional and structural changes in the serum spectra of CM patients. The results showed that the differences in spectral information of Raman spectroscopy can be used as the sensitive indicators to discriminate between normal subjects and CM patients. Based on the results of the PCA-LDA differential diagnostic classification, the results compare the overall accuracy of the three-core PCA-SVM model. The results of the overall accuracy comparison of the three nuclear PCA-SVM models showed that the RBF nuclear PCA-SVM model achieved the highest overall accuracy, which confirms the great potential of Raman microspectroscopic technology in combination with PCA-LDA and the SVM algorithm for predicting and screening the diagnostic effects of CM. The obtained results could still help us to understand the altered biochemical composition of serum in patients with CM and may provide an experimental base protocol for the development of suitable therapeutic regimens for CM patients based on changes in protein content in serum.

Author contributions

Q. C. and T. S. are responsible for writing the original draft, and data curation, D. D., S. Z. and B. W. for investigation and methodology, Y. G. and S. W. for graphical work, S. W. for the scanning of sections and reviewing and editing the original draft, and Z. Z. for conceptualization, conducting the project visualization, writing the original draft, and graphical work. The final draft was approved by all authors.

Conflicts of interest

There are no conflicts to declare.

Notes and references

A. P. Burke, H. Tazellar, J. Gómez-Román, R. Loire, P. Chopra, M. Tomsova, J. Veinot, T. Dijkhuizen, C. T. Basson, R. Rami-Porta, E. Maiers, A. E. Edwards, P. Walter, J. R. Galvin, S. Tsukamoto, D. Grandmougin and P. A. Araoz, Pathology and Genetics of Tumours of the Lung, Pleura, Thymus and Heart, World Health Organization Classification of Tumors, 2004, pp. 263–265 Search PubMed.
T. Kusumi, M. Minakawa, K. Fukui, S. Saito, M. Ohashi, F. Sato, I. Fukuda and H. Kijima, Cardiovasc. Pathol., 2009, 18, 369–374 CrossRef PubMed.
G. Samanidis, M. Khoury, M. Balanika and D. N. Perrea, Kardiol. Pol, 2020, 78, 269–277 CrossRef PubMed.
S. M. Yuan, S. L. Yan and N. Wu, Anatolian J. Cardiol., 2017, 17, 241–247 Search PubMed.
J. G. Wang, Y. J. Li, H. Liu, N. N. Li, J. Zhao and X. M. Xing, J. Thorac. Dis., 2012, 4, 272–283 Search PubMed.
A. C. Storm and L. S. Lee, World J. Gastroenterol., 2016, 22, 8658–8669 CrossRef PubMed.
L. M. Best, V. Rawji, S. P. Pereira, B. R. Davidson and K. S. Gurusamy, Cochrane Database Syst. Rev., 2017, 4, Cd010213 Search PubMed.
V. D. Aiello and F. P. de Campos, Autops. Case Rep., 2016, 6, 5–7 CrossRef PubMed.
Z. Farhane, F. Bonnier, A. Casey and H. J. Byrne, Analyst, 2015, 140, 4212–4223 RSC.
C. Durastanti, E. N. M. Cirillo, I. De Benedictis, M. Ledda, A. Sciortino, A. Lisi, A. Convertino and V. Mussi, Micromachines, 2022, 13, 1388 CrossRef PubMed.
N. M. Ralbovsky, L. Halámková, K. Wall, C. Anderson-Hanley and I. K. Lednev, J. Alzheimer's Dis., 2019, 71, 1351–1359 CAS.
N. M. Ralbovsky and I. K. Lednev, Chem. Soc. Rev., 2020, 49, 7428–7453 RSC.
S. Kim, W. Kim, A. Bang, J. Y. Song, J. H. Shin and S. Choi, Anal. Methods, 2021, 13, 3249–3255 RSC.
E. Witkowska, T. Jagielski, A. Kamińska, A. Kowalska, A. Hryncewicz-Gwóźdź and J. Waluk, Anal. Methods, 2016, 8, 8427–8434 RSC.
S. Kim, S. H. Lee, Y. J. Kim, H. J. Lee and S. Choi, Anal. Methods, 2019, 11, 5381–5387 RSC.
K. Gajjar, L. D. Heppenstall, W. Pang, K. M. Ashton, J. Trevisan, I. I. Patel, V. Llabjani, H. F. Stringfellow, P. L. Martin-Hirsch, T. Dawson and F. L. Martin, Anal. Methods, 2012, 5, 89–102 RSC.
B. Li, Y. Wu, Z. Wang, M. Xing, W. Xu, Y. Zhu, P. Du, X. Wang and H. Yang, Anal. Methods, 2021, 13, 5264–5273 RSC.
M. Yang, D. Chen, J. Hu, X. Zheng, Z.-J. Lin and H. Zhu, TrAC, Trends Anal. Chem., 2022, 157, 116752 CrossRef CAS.
N. M. Ralbovsky, G. S. Fitzgerald, E. C. McNay and I. K. Lednev, Spectrochim. Acta, Part A, 2021, 254, 119603 CrossRef CAS PubMed.
G. Lu, X. Zheng, X. Lu, P. Chen, G. Wu and H. Wen, Photodiagn. Photodyn. Ther., 2021, 33, 102164 CrossRef CAS PubMed.
D. Song, Y. Chen, J. Li, H. Wang, T. Ning and S. Wang, J. Biophotonics, 2021, 14, e202000456 CrossRef CAS PubMed.
Y. Gong, S. Wang, Z. Liang, Z. Wang, X. Zhang, J. Li, J. Song, X. Hu, K. Wang, Q. He and J. Bai, Cell. Physiol. Biochem., 2018, 49, 1127–1142 Search PubMed.
I. Schofield, D. C. Brodbelt, N. Kennedy, S. J. M. Niessen, D. B. Church, R. F. Geddes and D. G. O'Neill, Sci. Rep., 2021, 11, 9035 CrossRef CAS PubMed.
M. N. Amin, M. A. Rushdi, R. N. Marzaban, A. Yosry, K. Kim and A. M. Mahmoud, Biomed. Signal Process. Control, 2019, 52, 84–96 CrossRef PubMed.
S. Khan, R. Ullah, A. Khan, N. Wahab, M. Bilal and M. Ahmed, Biomed. Opt. Express, 2016, 7, 2249–2256 CrossRef CAS PubMed.
J. Wen, T. Tang, S. Kanwal, Y. Lu, C. Tao, L. Zheng, D. Zhang and Z. Gu, Front. Chem., 2021, 9, 641670 CrossRef CAS PubMed.
A. A. Moawad, A. Silge, T. Bocklitz, K. Fischer, P. Rosch, U. Roesler, M. C. Elschner, J. Popp and H. Neubauer, Molecules, 2019, 24, 4516 CrossRef CAS PubMed.
O. Gamulin, M. Skrabic, K. Serec, M. Par, M. Bakovic, M. Krajacic, S. D. Babic, N. Segedin, A. Osmani and M. Vodanovic, Molecules, 2021, 26, 3983 CrossRef CAS PubMed.
J. Luts, F. Ojeda, R. Van de Plas, B. De Moor, S. Van Huffel and J. A. Suykens, Anal. Chim. Acta, 2010, 665, 129–145 CrossRef CAS PubMed.
A. Z. Woldaregay, E. Årsand, T. Botsis, D. Albers, L. Mamykina and G. Hartvigsen, J. Med. Internet Res., 2019, 21, e11030 CrossRef PubMed.
F. Kazemi, T. A. Najafabadi and B. N. Araabi, J. Medical Signals Sens., 2016, 6, 183–193 CrossRef.
D. R. Kim, M. Ali, D. Sur, A. Khatib and T. F. Wierzba, Int. J. Health Geogr., 2012, 11, 10 CrossRef PubMed.
J. Jaumot, A. Juan and R. Tauler, Systems, 2015, 140, 1–12 CAS.
S. Benabou, C. Ruckebusch, M. Sliwa, A. Aviñó, R. Eritja, R. Gargallo and A. de Juan, Nucleic Acids Res., 2019, 47, 6590–6605 CrossRef CAS PubMed.
A. E. Baker, A. R. Mantz and M. L. Chiu, mAbs, 2014, 6, 1509–1517 CrossRef PubMed.
A. M. Herrero, M. I. Cambero, J. A. Ordóñez, L. de la Hoz and P. Carmona, Food Chem., 2008, 109, 25–32 CrossRef CAS PubMed.
X. Lu, Q. Liu, J. A. Benavides-Montano, A. V. Nicola, D. E. Aston, B. A. Rasco and H. C. Aguilar, J. Virol., 2013, 87, 3130–3142 CrossRef CAS PubMed.
R. Ullah, S. Khan, F. Farman, M. Bilal, C. Krafft and S. Shahzad, Biomed. Opt. Express, 2019, 10, 600–609 CrossRef CAS PubMed.
L. Shao, A. Zhang, Z. Rong, C. Wang, X. Jia, K. Zhang, R. Xiao and S. Wang, Nanomedicine, 2018, 14, 451–459 CrossRef CAS PubMed.
C. Krafft, L. Neudert, T. Simat and R. Salzer, Spectrochim. Acta, Part A, 2005, 61, 1529–1535 CrossRef PubMed.
S. Chen, S. Zhu, X. Cui, W. Xu, C. Kong, Z. Zhang and W. Qian, Biomed. Opt. Express, 2019, 10, 3533–3544 CrossRef CAS PubMed.
N. Fillmore, J. Mori and G. D. Lopaschuk, Br. J. Pharmacol., 2014, 171, 2080–2090 CrossRef CAS PubMed.
W. J. Liu, Y. B. Yin, J. Y. Sun, S. Feng, J. K. Ma, X. Y. Fu, Y. J. Hou, M. F. Yang, B. L. Sun and C. D. Fan, OncoTargets Ther., 2018, 11, 5429–5439 CrossRef CAS PubMed.
N. Qin, M. Qin, W. Shi, L. Kong, L. Wang, G. Xu, Y. Guo, J. Zhang and Q. Ma, Sci. Rep., 2022, 12, 13980 CrossRef CAS PubMed.
D. Bai, S. Yu, S. Zhong, B. Zhao, S. Qiu, J. Chen, J. Lunagariya, X. Liao and S. Xu, Int. J. Mol. Sci., 2017, 18, 544 CrossRef PubMed.
C. Murr, T. B. Grammer, A. Meinitzer, M. E. Kleber, W. März and D. Fuchs, J. Amino Acids, 2014, 2014, 783730 Search PubMed.
F. Messud-Petit, J. Gelfi, M. Delverdier, M. F. Amardeilh, R. Py, G. Sutter and S. Bertagnoli, J. Virol., 1998, 72, 7830–7839 CrossRef CAS PubMed.
A. F. G. Cicero and A. Colletti, Curr. Pharm. Des., 2017, 23, 2422–2427 CAS.
E. Reboul, Nutrients, 2019, 11, 838 CrossRef CAS PubMed.
D. Aune, N. Keum, E. Giovannucci, L. T. Fadnes, P. Boffetta, D. C. Greenwood, S. Tonstad, L. J. Vatten, E. Riboli and T. Norat, Am. J. Clin. Nutr., 2018, 108, 1069–1091 CrossRef PubMed.
J. Huang, S. J. Weinstein, K. Yu, S. Mannisto and D. Albanes, Circ. Res., 2018, 123, 1339–1349 CrossRef CAS PubMed.
S. Boobier, D. R. J. Hose, A. J. Blacker and B. N. Nguyen, Nat. Commun., 2020, 11, 5753 CrossRef CAS PubMed.
H. Wang, J. Li, J. Qin, J. Li, Y. Chen, D. Song, H. Zeng and S. Wang, J. Photochem. Photobiol., B, 2022, 226, 112366 CrossRef CAS PubMed.

Footnotes

† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3ay00180f

‡ These authors contributed equally to this work.

Click here to see how this site uses Cookies. View our privacy policy here.