Yifan
Sun†
a,
Xiao
Peng†
a,
Fusheng
Du
b,
Lin
He
b,
Yuan
Lu
*c,
Yufeng
Yuan
*b and
Junle
Qu
a
aState Key Laboratory of Radio Frequency Heterogeneous Integration (Shenzhen University), College of Physics and Optoelectronic Engineering, Key Laboratory of Optoelectronic Devices and Systems of Ministry of Education and Guangdong Province, Shenzhen University, Shenzhen, Guangdong 518060, China
bSchool of Electronic Engineering and Intelligentization, Dongguan University of Technology, Dongguan, Guangdong 523808, China. E-mail: yufengyuan@dgut.edu.cn
cThe Sixth People's Hospital of Shenzhen University, Shenzhen University, Shenzhen, Guangdong 518060, China. E-mail: chfsums@163.com
First published on 14th November 2025
As primary carriers of foodborne and zoonotic diseases, Bacillus spores can pose a serious threat to food microbiology and human disease. Thus, the precise identification of Bacillus spores is of great significance for ensuring food safety and human health. Herein, this study proposed a living Bacillus spore identification platform: an adaptive Kolmogorov-Arnold network (KAN)-guided convolutional neural network (CNN) configuration combined with laser tweezers Raman spectroscopy (LTRS). To address the small size of the original single-cell Raman spectral datasets, Gaussian noise-based spectra augmentation was employed to significantly enlarge and enrich it. When the adaptive KAN was introduced into the dense layer of CNN, the prediction accuracy of five Bacillus spore species was as high as 97.80% ± 1.79%. Moreover, the KAN-guided CNN configuration has strong robustness and generalization ability, providing a prediction accuracy of 96% for an independent spectral dataset. To figure out the classification contribution of each Raman band, a blocking individual Raman band method was proposed. The Raman band located at 1655 cm−1, belonging to the amide I vibration of protein, was determined as the dominant contributor, surpassing two Raman bands belonging to Ca-DPA at 1576 cm−1 and 1449 cm−1. It can be foreseen that the KAN-guided CNN configuration combined with LTRS shows great promise for determining microbial identity, especially for unculturable microorganisms.
Generally, conventional approaches that are employed for spore identification heavily rely on phenotypic characterization. Standard protocols involve culturing spores on solid agar media, followed by both morphological and physiological analysis guided by skilled staff.8,9 In addition, molecular techniques including species-specific genetic probes and sequence-based assays have enhanced the discrimination accuracy of Bacillus spores.10,11 Moreover, both biosensors12 and ratiometric fluorescence nanoprobes13 have been proposed, utilizing engineered molecular recognition for targeted detection. Although the reliability of these mature methods have been verified, they still have inherent limitations. For example, phenotypic characterization requires a long incubation period, rendering it unfriendly for non-culturable spore species. In addition, biochemical assays usually require destructive sampling, making it unsuitable for living cell characterization. Moreover, both biosensor and nanoprobe designs require intricate preparation processes. Finally, it is almost impossible to achieve single-cell resolution through conventional platforms. Therefore, it is of great significance to develop a non-invasive, rapid, and high-precision characterization strategy at a single-cell level.
Confocal Raman spectroscopy has been considered as a promising solution for revealing rich information about the biological composition in living cells through identifying molecular fingerprint vibrations. However, confocal Raman spectroscopy usually suffers from low signal-to-noise ratios (SNRs) of an individual cell. Fortunately, this drawback can be addressed by well-known laser tweezers Raman spectroscopy (LTRS), which integrates an optical trapping technology with confocal Raman spectroscopy. To overcome the signal fluctuation of Raman scattering caused by Brownian motion, the optical trapping technology can efficiently capture and stabilize individual microparticles, resulting in high-quality Raman scattering signals.14,15 To date, LTRS has been widely employed in revealing biomolecular information in bacteria, spores,16 organelles,17 and environmental pollutant microplastics.18 Although the high SNR Raman scattering of a single cell is possible, it remains to be improved in directly distinguishing similar species, due to the spectral similarity originating from the overlapped biomolecular composition. Moreover, raw Raman spectra usually contain background noise, requiring a data processing algorithm to extract discriminative spectral features.
Conventional chemometric approaches, including hierarchical cluster analysis (HCA), linear discriminant analysis (LDA), and principal component analysis (PCA), were firstly proposed to deal with the species classification of Raman spectra, depending on the statistical models to reduce the data dimensionality.19,20 However, they cannot efficiently process high-dimensional datasets. Afterwards, various early machine learning algorithms such as support vector machines (SVM) and Random Forest were further developed by guiding computers to learn from large spectra datasets, significantly enhancing the analytical efficiency.19,21,22 However, they also possess respective drawbacks. For example, SVM is proficient in binary classification tasks rather than in multiclass tasks, and Random Forest may neglect critical spectral features in the ensemble learning process. Based on these observations, both conventional chemometrics and early machine learning approaches cannot satisfy the specific demand for accurately classifying similar species.
In recent years, deep learning, an advanced branch of machine learning, has brought significant advancements in Raman spectral analysis. Among various networks, CNN is the most widely employed to hierarchically extract spectra features from massive datasets, enabling it to automatically identify discriminative patterns. To date, the excellent identification capability of the CNN model has facilitated the precise prediction of marine bacteria and pathogens,23,24 gradually extending to perform cancer diagnostics.25–27 However, single CNN exhibits limited feature extraction capability when processing highly similar datasets. The reason is that the multilayer perceptron (MLP) widely employed in the dense layer of the CNN model has a low capability of feature extraction only via fixed activation functions. Fortunately, an adaptive network called KAN shows strong learning capability through the use of learnable activation functions. Therefore, a co-functioned configuration incorporating KAN with CNN has shown enhanced flexibility in capturing nonlinear data patterns.28,29
Inspired by these observations, we proposed a novel strategy to precisely identify living Bacillus spores using a single-cell platform LTRS combining with KAN-guided CNN architecture. Firstly, the original single-cell Raman spectra from five Bacillus spore species were collected by our home-built LTRS platform. To overcome the limit of the small size of the original single-cell Raman spectra datasets, the Gaussian noise-based data augmentation approach was employed to significantly enlarge the size of the original single-cell Raman spectra datasets. Afterwards, the classification performance of the KAN-guided CNN architecture was optimized by tuning the CNN iteration. Finally, through the introduction of a blocking band algorithm, the classification weight from each Raman band was systematically evaluated. The main innovations of our proposed method can be summarized in three aspects: (1) Our home-built LTRS platform can achieve non-invasive and high-quality Raman scattering signals from Bacillus spores in a single-cell level; (2) A novel deep learning architecture integrating CNN with KAN was developed to achieve Bacillus spore identification with high prediction accuracy, providing better performance than conventional machine learning models; (3) Beyond achieving high prediction accuracy, the interpretable blocking band algorithm was introduced to explain the classification process of the model and provide biologically interpretable insights.
To reduce the yielded spore heterogeneity in the same Bacillus strain, a well-known approach named Percoll discontinuous density gradient centrifugation was employed to synchronize the vegetative Bacillus cells. Afterwards, agar plates containing nutrient broth were employed to cultivate vegetative Bacillus cells. It is worth noting that there were two different nutrient broths involved in the spore culture process. In addition, the nutrient broth (CM1168; Thermo Scientific, UK) was employed to incubate Bacillus subtilis, Bacillus pumilus, and Bacillus cereus seed cells, respectively. Both Bacillus marisflavi and Bacillus aryabhattai seed cells were cultured in marine nutrient broth 2216 (Becton Dickinson, USA). To enhance the spore yield, 10 μL MnSO4 solution (50 g L−1, ≥99.5% purity; Tianjin Kermel Chemical Reagent Co., China) was injected into 100 mL nutrient broth supplemented with 1.5% agar, due to the supplementation of Mn2+ significantly improving the sporulation yield in the Bacillus species.30 Prior to cell inoculation, the agar plates were sterilized by autoclave. Then, the vegetative seed cells from five Bacillus strains were coated onto five sterilized agar plates, and all plates were held into a culture incubator at 37 °C. After a culture period of 120 hours, many Bacillus colonies had formed on the agar plates. To obtain synchronized spores, a single Bacillus colony of each Bacillus strain was selected and further purified by high-speed centrifugation, which was performed at 5000 rpm for 25 minutes. Next, the spore precipitate was washed by high-speed centrifugation three times. Finally, the yielded mature spores were conserved and sealed into sterile ultrapure water at a temperature of 4 °C.
In addition, the laser spot focused on the spore sample had a diameter of 1.5 μm, and the laser power on the spore sample was measured to be 3.8 mW. Our previous study has shown that a laser power of 3.8 mW has little to no damage on the Raman spectra of Bacillus spores.31 For each species of Bacillus spore, approximately 150 single-cell Raman spectra of trapped spores were measured, and the acquisition time for each spore was 10 s. Furthermore, for each species of Bacillus spore, they were collected from two independent experiment runs of spore culture including 100 (first run) and 50 (second run), respectively. The background Raman spectra were also measured, facilitating the extraction of the original Raman spectra of the trapped spores.
It has been reported that data augmentation strategies such as Gaussian noise injection, spectra shifting, and spectra combination, were proposed to enlarge the size of datasets.32,33 It can be found that an obvious advantage of spectral augmentation is to significantly reduce the time required for measuring the spectra.
In our study, a controlled Gaussian noise injection strategy was employed to augment both the size and diversity of the single-cell Raman spectra datasets. In brief, the Gaussian noise was extracted by estimating the standard deviation at each wavenumber point across the original Raman spectra. This process was defined by the equation: y′(v) = y(v) + ε, where ε ∼ N(0(α·σ(ν))2). Here, ε represents the Gaussian noise term extracted from a normal distribution with zero mean and standard deviation α·σ(ν). The parameter α denotes the noise intensity level, and σ(ν) stands for the standard deviation of spectral intensity at wavenumber ν computed across the entire training dataset.
Afterwards, the simulated Gaussian noise was injected into the original spectra data. Therefore, the new single-cell Raman spectra dataset was composed of two parts: the original spectra and the newly generated simulated Raman spectra. In detail, 10% of standard deviation was injected into the original spectra data to form the new single-cell Raman spectra datasets. With the help of various degrees of spectral augmentation, the obtained single-cell Raman spectra datasets are presented in Table S1. For each augmentation operation, in addition to the original single-cell Raman spectra from each Bacillus species, there was another simulated Raman spectra.
It is worth noting that, in the dense layer of the CNN model, the MLP layer is usually introduced. As shown in Fig. S1, the MLP consists of interconnected neurons, where each neuron can process the input through learnable weights that can determine the influence of the specific input on the output. Moreover, these weights in combination with fixed activation functions can transform the input into a temporary output, and the final output highly depends on the CNN training. During the CNN training, the MLP can enhance the identification ability via iteratively optimizing weights. However, the MLP layer also has inherent limitations. For example, the high dependency on both fixed activation functions and learnable weights restricts the CNN model to match complex data sequences, resulting in low feature discrimination.34
To overcome the drawbacks of MLP, the KAN algorithm was proposed to replace MLP by introducing a paradigm shift, as illustrated in Fig. 2. In the KAN model, each input neuron is required to pass through adaptive functions, and the outputs are performed through a sum operation rather than linear weight multiplication in MLP, as shown in Fig. S1. The structure development enables KAN to represent a multivariate continuous function with superior flexibility, especially for complex spectral patterns. In brief, the calculation formulation of KAN is defined as:
![]() | (1) |
ϕ(x) = wbb(x) + ws spline(x) | (2) |
, where PO denotes the classification accuracy using the whole original spectra dataset, and PB represents the classification accuracy obtained by blocking the target band. To comprehensively evaluate all the typical Raman bands, a sliding window with an identical range of 200 cm−1 was gradually moved across all the intervals. Finally, the relative contribution for each Raman band was averaged by running 10 independent operations to mitigate the stochastic variations.
| Raman band (cm−1) | Assignment | Vibrational mode |
|---|---|---|
| 660 | Ca-DPA | Bending vibration of C–C in pyridine ring |
| 826 | Ca-DPA | Out-of-plane deformation of C–H |
| 1017 | Ca-DPA | Symmetric stretching of pyridine ring |
| 1250 | Amide III | C–N stretching and N–H bending |
| 1397 | Ca-DPA | Symmetric stretching of O–C–O |
| 1449 | Lipids and Ca-DPA | Symmetric bending of C–H in pyridine ring |
| 1576 | Ca-DPA | Asymmetric stretching of O–C–O |
| 1657 | Amide I | Amide I vibration of C O |
| Data volume | CNN | KAN-guided CNN |
|---|---|---|
| Without data augmentation | 89.20% ± 4.21% | 91.80% ± 3.94% |
| With data augmentation by 200 | 91.60% ± 5.37% | 93.70% ± 3.38% |
| With data augmentation by 300 | 95.87% ± 1.07% | 97.80% ± 1.79% |
| With data augmentation by 400 | 96.10% ± 1.34% | 97.91% ± 1.92% |
| With data augmentation by 500 | 96.52% ± 1.73% | 98.15% ± 2.21% |
To address the spectral data scarcity, Gaussian noise-based spectra augmentation was employed to enlarge the size of the single-cell Raman spectra datasets by extracting the standard deviation of the original Raman spectra dataset. Through injecting a standard deviation of 10% into the original Raman spectra, newly simulated single-cell Raman spectra were formed. It is worth noting that the single-cell Raman spectral datasets employed for CNN training contain two parts: original and simulated Raman spectral datasets, which significantly enhanced the spectra diversity. More cautiously, it is essential to prevent the simulated Raman spectra generated from the validation or testing sets being mixed into the training set, otherwise the risk of data leakage will increase. Therefore, the simulated Raman spectra from the training set should only be employed for the training process.
To validate the reliability of newly simulated single-cell Raman spectra, the comparison between the original and newly simulated Raman spectra datasets was plotted, as shown in Fig. 4 and Fig. S4.
It can be found that the global distribution of newly simulated Raman spectra is well aligned with the original Raman spectra, confirming that the Gaussian noise-based spectra augmentation was effective without distorting critical spectral features. Here, the extraction of 10% noise intensity was driven by calculating the relationships between the spectral intensity and perturbation magnitude. Gaussian noise was assumed to introduce these deviations proportional to the original signal intensity. It means that high-intensity Raman bands are likely to experience larger spectral variations. However, these Raman bands inherently exhibit stronger baseline signals, making them more resilient to noise introduction. Conversely, low-intensity Raman bands holding subtle discriminative features are highly sensitive to the perturbations of spectral signals, and even minor noise could cover these tiny features. In this study, the effects of different levels of Gaussian noise were also studied, as shown in Table S3. When the Gaussian noise level was determined to be 10%, the yielded prediction accuracy was largest. Therefore, the standard deviation of 10% was determined to balance the spectra augmentation outcome and feature preservation.
Next, the effect on the prediction accuracy by integrating a KAN layer into CNN was systematically studied, as shown in Table 2. Compared with the average prediction accuracy of 89.20% ± 4.21% provided by the single CNN, the KAN-guided CNN platform can achieve a higher prediction accuracy of 91.80% ± 3.94% due to the introduction of the KAN layer. More importantly, with further assistance of the Gaussian noise-based augmentation technique, the prediction accuracy enhancement of the KAN-guided CNN platform was faster than the single CNN model. Amazingly, when the newly simulated Raman spectral datasets were further augmented by 300, the prediction accuracy of the KAN-guided CNN platform was as high as 97.80% ± 1.79%, indicating that it can almost perfectly identify five Bacillus species. Therefore, the high prediction accuracy of the KAN-guided CNN platform was attributed to the synergistic mechanism. Firstly, the Gaussian noise-based augmentation enriched the spectral diversity, enabling the CNN model to generalize the subtle interspecies variations. Secondly, the adaptive activation functions of the KAN layer can efficiently capture the nonlinear relationships in spectral datasets, overcoming the limitation of the MLP layer in the dense layer. To further demonstrate the performance of the Gaussian noise-based spectra augmentation, comparative studies were performed by introducing three augmentation strategies including baseline variations, Poisson noise and GAN spectra, as shown in Table S4. It can be seen that they are comparable. However, the baseline variation method may distort the significant Raman spectra. For the Poisson noise, it is essential to dynamically adjust the parameter (scaling factor) to guarantee that the noise is effective for each wavelength. Finally, the GAN spectra augmentation acquires complicated calculation loads, making it difficult to achieve the astringency of the GAN model.36
The effects of different spline order were studied to determine the optimal parameter for KAN, as shown in Table S5. When the spline order of KAN was determined to be 10, the yielded prediction accuracy was the highest. Then, the iteration dynamics of the KAN-guided CNN platform were monitored using two loss functions and Adam optimization algorithm with these parameters (β1 = 0.9, β2 = 0.999, and learning rate = 0.001), and the spline order (the parameter of KAN) was 10. Next, the KAN-guided CNN platform was further optimized by varying the number of epochs from 10 to 500, as shown in Fig. S5. When the number of iteration epochs was smaller than 300, both the training and validation function curves significantly fluctuated, indicating that the KAN-guided CNN platform was underfitting. However, when the number of iteration epochs was larger than 300, both the training and validation function curves exhibited almost stable states, as shown in Fig. 5(A) and Fig. S5(D–F). Moreover, the optimized accuracy from both training and validation datasets were approximately 0.98. More importantly, the KAN-guided CNN platform demonstrated exceptional robustness, achieving an ultra-high prediction accuracy of 97.80% ± 1.79% for all five Bacillus species in at least 10 independent operations, as shown in Table S6. Interestingly, Fig. 5(B) shows that for a single independent operation, the prediction accuracies on both original and simulated spectra are almost the same, indicating that the simulated Raman spectra were well aligned with the experimental Raman spectra. In addition, the misjudged Raman spectra was extracted, as shown in Fig. S7. It clearly shows that the Raman bands at 1657 cm−1 of the misjudged Raman spectra (B. 2430) are highly similar with those of B. pumilus, causing the misjudgment of the KAN-guided CNN platform. Moreover, the yielded receiver operating characteristics (ROC) curves are shown in Fig. 5(C). The per-class precision, recall and F1 scores are also given in Table S7. It can be found that the true positive rates for the five Bacillus species are approximately equal to 1 and the F1 scores are larger than 90%, indicating that our proposed KAN-guided CNN platform has high specificity.
To further verify the excellent robustness and generalization of the KAN-guided CNN platform, a newly constructed single-cell Raman spectral dataset containing 250 spectra from newly cultivated Bacillus spores was collected and introduced to work as an independent validation dataset. Particularly, the dataset was derived from completely new cultured strains in separate batches on different dates, distinct from those used for model training. Through running 10-time independent operation, the optimal KAN-guided CNN platform can provide a prediction accuracy of 96.00% ± 0.63%, as shown in Fig. 5(D) and Table S8. It can be found that our proposed KAN-guided CNN model has excellent generalization capacity.
To show the significant superiority of the KAN-guided CNN platform, comparative studies were performed by introducing four conventional machine learning models and two advanced deep learning models, as shown in Fig. S6 and Table 3. For four conventional machine learning models, Random Forest can achieve the highest accuracy of 88.89% ± 1.09%, followed by SVM (88.59% ± 1.06%), XGBoost (88.22% ± 1.99%), and K-nearest neighbors (84.89% ± 2.57%), respectively. For the two advanced deep learning models, the prediction accuracy of ResNet was approximately 96.83% ± 1.25%, and the prediction accuracy of Transformer was 96.40% ± 1.78%. It can be found that the yielded prediction accuracy of the two advanced deep learning models is a little higher than that for the single CNN model, but lower than the KAN-guided CNN model. It can be concluded that our proposed KAN-guided CNN platform can consistently provide ideal prediction accuracy.
| Classifier | Prediction accuracy |
|---|---|
| KAN-guided CNN | 97.80% ± 1.79% |
| ResNet | 96.83% ± 1.25% |
| Transformer | 96.40% ± 1.78% |
| Single CNN | 95.87% ± 1.07% |
| Random Forest | 88.89% ± 1.09% |
| SVM | 88.59% ± 1.06% |
| XGBoost | 88.22% ± 1.99% |
| KNN | 84.89% ± 2.57% |
As shown in Table 4 and Fig. 6, the relative weights of band-specific contributions were quantitatively determined. Most prominently, the Raman band at 1657 cm−1 was determined as the highest contributor (3.26%), surpassing the second highest contributor at 1576 cm−1 and the third highest contributor at 1449 cm−1, with registered contributions of 3.04% and 2.89%, respectively. In addition, the two intervals containing Raman bands at 1397 and 1017 cm−1 exhibited moderate contributions. Finally, the Raman bands at 660, 826 and 1250 cm−1 showed negligible effects on the classification contribution. In general, the spectral intervals with higher signal intensities were generally correlated with greater contributions. However, the correlation may deviate when the spectral variability outweighs the intensity. For example, the Raman band at 1655 cm−1 with modest signal intensity demonstrated high classification contribution. In fact, the variability in interspecies reflects the significant differences observed in the Raman spectra from various Bacillus species, as shown in Fig. 3 and Fig. S2. Thus, the classification efficacy is determined not only by peak intensity, but also by the synergistic interplay of both spectral feature and inter-species difference.
| Blocked Raman band (cm−1) | Prediction accuracy | Classification contribution |
|---|---|---|
| 649–687 containing 660 | 98.33% | 1.67% |
| 798–849 containing 826 | 97.33% | 2.67% |
| 982–1062 containing 1017 | 97.25% | 2.75% |
| 1062–1357 containing 1250 | 99.08% | 0.92% |
| 1358–1421 containing 1397 | 97.18% | 2.82% |
| 1421–1515 containing 1449 | 97.11% | 2.89% |
| 1515–1625 containing 1576 | 96.96% | 3.04% |
| 1625–1716 containing 1657 | 96.74% | 3.26% |
Footnote |
| † These authors contributed equally to this work. |
| This journal is © The Royal Society of Chemistry 2026 |