Open Access Article
Bo Zhou
ab,
Yu-Kai Tong
b,
Ru Zhang*a and
Anpei Ye*b
aSchool of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China. E-mail: ruzhang@bupt.edu.cn
bKey Laboratory for the Physics and Chemistry of Nanodevices, School of Electronics, Peking University, Beijing 100871, China. E-mail: yap@pku.edu.cn
First published on 16th September 2022
Raman spectroscopy combined convolutional neural network (CNN) enables rapid and accurate identification of the species of bacteria. However, the existing CNN requires a complex hyperparameters model design. Herein, we propose a new simple network architecture with less hyperparameter design and low computation cost, RamanNet, for rapid and accurate identifying of bacteria at the species level based on its Raman spectra. We verified that compared with the previous CNN methods, the RamanNet reached comparable results on the Bacteria-ID Raman spectral dataset and PKU-bacterial Raman spectral datasets, but using only about 1/45 and 1/297 network parameters, respectively. RamanNet achieved an average isolate-level accuracy of 84.7 ± 0.3%, antibiotic treatment identification accuracy of 97.1 ± 0.3%, and distinguished accuracy of 81.6 ± 0.9% for methicillin-resistant and -susceptible Staphylococcus aureus (MRSA and MSSA) on the Bacteria-ID dataset, respectively. Moreover, it achieved an average accuracy of 96.04% on the PKU-bacterial dataset. The RamanNet model benefited from fewer model parameters that can be quickly trained even using CPU. Therefore, our method has the potential to rapidly and accurately identify bacterial species based on their Raman spectra and can be easily extended to other classification tasks based on Raman spectra.
Raman spectroscopy enables the identification of bacteria in a fast, non-destructive and label-free manner.7–11 The technology allows for the analysis of the molecular structure and chemical composition of substances, thus leading to significant progress in classifying components of complex mixtures.12,13 Different components in bacterial cells will produce unique spectral fingerprints on Raman spectra, and we can identify different bacteria based on these unique Raman fingerprints. Overall, Raman spectroscopy has significant potential to identify bacteria at the species level or the level of antibiotic resistance.10,14 In recent years, the use of neural networks to automatically feature Raman spectra has significantly improved the identification accuracy of pathogenic bacteria compared to traditional machine learning algorithms.8,9,15
To improve the classification accuracy of convolutional neural network (CNN) for bacterial Raman spectra, many attempts have focused on using dozens of filters,14,16–18 increasing the depth of the network,14,16 and adopting more advanced classification algorithm models, which were mainly used in the optimization methods of the image classification field. These methods often result in longer training times and more complex hyperparameter designs for neural networks. However, a common but significant problem has not been resolved, that is, the existing CNN is not explicitly designed for Raman spectra data, so we may not fully exploit the advantages of CNNs. In previous studies, state-of-the-art CNN techniques from image classification, such as residual network (ResNet), were used to classify low signal-to-noise ratio (SNR) Raman spectra data.14 Ho et al. used dozens of filters in the 26-layer CNN to achieve average isolate-level accuracies exceeding 82%. Maruthamuthu. et al.16 classified 12 microbes Raman spectra with an accuracy exceeding 95% using 18-layers ResNet with hundreds of filters in each layer. Liu et al.20 explored the rapid identification of 13 microorganisms by single-cell Raman spectroscopy (scRS) and achieved an average accuracy of 88.5 ± 4% using a 3-layers one-dimensional convolutional neural network (1DCNN) with dozens of convolution kernels. In fact, Raman spectra, as one-dimensional sequences without depth, are much simpler than images (multi-dimensional matrices). Therefore, such a complex network structure may not be required for identifying bacterial Raman spectroscopy at species-level accuracy. Through extensive experiments, we demonstrated that for bacterial Raman spectra, two or three convolutional layers with one filter per convolutional layer in a CNN are sufficient to achieve fast and high accurate identification at the species-level.
Herein, we proposed a novel and simple CNN model named RamanNet for rapid and accurate identifying bacteria at the species-level based on bacterial scRS. The RamanNet consists of an initial convolution layer followed by two residual layers, a flatten layer, and a final fully classification layer, and each convolutional layer contains only one filter. We validated the performance of RamanNet on both the Bacteria-ID dataset and PKU-bacterial dataset. RamanNet achieved an average isolate-level accuracy of 84.7 ± 0.3% and an accuracy of antibiotic treatment identification of 97.1 ± 0.3% on the Bacteria-ID dataset. Moreover, RamanNet has 296.9 times fewer parameters compared with ResNet but achieved a comparable classification accuracy of 96.04% on the PKU-bacterial dataset. Due to the optimization of the RamanNet structure, we can rapidly train the model using only CPU rather than GPU. We believe this research would provide guidance and reference for bacterial Raman spectral analysis using convolutional neural networks.
| Spectral number | Measure time (s) | Isolates classes | |
|---|---|---|---|
| Reference | 60 000 |
1 | 30 |
| Fine-tune | 3000 | 2 | 30 |
| Test | 3000 | 2 | 30 |
The wavenumber range of the spectra in all three datasets ranged from 381.98 to 1792.4 cm−1. The spectral integration time in reference, fine-tuning, and test subsets are 1 s, 2 s, and 2 s, respectively. To keep the SNR (SNR = 4.1) consistent across datasets, for fine-tuning and test subsets, the integration time was increased from 1 s to 2 s due to the debasement of the optical system efficiency. The full isolate information and specific Raman experiment process obtained from the dataset were described in ref. 14, and the entire dataset was downloaded from https://github.com/csho33/bacteria-ID/.
The raw data of each scRS was pre-processed using homemade code, developed based on MATLAB (2021b) as follows: removing cosmic ray by median filtering, subtracting system background using polynomial baseline correction, smoothing with five-point smoothing, and normalizing by area. The normalized spectral data in the PKU-bacterial dataset has higher SNR than the Bacteria-ID dataset due to longer spectral integration time, exemplified by S. epidermidis (Fig. 1). The dimension of all spectra in this dataset is 861.
![]() | ||
| Fig. 1 A randomly selected spectrum of S. epidermidis from (A) Bacteria-ID dataset and (B) PKU-bacterial dataset. It can be seen that the SNR of the latter is higher than that of the former. | ||
The details of the bacterial single-cell Raman spectroscopy measurement, data pre-processing, and data augmentation strategy were described in ref. 15.
The RamanNet consists of an initial convolution layer followed by two residual layers, a flatten layer, and a final fully connected classification layer (Fig. 2). Each residual layer contains a shortcut connection between the input and output of one convolutional layer, allowing for better gradient propagation and stable training.19 The convolution kernel numbers of all convolutional layers are set to 1. The size of the convolution kernel is 7 and 3, respectively. This simple model reduces traditional CNN's computational complexity so that it can be successfully trained on the Bacteria-ID dataset within 20 minutes, even on the CPU (Table 3). We adopted kaiming normal initialization21 and a “categorical_crossentropy” loss function. These architecture hyperparameters were selected via grid search using one training and validation split on the 30-isolates classification task.
| Model | Accuracy | Parameters (mega) | Computation parameters (MACs) | ||
|---|---|---|---|---|---|
| 30-Isolates task | 8-Treatments task | MRSA/MSSA | |||
| RamanNet | 84.7 ± 0.3% | 97.1 ± 0.3% | 81.6 ± 0.9% | 0.030 | 0.03 M |
| ResNet14 | 82.2 ± 0.3% | 97.0 ± 0.3% | 89.1 ± 0.1% | 1.341 | 395 M |
| Calculating unit | Pretraining (min) | Finetuning (min) | Prediction(s) |
|---|---|---|---|
| CPU (Intel i7-8700) | 16.7 | 2.7 | 7.6 |
| GPU (NVIDIA GeForce GTX 1080) | 14.8 | 2.4 | 7.3 |
:
1 as training and validation split, then trained the RamanNet on the training split and validated its accuracy on the validation split in order to perform model selection, thus five fine-tuned models were obtained. Finally, we evaluate and report the average test accuracy of these five fine-tuned models on the test subset gathered from independently cultured and prepared samples. In addition, the receiver operating characteristic (ROC) curve was used to verify the practicality of the ResNet model by plotting the true positive rate (TPR, sensitivity) versus the false-positive rate (FPR, 1-specificity) (Fig. 4).
![]() | ||
| Fig. 4 ResNet discriminates the receiver operating characteristic (ROC) curve for 30-isolates Raman spectra. The average AUC values of 30 isolates are more than 0.98. | ||
In this work, we used the stochastic gradient descent (SGD) optimizer with a learning rate of 0.001 and a batch size of 10 to prevent overfitting, and early stopping technology was also used to avoid overfitting. Here pre-training was performed in only ten epochs; the fine-tuning was conducted in only 30 epochs.
:
1. The training subset is augmented with data, and the validation subset is not augmented. We adopted the following data augmentation strategies: (1) randomly shifted left or right a few wavenumbers, (2) added 1% Gaussian noise to the spectrum for each wavenumber, and (3) linearly combined all spectra from the same bacterial species, the combining coefficients are randomly generated so that we obtained the enhanced training subset for each species.
![]() | ||
| Fig. 5 (a) Confusion matrix of MSSA/MRSA; (b) ROC curve of MSSA/MRSA, the average AUC values are 0.90. | ||
These training and validation processes were repeated on the training set five times to optimize the model. Finally, the model with the highest accuracy among the five optimized models acts as the optimal trained model. Subsequently, we evaluated the accuracy of the optimal trained RamanNet model on the test set using the confusion matrix of 15 bacteria (Fig. 6), and the ROC curve (Fig. 7) was reported.
![]() | ||
| Fig. 6 Confusion matrix for 15 bacterial species predicted by RamanNet. Blue represents Gram-negative bacteria, and red represents Gram-positive bacteria. | ||
![]() | ||
| Fig. 7 RamanNet discriminates the receiver operating characteristic (ROC) curve for 15 species of bacterial scRS. The average AUC value of 15 species is more than 0.99. | ||
In this experiment, we use the SGD optimizer with a learning rate of 0.001 and a batch size of 6. The training was performed in 500 epochs, and early stopping technology was also used to prevent overfitting. If the accuracy on the validation subset does not rise for 100 consecutive epochs, the training process will end.
Ibtehaz et al. proposed that the translational invariance of CNNs limits its application in Raman spectral classification, and used shifted multi-layer perceptions to simulate multi-layer convolutional layers to analyze shifted windows of Raman spectra.27 We argue that the translational equivariance is provided by multiple convolutional operations and the global pooling operation. Here our proposed RamanNet only uses two convolutional layers and discards the global pooling operation through a single channel (each convolutional layer has only one convolution kernel), which greatly reduces the translation invariance and the model complexity.
On the 30-class task, the average isolate-level accuracy is 84.7 ± 0.3% (Fig. 3A). Gram-negative bacteria are primarily misclassified as other Gram-negative bacteria; the same is true for Gram-positive bacteria; most misclassifications occur within the same genus. For 30 isolates and MSSA/MRSA, the mean area under the ROC curve (AUC) value was over 0.98 (Fig. 4) and 0.90 (Fig. 5B), respectively.
The 8-empiric-treatments task determines whether the model can provide the correct recommended empiric treatment; here, the accuracy reaches 97.1 ± 0.3% (Fig. 3B).
As shown in Table 2, compared to ResNet,14 RamanNet achieves better accuracy on the 30-isolates task and similar results on the 8-treatments task, while the number of parameters was reduced to 0.030 M parameters (less by 44.7×), and the computational complexity (multiply-accumulate operations, MACs) was significantly reduced to 0.030 M MACs (less by 13
166.7×). Moreover, the training time of the RamanNet model was less than 20 minutes using CPU rather than GPU (Table 3). While, in distinguishing MRSA/MSSA, the identification accuracy of RamanNet was 81.6 ± 0.9% (Fig. 5A), lower than 89 ± 0.9% of ResNet (Table 2). This would be because they belong to the same species and are highly similar, so RamanNet may need further improvement in identifying bacterial subtypes. In brief, the above results demonstrated that RamanNet could achieve comparable or even better results at the species level than traditional ResNet on large Raman spectral datasets with low SNR. Thus, our model provides a quick and efficient way to analyze the Raman spectra of bacteria; it would be applied in the clinical diagnosis of bacterial diseases.
| Model | Accuracy | Parameters (mega) | Computation parameters (MACs) |
|---|---|---|---|
| RamanNet | 96.04% | 0.013 | 0.013 |
| ResNet | 94.53% | 3.86 | 9.64 |
| This journal is © The Royal Society of Chemistry 2022 |