Guanghui
Chen
a,
Peichao
Zheng
*b,
Jinmei
Wang
b,
Biao
Li
b,
Xufeng
Liu
b,
Zhi
Yang
b,
Zhicheng
Sun
b,
Hongwu
Tian
cd,
Daming
Dong
cd and
Lianbo
Guo
e
aSchool of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China
bSchool of Optoelectronic Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China. E-mail: zhengpc@cqupt.edu.cn
cResearch Center of Intelligent Equipment, Beijing Academy of Agriculture and Forestry Sciences, Beijing 100097, China
dKey Laboratory of Agricultural Sensors, Ministry of Agriculture and Rural Affairs, Beijing 100097, China
eWuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
First published on 10th June 2024
Owing to the complex laser–material interaction, large spectral fluctuation is one of the main causes of the relatively poor qualitative analysis performance of laser-induced breakdown spectroscopy (LIBS). In this study, a data preprocessing method, namely minimum distance and subsequent averaging (MD & A) using plasma image information, is proposed to identify effective spectra and improve the spectral stability of LIBS. The correlations between plasma image features and spectral line intensity were analyzed, which indicated that plasma average area intensity showed high correlation with plasma states. To evaluate the performance of the proposed method, original spectra were preprocessed by other traditional screening methods, and then, different quantitative models for Mn, Cu, Cr and Si elements in steel samples were separately established based on the partial least squares regression (PLSR) algorithm. Compared to the original spectra, the evaluation parameter R2 is improved to nearly 99% from 93% while the values of mean root mean squared error of prediction (RMSEP) and mean average relative error of prediction (AREP) are also reduced by over 50%. In addition, the mean relative standard deviation (RSD) of different spectral line intensities using MD & A can be reduced to the range of 1.07% to 3.12% from the range of 3.76% to 18.48%. The above results indicate that our proposed method can be easy to realize and significantly improve the spectral stability and quantitative analysis performance of LIBS compared to traditional methods.
Laser-induced breakdown spectroscopy (LIBS) is a promising atomic spectroscopy and analytical technology that determines the elemental composition and content of materials by analyzing the ionic and atomic emission spectrum of transient plasma, which is produced by the interaction between a high-power laser and material.5–7 LIBS has obvious advantages of in situ, real-time analysis, fewer sample preparation steps and multi-element detection over traditional chemical analysis methods. Therefore, LIBS has been widely applied in various fields for analytical purposes, such as biology,8 geology,9 metal alloy analysis,10 food safety,11 and space exploration.12 However, because of uncertain influencing factors such as the matrix effect, self-absorption effect and experimental environment, the small, inhomogeneous, and drastically evolving plasma produced by the process of laser–material interaction would cause large fluctuations in plasma properties (such as plasma temperature and electron density) and relatively poor signal stability of LIBS, which is also regarded as one of the key bottlenecks in achieving improved quantitative performance and wide commercialization of LIBS.13–15 Although various methods, such as machine learning algorithm correction,16 dual pulse excitation,17 nano-enhanced,18 and spatial confinement,19 have been proposed to improve the spectral stability and analytical performance of LIBS, it still cannot avoid the generation of ineffective spectra. Therefore, it is necessary to reduce or eliminate the influence of ineffective spectral data on quantitative analysis by data preprocessing methods.
Compared with experimental methods like dual pulse excitation and spatial confinement, the data processing method (with the advantages of a wide application range, easy operation and no or little increase in the LIBS system complexity) has become a popular method for the improvement of LIBS analytical performance. For example, Hahn et al.20 used the peak-to-base ratio or signal-to-noise ratio (SNR) value of a specific spectral line as a screening indicator to select the effective spectral data. Similarly, Yao et al.21 demonstrated that the standard deviation (SD) method has better applicability for identifying spurious spectra of gas–solid flow in comparison with the SNR method and the absolute peak intensity method. Yan et al.22 proposed a method of minimum distance (MD) of the spectral intensities to eliminate invalid data and improve the rock classification accuracy in handheld LIBS. Kappeler et al.23 achieved an accurate quantitative analysis of the binder and conductive additives in Li-ion battery cathodes by using the average spectrum of multi-pulse LIBS measurements. However, the above data processing methods for distinguishing the effective spectra are basically dependent on the spectral intensity of the specific element, which is easily affected by interference factors and restricts its application for the unknown sample or sample with complex composition. Moreover, due to the poor spectral stability that the relative standard deviation (RSD) of LIBS is generally much larger than 10%, the method of averaging all the raw spectral data has a limited effect in improving the quantification performance of LIBS.24
The plasma image is a promising reference signal and can provide two-dimensional information about the plasma morphology, plasma radiation intensity distribution and other properties. In recent years, several researchers applied the plasma image information to improve the LIBS analysis capability and achieved substantial results. Motto-Ros et al.25 utilized the morphological features of the plasma image to achieve precise control over the plasma emission collection and successfully improve the LIBS detection repeatability in both short and long terms. Ni et al.26 proposed a normalization method to reduce the spectral fluctuations by using the sum of pixel values in the plasma area as a reference signal. Zhang et al.27,28 used plasma image features to characterize the plasma property parameter, and proposed plasma image-spectrum fusion methods to reduce the measuring errors caused by the matrix effect and spectral fluctuation in LIBS. Particularly, compared with the traditional data screening methods, Chen et al.29 proposed an image auxiliary method to eliminate the invalid spectral data by using plasma image area information and achieved better quantitative analysis results. The plasma image information can directly characterize the partial state of laser–material interaction, and is more comprehensive, repeatable, and there is less interference by the collection system than spectral signals, which obviously contributes to the identification of effective spectral data.13,30 Although these methods can improve the LIBS detection accuracy and stability to a certain extent, there is still significant room for the improvement of the data preprocessing efficiency and analytical performance.
In this work, an image auxiliary data preprocessing method based on plasma image information is proposed to improve the spectral stability of LIBS. Correlations between the plasma images and the corresponding spectral lines of the matrix or trace elements in steel alloy were firstly studied. Then, the plasma average intensity, which demonstrated the highest correlation with the spectral line intensity, was used as a reference index in combination with the minimum distance (MD) and averaging method to eliminate invalid data. The quantitative analysis of Cu, Si, Mn and Cr elements in steel was then conducted by partial least squares regression (PLSR) model. Compared with the conventional averaging or SD methods based on the spectral line intensity, the results proved that the proposed method can effectively improve the spectral stability and quantitative analysis performance of LIBS.
No. | C | Si | Mn | Cr | Cu | Fe |
---|---|---|---|---|---|---|
1 | 0.0023 | 0.0540 | 0.0180 | 0.0230 | 0.0036 | 99.7784 |
2 | 0.0028 | 0.0770 | 0.1040 | 0.0420 | 0.0490 | 99.1698 |
3 | 0.0320 | 1.5500 | 0.3030 | 0.2360 | 0.4030 | 95.8989 |
4 | 0.0960 | 1.0900 | 0.6690 | 0.1020 | 0.3160 | 95.9517 |
5 | 0.2430 | 0.7690 | 1.0400 | 0.1060 | 0.2480 | 95.6516 |
6 | 0.3870 | 0.4360 | 1.4700 | 0.4090 | 0.1670 | 95.8404 |
7 | 0.4980 | 0.1760 | 2.1000 | 0.6120 | 0.0750 | 95.3222 |
8 | 0.1640 | 0.1740 | 1.0400 | 0.1940 | 0.0170 | 98.2670 |
9 | 0.4530 | 0.2720 | 0.6040 | 0.0440 | 0.1410 | 98.4004 |
10 | 0.230 | 0.3540 | 1.6500 | 0.2710 | 0.1110 | 95.9956 |
11 | 0.7080 | 0.2380 | 0.5950 | 0.0250 | 0.0058 | 98.4022 |
12 | 0.3580 | 0.1950 | 0.7280 | 0.0280 | 0.0110 | 98.6381 |
13 | 0.1520 | 0.2040 | 0.6580 | 0.0280 | 0.0170 | 98.9167 |
14 | 0.0620 | 0.0380 | 0.7440 | 0.0120 | 0.0063 | 99.1163 |
15 | 0.2470 | 0.2650 | 0.5820 | 1.6800 | 0.0330 | 96.6112 |
According to the literature, features such as the “plasma max intensity”, “plasma area”, “plasma average area intensity”, “plasma roundness” and “brightness contrast” could be extracted from the plasma image to analyze the correlation between the plasma image and the spectra.27,29,31,32 Additionally, in the medical imaging field, the structure similarity index measure (SSIM) was used to concretely quantize the similarity between two images using the value of 0 to 1 (higher value of SSIM indicating greater similarity).33 In this work, the corresponding spectra of high SSIM value (between plasma images and their average plasma image in the same sample) was tried to identify the effective data. The above image features were all extracted in the image area with pixel values over 1000.
The Pearson coefficient is widely applied to analyze the linear correlation between two sets of data.34 Then, considering the peak and stability of the spectral line intensity, the average Pearson coefficient r of 15 samples is shown in Fig. 3 to represent the correlations between the spectral line intensities (including Fe I 404.581 nm, Fe I 425.079 nm, C I 247.856 nm and Si I 288.858 nm) and image features. However, there is no clear relation between the spectral line intensity and image features, including the SSIM and plasma roundness. The reason might be that under different excited states, these parameters are sensitive to the irregular variations of the morphology and intensity distribution of plasma, which leads to the poor correlation between the average plasma image and plasma image. On the other hand, compared with the spectral line of other trace elements, the spectral lines of the matrix element Fe are strongly correlated to the plasma max intensity, plasma area, plasma average area intensity and brightness contrast, and the average Pearson coefficients r between the above features and Fe I 404.581 nm are 0.9492, 0.9023, 0.9668 and 0.959, respectively. Therefore, in the following section, the plasma average area intensity of the plasma image was used as a reference index by using the proposed method to select effective data for improving the LIBS spectral stability.
Firstly, the reference index Fi (i ≤ m, m = 200) was extracted from the spectra or plasma images, where i is the serial number of the collected data in an individual sample. Then, a column matrix Xm*1 consisting of all the reference indexes Fi was obtained. Secondly, the absolute difference matrix D1 could be calculated by subtracting the value of F1 from each element of matrix Xm*1. Each element and the sum value of all elements of D1 was defined as d1i and S1, respectively. Thirdly, D2 to Dm and S2 to Sm were calculated by repeating the above steps, respectively. Then, the center of clustering Fmin of the reference index could be obtained by finding the Smin (minimum value of S1 to Sm), which meant that the differences of the reference indexes were minimum. Finally, after sorting the elements of Dmin (the absolute difference matrix corresponding to Smin) in ascending order, matrix Dmin2 was obtained. Then, the corresponding spectra of the first n (n ≤ m) elements of Dmin2 were selected as the effective spectra. The above steps were repeated 15 times to get the effective spectra from all samples.
In addition, the SD and averaging method was used as a contrasting method to evaluate the performance of the proposed method in this work. Due to the SD and averaging method being a universal method for reducing the influence of ineffective spectral data on LIBS quantitative analysis, the detailed principles are not described here.
(1) |
(2) |
(3) |
As described in Sections 3.1 and 3.2, the plasma average area intensity of the plasma image and different spectral line intensity were used as the reference index to select effective spectra using the MD method. We defined the “effective data ratio” as the data volume ratio between the selected effective data and raw data. For example, 100% and 50% of this ratio represent the total and half amounts of the raw data that are retained as effective data, respectively. To avoid unsatisfactory performance predicted by a small data set, the mean value of R2 and root-mean-square-error of the prediction (RMSEP) of 15 iterations were used to optimize the effective data ratio. The optimization range was 50% to 100%, and the step size was 2%. Taking the Cu element prediction as an example, the optimization results are shown in Fig. 5.
Fig. 5 The optimization results of the MD method under effective data ratio: (a) mean R2 and (b) RMSEP. |
As shown in Fig. 5, it can be seen that the analysis performance of the steel samples was basically monotonically improved with the decreasing of the effective data ratio. For the raw spectra (effective data ratio range of 50% to 100%), the results indicate that PLS can extract better features, and the best performance (mean R2 = 0.9746 and mean RMSEP = 0.0203) was obtained when the order is sorted by the MD method with image feature (plasma average area intensity) indexes rather than spectral line. In addition, the analysis performance of the PLS model using different Fe spectral lines was similar, but it was still better than using the spectral line of trace elements in the effective data ratio range of 66% to 100%. However, the analysis performance of the PLS model using the C spectral line is close to or even better than that of best Fe spectral line in the effective data ratio range of 50% to 66%, which means that the data screening effect of spectral line index is complex and inferior than image feature index in the MD method.
Furthermore, to reduce the measuring error, the raw spectra are sorted by the MD method with different feature indexes, and then the new spectra are obtained by averaging each set of 5 sorted spectra. In addition, the results of the raw spectra processed by averaging and then MD (A & MD) method with Fe I 425.079 nm are used as comparison to evaluate the effect of the proposed method (MD and then averaging, MD & A). As shown in Fig. 6, the proposed method was obviously better than the comparison (mean R2 < 95% and mean RMSEP > 0.036) of A & MD, which indicated that averaging the raw data with the original order would introduce more invalid LIBS spectra and lead to poor analysis results. The lower the value of the effective data ratio, the higher prediction accuracy of the proposed method for the Cu element, whose trends are similar but better than those of the MD method only. Finally, the best R2 and RMSEP of MD & A with the image feature index are 99.41% and 0.0138, respectively. In conclusion, the best effective data ratio of MD & A is 50%, and MD & A is regarded as the proposed method to improve the LIBS analysis capability.
Fig. 6 The optimization results of the MD&A method under effective data ratio: (a) mean R2 and (b) RMSEP. |
Finally, the quantitative analysis results of different elements under different methods are listed in Table 2. The mean RSD was used to reflect the spectral line intensities of all samples. As shown in Table 2, after using the proposed method, the quantitative performance for all analysis elements was similar to the trends of the Cu content prediction. In addition, although the mean RSD of different spectral lines using MD & A can be reduced to the range of 1.07 to 3.12% from the range of 3.76 to 18.48%, the results obtained using SD & A are still not better than the original spectra. The possible reason may be that compared to the plasma image feature including more information, only the stability of the spectral line intensity is not enough to evaluate the quality of the spectra due to the uncertainty in the influencing factors in LIBS detection. Moreover, the SD method is usually influenced by the empirical spectral line selection and the MD method may be better to reflect the statistical distribution laws of the spectral data, which leads to better performance of MD & A than that of SD & A.
Element (wavelength) | Method | Mean R2 (%) | Mean RMSE (wt%) | Mean ARE (%) | Mean RSD (%) of spectral lines | ||
---|---|---|---|---|---|---|---|
Calibration | Prediction | Calibration | Prediction | ||||
a Original spectra processed by the averaging method. b MD & A method with the spectral line of the Fe I 425.079 nm as a reference index. c MD & A method with the image feature of the plasma average area intensity as a reference index. | |||||||
Cu I (324.754 nm) | Ori & Aa | 93.23 | 0.0305 | 0.0382 | 98.61 | 118.28 | 4.27 |
SD & A | 98.57 | 0.0140 | 0.0196 | 37.52 | 49.19 | 6.00 | |
MD & Ab | 99.33 | 0.0096 | 0.0138 | 29.71 | 40.06 | 1.14 | |
MD & Ac | 99.41 | 0.0090 | 0.0121 | 26.56 | 35.19 | 1.14 | |
Cr I (425.433 nm) | Ori & Aa | 93.95 | 0.1381 | 0.1679 | 106.61 | 131.08 | 18.48 |
SD & A | 98.86 | 0.0597 | 0.0752 | 48.85 | 60.21 | 32.48 | |
MD & Ab | 99.51 | 0.0391 | 0.0512 | 30.46 | 44.54 | 2.54 | |
MD & Ac | 99.47 | 0.0405 | 0.0441 | 29.51 | 40.69 | 2.99 | |
Mn I (403.076 nm) | Ori & Aa | 92.49 | 0.1658 | 0.1868 | 46.22 | 63.99 | 13.86 |
SD & A | 98.02 | 0.0852 | 0.1005 | 18.85 | 33.62 | 19.23 | |
MD & Ab | 99.13 | 0.0562 | 0.0757 | 14.70 | 29.74 | 2.84 | |
MD & Ac | 99.13 | 0.0562 | 0.0709 | 14.04 | 25.94 | 3.12 | |
Si I (288.158 nm) | Ori & Aa | 89.61 | 0.1296 | 0.1789 | 50.14 | 71.36 | 3.76 |
SD & A | 96.13 | 0.0787 | 0.1256 | 29.11 | 52.55 | 7.04 | |
MD & Ab | 98.11 | 0.0550 | 0.0896 | 23.19 | 37.40 | 1.07 | |
MD & Ac | 98.29 | 0.0524 | 0.0884 | 21.34 | 38.05 | 1.13 |
On the other hand, although the results of using MD & A with two different indexes are similar, the plasma average area intensity may be a more objective index because it is easily obtained and has a high correlation with spectral effectiveness. The above results indicate that our proposed data screening method can be easy to realize, significantly improving the spectral stability and quantitative analysis performance of LIBS compared with traditional methods.
This journal is © The Royal Society of Chemistry 2024 |