Weng
Shizhuang
ab,
Chen
Sheng
ab,
Li
Miao
*b,
Zeng
Xinhua
b,
Zheng
Shouguo
b,
Zhang
Jian
b,
Chen
Jin
c and
Chen
Lei
b
aUniversity of Science and Technology of China, Hefei, 230026, China
bHefei Institute of Intelligent Machine, Chinese Academy of Sciences, Hefei, 230031, China. E-mail: mli@iim.ac.cn
cSchool of Chemistry & Chemical Engineering, Anhui University, Hefei, 230601, China
First published on 25th November 2013
Surfaced-enhanced Raman spectroscopy (SERS) is a novel analytical technology mainly for quick, simple and ultrasensitive analysis. In this study, SERS and partial least squares regression (PLSR) were used for the quantitative analysis of thiram combined with a wavenumber selection method based on the adaptive genetic algorithm (AGA). The conventional genetic algorithm (CGA) was also used for wavenumber selection and the effects of the two were contrasted. Moreover, the impact on individual analytical results was evaluated of different numbers of subintervals included in the wavenumber selection. We achieved the best results (root mean square error of cross-validation = 0.2957 μM, R = 0.9995) by PLSR using an AGA with 5-subintervals. The experiments indicated that the proposed AGA-based wavenumber selection was superior to the CGA and the effect of multi-subintervals on wavenumber selection was better than that of a single subinterval.
Surface-enhanced Raman spectroscopy (SERS) is a novel analytical method providing ultrasensitive, simple and rapid detection of organic chemicals.6 SERS overcomes the low sensitivity of traditional Raman spectroscopy by tremendous enhancement of scattering signals via absorption of analytes onto the rough surface of noble metals (gold, silver and so on). Besides, SERS technology has the excellent characteristics of high speed, low cost and its ability to record spectra of analytes without pretreatment. Therefore, SERS is suitable for fast and accurate analysis of thiram.
While of long standing fundamental importance, SERS as a quantitative analytical tool has only recently begun to be used.7–9 Chemometrics methods such as principal component regression (PCR), partial least squares regression (PLSR) and support vector machine regression (SVR) are generally employed to process data in quantitative analysis. Meanwhile, some studies have proved that proper wavelength or wavenumber selection can improve the results before using the chemometrics methods.10–13 In general, the methods used for wavenumber selection can be divided into manual and automated procedures. The manual methods are based on the selection of the spectra near characteristic peaks of the analytes; the automatic methods mainly depend on optimization algorithms. In comparison, the latter methods are carried out without human intervention, and are simple and easy to operate. It has been proved that the optimization algorithms such as the genetic algorithm (GA),14,15 simulated annealing (SA),16 and ant colony optimization (ACO)17 can effectively select appropriate wavenumber regions. Since the GA is simple and easy to realize, it was used for wavenumber selection in this research. In the method of GA-based wavenumber selection, a combination of spectral variables or subintervals is assigned to a chromosome. Through continuous crossover, mutation and selection, chromosomes of high fitness value are obtained with high predictive accuracy of the corresponding regression models. As the rate of mutation and crossover is fixed in CGA, the algorithm usually has the shortcomings of easily falling into local optimization and destruction of the excellent individuals.18 One possible way of avoiding this problem is the use of the adaptive genetic algorithm AGA so that the rate of mutation and crossover can be adjusted by the fitness value of the individuals. The corresponding crossover and mutation rates of the individuals with high fitness value are low in AGA, hence they can be saved and optimized further with the AGA. But for individuals with a low fitness value, it is highly probable that they are eliminated because of their high mutation rate. The selection unit of wavenumber selection can be divided into spectral variables and spectral subintervals.19 The resultant analysis results tend to be better in the latter for taking into account the high correlation of the spectral variables.
In this study, SERS and PLSR were used for the quantitative analysis of thiram combined with AGA-based wavenumber selection methods. Different GA-based wavenumber selection methods (AGA, CGA) were compared by 10-fold cross-validation on the basis of the correlation coefficient (Rp) and the root mean square error of cross-validation (RMSECV) of the corresponding regression models. Furthermore, the effects of the single subinterval and multi-subintervals on wavenumber selection were also assessed. The details are shown in below.
Ultrasonic dispersed Ag nanoparticles, were placed into a centrifuge tube. Then thiram solutions of different concentrations (the same volume) were added to the Ag sol, followed by mixing using ultrasonic dispersion for 10 min.22 Then, the mixed solution was dropped onto a silicon wafer which acted as the enhancing substrate. After evaporation of the solvent at room temperature the SERS spectra were recorded. The spectra were acquired using a LabRAM HR800 Raman spectrometer equipped with a 532 nm diode laser source. All the spectra were recorded with a 50× microscope objective and 200 mW laser power. For each original spectrum, 10 scans were performed at sampling points of 710 over the range of 400–1600 cm−1 with an integration time of 10 s. For each sample, 10 original spectra were recorded taking into account the location of the sampling points. And 100 spectra were collected with sampling once a week. A single original spectrum was not suitable for quantitative analysis because of the poor reproducibility of the SERS signal. To solve this problem, the mean of 30 spectra, randomly selected from 100 original spectra, was used for the analysis spectroscopy and 50 analysis spectra were acquired by cyclic operation 50 times. The baseline of each spectrum, which was caused by the fluorescence background, was approximated by a seventh-order polynomial fit and subtracted using NGSLabSpec software (Jobin Yvon).
| X = TPT + E, Y = TQT + F | (1) |
![]() | (2) |
is the covariance of X. So far, the relationship between X and Y is derived.
First, chromosomes are randomly generated to form the initial population. Then, to find the best combination of the wavenumber subintervals, the calculations for the fitness values, selection of chromosomes, crossover, and mutation are repeated until the number of iterations exceeds the predefined limit. In this study, one subinterval in a spectrum is represented by two integer values. The first value denotes the starting wavenumber of a subinterval, and the second value denotes the finishing wavenumber. If the range of the represented region exceeds the full spectral range or two regions overlap, the excess or redundant parts are ignored.
Two objectives are considered when designing the fitness function of chromosomes. One is to maximize Rp, and the other is to minimize RMSECV. So the fitness function was designed as: Rp/(1 + RMSECV). And the fitness values of chromosomes were calculated by PLSR using the SERS spectra of the selected wavenumber regions. Other operators were set as follows: the number of iterations was 500; the size of the population was 20. The number of subintervals in each chromosome was 1 (single subinterval) or 5 (multi-subintervals). The crossover rate was 0.6 and the mutation rate was 0. 1 in CGA; but for AGA, the crossover rate (Pc) and the mutation rate (Pm) were set as following:
![]() | (3) |
, fmin are the maximum fitness value, the average fitness value and the minimum fitness value of the individuals in each generation. And pc1 > pc2 > pc3, pm1 > pm2 > pm3 were assigned by the value of 0.0–1.0. The corresponding rate of crossover and mutation of the individuals with low fitness is high in eqn (3), the individuals can be washed out during the iteration. For the individuals with high fitness, the rate of crossover and mutation is small, and the individuals are inherited and further optimized. But in CGA, the fixed rate of crossover and mutation may cause the individuals with low fitness not to be washed out quickly; the individuals with high fitness are not more likely to be inherited, and the better individual is not conducive to being obtained.
All the computation and chemometric methods were implemented in MATLAB 2011b (The Mathworks Inc., Natick, MA, USA). The PLSR or GA algorithms were carried out by ourselves.
S stretching modes, the peak at 561 cm−1, which is associated with the S
S stretching mode, the peak at 925 cm−1, which is due to the stretching CH3N and C
S modes, are all clearly identifiable. From Fig. 2, we can also find that the CN stretching mode and rocking CH3 mode occurs at 1150 cm−1; the CN stretching mode occurs at 1444 cm−1 as well as the rocking CH3 mode.26 At the same time, the CN stretching mode and the deformation, rocking mode of CH3 can be observed at 1514 cm−1. All the intensities of these peaks are stronger while the concentrations are higher. The phenomenon partially indicates that quantitative analysis of SERS is possible.
Spectral wavenumber regions for the establishment of the regression model have a significant influence on the performance of the mode, and the conventional methods directly select the spectra near characteristic peaks. Although all characteristic information of the spectra is used to develop the model in these methods, introduction of some information has a negative effect on the results and the non-characteristic information rather than a positive effect. Thus, the optimization algorithms (CGA, AGA) were used to identify which few wavenumber regions to use in this study.
| Data set | Method | RMSECV | R |
|---|---|---|---|
| a Without wavenumber selection algorithms. | |||
| SER spectra of 400–1600 cm−1 | 0.4666 | 0.9934 | |
| CGA1 | 0.4597 | 0.9953 | |
| AGA1 | 0.3908 | 0.9977 | |
| CGA5 | 0.3642 | 0.9986 | |
| AGA5 | 0.2957 | 0.9995 | |
Fig. 4 shows the PCA plot of the PLSR scores of the spectra selected by AGA5/CGA5. It is found that the PLSR scores of the samples of the same concentration are more centred in AGA5, which means that the deviation between the samples of the same class is reduced. Thus the performance of the corresponding model is improved. The predictive error of the model which was built with the selected spectral wavenumbers in AGA5 is shown as Fig. 5. The analysis results are generally accurate, but the relative deviation for the low-concentration solution (0.1 μM) is too big to appear to be a mistake.
![]() | ||
| Fig. 4 PCA scores plot of PLSR latent variables of the thiram spectra selected by CGA5/AGA5; (a) CGA5, (b) AGA5. | ||
| This journal is © The Royal Society of Chemistry 2014 |