Mehrvash
Varnasseri†
,
Yun
Xu†
and
Royston
Goodacre
*
Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, L69 7ZB, UK. E-mail: Roy.Goodacre@liverpool.ac.uk
First published on 17th March 2022
Detecting food adulteration has always been an important task for food safety, especially when grapefruit is the adulterant as components in the juice have undesired interactions with many medicines. In this study we employed a handheld Raman device to detect adulteration of orange juices with grapefruit juices. Fresh fruits of orange and grapefruit were purchased from five different sources and fruit juices were made using a handheld juicer. The extracted juices were then mixed in a way that concentrations of grapefruit juices varied from 0% to 100% in 5% increments. In order to study the impact of the different sources of the fruits, three different sets of mixtures were prepared based on their spectral similarity and dissimilarity. Raman spectra were collected using a handheld instrument with an excitation laser at 785 nm and data analysed using principal component analysis (PCA), principal component-discriminant function analysis (PC-DFA) and partial least squares regression (PLS-R). PLS-R models were trained and validated on: (i) the full data set from the three different mixture sets, and (ii) each set of the three mixtures separately. The results showed that a good calibration model was obtained using full data which had a coefficient of determination (Q2) of 0.81 and a root mean square error of prediction (RMSEP) of 12.5%. Such results were improved when the PLS-R model was trained and validated on the three separate mixture combinations, where the Q2 varied from 0.85 to 0.89 and RMSEP varied from 9.9% to 11.6%. Finally, we adopted a two step approach in which a partial least squares for discriminant analysis (PLS-DA) was trained first to classify the three sample sources and then three different PLS-R models were subsequently trained on samples from the same source. This resulted in a Q2 of 0.83 and RMSEP of 12.0%. In conclusion, we have demonstrated that Raman spectroscopy can be used as a portable and rapid analytical tool for detecting adulteration of grapefruit juice added to orange juice.
Classical analytical platforms such as HPLC-DAD-MS/MS,11 UPLC-QToF MS, FT-IR and NMR spectroscopy had been reported for fruit juice product authentications.12–16 A few molecular DNA based methods have also reported in the literature for the same purpose.17–21 However, all these methods involve lengthy sample preparations and bulky instruments which make them not suitable for rapid on-site detection. By contrast, Raman spectroscopy, especially using portable handheld Raman platforms, has many attractive features such as being non-destructive, with little to no sample preparation required, resulting in chemical information rich spectra and is insensitive to water which means that the juice can be measured directly. All of these features have made Raman spectroscopy particularly suitable for such applications.22,23
In this study we employed a CBEx handheld Raman spectrometer from Snowy Range (Laramie, Wy, U.S.A.) to measure freshly squeezed orange juices adulterated with different concentrations of grapefruit juices. The collected Raman spectra were then subjected to multivariate analysis including principal component analysis (PCA) and partial least squares regression (PLS-R) to detect and quantify the levels of adulterations of grapefruit juices added orange juices. Considering the fact that the same fruit from different origins (e.g., site of purchase) may also have subtle differences in their chemical composition (e.g., due to storage conditions and length of storage) and this may affect their Raman spectra. To assess such effect, orange and grapefruit were purchased from five different sources and their Raman spectra were analysed and compared.
Grapefruit and oranges were manually squeezed using a handheld juicer. The extracted juices were then filtered to remove solid debris and centrifuged at 3100g for 5 min. The supernatants were further centrifuged at 15871g for 3 min. All centrifuge steps were kept at 4 °C and the processed juice (supernatants) were labelled and stored in at −80 °C until further analysis. Mixtures of two types of juices were prepared by mixing orange juice with grapefruit juice in glass vials and the concentration of grapefruit juice varied from 0% to 100% v/v in 5% increments which resulted in 21 different concentrations of grapefruit juices within the mixtures. More details are provided in the results section.
A CBEx handheld Raman spectrometer from Snowy Range (Laramie, WY, U.S.) was employed to measure Raman spectra of juices. The spectrometer operates using a 70 mW (on sample) 785 nm laser with detection on a 2048 element CCD array, resulting in a 12–14 cm−1 spectral resolution. All spectra were collected in the spectral range 400–2300 cm−1 with an acquisition time of 2 s.
Prior to multivariate statistical analysis, all spectra were normalized using standard normal variate (SNV) normalization so that each spectrum has a mean of 0 and a standard deviation of 1. Principal component analysis (PCA) and principal component-discriminant function analysis (PC-DFA) were used to visualize the pattern of samples. Partial least squares regression (PLS-R) was then employed to quantify the level of adulterations of grapefruit juice added to orange juice. The data were split into training and test set based on concentrations of grapefruit juice added to the mixture. The samples with 0%, 10%, 20%, …, 100% grapefruit juice were used as training sets and the samples with 5%, 15%, 25%, …, 95% grapefruit juice were used as test sets. The number of latent variables (LVs) of PLS models was optimised by performing a k-fold (k = 11 which is the number of different concentrations in the training set) cross-validation on the training set only. The optimal number of LVs was set as the one which resulted in minimal root mean squares errors of cross-validation (RMSECV). Once this process was completed the trained PLS-R models were then applied to the test set and the results were reported as coefficient of determination Q2 and root mean squares errors of prediction (RMSEP), as given in eqn (1) and eqn (2), of the test set.
![]() | (1) |
![]() | (2) |
Based on the PLS-R model, a limit of detection was also estimated using net analyte signal approach as described by Olivieri et al.24
All data analysis works were carried out in MATLAB 2020a (the MathWorks, MA, U.S.) environment and relevant in-house MATLAB functions are available freely on our Github repository at https://github.com/biospec.
Raman band (cm−1) | Source of vibrationa | Assignment |
---|---|---|
a Sources: G for glucose, F for fructose and S for sucrose. Strength of vibration is denoted as follows: x weak; X strong; XX dominant. | ||
509–524 | GG/F/S | Skeletal vibration |
593–600 | S | Skeletal vibration |
625–631 | FF/s | Ring deformation |
705 | F | Skeletal vibration |
822 | FF | C–OH stretch |
837–850 | SS/G | Unknown |
868 | F | C–O–C cyclic alkyl ethers |
916–921 | f/G/s | CH, COH bend |
976–988 | F/s/g | Ring ‘‘breathing’’ |
1065–1074 | FF/SS/G | C–O–C cyclic alkyl ethers |
1127–1129 | SS/GG | C–OH deformation |
The PCA scores plot of the 10 pure juices is shown in Fig. 2(a). Separations of two types of juices and separations between difference sources within each type of juice can both be observed. It is also interesting to see that the distances between two types of juices in the PCA space varied significantly between the different sources of the fruits. The differences between orange from source A and grapefruit juice from source B appeared to be the most different in fruits purchased from source A (highlighted in red) while such difference is much smaller in orange purchased from source C and grapefruit purchased from source E (highlighted in blue). This suggested that fruits from different sources could have subtle differences in their Raman spectra and that this could affect the capability of detecting adulteration levels of grapefruit juices added to orange juices. The loadings plot of this PCA model is shown in Fig. 2(b). The most significant bands are located between 800–900 cm−1 which corresponding to the major sugar bands as shown in Fig. 1, suggesting subtle sugar contents differences also contributes to the separations between fruits from different sources. With this factor in mind, three sets of mixtures of orange and grapefruit juices were prepared: (1) orange from source A mixed with grapefruit from source B, denoted as A1–B2; (2) orange from source D and grapefruit from source C, denoted as D1–C2 and (3) orange from source C and grapefruit source E, denoted as C1–E2. These represent three scenarios where orange-grapefruit juice mixtures in PCA space were most different (A1–B2), moderately different (D1–C2) and least different (C1–E2). For each mixture set, a series of orange and grapefruit juice mixtures were prepared with the concentration of grapefruit juice varying from 0% to 100% v/v in 5% increments, as described in the experimental section. The PCA scores plot of one set of mixture, D1–C2, is given in Fig. 3(a) and it is clear that there was a gradient corresponding to the amount of adulterant within the mixture along PC 1 axis. The PC 1 loadings plot is shown in Fig. 3(b) and again the most significant bands are located between 800–900 cm−1, pointing to sugar differences are the main factor in differentiating these two types of fruits. The PCA scores and loadings plots of the other two mixtures also showed similar trends (data not shown). A more interesting pattern was observed in PC-DFA scores plot where all three mixtures were analysed together (Fig. 4). It appears that the first and therefore most significant discriminant function was responsible for the amount of grapefruit juice added to orange juice while the source differences were observed on the 2nd and 3rd discriminant functions. This suggest that although the origin of fruit might have some impact in quantifying the grapefruit juice added to orange juice, such impact was not significant enough to make such quantification impossible. We also note in Fig. 4 that the three pure orange juices (A1, C1, D1) cluster more closely together compared to the three pure grapefruit juices (B2, C2, E2).
In light of results of PCA and PC-DFA, PLS-R modelling was carried out in three different ways. First, PLS-R models were trained and tested on the whole data set, regardless of the origins of the mixtures. This represents the expected results if the source of fruits were ignored, which would be more real-world as it may be preferred to have a single model that fits all these data. Then three separate PLS-R models were trained and tested on each set of mixtures separately and this represents the scenario when a priori knowledge about the source of the fruit is available and one had trained models specifically to each source. Lastly, a two-step hierarchical model was adopted in which first a classification model of PLS-DA was trained on the training set to classify the source of fruits, i.e. A1–B2, D1–C2 or C1–E2. Then three PLS-R models were trained on each set of mixtures separately using corresponding subsets of samples in the training set. In the test phase, each sample in the test set was first subjected to the PLS-DA models to predict its source of origin and based on the prediction of the source, this test sample was then assigned to a corresponding PLS-R model to predict the concentration of grapefruit adulteration. This means that if a source was incorrectly predicted in the first classification stage then the wrong PLS-R model would be used for quantification.
A predicted versus known plot of PLS-R model predictions on all data (i.e., source of origin ignored) is given in Fig. 5. A good agreement between known and predicted concentrations were observed. Such plot for all other models showed similar pattern and thus omitted for brevity. The number of latent variables (LVs), RMSECV, Q2, RMSEP and estimated LoDs of all PLS-R models are summarised in Table 2. When PLS-R modelling was performed on all data without considering the source of fruit, the averaged Q2 was 0.8171, RMSEP was 12.47% with an estimated LoD of 11.70%. The prediction accuracy improved when PLS-R model was trained and tested on the data from the same source. Q2 increased to 0.8467, 0.8693 and 0.8891 for three sets of A1–B2, D1–C2 and C1–E2, respectively, and their RMSEP reduced to 11.55%, 10.18% and 9.90%, respectively. The LoDs of these three models were also improved to 8.10%, 7.67% and 6.51% respectively. Such differences highlighted the adverse effect of difference source of origins of fruits on quantifying the level of adulteration of grapefruit juice in orange juice. The prediction accuracy of PLS-DA model in predicting source of origins was 87.0% and a detailed confusion matrix is given in Table 3. The following prediction in concentrations of grapefruit juice resulted in a Q2 of 0.8322 and a RMSEP of 11.95%. These were worse than the results of single source PLS-R model albeit better than those of the PLS-R model which completely ignored source of origin. This demonstrated that by introducing a classification model the effect of sources was partially mitigated, yet because the classification model was not perfect, the final accuracy was still worse than models on single source.
No. of LV | Q 2 | RMSECV (%) | RMSEP (%) | LoD (%) | |
---|---|---|---|---|---|
a RMSECV and LOD are omitted for two-models results as it used the same single source PLS-R models. | |||||
All data | 3 | 0.8171 | 13.20 | 12.47 | 11.70 |
C1–E2 | 2 | 0.8891 | 10.13 | 9.90 | 6.51 |
D1–C2 | 2 | 0.8693 | 10.44 | 10.18 | 7.67 |
A1–B2 | 2 | 0.8467 | 11.23 | 11.55 | 8.10 |
PLS-DA/PLS-R | 2/3 | 0.8322 | — | 11.95 | — |
C1–E2 (predicted) | D1–C2 (predicted) | A1–B2 (predicted) | |
---|---|---|---|
C1–E2 (known) | 86.7% | 6.7% | 6.6% |
D1–C2 (known) | 7.9% | 89.5% | 2.6% |
A1–B2 (known) | 10% | 5.0% | 85.0% |
Footnote |
† These authors contributed equally to the work. |
This journal is © The Royal Society of Chemistry 2022 |