Analysis of trace elements in uranium by inductively coupled plasma-optical emission spectroscopy, design of experiments, and partial least squares regression †

Analysis of trace elements in uranium by inductively coupled plasma-optical emission spectroscopy, design of experiments, and partial least squares regression


Introduction
5][6] Trace elemental analysis by each technique is generally performed by diluting samples such that the U matrix is low, and matrix matching the calibration standards, 1 or separating the matrix completely. 7ICP-OES analytical data comes from the emission spectra of elements excited within a plasma with temperatures as high as 10 000 K. The ICP-OES method measures photons, rather than ICP-MS which measures ions of specic mass and is limited by emission rich f-block elements (e.g., U and Pu), which could interfere with trace element spectra.
The electron-rich spectrum for U results in low-lying peaks that overlap with the emission spectra of most trace elements.][9][10][11][12][13][14][15] Recent efforts have focused on minimizing sample size, reducing method time, and automation to improve these highly effective separations. 2eparations are generally assumed to be the superlative way to obtain quality ICP-OES results for each element at all trace-level concentrations.However, multivariate chemometric regression techniques could account for complex optical emission spectral signatures directly, without needing matrix matched samples.
In recent decades, advanced multivariate chemometric techniques have been developed to build high-delity regression models in systems with confounding, covarying, and overlapping spectral features. 168][19] This technique has been implemented with great success in numerous elds of science and technology including food processing, pharmaceutical, and nuclear industries.PLSR is a factor analysis method that maximizes the covariance between two matrices corresponding to the spectra (X) and concentrations (Y) using combinations of latent variables (LVs).1][22][23][24] PLSR could be used to model optical emission spectra and avoid the need for the U matrix removal, but such an approach has not been studied previously.This would improve the analytical time and efficiency of ICP-OES measurements.
Here, we optimize PLSR models built from optical emission spectra, determine limits of detection for numerous trace elements in a U matrix, and validate the method using quality control samples and two uranium oxide reference materials.Calibration and validation spectral data sets were selected by Ioptimal designs to minimize the samples required in the training set, which spanned concentrations of U (4-1000 mg mL −1 ) and trace elements (0.02-2 mg mL −1 ) and covered the anticipated solution conditions (20-5000 mg per g U).These conditions are highly relevant to numerous applications in the nuclear eld.Three points of scientic advancement are covered in this work: (1) multivariate analysis enabled direct quantication of trace elements and U without separations, (2) I-optimal design provided a statistical framework to minimize the number of samples in the training set without user bias, and (3) established limits of detection for numerous trace elements in U using a novel PLSR approach.Herein, we report the rsttime multivariate analysis was used to model optical emission spectra and accurately measure trace elements without the need for the U matrix removal. 25This new approach enables the analysis of trace elements and U simultaneously, which is expected to greatly improve the timeliness and efficiency of ICP-OES measurements in niche applications like U production, trace element determination in nuclear fuel, and intentional forensics. 25,26It also provides a viable option to measure elements which are difficult to chemically separate from U (e.g., Zr, Nb, and Th).This state-of-the-art approach can be extended to many applications within and beyond the nuclear eld.

Experimental
All chemicals were commercially obtained (ACS grade) and used as received unless otherwise stated.Nitric acid 70% (HNO 3 ) was purchased from Sigma-Aldrich.NIST traceable U (10 000 mg mL −1 ) and multielement (100 mg mL −1 ) ICP-OES standard solutions in HNO 3 were purchased from High-Purity Standards.Samples were prepared using deionized water with Milli-Q purity (18.2 MU cm at 25 °C).

Sample preparation
Training set samples contained U and trace elements covering the anticipated solution conditions (20-5000 mg per g U).A list of trace elements included in the sample set are summarized in Tables S1 and S2.† Samples were prepared gravimetrically, using a Mettler Toledo model XS204 balance with an accuracy of ±0.0001 g.Aliquots were diluted in 4% HNO 3 or 2 M HNO 3 .Sample concentration uncertainties were determined by standard error propagation methods described in the ESI.† Uranium oxide (U 3 O 8 ) certied reference materials (CRM) 124-1 (New Brunswick Laboratory Program Office, Argonne, IL, USA) and a Canadian Uranium Product (CUP-2) U ore reference material (Ottawa, Canada) were prepared by digesting 250 mg in a Savillex vessel using 8 M HNO 3 and 0.05 M hydrogen uoride (HF) with heat (100 °C overnight).The resulting solution was diluted to 1000 mg per mL U in 2 M HNO 3 before ICP-OES analysis.

Experimental design
Design of experiments was used to statistically select sample concentrations with Design-Expert (v.11.0.5.0) by Stat-Ease Inc., within the Unscrambler soware package by Camo Analytics.A two-component I-optimal design was used to select training set sample concentrations using a quadratic process order and both point and coordinate exchange.The design required six model points to estimate the coefficients in the design model; these were included in the calibration set.The model points were augmented with ten lack-of-t (LOF) points, which were used as either calibration or validation set samples.LOF points maximize the distance to other runs while satisfying the optimality criterion.The design included two numeric factors, trace element (0.02-2 mg mL −1 ) and U (4-1000 mg mL −1 ) concentrations, and a constraint −0.005 × U mg mL −1 + trace mg mL −1 # 0 to ensure trace concentrations relative to U ranged from 20 to 5000 mg per g U.The design was evaluated using the fraction of design space technique. 27The I-optimality criterion, used to calculate I-optimal designs (also called IV or Integrated Variance), is the most desirable option when prediction performance is important. 28The algorithm selects points to minimize the integral of prediction variance throughout the design space.

Inductively coupled plasma-optical emission spectroscopy
ICP-OES was used to quantify elemental concentrations in each sample.The elemental analyses were evaluated in axial view using a Thermo Fisher (Bremen, Germany) iCAP PRO instrument operated at 1150 W with an Ar ow rate of 12 L min −1 .The ICP-OES is equipped with a simultaneous echelle spectrometer and a high-speed charge injection device (CID) detector for the simultaneous detection of all wavelengths (167-852 nm).All samples were introduced with an Elemental Scientic Inc. (ESI, Omaha, NE, USA) SC-2DXi autosampler into a quartz nebulizer housed within a quartz spray chamber.All measurements were made with axial plasma viewing for enhanced sensitivity.Emission spectra were processed by adjusting the background correction and integration area using Qtegra™ Intelligent Scientic Data Solution™ soware (Bremen, Germany).The interelement correction can be applied to raw data intensity values when the spectral overlap from the dominant emission lines is known.External calibration was used to determine unknown elemental concentrations using either a standard calibration curve for each element (0.01-5 mg mL −1 ) or an Ioptimal design selected training set (Section 2.2).Spectral data were postprocessed using Thermo Fisher soware to help account for overlapping peaks when determining elemental concentrations by soware derived intensity values.Trace metal ICP-OES measurements in U were validated against quality control and reference samples (CMR 124-1 and CUP-2).

Partial least squares regression
PLSR is one of the most popular supervised multivariate modeling methods.It models both the X (spectra) and Y (concentration) matrices simultaneously to nd the factors (also known as latent variables, LVs) in X that best predict Y, by iteratively maximizing the covariance between X and Y.The ideal number of LVs is typically selected by comparing the calibration and validation root mean square error (RMSE) vs. the number of factors in the model.Factor selection is typically performed using a set of test samples to evaluate model performance by from cross-validation or an independent set.The last factor with an appreciable decrease in the RMSE of the cross validation (RMSECV) generally corresponds to the ideal number of LVs.Including too many factors can overt the model and introduce unwanted noise.A full cross validation, leaving one sample out at a time, was used.
The Unscrambler X soware (version 10.4) was used for multivariate analysis and data preprocessing.A NIPALS algorithm with 100 iterations was used for PLSR model calibration. 29LS2 models, which handle multiple Y responses simultaneously, were used unless otherwise stated.Variable selection based on signicant regression coefficients did not improve the models.Data preprocessing and feature selection methods were evaluated; however, these did not result in signicant improvements (data not shown here). 22,30,31

Statistics and limits of detection
The RMSE was used as the primary metric for cross validation (CV) statistics and prediction (P) error, dened in eqn (1): where y i is the known concentration, ŷi is the model predicted concentration, and n is the total number of samples. 32RMSECV and RMSEP measure the dispersion of samples around the regression line when cross validation (CV) or the validation set is used, respectively.To simplify comparisons, the RMSEP values were divided by the median of the concentration range and converted to a percentage (RMSEP%).RMSEP% values # 5% generally indicate acceptable model performance.The deviation (i.e., uncertainty) in predicted Y-values (i.e., concentrations) for each individual sample was estimated as a function of the global model error, sample leverage, and residual Xvariance. 33ercent relative difference (% RD) was used to calculate how close the predictions were to the reported mean concentration in the verication samples using eqn (2): where C 1 and C 2 correspond to the concentration of the measured and reference values, respectively.A zeta score (z) was used to evaluate PLSR predictions compared to reported reference values. 8Zeta scores between +1 and −1 are considered highly acceptable, whereas values greater than +2 and −2 are questionable.Zeta scores were calculated using the experimental result (x) and uncertainty m 2 (x) with the certied reference value (x a ) and its standard uncertainty m 2 (x a ) using eqn (3): The International Union of Pure and Applied Chemistry denes the LOD as the lowest concentration that can be detected with reasonable certainty for a given method. 34Ortiz provided an expansion of the traditional univariate LOD equation for multivariate methods to determine a pseudounivariate LOD (LOD pseudo ), 35 summarized in eqn (4): where s pseudo is the slope of the known calibration sample concentrations plotted against the model-predicted calibration sample concentrations, h 0 min is the minimal calibration sample leverage, n is the number of calibration samples, and var pseudo is the variance of the model-predicted calibration sample concentrations.Note, these LOD pseudo values are estimates and have been shown to be either consistent with or conservative when compared to more calculation-involved LOD condence bands. 36 Results and discussion

Optical emission spectra
The optical emission spectra of most elements on the periodic table are well established.These spectra originate from atoms or ions that absorb energy from the plasma, causing electrons to move from ground to excited states.When excited electrons transition back to lower energy levels, each element emits characteristic photons of light at wavelengths corresponding to the energy change between levels.The linear relationship between the intensity of the light emitted by a given number of atoms is described by the Beer-Lambert law.Univariate regression curves are commonly used to describe how the intensity of light is related to the concentration of each element in solution.However, this calibration approach necessitates the input of spectra free from the interference of overlapping spectral features related to other elements in solution. 37This linear relationship breaks down when measuring trace elements in nuclear materials with line-rich emission spectra like U. Several emission spectra are shown in Fig. 1 to illustrate the range of U matrix spectral interference on trace element spectra.The U interference with the Fe 238.20 nm emission line resulted in a relatively simple baseline offset.On the other hand, the V 310.23 nm was more signicantly inuenced by convolution with the U emission peaks.The effect(s) of low-lying U peaks vary from element to element.Additionally, these effects vary signicantly between emission lines from the same element (Fig. S1 †).Thus, multiple emission peaks for each element must be considered.
Interferences from adjacent or overlapping emission lines from the matrix (U) complicated quantication of most elements using standard univariate calibration and instrument soware settings.Several examples are provided in Table 1.The emission spectra were postprocessed by adjusting the background integration area using Qtegra™ Intelligent Scientic Data Solution™ soware.The univariate calibration curves for each species, without the U matrix, were used to quantify trace elements in the U matrix for several quality control samples (Table S4 †).The RMSE% for ve trace elements when compared to reference values are shown in Table 1.The RMSE% for U concentration (emission peak 385.96 nm) by the univariate approach was 2.6%.Trace Fe was the only element quantiable by this standard approach (i.e., #5%).This conrms previous ndings that required matrix separation of trace elements before quantication with standard ICP-OES methods. 8erefore, a multivariate approach was investigated to account for overlapping U peaks and improve trace quantication by analyzing optical emission spectra directly.

Selecting sample concentrations
Supervised multivariate regression models must contain samples covering the anticipated conditions; trace concentrations from 20 to 5000 mg per g U. Optimal designs have been used with great success to minimize the number of samples in spectral training sets. 21,22They are the most exible, user friendly, and efficient option for selecting training set concentrations when the fewest number of samples is desired.Onefactor-at-a-time methods, more commonly used for selecting samples, result in numerous samples. 24For example, a 2-factor set varied at 5 levels would require 25 samples (5 2 ).Sample concentrations were selected by I-optimal experimental design.Six model points were augmented with ten LOF points (Table 2).The ratio was calculated by dividing the trace (mg) by the U (g) to obtain mg per g U. LOF samples fall within the factor space (i.e., no vertex points) and can be added to the calibration set or used as a statistically derived validation set to avoid user bias.Here, the calibration set contained 12 samples, and the validation set contained 10 samples including 4 LOF points and an additional set of 6 validation samples (Table S3 †) to cover the factor space for each variable.Additional LOF points could be included in future designs to provide more  quality controls.Optimal designs encompass both mixture and process variables, contain different high and low components, and accommodate constraints with factor limits so they can easily be tailored to specic conditions.

PLSR model development and performance
PLSR was used to correlate optical emission spectra to analyte concentration.The concentration of samples in the calibration set is shown in Table 2.The predictor matrix X comprised the entire spectrum for each analyte.Trimming the spectra to only include regions specic to the trace elements did not improve the model performance (data not shown here).Individual PLSR models were built for each trace element.Uranium concentration was modeled using low-lying peaks in the trace element regions of interest for most elements.This allowed the quan-tication of both trace species and U simultaneously (i.e., single ICP-OES measurement).The limit of detection for U using lowlying peaks varied from element to element because the levels of low-lying U peaks relative to trace element emission intensity in each region varied signicantly.For several elements, U measurements were improved by modeling (PLS1) the U emission peak (385.96nm) to predict U concentration.PLS1 and PLS2 models could be combined in a stacked regression approach in future work. 20he optimal number of factors (i.e., LVs) in PLSR models for each element was chosen by evaluating the percent root mean square error (RMSE%) versus the number of factors.RMSE values have the same units as the response variable (i.e., mg mL −1 ).An example model of the Zr 339.198 nm emission region is provided in Fig. 2a.The last signicant decrease in RMSE% occurred at three factors for Zr and U, which suggests that three factors should be included.The PLSR model, with three factors for both Zr and U, was used to predict sample concentrations in a validation set to calculate RMSEP.Predicted versus reference parity plots for Zr and U are shown in Fig. 2b and c.A linear correlation near one for each measurement indicated robust calibration, CV, and prediction performance.Similar RMSEC, RMSECV, and RMSEP values indicated a balanced model for Zr and U. RMSEC and RMSECV statistics differed signicantly when fewer training set samples were used.This suggests that the number of samples in the training set was minimized effectively using I-optimal design and approached the optimum (∼12 samples).Future work could assess this in greater detail.The number of samples used to train the PLSR model was consistent with the traditional approach that typically requires six trace element standards and six U standards (12 total).The RMSEP and RMSEP% values for U and trace elements are reported in Table 3. RMSEP values approximate the ±error associated with predicted values.The number of factors varied between elements.Two or three factors was the most common, although several elements used four or even ve factors.Zirconium (Zr) and niobium (Nb) emission spectra are convoluted with low-lying uranium spectra (Fig. 1 and S2 †).Zirconium and niobium are difficult to separate from uranium using common methods (e.g., UTEVA). 8The PLSR approach measured both Nb and Zr with high accuracy without separation.This highlights a major benet of this new approach for modeling emission spectra directly.
A different number of factors were used, despite there being two species (Y variables) in each PLSR model.This could be related to the dissimilar intensities of low-lying U peaks relative to trace element peaks.The explained variance plots were compared to X-loadings to better understand differences between models and conrm that the models were describing relevant features in the spectra.Line loadings should have a prole like the original spectra.An example with Zr and Mn models is shown in Fig. 3.The calibration total explained Yvariance for Zr (factors-3) and Mn (factors-2) was 99.95% and 99.94%, which indicated that most of the total variation in Y (i.e., concentration matrix) was accounted for.CV explained variance plots matched the calibration, which suggests that each model can describe new data well, and there is no indication of overtting (Fig. 3a and c).
X-Loading plots show the wavelengths that provide the most important sources of information.They show how the spectral data relates to the variation in Y. Variables with the largest loadings in the earlier components describe the greatest differences between samples.The rst loading in each model represents the emission band of the trace species.This was consistent with the explained Y-variance plot, which indicated that the rst factor primarily describes the variation in the trace species.This was expected because the trace element emission peak is the greatest source of signal variation for most species.However, for some elements (e.g., V), the low-lying U spectrum is more intense than the trace element (Fig. 1) and the opposite trend in explained variance was observed (data not shown here).This could explain why PLSR models for some elements like V contained greater than three factors.
Manganese X-loadings for factors 1 and 2 are shown in Fig. 3b.These correspond almost entirely to the Mn emission band (X-loading 1) and U low-lying peaks (X-loading 2).Zirconium X-loadings for factors 1, 2, and 3 are shown in Fig. 3d.The rst and second loadings look like the optical emission spectra for Zr and U, respectively (Fig. 1).The X-loading for factor 2 looks primarily like the background component from the U lowlying peaks.This is consistent with the explained Y-variance plot, which shows that the second component describes mostly the U portion.The second component describes some information related to the trace species, particularly in the Zr model.The Zr X-loading for factor 3 likely describes a combination of instrument dri and adjusts for the convolution of the Zr and U emission peaks.These results illustrate that the PLSR models are describing the data well and in a way that is consistent with reality.

Analysis of certied reference materials
The ICP-OES analysis of trace elements in U materials should be accompanied by quality control measurements using certied reference materials.5][6] Standards may contain as many as 66 wellcharacterized elemental concentrations ranging from 0.005 to 11 000 mg per g U. 4 Here, we evaluated how well the PLSR approach predicted impurity levels within the studied concentration range (20-5000 mg per g U) in uranium oxide CRM 124-1 and CUP-2 reference materials. 4,6This concentration range could be expanded in future work by modifying the I-optimal design parameters.Best case scenario detection limits for ICP-OES measurements with some elements are near ∼5 mg per g U when separations are used. 2The limit of detection (LOD) with respect to U for each trace species varied from ∼20 to 200 mg per g U using PLSR.If the concentration range is extended further, more samples in the training set and a stacked regression approach could effectively cover the entire range while accounting for potential nonlinearity in the emission trends. 20,21he percent relative difference (% RD) for 30 elements are reported in Table 4 (CUP-2) and Table 5 (CMR-124-1).Multiple wavelengths for most elements were evaluated.The results in Tables 4 and 5 were reported for the wavelength of each element with the best performance (i.e., lowest % RD).For example, one wavelength for Al (308.215nm) and Mn (257.61 nm) missed the mark for the lowest or both concentrations while the other wavelength provided highly accurate values (Tables 4 and 5).This shows the need to evaluate each multiple wavelengths for each element to obtain the best results.
We also employed a pseudounivariate approach to calculate the method LOD based on how well the model predicts the samples in the calibration set (see Section 2.5).The LOD approximation was generally consistent with the measured reference material concentration results.For example, the LOD for Co was calculated as 38 mg per g U. We tested the model on CRM 124-1 with a reported mean value of 23.3 ± 6.1 mg per g U, and the results fell outside the range at Co 17 ± 16 mg per g U or −28.2%RD.The large uncertainty associated with the measurement also suggests that we were operating below the LOD.The reported mean values for B in CRM 124-1 and CUP-2 were 5.5 ± 1 and 73 ± 25 mg per g U, respectively.The PLSR model predicted 9.7 ± 7 and 73.5 ± 11 mg per g U B for CRM 124-1 and CUP-2, respectively.The only example that slightly missed the mark was the Cu 324.75 nm peak.With an estimated LOD of 28 mg per g U, the % RD for both CRM 124-1 (46.3 ± 9.4 mg per g U) and CUP-2 (31.6 ± 5.7 mg per g U) standards were expected to be in range.However, the predicted CUP-2 sample concentration was not within the expected % RSD bounds.This stresses the point that reference materials and quality controls must accompany each measurement to ensure accurate results.Ultimately, zeta scores and % RD values were used to compare PLSR model concentration values and reported reference values.Overall, the zeta scores for every element were within the ±1 range indicating highly acceptable results.The prediction matched the reference mean only when the % RD was lower than the reported % RSD values.Although some elements such as La, Nd, and Dy were below the estimated detection limits, we still included % RD values.Most of the reported trace and U concentrations were predicted simultaneously using PLS2 regression models.For CUP-2 and CRM 124-1 the U concentration of the measured solutions was 1223 ± 19 and 1167 ± 16 mg mL −1 , respectively.Several elements (Ca, Fe, Mg, Na, Be) with minimal U overlap and relatively strong emission intensities fared slightly better using U concentrations provided by a PLS1 model built using U 385.96 nm emission spectra.Many alkali/alkaline elements (e.g., Li and Ca) had emission peaks with much greater intensity than the low-lying U background, such that the quantication of U with lowlying peaks was compromised in the range studied.Calcium and sodium concentrations in CUP-2 were modeled to determine how well PLSR can predict sample concentrations outside of the modeled range (20-5000 mg per g U).The Ca and Na zeta scores of −0.76 and −0.078 and % RD values indicated highly acceptable values.

Conclusions
For the rst time, PLSR models were developed to quantify trace element concentrations in U (20-5000 mg per g U) based solely on spectral variations in ICP-OES spectra, without prior chemical separations.The reduction in RMSEP compared to standard soware protocols shows how multivariate analysis can be used to account for convoluted spectral features.This method improves the timeliness of ICP-OES measurements that traditionally rely on chemical separations, 2 and this multivariate approach can be used to measure trace species that are difficult to separate from U (e.g., Zr, Nb, Th). 25 The analysis presented had an overall % RD < 10% for nearly 30 elements of interest compared to two certied reference materials (CMR 124-1 and CUP-2).The zeta score determination further demonstrates the effectiveness of this multivariate approach and that it is ready for the analysis of real process solutions.This methodology can likely be applied to every element with measurable optical emission spectra and can readily adapt to support applications within the nuclear eld and beyond.Future work may include further optimizing detection limits and characterizing other systems, such as mixed lanthanides and actinides, and testing this approach on other ICP-OES platforms.

Fig. 1
Fig. 1 Optical emission spectra of trace elements from 0.02 to 1.5 mg mL −1 in 1000 mg per mL U for (a) V 310.23 nm, (b) Mn 259.37 nm, (c) Zr 339.20 nm, and (d) Fe 238.20 nm compared to a standard (1 mg mL −1 multielement standard) and a 1000 mg per mL U sample.

Fig. 2
Fig. 2 Plot of (a) RMSE% versus the number of factors, (b) Zr parity plot with RMSE values, and (c) U parity plot with RMSE values.RMSE values are in parts per million (mg mL −1 ).

Table 1
RMSE% for five trace elements calculated by the standard univariate approach

Table 2 I
-optimal selected analyte concentrations with space and build type and trace concentration (mg mL −1 ) a a (*) LOF points included in the validation set.Required model points are bolded.Abbreviations include lack of t (LOF).U and ratio concentrations were rounded to the nearest integer.

Table 3
RMSEP and RMSEP% values for U and trace element and factors included in the model

Table 4
Predicted mg per g U compared to the reported mean values for CUP-2 (ref.4) a