Analytical Methods Committee, AMCTB No. 112
First published on 7th October 2022
Estimates of measurement uncertainty (MU) are now ubiquitous in analytical chemistry. Having sufficiently reliable estimates is important for decision making, e.g., deciding whether a particular measurement method produces results that are fit for the intended purpose (FFP). In some situations it can be useful to compare these estimates. For example, we may wish to establish whether the MU for an in situ method, where measurements are made directly in the field, is significantly different (one would often expect it to be larger) from that obtained using a more traditional laboratory method. Or we might want to compare the different components of MU (e.g., compare the uncertainty arising from the sampling activity with the uncertainty from the analytical method) thus enabling us to take a cost-effective approach to reducing the overall, or combined MU. Quoted values of MU are only ever estimates however, being subject to their own uncertainties (AMCTB No. 105). This has implications when two values of MU are compared. An example is provided where the sampling and analytical components of MU are compared for measurements of the nitrate concentration in a field of lettuces. It is shown that in this case it would be more cost effective to reduce the sampling component of MU in order to reduce the overall MU.
Fig. 1 Nested balanced experimental design for the duplicate method used in uncertainty estimation. Two samples are acquired using fresh interpretations of the same protocol, and two analyses performed on each.2 |
Standard deviations are calculated for each of the 3 levels in Fig. 1 by Analysis of Variance (ANOVA). This gives estimates of the standard uncertainty u both at the sampling and analytical levels. The overall MU is obtained by combining the sampling u and analytical u by the sum-of-squares. The expanded relative uncertainty U′ at any level can be expressed as a percentage, with a coverage factor of 2 for approximately 95% confidence, as 2 × 100 × u divided by the concentration value.
Another consideration is that in the particular case of variances estimated using the balanced design described above, the assumption of independence only strictly holds true when comparing variances at the lowest (analytical) level. This is because the variances at the higher levels are calculated by subtracting the variance of the level below. However, it will often be found in practice that variance at the sampling level is much greater than variance at the analytical level, and it may then be reasonable to use an F-test on the combined variances (i.e., on the squares of the MU values) as an approximation.4
An alternative to the F-test is to calculate confidence intervals (CIs) for the two uncertainties being compared. Note that here we are talking about CIs of the uncertainties, not CIs of measurement results themselves (or of their mean values). The computer program RANOVA3 (ref. 5) includes an option to calculate CIs on uncertainties for the n × 2 × 2 experimental design introduced earlier (Fig. 1). Discussion and details of the calculations of these CIs can be found in ref. 1 and 6. The CIs can then be compared for overlap. If they do not overlap, then the variances are different with a significance level of p < 0.05. However, this type of comparison has low power as a statistical test. In cases where the CIs do not overlap, we can be confident they are significantly different. However, in cases where they are overlapping, but the degree of overlap is not obviously large, we are unsure whether a significant difference exists or not. It is for this reason that the F-test is to be preferred if the conditions of normality and independence are either fully met, or we have reason to believe that an F-test will be a sufficiently good approximation.4
Fig. 2 Sampling of lettuce: the protocol (left) specifies taking 10 heads (numbers indicate the order in which the increments were taken) to make a single composite sample from each bay. The duplicate sample (right) was acquired using the same protocol but applying a different route.4 |
Between-sampling target (mg kg−1) | Sampling (mg kg−1) | Analytical (mg kg−1) | Measurement (mg kg−1) | |
---|---|---|---|---|
a CI for measurement is an approximation based on linear combinations of variances.8 | ||||
SD (or u) classical | 556 (0, 1320) | 518 (334, 1008) | 148 (110, 226) | 539 (372,1018)a |
SD (or u) robust | 565 (347, 1176) | 319 (248, 705) | 168 (138, 204) | 361 (300, 724) |
A comparison between the sampling and analytical uncertainty components can indicate where it would be most efficient to allocate resources if it is desired to reduce the overall measurement uncertainty (‘Measurement’ in Table 1). In this case, the CIs of ‘Sampling’ u and ‘Analytical’ u do not overlap in either the classical or the robust results (i.e., 248 to 705 mg kg−1 for robust sampling, does not overlap with 138 to 204 mg kg−1 for the robust analytical). This proves that the sampling uncertainty is significantly larger and clearly dominant, and therefore it could be advantageous to reduce this component of uncertainty, depending on the relative costs of sampling and analysis.
Unfortunately, it is not possible to directly compare the sampling and analytical uncertainties using an F-test. The nested design (see Fig. 1) enables ANOVA to calculate the variances at the target and sample levels by subtracting the mean-square values of the level below (i.e., analytical). Consequently, these two variances do not meet the requirements of independence. However, we can test whether the sampling variance is significantly different from zero using the ratio of the mean-square values at the sampling (MSS) and analytical (MSA) level: MSS/MSA.4 This ratio can be compared with the upper critical value of the F-distribution.4
In the following calculations, I is the number of (duplicated) sampling targets (I = 8), J is the number of samples per sampling target (J = 2) and K is the number of analyses per sample (K = 2).
The mean-square ratio can be calculated from the ANOVA results (Table 1) as follows, where sS and sA are the robust standard deviations at the ‘Sampling’ and ‘Analytical’ levels, respectively.
Degrees of freedom for the sampling and analytical levels are calculated as I(JS − 1) and IJ(K − 1). From this we can look up a critical value for this ratio Fcrit(0.05,8,16) = 2.6. The ratio MSS/MSA = 8.2 is greater than this critical value, so we can reject the null hypothesis that the sampling variance at the population level (σS2) is zero, with 95% confidence for a probability level of α = 0.05.
We can also use the mean-square values to test whether the sampling and analytical variances are significantly different, because in the particular case where the population variances are equal (σS2 = σA2), the distribution of
Since MSS/MSA = 8.2 is greater than 3 times the critical value (2.6 × 3 = 7.8), the null hypothesis that the sampling variance is not significantly larger than the analytical variance (σS2 = σA2) can also be rejected, indicating that the sampling and analytical variances are significantly different for a chosen probability level of α = 0.05. This supports the previous conclusion (based on CIs) that the sampling uncertainty is dominant.4
In general, this approach assumes either that there have been no significant systematic effects contributing to the MU (such as analytical bias), or that they have been corrected for, or included within the estimate of MU.
The example presented demonstrates a comparison between different components of uncertainty (the ‘Sampling’ and ‘Analytical’ components estimated using the experimental design in Fig. 1). In other situations, where we might wish to compare uncertainties between two different measurement methods, some approximations might be applicable. For example, if we wish to compare two measurement methods where the sampling uncertainty clearly dominates over the analytical uncertainty in both cases, an approximation can then be made using an F-test on the two values of the combined MU. Further details for these other situations are given in ref. 4.
Peter D. Rostron
This Technical Brief was prepared for the Analytical Methods Committee with contributions from members of the AMC Sampling Uncertainty and Statistics Expert Working Groups, and the Eurachem Working Group on Uncertainty from Sampling (both chaired by Michael H. Ramsey), and approved on 4
th
August 2022.
This journal is © The Royal Society of Chemistry 2022 |