Michael
Thompson
a,
Barry J.
Coles
b and
Joseph K.
Douglas
a
aSchool of Biological and Chemical Sciences, Birkbeck College (University of London), Gordon House, 29 Gordon Square, London, UK WC1H 0PP
bDepartment of Earth Science and Engineering, The Royal School of Mines, Imperial College of Science Technology and Medicine, London, UK SW7 2BP
First published on 13th December 2001
Quality control in sampling has been demonstrated as practicable in sampling procedures that require the combination of sample increments to form a composite sample. The proposed method requires no sampling resources or use of time beyond those normally used. Increments are allocated at random into two half-sized composites, each of which is analysed separately. The absolute difference between the two results is plotted on a one-sided control chart, which is interpreted like a Shewhart chart. In commonly prevailing circumstances the analytical precision is negligible and the chart represents sampling precision alone.
The proposed method is applicable to sampling procedures that depend on the combination of a number of increments taken from the sample target material. The composite sample so formed is then further processed to produce the laboratory sample for analysis. In the proposed QCSAM method the standard procedure is modified: the increments, once taken, are apportioned at random to either of two separate composite samples or ‘splits’. Each of these splits is processed and analysed separately. The absolute difference between the two results depends on the combined sampling precision and analytical precision and can be plotted on a chart related to the familiar Shewhart chart. The procedure is shown schematically in Fig 1.
Fig. 1 Schematic diagram showing a standard sampling/analytical protocol and the QCSAM protocol. |
σ2x = 2σ2sam + σ2an |
σ2 = σ2x /2 = σ2sam + σ2an /2, |
Considering now the difference d = xA
−
xB between the results xA, xB obtained by separately analysing two complementary splits, the variance of d is given by
σ2d = 2σ2x = 2(2σ2sam + σ2an ). |
In theory such a chart would reflect the variation in sampling and analysis jointly. However, it is often more costly to reduce the sampling variance, which is therefore usually regarded by analysts as the limiting factor. Consequently, in most practical instances the sampling variance would be the dominant term, and in those instances, the QCSAM chart would provide a visual indication essentially of sampling variation alone: the analytical contribution would be negligible. For example, if (hypothetically) σan = 0, then σd = 2σsam. If σan = 0.5σsam then σd would be dilated beyond this hypothetical value by a factor of only 1.06. Even if σan = σsam, σd would only be increased by a factor of 1.22. (However, if σan > 2σsam, it would be futile to attempt to monitor σsam at all, so there is only a small window in the possible values of σan/σ sam where the analytical and sampling variation would be jointly represented on a QCSAM chart.)
Because the constitutions of the complementary splits in a pair are random, there is no difference in status between the measurements made on them. It is preferable, therefore, to construct a one-sided control chart for absolute differences |d| between the results of complementary splits (rather than a standard Shewhart chart for the signed differences d). On such a one-sided chart the bottom line would fall at zero, the warning limit at +2σd, and the action limit at +3σd, to provide probabilities corresponding with the two-tailed limits of a classical Shewhart chart.
(The last point above possibly needs some elaboration. Making the one-sided chart would be equivalent to folding a Shewhart chart (and therefore the normal distribution) along the mean line. Fig. 2(A) shows a normal distribution with the areas outside μ ± 2σd shaded: we expect about 5% of all observations d to fall in the shaded regions jointly and about 2.5% to fall above +2σd. When the distribution is folded along the zero line (Fig. 2B) the negative values of d are transformed into positive values and the frequency of observations falling in each region is doubled in the resulting half-normal distribution (Fig. 2C). As a result about 5% of the values of |d| fall above the +2σd bound, shown by the shaded area in Fig. 1C. Results fall above the +2σd bound in the one-sided chart with the same probability that they fall outside the ±2σd limits on the standard Shewhart chart. Corresponding comments would apply to the 3σd limits also. Note that the chart bounds are here defined in terms of σd, the standard deviation of the signed values d. This is the equivalent of using a one-sided range chart for two observations.6)
Fig. 2 Folding the normal distribution for a one-sided QCSAM chart. |
The experimental points for plotting on the QCSAM chart would be the absolute differences |d| between the analytical results on complementary splits, a value that we call the split absolute difference or SAD. In the absence of an analytical problem (which should be controlled separately by existing methods) a point falling outside the action limit would indicate that either the sampling procedure had been executed ineptly (for example, with too few increments) or that the sampling target material was unusually heterogeneous. In either instance the discrepancy would demonstrate that the sample was suspect and possibly unfit for the purpose.
In common with all control charts, the control lines would have to be provisional and subject to review in the initial stages, because during that period the value of σd would be either unknown or poorly estimated. The setting-up strategy used in this study was the empirical method usually adopted in analytical science applications, namely, to use control limits based on an independent fitness-for-purpose criterion in the first instance (when there is no estimate of σd), and to replace them at specific stages with limits calculated from increasingly accurate estimates of σd based on the burgeoning experimental data. The value of σd can be estimated directly from the differences between the results from splits as stdev (d), the standard deviation of the differences d with due regard to the sign of the individual values. Alternatively, σd can be estimated from the SAD values |d| themselves, either as 1.2535 × mean(|d|) or as 1.6608 × stdev (|d|), although the value of the constants in these expressions depends on the assumption of the normal distribution. A robustified estimate (not so dependent on the normality assumption) could be obtained from the median SAD as 1.4826 × median(|d|). In any event, the estimation procedure would have to be protected in some way against undue influence from outlying results.
Suitability of the test materials for the study was established by a separate initial screening of the dataset for potential problems. The analytical data were examined by nested analysis of variance and visual inspection of a graphical representation of the data. There was no evidence of concentration-related heteroscedasticity in the sampling variance or analytical variance, because of the relatively short concentration ranges encountered.
σsam ≤ σffp = 0.05c. |
σd = 2σsam ≤ 2σffp, |
If we calculate control limits in the normal way, that is so that the control limits simply describe the performance of the sampling/analytical system under statistical control, we have a situation exactly comparable with ordinary analytical internal quality control. Figs. 5 and 6 show values of SAD plotted on a chart provided with 2σd and 3σd limits where σd was estimated as the robust standard deviation6 of signed differences (d) available up to the sampling of Site 13. Of course, 13 is a small number of observation for the unqualified use of robust estimates and, indeed, for setting up a control chart,8 but the purpose here is merely to illustrate the feasibility of the overall procedure, not to provide a definitive example.
Fig. 3 QCSAM chart for calcium, showing SAD values, with provisional control limits derived from a fitness-for-purpose criterion. |
Fig. 4 QCSAM chart for lithium, showing SAD values, with provisional control limits derived from a fitness-for-purpose criterion. |
Fig 5 shows the result for sodium, indicating a SAD value outside the action limit at Site 2. The sampling at Site 2 is found to be beyond the action limit for a total of four elements (Na, Sr, Ag, and P) and beyond the warning limit for three others (Ca, Ni, Cd). This is suggestive of a general problem with the sampling at site 2, presumably because the soil was unevenly contaminated and therefore unusually heterogeneous. Apart from the results from Site 2, there were only two unrelated instances of beyond-action-limit samplings. Fig. 6 shows a contrasting situation where the SAD values for lead show the sampling to be compliant throughout.
Fig. 5 QCSAM chart for sodium, showing the SAD values, with control limits set at 2σd and 3σd, based on a robust estimate of σd. |
Fig. 6 QCSAM chart for lead, showing the SAD values, with control limits set at 2σd and 3σd, based on a robust estimate of σd. |
The QCSAM procedure as described is suitable for any sampling procedures depending on the combination of random sample increments to form a composite sample. However, it could readily be adapted to other types of sampling procedures, for instance, coning and quartering, so long as the splits were identified at the first stage of the procedure. The proposed control charts of the SAD result will in most instances represent the sampling precision unambiguously, because it is normal to find that the analytical precision is somewhat better than the sampling precision, and because heteroscedasticity will usually be negligible because of the short concentration range to be found in successive sampling targets.
QCSAM bears much the same relationship to uncertainty of sampling as does analytical internal quality control to analytical uncertainty. In both activities the respective uncertainties should be established when the procedures are validated. Thereafter, in successive runs of the operation, the statistical control is monitored by the ongoing quality control. However, the signed difference d has an expectation of zero under the assumption of statistical control, and is therefore invariant in respect of putative sampling biases. Analytical bias (at the laboratory and method levels) is an important contributor to analytical uncertainty. Whether sampling bias is an important aspect of sampling uncertainty, or even exists at all as a valid concept,9 is currently a moot point. It seems likely therefore that σ2sam , as estimated between sampling runs from QCSAM data, provides at least a reasonable representation of sampling uncertainty (certainly better than anything else available to date) and is suitable for combination with the overall analytical uncertainty to arrive at the uncertainty of the combined sampling/analytical procedure.
We have used the terms ‘compliant’ and ‘non-compliant’ to describe SAD values in this work because ‘in control’ and ‘out-of-control’ seem inappropriate in the context of sampling quality. As used in industrial production or analytical internal quality control, ‘out-of-control’ implies that the production method or analytical procedure is at fault. In QCSAM there is an alternative and perhaps more likely explanation to that of faulty procedure, namely that the sampling target is more heterogeneous than usual.
This journal is © The Royal Society of Chemistry 2002 |