Collaborative trial in sampling for the spatial delineation of contamination and the estimation of uncertainty

Sharon Squire*a, Michael H. Ramseyb and Michael J. Gardnerc
aEnvironmental Geochemistry Research Group, T. H. Huxley School of Environment, Earth Science and Engineering, Imperial College of Science, Technology and Medicine, London, UK SW7 2BP
bCentre for Environmental Research, School of Chemistry, Physics and Environmental Science, University of Sussex, Falmer, Brighton, UK BN1 9QJ
cWRc-NSF, Henley Road, Medmenham, Marlow, Buckinghamshire, UK SL7 2HD

Received 6th October 1999, Accepted 22nd November 1999

First published on UnassignedUnassigned7th January 2000


Abstract

The fitness-for-purpose of a sampling protocol to spatially delineate a region of contamination has been assessed for the first time by use of a collaborative trial in sampling, conducted on a synthetic reference sampling target (RST). This trial employed the RST to show the agreement between one participant’s estimate of the extent and intensity of contamination with that of the ‘true’ value and those of other participants, when they were all using the same nominal protocol. The collaborative trial showed the performance of the protocol when it was applied in any of its four, equally probable orientations. Nine samplers each independently collected soil samples using a herringbone sampling protocol, applied in two randomly selected orientations. Test portions of the samples were then chemically analysed using a single analytical system and the resulting ‘hot spot’ of contamination spatially delineated using two independent methodologies. This spatial extent of contamination was compared with the dimensions of the true hot spot to score the participants, based on a novel adaptation of the International Harmonised Protocol. The value of the score was derived from a weighted sum of the false negative and false positive areas designated as contaminated by the participants. Within- and between-sampler variations were used to assess the performance of the sampling protocol both for the spatial delineation and for the estimation of contaminant concentration at particular sampling locations. The sampling protocol investigated in this CTS was found to be fit-for purpose on this, relatively simple, RST. For a single sampling location situated on a hot spot, sampling repeatability was estimated as 60.08%, and sampling reproducibility 85.79%. This uncertainty contrasts with the sampling reproducibility of 3.77% for a single sampling location situated on the background population of uncontaminated soil. This difference is partially due to a variation in the soil heterogeneity between the contaminated and uncontaminated sample populations. Sampling bias was not significant for either samplers or the sampling protocol, although such a bias may have been masked by the heterogeneity of the sampling target.


Introduction

A Collaborative Trial (or method performance study) is an internationally required method for assessing new or amended analytical methodologies using procedures developed by the International Harmonised Protocol.1 It is an inter-organisational study in which portions of the same test sample are analysed in duplicate by the same method, in different laboratories (n[greater than or equal, slant] 8). Variation within- and between-laboratories is used to determine the precision with which this test sample has been characterised. The results are used to assess the capabilities of the method, or to identify laboratories that may be recognised as a competent user of the method.

Applied to sampling, the above approach requires a number of participants (called samplers) to take two sets of samples from a target using various interpretations of the same sampling protocol. Each sample is then analysed in duplicate, under randomised repeatability conditions, and hierarchical analysis of variance (ANOVA) used to decide whether within-sampler and between-sampler precision are within a specified fitness-for-purpose criterion.2 Chemical analysis under repeatability conditions is required to avoid confusing analytical and sampling variations. The measurements of concentration are treated with ANOVA to estimate precision (as standard deviations) between-samplers (s2), within-samplers (s1) and between analytical duplicates (s0). The within-sampler variation is also called the sampling repeatability standard deviation3 (s1 = sr(s)) and refers to one sampler using the same procedure and equipment over a short period of time. Reproducibility is derived from the sum of squares of the within- and between-sampler standard deviations (√s21 + s22 ) and refers to measurements made on a single or composite sample, collected by different participants using the same sampling protocol. The reproducibility standard deviation represents the uncertainty in measuring the mean concentration of an analyte using the selected protocol. If the uncertainty is found to be too large for particular investigations (i.e., not fit-for-purpose) then modifications to the protocol would be required, e.g., collecting composite rather than single samples.4

The above methodology assesses the protocol in terms of precision, but makes no estimate of bias arising from the sampling methodology. The existence of this bias is a contentious issue, being questioned by some authors,5 but recognised by others.6 Such bias is usually difficult to estimate with respect to the true concentration of a contaminant within contaminated land, as the true concentration is never known. An alternative reference point for the estimation of the sampling bias is the consensus value from a substantial number of measurements made by different protocols4 and/or independent samplers.2 Previous applications of this methodology have taken no account of the spatial variability of the analyte in question, which is a parameter often required for assessing potentially contaminated areas for remediation.

The present study therefore used a synthetic reference sampling target (RST), comprising a single hot spot of known concentration and position, to act as a reference value against which to assess the performance of the sampling process.7 This collaborative trial in sampling (CTS), in addition to the objectives of previous trials, allows the first estimates of the bias from sampling to be obtained, these being traceable to a known mass of pure analyte. Such biases could arise from several causes, such as contamination from the sampling tools, inappropriate handling or selective sampling.8 A new scoring method, based on the true hot spot characteristics, was required to assess the fitness-for-purpose of the sampling protocol to spatially delineate an area of contamination. The results of the CTS are processed to provide an assessment of the sampling protocol, and its application by each participant, in the form of a score derived from a novel adaptation of the International Harmonised Protocol.9

The objectives of this study were therefore to determine: (1) whether it is possible to use a spatially resolved CTS to judge the fitness-for-purpose of a particular sampling protocol (e.g., herringbone pattern, n = 25); (2) whether the variation in spatial delineation by samplers was greater within-sampler or between-samplers; and (3) the measurement uncertainty caused by the precision and bias of the sampling and analytical methods at two selected sampling locations, where one location is on the hot spot and the other is on the background population of uncontaminated soil. The methodology for the estimation of spatial uncertainty from the regions of soil classified as contaminated will be described in a subsequent paper.

Experimental

A synthetic reference sampling target of dimensions 30 m by 30 m was constructed at the Imperial College campus at Silwood Park, Ascot (grid reference TQ 9373 6936) for use in inter-organisational sampling trials7 (Fig. 1). A single circular hot spot comprising 8% of the site area was spiked with barium sulfate to a depth of 15 cm (centred at co-ordinates x = 12 m, y = 13 m). The concentration of barium in the hot spot decreases towards its edge, comprising 5 rings, to mimic approximately the contamination that can result from some historical mining and smelting activities. This site was characterised in a pilot study using six different sampling protocols and intensive sampling of the hot spot.7 The pilot study showed relatively homogenous concentrations of barium in the background population of the uncontaminated soil (typically 154 ± 11 μg g−1 at 95% confidence). The threshold limit of concentration that indicates the edge of the hot spot was calculated to be 171 μg g−1. The temporal stability of the target was tested by comparing soil samples collected from the hot spot before and after the sampling trial. A two-sided t-test showed no significant temporal variability over the duration of the experiment, where a change of 95 μg g−1 would have been detected as significant for the highest concentration of Ba in the hot spot.
Diagram of the sampling target showing the true hot spot location and 
four possible herringbone pattern orientations with sample locations.
Fig. 1 Diagram of the sampling target showing the true hot spot location and four possible herringbone pattern orientations with sample locations.

Nine organisations, listed in the acknowledgements (five university departments and 4 commercial organisations) sent samplers to the site, sequentially over a period of 3 months, between October 1997 and January 1998. The samplers’ aim for the project was given to each participant 1 month before the first participant commenced sampling. This aim was to spatially delineate regions of soil containing [greater than or equal, slant]171 μg g−1 of barium. It was intended that the samplers should collect soil samples using a common protocol specified by the organisers. The organisers would then analyse the soils and use the results to arrive at an estimate of the location of the area of contamination, using two different methodologies. This spatial delineation step was not required of the participants in this trial so as to maintain the independence of a sampling proficiency test, which was also being undertaken by the same participants. Participants visited the site independently and did not observe any other participant during the sampling exercise. Holes left from sampling were closed to remove any visible trace of the sampling that might affect later participants.

Sampling protocol

The equipment available for use by samplers comprised the following: steel screw auger of 25 mm diameter; wet strength paper sample bags; wooden canes; two 30 m surveying tapes; and two indelible marker pens. A herringbone sampling design was employed in this CTS as an example of a protocol which has been recommended for the objective of locating hot spots of contamination.10 The taking of duplicate samples was added to the protocol for the estimation of sampling precision and measurement uncertainty. Participants were asked to take samples of the top 15 cm of topsoil, as this was the specified depth of the reference sampling target. The participants’ instructions were as follows. (i) Sample the test site using the sampling design. Orientation has been selected at random using random numbers (see Fig. 1). (ii) Use the measuring tape and canes (optional) to mark the sampling locations on the test site. (iii) Use the auger provided to take samples of 0–15 cm of topsoil. To do this remove the surface vegetation at the sampling point and then drill into the soil for 15 cm. Pull out the auger, trying to avoid smearing. (iv) Put the sample into the craft bag provided and move to the next sampling location, leaving no evidence of activity to maintain the statistical independence of the other participants. (v) Follow the same procedure at each sampling location and put each increment in a separate bag, clearly marked with the sampling design letter and sample number. (vi) Collect 8 duplicate samples at the sampling locations marked, which have been individually selected from random numbers. To take a duplicate sample, first collect a soil sample from the nominal design. Next, take a second sample 20 cm away from the first, in a random direction. Clearly mark the craft bag with the word ‘duplicate’, sampling design letter and sample location. (vii) Repeat the whole procedure (steps i–vi) using a different orientation (selected at random) of this protocol.

Participants were asked to use the equipment provided by the organisers, although it was optional as to how much of the equipment was used. This allowed participants to use their own judgement. In this way, the CTS was intended to give a realistic picture of the usual practice of samplers in interpreting a sampling protocol. Some of the interpretation of each participant was recorded using a video camera to make comparisons between the participants’ sampling techniques.

The soil samples from all participants were collected by the organisers for sample preparation and chemical analysis at Imperial College. The soil samples were dried at 65 °C, then dissagregated to liberate the natural grain size using a pestle and mortar. The soil size fraction passing through a 2 mm stainless steel sieve was ground in a chrome–steel pot within a swing mill to a grain size of <75 μm, to produce the laboratory sample. Analytical test portions of the laboratory samples were digestedin a mixture of nitric, perchloric and hydrofluoric acids11 and analysed by ICP-AES for barium. This analytical method was chosen because it performed acceptably for reference materials when judged against their certified reference values, and it was sufficiently rapid and inexpensive. All the samples from the collaborative trial in sampling were analysed in randomised order within nine analytical batches. Analytical quality control procedures were used to determine analytical precision and bias, and to test if there was any significant differences in the quality of measurements between the batches.

Certified reference materials (NIST 2709 and 2711) were analysed in duplicate, at random positions between each batch, to estimate analytical bias. House reference materials (HRM 1 and HRM 2) and a special house reference material (HRM 32) spiked with BaSO4 were analysed at random positions within each batch to estimate between-batch precision. The BaSO4 used in the preparation of HRM 32 reference material was the same as that used to prepare the RST. Measurements were corrected for Ba where significant concentrations were in reagent blanks. Analytical duplicates were used to estimate analytical precision. Sample duplicates were collected with a separation distance of 20 cm at 8 sample locations to represent potential surveying error. The analysis of variance (ANOVA) method was used to estimate the measurement uncertainty across the whole site, and for two different sampling locations.

Results and discussion

All of the samples collected from the CTS were analysed within nine analytical batches. Test portions from two sampling protocols were randomised within each batch, as it was impossible to analyse all the samples in one batch due to the large number of samples collected (n = 594). Many of the batches showed a small but statistically significant negative bias for the certified reference materials (typically −6%). Between the batches there were no significant differences in the mean concentrations of reference materials, as judged using one way analysis of variance. In this way the analytical bias does not affect the objectives of comparing within- and between-batch variations within this study. The overall analytical bias (i.e., −6%) was accounted for, however, when estimating the bias in spatial delineation of the hot spot, resulting from a particular sampling methodology.

Internal quality control of the herringbone sampling protocol

The taking of duplicate samples that are separated by a distance representative of that generated on site by the surveying technology employed (i.e., 20 cm) is a relatively cheap and quick method of obtaining the measurement uncertainty from a single sampler. The sampling duplicates are assumed to be representative of uncertainty across the whole sampling target. The method is primarily a ‘bottom up’ approach as it estimates different sources of variability and combines them to give an estimate of uncertainty.12

The variances to be described for sampler number 5, protocol orientation C (abbreviated as 5C), were typical of all the participants’ results. The component standard deviations measured using robust analysis of variance (ANOVA)13 for this single sampler using a single sampling protocol were sgeochemical = 1.10 μg g−1, ssampling = 1.96 μg g−1 and sanalysis = 1.53 μg g−1. The measurement uncertainty (smeas) was calculated from the sum of squares from sampling and analysis, and was calculated as 2.49 μg g−1 (1 s). The expanded uncertainty for 95% confidence, U, was 4.88 μg g−1 (1.96 s). This gives a relative measurement uncertainty, U, of 3.34%, expressed relative to the mean Ba concentration of 146 μg g−1.

The RST used in this investigation contained a large proportion (92% by area) of relatively homogenous background concentrations of uncontaminated soil. The majority of the sampling duplicates were therefore collected away from the hot spot. This resulted in a low value of measurement uncertainty (typically U of 3.34%), as there was very little small-scale variability between the sample duplicates in this homogenous soil (154 ± 11 μg g−1 at 95% confidence). Therefore, this estimate of uncertainty is not simply a characteristic of the sampling and analysis procedures, but also of the site heterogeneity. This single estimate of measurement uncertainty can best be considered as a lower limit of measurement uncertainty that applies to such relatively homogeneous sites. Applying this value to all sample locations (including those on a hot spot) is therefore considered to underestimate the measurement uncertainty in sample locations within areas of higher geochemical variability.

Measurement uncertainty at a single sampling location

Within- and between-sampler variations were measured by classical analysis of variance (ANOVA) to determine the sampling repeatability (within-sampler standard deviation) and sampling reproducibility (total standard deviation) at two sampling locations (one on the hot spot, location 13, and one in the background population, location 3). The sampling repeatability standard deviation determines the uncertainty of a single sampler measuring the Ba concentration at a single sampling location. The reproducibility standard deviation quantifies the uncertainty of multiple samplers measuring the Ba concentration at a single sampling location.

The two sampling locations (numbered 3 and 13) were sampled twice by all nine samplers (once in each protocol orientation). Each sample was analysed once for Ba by ICP-AES and all of the measurements interpreted using classical analysis of variance (ANOVA) to estimate precision (as standard deviations) under sampling repeatability (sr(s) = s1) and reproducibility (sR(s) = s12 + s22) conditions. The symbols s1 and s2 refer to the within- and between-group standard deviations. Classical, rather than robust, statistics were applied in this case as there was no intention of focusing on the main population and down-weighting outlying values.

The ISO definition of analytical bias is the difference between the expectation of the test results and an accepted reference value.3 Sampling bias can therefore be defined, by analogy, as the difference between the mean of the population of sampling measurements and the assigned value of the sampling target. The assigned value for the RST was derived from the spiked concentration of barium sulfate added to the soil. The confidence limits (at 95% confidence) for the assigned value were based on the standard deviation of the measured concentration results. The mean concentrations for locations 3 and 13 for each sampler were compared with the respective assigned concentration value to estimate the sampling bias. The consensus mean was also compared with the assigned mean in order to determine if the protocol gave rise to an overall bias in concentration.

Location 13 is situated within Ring 4 of the hot spot and has an assigned concentration of 468 ± 451 μg g−1 at 95% confidence.7 The large uncertainty on this value was introduced inadvertently by the heterogeneous mixing in this zone of the hot spot. None of the measured Ba concentrations from the nine participants differed significantly from the assigned value for this location (as shown in Table 1). Even if the −6% analytical bias is allowed for, none of the measurements shows a significant bias against the assigned value. The large heterogeneity of Ba within this hot spot ring (RSD of 85%) made identifying sampling bias difficult using the assigned value at this location. This indicates the need for a more homogenous sampling target, or concentrations within the hot spot being much higher above the background population. The performance of the sampling protocol was judged against the consensus value. The mean Ba concentration over all the participants (504 ± 376 μg g−1 at 95% confidence) was found to be not significantly different from the assigned value. The uncertainty of a single sampler identifying a single sampling location (sampling repeatability) was estimated at 60.08% at 95% confidence (using the equation 196 × swithin/[x with combining macron]). Similarly, the uncertainty of multiple samplers identifying a single sampling location (sampling reproducibility) was estimated as 85.79% at 95% confidence. There was no statistically significant difference in within-sampler variance compared with between-sampler variance for location 13. For site investigations requiring lower uncertainty, such as those where the mis-classification of the land could cause unacceptable financial losses, then one way of reducing this uncertainty would be by the collection of larger or composite samples.

Table 1 Participants’ measurements on the hot spot (location 13) showing no significant difference within and between samplers and no significant measurement bias for either the samplers or the protocol
Location 13 concentration/μg g−1 
Participant numberSample 1Sample 2Average concentration/μg g−1Absolute biasa/μg g−1
a Where the assigned concentration is 468 ± 451 μg g−1 at 95% confidence.
1472869671203
2832641737269
3255318287−182
4146256201−267
5393294344−125
6790629710242
7398576487 19
8748374561 93
9592484538 70
[x with combining macron] = 504 ± 192 (1 s)


Location 3 is situated in the background population and has an assigned concentration of 154 ± 11 μg g−1 at 95% confidence. None of the nine participants measured statistically different concentrations of Ba from that assigned for location 3 (as shown in Table 2). No significant difference between the consensus of the nine participants and assigned value was evident, even when the –6% analytical bias was taken into account. The performance of the sampling protocol judged against the consensus value showed the mean measured concentration (145 ± 2.59 μg g−1 at 95% confidence) to be not significantly different from the assigned concentration. The uncertainty of a single sampler quantifying the concentration at a single sampling location (sampling repeatability) was estimated at 3.77% at 95% confidence (using the equation 196 × swithin/[x with combining macron]). For multiple samplers (sampling reproducibility) this uncertainty was the same (3.77%) as there was no extra variance between-samplers. These results contradict those found in analogous situations encountered in collaborative trials in chemical analysis, where inter-laboratory variations tend to be greater than those within a laboratory. This is due to the relatively homogenous background population of Ba and all samples being analysed within one laboratory. It can therefore be concluded that the protocol is fit for estimating the background concentrations of barium to within 3.77% of the consensus concentration at this location.

Table 2 Participants measurements away from the hot spot (location 3) showing no significant difference within and between samplers and no measurement bias for either the samplers or the protocol
 
Location 13 concentration/μg g−1
Participant numberSample 1Sample 2Average concentration/μg g−1Absolute biasa/μg g−1
a Where the assigned concentration is 154 ± 11 μg g−1 at 95% confidence.
1144148146 −8
2145147146 −8
3145143144−10
4142145143.5−10.5
5147144145.5 −8.5
6143152147.5 −6.5
7143144143.5−10.5
8145145145 −9
9143147145 −9
[x with combining macron] = 145 ± 1.32 (s)


Scoring system for spatial delineation

Assessing the spatial performance of the herringbone sampling protocol required the development of a new methodology, to provide a score relating it to a fitness-for-purpose (FFP) criterion. The fitness-for-purpose scoring system for spatial delineation was based on a cost-effectiveness approach. Fig. 2 gives an example of an estimated hot spot area relative to the assigned or ‘true’ area. The score for each participant in the CTS was given by the excess cost:
 
‘Excess cost’ = a(Ei) + b (Ti) (1)
where T is the hot spot assigned area, E the estimated hot spot area and i the area of overlap of the two regions, T and E. The cost of unnecessary remediation (false positive) is (Ei) and the cost of not remediating areas that should be remediated (false negative) is (Ti). Proportionality constants a and b are used to give weight to the importance of each classification. In this instance (a = 1 and b = 4) the financial penalty of a false negative classification was estimated to be four times that of a false positive classification. The value of the proportionality constants could be changed depending on the type of contaminant and remediation techniques used on an individual site. The value of the area, E, is expressed relative to the value of T, which is taken as unity, hence Ewhen T = 1 = E/T. The value of i is expressed as a fraction of T, so that iwhen T = 1 = i/T.

Schematic diagram showing the false positive and false negative 
delineations of a hot spot, which are factors influencing the spatial 
scoring system for the CTS.
Fig. 2 Schematic diagram showing the false positive and false negative delineations of a hot spot, which are factors influencing the spatial scoring system for the CTS.

Scores for this trial were produced that ranged upwards from zero. A score of zero indicates perfect spatial delineation with no excess cost. A larger score reflects greater ‘excess cost’. The fitness-for-purpose criterion of this trial has been set at a score of ⩽3 based on professional judgement. Fig. 3 shows that when the measured area of the hot spot is equal to the assigned area, only a 40% overlap area is required to achieve a satisfactory score (i.e., ⩽3) from equation 1. A score better than required (e.g., 1) could be achieved when the measured hot spot area is the same size as the assigned with an overlap of 80%. For site investigations requiring less spatial precision a FFP score of 5 may be acceptable. Such a score could be achieved with the measured hot spot size being three times that of the assigned, with an overlap of 40% for these particular values of a and b.


Graphical demonstration of the fitness-for-purpose scoring system used 
for the CTS, derived from equation 1. The score for participants varies 
with the area and percentage overlap measured in comparison to the assigned 
hot spot. Participants achieving a FFP score of ⩽3 were classed as 
satisfactory in this CTS. Where the measured hot spot is the same area as 
the assigned and has an overlap of 40%, a satisfactory score can be 
achieved.
Fig. 3 Graphical demonstration of the fitness-for-purpose scoring system used for the CTS, derived from equation 1. The score for participants varies with the area and percentage overlap measured in comparison to the assigned hot spot. Participants achieving a FFP score of ⩽3 were classed as satisfactory in this CTS. Where the measured hot spot is the same area as the assigned and has an overlap of 40%, a satisfactory score can be achieved.

Spatial delineation of the hot spot from the CTS

The measured concentration values for each sampling protocol were spatially delineated using two consistent methodologies. The simplest option for the spatial delineation employed the joining of uncontaminated sampling locations around a hot spot using triangulation (3Plot software).14 A second option used linear interpolation of the concentration values between the contaminated and non-contaminated sampling points to locate the position of the threshold concentration. For the linear interpolation each data set was interpreted, without smoothing, on a square grid of 0.5 by 0.5 m (Surfer, Golden Software).15 This method has several limitations but was considered a suitable option for these small data sets in comparison with the other techniques available. Investigations into the use of geostatistical techniques, such as kriging,16 showed several problems that made the technique impractical for delineation of the hot spot, in particular the small number of samples within each data set (n = 25).

The results of the spatial delineation of the CTS data using the linear interpolation, shown as the solid line in Fig. 4, indicates a greater extent of between-orientation variability (in rows) than within-orientation variability (in columns). Protocol design orientation A shows the greatest variability in spatial delineation between-samplers out of all orientations. Comparisons of the hot spot hits (against the accepted values for each individual sampling location) indicate the varying delineations to be partially a result of heterogeneity within the outer two rings of the hot spot. Participant number 4, orientation A, (4A) is one such example, which showed no evidence of contamination (<171 μg g−1 of Ba) at 2 out of 3 locations within the hot spot. The fitness-for-purpose score for each organisation’s sampling designs (given in Fig. 4) was calculated using eqn. 1. These scores showed that the linear interpolation method indicates satisfactory performance (scores ⩽3) for all but one instance (4A).


Spatial delineation of hot spots based on measurements made by 
participants in the CTS. The solid line is delineation based on linear 
interpolation. When compared with the assigned location of the hot spot 
(dashed line), the performance scores (given below each map) are mainly 
satisfactory (score of ⩽3) with one exception (A4). A second method of 
interpolation (based on joining the nearest uncontaminated sampling 
locations) shows a satisfactory performance for all participants. This 
indicates that the protocol is fit for the specified purpose of spatially 
delineating a single hot spot of contamination with minimal 
misclassification.
Fig. 4 Spatial delineation of hot spots based on measurements made by participants in the CTS. The solid line is delineation based on linear interpolation. When compared with the assigned location of the hot spot (dashed line), the performance scores (given below each map) are mainly satisfactory (score of ⩽3) with one exception (A4). A second method of interpolation (based on joining the nearest uncontaminated sampling locations) shows a satisfactory performance for all participants. This indicates that the protocol is fit for the specified purpose of spatially delineating a single hot spot of contamination with minimal misclassification.

Triangulation was also used to define the edge of the hot spot using the measurements from each participant, to judge the possible effect of this method on the score. The triangulation method assumes a ‘worse case’ scenario in which the soil is contaminated right up to the nearest uncontaminated sampling location (<171 μg g−1 Ba). The advantage of this methodology is that it does not make any assumptions about the spatial distribution of the barium between the sampling locations. The triangulation results were similar to that of the linear interpolation with the exception of case 4A, which was found to be fit-for-purpose in this instance. Performance scores for triangulation were, on average, 25% higher than liner interpolation, primarily because of a greater proportion of ‘false positive’ classifications.

All but one of the protocol designs (Participant 4, design orientation A) had a fitness-for-purpose score of ⩽3, indicating that the herringbone sampling protocol was fit for the purpose in identifying the true hot spot location and dimensions on this RST, with minimal misclassification. There was no significant difference in within-sampler scores compared with between-samplers scores using one-way analysis of variance. However, Fig. 4 shows that a particular protocol orientation does tend to produce a distinctive shape of the measured hot spot. This sampling target was very simple in design when compared with typical contaminated land investigations. The hot spot was perfectly circular and the site was perfectly square and flat, with no obstacles such as building foundations, mounds and trees. This closely corresponds with the idealised model assumed in the theoretical testing of this sampling protocol (Ferguson).17 A potentially more informative approach in the future would be to perform a CTS on a more realistic site with irregular hot spots and typical obstacles such as buildings, trees and topographic irregularities. The approach would then allow assessments of such protocols in more realistic circumstances.

Conclusions

The collaborative trial in sampling has been demonstrated to be a useful new tool in assessing the performance of a sampling protocol for the spatial delineation of contamination. The synthetic reference sampling target allows estimates of bias to be calculated, both in terms of the spatial position of any hot spots and the concentration at a single sampling location. The fitness-for-purpose scoring scheme based on cost-effectiveness allowed these parameters to be assessed on their performance for this site. There was no significant difference in spatial performance scores within- and between-samplers. For the interpolation methods all but one score was below a fitness for purpose score of 3, indicating the protocol was fit-for-purpose in spatially delineating a hot spot of contamination with acceptable levels of misclassification. The uncertainty in estimating the concentration at a single sampling location with a single sampler (sampling repeatability) was estimated to be 60.08% in the contaminated area (location 13) and 3.77% in the uncontaminated area (location 3). Using multiple samplers this uncertainty (sampling reproducibility) was estimated to be 85.79% for location 13, while only 3.77% for location 3. There was no significant difference between the uncertainty measured for a single sampler compared with that measured for multiple samplers for locations 13 and 3. There was no significant sampling bias detected at individual sampling locations, although bias may have been masked at some locations by the heterogeneity of the hot spot. This initial investigation has highlighted the potential for constructing more realistic sites for use in inter-organisational sampling trials. Such sites could be used to take into account the depth of sampling, multiple irregularly shaped hot spots and obstacles within the site, in order to make these investigations more realistic. The expression of the uncertainty in the spatial delineation of the hot spot will be addressed in a subsequent paper reporting the findings of a sampling proficiency test. The methodology reported here should have general applicability to other situations where measurements are used to express the spatial distribution of analytes (e.g., electron microscope analysis of mineral grains).

Acknowledgements

The authors thank the following people and organisations for participating in this study: Ariadni Argyraki of Environmental Geochemistry Research Group, Imperial College; Paul Blackwell of British Geological Survey; Kevin Cordes of School of Environmental and Applied Sciences, University of Derby; Alex Ferguson of British Geological Survey; Matt Hill of Department of Environmental Science, University of Bradford; Kevin Seed of Postgraduate Research Institute for Sedimentology, University of Reading; Ross Stevens of Balfour Beatty Construction Limited; Mike Thompson of Chemistry Department, Birkbeck College; and Jane Turrell of Water Research Centre. Especial thanks go to Mike Thompson for his useful comments regarding the fitness-for-purpose scoring scheme and Dustin Lister for writing a computer software package to score the participants.

References

  1. W. Horwitz, Pure Appl. Chem., 1988, 60, 855 CAS.
  2. M. H. Ramsey, A. Argyraki and M. Thompson, Analyst, 1995, 120, 2309 RSC.
  3. ISO 3534-1:1993, Statistics, Vocabulary and Symbols—Part 1, Probability and General Statistical Terms, British Standards Institution, London, UK, 1993, p. 34. Search PubMed.
  4. M. H. Ramsey, A. Argyraki and M. Thompson, Analyst, 1995, 120, 1353 RSC.
  5. P. M. Gy, Sampling of Heterogeneous and Dynamic Materials, Elseiver, Amsterdam, 1992, p. 26. Search PubMed.
  6. M. Thompson and M. H. Ramsey, Analyst, 1995, 120, 261 RSC.
  7. M. H. Ramsey, S. Squire and M. J. Gardner, Analyst, 1999, 124, 1701 RSC.
  8. M. Thompson, Accreditation Qual. Assur., 1998, 3, 117 Search PubMed.
  9. M. Thompson and R. Wood, Pure Appl. Chem., 1993, 65, 2123 CrossRef CAS.
  10. Department of the Environment, Sampling Strategies for Contaminated Land, Contaminated Land Research Report No 4, The Centre for Research Into the Built Environment, Nottingham Trent University, NG1 4BU, 1994. Search PubMed.
  11. M. Thompson and J. N. Walsh, Handbook of Inductively Coupled Plasma Spectrometry, Blackie, London, 2nd edn., 1989, p. 160. Search PubMed.
  12. M. H. Ramsey and A. Argyraki, Sci. Total Environ., 1997, 198, 243 CrossRef CAS.
  13. Analytical Methods Committee, Analyst, 1989, 120, 1693. Search PubMed.
  14. 3PLOT (Version 4.40), 3Plot Program for Windows, Moscow, Russia, 1998..
  15. SURFER (Version 5.01), Surface Mapping System, Golden Software, CO, USA, 1994..
  16. E. H. Isaaks and R. M. Srivastava, An Introduction to Applied Geostatistics, Oxford University Press, New York, 1989, 444. Search PubMed.
  17. C. C. Ferguson, in Contaminated Soil ’93, ed. F. Arendt, G. J. Annokkée and R. Van den Bosman, Kluwer Academic Publishers, The Netherlands, 1993, pp. 599. Search PubMed.

This journal is © The Royal Society of Chemistry 2000
Click here to see how this site uses Cookies. View our privacy policy here.