Michael 
            Thompson
          
        
      
      
School of Biological and Chemical Sciences, Birkbeck College (University of London), Gordon House, 29 Gordon Square, London, UK WC1H 0PP
    
First published on 22nd February 2000
Recently conducted collaborative trials in which the analyte concentration was below 100 ppb provided reproducibility standard deviations that were systematically lower than the predictions of the Horwitz function. This study shows that such statistics are better represented by a model with a constant relative standard deviation. A modified function is suggested as suitable for use (with due caution) as a fitness-for-purpose criterion.
![[triple bond, length half m-dash]](https://www.rsc.org/images/entities/char_e007.gif) 10−6). This relationship is so 
widely recognised that it is used both as a benchmark to judge the efficacy 
of collaborative trials3 and as a 
fitness-for-purpose criterion in proficiency testing in the food and other 
sectors.4–6
10−6). This relationship is so 
widely recognised that it is used both as a benchmark to judge the efficacy 
of collaborative trials3 and as a 
fitness-for-purpose criterion in proficiency testing in the food and other 
sectors.4–6
      In 1996, however, Horwitz7 reported that, at the low concentrations of analyte encountered in the analysis of pesticides, estimates of the reproducibility standard deviation (σR) were consistently lower than σH. Furthermore, the same tendency was reported and discussed in 1997 in a study of the experimental basis of the Horwitz function.2 In the latter study the data remarkably showed laboratories achieving reproducibility standard deviations that clustered around a trend, σR = c/3, that could be regarded as a definition of the ‘reproducibility detection limit’. The new trend was therefore attributed to the practical requirement that the reproducibility precision must be no worse than that associated with the reproducibility detection limit of the method, if the method was to be usable. In other words, at concentrations below 10 ppb, the Horwitz function predicted inter-laboratory precisions so poor that, if they were realised in practice, there would be doubt about the presence or absence of the analyte. Laboratories, when they needed to, could agree with each other more closely than predicted by the Horwitz function.
It was also clear that there was a more restricted deviation from the Horwitz function at higher concentrations: at concentrations greater than about 10−1 (10% m/m) the reproducibility precision was on average again somewhat smaller than σH. The trend of the data at these concentrations could be represented as σ = 0.01c0.5. This line intersects the Horwitz function at a concentration of about 10−0.86, that is, 13.8% m/m.
These facts have implications for the use of σH as a fitness-for-purpose criterion in proficiency tests. Previously it has been argued that the Horwitz function was an appropriate criterion, at least down to 10−8, because analytical methods tend to evolve towards fitness for purpose by a kind of natural selection.6 However, it was clear that the function should not used in that context for proficiency tests at the very low concentrations appropriate for analytes such as mycotoxins, etc.
In the present study data from recent (post-1997) collaborative trials, all involving analytes at concentrations below 10 ppb, were examined to see if the previously noted trend was being maintained, and whether a modification of the Horwitz function could be formulated to serve as an objective fitness-for-purpose criterion. The trials all related to the determination of mycotoxins and, in all, 47 different trial materials were analysed in nine separate studies.
|  | ||
| Fig. 1 Results from recent collaborative trials of methods for the determination of mycotoxins, showing the trend of the data (solid line) and the Horwitz function (dashed line). | ||
There are two main conclusions to be drawn from these findings. First, the deviation from the Horwitz function is more marked in the current data than in the 1997 study. It is not clear that this trend towards better precision has stabilised, although that would be a reasonable assumption for the moment. Second, the deviations can be represented well by a simple generalisation relating precision with concentration, apparently without lack of fit apart from a few outliers. The generalisation is a better guide to true behaviour than individual results, because errors would be smaller.
The following function is therefore suggested as a contemporary model for reproducibility standard deviation:
|  | (1) | 
|  | ||
| Fig. 2 z-Scores for aflatoxin M1, calculated from results in FAPAS Round 0423, by using sigma values from both the Horwitz function and the modified function. | ||
| This journal is © The Royal Society of Chemistry 2000 |