The use of Zipf's law in the screening of analytical data: a step beyond Benford
Abstract
This study shows for the first time the effectiveness of Zipf's law in screening analytical data sets for outliers, data formatting and data transcription errors, particularly when the data sets are small. In the case of pollutant concentrations in ambient air, the multivariate nature of the measurement, and the relationship between the measured values of these multivariant quantities are the characteristics that allow a Zipf's law approach to data screening to be successful. Furthermore, it has been shown that Zipf's law has advantages over other novel data screening techniques, such as Benford's law, in terms of sensitivity and scope.