Dynamic baselines for the detection of water quality impacts – the case of shale gas development †

There is a need for the development of e ﬀ ective baselines against which the water quality impacts of new developments can be assessed. The speci ﬁ c conductance of ﬂ owback water from shale gas operations is typically many times the speci ﬁ c conductance of surface water and near-surface groundwater. This contrast in speci ﬁ c conductance means that speci ﬁ c conductance could be the ideal determinand for detecting water quality impacts from shale gas extraction. If speci ﬁ c conductance is to be used for detecting the impacts of shale gas operations, then a baseline of speci ﬁ c conductance in water bodies is required. Here, Bayesian hierarchical modelling of speci ﬁ c conductance was applied across English groundwater. The modelling used existing, spot-sampled data from the years 2000 to 2018 from 537 unique borehole locations. When the di ﬀ erences between boreholes was considered, then the approach was su ﬃ ciently sensitive to detect 1% mixing of fracking ﬂ uid in groundwater at a 95% con ﬁ dence interval. The Bayesian hierarchical modelling maximises the return on public investment and provides a means by which future observations can be judged. “ Dynamic baselines for the detection of water quality impacts – the case of shale gas development ” . This paper describes a new approach to de  ning environmental baselines. Environmental baselines are needed so as to be able to de  ne whether an impact has, or has not, occurred with acceptable con  dence. What is required is a probabilistic baseline which can predict the expected distribution at a monitoring location and so it is possible to judge whether a new observation is “ unusual ” at a given probability. In this study we use a Bayesian hierarchical analysis of the speci  c conductance of English groundwater to develop a baseline against which the impact of shale gas exploitation could be judged. The approach could detect 1% fracking  uid in groundwater at a 95% con  dence.


Introduction
Many developed countries have extensive water quality monitoring networks that have been active for many years and so generating a large resource of water quality monitoring data, earliest river water quality monitoring starts in the nineteenth century; 1 and in the developed world this monitoring is now spatially extensive. However, monitoring data has oen been acquired for very specic, sometimes local, purposes, and in the European Union this has oen been focused upon the "good ecological status" dictated by the Water Framework Directive. 2,3 The use of these data in ways beyond those originally intended is attractive but can be problematic. 4 Our aim was to determine whether this considerable quantity of data can be used not only to meet its original purpose but also aid in the protection and improvement of water resources from new challenges such as the development of a new onshore shale gas industry. One of the purposes of water quality monitoring is to assess whether activities have an adverse impact on a water body. To assess whether a human activity has an adverse impact or conversely demonstrate that no adverse impact has occurred, within an acceptable level of uncertainty, it is necessary to demonstrate that the activity (in this case shale gas exploitation) has changed the water quality above an acceptable limit or above a value that would have been expected without the activity present. In many cases, when the presence of pollutants is not overwhelming, then to demonstrate the presence or absence of an impact an effective baseline must exist, but such a baseline is not commonly available.
For a baseline, or indicator or change, to be of use in detecting and apportioning change or adverse impact there are a number of properties that a robust baseline should have. First, the monitored parameter should be related to the activities of concern, or provide an unrelated comparison for assurance purposes (e.g. a conservative marker), and the monitored parameter should be not related to other pollution sources. Second, the monitored parameter should be sensitive to the change with a large change in the magnitude for a small change in the activity. Third, any chosen water quality parameter should be a lead indicator of change, and in the case of water quality this may mean a determinand with little attenuation in its environment. It would be preferable that monitoring could provide early warning of any impact so that measures could be deployed in time to mitigate or avoid a problem. Finally, a larger sample size is preferable and that is more easily achieved if the measurement cost is low and readily deployable.
Shale gas exploitation is in its infancy in England and the potential impact on ground and surface water quality is a public concern. 5,6 Extensive monitoring campaigns have been undertaken around a shale gas exploration site in Lancashire including the determination of baseline conditions for a wide variety of determinands. 7 A number of technologies have been proposed directed at indicator substances for the monitoring of shale gas impacts on water resources, for example, measurement of dissolved CH 4 ; 8 radium; 9 barium and sulphate; 10 strontium isotopes 11 ). However, by far the greatest contrast between owback or produced water (i.e. water returning from shale gas operations) and shallow ground or surface water is their salinity, where salinity could be measured as total dissolved solids (TDS) or electrical conductivity. A review of the total dissolved solids (TDS) of shale gas owback water from US shale gas operations showed that its salinity was between 0.67 and 10 times the TDS of seawater TDS (log TDS seawater < 4. 6) and far larger than the TDS of freshwater (log TDS of freshwater $ 2.6). 12 So far only one fracking operation has been conducted in the UK, for which owback uid composition has been measured (Preese Hall, Lancashire). The conductivity of the owback uid from the Preese Hall well was between 133 730 and 150 614 mS cm À1 . 13 Measuring salinity, as electrical conductivity, requires no special equipment and is commonly and frequently measured in freshwater. Furthermore, the UK has been monitoring specic conductance (electrical conductivity of water corrected to a known temperature) across its range of fresh water bodies for decades. The availability and ease of collecting conductivity data are in contrast with some of the other indicators proposed for assessing shale gas, for example, the availability of strontium isotope data 14 and the expense and facility required to collect further data.
These idealised characteristics for a good baseline indicator parameter mean that salinity, and its allied measures, specic conductance and TDS, are excellent prospects for detecting change in water quality from any developing shale gas industry. Measures of salinity show a high contrast between owback uids and freshwater environments and that this contrast is highly specic to shale gas. The high specicity and contrast between the salinity of the owback uids and the salinity of freshwater means that salinity could be a lead indicator of pollution incidents. What is more, high salinity water from hydrocarbon exploitation has been shown to be toxic to exposed organisms. 15,16 Shale gas sites would be expected to have a range of mitigation measures in place; however, Davies et al. 17 showed that 6% of active, onshore oil wells in the UK have reported a pollution incident in the last decade. Clancy et al. 18 showed, for Colorado, that for conventional oil exploitation sites, 0.02% of the produced water was spilt. Meanwhile in Alberta, Canada, in 2015 there were 113 incidents of spills of owback and produced water documented. 19 Boothroyd et al. 20 showed that 30% of decommissioned oil and gas wells showed signicant CH 4 leaks even within 10 years of their decommissioning.
Although there are considerable numbers of specic conductance measurements available, these measurements were not collected with the purpose of creating a baseline for judging the impacts of a new industry. The government bodies responsible for water quality monitoring in England (The Environment Agency) are testing a range of statistical tools to use with monitoring data from shale sites but as yet they make no recommendations for indicator substances. Moreover, a simple, coherent method is required for the objective and transparent assessment of water quality monitoring. Worrall et al. 21 proposed to use existing riverwater specic conductance data to develop a baseline against which the impacts of shale gas exploitation could be assessed using Bayesian generalised linear modelling. The proposed method had several innate advantages; the method was data-driven; exible with respect to how the specic conductance data are represented; can use factorial (e.g. month of sampling) and include covariate information (e.g. land-use). Creating the statistical modelling within a Bayesian approach means that the approach creates a framework whereby all information has value. Equally, within this framework, new information can be added to update estimates and model outputs are a probability distribution which means that risk can be considered at all stages. The approach developed by Worrall et al., 21 creates a baseline which is dynamic in time and space as it predicts values for sites but updates with each new observation. However, the analysis of Worrall et al. 21 showed that for river water the sampling frequency and the instream residence time compared to the likely volume spilt in acute incidents would mean that monitoring would be unlikely to detect all but the most catastrophic spills. Bayesian modelling approaches have been successfully developed and applied elsewhere for environmental management: for detecting dissolved CH 4 ; 22 ecological risk assessment; 23 stream water quality; 24 impacts of climate change. 25 Therefore, the objective of this study was to develop a dynamic baseline for groundwater which is dynamic as it adapts in time and space as new data become available. Furthermore, the baseline, obtained by means of calculating the expected distribution of specic conductance in groundwater, will provide a probabilistic risk assessment of a pollution event. The approach of this study was to consider specic conductance in groundwater and use hierarchical Bayesian generalised linear modelling to predict the expected distribution at the time of sampling. Any sampled value can be compared to its predicted distribution to estimate the probability that an unusual event has occurred, i.e. there has been a pollution incident.

Study sites
The study considered all the specic conductance at 25 C measured in English groundwater wells and boreholes between the years 2000 and 2018no springs were included in this analysis. Only measurements that were from routine monitoring were included and data listed as being from pollution incidents were excluded as they will represent additional sampling at a higher frequency of times when higher specic conductance might be expected. Any boreholes that were listed as being part of or adjacent to landll sites were removed. The data distribution of specic conductance was examined and values reported as having specic conductance lower than 40 mS cm À1 were removed as these had specic conductance lower than that of rainfall 26 and such entries therefore were removed as being mis-entered data. Similarly, a QQ 0 plot was used to assess the distribution of the data and values above 6000 mS cm À1 were excluded as being outliersthis represented the data from one site. The analytical methodology for specic conductance is outlined and controlled by the UK government's Standing Committee of Analysts (http:// standingcommitteeofanalysts.co.uk/). Sites were retained within the study if they were sampled in at least two years during the period 2000 to 2018. This condition was used in part to restrict the number of sample sites that had to be considered and also to facilitate the estimation of interaction terms. Given the criteria for selection, there were 3370 observations from 537 unique locations that could be included in this analysis (Fig. 1).

Bayesian hierarchical generalised linear modelling
The estimation was based upon Bayesian hierarchical generalised linear modelling. Each data point (i.e., specic conductance measurement -k) within a generalised linear modelling approach is assumed to be generated from a specic distribution within the exponential family of distribution. For the prediction of groundwater specic conductance (k -mS cm À1 ), the generalised linear model for the prediction of k for site x at time t given a gamma distribution would be: where E(k xt ), the expected value of the groundwater specic conductance for site x at time t (mS cm À1 ); site, the factor representing the different monitoring sites for which data were available; year, the different years of sampling; month, the different months of sampling. In this way b xt is calculated for each monitoring site, for each month, across the record and represents the trend in the ground water specic conductance across the period for a particular site and month. Note that year was given as the difference from the mean of all the data (DYear) and in this way a xt is not the y-intercept, i.e. the value of k at year 0, rather a xt is the expected value of k at site x for the middle of the period of record. The approach of expressing year as the difference from the global mean value is so as to make estimation of more precise as a now sits in the middle of the observations rather than at one extreme as would be the case if was the y-intercept. Eqn (i) is a gamma regression; the gamma distribution rather than a normal distribution is used because it is not dened for values less than zero and so a negative k is never predicted. However, to test the t of the model proposed above, models were also tted using both normal and log normal distributions nd in the latter case the equations used are: The approach given in eqn (i) through (v) is hierarchical because the model is established with a prior distribution on a prior distribution. In this case, the prior distribution on the terms a and b depends on the factors and each of these factors has its own prior distribution. The conceptual structure of the model is illustrated in Fig. 2.
Eqn (i) to (v) are expressed in terms of site and month as factors and year as a covariate. Year was considered as a covariate, which meant that for each of the factors (e.g. the difference between the months of sampling), the value of a and b could be estimated for each factor level included in a particular model. Furthermore, the inclusion of year as Fig. 1 The location of all the boreholes included in this study. a covariate meant that predictions for future years could also be made for each of the factor level combinations. The site factor considers the difference between the monitoring sites and sites were included in the analysis if they were sampled in two or more years. Only sites with two or more years of sampling were retained as this facilitated the estimation of interaction terms, i.e. those terms that are the combination of factors. The month factor had 12 levels which is one for each calendar month.
As an alternative to the site factor, the aquifer class and aquifer type were considered. The aquifer class had seven classes dened by the British Geological Survey (BGS) and the classes are: 1A, 1B, 1C, 2A, 2B, 2C, and 3, where: 1 ¼ intergranular aquifer; 2 ¼ fractured aquifer; 3rocks with essentially no groundwater; furthermore, with A ¼ highly productive aquifer; B ¼ moderately productive; C ¼ low productive aquifer. 27 The aquifer type is based on the geology that the borehole penetrates. The aquifer type was established from the British Geological Survey (BGS) aquifer designation map and the aquifer type was included if two of more sites sampled that particular aquifer geology; in all it was possible to include 27 different aquifer types (https://www.bgs.ac.uk/products/ hydrogeology/aquiferDesignation.htmlthe dataset summarised by the aquifer class and type is given in Table S1 †). The approach is hierarchical because the model is established with a prior distribution on a prior distribution. In this case the prior distribution on the trend over time, e.g. the terms a and b (eqn (i)), depends on the factors site, month, etc., and each of these terms has its own prior distribution.
Markov Chain Monte Carlo (MCMC) simulation was used for the Bayesian analysis. The posterior distribution of the specic conductance was calculated using the Jags code called from R (version 4.0) using the R2Jags (version 0.6-1) library (R and JAGS code for the most complex model is included in the ESI †). An MCMC chain of length 10 000 iterations aer a 2000 burn was used with samples saved every 10 cycles and with 3 chains.
The prior distribution for the values of b were set as normal distributions with a mean of zero so that both positive and negative trends in time were equally favoured at the outset, and standard deviation of the priors was set to be an uninformative prior and the likely small value of values of b, e.g. N(0, 1000). The prior distributions for the values of a were also set as a normal distribution but with the mean set to be the mean of all the dataset and a standard deviation chosen to make negative values unlikely, e.g. N(6, 0.25). The choice of such a distribution is justied because if values of b are small (e.g. inuence of factors is weak), then a is the prediction of the specic conductance and thus should approximate the expected value of the distribution of the data as a whole. A half-t distribution was used for the prior distributions of the standard deviations for all terms as half-t distributions mean that a negative value of the standard deviation cannot occur, e.g. hal(0, 5, 1)T(0).
The t of model was tested by a number of approaches. Firstly, the adequacy of the MCMC process was assessed usingR, the convergence statistic, and values ofR < 1.1 were considered acceptable. IfR > 1.1, then the burn in process and number of iterations were increased. Second, for any factor, the 95% credible interval does not include zero, and going forward this is referred to as being a 95% probability of being signicantly different from zero. Third, when a factor, interaction, or covariate is included, this caused total model deviance to decreasedeviance is a goodness of t measure and is a generalization of the idea of using the sum of squares of residuals in ordinary least squares. Fourth, when an additional factor, interaction or covariate is included, there is a resulting decrease in the deviance information criterion (DIC). Because the inclusion of additional factors, covariates or interactions will increase the degrees of freedom of a model such inclusion would lead to a decrease in the total deviance of a model, and hence the need for another measure rather than just total deviance. The DIC accounts for the trade-off between the inclusion of more tting parameters against the additional t of the model and penalises for additional parameters relative to the t of the model -DIC is the general case of the Akaike Information Criterion. Finally, the effective number of parameters (pD) was monitored, and it would be expected that as a factor or covariate was added to the model, then the number of effective parameters would likewise increase, and if pD did not increase with the inclusion of a factor or covariate then that parameter has no effect on the model and could be removed. Furthermore, that pD should be close to the ideal case if all parameters are contributing, and so therefore the calculated pD can be expressed as a percentage of that maximum possiblethis value can never be greater than 100%. Finally, the t of any model was judged using a posterior prediction check, i.e. the output of the model was plotted against the observed values and the tted line between these two examinedit would be expected that a good t model would give a 1 : 1 line between observed and posterior predicted values. Given the purpose of the model the balance of these criteria would be for a model that includes signicant factors and gives the best possible t as described by the lowest deviance given a reasonable t across the range of observations shown in the posterior probability plots.

Model types
Three models were tted (i) Year, month and site factorsthis model was used to give prediction at individual sites and was designed to understand the temporal trend at each site and assess the capacity for prediction and baseline development.
(ii) Year, month and aquifer class factorsthe primary purpose of this model was to predict specic conductance for a previously unknown site. The aquifer into which any borehole will penetrate could be known before drilling, and so even if any previous sampling has not been included in the model development the question is whether it would be possible to predict specic conductance at sites not included in the analysis.
(iii) Year, month and aquifer type factorsas with the aquifer class we can consider a model based on the aquifer type as a means of predicting a specic conductance baseline for a site not previously sampled or not included in the model development.

Model verication and application
The data used for this study were groundwater data collected from the year 2000 to 2019. The models were developed on the groundwater data collected between 2000 and 2018, and to verify and further test the models, the models were used to predict the specic conductance for 2019. Given the factors included in the model, this prediction could be done for each borehole, aquifer class and aquifer type and for every month. These predictions were compared to the results of specic conductance in groundwater available for 2019.
As an example, the models were applied to the most and the least sampled sites and results shown over the study period.

The problem of the depth of sampling
The specic conductance of groundwater is expected to increase with the depth below the surface and therefore, any variation modelled could be due to variation in the depth that the sample came from. The depth of sampling is not recorded as part of the Environment Agency's monitoring; however, as part of separate investigations into the potential for geothermal energy in the UK a catalogue of depth constrained monitoring is available from Burley et al. 28 These data were used to test how much of the observed variation in specic conductance could be explained by the change in the depth of sampling. It should be noted that no groundwater body is delimited for water supply below 400 m in the UK. 29 No specic conductivity was reported within the available geothermal database but a standard relationship between salinity and specic conductance was used for the conversion. 30 The relationship between the specic conductance and depth for English groundwater was used to examine whether variation in specic conductance observed between sites within this study could simply be ascribed to variation in the depth of sampling.

Comparative data
To give context to the specic conductance of English groundwater, the Environment Agency database was examined for the same time period to give the distribution of specic conductance for English river water, lake water and sewage treatment discharge. For English rainfall records from UK Precip-NET were used (https://uk-air.defra.gov.uk/networks/network-info? view¼precipnet).

Results
Because the aquifer type was not dened for every borehole, the dataset when considering the aquifer type was slightly smaller with 3315 observations at 535 unique locations with 27 different aquifer types. The data have a median of 772.5 mS cm À1 and a 95% percentile range of 221 to 2350 mS cm À1 . The comparative data for English river water, lake water, rain water and nal sewage effluent (Table 1) show that groundwater does have a higher specic conductance The distribution of the data by covariates and factors shows that there is a decline in the specic conductance of groundwater over time since 2000 (Fig. 3). Patterns of difference are less clear for the month and aquifer class factors (Fig. 4). For the geology, there are some clear distinctions with the lowest values in the Upper Greensand and highest in the Upper Devonian formations, although both of these aquifers had sample size of less than 6 samples (Table S1 † - Fig. 5).

Statistical modelling
For all models theR < 1.01 and so the tting process for all models was deemed adequate. In all cases the best t model was log normal for each model and the results of model t relative to covariates and factors are detailed in Table 2. The pD will increase with the number of parameters included in the model, but the percentage of the expected pD may decrease as more parameters are included. The model with the most possible parameters (year + month + site) was the one with the lowest percentage of the expected pD. This low value of the percentage of the expected pD can be interpreted as that there being strong inter-and intra-annual relationships at some sites but by no means at all sites. The DIC value is at a minimum for the year + site model but increases for the model with the most parameters, i.e. year + month + site, implying that the most efficient model was year + site. However, model efficiency, as dened by DIC, may not be the most important criterion in terms of predicting specic conductance and the ability to t the data accurately would be more important. With respect to deviance the best-t model was the year + month + site model. The nal test to be considered is the posterior prediction comparison (Fig. 6) and this suggests that the model year + month + site was the best predictor. The slopes (b) estimated for each site shows that the mean slope estimates were negative for 497 out of the 535 sites with the remainder having positive estimates of the slope (Fig. 7). When the uncertainty in those estimates is considered, then it is possible to consider the chance that each of the slope estimates is zero, in which case at the 95% probability no site showed a positive slope and 7 showed a negative slope. The spatial distribution of the sites with signicant slope, and indeed negative, slopes are all relatively coastal when compared to the distribution of all the boreholes in the study (Fig. 7). The site with the greatest signicant slope would have a specic conductance decrease from 1685 mS cm À1 in the year 2000 to 572 mS cm À1 in 2018.
The slopes (b) estimated for each aquifer type shows that the mean slope estimates were negative for 20 out of the 27 aquifer types with the remainder having positive estimates of the slope. When the uncertainty in those estimates was considered. then 11 out of the 27 aquifer types had a 95% probability, while four  aquifer types showed a signicantly positive slope. The largest decline was for the Lias clay group and the greatest increase was for the Millstone Grit group.
The slopes estimated (b) for the aquifer classes showed that all seven aquifer classes had slopes signicantly different from zero with ve out seven aquifer classes showing a negative slope (aquifer classes: 1A, 1B, 1C, 2A and 2C), while two aquifer classes showed positive trends (aquifer classes: 2B and 3). The greatest increase was observed in aquifer class 3 and the largest decrease was estimated for aquifer class 2C.
The prediction of the model can be plotted both in terms of the expected value (the arithmetic mean as the tted distribution was normal) and measures of the distribution across the UK (Fig. 8).
The distribution of expected values across the UK shows no distinctive patterns with no apparent pattern contrasting coastal with inland sites; no geographical trend (e.g. west vs. east); or upland vs. lowland sites. The pattern does indeed reect the result in Fig. 6, i.e. the aquifer class and type are not as good predictors of groundwater specic conductance as site specic prediction.

Model verication application
The results for the most and the least sampled sites (Fig. 9) shows that in both cases the observed data are well within the envelope of the 95th percentile condence interval predicted from the statistical modelling. Further, verication of the model comes from the comparison between the predicted and observed values for 2019, and this shows that all but 4 out of 29 possible observations were as predicted (Fig. 10). It is of course possible that the predicted distribution does not t the observed because an unusual event has occurred at one of the sites. In each of the cases that could be considered in this study where the observed was outside the predicted distribution the observed was lower than the predicted distributiontherefore the modelling may be over-predicting values.

Problem of depth
From the Geothermal catalogue, the specic conductance is oen greater at depth than at the surface (Fig. 10). At the   Fig. 10 and in this box it would be difficult to suggest any relationship between the depth and specic conductance. Within the box drawn (as dened by the specic conductance within the dataset of this study), the depths of the samples from the UK Geothermal catalogue were up to 451 m below the surface. The lower extent of groundwater for water supply in the UK is oen considered as 400 mbgl. 29 Over the entire range of depths sampled the best-t equation was: where depth ¼ depth below ground surface (m). The value in the bracket is the standard error in the coefficient. Note that the constant term was not signicant at 95% probability. However, another interpretation of Fig. 11 is that specic conductance has saturated at depth and that no value of specic conductance greater than 181 895 mS cm À1 (TDS ¼ 228 728 mg L À1 ) was ever observed. For the range of specic conductance observed in this study (up to 2350 mS cm À1 ), the relationship with the depth was not signicant at a 95% probability of being different from zero, i.e. the inability to include the depth of sampling in this analysis was not critical to the result. Fig. 11 illustrates why analysing the specic conductance of groundwater will be useful for detecting water from depth. There have only been two fracking operations conducted in the UK at Preese Hall and Preston New Road, both in Lancashire, for which the conductivity of owback uid is known. The conductivity (not specic conductance) from the Preese Hall well varied from 133 730 and 150 614 mS cm À1 . 13,31 At the depth of fracking at Preese Hall, it would be expected that the specic conductance of the groundwater would be 450 000 mS cm À1 or even if Fig. 11 is viewed in linear as opposed to log-log space the predicted specic conductance would be 285 000 mS cm À1 , i.e. this would suggest that there is mixing of the deep groundwater with either the fracking uid itself or less saline water as the water raised back to the surface. The fracking uid used at Preese Hall was a "slick water" uid, 32 a uid that is predominantly composed of water and 1-2 additives (Table 3), usually a friction reducer and a chemical tracer. 33 At Preese Hall, 425 g of a "sodium salt" were used in 8399 m 3 of tap water, which would suggest that the TDS of the fracking uid would be of the  order of 400 mg L À1 or 1122 mS cm À1 . Therefore, given the measured conductivity of the Preese Hall owback uid; the predicted specic conductance of the groundwater at the depth of fracking; and the specic conductance of the fracking uids;

Discussion
this would suggest that fracking uid mixes with the groundwater at depth in 1 : 1 to 1 : 3 volume ratio. Given the likely conductance of owback uid (for Preese Hall: 133 730 and 150 614 mS cm À1 ) and the predicted specic  conductance of each monitored site (median ¼ 696 mS cm À1 , 95th percentile range of 404 to 1158 mS cm À1 ), it is possible to assess what amount of mixing of the owback uid would be sufficient to increase the specic conductance of the local shallow groundwater to a value greater than would be expected at a given probability, for example: where f ¼ the proportion of owback uid mixing into the surface groundwater at the site of concern; k x ¼ specic conductance (mS cm À1 ) with x as owback (the k of owback uid); 97.5 (the k of the site at 97.5% probability); and 2.5 (the k of the site at 2.5% probability). This test is conservative as it assumes that the mixing with the owback uid must be sufficient to cause the specic conductance of the shallow groundwater to rise from the 2.5th percentile of its predicted distribution to the 97.5th percentile of its predicted specic conductance distribution. Furthermore, if it is assumed that the k of the owback uid is the lowest value recorded for Preese Hall, then the values of f will be at the high end of the range. There is still little evidence that shale gas exploitation has actual signicant impacts on groundwater resources. Fontenot et al. 36 did not nd any signicant increase in TDS in well water within 3 km of shale gas operations in the Barnett shale basin, Texas. Similarly, Warner et al. 37 found that the shallow groundwater composition in 127 boreholes of the Fayetteville shale gas basin, Arkansas, did not reect the composition of the owback and produced water from the shale gas operations. Harkness et al. 38 and Warner et al. 37 did suggest that there was Fig. 10 The comparison of the predicted and observed specific conductance, which could be included in the modelling and for which measurements had been made in 2019. Fig. 11 The variation of specific conductance with the depth for data derived from Burley et al. (1984). The shaded box represents the values observed for surface water in England. A simple approach to analyzing these data would have been to observe the distribution at one site and simply compare the newest observation with that distribution. Such a comparison would be limited by the size of the sample set at each site and for the sites used in this study the average sample size was only 6 samples. But such a comparison would not be fair as all the samples at the site might have been taken in summer months or perhaps at the beginning of the sampling period rather than in the most recent years. It would be better to compare the most recent observations with observations for the current year, even better with the most recent time step, i.e. for the same year, month, day of the month or hour of the day. Of course it is unlikely that there will be sufficient observations to give such a reasonable distribution for any month for any year at any site. So it would be useful if information from other sites could be drawn upon to make the most appropriate comparison possible; this then is what the approach of this study has achieved. By using all available information, the approach of this study could estimate a distribution of observations for every month and for every year at each site. An analogous, non-Bayesian approach might be weighted regression analysis. 40,41 The Bayesian approach uses all available data to predict distributions at the sites of interest. The approach is a systematic and transparent approach to analysing data and provides a probability, with uncertainty, as to the nature of any observed data. Thus in turn the probability that any pollution has, or has not, occurred can be assessed. All risk assessment is actual a probability statement and the tools here use Bayesian approaches so all results will be a probability with an uncertainty. The Bayesian framework means that not only does the tool use all available data it also automatically updates as new information becomes available. A physical modelling approach would require more extensive data and observations to parameterise, calibrate and validate the physical model and so is not well suited to the type of dispersed data set that commonly arises from water quality monitoring. Conversely, the Bayesian approach developed here uses all the information and can return a result with clear uncertainty estimation.
For determinands such as specic conductance no specic legal standard exists within the UK or EU. For operators of works such as shale gas sites the review period for water quality monitoring is dictated in their license to operate, but, are data reviewed each time new data are produced and is the regulator informed if there is an issue? The regulator in the UK may be asked to report at any time to the secretary of state at the highest government level, although it is not clear how oen this might occur. The approach presented here could be used for each new datapoint, i.e. the calculation and method presented would give the probability that a new observation is exceptional and not what should be expected.
Could it be possible to improve on the current approach? It is possible to include covariates and other factors within the analysis. This study has already included the aquifer class and type, but there could be a spatial correlation within the monitoring that could relate boreholes between aquifers. Such a spatial correlation in the observations is not included in this modelling approach. Qian et al. 42 have developed a Bayesian hierarchical model for the calculation of nutrient loads in rivers that incorporated spatial correlation, but the spatial correlation was based on the ow through the river network which provides for one dimensional and directional correlation not appropriate for the three dimensions of the aquifers considered here.
Specic conductance could be expected to co-vary with some cations and anions and indeed different sources of water may have a similar range of specic conductance but a very diffident set of cations and anions that constitute that ionic strength. Other studies have considered other chemical indicators of pollution from unconventional hydrocarbons: Cl/ Br; 43 Sr isotopes; 14 or the ratio (Ba + Sr)/Mg. Indeed, Wilson and Van Briesen 44 used Cl/Br ratios to detect shale gas uids in surface water of the Monongahela river in Pennsylvania. However, a particular advantage of specic conductance is that it is regularly monitored whereas other determinands proposed in the literature have been far less frequently monitored.
This study has shown that between 2000 and 2018, specic conductance in English groundwater declined. Although only seven sites showed a signicant decline the distribution of the slopes showed that 93% of sties had a negative slope and no sites had a signicant increase. The salinisation of aquifers has been commonly reported across the globe, typically in coastal aquifers 45 and this can be due to overexploitation 46 or rising sea levels. 47 Not all salinication has occurred in coastal aquifers and Rivett et al. 48 have noted that in the UK we use between 1 and 3 Mtonnes of salt every year for road deicing and that this can be traced into aquifers. Indeed, many countries have noted deicing salt accumulating in aquifers (e.g. Canada 49 ) and Perera et al. 50 estimated that 19% of the applied road salt accumulated in the aquifer. These studies would suggest to expect salinisation in a country such as the UK where coastal aquifers are common and where there is an increasing demand for water with an expanding population. However, desalinisation was observed and never salinization. Han et al. 46 ascribed desalinisation in coastal karst aquifers to return ows from irrigation. Cloutier et al. 51 gave an example of desalinisation as a recovery from a period of a high sea level during a glacial period as modern recharge ushes the seawater out. We would propose that the decline in groundwater specic conductance observed in this study could be due to changes in recharge patterns driven by climate change. If wetter winters caused an increase in the recharge of meteoritic water, then the specic conductance of the groundwater would decrease. Wen et al. 39 found signicant declines in TDS since 1990 across 1384 in Pennsylvania and they ascribed these declines to declining atmospheric deposition and decreased inputs from the coal and steel industry.

Conclusions
This study has developed a Bayesian hierarchical model for determining the specic conductance of English groundwater. The model is able to estimate the distribution of specic conductance for any aquifer type, aquifer class or borehole that was included in the monitoring dataset. This estimation could be projected forward to predict distributions in the future, and furthermore, the Bayesian framework means that the probability estimation will continue to improve with all ongoing monitoring. The method was sensitive enough such that it could detect mixing of 1% of shale gas produced water into shallow groundwater.

Conflicts of interest
There are no conicts to declare.