Brian M.
Pecson‡
*a,
Anya
Kaufmann‡
*a,
Daniel
Gerrity
b,
Charles N.
Haas
c,
Edmund
Seto
d,
Nicholas J.
Ashbolt
e,
Theresa
Slifko
f,
Emily
Darby
a and
Adam
Olivieri
g
aTrussell Technologies, Oakland, California, USA. E-mail: brianp@trusselltech.com; anyak@trusselltech.com
bApplied Research and Development Center, Southern Nevada Water Authority, Las Vegas, Nevada, USA
cDrexel University, Philadelphia, Pennsylvania, USA
dUniversity of Washington, Seattle, Washington, USA
eUniversity of South Australia, Adelaide, Australia
fMetropolitan Water District of Southern California, Los Angeles, California, USA
gEOA, 1410 Jackson Street, Oakland, California, USA
First published on 20th October 2023
Specifying appropriate pathogen treatment requirements is critical to ensure that direct potable reuse (DPR) systems provide consistent and reliable protection of public health. This study leverages several research efforts conducted on behalf of the California State Water Resources Control Board to provide guidance on selecting science-based pathogen treatment requirements for DPR. Advancements in pathogen detection methods have produced new robust, high-quality datasets that can be used to characterize the distribution of pathogen concentrations present in raw wastewater. Such probabilistic distributions should replace the deterministic point estimate approach previously used in regulatory development. Specifically, to calculate pathogen treatment requirements, pathogen distributions should be used in probabilistic quantitative microbial risk assessments that account for variability in concentrations. This approach was applied using the latest high-quality datasets to determine the log reduction targets necessary to achieve an annual risk goal of 1 in 10000 infections per person as well as a more stringent daily risk goal of 2.7 × 10−7 infections per person. The probabilistic approach resulted in pathogen log reduction targets of 13-log10 for enteric viruses, 10-log10 for Giardia, and 10-log10 for Cryptosporidium. An additional 4-log10 level of redundancy provides protection against undetected failures while maintaining high degrees of compliance with the daily (99%) and annual risk goals (>99%). The limitations of the use of molecular pathogen data are also discussed. While the recommendations and findings are targeted for California, they are broadly applicable to the development of DPR regulations outside California and the U.S.
Water impactPathogen log-reduction requirements for direct potable reuse (DPR) must ensure reliable protection of public health, but should be appropriately selected to avoid the economic, societal, and environmental costs of over-treatment. This study recommends both a framework and specific requirements for pathogen control in DPR using the highest-quality pathogen monitoring data in probabilistic microbial risk assessments. |
California has already developed regulations for two forms of IPR (groundwater recharge and surface water augmentation), and the California State Water Resources Control Board (State Water Board) is under legislative mandate to develop DPR regulations by the end of 2023. In its 2016 DPR feasibility assessment, the State Water Board concluded that it needed to modernize the process of developing pathogen log reduction targets (LRTs) for DPR by 1) developing a new high-quality dataset to better characterize pathogen concentrations in raw wastewater, and 2) implementing an updated probabilistic approach for determining LRTs.3,4 The probabilistic approach prioritizes the use of high-quality monitoring methods to develop robust datasets that are used to describe statistical functions (probability distributions) that characterize the likely range in pathogen concentrations. The distributions are used to estimate microbial risk as well as the likelihoods that those values occur within a given range. The probabilistic approach allows for the risk manager to consider the entire distribution of risk and eliminates the need to assume extreme estimates for pathogen concentrations that could erroneously overestimate risk.
To meet this goal, the State Water Board undertook three research projects related to enteric pathogen control: two related to the characterization of pathogen concentrations in wastewater and the third in the application of these new data to develop treatment requirements. In the first, a 14-month pathogen monitoring campaign was conducted to better characterize the concentrations of representative enteric pathogens in raw wastewater.5,6 This study was deemed critical because these concentrations define the starting point for calculating LRTs: higher raw wastewater concentrations require greater levels of treatment to reduce pathogen concentrations down to acceptable drinking water levels (and vice versa). The second research effort evaluated how these concentrations would be impacted during periods of higher disease occurrence, such as during outbreaks.7 Because the State Water Board acknowledged the need for improved methods to “provide more complete information on [pathogen] concentrations and their variability”,3 the pathogen monitoring effort developed new standard operating protocols (SOPs) adhering to strict QA/QC regimes to ensure the new data were of the highest quality.
The third research effort focused on how to use the data to determine the level of treatment needed to meet the State's risk goal of 2.7 × 10−7 infections per person per day.8 The main product of this effort is a publicly-accessible, web-based tool called DPRisk (https://cawaterdatadive.shinyapps.io/DPRisk/) that uses quantitative microbial risk assessment (QMRA) to 1) evaluate risk-based treatment requirements and 2) assess the performance of candidate DPR trains in meeting these goals.9 The DPRisk tool meets the State Water Board's goal of implementing a probabilistic QMRA method to confirm the necessary removal values for human-infectious viruses, Giardia, and Cryptosporidium.3 The research effort also provided the State Water Board's Division of Drinking Water (DDW) with quantitative insight regarding how key inputs in the QMRA impact pathogen risk and DPR treatment requirements.
This paper synthesizes the findings from the three pathogen research efforts to identify new scientifically supported pathogen LRTs using the highest-quality data sources and probabilistic risk assessment methods. It also provides recommendations for identifying and using high-quality data sources, describes challenges with the use of molecular pathogen data, and shows how redundancy can mitigate the impacts of treatment failures. While the effort was focused on developing recommendations for California, the approach and information are also transferable to other developed regions. Both the DPRisk tool and the new dataset provide flexibility to be adapted for site-specific conditions in other locations. If a more localized dataset is desired, the approach could be adopted for the monitoring campaign, integrated into the new dataset distributions,10 and evaluated using site-specific conditions within DPRisk.
P inf = probability of infection
n = number of exposure periods
DR = dose–response function for the reference pathogen
V n = volume of water ingested
C n = pathogen concentration in the source water
LRT = log removal target
DDW has stated that they will use an infection-based risk target based on the 1 in 10000 risk of infection per person per year.11 It will be adapted, however, as a more stringent daily risk goal of 2.7 × 10−7 infections per person per day by dividing the annual risk evenly across each day of the year (10−4 infections per person per year/365 days). To evaluate the LRT required to meet the daily risk goal, LRT values were input into the equation above in increments of 0.1 starting at 0 up to 22. A Monte Carlo analysis was used to capture inherent variability in pathogen concentrations at a 15-minute interval. Risk for a given 15-minute period was then calculated from the pathogen concentration, exposure volume, and LRT occurring. The 15-minute interval was selected for several reasons: 1) many potable reuse regulations require that treatment process performance be measured “continuously”, which is defined as at least once every 15 minutes, 2) the State Board wanted the modeling to capture the variability in process performance at the same frequency as the monitoring, 3) the minimum duration of a process failure would be no less than 15 minutes based on this frequency, and 4) the use of higher frequency data allowed each day to be characterized by distributions (rather than point estimates) of both influent raw wastewater pathogen concentrations and unit process performance. Daily risk was then calculated for the given LRT. The LRT value resulting in the smallest difference between the calculated daily probability of infection and the daily risk goal was stored as the LRT for a single day. This process was simulated 10000 times to develop a distribution of LRTs that met the daily risk goal.
To evaluate daily risk for a given LRT or distribution of LRTs, a Monte Carlo analysis was used to capture the variability in pathogen concentrations at 15-minute intervals and a distribution of daily risk was developed by simulating the process 10000 times. To evaluate annual risk, a Monte Carlo process was used to sample from the daily risk distribution. This process was repeated to produce 100 simulations of annual risk.
In this study, the model was used to evaluate the impact of differing assumptions about wastewater pathogen concentrations and dose–response models on the LRT required to meet the daily risk threshold for each reference enteric pathogen. The model was also used to evaluate the impact of failures in treatment on the ability to meet daily and annual risk goals and evaluate the level of redundancy that would adequately protect against failures. To model failures, the LRTs in the model are adjusted to account for changes in the level of treatment. Information about the duration, magnitude, and frequency of failure assumed in this study is provided in section 3.3.
Reference pathogen | Distribution of concentration in raw wastewatera | Units | Data source |
---|---|---|---|
a Values are log10 transformed. Normal distribution parameters listed as (mean, standard deviation). GC – genome copies. MPN – most probable number. b Norovirus GII was selected based on it being present at higher concentrations than both GIA and GIB. c To develop recommended LRTs, the authors recommend an additional layer of conservatism that includes an assumption that only 10% of the total viruses present were culturable which effectively shifts the mean of this distribution to 4.2. d Giardia cysts and Cryptosporidium oocysts determined microscopically. Infective cysts and oocysts were conservatively assumed to be equivalent to the total number determined microscopically. US EPA has previously provided rationale for this assumption based presumption that overestimation of infectivity would be offset by underestimation of recovery.22 | |||
Norovirus GIIb | Normal (4.0, 1.2) | GC L−1 | 5 |
Point (9.0) | GC L−1 | 17, 18 | |
Enterovirus spp. | Normal (3.2, 1.0)c | MPN L−1 | 5 |
Giardia spp.d | Normal (4.0, 0.4) | Cysts per L | 5 |
Point (5.0) | Cysts per L | 19, 18 | |
Cryptosporidium spp.d | Normal (1.9, 0.6) | Oocysts per L | 10 |
Point (4.0) | Oocysts per L | 20, 21, 18 |
To evaluate the impact of the different assumptions about raw wastewater pathogen concentrations (i.e., distribution vs. point estimate) on the required LRT to meet the daily risk goal, the QMRA model was run holding all other variables constant (i.e., dose–response and consumption).
Reference pathogen | Model | Parameters | Parameter values | Units | Ref. |
---|---|---|---|---|---|
a For this analysis, the approximate beta-Poisson dose–response model was used instead of the hypergeometric dose–response model due to the significant differences in computing time between the two and the relatively small differences in resulting infection rate at low doses. b Rotavirus dose–response function used in conjunction with enterovirus occurrence data for consistency with the virus reduction requirements of the Surface Water Treatment Rule. FFU: fluorescence focus units. | |||||
Norovirus (GI) | Hypergeometrica | Alpha | 0.04 | GC | 25 |
Beta | 0.055 | ||||
Norovirus (GI and GII.4) | Fractional Poisson | P | 0.72 | GC | 26 |
U | 1106 | ||||
Beta | 2.80 | ||||
Giardia lamblia | Exponential | r | 0.0199 | Cysts | 23 |
Cryptosporidium spp. | Beta-Poisson | Alpha | 0.116 | Oocysts | 24 |
Beta | 0.121 | ||||
Rotavirus | Approximate beta-Poisson | Alpha | 0.253 | FFU | 28 |
Beta | 0.426 |
The log10-transformed GC:IU ratio data were fitted to a normal distribution using the function “fitdistcens” from the R package “fitdistRplus,” which estimates the mean and standard deviation for censored datasets using maximum likelihood estimation.33 Samples with virus concentrations below the limit of quantification (LOQ) for the molecular methods resulted in left-censored GC:IU ratios since the numerator in the ratio (GC) was below the LOQ; the right bound for these data points was based on the molecular method LOQ. Samples with virus concentrations below the LOQ for the culture methods resulted in right-censored GC:IU ratios since the denominator in the ratio (IU) was below the LOQ; the left bound for these data points was based on the culture method LOQ. Data points where the culture and molecular concentrations were both below the LOQ were excluded from the distribution. Approximately 9% of the samples (11/122 samples) had enterovirus concentrations below the LOQ with both the culture and molecular methods, and 19% of the samples (23/122 samples) had adenovirus concentrations below the LOQ with both the culture and molecular methods.
In light of these limitations, the DPR-2 study identified several characteristics of “optimal” high-quality datasets and then adapted methods to meet these criteria.5 The optimized methods improved sensitivity and allowed the distributions to be characterized from the lowest through the highest concentrations. For example, the DPR-2 Cryptosporidium method analyzed 1-L samples that provided increased sensitivity than the aforementioned 50-μl method, with the detection rate increasing from 40% to 98%. Additionally, matrix spikes were added to each sample to correct the concentrations for recovery efficiency. Several recent studies evaluating the accuracy of environmental monitoring have highlighted the importance of this and other QA/QC steps, since failing to correct for recovery efficiency can introduce errors of several orders of magnitude.34 Methods were adapted to meet a set of optimal criteria, and the subsequent campaign yielded the most robust raw wastewater pathogen dataset collected to date, encompassing multiple targets and detection methods (i.e., microscopy, culture and molecular) (Table 3).
Optimal criteria | Compliance of DPR-2 dataset |
---|---|
Large sample size | 120 samples evaluated using nine different assays |
Two protozoa (Giardia and Cryptosporidium) enumerated via immunofluorescent microscopic methods | |
Five viruses (enterovirus, adenovirus, norovirus GI and GII, SARS-CoV-2) enumerated with culture and/or molecular methods | |
High method sensitivity | >90% detection rate for culture- and microscopy-based methods |
Compatible with QMRA | Culture and microscopy can be used directly in QMRA without conversions to estimate concentrations of infectious pathogens based on molecular data (i.e., genome copies) |
QA/QC | Full suite of QA/QC controls for all samples including matrix spikes for each protozoa sample and every other virus sample |
Geographic/scale distribution | Sampling at five wastewater treatment plants varying in size from 17 to 292 MGD representing one-quarter of the California population |
Temporal distribution | 24 samples at each sampling location over 14 month period |
To expand the dataset to include other geographic locations and time periods, Darby et al. identified other historical datasets meeting a minimum set of criteria and developed an approach to combine these highest-quality data into single pathogen distributions.10 All data points from each of the selected studies were pooled together and log-transformed to produce a normally distributed dataset. The parameters of the resulting log-normal distribution were defined and values from the distribution were sampled for the estimation of LRT requirements. The authors acknowledge that this pooling approach may obscure site-specific variations at the tails of the distributions, but believe it is still advantageous to incorporate the variation across multiple locations into a single distribution. Site-specific monitoring could be used to confirm the appropriateness of the DPR-2 dataset when applied in other locations.
The analysis showed that the new DPR-2 data are aligned with the historical distributions, with the exception of a Cryptosporidium dataset collected in Australia that led to a small but relevant shift in the distribution.21 The recovery-corrected distributions characterize concentrations from the low through the high concentrations providing confidence in the data across the full extent of the distribution. The authors recommend that the aggregated distributions comprising the DPR-2 and high-quality historical datasets be used as the basis for regulatory development in California. This approach was also endorsed by the expert panel helping the state water board evaluate the public health protectiveness of the DPR criteria.35
The use of distributions in probabilistic assessments would represent an important shift from the point estimate approach the State Water Board used to develop their IPR regulations. Previously, the State Water Board used the single highest value reported in the literature to characterize wastewater concentrations of enteric virus, Giardia, and Cryptosporidium12 and they are using the same approach for DPR.18 The point estimate approach includes significant conservatism in that it assumes every raw influent wastewater contains the highest pathogen concentrations ever reported in the literature at all times. The use of conservative point estimates may be justified if there is significant uncertainty associated with the values or when the data only provide confidence in the highest values in the distribution, such as was the case previously for Cryptosporidium.20 However, when high-quality data meeting the optimal criteria are available, the full distribution of pathogens should replace point estimates and be incorporated in probabilistic assessments of risk. The authors recommend that regulators use the new recovery-corrected distributions to replace earlier point estimates.
This decision is relevant for regulatory development because it can impact the LRT required to protect public health. The following example shows how the Cryptosporidium LRT would be impacted by the use of 1) a conservative point20 or 2) the aggregated, DPR-2 distribution10 by keeping all other QMRA inputs equal (see Methods). The point estimate leads to a single LRT value of approximately 11 logs to achieve the daily risk goal of 2.7 × 10−7 infections per person per day (Fig. 1). The aggregated DPR-2 distribution, however, results in a distribution of LRTs spanning from 8.8 to 9.5 logs over the 0.01st to the 99.99th percentile. As a result, the new, higher quality data provide a scientific justification for a treatment goal that is 1.5-logs lower than the LRT developed using the point estimate. A reduction in the LRT should not be misconstrued as a reduction in public health protection. While the public health goal remains the same—to provide treatment that reduces risk to acceptably low levels—our understanding of what it takes to achieve these goals has advanced. This advancement shows that layers of conservatism that were once justified can now be removed without compromising public health. In a time when the effects of climate change are impacting access to water sources across the globe, potable reuse will be increasingly relied upon as a necessary mitigation strategy. Selecting appropriate treatment requirements—neither too low nor too high—will increase the sustainability of potable reuse, reducing costs and expanding implementation in resource-scarce areas. Unnecessarily high treatment requirements may only increase the burden on municipalities without resulting in a real increase in public health protection.
Fig. 1 Cryptosporidium log reduction target (LRT) required to meet a risk goal of 2.7 × 10−7 infections per person per day based on the use of a point estimate of wastewater concentration using data from Robertson et al.20 (blue dashed line) or a distribution of concentrations from Darby et al.10 (orange solid line). Y-Axis denotes the probability that a given LRT would either meet or be below the daily risk goal. |
Several new lines of evidence provide a rationale for modifying the 1:1 GC:IU assumption. For years, researchers have shown that genomic material remains detectable for viruses that have been inactivated,38,39 demonstrating that GCs alone are not reliable indicators of infectivity. SARS-CoV-2 provides the most recent evidence of a virus whose GCs are present in wastewater without being infective.40 DPR-2 provides new quantitative data that further supports that GC:IU ratios in raw wastewater are not static but fluctuate over a wide range. Enteroviruses were quantified by culture and molecular methods and showed GC:IU ratios that ranged from as low as 1:1 to as high as 10000:1 (Fig. 2). GC:IU ratios for adenovirus ranged from approximately 1:1 to 100000:1 (data not shown). Recent work with an emerging norovirus culture system has shown that this phenomenon also applies to norovirus.41 The norovirus molecular signal persists longer than viable norovirus in the environment, showing that GC:IU ratios vary and are not statically 1:1.
Using the new DPR-2 data to bookend potential GC:IU ratios introduces 4 logs of variability in the resulting LRT. If applied to the State Water Board's norovirus LRT derivation, the LRT would extend from a single 16-log point estimate to a range spanning from as low as 12- to as high as 16-logs (Fig. 3). The uncertainty in the infectivity of GCs also impacts estimates of norovirus risk. The authors support recommendations to use a range of dose–response functions in QMRA.27 Using both high-end and low-end dose–response models (represented by the hypergeometric and fractional Poisson models) results in an approximate 3-log range for norovirus LRTs (Fig. 3). Using the DPR-2 distribution of norovirus concentrations rather than the point estimate introduces another 3-logs of variability (Fig. 3). Coupling together these three factors—1) the 4-log variability associated with the GC:IU ratio, 2) the 3-log variability associated with dose–response functions, and 3) the 3–4log variability associated with the distribution vs. point estimate—leads to a >10 order of magnitude level of uncertainty for norovirus. The range of potential LRTs therefore extends from 6 to 16 logs (Fig. 3). While the authors acknowledge the public health importance of norovirus, current knowledge gaps lead to excessive degrees of uncertainty in estimating LRTs. Rather than arbitrarily selecting a single LRT from within this range, such as the 16-log extremity proposed by the State Water Board, the authors recommend the use of alternate reference pathogens to establish virus LRTs.
An alternative is the approach used by US EPA in the development of virus requirements for the 1989 Surface Water Treatment Rule.2,42 EPA acknowledged there was a wide diversity of relevant viruses, and one way to address this diversity was to assume worst-case characteristics along two lines: occurrence and infectivity. Specifically, EPA coupled the occurrence data of enteroviruses (a group of culturable human viruses present in high concentrations in wastewater) with the dose–response function for rotavirus (a highly infective human pathogen). This combination was intended to provide conservatism in viral treatment requirements. The approach benefits from regulatory precedent established with both the federal Surface Water Treatment Rule and all of California's IPR regulations.13 The authors recommend the use of this EPA approach for determining enteric virus treatment requirements based on its high degree of conservatism, consistency with previous regulatory frameworks, and independence from the limitations of molecular data.
One strategy to protect against undetected failures is to require sufficient treatment redundancy to offset the 4- to 6-log increase in risk. To evaluate these benefits, 4 and 5 logs of redundancy were added to the 10log Cryptosporidium LRT described previously. The impact of a 6-log, 24-hour failure occurring 1% of the year (i.e., 3.65 days per year) on the daily and annual risk profiles is shown in Fig. 4. Even with a high (1%) rate of failure, a DPR system providing at least 4 logs of redundancy would protect against large, undetected failures and achieve the 2.7 × 10−7 daily risk goal 99% of the time (>361 days per year) and the 10−4 annual risk goal >99% of the time. Based on these findings, 4-logs of redundancy would be sufficient to protect DPR systems against even frequent, large magnitude failures. Identical redundancy requirements apply for virus and Giardia (see ESI†). While redundancy provisions are ultimately a risk management decision, this analysis provides a scientific basis to justify a 4-log redundancy requirement.
A summary of the recommendations is presented in Table 4.
Virus | Giardia | Cryptosporidium | |
---|---|---|---|
a See Table 1 for statistical distribution parameters. b See Table 2 for dose response model parameters.LRT – log10 reduction target for pathogen control; N/A – not applicable. | |||
Raw wastewater dataset | Aggregated DPR-2 enterovirus culture distribution10a | Aggregated DPR-2 Giardia distribution10a | Aggregated DPR-2 Cryptosporidium distribution10a |
Modifications | Assume 10% of total viruses quantified through culture | N/A | N/A |
Dose response | Rotavirus beta-Poissonb | Exponentialb | Beta-Poissonb |
Minimum LRTs for public health protection | 13-log | 10-log | 10-log |
Redundancy against undetected failures | 4-log | 4-log | 4-log |
Overall LRT requirements | 17-log | 14-log | 14-log |
• Distributions of pathogen concentrations, rather than point estimates, should be used for regulatory development, particularly when high-quality datasets are available.
• The aggregated, recovery-corrected DPR-2 dataset includes robust, high-quality data that should be used as the basis for raw wastewater inputs for QMRA. Modifications of the dataset (e.g., adjustments to account for incomplete virus enumeration through culture methods) should be considered for additional conservatism.
• Given the unavailability of high-quality pathogen data, the authors recommend the use of the aggregated DPR-2 dataset as a starting place for QMRA in locations within and outside the U.S. Site-specific monitoring could be used to confirm the appropriateness of the DPR-2 dataset in that location. When appropriate, the new data could be integrated using the approach of Darby et al. to create a more robust dataset.10
• Quantitative microbial risk assessments should be conducted using probabilistic approaches incorporating pathogen concentration distributions rather than deterministic methods relying exclusively on point estimates. Publicly-available, online tools such as DPRisk can be used for such analyses.
• Culture- and microscopy-based data reduce the uncertainty associated with the interpretation of molecular data. If molecular data are used, uncertainties should be understood by bookending the GC:IU and dose response assumptions with both upper- and lower-end estimates.
• The conclusions from a quantitative microbial risk assessment are a product of the underlying assumptions. Thus, risk assessments must be transparent and reproducible. To ensure reproducibility, publicly-available platforms like DPRisk can be used to catalog all decisions.
• The quantitative benefits of both treatment (i.e., engineered unit processes) and non-treatment barriers (e.g., small environmental buffers) should be assessed and incorporated into DPR criteria. Even small environmental buffers can provide redundancy and protection against failures.
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3ew00362k |
‡ These authors contributed equally. |
This journal is © The Royal Society of Chemistry 2023 |