Open Access Article
Katherine
Crank
a,
Katerina
Papp
a,
Casey
Barber
ab,
Kai
Chung
a,
Emily
Clements
a,
Wilbur
Frehner
a,
Deena
Hannoun
a,
Travis
Lane
a,
Christina
Morrison
a,
Bonnie
Mull
c,
Edwin
Oh
d,
Phillip
Wang
a and
Daniel
Gerrity
*a
aSouthern Nevada Water Authority, P.O. Box 99954, Las Vegas, NV 89193, USA. E-mail: daniel.gerrity@snwa.com
bSchool of Public Health, University of Nevada, 4700 S. Maryland Parkway, Suite 335, Mail Stop 3063, Las Vegas, NV 89119, USA
cBCS Laboratories Inc, 4609 NW 6th St, STE A, Gainesville, FL 32609, USA
dLaboratory of Neurogenetics and Precision Medicine, University of Nevada Las Vegas, 4505 S. Maryland Parkway, Las Vegas, NV 89154, USA
First published on 13th December 2024
Characterization of wastewater concentrations of human enteric pathogens and human fecal indicators provides valuable insights and data for use by regulators and other stakeholders when developing treatment criteria for water reuse applications, performing quantitative microbial risk assessments, or conducting microbial source tracking. Wastewater samples collected over three years during and after the COVID-19 pandemic were analyzed retrospectively (March 2020–September 2022) and prospectively (October 2022–December 2023) by qPCR for molecular markers of adenovirus, enterovirus, norovirus GI & GII, as well as the human fecal indicators pepper mild mottle virus, crAssphage, and HF183 (n = 1112). A sub-campaign was conducted, and wastewater samples were tested for the culturable enteric viruses adenovirus and enterovirus (n = 56) and the protozoan parasites Cryptosporidium and Giardia (n = 73) over one year (January–December 2023). All assays had high detection rates, ranging from 71% to 100%, and were fit to log-normal distributions. All molecular markers for enteric pathogens displayed seasonal and geographic variation, potentially explained by seasonal epidemiology of gastrointestinal illness, differing populations, and differing sample types. Additionally, the impact of Nevada-specific COVID-19 public health guidance (e.g., mask mandates, stay-at-home orders) on enteric pathogen concentrations was characterized, with significantly higher concentrations of molecular markers observed in “non-pandemic” conditions. This study provides high quality (i.e., high sensitivity, minimally censored, recovery adjusted) pathogen and indicator datasets with insights for use in academic, public health/epidemiological, and industry/regulatory applications.
Water impactWastewater samples collected over three years were analyzed for a variety of viral and protozoal pathogens and human fecal indicators to establish statistical distributions of concentrations. Pandemic conditions, seasonality, sample biobanking, and sample type all had various impacts on target concentrations with implications for use in risk assessments, regulatory rule setting, and in microbial source tracking applications. |
The integration of pathogen concentrations, exposure pathways, and treatment efficiency datasets can be used in QMRA studies for assessing risk from exposure to wastewater pathogens.18,19 Increasingly severe drought, particularly in the southwestern United States (U.S.), and global climate change have heightened awareness of recycled water as a valuable component of water resource portfolios. This includes indirect (IPR) or direct potable reuse (DPR). Robust pathogen concentration datasets in untreated or partially treated wastewater are necessary for determining overall treatment levels that are needed for safe implementation of potable reuse, and these datasets are instrumental in forming regulatory guidelines. Regulatory frameworks often emphasize worst-case assumptions (e.g., peak pathogen concentrations), resulting in potentially unsustainable capital and O&M costs for the resultant treatment paradigms. By better characterizing conditions and factors that lead to these worst-case scenarios, it is possible to respond to these conditions in near-real-time rather than resorting to excessive levels of advanced treatment aimed at mitigating low frequency, high consequence, and potentially site-specific events.
Reported concentrations of pathogens and human fecal indicators in wastewater are highly variable due to extensive external (e.g., geographic location, dilution, disease burden, socio-economic status20) and internal (e.g., quantification methods, intra-laboratory variation21) factors. Few studies have characterized the occurrence and variability of a wide set of human enteric pathogens and fecal indicators over extended periods of time in an effort to explain the drivers of these observed concentration ranges and variability.4,22–26 Notably, Water Research Foundation (WRF) project 4989, “Pathogen Monitoring in Untreated Wastewater”,27 set out to perform an extensive monitoring campaign and literature review to develop a combined dataset of pathogen concentrations in wastewater for use by California regulators in establishing log reduction value (LRV) targets for DPR. WRF 4989 identified two major limitations in wastewater pathogen occurrence studies that have hindered the application of wastewater data, particularly for regulatory purposes, though these limitations extend to applications beyond regulatory development. These limitations include the lack of sufficient analytical sensitivity, resulting in high levels of non-detects and non-quantifiable (i.e., highly censored) data, and the absence of appropriate recovery spike-ins and controls. Other limitations that impact large-scale wastewater studies include limited analytical scope or using only one enumeration method, such as viral cell culture, microscopy, or molecular methods, rather than combining multiple approaches. For understanding pathogen trends in wastewater on a broad scale, it is useful to integrate multiple methods to uncover findings that go beyond methods-related variability.
In this study, we leverage efforts related to a SARS-CoV-2 wastewater surveillance campaign in Southern Nevada to develop a comprehensive wastewater dataset of diverse pathogens and indicators. This dataset characterizes the concentrations of human enteric pathogens and human fecal indicators in raw wastewater over a three-year-long monitoring effort yielding over 1000 samples analyzed by qPCR and over 50 samples analyzed by both cell culture and microscopy methods. The sampling period spanned from 2020–2024, encompassing the peak impact of the COVID-19 pandemic. Herein, “pandemic” and “non-pandemic” conditions were delineated by Nevada's statewide COVID-19 Declaration of Emergency. The targets chosen reflect the potential use of these data for DPR regulatory development by including both molecular and culture-based enumeration of the enteric viruses adenovirus (AdV) and enterovirus (EnV), molecular data for norovirus (NoV) GI and GII, and microscopy-based enumeration of the protozoa Cryptosporidium and Giardia. Additionally, three human fecal indicator targets were included for MST and WBE applications: the RNA of pepper mild mottle virus (PMMoV), the DNA of Bacteroides phage crAssphage (Carjivirus communis28), and a DNA marker for human-specific Bacteroides (HF183). To ensure that our data meet high quality standards required for use in regulatory contexts, we implement extensive quality control following recommendations outlined by WRF 4989 and related publications29,30 and expand on them to develop additional quality control recommendations for future studies utilizing biobanked nucleic acid samples.
000 weekly visitors, the University of Nevada Las Vegas, with ∼28
000 students, and Harry Reid International Airport, which serves ∼55 million travelers annually.31
| Facility | Population served | Flow rate (mgd) | Sample type and source | Sample collection volume (mL) | ||
|---|---|---|---|---|---|---|
| qPCR assays | Protozoa assays | Viral culture assays | ||||
a Facility 1 also serves ∼700 000 weekly visitors and ∼55 million airport travelers annually.
b Facility 4 is the 24 hour composite of facility 4A (west trunk line) and facility 4B (east trunk line).
c Facility 4A (and by default facility 4) receives solids and bypass flows from facility 2.
d Grab influent samples were collected on 6/14/2021, 8/15/2022, 8/22/2022, and 8/29/2022 (otherwise composite).
|
||||||
| 1 | 872 009a |
100 | Grab primary effluent | 10 000 |
NA | 1000 |
| Grab influent | NA | 100 | NA | |||
| 2 | 86 330 |
5 | Composite influent | 10 000 |
100 | 1000 |
| 3 | 757 418 |
42 | Composite/grab influentd | 250 | 100 | 1000 |
| 4b | N/Ab | N/Ab | Composite influent | 250 | 100 | 1000 |
| 4Ac | 133 977 |
15 | Grab influent | 250 | NA | 1000 |
| 4B | 114 532 |
6 | Grab influent | 250 | NA | 1000 |
| 5 | 255 008 |
20 | Composite influent | 250 | 100 | 1000 |
| 6 | 16 399 |
0.8 | Grab influent | 250 | 100 | 1000 |
BCoV recovery was determined for every sample using one of several approaches, depending on the history of the sample. If BCoV recovery was ultimately determined to be <1%, the sample was excluded (n = 89, <8% of all samples). For ‘fresh’ samples, (i.e., non-archived and analyzed approximately within one week of collection), spiked BCoV was directly quantified in each sample to estimate recovery of molecular viral targets. Some samples collected in early 2020 were processed and analyzed before the BCoV spiking approach was implemented into the monitoring effort (n = 37). For these samples, recovery was set to 2%—the average observed recovery for the combined HFUF-Centricon method and consistent with Gerrity et al. (2021).32 For ‘archived’ samples several approaches to determining recovery were assessed.
The remainder of this section discusses the various approaches for assessing recovery and the potential effects of target degradation during long-term sample storage. By re-quantifying BCoV in archived nucleic acid extracts and comparing to the original recovery of any given sample, it is possible to not only account for original loss during sample processing but also for potential nucleic acid degradation and loss caused by several years of storage at −30 °C. The resulting correction factor (recovery×degradation) can then be applied to all assays in which BCoV can be quantified before and after storage. For samples in which BCoV was originally detected but then non-detect upon re-analysis of the archived extract, several recovery estimation methods were assessed: (1) substitution of an average BCoV recovery, (2) substitution of a PMMoV-derived degradation factor, and (3) supervised machine learning (SML). Data where BCoV recoveries were known (n = 610) were split randomly into two sets, one with 75% of the data and the other with the remaining 25%. The larger set (n = 458) was used to develop the averages, PMMoV degradation terms, and SML models. These models were then used to estimate the recovery for the smaller dataset (n = 152), and the differences between the estimated recovery values and the actual recovery values were evaluated using root mean square error (RMSE), R2, and mean absolute error (MAE) goodness-of-fit metrics.
Traditionally, recovery analyses involve only testing a subset of samples due to cost and labor constraints.21,42–45 The subset recoveries are then averaged and applied to all samples, or a distribution is fit.43 However, with qPCR applications, especially in wastewater matrices, recovery efficiencies vary widely across samples. Thus, applying a generalized average may not be appropriate, as this approach may contribute significant bias.42,43 To evaluate potential bias, we considered multiple methods for averaging BCoV recovery rates. The first approach was using the facility-specific average recovery determined from the re-quantification of BCoV after storage (R1). Another method was using the original facility-specific average of recovery from the original quantification of BCoV before storage (R2) or substituting the original sample-specific recovery values (R3). Additionally, we evaluated substituting an overall combined average recovery point-value determined from the re-quantification of BCoV after storage (R4) or an overall point-value average of original recovery values (R5) (Table S7†).
As with BCoV, PMMoV was quantified before and after sample archiving, with the difference in concentration theoretically accounting for losses during storage. In the SARS-CoV-2 wastewater surveillance campaign workflow, PMMoV was quantified using a SYBR-based qPCR assay for all samples collected March 2020–December 2023. A paired sub-analysis of fresh samples showed that the SYBR-based assay yielded significantly different concentrations compared to the probe-based qPCR assay used for analysis of archived samples, with the SYBR-based assay yielding higher concentrations on average (by ∼0.26
log10 gc L−1) (p < 0.0001, paired t-test on normally distributed log10 transformed data, n = 48). Accordingly, SYBR-based PMMoV concentrations were adjusted using multiple approaches before dividing to determine the degradation term (PM1, 2, and 3). First, SYBR-based PMMoV concentrations were converted to probe-based concentrations using a linear regression (Fig. S3†), and degradation was calculated as the probe-based concentration (post-storage value) divided by the normalized concentration (pre-storage value) (PM1). Second, the average difference in concentrations (0.26
log10 gc L−1) was subtracted from the SYBR-based concentrations to convert to probe-based concentrations (PM2), and the degradation factor was determined as above. Third, no adjustment for the different assays was applied, and the degradation factor was calculated with the probe-based concentrations (post-storage value) divided by the SYBR-based concentrations (pre-storage value) (PM3). PMMoV degradation factors over 100% (due to error in the assay correction factors) were set to 100%, indicating no degradation occurred.
Calculating a correction term using PMMoV concentrations before and after archiving only accounts for losses due to degradation and the cDNA synthesis step for archived samples. To simultaneously correct for degradation via the PMMoV-derived correction factors and losses during initial processing, each iteration of the averaging approach (R1–R5) was also multiplied by each iteration of the PMMoV correction term (PM1–PM3) in separate models (R* × P*).
Recovery efficiency appears to be a non-independent adjustment factor, meaning that recovery can be correlated or have interdependent relationships with concentrations of other targets, water quality parameters such as total dissolved solids (TDS),43 and sample handling procedures. For this reason, we also attempted to use SML to estimate recovery using metadata and target concentrations, including temperature, storage time, facility, concentration method, detection status for each assay and sample, and non-recovery-corrected concentrations of the markers. SML was conducted using the cubist model,46 which is a form of decision tree modeling, from the caret package in R.47 The least important variables were identified with the varImp function and omitted to determine whether equal or greater accuracy could be achieved in equal or less computation time. The most accurate model was then tested across a wider range of hyperparameter settings. Additional information on the SML model, including all variables considered, the most important variables identified, and algorithm information, are available in Text S4.† Each individual method (i.e., SML, R* × P*, R1–R4, and PM1–3) was evaluated to determine the best fit to the known data.
| Methoda | RMSE | R 2 | MAE |
|---|---|---|---|
| a Method names and descriptions are available in Table S7.† “R” methods are BCoV recovery corrections, “PM” methods are PMMoV degradation corrections, “×” indicates multiplication of two corrections, and SML is supervised machine learning. b Chosen for final pathogen distribution comparison (Text S4†). c Sample-specific original recovery. | |||
| SMLb | 0.158 | 0.863 | 0.096 |
| R2 × PM1b | 0.199 | 0.783 | 0.145 |
| R1 × PM1 | 0.216 | 0.745 | 0.141 |
| R3 × PM2 | 0.221 | 0.732 | 0.158 |
| R5 × PM1 | 0.223 | 0.727 | 0.158 |
| R2 × PM2 | 0.228 | 0.714 | 0.177 |
| R1 | 0.229 | 0.712 | 0.171 |
| R3 × PM3 | 0.237 | 0.690 | 0.153 |
| R1 × PM2 | 0.247 | 0.664 | 0.167 |
| R2 | 0.248 | 0.662 | 0.212 |
| R4 × PM1 | 0.248 | 0.661 | 0.158 |
| R3 × PM1 | 0.254 | 0.646 | 0.169 |
| R2 × PM3 | 0.262 | 0.622 | 0.171 |
| R5 × PM2 | 0.263 | 0.619 | 0.206 |
| R5 | 0.264 | 0.616 | 0.243 |
| R4 × PM2 | 0.278 | 0.576 | 0.189 |
| R4 | 0.279 | 0.572 | 0.200 |
| R5 × PM3 | 0.285 | 0.552 | 0.188 |
| R3b,c | 0.287 | 0.547 | 0.210 |
| R1 × PM3 | 0.288 | 0.545 | 0.181 |
| R4 × PM3 | 0.306 | 0.485 | 0.194 |
| PM1 | 0.550 | −0.660 | 0.452 |
| PM3 | 0.651 | −1.328 | 0.567 |
| PM2 | 0.725 | −1.892 | 0.668 |
The recovery estimation method with the lowest RMSE, lowest MAE, and highest R2 was SML using the cubist model. The RMSE was 0.158, which suggests that SML predicts recovery on average within 15.8% of the measured recovery value in the test set. The second lowest RMSE was R2 × PM1 (average of 0.199), referring to the substitution model where the facility-specific average recovery determined from the original BCoV quantification was multiplied by PMMoV degradation determined after correcting SYBR-based concentrations with linear regression. A subsequent analysis of the impact of recovery estimation method on final target concentration distributions revealed no significant difference (p > 0.95) between using the R2 × PM1 and SML methods, but that neglecting to account for degradation (R3, original recovery values alone), yielded significantly lower overall concentrations (Text S5 and Table S8†). This is expected since concentrations can decrease due to degradation, and original recovery alone would not account for that change, thereby yielding artificially low concentrations. Since the difference between the R2 × PM1 and SML approaches was statistically indistinguishable (all markers p > 0.95; Table S8†) and because SML is less intuitive and more computationally intensive, R2 × PM1 was chosen for calculating recovery for samples with BCoV results that were non-detect or <LoQ upon re-analysis.
Archived biobanks face challenges with nucleic acid degradation of targets due to the effects of one or more freeze–thaws, overall duration of storage, and storage temperature. This is evident here in lower detection rates in archived samples vs. fresh samples (AdV: 66% vs. 93%, EnV: 46% vs. 87%, NoV GI: 57% vs. 100%, NoV GII: 68% vs. 99%). Whenever possible, these impacts must be accounted for to avoid underestimating target presence and concentrations, which occurs when sample degradation has occurred but is not considered or quantified. Ideally, controls should be spiked into each sample and quantified before and after storage. In cases where this is not possible, we recommend assessing several different recovery or degradation estimation factors. SML provided the most accurate estimate for recovery in this study. However, not all biobank studies have access to the volume of data, including metadata, that were used to train the model. Using the advanced machine learning method was shown to not significantly impact concentration distributions compared to a simpler, average-based approach using original facility averages multiplied by a PMMoV degradation factor, though SML might be more useful if there was more extensive metadata available, such as water quality parameters. While machine learning might not be necessary when recovery can be estimated more directly, it could be useful in cases with extensive metadata, and some form of critical assessment is essential to determine the best method for estimating recoveries in the absence of consistent recovery data. Critical to all these methods is the assumption of using a spike-in for individual samples, which should be considered a high priority for wastewater studies.
log10 gc L−1) compared to NoV GII (Table 3 and Fig. 2). Despite lower detection rates for GI (80%) compared to GII (90%), the fitted distribution mean for GI was higher (6.48
log10 gc L−1) than for GII (6.15
log10 gc L−1). Facility 3 had the highest concentrations of GI and GII and was significantly higher than facilities 1, 2, and 6 for GI and facilities 1, 2, 4A, 4B, and 6 for GII (p < 0.05) (Fig. S4†). The four highest GI concentrations were all greater than 9.17
log10 gc L−1, which is the concentration used as the basis for California's DPR regulatory rule setting,50,51 and the top three occurred in samples collected from facility 6, with the fourth occurring in facility 4B. Facility 6 and 4B were represented by grab samples, and facility 6 is also the smallest sewershed in the study (only ∼16
000 people), so it is possible that these particular grab samples constituted plugs of wastewater containing contributions from ‘supershedders’. In fact, the top two GI concentrations (9.46
log10 gc L−1 and 9.38
log10 gc L−1) both occurred in facility 6 in subsequent weeks in March 2023. The 9.17
log10 gc L−1 concentration utilized by California was observed in a grab sample from a very small facility in France (290 m3 per day or 0.08 mgd) serving approximately 1200 inhabitants during a known NoV GI outbreak.52 Given the small sewershed sizes for facility 6 in this study and the facility in France, coupled with their similar per capita wastewater generation rates, we hypothesize that the hydraulic characteristics of small systems (i.e., less dilution and dispersion) may result in higher observed pathogen peaks, especially during outbreaks. We detected 3 peak (≥9.17
log10gc L−1) samples from facility 6 across 120 samples, for a similar peak frequency as the facility in France (1 out of 28 collected samples).
| Parameter | Crypto | Giardia | AdV (cult.) | EnV (cult.) | AdV (mol.) | EnV (mol.) | NoV GIA (mol.) | NoV GIB (mol.) | NoV GI sum | NoV GII (mol.) | CP56 (mol.) | HF183 (mol.) | PMMoV (probe) | PMMoV (SYBR) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| a Mean and standard deviation are of samples with detected target only. b Minimum is the lowest measured concentration above the LoQ. c Recovery mean and range. d Distribution fit to censored data using ‘fitdistcens’ with MLE or non-censored data with ‘fitdist’. Non-detect values were considered left-censored, and <LoQ values were considered interval-censored between the LoD and LoQ. Distributions are normal distributions of log10-transformed data, with mean and standard deviation reported in log10 target per L. e Average of MS2 & phiX174 recoveries. f Recovery separated by concentration method is available in Table S4.† | ||||||||||||||
| Number of samples (#) | 73 | 73 | 56 | 56 | 1107 | 1112 | 1112 | 1112 | 1112 | 1112 | 1112 | 1112 | 807 | 1108 |
| Detection frequency (%) | 81% | 100% | 96% | 96% | 84% | 82% | 71% | 77% | 80% | 90% | 100% | 99% | 100% | 100% |
| Meana (log10 target per L) | 2.18 | 3.73 | 3.29 | 3.77 | 6.53 | 5.92 | 6.71 | 6.65 | 6.91 | 6.41 | 9.19 | 8.04 | 9.30 | 9.07 |
| St. dev.a (log10 target per L) | 0.47 | 0.40 | 0.75 | 0.88 | 1.06 | 0.67 | 0.83 | 0.88 | 0.90 | 0.84 | 0.70 | 1.43 | 0.53 | 0.49 |
| Minb (log10 target per L) | 1.22 | 2.52 | 1.74 | 1.17 | 4.08 | 4.53 | 4.88 | 5.17 | 5.21 | 4.36 | 5.82 | 4.53 | 6.71 | 6.66 |
| Max (log10 target per L) | 3.31 | 4.69 | 5.30 | 5.76 | 9.25 | 8.15 | 9.27 | 9.31 | 9.46 | 8.57 | 11.54 | 11.45 | 11.14 | 10.68 |
| Recoveryc (%) | 31% (3–91) | 55% (3–90) | 34% (8–93)e | 23% (1–100)f | 31% (1–100) | |||||||||
| Fitted distributiond | μ = 2.04 | μ = 3.73 | μ = 3.24 | μ = 3.70 | μ = 6.23 | μ = 5.45 | μ = 6.26 | μ = 5.99 | μ = 6.48 | μ = 6.15 | μ = 9.18 | μ = 7.99 | μ = 8.80 | μ = 9.07 |
| σ = 0.54 | σ = 0.40 | σ = 0.78 | σ = 0.93 | σ = 1.17 | σ = 1.1 | σ = 1.03 | σ = 1.43 | σ = 1.17 | σ = 1.11 | σ = 0.71 | σ = 1.47 | σ = 0.61 | σ = 0.50 | |
![]() | ||
| Fig. 2 Distributions of concentrations of NoV GI (sum of GIA and GIB) and GII across facilities, along with corresponding detection rates. Censored data estimated for visualization using regression on order statistics (ROS) in NADA.53 Dashed lines indicate the overall combined fitted distribution means, taking into account left-censored data. | ||
In composite samples or samples taken further into the treatment train (e.g., after primary clarification at facility 1), there is a peak “averaging” effect.54 This is useful when the goal is to evaluate wastewater that is representative of the overall population, for instance in WBE applications. Grab samples taken from influent wastewater are more likely to capture non-representative (e.g. non-dispersed or single-sourced) plugs of wastewater, but these samples are useful when assessing wastewater pathogen concentrations from a hydraulic perspective (e.g., when examining treatment train performance and its tolerance to spikes). Therefore, the grab samples resulting in NoV GI spikes are important additions to the dataset, highlighting how small systems may be more susceptible to concentration extremes.
It should be noted that while these concentrations appear to be outliers in Fig. 2, these values do not actually fall in that category when censored data are included in the outlier determination, and this applies to all apparent outliers in the boxplots. To verify that these points are not true outliers, we removed the four values >9.17
log10 gc L−1 and re-fit the distribution. The exceedance probability for 9.17
log10 gc L−1 had only a slight change between the high-points-included distribution (1.0%) vs. the high-points-excluded distribution (0.85%), and there was no statistically significant difference between the two distributions (Kolmogorov–Smirnov, p = 0.24).
log10 gc L−1. Concentrations of infectious EnV were lower, with an overall detection rate of 96% and distribution mean concentration of 3.70
log10 MPN L−1. Fig. 3 shows EnV concentrations, mean distribution across facilities, and detection rates. Facility 3 had the highest mean concentrations of EnV for both qPCR and cell culture. For qPCR, facility 3 was significantly higher than facilities 1, 2, 4A, 4B, and 6 (p < 0.05), and for culture methods, facility 3 was significantly higher than facilities 4A and 6 (p < 0.05).
log10 gc L−1. For cell culture, AdV had a 96% detection rate, with a distribution mean concentration of 3.24
log10 MPN L−1. No significant differences in culture-based concentrations were observed among facilities, but similar to EnV, facility 3 had the highest concentration of AdV via qPCR, and was significantly higher than facilities 1, 2, 4A, 4B, and 6 (p < 0.05) (Fig. 4).
:
IU) ratios.
The subset of data analyzed by both qPCR and cell culture methods (n = 56) was used to develop distributions of GC
:
IU ratios for AdV and EnV (Fig. 5). GC
:
IU ratios varied widely across all samples and between facilities (Fig. S5†). GC
:
IU ratios ranged between 19
:
1 and 246
000
:
1 for AdV and between 1
:
1 and 54
400
:
1 for EnV for detectable data. These ranges (∼5
log10 and ∼4
log10, respectively) are consistent with Pecson et al. (2022),29 which utilized very similar methods. Notably, both the ratios observed here and in Pecson et al. (2022) were lower than those observed in a study utilizing similar methods on wastewater from San Diego, CA, where ratios for EnV ranged from 4.5–8
log10 GC
:
IU.55
Data were slightly censored (7% for AdV and 13% for EnV) so log10-transformed GC
:
IU ratios were fit to a censored normal distribution. The distribution mean GC
:
IU ratio for AdV was 3.67
log10, corresponding to a GC
:
IU ratio of 4699
:
1, and the mean GC
:
IU ratio for EnV was 2.45
log10, corresponding to a GC
:
IU ratio of 280
:
1. A recent study56 suggested that the cell culture method for enumerating EnV utilized here may not be optimal for detection of all infectious EnV in wastewater, and so subsequent QMRAs57 increased measured EnV culture concentrations by an assumed factor of 10 to correct for potential undercounting of virus in a viable-but-non-culturable (VBNC) state. This recently suggested correction factor, if applied to this study's reported EnV concentrations in future QMRAs, would bring the average EnV concentration by cell culture closer to the observed average concentration via qPCR (4.70
log10 MPN L−1 compared to 5.45
log10 gc L−1), and the distribution mean GC
:
IU ratio to only 28
:
1. Although molecular methods measured higher concentrations, detection rates were lower for gene copies compared to infectious units (84% vs. 96% for AdV, 82% vs. 96% for EnV). Two factors may explain this discrepancy: differences in ESV and storage degradation. Culture methods had higher ESVs (∼200 mL, Text S3†) compared to molecular methods (∼1 mL), increasing detection likelihood. Additionally, archived samples may have degraded, causing concentrations near the detection limit to fall below it. Notably, in paired samples measured by both methods, the molecular detection rate was greater than or equal to the culture detection rate (96% for AdV and 100% for EnV).
log10 oocysts per L. Cryptosporidium's highest mean concentration occurred in facility 5, although the differences between facility 5 and other facilities were not statistically significant (p values ranging between 0.1 and 1 for each facility comparison). Cryptosporidium also exhibited spikes, with the maximum concentration (3.31
log10 oocysts per L) occurring in facility 4. A Kruskal–Wallis test indicated a slightly significant difference (p = 0.01) between facilities, but Dunn's post hoc analysis revealed no individual significant differences between facilities. Giardia was found in every sample (100% detection rate), with a mean concentration of 3.73
log10 cysts per L across all facilities. Giardia concentrations varied slightly between facilities (one-way ANOVA; p = 0.00155) (Fig. 6), with two spikes in facility 6 (4.66
log10 cysts per L and 4.69
log10 cysts per L) on separate sampling dates, causing facility 6 to have the widest range of concentrations and highlighting again the impact of grab sampling from small systems. However, facility 1 had the highest mean concentration, with significant differences compared to facilities 2, 4, and 6 (ANOVA, Tukey post hoc; p < 0.05).
log10 per L for most targets, but notably over 2
log10 gc L−1 higher in the case of all NoV targets, and 1.9
log10 gc L−1 higher for AdV (qPCR) compared to the California dataset. Additionally, there was greater variability (as measured by standard deviation) for all molecular targets except AdV (1.6 vs. 1.2) in the Southern Nevada dataset. This variability may be driven by the large sample size paired with censored data, encompassing over three years of weekly and monthly sampling, as well as encompassing both pandemic conditions and ‘normal’ conditions. Nearly identical methods were employed between this study and WRF 4989, so while the differences are unlikely to be methods driven, there still could be differences due to archiving, recovery adjustment methods, and inherent sampling differences. Future research can use this study and the unprecedented volume of biobanks of wastewater data from other COVID-19 wastewater surveillance programs, which will result in an increase in published pathogen datasets, to create a new combined distribution.
| Pathogen | Meta-analysisa | Californiaa | Current study | Differencec | |||
|---|---|---|---|---|---|---|---|
| Meanb | St. dev.b | Meanb | St. dev.b | Meanb | St. dev.b | Δ | |
| a Data reproduced here from WRF 4989. b Mean and standard deviation are of a normal distribution of log10-transformed concentrations in units of target per L. c Δ = current study − literature (i.e., meta-analysis or California). For markers with no meta-analysis data, the California data are used. | |||||||
| Cryptosporidium microscopy | 1.9 | 0.6 | 1.7 | 0.4 | 2.0 | 0.5 | +0.1 |
| Giardia microscopy | 4.0 | 0.4 | 4.0 | 0.4 | 3.7 | 0.4 | −0.3 |
| Enterovirus culture | 3.2 | 1.0 | 3.2 | 1.0 | 3.7 | 0.9 | +0.5 |
| Adenovirus culture | — | — | 2.8 | 1.0 | 3.2 | 0.8 | +0.4 |
| Enterovirus molecular | 5.1 | 1.1 | 4.9 | 0.8 | 5.5 | 1.1 | +0.4 |
| Adenovirus molecular | — | — | 4.3 | 1.6 | 6.2 | 1.2 | +1.9 |
| Norovirus GIA molecular | — | — | 3.8 | 1.0 | 6.3 | 1.0 | +2.5 |
| Norovirus GIB molecular | — | — | 3.6 | 1.0 | 6.0 | 1.4 | +2.4 |
| Norovirus GI sum | — | — | — | — | 6.5 | 1.2 | — |
| Norovirus GII molecular | — | — | 4.0 | 0.2 | 6.2 | 1.1 | +2.2 |
AdV and EnV did not display as substantial differences. AdV was slightly, but significantly, higher in the winter than in summer (p = 0.02); no other comparisons were statistically significant. Seasonality trends of gastrointestinal AdV are varied, largely depending on the particular location or timeframe.76,77 AdV gastroenteritis infections do not usually display strong seasonal patterns; instead, they generally peak sporadically throughout the year.78 For EnV, concentrations were higher in the fall than in spring or summer (p < 0.05). Though EnV gastroenteritis infections are sometimes associated with summertime illness,79 seasonality of EnV will largely depend on which EnV species are circulating within the population, as different enteroviruses have shown different seasonality in wastewater monitoring data.80
log10 higher) in samples collected during normal conditions, suggesting that gastrointestinal disease circulation was lower during the pandemic (Fig. 8 and S8†). Concentrations of all viral molecular markers were higher after the end of the state of emergency than all other individual pandemic phases, with the following exceptions. There was no significant difference between the phase after the end of the state of emergency and the end of the second mask mandate for NoV GII and AdV. Otherwise, all other phases exhibited significantly lower concentrations than after the end of the state of emergency (Fig. S7 and S8†).
| Milestone | Sampling starts | Stay-at-home directive | State reopening | State pause | End of mask mandate | Mask mandate 2 | End of mask mandate 2 | End of state of emergency |
|---|---|---|---|---|---|---|---|---|
| a Masking requirements reinstituted on 06/24/2020. | ||||||||
| Start date | 3/10/2020 | 3/18/2020 | 5/9/2020 | 11/22/2020 | 6/1/2021 | 7/27/2021 | 2/10/2022 | 5/20/2022 |
| Masking requirements | No | No | Yesa | Yes | No | Yes | No | No |
| Social distancing measures | No | Yes (lockdown) | Yes | Yes | Reopening at 100% capacity | No | No | No |
| Public school status | In-person | Remote | Remote | Remote/hybrid | Hybrid/summer break | Summer break/in-person | In-person | In-person |
![]() | ||
Fig. 8 Locally estimated scatterplot smoothed (LOESS, a nonparametric method for smoothing82) concentrations with 95 percent confidence interval plotted over time and separated by pathogen. For plotting purposes only, <LoQ and non-detect data were set to the LoQ/ . Statistical significance as measured by the Kruskal–Wallis test and post hoc testing is indicated with solid shading, with solid shade of red indicating a significant increase from the previous phase and a solid shade of blue indicating a significant decrease from the previous phase. Hatching indicates no significant difference from the previous phase, and color of the hatching matches the previous shade. Concentrations are not smoothed across calendar year borders. | ||
An argument can be made that for application of these data, distinction should be made between pandemic conditions and normal conditions, with the normal-condition distribution providing more conservative (i.e., higher) pathogen concentrations for use in QMRA. Alternatively, the full dataset provides a larger range of potential wastewater conditions, incorporating variability that could be useful for estimates of risk across broad scenarios. Separate pandemic-condition and normal-condition distribution fittings are available in Table S10.†
log10 gc L−1), followed by PMMoV (distribution mean concentration = 8.80
log10 gc L−1) and HF183 (distribution mean concentration = 7.99
log10 gc L−1). SYBR-based PMMoV data are not included in these analyses.
In Southern Nevada, human fecal indicator concentrations in wastewater should be fairly constant due to few and theoretically stable non-human contributions to the non-combined sewer system. However, we observed human fecal indicator changes between facilities, pandemic phases, and slight seasonal variation. PMMoV and crAssphage exhibited lower variation in wastewater than the viral pathogens measured by qPCR (Fig. 9). Bacterial indicator HF183 had an unexpected drop in concentration in October 2022 before recovering to previous levels in 2023, which is curious as methods were consistent for the duration of the study, viral fecal indicators remained constant, and the drop off was observed in all facilities. This drop-off impacted the standard deviation, 1.47
log10 gc L−1, the highest observed across all targets. We note that the wastewater concentration methods involve removing the solids fraction, which may contain a significant portion of bacteria due to their larger size. Therefore, these methods may not be optimal for bacterial markers. Seasonal trends were not observed for crAssphage (p = 0.09). For PMMoV and HF183, there were higher concentrations observed in summer compared to all other seasons (p < 0.0001). All fecal indicators displayed significant variation between facilities, with the most obvious trend being that facility 1 had significantly lower concentrations (p < 0.005) than all other facilities, with the only exception being PMMoV in facilities 1 and 2 not being statistically distinguishable (p = 1). The facility 1 sample is of primary effluent (influent after undergoing a settling step), so fecal material may have been somewhat removed. Moreover, due to the grab nature of the primary effluent, representing influent arriving at ∼5:00–6:00 am, there may be a diurnal effect leading to lower human fecal inputs.32
Interestingly, there were significant changes between individual pandemic phases for all indicators. For instance, crAssphage increased somewhat during the state reopening phase (p = 0.01), stayed high and constant, and then significantly decreased when the second mask mandate was issued (p = 0.007), potentially due to changes in commuting behaviors, tourism, or other factors. PMMoV did not follow this pattern, however, with a significant increase during the state reopening phase (p = 0.04), increasing again during the state pause (p = 0.01), decreasing at the end of the mask mandate (p = 0.03), constant through the second mask mandate, and a final significant decrease from the end of the second mask mandate into the end of the state of emergency phase (p < 0.0001). When divided into pandemic and normal phases, crAssphage showed no significant difference between phases (p = 0.71, Mann–Whitney), whereas PMMoV was significantly higher (p < 0.0001, Mann–Whitney) in normal/non-pandemic phases and HF183 was significantly higher (p < 0.0001, Mann–Whitney) in pandemic phases.
Ultimately, due to the high variability of HF183 and the potential non-human sources of PMMoV (e.g., food preparation and food waste disposal down drains), our data suggest that crAssphage is the best performing molecular fecal indicator for MST, or potentially for future data normalization approaches, at least specifically in the studied watershed.
Also, here we characterize microbial constituents (excluding Giardia and Cryptosporidium) in the liquid portion of wastewater, although we recognize that enteric pathogens and fecal indicators exist in both liquid and solid phases.4,84 The WastewaterSCAN program, initially developed to monitor SARS-CoV-2 in wastewater, has now expanded to include other targets, including enteric pathogens. However, it solely focuses on pathogen concentrations in the solid phase of wastewater, which presents certain limitations for environmental applications, wastewater treatment optimization, and regulatory decision making. While its current form is validated, optimized, and highly useful for public health applications, the dataset's utility can be significantly enhanced in future research by incorporating methodologies to convert between solid and liquid phase concentrations using partitioning coefficients. For instance, research could include back-calculating overall influent concentrations for water reuse LRV development, and also infection estimates for WBE applications. Characterizing partitioning coefficients for a growing list of viruses and pairing these coefficients with total suspended solids data could facilitate translation of reported gene copies per gram of solids to overall gene copies per liter of wastewater. This approach could allow future research to compare pathogen wastewater dynamics in Southern Nevada to the United States in general, allowing for a more comprehensive utilization of the WastewaterSCAN data, enhancing its applicability across various multidisciplinary fields.
Major products of our study include robust fitted distributions for culture-based enteric viruses, qPCR-based enteric viruses, protozoan pathogens, and human fecal indicators. Additionally, we developed methods for recovery estimation when degradation of biobank samples may be a concern, and we established GC
:
IU ratio distributions. These ratios are critical parameters for QMRAs when converting molecular data, which includes both non-infectious and infectious genetic material, into infectious units. Our analyses of wastewater concentrations of enteric viruses during the COVID-19 pandemic supports the hypothesis that concentrations of enteric pathogens were significantly lower in some wastewater systems during pandemic conditions. This decrease is potentially due to reduced spread of gastrointestinal illnesses during social distancing and other pandemic response measures. Our study confirmed that enteric viruses, as measured by molecular methods, exhibited seasonal variation, with norovirus GI and GII following well documented trends in the literature. Furthermore, our findings indicated that fresh samples had fewer non-detects compared to archived samples, suggesting that storage conditions impact nucleic acid integrity. Therefore, incorporating appropriate storage and degradation controls are crucial for studies of biobanked nucleic acid extracts.
Footnote |
| † Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4ew00620h |
| This journal is © The Royal Society of Chemistry 2025 |