 Open Access Article
 Open Access Article
      
        
          
            Amita 
            Muralidharan
          
        
      , 
      
        
          
            Rachel 
            Olson
          
        
      , 
      
        
          
            C. Winston 
            Bess
          
        
       and 
      
        
          
            Heather N. 
            Bischel
          
        
       *
*
      
Department of Civil and Environmental Engineering, University of California Davis, Davis, California 95616, USA. E-mail: hbischel@ucdavis.edu
    
First published on 24th October 2024
Sub-city, or sub-sewershed, wastewater monitoring for infectious diseases offers a data-driven strategy to inform local public health response and complements city-wide data from centralized wastewater treatment plants. Developing strategies for equitable representation of diverse populations in sub-city wastewater sampling frameworks is complicated by misalignment between demographic data and sampling zones. We address this challenge by: (1) developing a geospatial analysis tool that probabilistically assigns demographic data for subgroups aggregated by race and age from census blocks to sub-city sampling zones; (2) evaluating representativeness of subgroup populations for COVID-19 wastewater-based disease surveillance in Davis, California; and (3) demonstrating scenario planning that prioritizes vulnerable populations. We monitored SARS-CoV-2 in wastewater as a proxy for COVID-19 incidence in Davis (November 2021–September 2022). Daily city-wide sampling and thrice-weekly sub-city sampling from 16 maintenance holes covered nearly the entire city population. Sub-city wastewater data, aggregated as a population-weighted mean, correlated strongly with centralized treatment plant data (Spearman's correlation 0.909). Probabilistic assignment of demographic data can inform decisions when adapting sampling locations to prioritize vulnerable groups. We considered four scenarios that reduced the number of sampling zones from baseline by 25% and 50%, chosen randomly or to prioritize coverage of >65-year-old populations. Prioritizing representation increased coverage of >65-year-olds from 51.1% to 67.2% when removing half the zones, while increasing coverage of Black or African American populations from 67.5% to 76.7%. Downscaling had little effect on correlations between sub-city and centralized data (Spearman's correlations ranged from 0.875 to 0.917), with strongest correlations observed when prioritizing coverage of >65-year-old populations.
| Water impactWastewater-based disease surveillance should aim to achieve equitable representation of vulnerable groups within sampling regions. The probabilistic assignment approach in this study helps determine the distribution of demographic groups within sampling areas at sub-sewershed levels. When resource constraints necessitate downscaling the number of sampling sites, the approach demonstrated herein can inform decisions to preserve spatial representation of vulnerable populations. | 
According to the Centers for Disease Control and Prevention (CDC), the three major factors affecting COVID-19's unequal distribution of impact are age, race, and ethnicity, with age cited as the main risk factor for severe COVID-19 outcomes.5 Data from the National Vital Statistics System has shown that, relative to those between ages 18–29 years old, risk of death from COVID-19 is 25 times higher for people between ages 50–64 years, 60 times higher for people between 65–74 years, 140 times higher for people between 75–84 years, and 340 times higher for those 85 years and older.6 Moreover, the COVID-19 pandemic underscored racial and ethnic health inequities. Individuals from racial and ethnic minority groups were disproportionately affected by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) transmission, leading to increased rates of hospitalization, emergency room visits, and premature death compared to non-Hispanic White individuals.6 Since March 2020, the average daily increase in COVID-19 mortality was found to be much higher in rural U.S. counties amongst predominantly Black and Hispanic populations.7 Among rural counties, those in the top quartile of percent Black populations had an average daily increase in COVID-19 mortality rates 70% higher than counties in the bottom quartile, and counties in the top quartile of percent Hispanic populations had an average daily increase that was 50% higher.7
Wastewater-based disease surveillance (WDS), which involves analyzing community-pooled wastewater samples from centralized wastewater treatment plants or sewer collection systems for disease biomarkers, has emerged as a viable strategy to provide insight into population-level disease trends. WDS has been favored as a minimally invasive, anonymous, and cost-effective way to track virus spread compared to testing individuals within the population since more than 80% of U.S. residents are on a piped sewer system.1 Each wastewater sample can represent hundreds to over a million people depending on the sample collection location. Those infected with SARS-CoV-2, including asymptomatic, pre-symptomatic, and symptomatic individuals, can shed viral particles and associated ribonucleic acid (RNA) through fecal matter. SARS-CoV-2 RNA remains readily detectable in wastewater even though fecal–oral transmission has not been reported for this virus. WDS has thus filled gaps associated with underreporting of cases, and can serve as an early indicator of potential outbreaks.8 Moreover, WDS has proven to be a more comprehensive approach to tracking viral outbreaks and community infections since it does not rely on community members having access to clinical testing services or seeking healthcare when they are experiencing symptoms.9 WDS can be particularly useful in resource-limited settings (e.g., where clinical testing services are constrained).
Trends in wastewater data at a sub-city level (e.g., at the census block level, the most granular level at which public demographic data can be obtained) have also been used in WDS to inform public health responses.10,11 Early in the pandemic, many WDS programs were established rapidly via academic and government partnerships, and sampling locations were not always selected in a methodical way. While the utility of WDS is evident, sampling paradigms that rely on convenient points of access within the sewer network may not equitably serve the public or public health, as some populations may be underrepresented depending on where sampling occurs. There may be similar disparities in traditional monitoring efforts as there are disparities in access to clinical testing and vaccinations.12 Recent efforts in the field have acknowledged the importance of taking steps targeted at reducing inequities,13 but standardized measures for assessing performance of WDS towards improving health equity are lacking. Additional analysis is needed to evaluate whether chosen sampling locations are appropriately representative of a given community, especially in such cases as wastewater surveillance where any given sample pools together information from a broad population.
Equitable protection of public health can be guided through inclusive wastewater surveillance efforts.4 Specific considerations are needed to promote inclusion of underrepresented groups and equity in responses to public health threats, including evaluations regarding the extent to which vulnerable populations are represented in wastewater monitoring programs.14 Ultimately, the success of a WDS program relies on the assurance that there is equitable representation of high-risk and/or underserved communities in the sampling design. Evaluating demographic representation of sub-sewershed zone populations can prove difficult because data at a census block level does not align with sewer networks. In other words, flows of wastewater from populations within census blocks depend on city-wide sewer system connections and do not conform to the zones of population that are represented from wastewater samples collected from maintenance holes (MHs) within a city.
This project offers a strategy for WDS sub-city (or sub-sewershed) health equity evaluations using census data at a block-level with the goal of enhancing inclusivity in the design of sampling frameworks within the constraints of a sewer system. First, we developed a probabilistic assignment approach to determine the expected subgroup population that is represented by collecting samples at different locations in a city sewer system. We used sub-sewershed wastewater surveillance during the COVID-19 pandemic and demographic data in Davis, California to demonstrate the approach. Second, we compared trends in sub-sewershed wastewater data to city-wide trends. Sampling frameworks may seek to achieve representativeness of overall community disease trends in addition to achieving representation based on demographic characteristics. Finally, we evaluated scenarios in which adaptive sampling strategies are implemented to prioritize representation of high-risk or vulnerable populations under resource-constrained conditions. The overall framework offers a strategy to evaluate sub-city sampling designs for wastewater surveillance to enhance health equity goals.
![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) 850 in Yolo County, California.15 HDT offered wide-spread saliva-based asymptomatic and symptomatic testing for free, conducting over 1.6 million COVID-19 tests for the community and the university across 120 locations including testing sites at local schools, community centers, the university campus, and mobile clinics.16 From September 2020 to September 2022, we conducted wastewater surveillance throughout the City of Davis (COD) at the city, sub-city, and building/neighborhood levels.17Table 1 provides definitions for key terms associated with the sampling framework. At the city level, samples were collected from the influent to the COD Wastewater Treatment Plant (COD WWTP). At the sub-sewershed level, samples were collected from up to sixteen nodes that each represent a sub-sewershed within the COD. At the building/neighborhood level, samples were collected from up to seven additional nodes for building complexes or neighborhoods identified as priority areas by HDT and local officials for potential communication and/or health interventions. The number of sampling locations and frequency of sampling increased through time. By April 2021, HDT sampled daily from the COD WWTP and three times per week from MHs in each of the sub-sewershed and building/neighborhood zones. Safford et al. (2022)17 describe HDT wastewater surveillance conducted from September 2020 to June 2021, for which wastewater samples were collected at seven building/neighborhood locations, 16 sub-sewershed nodes, and the city level. These samples were analyzed using reverse transcription quantitative polymerase chain reaction (RT-qPCR) and showed good correlation with clinical test results at the city-level and sub-sewershed level. Daza-Torres et al. (2023)18 report city-level (COD WWTP) wastewater surveillance data from December 1, 2021, to March 31, 2022, which was measured using reverse transcription droplet digital PCR (RT-ddPCR). The present study describes and analyzes HDT wastewater surveillance conducted from November 22, 2021, to September 30, 2022, for which wastewater samples were collected daily from the COD WWTP and three times per week at 15 sub-sewershed nodes. All samples in this study were analyzed using RT-ddPCR as reported by Daza-Torres et al. (2023)18 and described below.
850 in Yolo County, California.15 HDT offered wide-spread saliva-based asymptomatic and symptomatic testing for free, conducting over 1.6 million COVID-19 tests for the community and the university across 120 locations including testing sites at local schools, community centers, the university campus, and mobile clinics.16 From September 2020 to September 2022, we conducted wastewater surveillance throughout the City of Davis (COD) at the city, sub-city, and building/neighborhood levels.17Table 1 provides definitions for key terms associated with the sampling framework. At the city level, samples were collected from the influent to the COD Wastewater Treatment Plant (COD WWTP). At the sub-sewershed level, samples were collected from up to sixteen nodes that each represent a sub-sewershed within the COD. At the building/neighborhood level, samples were collected from up to seven additional nodes for building complexes or neighborhoods identified as priority areas by HDT and local officials for potential communication and/or health interventions. The number of sampling locations and frequency of sampling increased through time. By April 2021, HDT sampled daily from the COD WWTP and three times per week from MHs in each of the sub-sewershed and building/neighborhood zones. Safford et al. (2022)17 describe HDT wastewater surveillance conducted from September 2020 to June 2021, for which wastewater samples were collected at seven building/neighborhood locations, 16 sub-sewershed nodes, and the city level. These samples were analyzed using reverse transcription quantitative polymerase chain reaction (RT-qPCR) and showed good correlation with clinical test results at the city-level and sub-sewershed level. Daza-Torres et al. (2023)18 report city-level (COD WWTP) wastewater surveillance data from December 1, 2021, to March 31, 2022, which was measured using reverse transcription droplet digital PCR (RT-ddPCR). The present study describes and analyzes HDT wastewater surveillance conducted from November 22, 2021, to September 30, 2022, for which wastewater samples were collected daily from the COD WWTP and three times per week at 15 sub-sewershed nodes. All samples in this study were analyzed using RT-ddPCR as reported by Daza-Torres et al. (2023)18 and described below.
        
| Key term | Definition | 
|---|---|
| Sewershed | The area that contributes wastewater to a common end point. In this study, the sewershed refers to the area whose sewers flow to the City of Davis Wastewater Treatment Plant (COD WWTP) | 
| Sewershed node | A maintenance hole (MH) that serves as a wastewater sampling location | 
| Sub-sewershed zone | The area represented by one or more sewershed nodes. Also referred to as sub-city zone | 
| Subgroup | A subset of the overall city population that shares a specific demographic characteristic (e.g., race or age) | 
![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) 000 gallons.18 Based on the influent flow of 3.6 million gallons per day, about 24 pulses were expected per day. Each sample date recorded corresponds to the date that an autosampler program was completed. The COD WWTP provided 12 mL samples in new 15 mL polypropylene centrifuge tubes. The samples were stored at 4 °C. Samples were transported to the analytical laboratory at UC Davis in coolers on ice and generally processed the same day. Samples were first pasteurized for 30 minutes at 60 °C to mitigate biohazard risk while maintaining RNA quality and equilibrated to 4 °C prior to additional processing described below.
000 gallons.18 Based on the influent flow of 3.6 million gallons per day, about 24 pulses were expected per day. Each sample date recorded corresponds to the date that an autosampler program was completed. The COD WWTP provided 12 mL samples in new 15 mL polypropylene centrifuge tubes. The samples were stored at 4 °C. Samples were transported to the analytical laboratory at UC Davis in coolers on ice and generally processed the same day. Samples were first pasteurized for 30 minutes at 60 °C to mitigate biohazard risk while maintaining RNA quality and equilibrated to 4 °C prior to additional processing described below.
      
      
        To avoid contamination, preparation and plating of the ddPCR master mix were conducted manually in a PCR hood in a separate location from sample loading, which was performed using an epMotion® 5075 (Eppendorf) liquid handler. Each reaction plate included duplicate positive controls (stock mixture of synthesized gene fragments, gBlocks™ from Integrated DNA Technologies, for the target regions) and duplicate no-template controls (nuclease-free water). Results were analyzed using the QX One Software Regular Edition (Bio-Rad) and thresholds were adjusted by visual inspection in samples and controls. The results were considered invalid if the distribution of positive or negative droplets appeared abnormal or if the total number of droplets generated was below 10![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) 000 droplets in a well.
000 droplets in a well.
| Rank = 0.5 + 0.95 × (number of measurements) | (1) | 
The rank position was rounded up to provide a more conservative LOB since the calculated value was a non-integer. The theoretical LOD was determined by adding two times the standard deviation of all the replicate results to the LOB. The LOD and LOB values are reported in Table S3.† The highest number of positive droplets in the merged wells of the blank samples were 6 (N1) and 8 (N2). The cutoff was set at 3 (N1) and 4 (N2) since wastewater samples were routinely analyzed in duplicate. This way, it was possible to mark samples below the droplet threshold. Additionally, if the samples had fewer N1 and N2 droplets than twice the number of droplets in the extraction control blank that was analyzed on the same day, they were considered below the droplet threshold. Furthermore, runs that had an extraction control blank with greater than 15 positive droplets (N1 or N2) were considered contaminated, and the extracts were re-processed.
If the samples satisfied the criteria above, the relative concentration of N gene was calculated. This was done by merging the duplicate results for each target and calculating the concentration of each target in the RT-ddPCR reaction, assuming a Poisson distribution using the QXOne Software 1.1.1 Standard Edition. To obtain the average SARS-CoV-2 RNA concentration in the initial wastewater sample, the N1 and N2 results were averaged after correcting for the sample and reagent volumes used. The resultant value was reported as genome copies (gc) per mL wastewater. Concentrations of targets were not corrected for BCoV recovery efficiency. A threshold BCoV recovery value of 10% was used to retain the recorded SARS-CoV-2 concentrations that had a recovery rate equal to or greater than the threshold value. High variability is to be expected among BCoV recovery values due to variability of sample characteristics. Other recovery analyses have reported average BCoV recovery values ranging from 4.8% to 36.1%, depending on the virus concentration method employed.22 Targets were excluded from the average concentration if N1 or N2 merged droplet counts were below the minimum droplet threshold, and the concentration was reported as 0 if both the N1 and N2 targets were below the droplet threshold. We use N/PMMoV (the average SARS-CoV-2 RNA concentration (N) divided by the concentration of PMMoV) as the wastewater signal for subsequent analysis.
|  | ||
| Fig. 2 Map of the nine sub-sewershed zones for the SARS-CoV-2 wastewater-based epidemiology efforts in Davis (census blocks outlined). | ||
Because of the high spatial granularity of census-block-level data, it is important to note that only certain demographic factors had sufficient data available for this analysis. Additionally, increased margins of error were reported in the census data collected in 2020 due to the difficulty of collecting responses during the COVID-19 pandemic.28 Consequently, this study focuses on the following two demographic factors: race and age. No other demographic variables were available in the 2020 census data at the block level during the time of our analysis.
The tabular census data was grouped according to the categories specified on the California Department of Public Health (CDPH) Health Equity Dashboard.29 The census data presenting the racial composition of each census block was filtered to include the following seven groups: White, Black or African American, American Indian and Alaska Native, Asian, Native Hawaiian and Other Pacific Islander, Other, and Multi-Race. The census data presenting the composition of people by age in each census block was grouped into the following ten age categories: less than 5 years, 5–17 years, 18–34 years, 35–49 years, 50–59 years, 60–64 years, 65–69 years, 70–74 years, 75–79 years, and greater than or equal to 80 years.
All analyses were performed using Python version 3.11.5 (the Python script used for implementation is available at https://tinyurl.com/HealthEquityWBE).
|  | ||
| Fig. 3 Simplified illustration of the probabilistic assignment method adopted from Safford et al. (2022),17 demonstrating how demographic data is distributed from a census block to a sub-sewershed zone. | ||
Fig. 3 illustrates how the locations of the MHs in the COD sewershed can be used to probabilistically assign demographic data from census blocks to sub-sewershed monitoring zones, whereas Safford et al. (2022)17 probabilistically assigned clinical case count data to sampling zones. This approach allows us to assess whether sampling locations were chosen in a way that appropriately represents subgroups within the overall population. The probabilistic assignment determines the expected subgroup population members whose wastewater is captured by sampling at a specific location. In the example outlined in Fig. 3, the sampler location depicted at the bottom covers a sub-sewershed monitoring zone that spans two census blocks. The census block on the left has a total population of 60, of which 21 individuals belong to the subgroup of interest (e.g., >65 year olds). There are three MHs in this census block. The census block on the right has a total population of 85, of which 25 individuals belong to the subgroup of interest. There are five MHs in this census block. The tool's objective is to determine a predicted number of subgroup members captured by each MH in the census block. To calculate this value, we divided the subgroup population by the number of MHs, under the assumptions noted above. In the case of the census block on the left, we divided the 21 individuals' wastewater contributions across all three MHs, which results in a value of seven. This value is the probabilistically assigned subgroup population represented by a sample taken at a given MH in the census block on the left. For the census block on the right, we split the 25 individuals' wastewater contributions across all five MHs, resulting in a probabilistically assigned value of five individuals. The wastewater flow was tracked through the dataset containing the directional connections between all the MHs in the city and summed at the sampler location. In this example, we obtained a predicted population of 12 people who belong to the subgroup of interest. Note that this is a simplified representation of the methodology behind the probabilistic tool. It is possible to obtain decimal resultant values using this method, but tool outputs can be rounded to whole numbers and still provide meaningful insight.
The output of the probabilistic tool for a single run can be generated in the format of a summary table, shown in Table 2. The table displays the number of community members from the subgroup of interest who are represented by the sample taken at a node under the given sampling zone boundaries. Additional columns denote sampling nodes by their MH identification (ID) number, and additional rows of the output table represent data from additional census blocks within the city. The values within each column under a MH ID number are the probabilistically assigned subgroup populations represented by the wastewater sample. We ran the probabilistic distribution tool for each of the seven racial groups and each of the ten age categories. Then, the output table was filtered for the census blocks within each sub-sewershed zone to be able to calculate summary statistics by sub-sewershed zone rather than discrete census blocks.
| Census block | Total subgroup population | Sewershed node | |||||||
|---|---|---|---|---|---|---|---|---|---|
| M16-011 | N13-045 | N11-062 | N11-072 | N12-066 | O13-002 | O20-001 | … | ||
| 061130104012002 | 13 | 0 | 0 | 0 | 0 | 0 | 0 | 13.0 | … | 
| 061130107041016 | 8 | 0 | 4.0 | 0 | 0 | 0 | 4.0 | 0 | … | 
| … | … | … | … | … | … | … | … | … | … | 
We then compared the results of the probabilistic assignment tool to “manually derived” subgroup populations for each sub-sewershed zone. We obtained the manually derived population values by visually assigning the census-reported subgroup population value to a given census block under the delineated zone boundaries. We calculated the absolute percent difference (APD) between the probabilistically assigned subgroup population value and the manually derived subgroup population value using the following equation:
|  | (2) | 
|  | ||
| Fig. 4 Population-weighted, 10-day right-aligned trimmed moving average of the normalized SARS-CoV-2 concentration in wastewater (N/PMMoV) for each sub-sewershed monitoring zone in the study area. | ||
To quantify the error associated with the cumulative population-weighted moving average values, we calculated the mean absolute error (MAE) for each sub-sewershed zone's set of population-weighted moving averages, using the following equation:
|  | (3) | 
|  | (4) | 
![[thin space (1/6-em)]](https://www.rsc.org/images/entities/char_2009.gif) 826 (95.5% of the census-reported total population in 2020). Scenarios 1 and 2 evaluate impacts for a minor (∼25%) reduction in the number of sampling nodes, while scenarios 3 and 4 evaluate a reduction of sampling nodes by approximately 50%. In scenarios 1 and 3, the sampling sites removed were chosen at random. In scenarios 2 and 4, the sampling sites removed were selected such that the coverage (total subgroup population in the included zones divided by the total subgroup population in Davis) of the >65-year-old population was prioritized. In other words, the nodes that were removed corresponded to sub-sewershed zones with the fewest number of >65-year-olds. For each scenario we determined: the percentage of the total city population covered by the scenario, the percentage of the >65-year-old subgroup population covered by the sampling regime, the percentage of Black or African American subgroup population covered under that same regime, and a Spearman's rank correlation coefficient that reports the strength of correlation between the aggregated wastewater signals for the included zones (PWMA values) and the COD WWTP signals.
826 (95.5% of the census-reported total population in 2020). Scenarios 1 and 2 evaluate impacts for a minor (∼25%) reduction in the number of sampling nodes, while scenarios 3 and 4 evaluate a reduction of sampling nodes by approximately 50%. In scenarios 1 and 3, the sampling sites removed were chosen at random. In scenarios 2 and 4, the sampling sites removed were selected such that the coverage (total subgroup population in the included zones divided by the total subgroup population in Davis) of the >65-year-old population was prioritized. In other words, the nodes that were removed corresponded to sub-sewershed zones with the fewest number of >65-year-olds. For each scenario we determined: the percentage of the total city population covered by the scenario, the percentage of the >65-year-old subgroup population covered by the sampling regime, the percentage of Black or African American subgroup population covered under that same regime, and a Spearman's rank correlation coefficient that reports the strength of correlation between the aggregated wastewater signals for the included zones (PWMA values) and the COD WWTP signals.
      
    
    
      We assessed the APD values spatially (sub-sewershed zones) and by demographics (race and age). APDs values across all sub-sewershed zones in Davis for race categories (Table 3) ranged from approximately 0% to 43%. Relatively low percentage differences were observed by race category in SR-A, SR-D, and SR-I, while SR-B and SR-E showed consistently higher APD values. APDs across all sub-sewershed zones in Davis for age categories (Table 4) ranged from approximately 0% to 60%. Relatively low percent difference values were observed in SR-A, SR-D, SR-H, and SR-I. Consistently higher APD values were observed in SR-B, SR-C, and SR-E.
Greater percent differences were observed more often for minority subgroup populations compared to White populations. This highlights how boundary effects can be significant for subgroups present in low absolute numbers. In these cases, some corrections may need to be made for census block populations at sub-sewershed zone boundaries. This could be done by intersecting census data with unique boundaries (i.e., the sub-sewershed zone boundaries) and using apportionment to divide the census block population between two neighboring zones. However, the accuracy of apportioning area would be challenging to validate. Low-population zones (e.g., SR-E) will tend to exhibit high APDs when small absolute changes in zone reassignment lead to a large percent change. The higher populous zones (e.g., SR-A) tended to yield low APDs overall. Locally relevant infrastructure conditions can help to interpret in congruencies and modify planning strategies for sampling designs. For instance, zone SR-D showed no differences between the probabilistic assignment and manual derivation methods (APD = 0%) but also has the lowest total population amongst the seven zones. SR-D the area encompasses a relatively new development with distinct, non-overlapping census blocks. We selected to use the probabilistic assignment tool to assess the representation of subgroup populations across Davis under different planning scenarios.
Individual sub-sewershed zones exhibited variability in the magnitude and timing of wastewater signals (Fig. 4). As expected, the propagated error for the aggregated sub-sewershed zone data (6.48 × 10−3) was much higher than that of the COD WWTP moving averages (3.60 × 10−4) due to the greater number of operations performed to generate the data. Nevertheless, the cumulative population-weighted mean average of the sub-sewershed zones exhibited similar magnitudes and patterns compared to the COD WWTP moving average wastewater data (Fig. 5). While the population in Davis is known to fluctuate during the summer and early winter months, due to the movement of students in and out of the city, there was no clear impact of these mobility trends on the correlations between sub-sewershed and city-wide wastewater data. The Spearman's rank correlation coefficient over the study period was 0.909, with a statistically significant positive correlation (p-value of 5.88 × 10−28). We also calculated a Spearman's rank correlation coefficient between each sub-sewershed zone and the COD WWTP (Table S5†). Correlation coefficients ranged from approximately 0.732 to 0.935, and all p-values were less than 0.05. This suggests that many combinations of sub-sewershed zones may offer reasonable representation of wastewater disease dynamics at the city-level. Moreover, results for wastewater data at the sub-sewershed level indicate that the data can provide utility when alerting health professionals to potential surges. For instance, wastewater data from four sub-sewershed zones (SR-A, SR-B3, SR-C1, and SR-C2) tended to rise earliest amongst all the zones, pointing to potential regions for early health interventions. Greater fluctuation in sub-sewershed zone data compared to city-level aggregated results provides finer resolution when tailoring local interventions. In fact, HDT actively utilized wastewater data from sub-sewershed zones at the time of the data collection to target communications as a component to a multi-faceted strategy for precision public health.16,17
Downscaling sampling may impact the representativeness of the data to the city overall, introducing tradeoffs between resource constraints and equitable representation of population subgroups. In this example, we stratified the sub-sewershed sampling by age, retaining sites to over-sample the high-risk population of individuals ages 65 and older. We evaluated the impact of this prioritization scheme on the representation of two subgroups disproportionately impacted by COVID-19 (elderly populations and Black or African American populations). In each of 4 scenarios considered, we evaluate the impact of sampling node selections on: (1) the representation of the >65-year-old population (who comprise about 14.5% of the COD population), (2) the representation of the Black or African American populations (who comprise about 2.31% of the COD population), and (3) correlations between aggregated sub-sewershed zone wastewater data with city-level data.
In scenarios 1 and 2, we examined the city-level effect of artificially pausing sampling at approximately 25% of the sampling sites, reducing the number of nodes from 15 to 11. In scenario 1, we removed nodes at random (SR-A, SR-D, SR-F1, and SR-F2 removed). The resulting coverage of the >65-year-old population decreased to 57.5%. The coverage of the Black or African American population declined to 70.3%. For comparison, scenario 2 maximized coverage of the >65-year-old population under the same resource constraints (SR-D, SR-E, SR-F1, and SR-F2 were removed). As expected, coverage of the >65-year-old population improved relative to scenario 1, to 80.5%. The coverage of Black or African American populations also improved in this scenario, though unexpectedly, to 89.7%. The aggregated wastewater signal in both scenarios was minimally affected by the exclusion of sampling nodes. Spearman's rank correlation coefficients of the aggregated wastewater signals from sub-city zones in each scenario relative to the COD WWTP moving average remained high and significant (Table 5), reflecting minimal change from baseline.
| Scenario | Description | % of total city population represented | % of >65-year-old subgroup population covered | % of Black or African American subgroup population covered | Wastewater signal correlation (p-value) | 
|---|---|---|---|---|---|
| Baseline | Includes all sewershed nodes monitored in this study | 95.5% | 86.9% | 82.9% | 0.909 (5.88 × 10−28) | 
| 1 | Baseline minus 25% of sites (randomly selected) | 69.1% | 57.5% | 70.3% | 0.907 (1.19 × 10−27) | 
| 2 | Baseline minus 25% of sites (prioritized to maximize coverage of >65-year-old population) | 84.9% | 80.5% | 89.7% | 0.900 (1.19 × 10−26) | 
| 3 | Baseline minus 50% of sites (randomly selected) | 61.8% | 51.1% | 67.5% | 0.875 (1.87 × 10−23) | 
| 4 | Baseline minus 50% of sites (prioritized to maximize coverage of >65-year-old population) | 70.5% | 67.2% | 76.7% | 0.917 (2.83 × 10−29) | 
In scenarios 3 and 4, we examined the effect of pausing sampling at approximately 50% of the sampling sites, reducing the number of nodes from 15 to 9. A random selection of nodes in scenario 3 (SR-B1, SR-B2, SR-B3, SR-B4, SR-H, and SR-I removed) corresponded to a decline in the coverage of the >65-year-old population to 51.1%. The coverage of the Black or African American population decreased to 67.5%. In scenario 4, which once again maximized coverage of >65-year-old populations (SR-D, SR-E, SR-F1, SR-F2, SR-G, and SR-H removed), coverage of both >65-year-olds (67.2%) and Black or African American individuals (76.7%) improved relative to scenario 3. For both scenarios, the Spearman's rank correlation coefficients of the aggregated wastewater signals from sub-city zones relative to the COD WWTP moving average remained strong and significant. Notably, the aggregated wastewater signal in scenario 4 correlated most strongly with the city overall amongst all scenarios considered. This finding suggests that the zones removed in scenario 4 deviated from the city-level results during the study period.
Our scenario planning example, though rather simplistic in nature, demonstrates a strategy for assessing the impact of sampling design decisions on the coverage of vulnerable populations. As expected, demographic comparison revealed that greater coverage of >65-year-olds is achieved when scale-back of sampling is methodical rather than random. We illustrate how tailoring the sampling plan to prioritize one vulnerable population impacts another, and we recommend that decision-makers consider potential impacts of prioritization schemes on subsequent health intervention strategies. After all, prioritization of certain vulnerable groups may lead to underrepresentation of others. To balance these competing priorities in real-world settings, public health agencies may refer to general trends in wastewater surveillance data in a sampling area to identify areas of high virus concentration, irrespective of demographic factors. Another way to contend with this impact may be to utilize a combination of this study's approach and the conventional approach of placing sites for maximal spatial coverage. Randomly selecting sample sites to remove may be appropriate situationally, especially if the scale-back efforts are modest (e.g., removing less than 25% of sites) and impacts on data equity are evaluated. In our study, removal of sub-city zones had little effect on aggregated wastewater signal correlations with data collected directly at the centralized wastewater treatment plant. Site-specific variations in the timing and magnitude of wastewater virus concentrations were observed, illuminating the utility of sub-sewershed monitoring to inform targeted interventions. In any scenario, it remains important to evaluate the impact of prioritization schemes across multiple demographics and disease dynamics, to assess how these changes will impact a precision public health approach. When contending with health disparities, wastewater disease surveillance can support a precision public health approach for greater health equity.
Decision-makers hold the ultimate responsibility to determine whether the coverages of varying demographics under different sampling scenarios are sufficiently representative of the city population. When making decisions related to improved health interventions within a city, it is simultaneously important to avoid targeting specific populations and creating additional stigmas associated with disease transmission.1 Decision-makers must balance perspectives of officials at the city and county levels, health and medical professionals from public and private organizations, and other diverse stakeholders. Stakeholders may factor in the demographic representation fluctuations in specific sub-sewershed zones into decision-making instead of opting for the highest subgroup population percentage at the city level.
Second, we note that our approach may need to be modified when extended to other locations, as differences will arise from regional variations. This study specifically examines monitoring a suburban town with institutional support for its rigorous wastewater testing program during the pandemic. Differences including sewer system layout and coverage, access to digitized sewer system data, and population density may necessitate modifications to the approach in other locations.
Third, this study does not account for the mobility of populations. Assignment of demographic metrics to wastewater data does not take into account differences between individuals' place of residency and place of work or study.8 Davis also has a large student population from the University. Demographics for sub-sewershed zones with more students may change more often than areas with long-term residents. Application of the approach outlined in this study to other regions would offer opportunities for cross-comparisons.
Fourth and finally, this study analyzes race and age separately, as though their distribution of impact is mutually exclusive. In reality, there are populations (e.g., Black or African American population above the age of 65) that are especially vulnerable to COVID-19. Evaluating health equity on the basis of single characteristics is inherently limiting and risks homogenizing marginalized groups.
Several reasonable extensions of this project arise as we continue to understand community transmission of infectious diseases using wastewater. Health disparities are driven by a combination of many factors, including socioeconomic status, healthcare coverage, immigrant status, native language, and educational attainment. Future analyses should consider intersections amongst multiple community attributes using publicly available data. The American Community Survey, for instance, offers a breadth of demographic data at broader scales (e.g., census block group or tract).34 While the present study focuses on community transmission at the city-level, population representation should also be assessed at county and regional levels.12 For a well-balanced distribution of sampling nodes at broader geographic scales, optimization models can quantitatively consider parameters of interest, including the population served, spatial coverage, social vulnerability, and dissimilarity of wastewater signals.11,35 The methodology presented in the present study can also be applied to other diseases that have known or anticipated links to demographic factors. For example, higher morbidity is observed in young children for respiratory syncytial virus (RSV).36 In preparedness efforts, cities might develop a suite of sampling strategies in anticipation of varied disease transmission scenarios and high-risk groups.
A successful wastewater-based disease surveillance program should aim for equitable representation of vulnerable groups within the sampling design, while preserving the anonymity of the sampled populations. When sampling frameworks are designed appropriately, wastewater data can inform health interventions that address disparities faced by underserved communities. The probabilistic assignment approach demonstrated in this study offers a way to assess impacts of changes to sampling regimes at a sub-sewershed level, facilitating design amidst shifting priorities and variable conditions.
This study also demonstrates the utility of equity assessments in scenario planning for wastewater monitoring. Before a sampling effort is undertaken, the approach can be used to determine whether the sampling area includes zones with more vulnerable subgroup populations, and sampling zones can be adjusted to ensure collection at appropriate MHs. If a limited number of autosamplers can be deployed in an area, the procedure outlined in the study could be used to determine favorable locations to optimize a chosen measure (e.g., capture a larger proportion of the population over the age of 65, who are more vulnerable to serious illness from COVID-19). Sampling locations can then be chosen strategically while accounting for different urban contexts, such as sewer system layout, MH access points, and population density. Ultimately, using this approach can help stakeholders adapt their sampling strategies to the local landscape, supporting incorporation of equity-centered public health interventions.
| Footnote | 
| † Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4ew00552j | 
| This journal is © The Royal Society of Chemistry 2025 |