Amy L.
Pochodylo
and
Damian E.
Helbling
*
School of Civil and Environmental Engineering, Cornell University, 220 Hollister Hall, Ithaca, NY, USA. E-mail: damian.helbling@cornell.edu; Fax: +1 607 255 9004; Tel: +1 607 255 5146
First published on 24th November 2016
The emergence of suspect screening has enabled the comprehensive characterization of micropollutants in water systems. In this work, we developed a sensitive suspect screening workflow and applied it to characterize the occurrence of micropollutants in eighteen water samples collected from an urban water system in New York State. We used high-resolution mass spectrometry to collect full-scan and data-dependent tandem mass spectra from the water samples and compiled a suspect database that contained 1113 chemical substances including pesticides, pharmaceuticals, personal care products, and industrial chemicals. The suspect screening workflow included peak picking, suspect database matching, isotopic pattern scoring, a replication filter, blank subtraction and artifact removal, and clustering of suspect hits. Each step in the workflow relied only on the quality of the analytical data, and was optimized and validated using a set of compounds that covered a broad range of physicochemical properties. After applying the optimized suspect screening workflow to the data acquired from the water samples, we developed a series of prioritization strategies that ranked the resulting suspect hits according to metrics that we hypothesized would favor true positive detections. We then acquired authentic standards for suspect hits based on their ranking on the priority lists to confirm or reject their occurrence. With this approach, we confirmed the presence of 112 micropollutants in at least one of the eighteen water samples. Comparing these results to the scope of conventional micropollutant monitoring methods, we approximate that our suspect screening approach more than doubled the number of micropollutants that may otherwise have been identified.
Water impactMan-made chemicals such as pesticides, pharmaceuticals, and personal care products have been measured in water resources around the world. This work introduces a new approach to comprehensively characterize the occurrence of these so-called micropollutants in environmental samples. With this approach, we identified 112 micropollutants occurring in at least one sample collected from drinking water, wastewater, and surface water systems. |
Despite the clear value of the data acquired during target screening, there remain limitations to this approach. First, target screening focuses water quality monitoring on a fixed set of micropollutants. However, the numbers and types of micropollutants that may occur in a water system are dependent on a variety of local features such as land use,8 proximity to industry,9 type of sewer system,10 type of wastewater treatment system,10 and population demographics.11 These factors suggest that analytical methods need to be flexible and easily adaptable to address the numbers and types of micropollutants that may occur in any region of interest. Second, target screening requires the use of authentic standards for the identification and quantification of target analytes. In addition to being laborious and economically inefficient, the process of selecting analytes for a target screening method is particularly challenging when the types of micropollutants that may be present in a water system are unknown. Finally, even the broadest multi-residue target screening methods only evaluate the occurrence of a fraction of the micropollutants that are expected to occur in any water system. As a result, risk assessments based on the results of target screening methods can significantly underestimate the chemical risk associated with micropollutant occurrence.12
Recent advances in high-resolution mass spectrometry (HRMS) have enabled the development of more comprehensive and versatile analytical methods for water quality monitoring without the need for authentic standards.13–16 Suspect screening is an emerging approach that relies on the high mass accuracy and high mass resolution afforded by HRMS to link features in mass spectral acquisitions to suspect chemicals that may occur in a sample.17 The general approach involves peak picking in the full-scan mass spectral acquisition and matching the accurate masses of the picked peaks to the exact masses of the major adducts (e.g., [M + H]+ or [M − H]−) and the theoretical isotopic patterns of suspect chemicals. Each match is a tentative detection of a suspect chemical and is referred to subsequently as a “suspect hit.” Suspect screening methods have been described for identifying suspect chemicals in a variety of matrices including water and wastewater,12 lake sediments,18 urine,19 and processed animal products.20
There are at least two important considerations that must be addressed prior to developing a new suspect screening workflow for a particular application. First, it is important to consider the numbers and types of chemical substances to be included as suspect chemicals. One approach is to “screen smart”, where a relatively small number of suspect chemicals is selected whose presence will provide key insights into a particular problem. For example, suspect screening has been applied to identify putative transformation products of sulfonamide antibiotics,21 photo-degradation products of iodinated contrast media,22 and biotransformation products of structurally-similar chemical substances.23,24 Another approach is to “screen big”, where a suspect database that includes thousands of suspect chemicals is built to represent the universe of likely chemical substances that may be present in a particular sample.13,16,25 This latter approach may be the most appropriate when the goal is comprehensive characterization of micropollutants in environmental samples, though large suspect databases should be applied with caution as more suspect chemicals are likely to result in more false positive detections. Second, it is important to consider how the suspect screening workflow should be optimized. Much recent research has focused on improving the false positive rate of suspect screening methods. Some commonly explored strategies include in silico prediction of the retention times26–28 or tandem mass spectral (MS2) fragments of suspect chemicals,13,29 data which can be incorporated into suspect screening workflows to further evaluate suspect hits. Others have considered intensity-dependent mass error adjustments30 or statistical rejection filters.31,32 Whereas these techniques have led to successful suspect screening discoveries and a general reduction in false positive rates, those benefits come at the expense of higher false negative rates, which narrow the comprehensiveness of the suspect screening. Another optimization approach is to balance the false positive and false negative rates,14,33,34 though the concession made in balancing the error rates may also lead to less comprehensive coverage of the suspect screening method. To the best of our knowledge, no suspect screening method has been described that explicitly aims to minimize the false negative rate to enable the most comprehensive characterization of micropollutant occurrence in a water system.
The goal of this research was to develop and apply a suspect screening workflow to comprehensively characterize the occurrence of micropollutants in an urban water system in New York State. To meet this goal, we collected water samples at the intake and from the finished water of a drinking water treatment plant (DWTP), at the influent and effluent of a wastewater treatment plant (WWTP), and from a surface water system that receives the effluent of the WWTP. We then: (i) developed and optimized a novel suspect screening workflow; (ii) validated the performance of the suspect screening workflow in each of the matrices; and (iii) applied the suspect screening workflow to the set of water samples to identify suspect micropollutants. Our approach was to “screen smart” while remaining comprehensive because the occurrence of micropollutants had never been assessed in the study area. Therefore, the suspect database contained 1113 chemical substances that have been reported as water-relevant micropollutants in water systems around the world and are likely to be detected by our HRMS analytical method. Additionally, we systematically optimized the suspect screening workflow to minimize the false negative rate. We then developed a series of novel prioritization strategies to rank suspect hits in a way that we expected would give priority to true positive detections. Authentic standards were acquired to confirm or reject the occurrence of all prioritized suspect chemicals.
Fig. 1 Schematic of the steps involved in the development and optimization, validation, and application of the suspect screening workflow. |
We used the TraceFinder v3.1 software for peak picking, though a number of open source software packages are available for peak picking within high-resolution mass spectra.40,41 Peak picking algorithms rely on a number of user-defined peak picking parameters that determine how mass spectra are clustered and whether or not a cluster of mass spectra will be defined as a peak. We systematically adjusted the magnitude of each peak picking parameter to investigate its effect on the results. We then compared the accurate masses of each of the picked peaks with the exact masses of the [M + H]+ and [M − H]− adducts of each of the 1113 chemicals in the suspect database to identify suspect hits. Other major adducts such as [M + Na]+ or [M + NH4]+ were not considered during suspect database matching because our analyses demonstrated that inclusion of other adducts did not improve the method sensitivity but significantly lowered the method selectivity. We optimized each peak picking parameter by identifying the parameter value that resulted in identification of all 45 validation compounds while minimizing the total number of suspect hits identified in representative high (750 ng L−1) and low (25 ng L−1) concentration samples of the dilution series. The parameters that had the largest influence on the results of peak picking were the area noise factor, the peak noise factor, the baseline window, the peak area threshold, and the signal-to-noise ratio. Details on the optimization of each of these parameters are provided in the ESI.†
The optimized peak picking and suspect database matching routine was applied to the full-scan mass spectra acquired from the representative high and low concentration samples from the dilution series. A total of 893 and 647 suspect hits were identified in each of the samples, respectively, as shown in Fig. 2. Peaks representing each of the 45 validation compounds were picked in both samples reflecting a method sensitivity of 100%. However, the large number of suspect hits yielded a poor method selectivity of 6.0% across the dilution series. Therefore, additional suspect screening workflow steps were developed to reduce false positive suspect hits and improve the method selectivity.
Isotopic pattern scoring can be applied to suspect screening workflows to remove suspect hits that do not contain an isotopic pattern matching the theoretical isotopic pattern of the suspect chemical. Isotopic pattern scores can be assigned in TraceFinder or other software packages based on deviations between the measured and predicted masses and intensities of the isotopic pattern. We optimized the isotopic pattern scoring in the same way that we optimized the peak picking parameters as described in the preceding, details of which are available in the ESI.† After applying the optimized isotopic pattern scoring routine to the high and low concentration samples of the dilution series, the total number of suspect hits was reduced to 604 and 452, respectively, as shown in Fig. 2. Isotopic pattern scoring had no effect on method sensitivity, but the reduction in the total number of suspect hits resulted in an improved method selectivity of 8.8%.
The remaining steps in the suspect screening workflow were developed to remove false positive suspect hits resulting from analytical noise or matrix constituents. First, the replication filter was developed to remove suspect hits that were not detected robustly over replicate analytical injections from the same sample. We reasoned that the peak picking algorithm may pick peaks related to noise or other transient substances in any single analytical injection, but chemical substances that are present and stable in the sample should generate a robust series of picked peaks across a set of multiple injections. We determined that three replicate analytical injections was sufficient to eliminate a significant number of false positive suspect hits resulting from analytical noise, as detailed in the ESI.† Second, the blank subtraction and artifact removal step was added to remove matrix constituents from the list of suspect hits. For blank subtraction, we removed suspect hits from a sample if a suspect hit was present in the 0 ng L−1 sample (the blank) and had a peak area greater than or equal to the peak area measured for that suspect hit in the sample. This is a conservative approach that does not incorporate a peak area amplifier into blank subtraction as has been reported elsewhere.12 Instead, we developed artifact removal which removed all suspect hits that were identified in every sample of the dilution series and had peak areas that did not vary significantly over the dilution series. Artifact removal enables removal of matrix constituents that may be present in the blank at slightly lower peak areas than in the sample without the need for applying an arbitrary amplifier. Finally, the clustering of suspect hits removes all extra annotations of a single suspect chemical to generate a list of unique suspect hits identified in each sample. The effects that each of these steps had on reducing the total number of suspect hits are presented in Fig. 2. After applying the full suspect screening workflow, the total number of suspect hits was reduced to 203 and 164 in the high and low concentration samples from the dilution series, respectively. One of the validation compounds (primidone) was lost from the low concentration sample during the blank subtraction step resulting in a method sensitivity of 98.9%. However, the continued reduction in the total number of suspect hits resulted in an improved method selectivity of 24.8%. It is important to note that selectivity is a function of the number of chemical substances contained in the suspect database. For example, if the suspect database contained only the 45 validation compounds, the selectivity of the suspect screening workflow would be 100%. Therefore, the optimization of the suspect screening workflow resulted in a highly sensitive method that has a method selectivity of approximately 25% when the suspect database contains 1113 chemical substances.
An accounting of the true positive and false negative detections is presented in Fig. 3. Twenty four of the 45 validation compounds were confirmed to occur in at least one of the eighteen water samples. The suspect screening yielded a total of 349 suspect hits among the eighteen water samples; 203 of those were confirmed as true positives and the remaining 146 were false positives. The majority of false positives were identified as such based on non-matching retention times when compared to the authentic standard. While an in silico retention time prediction step may have eliminated many of the false positive hits, the increased selectivity would have come at the expense of reduced sensitivity due to the uncertainty inherent in retention time prediction. This observation highlights the need for improved in silico tools for the prediction of retention times of suspect chemicals in liquid chromatography applications. Some false positive suspect hits were also rejected following inspection of MS or MS2 spectra. These were substances that were measured with very low intensities that either did not yield strong MS signals or did not trigger the dd-MS2 experiment. These substances could be considered as true positives that were below the limit of detection of the suspect screening workflow, but were conservatively identified as false positives here.
There were also 24 instances in which validation compounds were confirmed to occur following manual inspection of the analytical data but were not identified as suspect hits and are therefore false negatives. The majority of the compounds that were identified as false negatives were filtered out of the suspect screening workflow during the isotopic pattern scoring step, though the isotopic pattern was clear upon manual inspection. Adjusting the isotopic pattern scoring threshold or the allowable mass and intensity deviations did not improve the performance of the suspect screening method, so no modifications were made to the suspect screening workflow based on this observation. Other validation compounds were lost during the replication filter, particularly substances that were present at low intensity and in complex wastewater influent or effluent matrices. Together, these observations suggest that the limit of detection of our suspect screening workflow will not be as low as the limit of detection of an analogous targeted analytical method. Nevertheless, the method sensitivity across varying matrices based on these results was 89.4% and the method selectivity was 58.2%.
We developed two groups of novel prioritization strategies that ranked the resulting suspect hits according to metrics that we hypothesized would favor true positive detections. We then acquired authentic standards (when available) for suspect hits in the order in which they were ranked on the priority list and collected analytical data for confirmation of the suspect hits. We continued evaluating suspect hits on each priority list until we investigated the top 30 suspect hits or the running selectivity of the prioritization dropped below 60%, whichever came later. We selected 60% as the running selectivity threshold based on the results of method validation; we reasoned that attaining the selectivity obtained when applying the suspect screening method with a suspect list that contained only 45 chemical substances would be an ambitious benchmark to achieve with a larger suspect database. The running selectivity was calculated as the selectivity of the method as a function of the number of true positives and false positives identified as we evaluated the priority list. Suspect hits for which authentic standards were not acquired were included in the calculation of running selectivity and were assigned a selectivity of 25%, which is based on the conservative assumption that only one out of four suspect hits for which authentic standards were not acquired is a true positive, as was observed during method optimization.
The first group of prioritization strategies was based on Web of Science (WOS) searches for each of the 534 suspect hits using the search string “environment* AND water AND [name of suspect hit]”. We ranked each of the suspect hits based on the number of WOS search returns that were received for each suspect hit as of February 2016. We reasoned that suspect hits with more WOS search returns would be more likely to occur in our water samples. As is summarized in Table 1, the WOS prioritization resulted in the investigation of 36 suspect hits with an authentic standard and 22 of those were confirmed for a confirmation rate of 61%. The running selectivity of the method dropped to 60% after 38 suspect hits were investigated, and there were two suspect hits for which authentic standards could not be acquired. We then coupled the WOS ranking with other metrics aiming to further refine the prioritization of the suspect hits. For example, we developed a priority list based on the WOS rankings and considered only suspect hits that were present in both the WWTP influent and effluent samples during at least two sampling events (WOS + WWTPs). We reasoned that this strategy would prioritize persistent wastewater-derived micropollutants. We investigated 63 suspect hits based on this prioritization and confirmed 46 of them, 36 of which were additional unique confirmations beyond the WOS priority list alone. We also prioritized suspect hits based on the WOS ranking of suspect hits present in all lake samples (WOS + lake), present in all DWTP intake samples (WOS + DWTP), containing at least one chlorine atom (WOS + Cl), contained in the USGS wastewater methods (WOS + USGS), contained in the New York State PIMS database (WOS + PIMS), and pharmaceuticals that were present in the WWTP influent and effluent during at least two sampling events (WOS + WWTPs + pharmaceuticals). All combinations resulted in the confirmed identification of unique compounds beyond the WOS priority list alone. The results of these prioritization strategies are summarized in Table 1.
Length of list | Cmpds Investigated | Confirmed Cmpds | % confirmed | Unique confirmationsa | % unique confirmations | |
---|---|---|---|---|---|---|
a Unique confirmations are confirmations made beyond the WOS prioritization alone. For the WOS + WWTPs + Pharms prioritization strategy, unique confirmations are confirmations beyond the WOS + WWTPs prioritization. | ||||||
Web of Science (WOS) | 38 | 36 | 22 | 61% | — | — |
WOS + WWTPs | 87 | 63 | 46 | 73% | 36 | 78% |
WOS + lake | 30 | 23 | 15 | 65% | 12 | 80% |
WOS + DWTP | 30 | 18 | 14 | 78% | 12 | 86% |
WOS + Cl | 33 | 26 | 18 | 69% | 12 | 67% |
WOS + USGS | 89 | 57 | 45 | 79% | 36 | 80% |
WOS + PIMS | 30 | 25 | 12 | 48% | 5 | 42% |
WOS + WWTPs + Pharms | 97 | 69 | 51 | 74% | 24 | 47% |
The second group of prioritization strategies was based on the maximum peak area recorded for each suspect hit. We reasoned that there would be greater confidence in the results of each of the steps in the suspect screening workflow for suspect hits with larger peak areas. Further, this was a means to prioritize suspect hits in a way that is independent of whether or not the suspect chemical has been previously reported as a water pollutant. As is summarized in Table 2, the peak area prioritization resulted in 44 suspect hits investigated with an authentic standard and 38 of those were confirmed for a confirmation rate of 86%. The running selectivity of the method reached 60% after 77 suspect hits were investigated. We coupled the peak area metric to the same metrics described for the WOS search and those results are presented in Table 2.
Length of list | Cmpds investigated | Confirmed Cmpds | % confirmed | Unique confirmationa | % unique confirmations | |
---|---|---|---|---|---|---|
a Unique confirmations are confirmations made beyond the peak area prioritization alone. For the peak area + WWTPs + Pharms prioritization strategy, unique confirmations are confirmations beyond the peak area + WWTPs prioritization. | ||||||
Peak area | 77 | 44 | 38 | 86% | — | — |
Peak area + WWTPs | 78 | 44 | 38 | 86% | 3 | 8% |
Peak area + lake | 32 | 19 | 16 | 84% | 1 | 6% |
Peak area + DWTP | 30 | 18 | 14 | 78% | 2 | 14% |
Peak area + Cl | 42 | 26 | 21 | 81% | 20 | 95% |
Peak area + USGS | 97 | 60 | 49 | 82% | 22 | 45% |
Peak area + PIMS | 30 | 15 | 9 | 60% | 7 | 78% |
Peak area + WWTPs + Pharms | 81 | 48 | 40 | 83% | 17 | 43% |
In total, the WOS group of prioritization strategies enabled the confirmation of 103 suspect hits and the peak area group of prioritization strategies enabled the confirmation of 92 suspect hits. Many suspect hits were prioritized and confirmed in both strategies, but 20 suspect hits were only prioritized and confirmed in the WOS group and 9 suspect hits were only prioritized and confirmed in the peak area group. Plots of the running selectivity for each of the prioritization strategies and a complete list of all suspect hits compared with an authentic standard are provided in the ESI.† Suspect hits that were not confirmed or rejected with an authentic standard are not discussed in this manuscript.
Fig. 4 An accounting of the 88 suspect chemicals that were confirmed to occur in at least one of the eighteen water samples. All confirmed suspect chemicals are true positives (TR).1 Carbamazepine-10,11-epoxide;2 dextromethorphan;3 ethyl 3-(N-butylacetamido) propionate;4 hydrochlorothiazide; 5N4-acetylsulfamethoxazole;6 perfluorobutyric acid;7 perfluorooctanoic acid;8 tris(1,3-dichloro-2-propyl)phosphate;9 tris(2-chloro-ethyl) phosphate. |
We compared the numbers of micropollutants confirmed by our suspect screening approach to the numbers that may otherwise have been identified using more conventional target screening approaches. The USGS National Water Quality Laboratory maintains an index of target screening methods for micropollutants in water and wastewater matrices. When five of the most comprehensive methods are combined,4,5,43–45 they enable target screening for over 250 micropollutants amenable to analysis by HPLC HRMS including pharmaceuticals, pesticides, personal care products and industrial chemicals. Of the 112 micropollutants identified in this research, 54 of them are included in these target screening methods. The fractions of micropollutants identified in each of our water samples that are or are not included in these target screening methods is provided in Fig. 5. Based on this comparison, we approximate that our suspect screening approach more than doubled the number of micropollutants that may have otherwise been identified, even with very a comprehensive target screening approach.
An exhaustive discussion of the types of micropollutants that we identified in this work is beyond the scope of this manuscript. However, there are several observations worth noting. First, there were 8 micropollutants that were present in every WWTP effluent sample and in every lake sample: 5-methyl-1H-benzotriazole, atenolol acid, caffeine, DEET, gabapentin, metformin, saccharin, and sucralose. The importance of each of these persistent micropollutants as indicators of anthropogenic influence and concerns over their respective toxicities has been discussed elsewhere,46–49 though we are unaware of any previous work that has identified this mixture of micropollutants in a single water system or with a single analytical approach. Second, some micropollutants that were detected have received attention for their putative or known health effects on exposed ecosystems or human populations. Two perfluorinated alkyl substances (PFASs) were confirmed to occur in wastewater and surface water samples (PFOA and PFBA) in the study area. There has been increasing concern over the occurrence of PFASs in water, particularly in areas adjacent to military installations or chemical manufacturing industries.9 The study area is not situated near these types of sources, but the trace detection of two PFASs demonstrates their prevalence in the environment. Additionally, a number of the pesticides (e.g., 2,4-d, atrazine, metolachlor, simazine) identified in the surface water samples are known or putative endocrine disruptors and their occurrence must be noted.50 Third, most of the micropollutants that were detected in this research are polar chemicals that are expected to favor partitioning to water, but some have exhibited the potential for bioaccumulation including the PFASs and the UV filters benzophenone and benzophenone-3.51,52 Finally, many of the micropollutants confirmed in our study have frequently been reported to occur in water resources around the world. However, some of the micropollutants have rarely been reported as water contaminants or are believed to be reported here for the first time. These include the anticonvulsant levetiracetam, the antihistamine fexofenadine, the antiviral drug emtricitabine, the cough suppressant dextromethorphan, the diuretic triamterene, the fungicide iodocarb, the insect repellant ethyl butylacetylaminopropionate, and the muscle relaxants carisoprodol, metaxalone, and methocarbamol. These results are not only interesting from a novelty perspective, but also demonstrate the breadth of chemical coverage that suspect screening affords, as these chemical substances represent a broad range of chemical structures and physicochemical properties and are unlikely to be included together in conventional target screening methods.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c6ew00248j |
This journal is © The Royal Society of Chemistry 2017 |