Waterworks-specific composition of drinking water disinfection by-products †

Reactions between chemical disinfectants and natural organic matter (NOM) upon drinking water treatment result in formation of potentially harmful disinfection by-products (DBPs). The diversity of DBPs formed is high and a large portion remains unknown. Previous studies have shown that non-volatile DBPs are important, as much of the total toxicity from DBPs has been related to this fraction. To further understand the composition and variation of DBPs associated with this fraction, non-target analysis with ultrahigh resolution Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was employed to detect DBPs at four Swedish waterworks using different types of raw water and treatments. Samples were collected five times covering a full year. A common group of DBPs formed at all four waterworks was detected, suggesting a similar pool of DBP precursors in all raw waters that might be related to phenolic moieties. However, the largest proportion (64 – 92%) of the assigned chlorinated and brominated molecular formulae were unique, i.e. were solely found in one of the four waterworks. In contrast, the compositional variations of NOM in the raw waters and samples collected prior to chemical disinfection were rather limited. This indicated that waterworks-specific DBPs presumably originated from matrix effects at the point of disinfection, primarily explained by differences in bromide levels, disinfectants (chlorine versus chloramine) and different relative abundances of isomers among the NOM compositions studied. The large variation of observed DBPs in the toxicologically relevant non-volatile fraction indicates that non-targeted monitoring strategies might be valuable to ensure relevant DBP monitoring in the future. diagrams visualize molar ratios of hydrogen to carbon (H/C) plotted against those of oxygen to carbon (O/C), which provides information about the degree of saturation and oxygenation (which relates to oxidation) for each assigned molecular formula. Plotting H/C ratios against mass complements common van Krevelen diagrams by showing the mass distribution as well. Kendrick mass defect (KMD) is used to organize the molecular formulae in homologous series, in which formulae are related through differences in molecular entities such as, CH 2 , CHOO, C 2 H 2 and H 2 . 33 In this study, the homologous series were created based on CH 2 , which means that compounds that have the same number of double bond equivalents (DBE, which may represent rings and double bonds) and heteroatoms (limited to oxygen in this case), but different numbers of CH 2 entities, will have the same Kendrick mass defect. The parameter z -score ( z * ) is another independent parameter that describes homologous series 34 and the combined KMD and z * diagram was used to create a modified diagram, − KMD/ z * plotted against mass. 31 This diagram shows differences in counts of CH 2 along the x -axis, nominal exchange of CH 4 against O along the y -axis and nominal exchange of CH 4 against H 2 along diag-onals. For further characterization of DBPs, various indices were computed, describing double bond equivalence (DBE), aromaticity index (AI mod ), and average oxidation state of carbon (C OS ). DBE and AI mod were computed according to Koch and Dittmar (2006) 29 and C OS was calculated as eqn (1) where n H is the number of hydrogen atoms (neutral form), and n O, n Cl, n Br and n C are the counts of oxygen, chlorine, bromine and carbon atoms, respectively.


Introduction
Chemical disinfection constitutes an important drinking water treatment to inactivate pathogens and limit microbial regrowth in the distribution network and prevent spread of waterborne diseases worldwide. However, the use of disinfectants, such as chlorine, chloramine, chlorine dioxide and ozone leads to the formation of disinfection by-products (DBPs), of which many are toxic. 1 Exposure to DBPs has been linked to an increased risk of e.g. bladder cancer, miscarriages, and birth defects. 2,3 The formation pathways of DBPs are complex and the various chemical disinfectants are considered to produce distinct sets of compounds upon reaction with natural organic matter (NOM) or anthropogenic organic compounds, with a certain overlap. 4,5 Other than disinfectant type, formation of DBPs depends on the concentration and characteristics of NOM, water temperature, pH, disinfectant dosage, and contact time. 6,7 For example, speciation of free chlorine in water is highly dependent on pH and reactive species can shift from HOCl at lower pH (e.g. pH 5) to OCl − at pH 7.5 and higher. DBPs typically constitute chlorinated organic compounds with various chemical structures, but bromide or iodide when present in the source water might lead to the formation of brominated or iodinated DBPs. 4,[8][9][10] To date, approximately 700 different DBPs have been identified. 5 Only eleven of these, five haloacetic acids (HAA5), four trihalomethanes (THM4), bromate, and chlorite, are typically regulated. 4 The diversity of DBPs formed makes effective monitoring challenging. 1,11 Typically, only the regulated DBPs are monitored, e.g. THM4 and HAA5, which implies very limited information of the overall DBP exposure. Furthermore, consideration of differences in toxicity among the DBPs formed is critical for a relevant DBP assessment. Bioluminescence inhibition assays indicate that DBPs in the non-purgeable fraction (in the specific study defined as the total amount of adsorbed organic halogens (AOX) present after purging the sample with nitrogen for 30 minutes) are of higher toxicological relevance than DBPs in the purgeable fraction. 12 In the same study, AOX and a range of known DBPs were analysed. Most of the AOX in the purgeable fraction, i.e. semi-to highly volatile compounds, could be explained by known DBPs, while less than 16% could be explained in the non-purgeable fraction, i.e. non-volatile to semi volatile compounds, demonstrating the lack of available information of DBPs in this pool. 12 Recent work summarizing the challenges and opportunities within DBP research points towards the need of finding the key "toxicity drivers" that can explain the observed risk for bladder cancer, so that efforts to minimize exposure focus on the relevant targets. 13 A number of studies have been carried out to analyze the non-volatile fraction of DBPs using nontarget approaches, where some have focused on lab experiments, [14][15][16][17][18] and a few on real waterworks. [19][20][21] Previous studies indicate that there is variation and overlap of nonvolatile DBPs formed at different waterworks. [19][20][21] However, sampling in those studies was restricted to a single occasion, and the degree of similarity or variability between water treatment plants and treatment methods remains inconclusive. Expanding the knowledge about compositional variability of non-volatile DBPs is important to link DBP formation to water treatment conditions and evaluating remaining toxicity caused by DBPs. This study was undertaken to investigate the formation of non-volatile DBPs, covering a full seasonal cycle in four different waterworks, using different raw water sources, combinations of treatment steps, and disinfectants.

Methodology
A non-target approach was chosen to detect both known and unknown DBPs. Ultrahigh resolution Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) is well known for its high mass accuracy and mass resolution, and was considered the best option for screening of the elemental composition of readily ionizable compounds at low levels of abundance. 22,23 FT-ICR MS can provide elemental compositions of several thousand individual compounds in complex mixtures. [24][25][26] Electrospray ionization FT-ICR MS operated in negative mode has already been shown to successfully determine new DBPs. 16,[19][20][21] Importantly, due to selectivity in sample preparation and susceptibility of ionization, the compounds observed with this approach typically include nonvolatile, hydrophobic to semi polar, oxygen containing compounds, 27,28 e.g. containing carboxyl or hydroxyl groups.

Treatment processes at the four waterworks
Four Swedish waterworks were investigated, Berggården, located in Linköping (LIN) using water from river Motala ström; Borg, located in Norrköping (NOR) using water from the same river as LIN but approximately 50 km downstream, after having passed two lakes; Görväln, located in Stockholm (STO) using water from lake Mälaren, and Bulltofta, located in Malmö (MAL) using ground water from the Grevie aquifer. These waterworks use different combinations of treatment steps and chemical disinfectants ( Fig. 1). At LIN, rapid and slow sand filtration is followed by UV disinfection and hypochlorite addition. At NOR, coagulation with Al 2 ĲSO 4 ) 3 , flocculation and sedimentation (these processes are summarized as coagulation in Fig. 1) is followed by GAC (granular activated carbon) filtration, slow sand filtration and chloramination. Ammonia and hypochlorite in excess are consecutively added to the water stream, producing a small primary disinfection effect. However, the free chlorine (i.e., HOCl and OCl − ) is consumed rapidly and the finished water featured combined chlorine only (i.e. primarily monochloramine NH 2 Cl). At STO, the treatment steps include coagulation (Al 2 ĲSO 4 ) 3 ), flocculation and sedimentation (summarized as coagulation in Fig. 1), followed by rapid sand filtration, GAC, UV disinfection and chloramination with preformed monochloramine. At MAL, aeration is followed by water softening, rapid sand filtration and chloramination. For chloramination ammonia and hypochlorite are added separately to the water stream, at the same time, in proportions forming monochloramine (i.e. hypochlorite is not added in excess as in NOR). Half way into the sampling campaign, after three sampling events, UV disinfection was installed prior to chloramination. Table 1 presents basic water characteristics relevant to the DBP formation process at the four waterworks. pH was measured at room temperature within six hours after sampling, dissolved organic carbon (DOC) was measured using the nonpurgeable organic carbon (NPOC) method at the accredited lab associated with each treatment plant and total chlorine (as Cl 2 ) was measured on-line. Temperatures fluctuated more in LIN and NOR, compared to STO and were almost constant in MAL. pH was 8.0-8.5 at the point of disinfection at all waterworks except STO, where pH was 6.7-7.0. At STO, pH was raised to 8.0-8.5 before water was distributed. Total chlorine residual levels peaked in August for LIN and in November for NOR and STO, respectively. At MAL, all parameters measured were relatively constant throughout the year.

Sampling and solid phase extraction
Duplicate samples from the four waterworks, including raw water and water before and after chlorination/ chloramination, were collected at five occasions throughout one year; in March, May, August, and November 2016, and January 2017 (months are abbreviated Mar, May, Aug, Nov and Jan, in figures and tables). The water samples (5 L) were collected in amber glass bottles and filtered immediately after sampling (GF/F, pore size 0.7 μm, Whatman) into another   set of amber glass bottles. Before the first sampling event, glass bottles were cleaned three times with methanol and between sampling events they were washed three times with Milli-Q. 250 mL of the filtered water was stored for DOC analysis and the remainder was used for extraction, which was initiated within seven hours after sample collection. To avoid potential effects from NOM interactions and interferences upon FT-ICR MS analysis, no agents were added to quench residual chlorine. For the extraction, a volume of 4 L filtered water was adjusted to pH ≈ 2.5 using 3 M HCl, prepared by hydrochloric acid 32% (puriss P.A.) and ultrapure water (spectrophotometric grade). Water samples were connected via polytetrafluorethylene (PTFE) tubes to Bond Elut PPL cartridges (modified styrene divenylbenzene polymer, 1 g, 6 mL cartridge, Agilent Technologies, acquired from Scantec, Partille, Sweden) on a vacuum manifold (Standard 24-port, 57250-U, Sigma-Aldrich). The extraction was driven by a peristaltic pump (Vantage 3000 C S10, Svenska pump AB) creating vacuum in the manifold while pumping out water. The flow rate was kept at 20 mL min −1 or slightly below. Prior to extraction, cartridges were conditioned with methanol (10 mL, LC-MS Ultra CHROMASOLV®) and acidified ultrapure water (10 mL, pH 2.5, spectrophotometric grade). After loading, the cartridges were washed using pure water with 0.1% formic acid (10 mL, LC-MS Ultra CHROMASOLV®) to eliminate ions that might form adducts (e.g. Cl − ) and hence could interfere during FT-ICR MS analysis. Cartridges were then dried for 15 seconds using nitrogen gas (except for STO, where nitrogen gas was not available, and air with a hydrocarbon trap was used instead). The extracts were eluted with methanol (10 mL, LC-MS Ultra CHROMASOLV®), collected in glass vials (22 mL), and stored at −20°C until FT-ICR MS analysis in April 2017. A schematic presentation of the steps involved in extraction with Bond Elut PPL is found elsewhere. 27

FT-ICR MS analysis
To characterize DBPs formed at the four waterworks, a Bruker Solarix 12 T FT-ICR MS, operating with a negative mode electrospray ionization ESIĲ−) source, was used. The methanol extracts were diluted to a DOC concentration of ∼3.5 μg mL −1 to prevent overload of the ICR cell, which can lead to peak splitting and other interferences between ions due to space charge processes. The spray current was set to −3.6 kV and the flow rate to 2 μL min −1 . For each spectrum, 300 scans were acquired over a mass range of m/z 122 to 1500. Blank methanol samples were run between the samples from each of the four waterworks. Spectra were internally calibrated using a reference mass list of common natural organic molecules (mass accuracy < 0.2 ppm error). To assign molecular formulae to m/z ions in the mass spectrum, an in house software developed at Helmholtz Center Munich (Germany) was used, limiting the search to the following chemical elements: 12 C 0-100 , 1 H 0-∞ , 16 O 0-80 , 14 N 0-3 , 32 S 0-2 , 35 Cl 0-3 and 79 Br 0-3 . Iodine was not included, because an initial search among these samples revealed no presence of iodinated DBPs. In the text molecular compositions containing C, H and O atoms are referred to as CHO formulae, not considering the number of C, H and O atoms respectively. However, the number of chlorine and bromine atoms of the DBPs are specified, e.g. CHOCl refers to any CHO molecular composition with a single chlorine atom and CHOBr 2 refers to any CHO molecular composition with two bromine atoms, and so on.

Data analysis
2.5.1 Data filtration and verification. Assigned molecular formulae were filtered based on intensity (total ion count (TIC) > 3 000 000), mass error (<0.2 ppm), and the nitrogen rule. The nitrogen rule was applied to remove false assignments and states that nitrogen containing ions with even mass will have odd numbers of nitrogen atoms, and vice versa. 24 If this was not the case, candidate CHNO compounds were removed. The remaining molecular formulae were further processed to identify and verify chlorinated and brominated DBPs based on four steps. First, the proportions of C, H and O were used to discard unrealistic combinations. The requirements for keeping molecular formulae were: C, O and H > 0, O/C ratio ≤ 1, H/C ratio ≤ 2.5, and double bond equivalences (DBE) ≥ 0. DBE is defined as the total number of rings and double bonds in a molecule and was computed based on the number of atoms and their valence. 29 As this study focuses on halogenated DBPs, the second step removed molecular formulae without Cl or Br. In the third step, the preliminary molecular formula assignments of the chlorinated and brominated compounds were verified using the predictable sets of m/z ions for the same molecular formulae having different combinations of stable halogen isotopes (in this case 35/37 Cl and 79/81 Br), which is typically referred to as isotope simulation matching. Here, the m/z ions with the more common stable halogen isotope 35 Cl should co-occur with a proportionally less intense m/z ion with the 37 Cl isotope, whereas m/z ions containing 79 Br and 81 Br should show near identical mass peak amplitude. Molecules containing both chlorine and bromine as well as those containing more than a single halogen atom show predictable patterns of isotopomers as well. 30 Only DBP molecular formulae, for which both the 35 Cl and 37 Cl isotopes in case of chlorine containing DBPs and both the 79 Br and 81 Br isotopes in case of bromine containing DBPs were detected, were accepted as verified DBPs. This approach is conservative, as true halogenated compounds were rejected when the corresponding isotope m/ z ions were below the detection limit. In the fourth step, the molecular formulae remaining after step 1-3 that were solely detected after chemical disinfection were regarded as verified DBPs. Very few (1-4%) of the verified DBPs contained nitrogen or sulphur. In order to reduce the complexity of data analysis, only DBPs containing carbon, hydrogen, oxygen, chlorine and/or bromine (CHOClĲBr)-DBPs), were included for the comparative analysis in this study.
2.5.2 Data processing and analysis. Verified DBPs formed at the four waterworks were visualized using van Krevelen diagrams, H/C to mass and modified Kendrick plots, described in more detail elsewhere. 31,32 In short, van Krevelen  Koch and Dittmar (2006) 29 and C OS was calculated as eqn (1) where nH is the number of hydrogen atoms (neutral form), and nO, nCl, nBr and nC are the counts of oxygen, chlorine, bromine and carbon atoms, respectively.
Data processing was focused on the presence or absence of individual verified DBP formulae, disregarding differences in relative intensities. With the purpose of analyzing the compositional variation of DBPs formed at the four waterworks in a clear visual way, series of Venn diagrams were created by performing multiple comparisons of individual DBPs formed at each plant. Comparisons were made 1) for each sampling month separately and 2), through combining all DBPs formed throughout the five sampling events. For comparisons between the four waterworks, 15 segments built up the Venn diagram, including both unique and shared DBPs. The data processing was performed in Matlab 2017 using the assigned elemental formula as the variable for sorting. Among duplicate FT-ICR MS spectra from each extract, there were some variability in the amount of organic material injected, leading to variability regarding if the lowest intensity peaks reached above the threshold used as detection limit. Hence, we observed some variability among duplicates in the number of detected and verified DBPs, caused by the low relative intensity of many of the alternative stable isotopic composition peaks used for verification. Overall, this verification method also underestimated the number of DBPs detected because the verification peaks in many cases could not reach the specified detection limit. Hence, and due to the severity of this verification approach, all peaks that could be verified were accepted, and as a consequence the duplicate yielding the spectra with highest intensities were used. Relative mass peak intensities were not used in the data analysis, except to compute weighted average values of elemental compositions and indices. However, relative mass peak intensities were considered indirectly through the isotope verification filter. Table 2 presents the number of total and verified DBP molecular formulae together with average elemental compositions, elemental ratios, and various index values (all weighted against relative abundance) of verified DBPs formed at the four waterworks throughout the five sampling events. The verification process reduced the number of DBPs to 25-35% of the total number of halogenated CHO formulae found ( Table 2). A large fraction of the non-verified halogenated compounds was positioned along with and in the outer range of the verified DBPs in the van Krevelen diagram, and likely had too low intensity for the alternative isotope to be found. Another group of potential DBP signatures among the nonverified DBPs was characterized by a high mass and few oxygen atoms and were found both before and after disinfection.

DBP composition
A summary of the verified DBP types formed at the four waterworks is shown in Fig. 2. A full list of the verified DBP formulae is provided in the ESI † (Table S1). DBPs show profound differences between the waterworks, which is also reflected in the relative abundance of chlorine and bromine containing molecular compositions among verified DBPs (Table 2). Primarily, CHO molecular compositions with one chlorine, referred to as CHOCl formulae, were formed in LIN, Table 2 Average values, with recognition of relative abundance, of verified DBPs (both chlorinated and brominated CHO molecular formulae selected based on criteria described in section 2.5.1) in neutral form (the mass of a proton added) computed from negative electrospray (ESI) 12 T FT-ICR mass spectra NOR and STO, whereas CHO molecular compositions with one bromine, referred to as CHOBr formulae, were abundant in MAL. DBPs with two chlorine atoms were consistently formed in LIN and NOR, while few such components were found in the other plants. Tentatively, this could be related to disinfectant type since both LIN and NOR waters were exposed to free chlorine, which is a more effective halogenation agent compared to chloramine, facilitating the incorporation of multiple chlorine atoms. In previous DBP studies based on FT-ICR MS, CHOCl 2 molecular formulae were frequently found at waterworks using NaOCl disinfection and not at those using chloramine, supporting this interpretation. [19][20][21] Overall, few CHOCl 3 molecular formulae were found, even at LIN and NOR. For activated aromatic structures, such as phenolic compounds, that undergo electrophilic aromatic substitution reactions, the chlorination reaction rate decreases after each chlorine incorporation, which means that the conditions required for chlorine incorporation change during the continuous chlorination. 35 This may explain the low number of CHOCl 3 molecular formulae found. Possibly, CHOCl 3 molecular formulae did form, but were further transformed to end-DBPs, such as THMs, through rapid hydrolysis reactions (more specifically referred to as the haloform reaction), and were no longer amenable for FT-ICR MS detection (due to losses of volatile compounds during sample preparation and because of their low molecular weight). Previous chlorination experiments on isolated NOM fractions prior to FT-ICR MS detection, showed that a fraction constituting oxygenated, unsaturated compounds was the only fraction producing DBPs with multiple chlorine atoms, indicating that these differences also are linked to NOM characteristics. 14,36 When comparing those CHOCl 2 formulae with the CHOCl 2 formulae found in this study, 12 of 65 DBP formulae were common, indicating that part of the CHOCl 2 formulae detected in this study was formed through reaction with similar precursors. 14 The bromide concentration in the raw water at MAL was 0.28 mg L −1 , i.e. substantially higher compared to the other waterworks, explaining the formation of elevated levels of brominated DBPs there. In this case, bromide probably entered the groundwater from the old marine sediments dominating the region. 37 It is evident that the presence of bromide in the MAL water drives the formation towards brominated DBPs while minimizing the formation of chlorinated DBPs. This may be explained by the greater reaction rate constant of NH 2 Br, NHBr 2 and NHClBr (which are formed through reactions between NH 2 Cl and Br − ) to form Br-DBPs, compared to the reaction rate between NH 2 Cl and NOM to form Cl-DBPs, assuming slow reaction sites, 38 which are estimated to be the major reaction sites for Br-DBP formation and about 50% of Cl-DBP formation upon chloramination. 39 To further explain the concept of fast and slow reaction sites, halogen oxidants have been found to follow an initial rapid consumption phase, in which e.g. aromatic, phenolic compounds (fast reaction sites) react, followed by a slower consumption phase where NOM with e.g. electron withdrawing functional groups, such as carboxyl or carbonyl groups, (slow reaction sites) react. 38 Also, the low DOC level in MAL results in a relatively high Br − /DOC ratio which also favours the reaction pathway towards brominated DBPs. 6 CHOBr formulae were the dominant Br-DBP type detected while few CHOBr 2 or CHOBr 3 formulae were found. This might be due to the lower substitution rates of the bromoamines and bromochloroamine species, compared to chlorine (HOCl/OCl − ). 39 In general, very few combined chlorine-bromine molecular formulae were assigned and those formulae either had high H/C and low O/ C ratios, suggesting aliphatic compounds, or constituted a simple composition, C 5 HO 3 ClBr 2 , which has been identified as 2,2,4-dibromochloro-5-hydroxy-4-cyclopentene-1,3-dione (HCD) in previous work. 40 The low number of verified DBPs determined for August and November in STO was caused by low mass spectral intensities of those samples, causing few DBP isotope mass peaks to be found. The low intensities were likely due to ion suppression or because of non-optimal dilution for those samples specifically. However, CHOCl and CHOBr formulae not present before disinfection were formed in STO also in Aug and Nov, but very few were verified with their corresponding isotope mass peaks.

DBP characteristics
The majority of verified DBP compositions were observed in the mass range 300-500 Da and distributed at O/C and H/C ratios of 0.3-0.7 and 0.7-1.4, respectively (Fig. 3). DBPs from the four waterworks were overall similar in average H, C and O elemental compositions (Table 2). However, DBPs formed in MAL (mostly brominated), showed higher relative abundances of carbon and higher DBE and AI mod (see section 2.5.2) compared to those found in the other plants. The latter indicated higher proportions of aromatic, unsaturated compounds among the brominated DBPs. DBPs formed in LIN showed higher average mass (Table 2), compared to those found in NOR and STO. This might be explained by the narrower mass range (Fig. 3) and the larger relative abundance of highly oxygenated DBPs (10 and 11 oxygen atoms, ESI † Fig. S3) among LIN DBPs. MAL DBPs were distributed at higher masses in part due to the exchange of chlorine for bromine, and showed lower O/C ratios indicating a lower degree of oxygenation. The average carbon oxidation states (C OS ) of DBPs were low in both MAL and STO, compared to LIN and NOR, which may be related to the milder oxidant chloramine exclusively used in those plants. Generally, the CHOCl 2 formulae distributed at higher O/C ratios in a more restricted H/C space compared to the single chlorinated CHOCl formulae (Fig. 3). In addition, the CHOCl 2 formulae showed high counts of DBE for low numbers of carbon (ESI † Fig. S1), indicating a larger degree of fused aromatic structures among those molecular formulae. In −KMD/z* plots, LIN and NOR showed different mass distributions along CH 2 -based molecular series not clearly visible in the massedited H/C ratio diagrams (Fig. 3) which might be indicative of different precursor material. DBPs with −KMD/z* above 0.15 were also plotted separately in a van Krevelen diagram, ESI † Fig. S2. The CHOCl 2 formulae are interesting since these compounds seem linked to chlorine disinfection specifically, demonstrated by their abundance in LIN and NOR in this specific study, but also through comparisons with previous studies, see section 3.5. These CHOCl 2 DBPs distribute in two distinct regions of the modified Kendrick plots (Fig. 3), which indicate two groups of compositionally diverse molecules. One region is characterized by unsaturated compounds (lower region of the −KMD/z* plot) and the other with oxygenated but rather saturated compounds (higher region of the −KMD/z* plot). This indicates that the CHOCl 2 formulae have been formed through reaction with two groups of precursors holding these properties.
One of the big challenges in DBP research is the great diversity of NOM precursors, which largely explains the great diversity of DBPs, making it difficult to explain more than a fraction of the total organic halogen (TOX) in a sample. 13 However, the patterns seen in the modified Kendrick plots indicate that the related DBPs formed within each plant arise from distinct groups of precursor materials.
The verified DBPs showed near congruent distributions (Fig. 3), in spite of the observed specific differences between waterworks, also when compared to previous studies. 20,21 Based on their elemental ratios, most of the observed DBPs can be referred to as polyphenol-like compounds; 24,32 however, FT-ICR MS is not capable of supplying structural information beyond what can be inferred from molecular formulae alone. DBPs with low H/C ratios (∼0.7-0.8) indicate a deficiency in H and suggest presence of more condensed aromatic structures among these DBPs. The verified DBPs detected here typically have between 12-20 carbon atoms and 6-11 oxygen atoms, where 8 oxygen atoms is most common when combining all DBPs formed, including both chlorinated and brominated species (ESI † Fig. S3 and S4). When comparing with a previous fractionation study (ESI † Table  S5), this description of FT-ICR MS amenable DBPs is consistent with DBPs formed after chlorination of two distinct NOM fractions, 14 one characterized by aromatic compounds with high oxygen content (e.g. polyphenol-like compounds) and the other by compounds with larger carbon-skeletons, typically having more double bonds than oxygen atoms (e.g. complex aromatic and aliphatic ring structures). 14,36

Unique and shared DBP formulae
The Venn diagrams in ESI † Fig. S5 show the number of unique DBP formulae formed at each of the four waterworks and the number of DBPs shared, i.e. common for two, three, or all four plants. These Venn diagrams are based on comparisons of DBP formulae formed at each sampling event separately. Considering chlorinated and brominated CHO molecular formulae formed at each sampling month, 64-92% of the verified DBPs formed were unique, i.e., found in one plant only (ESI † Fig. S6). It is striking that only a single DBP at one sampling date (May) was shared between all waterworks (ESI † Fig. S5). The common DBPs were primarily shared between LIN, NOR and STO in different combinations, most likely because of their similar raw water types (all use surface waters; LIN and NOR take water from the same river but at locations approximately 50 km apart; NOR and STO have similar water treatments which reflect a similarity of raw water characteristics). Of the four waterworks, STO and LIN showed the highest proportion of shared DBPs (ESI † Fig. S7). Notably, in LIN, more unique DBPs were formed in August compared with other months, e.g. characterized by O/C ∼ 0.4 and H/C ∼ 1.4. This might be explained by the higher chlorine residual (0.44 mg L −1 as Cl 2 ) in August enhancing DBP formation from precursors with slower reaction sites. Other factors that were different in August compared to the other months, such as temperature, could also have contributed to these results. We also investigated whether other sources of precursor material were available at this month specifically, e.g. algal metabolites, but there were no reports of algae in the source water of LIN during this time. In MAL, 84-98% of DBPs formed were unique, due to the high number of brominated DBPs. When combining all DBP formulae formed throughout the five sampling events prior to the comparison (Fig. 4) the distribution of unique and shared DBPs remained rather unchanged. However, the number of unique formulae decreased (to 56%) and 16 CHOCl formulae were now found common for all plants (a full list of unique and shared DBP formulae is provided in ESI † Table S2). Fig. S8-S12 in the ESI † visualize DBPs of the different segments of the Venn diagram (Fig. 4), including the unique DBPs for each treatment plant (ESI † Fig. S8 and S9) and the DBPs that were common in all four, three and two of the waterworks (ESI † Fig. S10-S12). Notably, the unique LIN DBPs appeared in a more confined chemical space, while unique NOR DBPs covered a larger area of the van Krevelen diagram, indicating a greater diversity of DBPs formed at NOR, most likely due to greater diversity of raw water NOM molecules. At both LIN and NOR, CHOCl 2 formulae were part of the unique DBPs as well as shared between the two, which supports previous reasoning regarding connections of DBPs to free chlorine exposure in those plants. Interestingly, the DBPs shared between all waterworks occurred in a quite defined position in the van Krevelen diagram, suggesting these compounds were favourably formed even for different raw waters and treatments (Fig. 5). The modified Kendrick plot of these DBPs also showed a clear pattern which indicates that these compounds might be related via few nominal chemical transformations, including methylation/demethylation, alkyl chain elongation, and oxidation/reduction.

Do raw water NOM or treatment processes determine the DBP composition?
To investigate the origin of the DBP variation between the four plants, analogous comparisons were performed using raw water CHO molecular formulae from each water treatment plant. Interestingly, these Venn diagrams (ESI † Fig.  S13) showed opposite trends, displaying a large portion of shared formulae in case of CHO compounds as opposed to DBPs (cf. above). For all five occasions of sample collection, 77-87% of the CHO formulae present in the raw waters were common for two or more plants and the majority were shared between all four plants (Fig. S13 †). However, it is very likely that several isomers with identical molecular composition exist in one sample, particularly among the highly numerous CHO compounds. 26 Consequently, it is possible that variations of the observed DBP composition arose from differences in isomeric structures of NOM and their relative abundance in the four waterworks. Therefore, DBP diversity could reflect the actual NOM molecular diversity in a more comprehensible way than the highly convoluted mass spectra of NOM itself would show. This might apply to both raw waters (Fig. S13 †) and waters right before the point of chemical disinfection where the similar pattern were observed, i.e. that the majority of CHO formulae were shared between all four plants (Fig. S14 †). A less probable alternative is that the unique sets of predisinfection CHO formulae have produced the unique sets of DBPs. This hypothesis was investigated by back-calculating the respective CHO precursors from the verified DBPs formed at each plant, each month separately, assuming both substitution and addition reactions. When comparing each of the 20 Venn diagram segments of unique CHO formulae with the 20 Venn diagram segments of unique back-calculated CHO precursors (based on verified DBPs), only seven formulae of 689 matched, giving low support to this assumption. Instead, a large portion (70-100%) of the potential CHO precursors, determined from unique DBP formulae, were found among the CHO formulae shared between all four plants. Thus, it is clear that information related to structural features of NOM will be necessary to complement NOM compositional characteristics to fully understand the variation of DBPs. However, obtaining such information is beyond the scope of this study.

Comparisons with previous studies
The verified DBP formulae found in this study were also compared to those identified in previous studies, summarized in ESI † Tables S3-S5. It is important to note that two different types of SPE-cartridges have been used to extract NOM in these studies, Bond Elut PPL and Sep-Pak C18 (ESI † Tables  S3 and S4). Comparative analysis of extracted Suwannee River NOM has recently demonstrated fair congruence of PPL-and   C18-derived ESI-FT-ICR mass spectra. 28 However, differences in selectivity might affect the range of DBPs captured with the different cartridges. Of the DBPs detected in our study, 23% were found by Gonsior et al. (2014) 20 and 20% by Lavonen et al. (2013), 21 both assessing DBP formation in other Swedish waterworks (ESI † Table S3). The DBPs common to all three studies primarily constituted CHOCl 2 formulae and were typically shared between waterworks using chlorine (NaOCl) disinfection. 8% of the DBPs found in this study were also found in a Chinese drinking water treatment plant using chlorination (Table S3 †). 19 All these common DBPs constituted CHOCl 2 formulae. Comparing with formation potential experiments using high-dose chlorination of three Chinese source waters 16 and of Suwannee River fulvic acid standards in the presence of bromide, 17 63% and 93% of the DBPs respectively, were common to our results (ESI †  Table S4). All the 16 DBP compositions that were shared between all plants in this study were also found in the study of . 17 49% of the brominated DBPs found in this study were also formed upon high-dose chloramination of Suwannee River fulvic acid (in the presence of bromide) (ESI † Table S4). 15 The comparisons described above demonstrated a stronger link between different waterworks that use chlorine as compared to those using chloramine, regarding the formation of CHOCl 2 formulae, indicating that the variation of CHOCl 2 DBPs are smaller than for CHOCl DBPs. One reason for the matching CHOCl 2 formulae (and not CHOCl) might be because a more specific precursor site is needed to kinetically favour the incorporations of multiple chlorine atoms, e.g. phenols with multiple hydroxyl groups or compounds with a methylene group between two carbonyl groups. 35,41 The application of high-dose formation potential experiments of Suwannee River fulvic acid in the presence of bromide (1848 DBP compositions in total) contained almost the entire set of verified DBP formulae found in this study (360 different DBPs), formed both through chlorine and chloramine disinfection. 17 This is logical, because in high-dose chlorination experiments, all potential DBPs that can result from the precursor material will form due to the large excess of chlorine, and the Suwannee River may have highly diverse NOM including precursors for most possible DBPs. In the Swedish waterworks, the aim is minimum DBP production by removal of NOM in up-stream treatment processes and minimizing the dose of reactive chlorine, resulting in less DBPs being formed, compared to in high-dose experiments on the full range of NOM. Instead, local conditions are steering and limiting the DBP formation, e.g. through the abundance and reactivity of specific NOM structures, variations in bromide levels and disinfectant type, driving the formation of waterworks-specific DBPs.

Conclusions
Through non-target analysis and comparison at the molecular level, this study has brought qualitative insights into DBP formation and the presence of common versus plant specific fingerprints of DBPs formed at four Swedish waterworks. Three of the four waterworks investigated in this study do not produce detectable levels of THMs, which are regulated in Sweden. However, our non-target approach detected a clear presence and a large variability of DBPs in all treatment plants. Given that non-volatile DBPs appear most relevant in terms of health concern, some kind of screening or nontarget approach might be required for relevant monitoring due to the highly waterworks-specific DBP composition (as demonstrated in this study). Such monitoring might be achieved by adopting suspect screening, where relevant masses are monitored at a broader scale using high resolution mass spectrometry or through bioanalytical approaches where the toxic effect from the sum of DBPs are assessed. These approaches also have the potential to offer simultaneous monitoring of DBPs and other non-volatile organic compounds of health concern. However, further knowledge about how this complex DBP mixture change until the consumers tap is needed to identify the most optimal targets or "non-targets" to reduce human DBP exposure.

Conflicts of interest
There are no conflicts of interest to declare.