Alexandra K.
Richardson
a,
Marcus
Chadha
b,
Helena
Rapp-Wright
ac,
Graham A.
Mills
d,
Gary R.
Fones
d,
Anthony
Gravell
e,
Stephen
Stürzenbaum
a,
David A.
Cowan
a,
David J.
Neep
f and
Leon P.
Barron
*ag
aDept. Analytical, Environmental & Forensic Sciences, School of Population Health & Environmental Sciences, Faculty of Life Sciences & Medicine, King's College London, 150 Stamford Street, London, SE1 9NH, UK
bAgilent Technologies UK Limited, 5500 Lakeside, Cheadle, SK8 3GR, UK
cSchool of Chemical Sciences, Dublin City University, Glasnevin, Dublin 9, Ireland
dFaculty of Science and Health, University of Portsmouth, White Swan Road, Portsmouth, PO1 2DT, UK
eNatural Resources Wales, Faraday Building, Swansea University, Singleton Campus, Swansea SA2 8PP, UK
fAgilent Technologies UK Limited, Church Stretton, Essex Road, SY6 6AX, UK
gEnvironmental Research Group, School of Public Health, Faculty of Medicine, Imperial College London, 80 Wood Lane, London W12 7TA, UK
First published on 30th December 2020
A novel and rapid approach to characterise the occurrence of contaminants of emerging concern (CECs) in river water is presented using multi-residue targeted analysis and machine learning-assisted in silico suspect screening of passive sampler extracts. Passive samplers (Chemcatcher®) configured with hydrophilic–lipophilic balanced (HLB) sorbents were deployed in the Central London region of the tidal River Thames (UK) catchment in winter and summer campaigns in 2018 and 2019. Extracts were analysed by; (a) a rapid 5.5 min direct injection targeted liquid chromatography-tandem mass spectrometry (LC-MS/MS) method for 164 CECs and (b) a full-scan LC coupled to quadrupole time of flight mass spectrometry (QTOF-MS) method using data-independent acquisition over 15 min. From targeted analysis of grab water samples, a total of 33 pharmaceuticals, illicit drugs, drug metabolites, personal care products and pesticides (including several EU Watch-List chemicals) were identified, and mean concentrations determined at 40 ± 37 ng L−1. For targeted analysis of passive sampler extracts, 65 unique compounds were detected with differences observed between summer and winter campaigns. For suspect screening, 59 additional compounds were shortlisted based on mass spectral database matching, followed by machine learning-assisted retention time prediction. Many of these included additional pharmaceuticals and pesticides, but also new metabolites and industrial chemicals. The novelty in this approach lies in the convenience of using passive samplers together with machine learning-assisted chemical analysis methods for rapid, time-integrated catchment monitoring of CECs.
The majority of studies characterising CEC occurrence in aquatic media have focussed on the use of grab or composite sampling. Whilst these methods enable near real-time monitoring of CECs, they require time and labour-intensive monitoring campaigns using repeated sampling to capture the breadth of CEC occurrence and their fluctuation. As an alternative, passive sampling enables time-weighted average occurrence characterisation over extended periods. Analyte accumulation in the sorbent can also improve the analytical performance through enhanced sensitivity using appropriately selective chemistries. However, passive samplers often fail to capture pulsed sources of CECs. Several different passive sampling approaches and formats exist, for CECs, mixed-mode sorbents within metal or plastic housings configured with porous membranes are popular, including polar organic chemical integrative samplers (POCIS) or Chemcatcher® devices.
With the advent and increased availability of full-scan high-resolution accurate mass spectrometry (HRMS), the potential for simultaneous targeted, untargeted and suspect screening of environmental samples for larger numbers of CECs in all environmental compartments has been realised.5,6 For the latter in particular, suspect screening with HRMS now offers the ability to retrospectively interrogate acquired full-scan sample data to potentially identify additional compounds post hoc. However, by comparison with its application to water (e.g., either directly or following solid-phase extraction),5,7 few reports of untargeted or suspect screening of passive sampler extracts exist for CECs. Soulier and colleagues analysed POCIS extracts for CECs and demonstrated occurrence of ∼30 industrial chemicals, pesticides, pharmaceuticals and personal care products across two selected sites in France using database matching by retention time and HRMS.6 The likelihood of larger numbers of contaminants being present in passive sampler extracts is high, as demonstrated by the complexity of the untargeted data analysis subsequently performed by these authors. Rimayi et al. recently used Chemcatcher® samplers for suspect screening of CECs in South Africa, revealing the occurrence of >200 compounds including general medicines and psychotropic compounds in wastewater impacted river catchments, of which ∼180 were detected for the first time.7 Again, suspect matching was performed using large databases incorporating accurate mass ±5 ppm, isotopic fit and ±0.5 min retention time thresholds. However, in many cases, retention data is either not available for such large numbers of compounds in databases, or the analytical methods used do not match the chromatographic datasets, rendering them unusable for matching. For unknowns, HRMS allows the collection of full-scan data at high sensitivity, mass accuracy and resolution8–10 enabling in silico tentative identification to be performed in many cases, either by exact mass matching or through comparison with accurate-mass databases.10,11 Current mass spectral libraries are extensive, containing reference data for thousands of compounds, thus allowing for a single sample to be screened and deliver a list of potentially matching contaminants in a relatively short period of time. However, in many cases, identification of suspects using HRMS libraries still requires a reference chromatographic retention time for comparison. Obtaining reference standards for suspect compounds can be costly and these are not always commercially available and particularly for metabolites or transformation products.12 In these circumstances, predictive retention time models have recently shown to be useful tools to raise assurance much further for shortlisted suspect contaminants identified using HRMS spectral libraries and where retention data does not exist or is not usable. For example, previous work in our laboratory showed the application of retention time prediction reduced the number of suspects shortlisted in untreated wastewater by one third, allowing prioritisation for reference standard purchase.13 In this way, the combination of a suitably accurate matched predicted retention time and appropriate MS criteria could arguably elevate a lower level match to Level 2(a) “probable structure” classification according to the widely adopted framework proposed by Schymanski et al.14
Multiple methods for in silico prediction of liquid chromatography retention times have been published in the literature, ranging from simple logP based models12,15,16 to complex multivariate quantitative structure–retention relationships (QSRR) models.17,18 More recently, machine learning-based QSRR methods have emerged for retention time prediction including support vector machines,19 tree-based learners,20 and artificial neural networks (ANNs).8,13,21–23 We have extensively evaluated the latter and even demonstrated good generalisability across multiple reversed-phase LC methods, instruments and sample types for >1100 unique compounds.21 Recently, we applied this approach to identify retrospectively 37 additional CECs in influent and effluent wastewaters in London in LC-HRMS data.13 Given that the River Thames is subject to regular wastewater impact from CECs arising from combined sewer overflows (CSOs),24 the potential combination of passive sampling and machine-learning assisted high-resolution suspect screening analysis could present a powerful new method for CEC characterisation, including the ability to utilise HRMS databases more fully where LC methods do not match or where analyte retention data is lacking. With the constant development and improving performance of analytical tools and methods especially for large numbers of compounds, better prediction of gradient retention time is now possible.
The aim of this work was to improve understanding of the occurrence of CECs using passive samplers deployed in the River Thames (UK) using both targeted LC-MS/MS analysis and machine learning-assisted in silico LC-HRMS suspect screening. To achieve this, the objectives were: (a) to perform differential targeted analysis of river water and passive sampler extracts using a rapid, direct injection LC-MS/MS method;25 (b) to develop and apply an ANN-based model for multi-analyte retention prediction in a gradient reversed-phase LC method and (c) application of the developed LC-HRMS suspect screening workflow to the occurrence of new and additional CECs in two river monitoring campaigns in winter and summer in 2018/19. This new approach is likely to improve the value of passive sampler extract data as a more rapid in silico shortlisting step for new or additional CECs.
Chemcatcher® samplers were deployed on two occasions in the River Thames UK at two proximal sites located in Central London. This region of the river is tidal, brackish and CEC concentrations at both sites were previously found not to be statistically different over a weeklong grab sampling period.24 This sampling area is also close to several CSO vents, which discharge untreated wastewater into the Thames with a frequency of roughly once a week, especially during times of heavy rainfall. During both deployments, Chemcatcher® samplers were fastened via drilled pilot holes and cable ties to 34 × 15 cm solid plastic boards. These were then affixed to pontoons and submerged at a relatively consistent 1 m depth underwater using a 3 kg dive weight. The first campaign (winter, 21st December 2018–6th January 2019) was performed at the London Fire Brigade (LFB) Lambeth River Fire Station pontoon (51°29′35.1′′N; 0°07′19.9′′W) using four Chemcatcher® devices. This site allowed secure access away from the shore and over the holiday period to deploy and collect samplers, as needed. The second campaign (summer, 27th August 2019–9th September 2019) was located ∼2 km downriver at the Transport for London (TFL) Blackfriars Pier (51°30′38.5′′N; 0°06′00.6′′W) again allowing access to a Central London region of the catchment and three devices were deployed. A field blank was exposed during both deployments and retrieval and analysed using LC-MS/MS (as for deployed samplers using LC-HRMS). After the deployment periods, the Chemcatcher® housing was disassembled, the PES membranes discarded and the HLB disks were removed and air-dried overnight at room temperature alongside the field blank to account for contamination before storage at −20 °C in the dark until analysis. HLB disks (samples and field blanks) were eluted using 40 mL of MeOH at ambient temperature under vacuum. The use of successive elution steps with solvents of different pH was not considered here to minimise complexity, but could be used to potentially increase the number of compounds eluted from the sorbent. Extracts were dried using a Genevac centrifugal rotary evaporator (SP Scientific, Ipswich, UK) at 40 °C for 2 h. Prior to instrumental analysis samples were reconstituted in 1 mL of MeOH. The full procedure is described in Taylor et al.28
Targeted direct injection LC-MS/MS analysis of water samples and extracts from the passive samplers was performed using a Nexera X2 LC system coupled to an LCMS-8060 (Shimadzu Corp., Kyoto, Japan) fitted with an electrospray ionisation source. Rapid separations (Fig. S1†) were performed on a 5 × 3.0 mm, 2.7 μm Raptor biphenyl guard column (Restek, Pennsylvania, USA). The LC method comprised a binary gradient of 0.1% v/v aqueous formic acid (mobile phase C – MPC) and 0.1% v/v formic acid in 50:50 MeOH:MeCN (mobile phase D – MPD). The elution profile consisted of 10% MPD for 0.2 min, 10–60% MPD from 0.2 to 3.0 min, and 100% MPD to 4.0 min. The re-equilibration time was 1.5 min at 10% MPD. The column was kept at ambient temperature with a flow rate of 0.5 mL min−1 and an injection volume of 10 μL. Where possible, two transitions for each analyte were monitored using multiple reaction monitoring (MRM) with the dwell time varying between 1 to 20 ms depending on the analyte (Table S1†). The threshold for a retention time match to a reference standard was set to 0.2 min. Further method details can be found in Ng et al.25 For river water samples, these guard columns were replaced after every 3000 injections, approximately. Qualitative and quantitative method performance for detected compounds in water samples were assessed in accordance with the tripartite guidelines published by the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH).29
Following this, and to refine this initial shortlist of compounds further, ANN-based retention time prediction was employed.21 Measured retention data used to train the model were generated from LC-QTOF-MS/MS measurements for a mix of 239 pesticide standards (see S2 for details†) injected in triplicate commercially available through Agilent Technologies UK Ltd. Simplified molecular-input line-entry specifications (SMILES) from Pub-Chem were used to generate data on each compound for 16 molecular descriptors. These descriptors were selected based on a combination of curated descriptors relevant to reversed-phase liquid chromatography mechanisms, correlation with tR and genetic feature selection. In addition, and based on 239 compounds used for model development, the ratio of cases to inputs far exceeded the 5:1 ratio threshold proposed by Topliss and Costello.30 Full details can be found in Mollerup et al.,31 Munro et al.24 and Miller et al.22 Dragon version 7.0 (Kode Chemoinformatics srl, Pisa, Italy) was used to generate data for hydrophilic factor (Hy), unsaturation index (Ui), Ghose–Crippen and Moriguchi logP (AlogP, MlogP), number of benzene-like rings (nBnz), number of oxygen and carbon atoms (nC, nO), number of double and triple bonds (nDB, nTB) and number of 4–9 membered rings (nR04–nR09). For logD (mobile phase pH = 3.0), data were generated using Percepta PhysChem Profiler (ACD Laboratories, Ontario, Canada). See Table S2† for all molecular descriptor data and selection of such descriptors was based on previous work. The data were used as inputs to train a three-layer multilayer perceptron (3MLP) using Trajan v6.0 (Trajan Software Ltd., Lincolnshire, UK) with a 16-4-1 architecture (optimised) and with retention time as the output. The training of models was performed in two phases. In Phase 1, the dataset was split into 70:15:15 (training:verification:test) and, using random sampling, the most appropriate neural network type selected from linear models, probabilistic neural networks (PNNs), generalised regression neural networks (GRNNs), radial basis functions (RBFs), and 3MLP and four-layer multilayer perceptron (4MLPs). Thousands of models were built and evaluated over several separate 10 min training phases and the performance of the best 50 summarised in each case. The best model type was then selected based on the lowest and most consistent error returned and across each set. In Phase 2, the architecture of the best model was further optimised. The dataset was partitioned into 70:30 and bootstrap sampling applied and in ten replicated rounds of training of 5 min intervals each. The best multilayer perceptron model used conjugated gradient descent and backwards propagation to optimise performance32,33 (in this case, a 3MLP with a 16-4-1 architecture).
Analyte | Linearity | LODb (ng L−1) | LLOQc (ng L−1) | Other DI LC-MS/MS methods LLOQs (ng L−1) | Matrix effectsd | Winter | Summer | ||
---|---|---|---|---|---|---|---|---|---|
Min, max [CEC] (ng L−1) | Frequencye | Min, max [CEC] (ng L−1) | Frequencye | ||||||
a Represents n ≥ 5 calibrants measured in river water matrix and all tested over the range 10–2000 ng L−1. b LOD determined using 3 × standard deviation of the regression line divided by the slope. c LLOQ determined as 3.3 × LOD. d Represents the mean of n = 6 replicate measures of the percentage of background-subtracted responses measured for a 1000 ng L−1 spiked Thames river water sample compared to a standard at the same concentration (negative values represent suppression and vice versa). e Frequency represents the number of passive sampler extracts where occurrence was confirmed for that compound. f Hermes et al. (2018) LLOQ in surface waters.42 g Boix et al. (2015) LLOQ in surface waters.41 h Martínez Bueno et al. (2011) LLOQ in surface waters;43 — not detected. | |||||||||
4-Fluoromethcathinone (4-FMC) | 0.992 | 4 | 13 | — | −18 | — | — | 18 | 1/6 |
Acetamiprid | 0.998 | 4 | 11 | — | +6 | — | — | 17, 33 | 4/6 |
Amitriptyline | 0.991 | 4 | 11 | — | −35 | 12 | 1/3 | <LLOQ | 1/6 |
Amphetamine | 0.994 | 4 | 12 | 6.3g, 200.0h | −14 | 25 | 1/3 | 19, 41 | 6/6 |
Azoxystrobin | 0.997 | 4 | 11 | — | −5 | — | — | <LLOQ | 4/6 |
Benzoylecgonine | 0.998 | 4 | 12 | 0.1g, 13.0h | +1 | <LLOQ | 3/3 | <LLOQ | 3/6 |
Bisoprolol | 0.997 | 4 | 12 | — | +14 | — | — | <LLOQ | 3/6 |
Carbamazepine | 0.960 | 4 | 13 | 1.0f, 0.2g | +12 | 24, 33 | 3/3 | 77, 117 | 6/6 |
Citalopram | 0.987 | 5 | 14 | 10.0f | +26 | <LLOQ | 3/3 | <LLOQ, 14 | 6/6 |
Clopidogrel | 0.998 | 4 | 11 | 0.5f | +2 | — | — | <LLOQ | 1/6 |
Clozapine | 0.998 | 4 | 11 | — | +59 | — | — | <LLOQ | 5/6 |
Cocaine | 0.997 | 4 | 11 | 1.0g, 10.0h | −4 | — | — | <LLOQ | 4/6 |
Cyclouron | 0.985 | 4 | 12 | +2 | — | — | 50 | 1/6 | |
Diclofenac | 0.987 | 4 | 12 | 2.0f, 6.8g | +2 | — | — | 24, 31 | 2/6 |
Fenuron | 0.987 | 4 | 12 | — | −6 | 33, 43 | 3/3 | 27, 46 | 6/6 |
Imidacloprid | 0.927 | 8 | 24 | 15.0f | +12 | — | — | 26, 30 | 2/6 |
Ketamine | 0.995 | 4 | 11 | 25.0h | +4 | <LLOQ, 13 | 2/3 | 21, 31 | 6/6 |
Lidocaine | 0.999 | 4 | 11 | 2.0f | +1 | 15, 19 | 3/3 | 31, 51 | 6/6 |
MDMA | 0.996 | 4 | 12 | 0.5g, 100.0h | +3 | — | — | <LLOQ | 5/6 |
Memantine | 0.992 | 4 | 13 | — | +9 | — | — | <LLOQ | 2/6 |
Methamphetamine | 0.994 | 4 | 13 | — | +2 | — | — | <LLOQ | 3/6 |
Nicotine | 0.987 | 5 | 14 | 200.0h | −3 | 32 | 1/3 | 17 | 1/6 |
Oxazepam | 0.902 | 7 | 22 | — | +8 | 41, 58 | 3/3 | 52, 75 | 6/6 |
Propamocarb | 0.995 | 4 | 11 | — | −2 | <LLOQ | 3/3 | <LLOQ, 12 | 4/6 |
Propranolol | 0.992 | 4 | 12 | — | +19 | <LLOQ | 1/3 | 16 | 1/6 |
Pyracarbolid | 0.992 | 3 | 9 | — | −8 | — | — | 12 | 1/6 |
Salicylic acid | 0.995 | 3 | 10 | 37.5g | +128 | 64 | 1/3 | 44, 78 | 4/6 |
Sulfapyridine | 0.991 | 4 | 13 | — | +3 | <LLOQ, 17 | 3/3 | <LLOQ | 1/6 |
Temazepam | 0.985 | 3 | 10 | — | +2 | <LLOQ | 1/3 | 10, 20 | 6/6 |
Terbutryn | 0.996 | 4 | 11 | 1.0f | 0 | — | 0/3 | <LLOQ | 6/6 |
Tramadol | 0.990 | 4 | 11 | 15.0f | 0 | 78, 93 | 3/3 | 169, 251 | 6/6 |
Trimethoprim | 0.998 | 4 | 11 | 10.0f, 1.8g | −4 | <LLOQ | 2/3 | <LLOQ, 13 | 5/6 |
Venlafaxine | 0.997 | 4 | 11 | 2.0f, 0.2g | +1 | 19, 20 | 3/3 | 37, 75 | 6/6 |
In addition to those EU Watch List compounds detected in water samples, the macrolide antibiotics (azithromycin and clarithromycin), neonicotinoid pesticides (clothianidin and thiacloprid) and the triazine herbicide (ametryn) were identified in passive sampler extracts in across campaigns.38 Seven additional compounds were also identified (i.e., diazepam, fluoxetine, metoprolol, nortriptyline, sulfamethazine, sulfamethoxazole and warfarin) and this occurrence was consistent with previous studies of the Thames and its surrounding catchments over the range of 5–305 ng L−1 (median: 50 ng L−1).24,34–37 Conversely, 17 and 8 compounds were not detected in the winter and summer Chemcatcher® extracts, respectively, that were present in water samples (18 unique compounds in total). Unfortunately, given the time-integrated averaging nature of passive sampling, pulse introduction of contaminants are missed, which could partly explain this. However, there were no reported sewer overflow events during either deployment (S3 for more details†).
The range of logD values for all compounds sequestered onto these HLB sorbents during both campaigns was −1.16 to 6.09 at the mean river pH and similar to previous works.44–49 The logD of all 18 compounds unique to water samples covered a range of −0.3 to 4.21. Despite methanol being a recognised solvent for passive sampler sorbent elution,28 incomplete elution or ion suppression for some compounds may have occurred. However, a stronger solvent is likely to elute more heavily retained matrix components and successive elution using different solvents or at different pH was considered excessive for practical application. For matrix effects, LC-MS signals stability was relatively low for most analytes following direct measurement of river water samples, despite their brackish nature (Table 1).25 Therefore, despite this limitation for passive sampler extracts, the combination of both direct injection and passive sampler methods was still considered to be very useful for rapid targeted monitoring of river catchments for a relatively large number of CECs.
The first step of the suspect screening workflow involved comparing passive sampler extract data to the Agilent MS databases (forensic toxicology database = 9002 compounds; pesticide database = 1684 compounds; and water screening database = 1451 compounds). This resulted in an initial shortlist of 8485 unique possible compounds in extracts. When these data were further curated using the methods described in 2.5, this was reduced to 237 unique compounds identified across all passive sampler extracts (149 in winter and 157 in summer). Within this set, multiple matches were returned for 95 compounds. The scale of this occurrence data not only demonstrates the advantages of using HLB-type passive samplers for time-integrated catchment occurrence characterisation but also that the scale of data generated would make routine monitoring impractical. Thus, to prioritise rapidly potential compounds present and increase confidence in compound identity, machine learning was employed to predict retention time as a further data curation process to reduce the number of candidates to a practicable number for risk management purposes. Of course, candidate shortlists are all dependent on the database selected. Larger databases such as the US EPA CompTox Chemicals Dashboard would have returned more suspect candidates. Nevertheless, the use of the vendor-supplied database in the first instance was taken as a starting point to demonstrate the proof of concept. As sensitivity was expected to be poorer for CECs than that of the targeted method, suspect screening of directly injected water samples on the LC-QTOF/MS was not performed. However, this could prove beneficial for wider xenobiotic exposure characterisation in future work as technology advances.
The optimised model (a 16-4-1 3MLP) for the prediction of retention time showed excellent correlation and agreement across training, verification and blind test data (coefficient of determination, R2 = 0.885, 0.871 and 0.874, respectively, (Fig. S5(a)†)). The mean average error (MAE) across all cases in the training, verification and blind test sets were 26, 26 and 29 s, respectively (Fig. S5(b)†). The applicability domain of the prediction model was defined by investigating the molecular descriptors used to generate the prediction modes using principal component analysis (PCA) in Python (Fig. S6(a)†). Following mass spectral database suspect shortlisting, the model was applied to all 237 compounds tentatively identified in the passive sampler extracts (Table S3†). For all compounds, the retention time difference (ΔtR) between the measured (tR) and predicted (tPR) retention times were calculated. Compounds with ΔtR outside the 75th percentile of model error (52 s) were discarded, as previously proposed by our group.13 Predictions may have been improved if a more diverse set of training case examples were used including other classes of chemicals. Furthermore, ab initio molecular descriptor selection for this specific method was considered, which may have also been similarly successful. However, these descriptors were previously found to generalise well across several reversed-phase LC-based methods and was the preferred option.21
This process resulted in a shortlist of 59 (n = 43 in winter and n = 37 in summer) compounds across all passive sampler extracts with ΔtR data within this threshold (Fig. 2 and Table S4†). The majority of compounds clustered well within a 95% confidence interval of PCA data for molecular descriptors used to define the applicability domain (Fig. S6(b)†). A range of classes was tentatively identified including flame retardants, PPCPs, controlled drugs, pesticides, industrial chemicals and metabolites. Of all 59 compounds detected, 21 were common to both winter and summer. The largest class of compounds detected in common overall were PPCPs. Eight compounds were present in all sampler extracts in each campaign and of these, two were present in all samplers from both campaigns, i.e., O-desmethylvenlafaxine (a metabolite of the antidepressant, venlafaxine) and tri-(2-chloroisopropyl)phosphate (TCPP, a flame retardant). Others were only prevalent in the winter campaign, including 4-hydroxyphenyl-pyruvic acid (an intermediate metabolite of phenylalanine), butylacetanilide (insect repellent), aniline (industrial synthetic precursor) and dicamba (a broad-spectrum herbicide). Unique to summer were amisulpride (an antiemetic and antipsychotic) and dilaurylthiodipropionate (an antioxidant prevalent in food and cosmetics). Importantly, nine shortlisted compounds could not be found in the literature for river water (Table 2).
Compound | Primary use(s) | CAS | Measured m/z + isotope match | ppm | Qualifier fragment(s) | Measured tR (min) | ΔtPRa (min) | Current Schymanski framework level (now all raised to 2(a) with tR prediction or higher with reference standard confirmation) |
---|---|---|---|---|---|---|---|---|
a Error in retention time prediction. b [M + H]+ adduct. c [M − H]− adduct. d ‘Probable structure’ based on precursor ion + at least one product ion with and only one database match from MassBank. e ‘Unequivocal molecular formula’ based on precursor ion + isotope pattern match to the library. f ‘Tentative candidate(s)’ based on precursor ion + product ions. | ||||||||
2,4-Dinitro-o-cresol (DNOC) | Herbicide | 534-52-1 | 198.0235b | −1.17 | 180.0177 | 7.00 | −0.65 | 3f |
8-Hydroxy-efavirenz | Antiretroviral pharmaceutical metabolite | 205754-32-1 | 330.0159c | −2.77 | 257.9963, 246.0139, 286.0252, 250.0485 | 8.69 | −0.56 | 2(a)d |
9-Octadecenamide | Manufacturing lubricant | 301-02-0 | 282.2790b | 0.47 | 247.242, 97.1012, 83.0855, 135.1168, 265.2526 | 11.34 | 0.79 | 2(a)d |
Benhepazone | Nonsteroidal anti-inflammatory | 363-13-3 | 237.1022b | 0.34 | — | 7.23 | −0.09 | 4e |
Benzhydryl cyanide | Pharmaceutical precursor | 86-29-3 | 194.0972b | −3.89 | — | 7.23 | −0.46 | 4e |
Dilaurylthiodipropionate | Antioxidant | 123-28-4 | 515.4123b | 1.18 | 143.0161, 329.2145, 115.0212, 161.0267, 89.0056 | 11.28 | −0.12 | 2(a)d |
Furegrelate | Cardiovascular pharmaceutical | 85666-24-6 | 271.1080b | −1.00 | 210.0913 | 6.03 | 0.75 | 3f |
Methylthiouracil | Antithyroid agent | 56-04-2 | 143.0275b | −1.03 | 84.0444 | 3.33 | 0.39 | 2(a)d |
Proguanil | Malaria prevention | 500-92-5 | 254.1169b | −0.75 | 170.0480, 153.0214, 102.1026, 128.0262 | 6.72 | 0.03 | 2(a)d |
Among those tentatively identified were a few interesting cases to illustrate the performance of the new in silico suspect screening workflow. Firstly, an active metabolite of lidocaine (3-hydroxylidocaine, 3-HL) was shortlisted in passive sampler extracts. A clear precursor ion was detected at m/z 251.1762 [M + H]+ (Fig. 3(a)). Based on this ion alone, four chromatographic peaks were detected. Application of the predictive retention time model isolated a single chromatographic peak within a 19 s error which also corresponded to the presence of its qualifier fragment at m/z 89.0964 ([CH2N(CH2CH3)2]+).50 This, therefore, allowed a 2(a) identification according to the Schymanski et al. framework. 3-HL is formed in humans from cytochrome P450 enzymes 1A2 and 3A4 but has not been reported in river water before, but it is unsurprising given that lidocaine itself was detected in the targeted analysis of river water in both campaigns. Lidocaine is widely used as a local anaesthetic in both animals and humans and is available on prescription and as an over-the-counter medication to treat teething pain in children, skin burns/irritations, poisonous stings/bites and haemorrhoids. Lidocaine is also regularly used as an adulterant in illicit street drugs, such as cocaine.51 A second novel metabolite tentatively identified using in silico suspect screening was 8-hydroxyefavirenz, the primary metabolite of the antiretroviral, efavirenz,52 used to treat HIV-1 infection in the UK. A matching [M − H]− isotope abundance, several fragment ions and predicted retention time were all detected (Fig. 3(b)). To our knowledge, this is the first reported environmental occurrence of this metabolite in river water. In human liver microsomal studies, CYP2B6 was shown to play a major role in efavirenz clearance via 8-hydroxylation (∼77% (ref. 53)). Globally, reports of efavirenz occurrence are limited,54 but recently, concentrations as high as 37.6 μg L−1 have been measured in wastewater effluent in South Africa despite high sorption potential via sludge treatment. Lastly, tris(1-chloro-2-propyl)phosphate (TCPP) (Fig. 3(c)) was identified and is an organophosphate flame retardant that has multiple applications, including electronics and in furniture manufacture. In these applications TCPP is typically used in a film coating format rather than chemically bonded to the material, thus is prone to release into the environment.55 TCPP has been previously reported at ng L−1 concentrations in seawater and is known to cause detrimental effects in multiple animal taxa.55,56 In zebrafish, the lethal concentration (LC50) 96 h post fertilisation of TCPP was observed to be 3.7 mg L−1.57 Exposure to TCPP has resulted in decreases in neurobehavioral responses in fish, invertebrate and rodent species58–60 as well as endocrine disruption, and developmental and reproductive toxicity.56 When human cells have been exposed to TCPP through in vitro experiments, studies report inhibition in cell viability, growth rate, protein synthesis and cell cycle arrest.56 As such, TCPP is classified as a high hazard by the US EPA.61 Again the [M + H]+ ion was detected at m/z 327.0081 along with two fragments at m/z 98.9842 ([H4PO4]+) and m/z 174.9921 ([C3H6ClO4PH3]+). No other compound was shortlisted that corresponded to mass spectral data alone, but retention prediction was again accurate to within 13 s of the detected peak in the extract. Previous work focussing on evaluation of retention time prediction models for suspect screening in wastewater showed a success rate of between 83–73%.24
Of all compounds tentatively identified using suspect screening, 15 more were confirmed using curated database entries which included retention time data. Of these, one compound was confirmed using database retention times within the Agilent Forensic database (phenytoin). Passive sampler extracts were also analysed on a separate LC-QTOF-MS method which held curated database LC retention time and accurate MS data for 14 more compounds and their presence was confirmed in all cases (see S4 for method details†). These included amisulpride, atenolol, bicalutamide, celiprolol, disopyramide, erythromycin, flecainide, irbesartan, O-desmethylvenlafaxine, practolol, proguanil, sotalol, sulpiride, tapentadol. Therefore, with respect to the Schymanski et al. identification framework,14 the compounds initially shortlisted using the Agilent HRMS databases were mostly classified between Level 4 (unequivocal molecular formula) and Level 2(a) (probable structure), depending on the presence of unique fragment ions. With the addition of predicted and curated library retention time data, we propose that matching compounds which had only one positive library spectrum match could be elevated to Level 2(a). However, to elevate compounds to Level 1 (confirmed structure), confirmation with an analytical standard is still required. That being said, the workflow presented above rapidly and efficiently aided compound occurrence confirmation workflows in environmental samples. Furthermore, according to the manufacturer, the LC-MS/MS instrument used for targeted analysis is capable of monitoring 555 transitions simultaneously and there is sufficient scope to add these and several more compounds to the targeted analytical method if required, including multiple transitions for each (to this point, 292 transitions were monitored including two for each compound and at least one for each SIL-IS). Even where the number of transitions to be monitored exceeds this threshold, the speed of the LC-MS/MS method leaves scope for the incorporation of multiple rapid injections of the same small sample using different target analyte sets, each of several hundred CECs. Using passive sampling together with both targeted analysis and machine learning-assisted suspect screening, therefore, offers a new, flexible and rapid capability for near time-integrated catchment monitoring of CECs and potentially at large scale.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d0ay02013c |
This journal is © The Royal Society of Chemistry 2021 |