Suspect and non-target screening of ovarian follicular ﬂ uid and serum – identi ﬁ cation of anthropogenic chemicals and investigation of their association to fertility

In this work, ultra-high performance liquid chromatography-high resolution (Orbitrap) mass spectrometry-based suspect and non-target screening was applied to follicular ﬂ uid ( n ¼ 161) and serum ( n ¼ 116) from women undergoing in vitro fertilization in order to identify substances that may be associated with decreased fertility. Detected features were prioritized for identi ﬁ cation based on (i) hazard/exposure scores in a database of chemicals on the Swedish market and an in-house database on per-and poly ﬂ uoroalkyl substances (PFAS); (ii) enrichment in follicular ﬂ uid relative to serum; and (iii) association with treatment outcomes. Non-target screening detected 20 644 features in follicular ﬂ uid and 13 740 in serum. Two hundred and sixty-two features accumulated in follicular ﬂ uid (follicular ﬂ uid: serum ratio >20) and another 252 features were associated with embryo quality. Standards were used to con ﬁ rm the identities of 21 compounds, including 11 PFAS. 6-Hydroxyindole was associated with lower embryo quality and 4-aminophenol was associated with higher embryo quality. Overall, we show the complexity of follicular ﬂ uid and the applicability of suspect and non-target screening for discovering both anthropogenic and endogenous substances, which may play a role in fertility in women.


Introduction
Human infertility is dened as the inability to conceive within 12 months of actively trying, and is estimated to affect up to one out of six couples, [1][2][3][4][5][6] comprising approximately 25 million citizens in the European Union alone. 7In 25-30% of fully investigated couples, the reason for infertility remains unknown. 5,6he quality, or developmental competence, of the oocyte affects the early survival of the embryo as well as the establishment of pregnancy and subsequent fetal development. 8Oocyte quality is in turn dependent on follicle growth and oocyte maturation, which is achieved during folliculogenesis, a process that takes at least half a year in humans from primordial to ovulatory stage.Primordial follicles develop through primary and secondary stages and further to antral follicles which contain follicular uid.][20][21][22][23][24] Chemicals including bisphenol A (BPA), polychlorinated biphenyls (PCBs), dichlorodiphenyltrichloroethane (DDT), perand polyuoroalkyl substances (PFAS), and polybrominated diphenyl ethers (PBDEs) have been found in ovarian FF exposing the oocyte, 13,[25][26][27][28][29][30][31] with the possibility of interfering with oocyte maturation.PFAS are of particular concern due to their widespread occurrence in human blood.However, there are conicting results regarding the effects of PFAS on fertility in epidemiological studies, 25,27,[32][33][34][35][36][37] which stresses the need for further investigations, especially since PFAS seem to target the ovary in different ways. 38For example, experimental in vitro studies suggest adverse effects of PFAS on oocyte maturation. 39,40ccording to recent estimates, over 300 000 chemicals are or have been in global chemical commerce and now constitute potential environmental pollutants. 41Traditionally, highly specic and targeted analytical approaches are applied to characterize human exposure to known substances.As a result, numerous chemicals and their transformation products are overlooked.To ll this knowledge gap, suspect-and non-target screening (NTS) approaches have emerged over the last decade as promising techniques for capturing novel anthropogenic substances.Prioritization strategies, which reduce the quantity of data from tens of thousands of features, are critical to the success of these methods. 42For example, time trends have been used to prioritize anthropogenic chemicals over endogenous substances in human serum 43 and whole blood 44,45 while case/ control strategies have been employed for elucidating chemicals specically associated with occupational exposure. 46,47ther studies have employed suspect screening approaches, focusing on matching features to mass spectral databases of anthropogenic substances. 43,48n this work, we applied an NTS approach to human ovarian FF and serum in order to identify substances that could potentially affect fertility in women.A 3-tiered feature prioritization approach was utilized, involving (i) matching to substances with high exposure and unknown-or moderate-high hazard scores in the Swedish Chemicals Agency Market List and further an in-house PFAS database; (ii) enrichment in FF relative to serum; and nally, (iii) association with in vitro fertilization outcomes with a focus on embryo quality.To the best of our knowledge, this is among the rst studies to use NTS to investigate ovarian FF.In addition to providing insight into associations between chemicals and reproductive outcomes, the acquisition of non-target data offers the opportunity for retrospective mining, as new chemicals become available.

Patient recruitment
This study was approved by the Swedish Ethical Review Authority (Dnr 2015/798-31/2, 2016/360-32 and 2016/1523-32).Patients that visited the Carl von Linné clinic in Uppsala, Sweden, for assisted reproductive technology (ART) procedure consisting of ovum pick-up (OPU) for in vitro fertilization (IVF) were invited to participate in the study.Participation was voluntary; the patients could withdraw their consent at any point and they were informed that their participation would not affect their IVF-treatment.Patients received written and oral information about the study from a nurse and those agreeing to participate signed a written consent form in accordance to the Declaration of Helsinki.
From April 23 to June 16, 2016, 244 patients visited the clinic for OPU.Fiy-four patients were not invited to participate due to lack of time (n ¼ 36), non-Swedish speaking patients (n ¼ 14), freezing of oocytes for storage and later use (n ¼ 1), and avoiding pressure on emotionally stressed patients (n ¼ 3).This resulted in 190 invited patients, of whom ve declined and 185 accepted to participate in the study.All patients enrolled in the study were assigned a random 3-digit code for pseudonymization of the samples.Information regarding reproductive history, cause of infertility as well as other health parameters were collected from the patients' records.Data was handled in compliance with relevant laws and institutional guidelines (the Swedish data protection law, PUL, and the general data protection regulation, GDPR), and the biological samples were registered at Uppsala Biobank (IVO 627) following the Swedish law on biobanking in health care.In the nal analysis, we included patients who were non-smokers with body mass indices (BMIs) of <30 and ongoing treatment with OPU during the recruitment period.Nine patients were excluded due to smoking during the last 12 months and 14 were excluded due to BMIs $30 resulting in 162 patients for sample processing (Fig. 1).

Sample collection
Clear FF with no visible blood contamination (n ¼ 161) was freshly collected aer OPU in 50 mL test-tubes (559 001, Sarstedt, Nümbrecht, Germany) on ice, discarding the rst aliquot due to possible contamination with wash-uid used in the OPU-tubing system.All FF aliquots from a single patient were pooled together into one sample.The samples were centrifuged at 500Âg to separate cells from the supernatant FF, and the supernatant was subsequently aliquoted and stored at À80 C until analysis.Serum collection was carried out as follows: prior to OPU, patients were given an intravenous catheter and blood was drawn into two tubes (GREI456089, VWR, Stockholm, Sweden).The rst tube was discarded in order to avoid contamination from the catheter and the second was kept for processing.The blood was centrifuged within 30 minutes at 1400Âg (5 minute duration) and serum was separated and stored at À80 C until analysis.For NTS, >1.5 mL was needed for analysis, resulting in 116 samples.

Standards and reagents
A full list of 28 authentic-and 11 isotopically labelled substances can be found in Table S1 and 2 of the ESI.† The internal standard mixture was ready available in the laboratory and included substances with a wide range of physiochemical properties (e.g.K ow ranging from $ À1 [sucralose] to 5.7 [tonalide]).

Extraction
The extraction procedure was originally developed for human whole blood samples and is described in detail elsewhere. 44,49rior to extraction, samples (1.5 mL of serum or 2 mL of follicular uid) were thawed at room temperature and then spiked with 10 mL of internal standard mixture (50-125 ng of each substance; see Table S1 in the ESI † for a full list).Two mL of acetonitrile, 200 mg NaCl and 800 mg MgSO 4 were added to 2 mL of FF while 2 mL of acetonitrile, 150 mg NaCl and 600 mg MgSO 4 to 1.5 mL of serum.Aer addition of steel beads, a beadblender (1600 Mini-G, SPEX Sample Prep, Metuchen, NJ) was used to homogenize the samples for 5 min at 1500 rpm.The mixture was centrifuged for 10 min at 2200Âg and the supernatant was removed.This extraction was repeated with another volume of acetonitrile and the combined extracts were concentrated under nitrogen to 150 mL.Aer freezing overnight, the samples were centrifuged (5 min at 8000Âg), and 100 mL of extract was transferred to a LC-vial containing 100 mL LC-MS grade water.

Instrumental analysis
Instrumental analysis was carried out using a previously developed method. 44Briey, chromatographic separation of analytes was carried out with a Dionex UltiMate 3000 ultra-high performance liquid chromatograph equipped with a Hypersil GOLD aQ analytical column (2.1 mm Â 100 mm, 1.9 mm I.D.) and a prelter (2.1 mm, 0.2 mm) (Thermo Scientic, USA).The mobile phases consisted of LC-MS grade water with 0.1% formic acid (A) and acetonitrile with 0.1% formic acid (B).The gradient started at 5% B with a linear increase over 10 min to 99% B, followed by a 5.5 min hold and re-equilibration at 5% B for 2 min.The injection volume was set to 5 mL and the column temperature was held at 40 C. Detection was carried out on a Q Exactive HF Orbitrap (Thermo Scientic, USA), equipped with a heated electrospray ionization (HESI) source.The capillary temperature was set to 350 C with a spray voltage of 4.5 kV (positive mode) and 3.7 kV (negative mode), sheath gas (nitrogen) at 30/45 arbitrary units (pos/neg mode), auxiliary gas at 10/5 au (pos/neg) and auxiliary gas heater at 350 C. A full scan was combined with a data-dependent MS2 (ddMS2) fragmentation on the top ve.The full scan was run with a resolution of 120 000 Full Width at Half Maximum (FWHM) at 200 m/z and a scan range of 100-1000 Da. ddMS2 scans were run with a resolution of 15 000 FWHM at 200 m/z, normalized collision energy of 30%, an intensity threshold of 1 Â 10 5 , a dynamic exclusion for 10 seconds and an apex search between 1-10 seconds.Samples, quality controls (QCs) and blanks were run in random order with one internal standard repeated every 15 samples.Samples were run in four sequences: FF-positive mode, FF-negative mode, serum-positive mode, and serumnegative mode.A mass calibration of the Orbitrap was performed before each sequence of samples.

Data processing
Alignment, peak picking and feature determination were carried out using Compound Discoverer, versions 2.0 and 3.1 (Thermo Scientic, USA).Aggregation of peaks and adducts was implemented to dene unique features.All Compound Discoverer parameters are listed in Table S3 and 4 in the ESI.† The four sequences (FF pos/neg, serum pos/neg) were processed separately in Compound Discoverer 2.0 (parameters listed in Table S3, ESI †), resulting in four separate feature lists (containing peak areas in all samples for each feature).For tier 2 (see below) we wanted to directly compare feature peak areas in FF to those in serum, which was done by reprocessing with Compound Discoverer 3.1 (this new version just became available when starting to work on this part; parameters are listed in Table S4, ESI †).This time only two feature lists were generated, one for positive and one for negative mode, each including both FF and serum samples from 116 patients (including procedural blanks for blank subtraction).For these combined feature lists the intensity thresholds during peak picking had to be raised compared to the rst processing, due to the larger number of samples that needed to be processed (and by extension, the amount of time this would take).All calculated ratios between FF and serum for detected suspects were taken from these combined feature lists.
Following blank subtraction (max area sample/area blank > 5), feature prioritization was carried out using a three-tier approach (Fig. 2) which is described in detail below.

View Article Online
Substances identied through this workow were assigned a condence level using the Schymanski scale. 50In brief, suspects identied through databases based on exact mass (AE5 ppm) but with insufficient information for one exact structure were dened as a tentative identication and assigned a condence level [CL] from 3-5 depending on the available data.With further information such as MS2 matches to a library spectrum the suspect was dened as a probable identication with CL 2a or as 2b when there was additional experimental evidence like homologue series.Suspects conrmed with an authentic standard were considered CL 1.Further details can be found in the ESI, Table S5.†

Quality control
In order to avoid false-positives, both sampling blanks and procedural blanks were processed together with samples.Sampling blanks (n ¼ 5) were prepared immediately prior to follicular uid collection by rinsing the sample collection tubing with buffer using enough volume to ll the system with uid and collect a volume of $0.5 mL aer the rinse (G-RINSE, Vitrolife, Göteborg, Sweden), which was then saved for analysis.During the initial stage of data processing, features with signal intensities within 5-fold of the average procedural blank intensity were removed from the dataset and were not considered thereaer.This removed 13-54% of the features in the datasets (see Table S6, ESI † for exact numbers).Moreover, once a feature was identied using an authentic standard, we re-conrmed its absence in both the sampling and procedural blanks.
In order to account for procedural losses and conrm that the extraction procedure was suitable across a wide range of substances, a suite of 11 internal standards (Table S1, ESI †) were spiked into all samples prior to extraction.Internal standard recoveries in both FF and serum samples were acceptable, ranging from 61 to 107% for all substances (relative standard deviations (RSDs) ranging from 13-30% in serum and 13-23% in FF for all samples; see Fig. S1 in the ESI †).Each extraction batch of FF or serum contained a pooled QC sample prepared from n ¼ 3 samples of FF or 150 mL portions of all serum samples which were extracted the same way as patient samples.Internal standard recoveries in both FF and serum QCs ranged from 45-119% with RSDs of 5.4-15.9% in serum QCs and 4.5-14.2% in FF QCs (Fig. S1 and Table S8, ESI †).Finally, during instrumental analysis, internal standard solutions prepared in acetonitrile were analysed every 15 samples to monitor Fig. 2 Overview of the tiered prioritization strategy including numbers of features and tentatively identified suspects identified using the databases the Market List and MzCloud, and confirmed substances measured in positive (+) and negative (À) ionization mode presented for all tiers.Features are defined as the combination of all ions (i.e.adducts, parent ion, in-source fragments, etc.) at a given retention time.Some duplication may exist for substances ionizing in both positive and negative mode.The combined list (yellow box) was generated re-processing the data using different thresholds (see Data processing for details) and applied in Tier 2. The Market List matches were filtered on both high exposure and unknown-to-high hazard score (see Data processing for details).

View Article Online
instrumental dri over the course of the run.RSDs for internal standard peak areas in these solutions ranged from 1.8 to 21% and showed an absence of signal dri over the course of the run (Table S8, ESI †).

Tier 1: prioritization based on database comparisons
Applying the four separate feature lists, features detected in FF or serum samples were compared to two different databases: the Swedish Chemicals Agency Market List (KEMI Market List, referred to herein as the "Market List") and an in-house exact mass list of PFAS containing 279 PFAS previously detected in house or the literature.The Market List is available on the NORMAN Suspect List Exchange website 51 and contains 30 000 substances (industrial chemicals, pharmaceuticals, pesticides, etc.) from different national/regional inventory lists with a focus on the EU market.The Market List prioritizes substances based on both human hazard and exposure scores, which are based on condential data from the importer/manufacturer supplied to the Swedish Chemicals Agency.The exposure score is calculated based on risk of environmental contamination, such as the quantity and degree of uncontrolled release during use and the extent of use on the market (range 0-27, where 27 represent the greatest exposure).The hazard score is calculated based on hazard classications described by the EU's Classication, Labelling and Packaging (CLP) regulation, using toxicological information regarding carcinogenic, mutagenic and reproductive toxicity (range 0-9 where 9 represent the greatest hazard) (for details, see Market List documentation). 52The entire feature list (i.e.exact masses with 5 ppm tolerance) in serum and FF was compared to the Market List and resulted in 18 462 matches.Tentative assignments were ltered based on high exposure score ($15) and also unknown or moderate-to high hazard scores (unknown or $3) in FF.Filtered features occurring in >30% of the FF (n ¼ 170) were prioritized for further investigation to limit the search to more common exposures.
Additionally, an in-house database containing PFAS exact masses were matched with the feature lists of FF and serum.Aerwards reference standards were used to conrm the detected suspects of 11 PFAS (see ESI Table S2 †).

Tier 2: prioritization based on follicular uid to serum ratio
This prioritization strategy was based on the premise that features enriched in FF may have a greater impact on the maturing oocyte.To prioritize these features, we used the combined FF/serum datasets (see data processing section) to determine the ratio of the peak area of a given feature in FF to its corresponding peak area in serum from the same individual.The features with a ratio >20 in at least 110 out of 116 patients were then searched against the full Market List and the mass spectral library mzCloud.In order to assess the impact of matrix-induced ionization effects on the calculated ratios, we compared internal standard responses between FF and serum.The ratios of IS areas in FF to serum ranged from 0.015 to 0.89 (see Table S7 in the ESI †), with a median ratio in negative mode of 0.73 and in positive mode of 0.22 suggesting more matrix suppression in FF compared to serum in positive mode.
Generally, all median ratios were below 1, which results in an underreporting of enrichment factors when using peak areas in FF relative to serum.We note, however, that matrix effects could only be assessed for internal standards; we cannot rule out the possibility of higher or lower matrix-induced ionization effects for non-targets.Tier 3: prioritization based on reproductive outcomes Tier 3 suspects were prioritized using statistical comparisons between features in FF and serum depending on assisted reproductive technology (ART) outcomes based on (i) the live birth of an offspring, (ii) a positive pregnancy determined by a home urine hCG-test executed by the patient, and (iii) embryo quality (at least one top quality embryo $9.1, range 1-10) assessed on day 2 by a well-established method. 53In brief, embryo score incorporates cleavage stage with the information of embryo variables associated with higher implantation-rates at day two, i.e. variation in blastomere size and the number of mononucleated blastomeres.
Features associated with differences in reproductive outcomes were identied using MetaboAnalyst. 54,55Prior to processing, missing values (i.e.below detection limit) were replaced by a small value and the data were ltered by variance to remove excessive noise 56 before normalization of the data by mean centering.The difference between the ART outcomes was investigated using multivariate statistic approaches.However, we found that orthogonal partial least squares discriminant analysis (OPLS-DA) resulted in poor separation between the groups (top embryo quality: R2 ¼ 0.14, pregnancy test R 2 ¼ 0.10, live birth R 2 ¼ 0.07), possibly due to the massive amount of data and inuence of other parameters, so that the variation in FF and serum composition alone was not sufficient to describe differences in ART outcome.Further analyses were therefore focused on embryo quality, which had the highest R 2 of the three ART outcome models and, more importantly, allowed a more focused assessment of the direct effect of chemicals in the follicular uid on the maturing oocyte and embryo development.A fold change threshold between peak area in the two ART groups of $1.5 (in >75% of pairs/variable) combined with ttest signicance threshold of p < 0.05 provided signicantly different features in FF and serum of patients with top quality embryos compared to those with lower quality.Since we are using this approach as a prioritization strategy, we were willing to accept some false positives which may arise when not adjusting for multiple testing.The signicantly different features were compared to the databases mzCloud and the Market List for further prioritization.Logistic regression (glm model of CRAN package, R 3.6.1)was used assess if the groups (high vs. low embryo quality) differed in age or BMI as these are possible confounding factors.Difference in ovarian reserve estimated by the biomarker anti-müllerian hormone (AMH) in serum was investigated using the same method.P-values < 0.05 were considered signicant.Aer prioritization, the association between the identied exogenous substances and high/low embryo quality was investigated using a logistic regression model with age, BMI and AMH as explanatory variables.AMH and features were log-transformed to reduce skewness.The variables "fertility cause" (male/female origin) and "parity" were considered for inclusion in the model but removed based on goodness of t of the model and signicance.

Recruited patients
Baseline statistics of patients and the fresh cycles included in the study are presented in Table 1.The reasons behind the patients' infertility as stated in the medical records was unknown (43.8% of the cases), male infertility (24.8%), tubal factor (6.5%), endometriosis (3.9%), ovarian factor (10.4%), anovulation (9.8%) or sterilized (0.5%).The distribution of the diagnoses is similar to larger cohorts from the same clinic. 57ocytes were fertilized either with IVF (48.5%), intraplasmatic sperm injection (ICSI, 43.5%) or a combination thereof (8%).For treatment resulting in developing embryos, 49% resulted in at least one top quality embryo.The average score of the top embryo was 8.6 ranging from 0.3-10 in all patients.

Number of detected features
In the individual feature lists for FF, a total of 20 644 features were detected (14 474 in positive and 6170 in negative mode; features ionizing in both modes would be counted twice), with an average 3141 (standard deviation (SD) AE380) unique features per patient.Of these features, 745 features were present in >90% of the patients and 2349 in >50%.In addition, 6385 features were patient-specic, i.e. detected in only one (but not necessarily the same) patient.In serum, we detected 13 740 features (6493 in positive and 7247 in negative mode), with an average of 2508 (SD AE 228) unique features per patient.A total of 2049 features occurred in >50% of patients, of which less than half (885 features) were observable in >90% of patients.In serum, 4254 features were detected in no more than one (but not necessarily the same) patient.
The combined lists (including FF and serum) comprised 3034 features in positive mode and 9128 features in negative mode (166 was detected in >50% of patients in both follicular uid and serum and 41 in >90%).The inconsistency in the number of features between the individual and combined feature lists is due to (a) differences in the number of samples used for data processing (individual feature lists included all 161 patients for FF, while the combined feature lists only included data from 116 patients where both FF and serum data was collected); (b) the version of Compound Discoverer used; and (c) differences in settings in Compound Discoverer used for data processing (more specically, different thresholds during peak picking and alignment, see materials and methods for more details).

Tier 1: prioritization based on database comparison
Of the 3306 features occurring in $30% of the FF samples, 170 matched substances with high exposure scores and unknownor moderate-to-high hazard scores in the Market List (CL 3-5).Of these 170 tentatively identied substances, analysis of authentic standards for 14 chemicals conrmed the identities of ve substances (and at the same time ruled out nine identities).Substances identied at CL 1-2 are presented in Table 2 and the remaining tentative identications (CL 3-5) can be found in Table S9 and 10 in the ESI.† 3-pyridinecarboxamide (CL ¼ 1; also known as nicotinamide or vitamin B3) was conrmed.Two additional substances, tris(2-butoxyethyl) phosphate (TBEP, CL ¼ 1) and dibutylamine (CL ¼ 1; also known as N-butyl-1-butanamine) were observed frequently in follicular uid but was also observable at similar intensities in the sampling blanks; consequently, these substances were not considered further.
Although our focus was on anthropogenic chemicals, several endogenous compounds were identied, including 12-hydroxyoctadecanoic acid (CL ¼ 1, detected in 57% of the patients) and dodecanedioic acid (CL ¼ 1, detected in 67% of the patients).In addition to their natural occurrence, these substances are manufactured commercially and were prioritized in Tier I due to their presence on the Market List and unknown or unrecorded hazards.
Eleven PFAS were tentatively identied by matching the combined FF and serum feature list with the in-house database on PFAS; their identities were subsequently conrmed using reference standards (Table 2).The median FF:serum ratio ranged from 0.64 (PFOS) to 1.04 (PFHxS).These values are comparable to previously reported data, 58 a comparison of which can be found in Table S11 in the ESI.† Some PFAS are known to cause developmental toxicity in experimental animals 59 and disturb lipid metabolism in vitro 39,60 and in humans. 61Some PFAS are also associated with affected ART outcomes according to previous studies. 25,34However, despite clear connections with effects reported in the peer-reviewed literature, PFAS did not appear using the other prioritization strategies in this study (i.e.enrichment in FF and connection to reproductive outcomes).In addition, several EDCs commonly investigated in relation to human health outcomes were not View Article Online prioritized for identication using the approach in tier 1.Some of these compounds are not included as they do not fulll the criteria of ubiquitous exposure even though mentioned in the Market list (some phthalates for example) and other were given low hazard score in the Market List and thus not included in the prioritization used in this study (some PFAS and parabens).
Tier 2: prioritization based on follicular uid to serum ratio The identication of substances enriched in FF (Tier 2) is particularly important for facilitating reproductive risk assessment based on exposure of the oocyte during maturationa specic and sensitive period during development. 62In Tier 2, twenty-two percent of all features in the combined feature lists (2709 of 12 162 as sum of pos and neg mode) were found in both FF and serum.A total of 262 features with FF:serum ratios of $20 in 110 patients were compared to the full Market List and mzCloud.Suspects including identication level can be found in Table S9 and 10 (ESI).† One compound, lidocaine, was conrmed with a reference standard (CL ¼ 1); however, this substance is used as a local anesthetic during OPU and the accumulation in FF compared to serum can be assumed from the local injection close to the sampling site.An additional four hormones and hormone derivatives (progesterone, epitestosterone, 17a-hydroxyprogesterone, and 20a-hydroxyprogesterone) could be conrmed using library spectrum data (mzCloud hit >85, CL ¼ 2a).][65] The ratio that was chosen as a prioritization limit in this study (>20) reects a substantial enrichment in follicular uid.Enrichment of anthropogenic chemicals at lower ratios are also of importance to investigate, but were not included here in an effort to reduce the number of features to a reasonable amount for identication.

Tier 3: prioritization based on reproductive outcomes
Multivariate statistical models for ART outcomes (positive pregnancy test, live birth) resulted in poor group separation.Further analysis was therefore only implemented for embryo quality.The patients with top quality embryos did not signicantly differ from those with lower quality with respect to age (intercept 35.6, estimate À1.07, p ¼ 0.07) or BMI (23.27, 0.32, p ¼ 0.55).However, the patients with high embryo quality also had higher AMH levels (2.58, 1.17, p ¼ 0.027).NTS detected 252 signicantly different features in patients with top quality embryos compared to those with lower quality (134 in FF and 118 in serum).Features associated with patients with different embryo quality may represent substances that play a role in oocyte developmental competence.The signicantly different features were further analysed and compared to The Market List and mzCloud.The Market List matched 11 suspects in FF and 22 in serum, while hits in mzCloud all scored <85% which indicate insufficient data for a probable conrmation (CL 3-5, presented in Table S9 and 10 in the ESI †).Authentic standards  resulted in three conrmations: 12-hydroxydodecanoid acid (CL ¼ 1; endogenous, discussed in Tier 1), 4-aminophenol (or isomers, see below; CL ¼ 1) and 6-hydroxyindole (CL ¼ 2a).The aggregated list of identied compounds is presented in Table 2.When using a fold change threshold of $1.5 between peak areas of high/low embryo quality patients combined with t-test signicance threshold (p < 0.05), 4-aminophenol was associated with higher embryo quality.However, this could not be conrmed with logistic regression that included the confounding factors age, BMI and AMH (Table S12 in the ESI †).4-Aminophenol was observed in 83% of the patients but we cannot rule out that this assignment may correspond to aminophenol isomers.This chemical is found in consumer products such as personal care products and cosmetics and is used in the industry for staining fur, leather and textiles as well as in the manufacturing of pharmaceuticals such as paracetamol.In humans, the chemical aniline is metabolized into paracetamol/ acetaminophen and 4-aminophenol is in turn a minor metabolite of paracetamol that is nephrotoxic. 66In this study, 4-aminophenol was present in lower concentration in serum of women with lower quality embryos and it was also detected in FF.This suggests that 4-aminophenol might have effects on oocyte maturation.An epidemiological study found no association between acetaminophen or 4-aminophenol urine concentrations in females and time to pregnancy, but a signicant association for males. 67oth the tier 3 prioritization method and a logistic regression model that included confounding factors (age, BMI and AMH) found an association between 6-hydroxyindole and low embryo quality (p ¼ 0.02, Table S12 in the ESI †).6-Hydroxyindole was detected in serum in 34% of the patients and was found in FF. 6-Hydroxyindole is commonly used in hair dyes.Gut bacterial degradation of certain amino-acids produces indole that is absorbed from the gut and further metabolized into 6-hydroxyindole by CYP11A1. 68CYP11A1 regulates androgens and has been shown to be important in the etiology of polycystic ovary syndrome. 69ompounds previously associated with IVF outcomes such as PCBs, phthalates and bisphenols were not identied using this approach.This should not be interpreted as a negative result but rather a function of the analytical technique (these substances are analyzed by GC-MS rather than LC-MS) and/or prioritization method or outcome (embryo quality).

Conclusion
The results from this study show that FF contains a complex mixture of endogenous and anthropogenic substances.It is possible that the occurrence of anthropogenic substances inside the follicle could disrupt the composition of the FF, which in turn could lead to difficulties in conceiving.For example, studies using omics-approaches to dene FF composition and the effect on fertility parameters have shown that endogenous substances differ between high or low oocyteyielding cows 70 and there are differences in protein patterns in FF that determine if the oocyte could be fertilized or not. 71By using in vitro models, exposure effects on oocyte maturation and the pre-implantation embryo can be further explored, for example in the bovine or porcine model. 39,62Only a fraction of the detected features in FF and serum were prioritized for identication in our study (Tiers 1-3), but further identications using data from this study can be used retrospectively as new concerns arises or new compounds are discovered. 72 Environmental Science: Processes & Impacts Paper Open Access Article.Published on 24 September 2021.Downloaded on 3/14/2022 2:27:26 PM.This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.
Serum was collected using vene-catheters (Venon PVK 20 G, BD Medical Surgical Systems, Stockholm, Sweden) and sampling blanks prepared and processed the same as for follicular uid, by rinsing vene-catheters (n ¼ 3) prior to patient sampling.Each extraction batch of FF (n ¼ 16 samples/batch) and serum (n ¼ 19 samples/batch) included one procedural blank consisting of 1 mL of MilliQ water (in total n ¼ 10 procedural blanks for FF samples and n ¼ 6 for serum samples).
Fig.2Overview of the tiered prioritization strategy including numbers of features and tentatively identified suspects identified using the databases the Market List and MzCloud, and confirmed substances measured in positive (+) and negative (À) ionization mode presented for all tiers.Features are defined as the combination of all ions (i.e.adducts, parent ion, in-source fragments, etc.) at a given retention time.Some duplication may exist for substances ionizing in both positive and negative mode.The combined list (yellow box) was generated re-processing the data using different thresholds (see Data processing for details) and applied in Tier 2. The Market List matches were filtered on both high exposure and unknown-to-high hazard score (see Data processing for details).a significantly (p < 0.05) different (fold change > 1.5) suspects in the two groups.

Table 1 6 a
Characteristics of patients enrolled (n ¼ 161) Parameter Value Age in years, mean (SD) 34.6 (4.6) AMH a , mg L À1 (SD) 3.25 (2.9)Previous IVF-treatments, n (%Pregnancy rate b , % 37.9 Live birth rate b , % 29.Ovarian reserve estimated by the biomarker anti-müllerian hormone (AMH), measured in mg L À1 .b Results from fresh IVF cycles where 85% of the cycles resulted in an embryo transfer.Paper Environmental Science: Processes & Impacts Open Access Article.Published on 24 September 2021.Downloaded on 3/14/2022 2:27:26 PM.This article is licensed under a Creative Commons Attribution 3.0 Unported Licence.

a
Neutral monoisotopic mass (Da).b Median ratios were determined by calculating the peak area in FF/peak area in serum from the same individual.c RSD, Relative standard deviation of the ratios.d Condence level of conrmation, see section Data processing, Materials and methods for description.50For aall substances at CL 1 and 2b chromatograms (and, where high enough intensities were present in the samples, also MS2 spectra) are listed in Fig.S2A-R in the ESI.e Hydroxyoctadecanoic acid found through both tier 1 & 3.

Table 2
Summary of substances identified using the various prioritization strategies