Application of novel Solid Phase Extraction-NMR protocols for metabolic profiling of human urine

Metabolite identification and annotation procedures are necessary for the discovery of biomarkers indicative of phenotypes or disease states, but these processes can be bottlenecked by the sheer complexity of biofluids containing thousands of different compounds. Here we describe low-cost novel SPE-NMR protocols utilising different cartridges and conditions, on both natural and artifical urine mixtures, which produce unique retention profiles useful to metabolic profiling. We find that different SPE methods applied to biofluids such as urine can be used to selectively retain metabolites based on compound taxonomy or other key functional groups, reducing peak overlap through concentration and fractionation of unknowns and hence promising greater control over the metabolite annotation/identification process. Novel SPE-NMR methods were developed for selective retention of metabolites in human urine.


Introduction
Structure elucidation is a necessary and complex task for synthetic chemistry, drug discovery, and natural products research, but is also a major challenge in areas related to the life sciences. Metabolite structure elucidation is considered a bottleneck in metabolic profiling (the study of low molecular weight metabolite patterns in organisms) -metabolites found within biofluids can be identified and described as biomarkers characteristic of specific phenotypes and disease states. For example, Elliott et al. 1 were able to characterise the biomarkers of adiposity in US adults through analysis of urine samples. Metabolites representative of difference metabolite classes and biochemical pathways (such as N-acetylneuraminate and trimethylamine) were shown to have significant association with BMI. Hence, structure elucidation for metabolic profiling has been demonstrated to indirectly deepen knowledge of metabolic pathways, which may aid in development of future diagnostic and therapeutic techniques. Within metabolic profiling, 1 H Nuclear Magnetic Resonance (NMR) spectroscopy has become an extremely valuable tool for the characterisation of complex mixtures 2 ; statistical spectroscopy tools have also been applied to extract information from complex spectral sets 3 . Posma et al 4 . have demonstrated the use of statistical 2D NMR, utilising the RED-STORM probabilistic statistical spectroscopy tool on data acquired from diet-controlled human urine samples, in order to further expand understanding of dietary biomarkers.
As such, there is currently a high interest in identification of metabolites. Urine has found widespread use in metabolic profiling due to its non-invasive ease of collection; however, the human urinary metabolome remains only partially mapped. This is in part because of the sheer number and dynamic range of compounds within a given urine sample -and, as a corollary of this, because of peak overlap which frustrates annotation efforts. In a high-throughput NMR study of human urine, it was found by Bingol et al. 5 that in a sample 13 C-1 H HSQC spectrum, of the 1012 peaks detected, only 437 peaks (belonging to 98 individual metabolites) could be assigned. In 2013, Bouatra et al. 6  There is need to expand understanding of the human urinary metabolome in order to improve our capacity for characterization of population phenotypes and for discovery of biomarkers related to disease and diet. Currently, the two most powerful analytical techniques used for metabolite annotation (the putative identification of metabolites based on spectral similarity to literature or external data) and identification (confirmation of molecular identity based on 'a minimum of two independent and orthogonal datasets relative to an authentic reference standard') 7 are mass spectrometry (MS) and NMR spectroscopy, as they provide orthogonal qualitative data, as well as absolute and relative quantification, in a very precise and high-throughput manner. In the framework of LC-MS analysis, different column chemistries have been explored to ensure wide metabolome coverage of biofluids, and to take account of the physicochemical diversity of their components. Integration of NMR spectroscopy allows acquisition of complementary information for definitive structure elucidation and confirmation. Despite its reproducibility, ease of use, and quantitative data generation, the limited sensitivity of NMR remains a bottleneck to metabolite identification -lowconcentration compounds become indistinguishable from noise, and peak overlap by more concentrated compounds regularly obscures signals from less concentrated compounds. The use of SPE-NMR of urine has been demonstrated previously by Wilson and Nicholson for retention of specific drug metabolites such as paracetamol, ibuprofen, and naproxen 8 . The fractionation of human urine using SPE to reduce NMR peak overlap has also been demonstrated by Yang et al. 9 and Jacobs et al. 10 using C18 and HLB cartridges respectively. All previous SPE-NMR experiments have utilised standard or widely available methods using classic reversed phase sorbents, but have not varied or altered conditions such as pH to generate different retention profiles; additionally, the use of ion exchange SPE cartridges on biofluids for the purpose of metabolite annotation has not been attempted before. The expansion of available SPE methods should promise greater control over metabolite retention, and hence greater insight into the human urinary metabolome -this can be done through studying how different SPE methods can retain not only individual metabolites, but entire compound classes. Hence this approach is highly relevant to the detection and characterisation of unknowns in many complex biological mixtures.

Sample Collection
Urine was collected from 12 healthy volunteers in 500mL Corning™ tubes pre-rinsed with ultrapure water -each volunteer provided informed consent in writing. Each urine sample was individually analysed by NMR as a check for polyethylene glycol contamination. Samples were pooled into a pre-rinsed polypropylene carboy with a stirbar. The pooled sample was homogenised by stirring for five minutes, after which 15mL aliquots were dispensed into 20mL Sterilin sample tubes. Samples were labelled sequentially and stored at -80°C; samples to be used were subsequently transferred to a 4°C fridge for thawing.

pH-altered urine samples.
Concentrated HCl or NaOH solution was added dropwise to 250mL of pooled urine until the desired pH was achieved. The acidified (pH 2 and pH 5) and basified (pH 11 and pH 9) samples were then stored at 4°C.

Artificial urine preparation
500mL ultrapure water was added to a 1L flask. Constituent compounds and salts were then added (see appendix) with constant stirring. An additional 500mL of distilled water was added to make 1L of artificial urine.

Strong cation exchange, neutral pH sample
A Machery-Nagel Chromabond™ SCX cartridge (6mL capacity, 500mg bed weight) was conditioned with methanol (6mL), then equilibrated with water (6mL). Pooled urine (3mL) was loaded onto the cartridge, which was then washed with 2% formic acid solution (6mL). Methanol (6mL) was used to elute the first set of metabolites, followed by 5% NH 4 OH in methanol (6mL) for the second elution.

Phenylboronic acid, sodium phosphate buffer
A Machery-Nagel Chromabond™ PBA cartridge (6mL capacity, 500mg bed weight) was conditioned with a solution of 1% HCl in 70:30 water:acetonitrile (6mL), then equilibrated with sodium phosphate buffer basified to pH 10 with sodium hydroxide (6mL). Pooled urine (3mL) was loaded onto the cartridge, which was then washed with sodium phosphate buffer basified to pH 8.5 with sodium hydroxide (6mL). Water (6mL) was used to elute the first set of metabolites, followed by a solution of 1% HCl in 70:30 water:acetonitrile (6mL) for the second elution.

NMR sample preparation
Washes and elutions were dried under nitrogen and reconstituted in ultrapure water (3mL). Buffer containing trimethylsilylpropionate (TMS) as an internal reference standard was added to 540µL of reconstituted sample, as described by Dona et al 11 . 580µL of the manually vortexed samples were then transferred into 5mm SampleJet NMR racks.
Samples which required additional 2D NMR experiments were dried under nitrogen and reconstituted in D2O (3mL). TMS phosphate buffer (60µL) was added to 540µL of reconstituted sample, and 580µL of the resulting manually vortexed sample was transferred into 5mm NMR tubes.

NMR data acquisition
All 1D experiments were run using a Bruker Avance III 600MHz spectrometer equipped with SampleJet. Samples were analysed using one-dimensional water-suppressed 1 H NOESY experiments at 300K.
Additional 1 H-1 H J-resolved experiments, and 2D-NMR experiments, including 1 H− 1 H Total Correlation Spectroscopy (TOCSY), 1 H− 1 H Correlation Spectroscopy (COSY), and 1 H− 13 C Heteronuclear Single Quantum Coherence spectroscopy (HSQC), were utilised for metabolite annotation. The data from the 2D NMR experiments was acquired using a Bruker Avance III 600MHz spectrometer equipped with a cryoprobe.

Data analysis
NMR datasets were imported into MatLab using the Imperial Metabolic Profiling and Chemometrics Toolbox (IMPaCTS) 12 . Water (4.26-5.50ppm) and formate (8.25-8.63ppm) regions were removed from the spectra to eliminate interferences; the spectra were then normalised against the TSP region (-0.5 to 0.5 ppm) using a probabilistic quotient normalisation function 13 . Principal Component Analysis (PCA) plots were subsequently constructed with 5 principal components.

Results
Different cartridge chemistries were utilised in order to produce unique retention profiles for different compound classes (as demonstrated in Figure 1). All samples utilised a sample load incorporating 3mL of urine as a compromise between substrate retention capacity and spectral resolution. Each elution demonstrated differing retention profiles for each methodreplicates of the same method, however, had little difference between spectra. Hence the reproducibility of the SPE methods outlined here can be guaranteed.
Quantifying the extent of retention of different classes of compounds can be done using molecules representative of a chemical class. In the Human Urine Metabolome database 14 , chemicals are assigned taxonomically -first to classes, then to subclasses -using the ChemOnt automated taxonomy. As the taxonomy is automated, its class structure can be utilised in order to demonstrate retention profiles for given methods. Two separate lists of metabolites were utilised to generate a list of representative compounds. One set of metabolites was examined and ranked according to their frequency of occurrence in the human urinary metabolome 6 , such that the metabolites being examined would have a significant chance of being characterised in pooled urine samples. The second set was generated from a method for producing artificial urine 15 . The two lists were combined, and the resulting set of metabolites (Table 2) was then characterised by their assigned subclass from the ChemOnt automated taxonomy. The intensities of specific peaks corresponding to these metabolites in the elutions were compared to the intensities of the same peaks in the raw pooled urine samples. From this, percentage retention per compound class could be estimatedhence, insight into which methods can selectively retain different compounds classes can be achieved. For example, the peak intensity of retained creatinine (belonging to the subclass 'alpha amino acids and derivatives') from a method utilising a C 18 cartridge at neutral conditions was measured at 257. When compared to the intensity of the same peak in the 'raw' urine sample (5027), this gives an estimated retention capacity of creatinine using this method of 5%. After accounting for the other members of that subclass, C 18 at neutral conditions has an overall retention of α-amino acids and derivatives of 0.75%. The compound class retentions per method were then summed to produce a measure of the total retention capacity out of 100 -where methods ranking 0 would retain nothing, while methods ranking 100 would retain everything. It can be expected that the sum of the retention capacity estimates of the elutions and the washes for a given method is equal to 100.

Principal component analysis (PCA)
An alternative approach to quantifying the retention profiles of each SPE cartridge can be achieved using PCA. PCA is an analytical method which can be used to reduce the dimensionality of data and produce visual representations of correlations between datasets -for NMR spectra, it allows for clustering and trends between experiments to be demonstrated, as well as for discovering potential outliers.
A PCA structure built with the datasets from all elution methods utilising natural urine demonstrated that the elutions from acidified ion exchange methods were clearly separated from the other chemistries (figure 2). The below table (Table 4) lists the NMR signals visible in the PCA loadings plot, as well as their correlation coefficient and their tentative assignment -all assignments are made with comparison to the reported values in the literature, and hence can be considered annotated to a level 2 standard 16 . It demonstrates that a small number of peaks (creatinine, histidine, creatine, trimethylamine-N-oxide, and 3methlyhistidine being the most prominent) contributed to 69.24% of the differences between elutions. These metabolites were all retained by SCX cartridges under acidic conditions, and their spectral peaks are sensitive to pH changes, causing significant chemical shifts in even buffered samples.  Creatine 0.83 The ion exchange method most separated from other datasets is the one which produces elutions from strongly acidified urine using an SCX cartridge. However, there is also significant differentiation with the SAX elutions with 2% formic acidunder these conditions, the SAX cartridge begins to retain compounds like creatinine and histidine where it otherwise wouldn't under neutral or basic conditions. It is not clear why the SAX retention profile would begin to resemble that of SCX; one possibility could be that the silanol groups from the silica on which the SO 3and ammonium modifications are based are more able to retain these compounds under acidic conditions. However, silanols are usually protonated under acidic conditions, and hence would not express this ionic character; the explanation additionally doesn't account for why C 18 with 2% formic acid -which, similarly, contains silanol groups -does not retain creatinine or histidine. Besides the separation between ion exchange and reversed phase methods, there is additional separation along the secondary principal component axis between phenyl, C 18 neutral, SAX neutral, and SCX neutral elutions, and C 18 acidified and HLB acidified elutions -with HLB neutral elutions found in between the two clusters ( Figure 3). Removing elutions from ion exchange cartridges and reconstructing the PCA affords a similar differentiation with more clarity. The clusters suggest that C 18 neutral elutions have more in common with phenyl elutions than with acidified C 18 or even HLB neutral elutions. The NMR signals responsible for the separation between reversed phase elutions are tabulated in Table 5 -it also notes whether the annotated metabolites or unknowns are positively correlated with phenyl cartridges (and hence retained by phenyl), or are negative correlated (and hence retained by C 18 /HLB). Signals marked with an asterisk * are not visible in the loadings plot, but can be observed with manual inspection of the spectra.  Clustering can be observed forming an almost linear scale for C 18 and HLB cartridges ( Figure 4) -with C 18 neutral elutions at one end, acidified elutions at the other, and HLB neutral elutions in between. Many of the assigned peaks (table 4) in the aromatic region are caused by differences in chemical shift between identical compounds (likely due to pH differences)for example, N-methyl-2-pyridone-5-carboxamide (2PY) is significantly correlated both positively and negatively with phenyl elutions (Figure 5), as its spectral peaks undergo chemical shifting due to pH differences in different experiments.  The PC3vsPC4 plot in the all-elutions structure also demonstrated clustering (Figure 6) of the PBA elutions, positively correlated with the PC4 dimension. The PC4 loadings hence closely resemble the averaged spectra from the PBA elutions -the metabolites (Table 7) being mostly represented by mannitol and N-methylnicotinamide.

Artificial Urine
A similar PCA can be constructed for reversed phase elutions utilising artificial urine -again, there is separation across the first component, demonstrating a notable difference between phenyl and C 18 /HLB retention profiles. As with the natural urine elutions, the loadings (Table 8) can be annotated to demonstrate the most important spectral differences separating different methods. Here, artificial urine was used to demonstrate that a mixture of representative compounds can be used to estimate the retention capacity of cartridges without using natural urine -as with the natural urine reversed phase elutions, metabolites such as hippurate can be shown to be retained on C 18 /HLB, but not on phenyl cartridges; similarly, trigonelline can be shown to be retained on phenyl, but not on C 18 /HLB. This allows for greater control over future SPE experiments aimed at characterising retention profiles of SPE cartridges.

Discussion
The guiding philosophy behind this use of SPE-NMR suggests that not only each cartridge, but each pH and solvent system utilised in a given experiment would result in different retention profiles. These retention profiles can be classified either through annotation of a selection of common metabolites from different compound classes, or by using a more holistic approach, determining the compounds more likely to be retained under different conditions through data treatment. Selection use of methods can then be utilised to reduce peak overlap and aid with metabolite identification. One example of this is displayed in Figure 7; interferences in the 'raw' pooled urine sample are removed by use of a PBA-based method to clearly reveal mannitol. Annotation of a selection of common metabolites is facile and provides immediate and useful information of individual compounds. Ideally, this could be done using a list of metabolites representative of the compound classes generally found in human urine; unfortunately, the partial identification of the human urinary metabolome hinders the creation of a fully representative sample. Additionally, the natural rate of occurrence of metabolites may make a representation of an 'average' sample difficult or even impossible. Peak intensities can also be impacted by NMR shimming, peak overlap, and pH changes -all of which can affect the intensity recorded. Despite these shortcomings, general trends can be established by considering functional groups and structural commonalities between compound classes.
Clustering in PCA plots can be used to demonstrate substantial differences between datasets. The Strong Cation Exchange (SCX) cartridge -utilising the pseudo-permanently charged phenylsulfonic derivative (pKa ≈2.1) -provides the greatest separation between clusters when included in PCA structures, due to the ion exchange mechanisms not present in reversed phase chromatography. Ion exchange retention profiles rely heavily on pH control, since all compounds must have at least one positively charged atom in order to have sufficient attraction to the sorbent to be retained; hence, compounds which do not have a positive charge at physiological pH must be in acidic solution for retention to occur. This intrusive sample adjustment will naturally affect the chemistry of the biofluid; using acidified (or basified) conditions is, then, necessarily a trade-off between greater insight into the metabolome through retention, and authenticity of the sample itself. It is additionally feasible that compounds not normally present in the sample may be formed and retained due to the change of conditions, although this was not noted during the course of the experiments.
The importance of pH control is reflected in the retention capacity of SCX cartridges; the neutral pH retention profile is one of the least retaining methods with an estimated retention capacity estimate of 0, while at pH 2 its estimated retention capacity (14.67) is comparable to a more widely recognised reversed phase method, such as HLB with 2% formic acid in all steps (17.40). The compounds best retained on SCX under acidic conditions were predominantly histidine based -with histidine, 3-methylhistidine, and 1-methylhistidine being well retained at pH 2. Creatinine and TMAO were also present in the elutions. A cationic nitrogen atom, possibly stabilised by electron-donating groups through hyperconjugation, may serve as the most important characteristic uniting the compounds; compounds without nitrogen-containing functional groups (such as amino acids) were generally not present in the acidified SCX elutions, although the presence of a nitrogen-containing functional group did not necessarily result in retention -for example, of the proteinogenic amino acids were retained, only histidine was retained in any significant quantity. It may be notable that histidine has a pKa of 6.04, far lower than other positively charged amino acids arginine (pKa 12.10) and lysine (pKa 10.67).
Removing ion exchange elutions from the dataset and reconstructing a PCA plot demonstrates additional separation between C 18 /HLB and phenyl elutions and allows for further probing into the differences between the reversed phase methods. Previous uses of phenyl cartridges in the literature have remarked on its similar retention capabilities to C 18 , with slightly better retention for polycyclic aromatic compounds 18 , but slightly worse retention for other hydrocarbons 19 Phenyl cartridges utilise π-stacking on top of hydrophobic forces in order to provide additional retention for aromatic compounds -however, the strength of π-π interactions tends to be bound between around 8-12kJ/mol for benzene dimers 20 . For comparison, hydrophobic forces may be up to 4 times stronger 21 ; hence, despite having an additional mechanism of action, the actual retention capacity for phenyl cartridges is significantly lower than that of C 18 or HLB cartridges across all methods due to a weaker hydrophobic retention mechanism.
The two major compounds which were retained selectively by phenyl (but not by C 18 or HLB) were trigonelline and creatinine, both nitrogen-containing heterocycles. Other compounds with phenyl functional groups -such as phenylalanine or hippurate -did not experience greater retention using phenyl cartridges, and in fact retained much less, if at all. Conversely, the cyclic metabolites retained under acidic conditions by C 18 /HLB -but not by phenyl -are predominantly aromatic, with both heterocycles (2-furoylglycine, quinolinate, N-methyl-2pyridone-5-carboxamine), and hydrocarbon rings (hippurate, phenylacetylglutamine, and p-hydroxyphenylacetate) present. The aromatic heterocycles here do not have the ability to form cationic nitrogen in the rings themselves, unlike the aromatic compounds retained in phenyl elutions, such as trigonelline. Short chain fatty acids such as valeric acid were also retained under non-neutral conditions. It is unclear why charged metabolites like trigonelline would be better retained on phenyl cartridges -although as the charge is positive, it is hypothetically possible that the π-electron clouds located above and below the benzene rings are able to electrostatically Out of the remaining reversed phase sorbents, the C 18 and HLB cartridges are known to have similar retention profiles to each other 10 , with HLB cartridges often being preferred for their tolerance to drying and the possibility for elimination of conditioning and equilibration steps. The differences between the two can be demonstrated through comparison of elutionswhile the two have comparable retention, HLB cartridges tend to retain a slightly larger range of compounds in greater quantities. This is especially true at neutral conditions, where C 18 cartridges retain relatively little. Generally speaking, as with the SCX cartridges, more acidic conditions result in greater retention -possibly due to the deionisation of silanol groups on the surface of the sorbent. However, this again comes with the tradeoff of authenticity, as the acidic conditions may cause signal suppression, unwanted reactions between metabolites, or general degradation of the sample itself. Use of 2% formic acid in all steps allows for a balance between the two -while the retention does not extend as deeply as that under pH 2 conditions, the higher pH environment should not be as destructive to sample authenticity, and chemical shifts caused by drastic pH changes should be absent. HLB/C 18 under acidic conditions are powerful analytical methods which can reveal signals which are otherwise not visible -for example, unknown N, which displays a doublet of doublet of doublets (ddd) signal at 7.12ppm, is normally obscured by a 3-methylhistidine peak ( Figure 8). The use of reversed phase methods can hence provide additional information about the human urinary metabolome. C 18 and HLB cartridges themselves can also be differentiated from each other. On top of the obvious differences in sorbent structure (a hydrocarbon chain, compared to a polymer containing divinylbenzene), HLB cartridges are not silica-based -hence, silanol groups present in C 18 cartridges are not present in HLB. These silanol groups produce secondary interactions with metabolites, commonly expressed as a weak cation exchanger, which can influence retention. This is reflected in the differences between C 18 elutions under neutral and acidified conditions -at lower pH, the silanols are generally protonated; at neutral pH, at least some silanols are deprotonated, giving the cartridge the ability to selectively retain some cations. Indeed, observing the PCA results shows that compounds such as creatinine and TMAO are retained at neutral conditionsthese compounds also being retained by the SCX cartridge under the appropriate conditions.
The final elutions to consider were those afforded from methods utilising phenylboronic acid (PBA) cartridges. PBA cartridges utilise a unique covalent bonding mechanism in order to selectively retain diols, α-hydroxy ketones, or any other functional groups where two unsubstituted heteroatoms are separated by at least one carbon 22,23 . It is not clear whether the diols must adopt a specific isomerism for retention to occur: mannitol is heavily retained in the elutions, but contains both R and S carbon centres, as well as terminal hydroxyls which can rotate to become a given conformer -its retention hence does not give additional insight. Other compounds retained include acetate, acetamide, and N-methylnicotinamide, a metabolite of niacin. The presence of adjacent heteroatoms does not guarantee good retention: for example, citric acid is poorly retained, despite having three carboxylate groups. There is also some retention of dimethylamine in both artificial and natural urine samples, despite it not being a diol -however, it could hypothetically be retained through a single substitution of water at the boronate, rather than through a doublesubstitution, as is normally the case.

Conclusions
We have demonstrated and compared the use of different SPE methods for the retention of different compound classes within the human urinary metabolome. Different retention profiles can give unique insight into the metabolome by revealing metabolite peaks in NMR spectroscopy which had previously been obscured by peak overlap -SPE can hence be used to either remove the suppressing metabolite(s) or to isolate the unknown metabolite itself. These retention profiles can be differentiated based on their retention of not only individual metabolites, but of broader subclasses of compounds united by their structure commonalities, including shared functional groups. Hence, different methods can be utilised in order to give greater control over the annotation process. On top of the metabolites identified by comparison to metabolite databases and available literature, several unknowns were annotated in the elutions -further experiments and comparisons to an authentic chemical reference will be required to identify them to a level 1 standard 16 . There also exists the possibility to study washes from the SPE experiments further, as well as to use multiple SPE methods in series, in order to narrow down a set of retained compounds even further. Finally, it may be possible to transfer these methods to an automated SPE system for