Comparative mass spectrometry-based metabolomics strategies for the investigation of microbial secondary metabolites

Brett C. Covington a, John A. McLean ab and Brian O. Bachmann *a
aDepartment of Chemistry, Vanderbilt University, 7330 Stevenson Center, Nashville, TN 37235, USA. E-mail: brian.bachmann@vanderbilt.edu
bCenter for Innovative Technology, Vanderbilt University, 5401 Stevenson Center, Nashville, TN 37235, USA

Received 22nd April 2016

First published on 7th September 2016


Covering: 2000 to 2016

The labor-intensive process of microbial natural product discovery is contingent upon identifying discrete secondary metabolites of interest within complex biological extracts, which contain inventories of all extractable small molecules produced by an organism or consortium. Historically, compound isolation prioritization has been driven by observed biological activity and/or relative metabolite abundance and followed by dereplication via accurate mass analysis. Decades of discovery using variants of these methods has generated the natural pharmacopeia but also contributes to recent high rediscovery rates. However, genomic sequencing reveals substantial untapped potential in previously mined organisms, and can provide useful prescience of potentially new secondary metabolites that ultimately enables isolation. Recently, advances in comparative metabolomics analyses have been coupled to secondary metabolic predictions to accelerate bioactivity and abundance-independent discovery work flows. In this review we will discuss the various analytical and computational techniques that enable MS-based metabolomic applications to natural product discovery and discuss the future prospects for comparative metabolomics in natural product discovery.


image file: c6np00048g-p1.tif

Brett C. Covington

Brett Covington received a B.S. in Chemistry from Austin Peay State University in 2012. He was awarded with an NIH sponsored Chemistry–Biology Interface training grant in 2013 and is currently a 4th year PhD candidate at Vanderbilt University under the supervision of Brian Bachmann. His research in the natural product discovery section of the Bachmann lab has been primarily focused on the application of untargeted metabolomic methods to identify microbial secondary metabolite responses to environmental stimuli.

image file: c6np00048g-p2.tif

John A. McLean

John McLean graduated in 2001 with a PhD in Chemistry from George Washington University. Following postdoctoral research at Forschungszentrum Jülich in Germany and then at Texas A&M University with Prof. David H. Russell, he began at Vanderbilt University in 2006 where he is Stevenson Professor of Chemistry, Director of the Center for Innovative Technology, co-Director of the Automated Biosystems Core, and Deputy Director of the Institute for Integrative Biosystems Research and Education at Vanderbilt University. McLean and colleagues focus on the conceptualization, design, and construction of structural mass spectrometers, specifically targeting complex samples in systems, synthetic, and chemical biology.

image file: c6np00048g-p3.tif

Brian O. Bachmann

Brian Bachmann graduated in 2000 with a PhD in Chemistry from The Johns Hopkins University, studying with Prof Craig Townsend. Following this, he joining Ecopia Biosciences in Montreal, where he was Director of Chemistry, prior to moving to Vanderbilt, where he is currently Professor of Chemistry and Associate Director of the Vanderbilt Institute for Chemical Biology. He has a long standing interest in studying and harnessing the biosynthetic capabilities of living systems in order to radically accelerate the discovery of new bioactive compounds, and revolutionize how molecules are synthesized.


1 Introduction

Genomic sequencing of both cultivated microorganisms and uncultivated microbiomes has revealed that most of the biosynthetic potential of microorganisms remains inaccessible to date. Even genomes of extensively studied microorganisms contain a large fraction of secondary metabolite biosynthetic gene clusters for which natural products have not been identified.1–3 It is now estimated that the products of greater than 90% of secondary metabolite gene clusters are either not expressed under standard laboratory growth conditions4,5 and/or their products are difficult to identify within extracted metabolomes. Secondary metabolites play crucial roles in the chemical ecology of their producing organisms,6–8 and these roles often correlate translationally into applications in human medicine.9 Therefore, solving the linked problems of secondary metabolite gene expression and the identification of secondary metabolites within metabolomic inventories have become central efforts in the field of natural product discovery.

Natural product biosynthetic potential can be rapidly estimated from genomic sequence data via automated bioinformatics platforms capable of comparing sequenced biosynthetic gene clusters to previously sequenced microorganisms and inferring putative structures of natural products by biosynthetic inference.10–16 Recently, several reviews have described evolving computational tools for biosynthetic gene cluster analysis.17,18 Increasingly sophisticated methodologies have been developed to tackle the biosynthetic gene cluster expression problem, which may be subdivided into heterologous and native approaches. Heterologous strategies endeavour to recapitulate functional secondary metabolic biosynthetic gene clusters in surrogate producers. Gene clusters may be cloned, and/or synthesized and refactored into alternate organisms with the aim of detecting newly produced metabolites in comparison to a clean background.19–24 The success of heterologous expression is dependent upon functional expression within the host organism, which is a function of successful transcription, translation, and precursor availability as discussed in other reviews.19,25,26 Given the phylogenetic diversity of microbial secondary metabolite producers, a number of hurdles must be addressed to successfully identify constructs for functional expression. In addition to optimizing genetic regulatory elements for heterologous expression, differences in protein stability, post-translational modification of biosynthetic enzymes, and precursor availability must be addressed. Alternatively, native expression methods endeavour to activate secondary metabolite production from within the native producer.4,27,28

In addition to refactoring biosynthetic gene clusters via genome editing, many chemical and biological stimuli have been reported over the past few decades that activate secondary metabolite expression. The practice of exposing microorganisms to an array of growth conditions for the purpose of eliciting the production of multiple compounds is not a recent development.29 However, contemporary studies of the impact of stimuli on microbial metabolism, in addition to fulfilling to goals of natural product discovery, now model the chemical ecology and environmental microbiology of microorganisms.4,27,30–32 For example, microbial secondary metabolite producers have shown responses to subinhibitory antibiotic exposure33,34 as well as vertically acquired antibiotic resistance mutations which engender mutations in transcription and translation machinery35–40 have been demonstrated to activate the production of a fraction of previously undiscovered metabolites. Similarly, rare earth metal exposure,41–43 which may affect circulating levels of pleiotropic factors, has been demonstrated to modulate secondary metabolite production. A particularly successful strategy for activating secondary metabolite expression in microbes is via stimulation with competing organisms.44–48 Taken together these phenomenon suggest that secondary metabolites are indeed produced by microorganisms to respond to environmental stimuli45 and this is supported by the apparent biosynthetic gene cluster activation selectivity of various stimulatory methods as well as the intrinsically complex nature of secondary metabolite gene cluster regulation.1,49

Regardless, all categories of genome-prioritized natural product discovery require a means of measuring the modulation of secondary metabolite production within the extracted metabolome of native or heterologous producers, and metabolomics methods have been continually adapted to this task. Metabolomics is often defined as the comprehensive study of small molecules within a biological system and provides a direct measure of detectable secondary metabolite production within an organism of interest. Currently there are two analytical platforms to facilitate metabolome profiling for natural product discovery. Nuclear magnetic resonance (NMR) based metabolomic analyses, reviewed elsewhere,50 are not biased by molecular class and provide enhanced structural information for metabolites but are limited by the inherently low sensitivity of NMR. In contrast, the metabolomic analyses through mass spectrometry (MS), which will be the focus of our review, are exceptionally sensitive but are exclusively biased towards ionisable metabolites. The structural diversity of secondary metabolites, which span a broad range of functionality, molecular weight, and ionization efficiency, renders comprehensive detection of all metabolites through MS a challenging endeavour, and there is no universal approach for bioanalytical detection. For this reason, the development of metabolomics methods with MS strategies necessitates a discussion of contemporary practices and advances in analytical instrumentation.

As the product of the central dogma, the metabolome also contains information regarding a wide variety of cellular processes unrelated, or indirectly related, to secondary metabolism. Correspondingly, metabolomics information may encode insights into how secondary metabolite producing organisms respond to chemical and biological stimuli and may also provide a means of investigating the biological mechanisms of newly isolated natural products, antibiotics, and chemotherapeutics, from the metabolomic changes engendered within treated organisms. Secondary metabolites are generally biosynthetic end-products and unlike primary metabolites, they accumulate at higher levels than the fluxes observed in central metabolism.51 Hence, comparatively abundant secondary metabolites are well suited for comparative metabolomics work-flows. Other recent reviews have highlighted some applications of mass spectrometry for the discovery of natural products deriving from plant52 and microbial53 sources. In this review we provide a foundational overview of the analytical techniques that underlie MS-based metabolomic applications to natural product discovery and describe how these various techniques provide differentiating molecular characteristics for detected metabolites. We discuss the computational methods used to process complex metabolomics data and bioinformatics methods that utilize the molecular characteristics of detected metabolites to prioritize and dereplicate leads for natural product discovery. Lastly, we describe how these metabolomic methods are being applied to investigate biological activities for natural products and discuss future prospects for the field.

2 Methods of generating inventories of microbial metabolites

A variety of MS techniques are available to acquire metabolomics data with corresponding advantages and challenges depending on the analytical descriptor(s) that is/are desired. In each method, the end result of the analysis is a set of metabolomic ‘features’, ions with a determined mass-to-charge ratio (m/z) and potentially additional descriptive information. This additional information may include descriptors such as mass accuracy, chromatographic retention time, isotopic envelope, size and shape information, fragmentation data, and topological distribution, among others. A summary of key descriptors and the information they provide in secondary metabolite characterization is provided in Table 1. As the dimensionality of feature characterization has an impact upon the subsequent effectiveness of comparative metabolomics analyses, we will briefly discuss in this section commonly utilized methods for MS acquisition and highlight several of these key molecular characteristics that can be obtained with MS.
Table 1 Overview of analytical descriptors relevant to metabolomics-based natural product discovery
Analytical descriptor Description Analytical technique
Mass accuracy Deviation of the experimentally determined m/z from the true m/z. Expressed as the mass error (e.g. ppm), with sufficiently small error an exact chemical formula can be determined Mass analyzer
• Space-dispersive (e.g. ion trap, quadrupole)
• Time-dispersive (e.g. time of flight)
Isotopic modeling Comparison of the abundances of specific isotopes in the molecular isotopic envelope. Can provide rapid indication of amount and identity of heteroatoms Chemometrics
• theoretical isotope calculators
• Mass defect analysis
• Quantitation
Chromatographic retention time Time required for fluid-solid phase partitioning across a column. Provides separation on the basis of a differentiating characteristic orthogonal to mass Chromatography
• Hydropathy (e.g. liquid chromatography)
• Volatility (e.g. gas chromatography)
• Size and charge (e.g. size exclusion, charge exclusion, ion capture)
Ion mobility drift time Gas-phase electrophoretic separation based on size and shape of the metabolite as ions pass through a gas filled drift tube Ion drift tube
• Time-dispersive (e.g. drift time ion mobility, traveling-wave ion mobility)
• Space-dispersive (e.g. field-asymmetric ion mobility)
Fragmentation Tandem MS using ion activation to provide characteristic fragment species. Provides metabolite structural information to prioritize which of multiple isomers are the likely identity for a given elemental formula Ion activation
• Collisional (e.g. collision induced dissociation, and surface induced dissociation)
• Electron (e.g. electron transfer dissociation and electron capture dissociation)
• Photoactivation (e.g. infrared multiphoton dissociation and wavelength-tunable ultraviolet photodissociation)


2.1 Mass measurement accuracy

The mass-to-charge ratio (m/z) of detected metabolites is the primary property used to initiate the process of dereplication. For more than a decade mass analyzers have been able to determine mass accuracy with an error of under 1 ppm.54–57 This level of mass accuracy allows for the determination of elemental composition boundaries for compounds under 600 Da (ref. 58 and 59) when coupled with isotopic mass ratios.60 While advances in Fourier Transform ion cyclotron resonance MS (FTICRMS) can now routinely perform at sub-ppm mass errors, typical instrumentation provides mass errors in the range of 1 to 10 ppm (e.g. time-of-flight MS). Unfortunately, this alone is insufficient to confidently dereplicate features, because of the extensive number of potential isomers for a given elemental composition.61 Early compound dereplication is thereby often dependent on obtaining additional distinguishing characteristics such as those listed in Table 1, or via additional characteristics such as UV/Vis spectra and biological activity.62,63 It is also noteworthy that MS analysis is predicated on the ability to generate ions of the species of interest. Neutral or poorly ionizing species are transparent to MS, and because of this the number of detectable compounds from a metabolomic extract will vary depending on the analytical methods used during acquisition, in particular the specific ionization source and ionization conditions that are used.

2.2 Isotopic modeling

The isotopic envelope, comprised of both the major and minor isotopic contributions to the elemental formula, provides several opportunities for enhanced characterization information,64 including: (i) the presence of heteroatoms,65 and (ii) isotopic enrichment strategies for relative and absolute quantitation of the abundance of the secondary metabolite.66 The MS analysis of most biological molecules is typically concerning elemental formula comprising C, H, O, and N. The shared characteristic of these elements is that the monoisotopic peak also corresponds to the lowest mass isotope and thus, the lowest mass peak in the envelope is also the highest abundance for low molecular weight compounds. However, the vast majority of the periodic table is characterized by isotopic abundances that are somewhat varied from lightest to heaviest mass isotope and their isotopic signatures are oftentimes used in MS-based atomic analyses for identification purposes.67 The presence of heteroatoms, such as chlorine or bromine, are readily discernable in their contribution to the isotopic abundance observed for secondary metabolites and their stoichiometric contribution can be quickly verified through the use of isotopic calculator algorithms.68 Furthermore, these approaches are equally well suited by the addition of non-natural isotopic enrichment or depletion for determining the relative or absolute abundance of the secondary metabolite that is expressed. One such approach termed stable isotope labeling in cell culture (SILAC) has been demonstrated as a facile tool for incorporating enrichment or depletion in experimental protocols.69 Finally, genomic-based structural predictions, implying biosynthetic precursors, can be combined with stable isotope studies to identify targeted metabolites within organisms.70

2.3 Chromatographic retention time

Liquid chromatography (LC) is one of the most commonly used approaches to separate individual constituents of complex natural product extracts, and various LC methods and their applications have been previously reviewed.71–73 For natural product separations, reversed phase LC, and hydrophobic interaction chromatography are most commonly employed with a water–acetonitrile, or water–methanol gradient.74 This is typically performed on the basis of hydropathy, where reversed phase LC (RPLC) and hydrophobic interaction chromatography (HILIC) are most commonly utilized,75 and column retention will be affected by the ionization of these groups. Mobile phase pH can thereby significantly affect the separation efficiency for natural product extracts. Due to the dependence of compound retention on pH, and to assist ionization, mobile phases are commonly buffered with either acetic acid, trifluoroacetic acid, or formic acid76 to protonate acidic sites and facilitate retention. However, as low pH may suppress detection of negatively charged species in switched scanning modalities, neutral volatile buffers are often preferred.

Liquid chromatography-mass spectrometry (LC-MS) acquisition can take minutes to hours per chromatographic separation, and environmental changes throughout the course of the sample set (column conditioning, instrumental sensitivity and accuracy drift, etc.) can affect the quality of the data. Consequently, for multiple extract samples analysed in a sequential fashion, conditional changes between the start and end of analysis could lead to significant artefactual differences in group metabolomes, which complicate interpretation of subsequent comparative analyses. While challenging, recent reviews have outlined metabolomic experimental design strategies to accommodate these technical problems.77,78

2.4 Size and shape by ion mobility

Additional metabolomic feature information can be obtained by using gas-phase ion mobility-MS (IM-MS), without significantly increasing analysis time over MS-alone.79 The mechanism and utility of IM-MS has been the topic of several recent reviews.79–81 Briefly, in time-dispersive IM-MS, a uniform weak electric field is applied to a post-ionization ion drift tube containing an inert gas, where the ion velocity through the chamber is dependent upon thermal collisions with the background gas and its charge state.82 The number of collisions ions experience as they traverse the drift cell are proportional to their collision cross-sectional area, providing distinguishing information regarding an ion's shape and/or conformation in the gas phase.83 The separations in IM are very low energy in comparison with collisions used for fragmentation analysis, where in IM the ions experience approximately 104 to 106 collisions across a size separation versus 1 or several high energy collisions in collision induced dissociation (CID), respectively. Typical drift tube resolving power of IM-MS is sufficient such that conformationally restricted or extended metabolites, such as cyclic peptides, polycyclic polyketides, and polyenes often possess distinct ion mobility profiles that are obtained over the course of micro to milliseconds. IM-MS is often coupled with time-of-flight (TOF) MS that can rapidly acquire the m/z ratios for ions eluting from the IM-MS cell in a few microseconds. The frequency of data collection allows for sufficient time sampling across chromatographic peaks, which occur over the course of minutes, to be coupled to IM-TOFMS.79,84 When applied to microbial metabolomics the enhanced separation and sensitivity provided by IM-MS has been beneficial for identifying known secondary metabolites, dereplication, and prioritizing features. We have previously used IM-MS to help obtain high quality fragmentation data (IM-MS/MS) for all detected ions from crude extracts while comparing the differences between antibiotic resistant and wild-type Nocardiopsis.40 This facilitated the putative identification of several metabolites over-produced in mutant strains. IM-MS has been applied to differentiate halogenated natural products in cyanobacteria85 as well as peptide natural products from cave actinomycetes.86 IM-MS has also been applied to investigate the 3-dimensional structures of lasso peptides, interlocked microbial peptides with a range of bioactivities,87 and this technology will likely find other useful applications to natural product discovery as the technology becomes more widely available.

2.5 Ion fragmentation for structural information

Both time-dispersive (e.g. TOFMS) and scanning mass spectrometers (e.g. quadrupole MS and ion trap MS) can be used to acquire both precursor and fragment ion information (i.e. tandem MS),88 which can provide a wealth of highly specific structural information that can be used to help identify and dereplicate metabolites.89–93 In metabolomics-driven natural product discovery workflows, fragmentation data is commonly collected via an automated data-dependent acquisition method in which the most abundant ions within a scanning cycle are automatically selected for fragmentation. Fragmentation data analysis facilitates natural product dereplication which, as will be discussed in more detail below, is a critical step in the process of natural product discovery.94 There are a variety of methods applied to activate and induce dissociation of target ions, primarily categorized on the basis of how the ion is activated, collisionally, electron attachment, or through photon absorption, where the observed fragment ions will vary based on the method and parameters selected for fragmentation. For small molecules, collision induced dissociation (CID),95 and surface induced dissociation (SID)96 are commonly utilized. The degree of fragmentation observed using these methods depends on the number and degree of scissile bonds within a given molecule as well as the resulting internal ion energies used for analysis. In automated data-dependent tandem mass spectrometric fragmentation analysis, a given single set of dissociation parameters may not be appropriate for every feature of interest within a sample, requiring multiple experiments to determine optimal fragmentation parameters, and to effectively capture fragmentation data for a broad cross section of molecular classes. Ultimately, these methods provide characteristic fragmentation spectra that can be compared to established libraries of secondary metabolite fragmentation data to identify known secondary metabolites within the experimental sample.63 Additionally, tandem mass spectrometric data are useful for elucidating the structures of peptide natural products and have been used in ‘peptidogenomics’ strategies to link ribosomal and non-ribosomal peptide natural products to their cognate biosynthetic gene clusters.97–99

2.6 Leveraging spatiotemporal metabolomics inventories to capture inter-organism interactions

Secondary metabolite producing microorganisms can be cultivated on agar medium100–102 or in planktonic liquid103–109 culture medium, and several methods have been developed to extract and chromatographically separate resulting metabolomes.110–112 However, microorganisms cultivated as monocultures or mixed cultures on agar may display planar metabolite distributions containing valuable information about chemical pleiotropism, nutrient dependence, and chemical ecology,113–115 and bulk liquid extractions discard the spatial metabolomic feature differentiation that could otherwise be observed.116 Correspondingly, imaging mass spectrometry (IMS) methods have been developed for agar cultivated117 and environmental118–120 microbial samples to provide a second distinguishing ion characteristic, spatial localization. In IMS experiments, the area of a sample is divided into pixels which are individually analysed by the mass spectrometer. Matrix assisted laser desorption ionization (MALDI) is a commonly used ionization technique for IMS.121,122 MALDI requires the application of an ionization matrix to facilitate ionization of cell and agar embedded metabolites. This ablative technique has been applied to visualize spatial temporal distributions of secondary metabolite production in marine cyanobacteria51,120 and to elucidate microbial producers responsible for observed secondary metabolite biosynthesis.119,123 MALDI-IMS has also be used to visualize metabolic exchange between interacting organisms117,124–129 and identify novel antibiotic production in Streptomyces.130 The efficiency of MALDI ionization is matrix dependent, and varies across metabolite classes. Correspondingly, Desorption Electrospray Ionization (DESI) and secondary ion mass spectrometry (SIMS), which do not require the addition of an ionizing matrix, have also been applied to visualize natural product distribution through IMS131–134 among others.135 Determining the spatial distribution of produced secondary metabolites can be useful in natural product research, and as these IMS technologies continue to develop they are expected to become an integral component of metabolomic investigations into microbial secondary metabolites.136 Metabolomic features generated via IMS consist of m/z and its corresponding Cartesian coordinate in agar culture.

3 Preparation of high content mass spectral data for metabolomics studies

3.1 Strategies for formatting data for effective comparative analysis

Typical LC-MS analysis of extracted metabolites results in thousands of detectable features characterized by m/z, and retention time, as well as potentially ion mobility and fragmentation.137 Unbiased manual comparison of features between samples is challenging, especially when analysing a large number of samples. Therefore, it is necessary to automate feature collection from the acquired data in several processing steps that will facilitate data analysis (Fig. 1). There are a variety of non-compatible vendor-specific data file formats for mass spectrometric data which originally impeded the development of universal processing software. To address this issue the Protein Standards Initiative (PSI) group138 developed a standardized format, mzData, to facilitate data exchange.139 An additional format, mzXML, was developed to serve as a standard format for MS and MS/MS data processing.140,141 While both of these formats were popular, the scientific community pushed for a unified standard format to simplify software development. A new format mzML was released to replace both mzXML and mzData formats,142 however, all of these are still commonly used for metabolomics data. Correspondingly, one of the first steps in metabolomics data processing is to convert the data files from a vendor-specific format into one of the appropriate standard formats listed above for processing software. One common utility for this is ProteoWizard's MSconvert,143 which also has the ability to pre-filter the data with user defined parameters.
image file: c6np00048g-f1.tif
Fig. 1 General metabolomics workflow. Metabolites are extracted from experimental conditions and detected through MS analysis. MS data is then formatted and processed before undergoing statistical analyses to determine important metabolomic changes between the sample groups. These results may then be used to direct new experiments to optimize secondary metabolite production or test biological hypotheses generated from the initial experiment.

3.2 Methods and considerations for metabolite peak detection and alignment

After data format conversion, metabolite peaks must be identified and extracted from the data and aligned for all samples. A number of reviews have covered and compared the various processing packages and their algorithms.110,144–146 In this section we highlight a few of the common computational methods used for natural product discovery based metabolomics. The initial peak identification can be fairly challenging, as LC-MS ionization methods typically generate high levels of background chemical noise largely from mobile phases and buffers.147 Therefore, the automated processing methods must be able to identify genuine sample features while omitting detected chemical background and instrumental noise, and there have been several algorithms developed to accomplish this task. Vectorized peak detection algorithms identify data points above a set intensity threshold in both the m/z and retention time dimensions.148,149 There are also a number of 1-dimensional LC-MS processing algorithms commonly used for peptide analysis which detect peaks by using the isotope patterns in the m/z dimension.150–152 Another of the more common methods involves separating the LC-MS data into extracted ion chromatograms (EIC), each covering a very narrow m/z range. This process is called binning and, while fast and generally effective, this can lead to problems if the bin size is too large or too small. A matched filter153 is commonly applied to EICs to select for m/z peak shapes in the chromatographic time domain, and if features are split between multiple bins due to inappropriate sizing, they can be excluded by the algorithm resulting in false negatives. The traditional XCMS peak detection algorithm, a widely used LC-MS processing software package, sections off 0.1 Dalton wide EICs and then applies a second derivative Gaussian filter that aids in the discrimination of authentic peaks from noise along with a 10[thin space (1/6-em)]:[thin space (1/6-em)]1 signal to noise intensity threshold.154 An alternative to the binning approach for high resolution MS data is the centWave algorithm which identifies ion dense regions of interest in centroid data.155 Peaks are detected along these regions using a continuous wavelet transform, which allows for a much more dynamic range of peak shapes.155,156 The quality and validation of peak detection from increasingly complex datasets remains an area of intense research efforts.157,158

Another consideration in data processing is the tendency for retention times of features to vary between multiple injections due to changes in chromatographic conditions discussed previously. It is therefore necessary to match mass features between samples of an experiment, and align the retention times of matched peaks to generate a discrete feature list. Originally, internal reference standards were used to adjust retention times of each sample.159,160 However, retention time drifts throughout an acquisition are often not linear,154,161 and this also required additional sample preparation steps to incorporate the standards. A variety of algorithms have been developed to align features between sample runs without the use of internal standards. The original XCMS alignment algorithm identified hundreds to thousands of peak groups that are present in a large number of samples. These “well behaved” groups are used as markers to align the remaining detected features. Typically, the number of these markers identified from metabolome extracts is sufficient to cover the chromatographic profile of samples and correctly align the nonlinear retention time drifts. Local regression, LOESS,162 is then used to approximate drifts for regions without sufficient peak markers. Several alignment algorithms have been developed to process LC-MS data,163–169 and in a comparative study of six freely available retention time alignment methods the XCMS algorithm was shown to be the best for processing metabolomics data.170 However, it was noted that the appropriate selection of parameters used for the methods could have a large impact on the data output, such that the apparent success of any particular method is dependent upon the user's experience. A software package, Isotopologue Parameter Optimization (IPO), was recently released to automatically optimize XCMS parameter settings using natural C13 isotopic peaks.171 This software applies to a variety of different sample types, chromatographic strategies, and instrument methods and aids to simplify and systematize method development while optimizing metabolomics processing for non-experts.

After peak alignment it is common for several mass/retention time features to possess few or even no matches between samples. This may be because some peaks are entirely unique to a subset of experimental samples but can also stem from errors in peak detection due to inappropriate parameter settings, noisy data, etc. Gap-filling is commonly used to ensure these are not false negatives and provide a non-zero value for subsequent statistical analyses. In the absence of a detectable peak, the values obtained through gap filling reflect noise within the region that peaks were detected in other samples. For low abundance features, the integrated noise level over the peak region may be similar to the value determined for the feature, and this can lead to the observation of a metabolite ion that statistically correlates with a single condition while its lower abundance isotopes show no correlations in subsequent statistical analyses.

4 Analysis of metabolomics data in the context of secondary metabolites

The next stage of metabolomics analysis consists of applying one or more methods to compare metabolomics datasets. Depending on the objectives of a given study, several complementary methods may be applied. In the following discussion we review extant methods for comparative metabolomics analysis summarized in Table 2. To illustrate the application of these methods, we apply them to the analysis of a metabolomics dataset focused on the cytotoxic macrolide producing organism, Nocardiopsis sp. FU40, and its exposure to multiple competing organisms in mixed culture. In selected mixed culture conditions, this organism increased production of the secondary metabolites called ciromicins, which we highlight throughout data analyses.
Table 2 Overview of methods for metabolomic data analysis
Method Description Applications Disadvantages
Principle component analysis and projections to latent structures MVSA to identify significant covariance within data • Identifying data outliers • Less effective with large datasets
• Strain prioritization
• Grouping samples
• Compound prioritization
Self organizing maps Organizes features into a 2-dimensional map based on feature response trends across a variety of experimental conditions • Grouping samples • Less effective with small numbers of conditions
• Compound prioritization
• Comparing large numbers of experimental conditions
Molecular networking Organizes features into a connectivity network based on similarities in molecular fragmentation patterns • Compound prioritization • Fragmentation can vary with instrument parameters
• Compound dereplication


4.1 Multivariate statistical analysis and data projections for identification of abundant covarying metabolites

Subsequent to pre-processing, metabolomics data can be analysed through multivariate statistical analyses (MVSA) which simplify and identify significant correlations within the data. Two common methods for metabolomics data analysis are partial least squares (PLS), or projections to latent structures, regression methods and principal component analyses (PCA) reviewed in more detail elsewhere.172,173 Briefly, PLS methods assume that changes within the data are largely driven by a subset of latent variables, which are not themselves measured within the data but are more abstract, such as experimental treatments/stimulants or conditions. With this assumption, a PLS analysis will identify latent vectors within the data that describe the maximal covariance between user defined groups.174,175 Alternatively, PCA makes no assumptions about the data and identifies the sources of the highest variance across the samples to distinguish the samples from one another.176 The fundamental difference between these two analyses is that PLS are supervised with user defined groups while PCA are unsupervised variable reduction methods. Orthogonal signal corrections can be applied to PLS regressions to improve separation between predictive and non-predictive variation.177 The product of these analyses are scoresplots, or projections of samples onto a hyperplane within the data describing sample covariance, from PCA and PLS analyses. Interpretation of scoresplots show the separation of samples based on feature variance to determine which samples are similar (nearby in Cartesian space) and dissimilar (far away) with regards to their most significantly varying features. Replicate analyses of the same sample should cluster within the scoresplot, and in this way scoresplots are a useful means of identifying errors in sample acquisition or data pre-processing. Additionally, a control comprised of pooled samples should locate close to the origin of a PCA plot. Another useful product of the PCA analyses are loadings plots, which show correlations between variables in the data and summarize these variables' impacts on the scoresplot. Nearby features are positively correlated, while distant features are negatively correlated, and features in the same region as samples in the scoresplot will be more abundantly or uniquely present in those samples.
4.1.1 Strain prioritization via principal component analysis. One approach to the discovery of new natural products has been to prioritize organisms distinguished as metabolically unique through a PCA analysis. There is often a great deal of redundancy in the compounds identified through microbial natural product screening endeavours, and this redundancy can be reduced through the selection of metabolomically diverse microbial strains.178 Under the hypothesis that organisms with similar secondary metabolic potential would cluster in PCA space, Hou et al. analysed 47 microbial strains to demonstrate how MVSA could prioritize strains with diverse secondary metabolic potential.179 Similarly, PCA has been used to prioritize marine microbial symbionts180 as well as phylogenetically similar Streptomyces181 for natural product isolation. Fig. 2 demonstrates how metabolically unique organisms are distinguishable along the principal component vectors of the PCA scoresplot. This method may also be useful for identifying new classes of bioactive microbial compounds as has been done for plant extracts.182 However, caveats to this approach include (1) that the correlating features responsible for PCA prioritization of a subset of organisms from a library may not be secondary metabolites, which are generally present in relatively low abundance within crude extracts, and (2) that low abundant secondary metabolites, will not be emphasized by these methods.
image file: c6np00048g-f2.tif
Fig. 2 Using a PCA scores plot to prioritize microbial producers. A panel of actinomycetes including Microbiospora, Streptomyces, and Nonomurea genera. In this analysis, 14 strains grown under identical conditions were compared and principal component analysis was used to display metabolomic feature variance between the strains. Principal component 1 primarily groups Streptomyces from other strains, and component 2 further distinguishes Nocardiopsis sp. FU40 as metabolomically unique compared to other tested strains. Percentages shown in parentheses correspond to the variance between the samples contained within the specific component.
4.1.2 Secondary metabolite prioritization within metabolomics data via principal component and regression analyses. One important application of PCA and PLS metabolomics for natural product discovery is to prioritize induced secondary metabolites in comparative analyses between chemically and/or biologically stimulated and control conditions. Secondary metabolite production can be activated in microorganisms through a variety of chemical and environmental stimulation,29,30,183,184 and PCA and PLS are commonly applied to identify abundantly produced features in these conditions.40,185–187 Binary comparisons using S-plots can be a used to identify group specific features of a PLS model.188 These graphs separate features by their covariance along the x-axis and their correlation to user defined groups on the y-axis. More simply, more abundant features are farther from the origin on the x-axis, and features with correlations closer to 1 or −1 are likely to be unique or specific to one group or the other. Volcano plots have also recently been used to identify significantly covarying metabolites in binary comparisons of natural product extracts.189 Volcano plots show each features' statistical significance, p-value, on the y-axis and fold change along the x-axis.190,191 Similarly PCA loadings plots can be used to visualize significant feature differences between sample sets. Fig. 3 demonstrates how S-plots, volcano plots, and loadings plots can distinguish induced metabolomic features in our Nocardiopsis case study. The loadings plot tripartite comparison identifies features that correlate with either the Nocardiopsis or Rhodococcus monocultures or a mixed culture where the two compete for nutrients. In this plot the induced cytotoxic macrolactam ciromicin is clearly distinguishable as positively correlated with the mixed culture extract. Similarly, ciromicin was clearly identified through the S-plot and volcano plot comparisons between the Nocardiopsis monoculture and the mixed culture. These methods can be very powerful, and freely available online metabolomics packages, such as XCMS Online192,193 and Metaboanalyst194,195 can perform some routine MVSA data analyses in addition to data pre-processing. An alternative and fairly unique comparative analysis available through XCMS Online is the cloud plot.196 These plots convey feature fold changes, m/z, retention time, and statistical distribution in the same figure, and can perform both binary and multigroup comparisons.193
image file: c6np00048g-f3.tif
Fig. 3 MVSA, S-plots, loadings, and volcano plots to identify induced features. (a) The scoresplot reveals group separation between the Nocardiopsis monoculture (NF), the Rhodococcus wratis competitor monoculture (RW), and the mixed culture (RW&NF). (b) S-plot shows ciromicin significantly correlates (p < 0.1) to the mixed culture group in a binary comparison vs. the Nocardiopsis monoculture. (c) Loadings plot of features shows ciromicin contributes significantly to group differentiation on the PCA scoresplot. (d) Volcano plot also prioritizes ciromicin which has a high correlation (low p-value) on the y-axis and high fold change on the x-axis. Shading in panels a and c are used to highlight the data corresponding to the different sample subtypes.

4.2 Discovering molecular inventories of microbial responses via self organizing map analytics

A strength in MVSA analysis of metabolomics datasets is the identification of the most unique and abundant features between small numbers of treatment conditions. However, these methods are limited to displaying data in two or three dimensions and are biased towards the largest differences within the entire dataset. Therefore, the utility of MVSA to represent an experiment diminishes as the number and diversity of samples increases. For instance, it is common to screen a target organism under dozens of stimulus conditions to optimize compound production, or to induce silent biosynthetic gene clusters, and in these cases we have previously demonstrated that an alternative method utilizing Kohonen self-organizing map (SOM) analytics can be more effective at representing multiplexed stimuli data than PCA.184 As discussed above, metabolomic acquisition via LC-MS results in the acquisition of thousands of detectable features. Through SOM analyses these features are organized using an artificial neural network into a 2-dimensional grid based on feature response patterns across all experimental conditions. Features that share similar trend patterns are grouped in nearby nodes of the map as shown in Fig. 4. Through multiple iterations, typically several hundred, this organization is improved ultimately resulting in a feature map where features in this case correspond to clusters of similar response trends. Unlike MVSA, SOM analyses improve with increasing amount of data and response conditions (e.g. stimuli), as this leads to more varied response trends which in turn enhances feature organization. A metabolomics workflow – molecular expression dynamics inspector (MEDI), provides an open access methodology for SOM analysis from MS data and is readily applicable to microbial metabolomics.197 In MEDI, each tile, or node, of the grid is coloured based on the centroid intensity of its features to generate heatmaps. Difference maps can be generated by subtractive analysis (e.g. control and stimulus conditions) to readily prioritize abundant and treatment-specific metabolomic features into regions of interest. We applied these SOM analytics to map stimuli-induced metabolomic responses from 23 distinct conditions in Streptomyces coelicolor.184 In this study 16 detected secondary metabolites produced by S. coelicolor were induced in one of the 23 conditions and prioritized through the SOM analysis.184 Application of SOM analyses to investigate metabolomic changes engendered through microbial competition in our Nocardiopsis case study prioritized several metabolites unique to mixed culture conditions including the ciromicins. Additionally, the single SOM analysis recapitulates the results of multiple MVSA analyses. In Fig. 5 three PCA loadings plots are compared with three SOM heatmaps from the Nocardiopsis mixed culture example study. When the features held within regions of interest on the SOM maps are sorted by abundance they are highly consistent with the loadings plots from PCA analyses. Indeed, there is correspondence between PCA and the neural networks used for MEDI analysis. As the data and/or conditions become sparser, the SOM heatmap begins to decompose into a similar functional form as PCA.
image file: c6np00048g-f4.tif
Fig. 4 Feature organization within a self-organizing map analysis. Feature abundance profiles are illustrated for each feature as a response trend across all experimental conditions shown in the upper right. These trends are organized for similarity as shown on the bottom right. These organized data serve as the basis for visual heatmap representations of the observed metabolomic content of experimental cultures.

image file: c6np00048g-f5.tif
Fig. 5 Three example comparisons of prioritized features through principal component and self-organizing map analyses on mixed cultures with Nocardiopsis FU40 (NF), Rhodococcus wratis (RW), Tsukamurella pulmonis (TP), and Bacillus subtilis (BS). Features prioritized within SOM regions of interest recapitulate PCA tripartite analyses when sorted by abundance, or percentage of the region of interest (% ROI). Shading in scores and loadings plots used to highlight the data corresponding to the different sample subtypes.

4.3 Molecular networking to reveal structural uniqueness and relatedness in large datasets

Microorganisms have been extensively mined for natural products throughout much of the past century in the search for new pharmaceuticals, and the rediscovery of known compounds or known families of compounds is quite common. Identifying and removing these rediscovered natural products, a process known as dereplication, is both critical and challenging.59,198 Typically accurate masses or determined molecular formula of extracted compounds are used to search databases of known natural products. However, the large number of isobaric compounds complicate dereplication. UV/Vis absorbance59 spectra and chromatographic retention times199 can be used to further match extracted features to database compounds, and as technologies and databases improve, it is likely that ion mobility will play a role in natural product dereplication as well.85,86,200 Fragmentation spectra acquired through tandem MS is another useful property for dereplication. Metabolite fragmentation patterns observed through MS/MS analysis can be matched to those in databases like PubChem, METLIN201 and MassBank202 to putatively identify MS features. Kernel based machine learning algorithms have recently been applied to dereplicate metabolites using multiple levels of tandem MS,91,203 and while this works well for primary metabolites, public databases for microbial secondary metabolites with fragmentation spectra encompass only a small fraction of known natural products. For example, GNPS: Global Natural Products Social Molecular Networking, the largest natural product public database with MS/MS spectra, contains more than 140[thin space (1/6-em)]000 natural products,198 and there are an estimated 600[thin space (1/6-em)]000 published natural compounds.204

Computational methods to generate theoretical fragmentation spectra have been employed to compensate for the lack of experimental data on natural products.205 These in silico MS/MS spectral databases can further facilitate natural product dereplication when coupled with molecular networking,206 and as both experimental and in silico database coverage improves, comparisons of fragmentation spectra may become the most useful method of natural product dereplication. In addition to matching fragmentation spectra with database compounds, fragmentation data can be used to cluster related classes of molecules by fragment similarity. Molecular networking analyses cluster families of molecules through vector correlations between fragment ions.207 Yang et al. demonstrated the utility of this approach for natural product discovery by dereplicating 58 natural products from marine and terrestrial microorganisms.208 Molecular networking in this study also identified a number of novel analogs to known compounds, which are more difficult to obtain through other dereplication methods. In Fig. 6 we have applied molecular networking to our Nocardiopsis example dataset. Using the network visualizer in the GNPS: Global Natural Products Social Molecular Networking website, fragmentation spectra for each object in the network can be easily viewed and compared to matched reference spectra from the GNPS library. The features in the network can also be coloured by the user-defined group or condition to which they are correlated. In Fig. 6 we have highlighted features unique to mixed culture conditions in red. As shown, molecular networking identifies unique features which are unbiased by compound abundance. Several features of this dataset share no significant fragment similarity with the network and are isolated as “self-loops”. In fact, ciromicin A is among these uniquely fragmenting features, and this in itself may be another useful means to prioritize leads, as outlying features may be more structurally unique. Molecular networking analysis can be enhanced by combining additional metabolomics techniques. Klitgaard et al. used a combination of molecular networking and stable isotope labelling to identify novel analogs of nidulanin A and fungisporin in the well-studied fungus Aspergillus nidulans.70 Fragment based clustering in this manner can also be used to identify modified natural products stemming from interactions between organisms. Moree et al. used molecular networking with imaging mass spectrometry to investigate the interkingdom interactions between Pseudomonas and Aspergillus and observed a variety of biotransformed metabolites arising from this microbial competition.209 Similarly, Briand et al. applied molecular networking to identify new compounds and analogs arising from intraspecific interactions between algae.210 The application of molecular networking for the Nocardiopsis mixed culture data shown in Fig. 6 links a number of features found in the Nocardiopsis monoculture with similarly fragmenting features only detectable in mixed culture. These may represent compounds made by Nocardiopsis that are stimulated or modified in some way by the competitor Tsukamurella.


image file: c6np00048g-f6.tif
Fig. 6 Applications of molecular networking to explore data. Comparisons of acquired fragmentation spectra to established databases facilitates putative feature identification. Connectivity between features shown with blue lines relates structural similarities. Reference compounds seeded into the network can identify structural analogs. Feature distributions between experimental conditions are indicated by node colouring, red for mixed culture specific, and grey for features detected within the monoculture.

Molecular networking can also prioritize features by linking observed natural products to their cognate biosynthetic gene clusters and gene cluster families99,211 when used in conjunction with genomic sequence analysis. This can be an advantageous means of prioritizing metabolite leads as demonstrated by the work from Kleigrewe et al. where molecular networking was combined with genomic sequence analysis to identify a novel group of acyl amides, termed columbamides, from marine cyanobacteria.212 The Crawford lab has recently employed ‘pathway-targeted’ molecular network analyses to identify metabolites from the colibactin gene cluster, which had been linked to increased virulence in E. coli.213–215 As previously discussed, heterologous hosts are often used for the production of microbial secondary metabolites,216–218 and molecular networking is a useful tool for comparative metabolomics to visualize the output of these heterologous hosts. Schorn et al. used molecular networking to identify novel eponemycin congeners produced through heterologous expression in Streptomyces albus J1046.219 Molecular networking has even been applied to identify virulence factors in pathogenic organisms,220,221 and this method will become more beneficial for natural product discovery as databases and technologies improve.

5 Investigations of secondary metabolite bioactivity

Natural products are intrinsically biologically active, however, the clinical relevance of this activity may not always be discernible. Typically, natural product structure and mode of action are determined fairly late in the natural product discovery pipeline, which contributes to high rediscovery rates. Therefore, prioritizing natural product leads by deep profiling of pharmacologically relevant biological activities would expedite natural product based drug discovery. Natural product extracts are commonly divided into multiple fractions which are then screened to identify the components underlying the desired biological activity.222–225 However, low abundance compounds can often be overlooked in complex extracts, and recently MVSA have been utilized to help link observed fraction bioactivity to detectable features from metabolomic analyses.39,226 Even after correlating metabolites with biological activity, determining the mode of action for active compounds can be difficult and expensive.227–230 One approach has been developed that uses the antibiotic spectrum of activities across different organisms, mode of action profiles (BioMAP), to group similar antibiotics.231 This method was effectively able to cluster antibiotics of the same compound class and led to the identification of a novel naphthoquinone antibiotic, arromycin.231 Gene expression profiling with either the entire transcriptome232,233 or a subset of reporter genes234,235 has also been used to predict modes of action for natural products. However, because these transcriptomic screens are still relatively costly, there is a great interest in applying metabolomics analyses to predict natural product modes of action using either natural product extracts236 or purified compounds.237–242 Vincent et al. have recently shown untargeted metabolomics can effectively identify compound modes of action when specific metabolic pathways are the primary drug target.243 Metabolomic consequences of drug combinations may additionally be able to identify synergism, or antagonism between coadministered drug therapies.241 In a study with M. smegmatis, Halouska et al. observed that antibiotics which share similar biological targets engender similar metabolomic changes and are grouped together through MVSA.239 The group additionally applied their metabolomic methods to investigate antibiotics with unknown biological targets and found them to group with membrane disrupting antibiotics, ampicillin, D-cycloserine, and vancomycin.239 This methodology could prove very useful to prioritize compounds for isolation. Antimicrobial extracts which separate themselves metabolomically through MVSA or other analyses may exert their activity through a novel biological target or mechanism. In this way pharmaceutically relevant natural products could be prioritized for isolation. These metabolomic analyses have even been applied to investigate the underlying methods by which known antibiotics kill pathogens.244 Another approach, cytological profiling, uses automated image and microscopy analyses to identify phenotypic changes induced from bioactive compounds,245,246 and this method has been used to classify biologically active compounds by their respective modes of action247,248 even within more complex marine derived bacterial extracts.249 A combined approach integrating these phenotypic screens with untargeted metabolomics has recently been developed to predict the modes of action for complex libraries of natural products and prioritize unique bioactive components.250 Applying this method, Compound Activity Mapping, on data from 234 natural product extracts led to the discovery of the quinocinnolinomycins, a new family of natural products implicated to induce endoplasmic reticulum stress based on further cytological profile clustering.250 Ultimately, these multi-omic combinatorial methods may become the preferred means of predicting molecular modes of action. Integrating the phenotypic data from cytological profiling and the transcriptomic functional signature ontologies235 with metabolomics data using one or combinations of the powerful analytical platforms discussed in this review, self-organizing maps, molecular networking, MVSA, etc., could provide new insights into the modes of action of bioactive compounds and greatly facilitate novel drug discovery.

6 Conclusions

Metabolomic analyses are powerful tools for natural product discovery. However, while metabolomics can provide a wealth of information regarding the activity and responses of microorganisms, with current technologies it is practically impossible to analyse the entire metabolome of an organism comprehensively due to variations in ionization efficiency and limitations in detection across a wide dynamic range of concentrations. Instead, only detectable metabolites, which make up a fraction of the total metabolites present, are used to draw conclusions from current studies. While the full transcriptomic and proteomic potential of an organism can be determined through modern genome sequencing, there is no readily discernible limit to the number of metabolites present within organisms, so it is difficult to predict the number of metabolites omitted by current analyses. Due to these limitations, extra care must be taken when drawing conclusions from metabolomics datasets. Nonetheless, metabolomics analyses benefit microbial natural product discovery pipelines in a variety of ways as described in this report. These can be used to prioritize organisms, identify activated compounds from stimuli exposure, prioritize features through bioactivity spectrums or molecular class, and even dereplicate prioritized secondary metabolites. The metabolomics methods described herein may also facilitate investigations into the fundamental purpose behind secondary metabolite production within microbial communities. It is largely unclear how the production of secondary metabolites is regulated in situ as well as which ecological stimuli trigger secondary metabolic production. Such studies would benefit natural product discovery endeavours by facilitating predictions of stimuli to induce secondary metabolite production within the endogenous producer and could additionally provide insight into human health and wellness. Microbial secondary metabolites have a significant impact on human health by means of both isolated pharmaceuticals and compounds produced in situ from within human microbiomes,13,251–255 and metabolomics may be able to offer insights into how these organisms modulate their secondary metabolism in response to diet, medicine, and endogenous host factors.

Comparative metabolomics methods are aiding in unleashing the repressed and/or hidden wealth of microbial secondary metabolism predicted by whole genome sequencing. The combination of complimentary methods (e.g. SOM and molecular networking) has the potential to provide new tools to accelerate discovery. Ultimately, the purpose of these efforts is to identify biological roles for secondary metabolites, be they biochemical, chemical ecological, or translational in human medicine. We believe that the next era in secondary metabolite discovery and application will be facilitated by methods that combine high content biological activity data measurements for metabolites within metabolomes with corresponding multidimensional metabolomic data to illuminate effectors of natural small molecule interactions, and their roles in biological systems.

7 Acknowledgements

This work was supported by the National institutes of Health (no. R01GM092218 awarded to B. O. B. and J. A. M., and T32 no. GM0650086 awarded to B. C. C.), the Vanderbilt Institute of Chemical Biology, the Vanderbilt Institute for Integrative Biosystems Research and Education, and the Vanderbilt University College of Arts and Sciences.

8 References

  1. S. D. Bentley, K. F. Chater, A. M. M. Cerdeño-Tárraga, G. L. Challis, N. R. Thomson, K. D. James, D. E. Harris, M. A. Quail, H. Kieser, D. Harper, A. Bateman, S. Brown, G. Chandra, C. W. Chen, M. Collins, A. Cronin, A. Fraser, A. Goble, J. Hidalgo, T. Hornsby, S. Howarth, C. H. H. Huang, T. Kieser, L. Larke, L. Murphy, K. Oliver, S. O'Neil, E. Rabbinowitsch, M. A. A. Rajandream, K. Rutherford, S. Rutter, K. Seeger, D. Saunders, S. Sharp, R. Squares, S. Squares, K. Taylor, T. Warren, A. Wietzorrek, J. Woodward, B. G. Barrell, J. Parkhill and D. A. Hopwood, Nature, 2002, 417, 141–147 Search PubMed.
  2. H. Ikeda, J. Ishikawa, A. Hanamoto, M. Shinose, H. Kikuchi, T. Shiba, Y. Sakaki, M. Hattori and S. Omura, Nat. Biotechnol., 2003, 21, 526–531 Search PubMed.
  3. J. S. Zarins-Tutt, T. T. Barberi, H. Gao, A. Mearns-Spragg, L. Zhang, D. J. Newman and R. J. M. Goss, Nat. Prod. Rep., 2016, 33, 54–72 Search PubMed.
  4. K. Scherlach and C. Hertweck, Org. Biomol. Chem., 2009, 7, 1753–1760 Search PubMed.
  5. C. T. Walsh and M. A. Fischbach, J. Am. Chem. Soc., 2010, 132, 2469–2493 Search PubMed.
  6. G. Shabuer, K. Ishida, S. J. Pidot, M. Roth, H.-M. M. Dahse and C. Hertweck, Science, 2015, 350, 670–674 Search PubMed.
  7. B. L. Findlay, ACS Chem. Biol., 2016, 11, 1502–1510 Search PubMed.
  8. M. I. Vizcaino, X. Guo and J. M. Crawford, J. Ind. Microbiol. Biotechnol., 2014, 41, 285–299 Search PubMed.
  9. D. J. Newman and G. M. Cragg, J. Nat. Prod., 2016, 79, 629–661 Search PubMed.
  10. P. Zakrzewski, M. A. Fischbach and T. Weber, Nucleic Acids Res., 2011, 39, 339–346 Search PubMed.
  11. M. H. T. Li, P. M. U. Ung and J. Zajkowski, BMC Bioinf., 2009, 10, 185 Search PubMed.
  12. K.-S. S. Ju, J. Gao, J. R. Doroghazi, K.-K. A. Wang, C. J. Thibodeaux, S. Li, E. Metzger, J. Fudala, J. Su, J. K. Zhang, J. Lee, J. P. Cioni, B. S. Evans, R. Hirota, D. P. Labeda, W. A. van der Donk and W. W. Metcalf, Proc. Natl. Acad. Sci. U. S. A., 2015, 112, 12175–12180 Search PubMed.
  13. M. S. Donia, P. Cimermancic, C. J. Schulze, L. C. Wieland Brown, J. Martin, M. Mitreva, J. Clardy, R. G. Linington and M. A. Fischbach, Cell, 2014, 158, 1402–1414 Search PubMed.
  14. T. Weber, K. Blin, S. Duddela, D. Krug, H. U. Kim, R. Bruccoleri, S. Y. Lee, M. A. Fischbach, R. Müller, W. Wohlleben, R. Breitling, E. Takano and M. H. Medema, Nucleic Acids Res., 2015, 43, 237–243 Search PubMed.
  15. P. R. Jensen, K. L. Chavarria, W. Fenical and B. S. Moore, J. Ind. Microbiol. Biotechnol., 2014, 41, 203–209 Search PubMed.
  16. C. M. Farnet and E. Zazopoulos, in Natural Products: Drug Discovery and Therapeutic Medicine, ed. L. Zhang and A. L. Demain, Humana Press, Totowa, NJ, 2005, pp. 95–106 Search PubMed.
  17. N. Ziemert, M. Alanjary and T. Weber, Nat. Prod. Rep., 2016, 33, 988–1005 Search PubMed.
  18. M. H. Medema and M. A. Fischbach, Nat. Chem. Biol., 2015, 11, 639–648 Search PubMed.
  19. G.-E. Juan Pablo and J. B. Mervyn, J. Ind. Microbiol. Biotechnol., 2013, 41, 425–431 Search PubMed.
  20. M. Komatsu, T. Uchiyama, S. Mura, D. E. Cane and H. Ikeda, Proc. Natl. Acad. Sci. U. S. A., 2010, 107, 2646–2651 Search PubMed.
  21. E. Kim, B. S. Moore and Y. J. Yoon, Nat. Chem. Biol., 2015, 11, 649–659 Search PubMed.
  22. X. Tang, J. Li, N. Millán-Aguiñaga and J. J. Zhang, ACS Chem. Biol., 2015, 10, 2841–2849 Search PubMed.
  23. A. C. Ross, L. E. S. Gulland and P. C. Dorrestein, ACS Synth. Biol., 2014, 4, 414–420 Search PubMed.
  24. M. S. Donia, D. E. Ruffner, S. Cao and E. W. Schmidt, ChemBioChem, 2011, 12, 1230–1236 Search PubMed.
  25. S. E. Ongley, X. Bian, B. A. Neilan and R. Müller, Nat. Prod. Rep., 2013, 30, 1121–1138 Search PubMed.
  26. Y. Luo, B. Enghiad and H. Zhao, Nat. Prod. Rep., 2016, 33, 174–182 Search PubMed.
  27. M. J. Bibb, Curr. Opin. Microbiol., 2005, 8, 208–215 Search PubMed.
  28. M. R. Seyedsayamdost, J. R. Chandler and J. A. V. Blodgett, Org. Lett., 2010, 12, 716–719 Search PubMed.
  29. H. B. Bode, B. Bethe, R. Höfs and A. Zeeck, ChemBioChem, 2002, 3, 619–627 Search PubMed.
  30. P. J. Rutledge and G. L. Challis, Nat. Rev. Microbiol., 2015, 13, 509–523 Search PubMed.
  31. K. Ochi and T. Hosaka, Appl. Microbiol. Biotechnol., 2013, 97, 87–98 Search PubMed.
  32. F. J. Reen, S. Romano, A. D. W. Dobson and F. O'Gara, Mar. Drugs, 2015, 13, 4754–4783 Search PubMed.
  33. Y. Imai, S. Sato, Y. Tanaka, K. Ochi and T. Hosaka, Appl. Environ. Microbiol., 2015, 81, 3869–3879 Search PubMed.
  34. W. Wang, J. Ji, X. Li, J. Wang, S. Li, G. Pan, K. Fan and K. Yang, Proc. Natl. Acad. Sci. U. S. A., 2014, 111, 5688–5693 Search PubMed.
  35. S. Tokuyama, A. Kaji, H. Ikeda and K. Ochi, Appl. Environ. Microbiol., 2009, 75, 4919–4922 Search PubMed.
  36. Y. Tsurumi, S. Kodani, M. Yoshida, A. Fujie and K. Ochi, Nat. Biotechnol., 2009, 27, 462–464 Search PubMed.
  37. G. Wang, T. Hosaka and K. Ochi, Appl. Environ. Microbiol., 2008, 74, 2834–2840 CrossRef CAS PubMed.
  38. K. Ochi, S. Okamoto, Y. Tozawa, T. Inaoka, T. Hosaka, J. Xu and K. Kurosawa, Adv. Appl. Microbiol., 2003, 56, 155–184 Search PubMed.
  39. C. Wu, C. Du, J. Gubbens and Y. H. Choi, J. Nat. Prod., 2015, 78, 2355–2363 Search PubMed.
  40. D. K. Derewacz, C. R. Goodwin, R. C. McNees, J. A. McLean and B. O. Bachmann, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, 2336–2341 Search PubMed.
  41. Y. Tanaka, T. Hosaka and K. Ochi, J. Antibiot., 2010, 63, 477–481 CrossRef CAS PubMed.
  42. K. Kawai, G. Wang, S. Okamoto and K. Ochi, FEMS Microbiol. Lett., 2007, 274, 311–315 CrossRef CAS PubMed.
  43. G. Haferburg and E. Kothe, J. Basic Microbiol., 2007, 47, 453–467 Search PubMed.
  44. U. R. Abdelmohsen, T. Grkovic, S. Balasubramanian, M. S. Kamel, R. J. Quinn and U. Hentschel, Biotechnol. Adv., 2015, 33, 798–811 Search PubMed.
  45. D. K. Derewacz, B. C. Covington, J. A. McLean and B. O. Bachmann, ACS Chem. Biol., 2015, 10, 1998–2006 Search PubMed.
  46. S. Angell, B. J. Bench, H. Williams and C. M. H. Watanabe, Chem. Biol., 2006, 13, 1349–1359 Search PubMed.
  47. D.-C. Oh, C. A. Kauffman, P. R. Jensen and W. Fenical, J. Nat. Prod., 2007, 70, 515–520 Search PubMed.
  48. M. Cueto, P. R. Jensen, C. Kauffman, W. Fenical, E. Lobkovsky and J. Clardy, J. Nat. Prod., 2001, 64, 1444–1446 Search PubMed.
  49. A. A. Brakhage, Nat. Rev. Microbiol., 2012, 11, 21–32 Search PubMed.
  50. R. R. Forseth and F. C. Schroeder, Curr. Opin. Chem. Biol., 2011, 15, 38–47 Search PubMed.
  51. E. Esquenazi, A. C. Jones, T. Byrum, P. C. Dorrestein and W. H. Gerwick, Proc. Natl. Acad. Sci. U. S. A., 2011, 108, 5226–5231 Search PubMed.
  52. A. K. Jarmusch and G. R. Cooks, Nat. Prod. Rep., 2014, 31, 730–738 Search PubMed.
  53. D. Krug and R. Müller, Nat. Prod. Rep., 2014, 31, 768–783 Search PubMed.
  54. M. S. Bereman, M. M. Lyndon and R. B. Dixon, Rapid Commun. Mass Spectrom., 2008, 22, 1563–1566 Search PubMed.
  55. J. G. Stroh, C. J. Petucci, S. J. Brecker and N. Huang, J. Am. Soc. Mass Spectrom., 2007, 18, 1612–1616 CrossRef CAS PubMed.
  56. L. Sleno, D. A. Volmer and A. G. Marshall, J. Am. Soc. Mass Spectrom., 2005, 16, 183–198 Search PubMed.
  57. A. W. T. Bristow and K. S. Webb, J. Am. Soc. Mass Spectrom., 2003, 14, 1086–1098 Search PubMed.
  58. T. Kind and O. Fiehn, Bioanal. Rev., 2010, 2, 23–60 CrossRef PubMed.
  59. K. F. Nielsen, M. Månsson, C. Rank, J. C. Frisvad and T. O. Larsen, J. Nat. Prod., 2011, 74, 2338–2348 Search PubMed.
  60. T. Kind and O. Fiehn, BMC Bioinf., 2007, 8, 105 Search PubMed.
  61. J. C. May and J. A. McLean, Annu. Rev. Anal. Chem., 2016, 9, 387–409 Search PubMed.
  62. S. Kildgaard, M. Mansson, I. Dosen and A. Klitgaard, Mar. Drugs, 2014, 12, 3681–3705 Search PubMed.
  63. T. El-Elimat, M. Figueroa, B. M. Ehrmann, N. B. Cech, C. J. Pearce and N. H. Oberlies, J. Nat. Prod., 2013, 76, 1709–1716 Search PubMed.
  64. S. Neumann and S. Böcker, Anal. Bioanal. Chem., 2010, 398, 2779–2788 Search PubMed.
  65. M. Z. Hernandes, S. M. Cavalcanti, D. R. Moreira, W. F. de Azevedo Junior and A. C. Leite, Curr. Drug Targets, 2010, 11, 303–314 Search PubMed.
  66. J. K. Kim, K. Harada, T. Bamba and E. Fukusaki, Biosci., Biotechnol., Biochem., 2005, 69, 1331–1340 Search PubMed.
  67. S. Böcker, M. C. Letzel, Z. Lipták and A. Pervukhin, Bioinformatics, 2009, 25, 218–224 Search PubMed.
  68. P. S. Haglund, K. Löfstrand, K. Siek and L. Asplund, Mass Spectrom., 2013, 2, S0018 Search PubMed.
  69. S.-E. Ong, B. Blagoev, I. Kratchmarova, D. B. Kristensen, H. Steen, A. Pandey and M. Mann, Mol. Cell. Proteomics, 2002, 1, 376–386 Search PubMed.
  70. A. Klitgaard, J. B. Nielsen, R. J. N. Frandsen, M. R. Andersen and K. F. Nielsen, Anal. Chem., 2015, 87, 6520–6526 Search PubMed.
  71. F. Bucar, A. Wube and M. Schmid, Nat. Prod. Rep., 2013, 30, 525–545 Search PubMed.
  72. S. S. Ebada, R. A. Edrada, W. Lin and P. Proksch, Nat. Protoc., 2008, 3, 1820–1831 Search PubMed.
  73. O. Sticher, Nat. Prod. Rep., 2008, 25, 517–554 Search PubMed.
  74. J. L. Wolfender, G. Marti, A. Thomas and S. Bertrand, J. Chromatogr. A, 2015, 1382, 136–164 Search PubMed.
  75. M. Månsson, R. K. Phipps, L. Gram, M. H. Munro, T. O. Larsen and K. F. Nielsen, J. Nat. Prod., 2010, 73, 1126–1132 Search PubMed.
  76. K. F. Nielsen and T. O. Larsen, Front. Microbiol., 2015, 6, 71 Search PubMed.
  77. B. M. Hounoum, H. Blasco, P. Emond and S. Mavel, TrAC, Trends Anal. Chem., 2016, 75, 118–128 Search PubMed.
  78. S. Beisken, M. Eiden and R. M. Salek, Expert Rev. Mol. Diagn., 2015, 15, 97–109 Search PubMed.
  79. J. C. May and J. A. McLean, Anal. Chem., 2015, 87, 1422–1436 Search PubMed.
  80. F. Lanucara, S. W. Holman, C. J. Gray and C. E. Eyers, Nat. Chem., 2014, 6, 281–294 Search PubMed.
  81. R. Cumeras, E. Figueras, C. E. Davis and J. I. Baumbach, Analyst, 2015, 140, 1476–1490 Search PubMed.
  82. J. A. McLean, B. T. Ruotolo, K. J. Gillig and D. H. Russell, Int. J. Mass Spectrom., 2005, 240, 301–315 Search PubMed.
  83. C. S. Creaser, J. R. Griffiths, C. J. Bramwell and S. Noreen, Analyst, 2004, 129, 984–994 Search PubMed.
  84. M. Kulchania, C. A. S. Barnes and D. E. Clemmer, Int. J. Mass Spectrom., 2001, 212, 97–109 Search PubMed.
  85. E. Esquenazi, M. Daly and T. Bahrainwala, Bioorg. Med. Chem., 2011, 19, 6639–6644 Search PubMed.
  86. C. R. Goodwin, L. S. Fenn, D. K. Derewacz, B. O. Bachmann and J. A. McLean, J. Nat. Prod., 2012, 75, 48–53 Search PubMed.
  87. K. Jeanne Dit Fouque, C. Afonso, S. Zirah, J. D. Hegemann, M. Zimmermann, M. A. Marahiel, S. Rebuffat and H. Lavanant, Anal. Chem., 2015, 87, 1166–1172 Search PubMed.
  88. A. H. Payne and G. L. Glish, Methods Enzymol., 2005, 402, 109–148 Search PubMed.
  89. Z.-J. J. Zhu, A. W. Schultz, J. Wang, C. H. Johnson, S. M. Yannone, G. J. Patti and G. Siuzdak, Nat. Protoc., 2013, 8, 451–460 Search PubMed.
  90. H. P. Benton, D. M. Wong, S. A. Trauger and G. Siuzdak, Anal. Chem., 2008, 80, 6382–6389 Search PubMed.
  91. K. Dührkop, H. Shen, M. Meusel, J. Rousu and S. Böcker, Proc. Natl. Acad. Sci. U. S. A., 2015, 112, 12580–12585 Search PubMed.
  92. F. Hufsky and S. Böcker, Mass Spectrom. Rev., 2016, 9999, 1–10 Search PubMed.
  93. A. Vaniya and O. Fiehn, TrAC, Trends Anal. Chem., 2015, 69, 52–61 Search PubMed.
  94. A. F. Tawfike, C. Viegelmann and R. Edrada-Ebel, Methods Mol. Biol., 2012, 1055, 227–244 Search PubMed.
  95. T. M. Kertesz, L. H. Hall, D. W. Hill and D. F. Grant, J. Am. Soc. Mass Spectrom., 2009, 20, 1759–1767 Search PubMed.
  96. V. H. Wysocki, K. E. Joyce and C. M. Jones, J. Am. Soc. Mass Spectrom., 2008, 19, 190–208 Search PubMed.
  97. R. D. Kersten, Y.-L. L. Yang, Y. Xu, P. Cimermancic, S.-J. J. Nam, W. Fenical, M. A. Fischbach, B. S. Moore and P. C. Dorrestein, Nat. Chem. Biol., 2011, 7, 794–802 Search PubMed.
  98. R. D. Kersten, N. Ziemert, D. J. Gonzalez, B. M. Duggan, V. Nizet, P. C. Dorrestein and B. S. Moore, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, 4407–4416 Search PubMed.
  99. D. D. Nguyen, C.-H. H. Wu, W. J. Moree, A. Lamsa, M. H. Medema, X. Zhao, R. G. Gavilan, M. Aparicio, L. Atencio, C. Jackson, J. Ballesteros, J. Sanchez, J. D. Watrous, V. V. Phelan, C. van de Wiel, R. D. Kersten, S. Mehnaz, R. De Mot, E. A. Shank, P. Charusanti, H. Nagarajan, B. M. Duggan, B. S. Moore, N. Bandeira, B. Ø. Palsson, K. Pogliano, M. Gutiérrez and P. C. Dorrestein, Proc. Natl. Acad. Sci. U. S. A., 2013, 110, 2611–2620 Search PubMed.
  100. S. Reeta Rani, P. Anil Kumar, R. S. Carlos and P. Ashok, Biochem. Eng. J., 2009, 44, 13–18 Search PubMed.
  101. S. Ing-Lung, K. Chia-Yu, H. Feng-Chia, K. Suey-Sheng and H. Chienyan, J. Chin. Inst. Chem. Eng., 2008, 39, 635–643 Search PubMed.
  102. T. Robinson, D. Singh and P. Nigam, Appl. Microbiol. Biotechnol., 2001, 55, 284–289 Search PubMed.
  103. B. Kunze, B. Bohlendorf and H. Reichenbach, J. Antibiot., 2008, 61, 18–26 Search PubMed.
  104. M. S. Abdelfattah, M. K. Kharel and J. A. Hitron, J. Nat. Prod., 2008, 71, 1569–1573 CrossRef CAS PubMed.
  105. F. Surup, O. Wagner and J. von Frieling, J. Org. Chem., 2007, 72, 5085–5090 Search PubMed.
  106. A. Rančić, M. Soković, A. Karioti and J. Vukojević, Environ. Toxicol. Pharmacol., 2006, 22, 80–84 Search PubMed.
  107. T. M. Yoon, J. W. Kim, J. G. Kim, W. G. Kim and J. W. Suh, J. Antibiot., 2006, 59, 640–645 Search PubMed.
  108. K. Umezawa, Y. Ikeda and O. Kawase, J. Chem. Soc., 2001, 1, 1550–1553 Search PubMed.
  109. B. Murphy, K. Anderson, C. Borissow and P. Caffrey, Org. Biomol. Chem., 2010, 8, 3758–3770 Search PubMed.
  110. B. Zhou, J. Xiao, L. Tuli and H. W. Ressom, Mol. BioSyst., 2012, 8, 470–481 Search PubMed.
  111. S. Forcisi, F. Moritz, B. Kanawati and D. Tziotis, J. Chromatogr. A, 2013, 1292, 51–65 Search PubMed.
  112. N. L. Kuehnbaum and P. Britz-McKibbin, Chem. Rev., 2013, 113, 2437–2468 Search PubMed.
  113. P. D. Straight and R. Kolter, Annu. Rev. Microbiol., 2009, 63, 99–118 Search PubMed.
  114. A. E. Little, C. J. Robinson, S. B. Peterson, K. F. Raffa and J. Handelsman, Annu. Rev. Microbiol., 2008, 62, 375–401 Search PubMed.
  115. R. P. Ryan and J. M. Dow, Microbiology, 2008, 154, 1845–1858 Search PubMed.
  116. J. D. Watrous and P. C. Dorrestein, Nat. Rev. Microbiol., 2011, 9, 683–694 CrossRef CAS PubMed.
  117. J. Y. Yang, V. V. Phelan, R. Simkovsky, J. D. Watrous, R. M. Trial, T. C. Fleming, R. Wenter, B. S. Moore, S. S. Golden, K. Pogliano and P. C. Dorrestein, J. Bacteriol., 2012, 194, 6023–6028 Search PubMed.
  118. A. L. Lane, L. Nyadong and A. S. Galhena, Proc. Natl. Acad. Sci. U. S. A., 2009, 106, 7314–7319 CrossRef CAS PubMed.
  119. T. L. Simmons, R. C. Coates, B. R. Clark, N. Engene, D. Gonzalez, E. Esquenazi, P. C. Dorrestein and W. H. Gerwick, Proc. Natl. Acad. Sci. U. S. A., 2008, 105, 4587–4594 Search PubMed.
  120. E. Esquenazi, C. Coates, L. Simmons, D. Gonzalez, W. H. Gerwick and P. C. Dorrestein, Mol. Biosyst., 2008, 4, 562–570 Search PubMed.
  121. C. Eriksson, N. Masaki, I. Yao and T. Hayasaka, Mass Spectrom., 2013, 2, S0022 Search PubMed.
  122. M. M. Gessel, J. L. Norris and R. M. Caprioli, J. Proteomics, 2014, 107, 71–82 Search PubMed.
  123. T. Masatoshi, K. N. Joshawna, E. Niclas, E. Eduardo, B. Tara, C. D. Pieter and H. G. William, J. Nat. Prod., 2010, 73, 393–398 Search PubMed.
  124. Y. Yu-Liang, X. Yuquan, S. Paul and C. D. Pieter, Nat. Chem. Biol., 2009, 5, 885–887 Search PubMed.
  125. R. Bleich, J. D. Watrous, P. C. Dorrestein, A. A. Bowers and E. A. Shank, Proc. Natl. Acad. Sci. U. S. A., 2015, 112, 3086–3091 Search PubMed.
  126. K. Johannes, K. Martin, S. Bernd, S. Maria-Gabriele, H. Christian, M. Ravi Kumar, S. Erhard and S. Aleš, Nat. Chem. Biol., 2010, 6, 261–263 Search PubMed.
  127. D. J. Gonzalez, N. M. Haste, A. Hollands, T. C. Fleming, M. Hamby, K. Pogliano, V. Nizet and P. C. Dorrestein, Microbiology, 2011, 157, 2485–2492 Search PubMed.
  128. C.-J. J. Shih, P.-Y. Y. Chen, C.-C. C. Liaw, Y.-M. M. Lai and Y.-L. L. Yang, Nat. Prod. Rep., 2014, 31, 739–755 Search PubMed.
  129. Y. C. Harn, M. J. Powers, E. A. Shank and V. Jojic, Bioinformatics, 2015, 31, 42–50 Search PubMed.
  130. L. Wei-Ting, D. K. Roland, Y. Yu-Liang, S. M. Bradley and C. D. Pieter, J. Am. Chem. Soc., 2011, 133, 18010–18013 Search PubMed.
  131. M. F. Traxler, J. D. Watrous, T. Alexandrov and P. C. Dorrestein, mBio, 2013, 4, e00459-13 Search PubMed.
  132. S. Vaidyanathan, J. Fletcher, R. Goodacre, N. Lockyer, J. Micklefield and J. Vickerman, Anal. Chem., 2008, 80, 1942–1951 Search PubMed.
  133. S. Vaidyanathan, J. Fletcher, N. Lockyer and J. Vickerman, Appl. Surf. Sci., 2008, 255, 922–925 Search PubMed.
  134. E. Esquenazi, P. C. Dorrestein and W. H. Gerwick, Proc. Natl. Acad. Sci. U. S. A., 2009, 106, 7269–7270 Search PubMed.
  135. C. Bhardwaj and L. Hanley, Nat. Prod. Rep., 2014, 31, 756–767 Search PubMed.
  136. A. Bouslimani, L. M. Sanchez, N. Garg and P. C. Dorrestein, Nat. Prod. Rep., 2014, 31, 718–729 Search PubMed.
  137. G. J. Patti, O. Yanes and G. Siuzdak, Nat. Rev. Mol. Cell Biol., 2012, 13, 263–269 Search PubMed.
  138. S. Orchard, H. Hermjakob and R. Apweiler, Proteomics, 2003, 3, 1374–1376 Search PubMed.
  139. H. Hermjakob, Proteomics, 2006, 6, 34–38 Search PubMed.
  140. P. G. A. Pedrioli, J. K. Eng, R. Hubley and M. Vogelzang, Nat. Biotechnol., 2004, 22, 1459–1466 Search PubMed.
  141. S. M. Lin, L. Zhu, A. Q. Winter and M. Sasinowski, Expert Rev. Proteomics, 2005, 2, 839–845 Search PubMed.
  142. L. Martens, M. Chambers, M. Sturm and D. Kessner, Mol. Cell. Proteomics, 2011, 10, R110.000133 Search PubMed.
  143. D. Kessner, M. Chambers, R. Burke and D. Agus, Bioinformatics, 2008, 24, 2534–2536 Search PubMed.
  144. M. Hendriks, F. A. Eeuwijk, R. H. Jellema, J. A. Westerhuis, T. H. Reijmers, H. C. J. Hoefsloot and A. K. Smilde, TrAC, Trends Anal. Chem., 2011, 30, 1685–1698 Search PubMed.
  145. N. G. Mahieu, J. L. Genenbacher and G. J. Patti, Curr. Opin. Chem. Biol., 2016, 30, 87–93 Search PubMed.
  146. K. H. Liland, TrAC, Trends Anal. Chem., 2011, 30, 827–841 CrossRef CAS.
  147. W. Willem, J. M. Phalp and W. P. Alan, Anal. Chem., 1996, 68, 3602–3606 Search PubMed.
  148. C. A. Hastings, S. M. Norton and S. Roy, Rapid Commun. Mass Spectrom., 2002, 16, 462–467 Search PubMed.
  149. S. Roy, T. A. Shaler, L. R. Hill, S. Norton and P. Kumar, Anal. Chem., 2003, 75, 4818–4826 Search PubMed.
  150. M. E. Monroe, N. Tolić, N. Jaitly, J. L. Shaw and J. N. Adkins, Bioinformatics, 2007, 23, 2021–2023 Search PubMed.
  151. L. N. Mueller, O. Rinner, A. Schmidt and S. Letarte, Proteomics, 2007, 7, 3470–3480 Search PubMed.
  152. M. Sturm, A. Bertsch, C. Gröpl and A. Hildebrandt, BMC Bioinf., 2008, 9, 163 CrossRef PubMed.
  153. D. Rolf, B. Dan and E. M. Karin, Anal. Chim. Acta, 2002, 454, 167–184 Search PubMed.
  154. C. A. Smith, E. J. Want, G. O'Maille, R. Abagyan and G. Siuzdak, Anal. Chem., 2006, 78, 779–787 Search PubMed.
  155. R. Tautenhahn, C. Böttcher and S. Neumann, BMC Bioinf., 2008, 9, 504 Search PubMed.
  156. P. Du, W. A. Kibbe and S. M. Lin, Bioinformatics, 2006, 22, 2059–2065 Search PubMed.
  157. W. R. French, L. J. Zimmerman, B. Schilling, B. W. Gibson, C. A. Miller, R. R. Townsend, S. D. Sherrod, C. R. Goodwin, J. A. McLean and D. L. Tabb, J. Proteome Res., 2014, 14, 1299–1307 Search PubMed.
  158. K. Aoshima, K. Takahashi, M. Ikawa, T. Kimura, M. Fukuda, S. Tanaka, H. E. Parry, Y. Fujita, A. C. Yoshizawa, S.-i. Utsunomiya, S. Kajihara, K. Tanaka and Y. Oda, BMC Bioinf., 2014, 15, 1–14 Search PubMed.
  159. K. Dettmer, P. A. Aronov and B. D. Hammock, Mass Spectrom. Rev., 2007, 26, 51–78 Search PubMed.
  160. T. Frenzel, A. Miller and K. H. Engel, Eur. Food Res. Technol., 2003, 216, 335–342 Search PubMed.
  161. K. Podwojski, A. Fritsch, D. C. Chamrad and W. Paul, Bioinformatics, 2009, 25, 758–764 Search PubMed.
  162. W. S. Cleveland and E. Grosse, Stat. Comput., 1991, 1, 47–62 Search PubMed.
  163. V. Perera, M. D. T. Zabala, H. Florance, N. Smirnoff and M. Grant, Metabolomics, 2012, 8, 175–185 Search PubMed.
  164. E. Lange, C. Gröpl, O. Schulz-Trieglaff, A. Leinenbach, C. Huber and K. Reinert, Bioinformatics, 2007, 23, 273–281 Search PubMed.
  165. A. Lommen, Anal. Chem., 2009, 81, 3079–3086 Search PubMed.
  166. C. Bork, K. Ng, Y. Liu and A. Yee, Biotechnol. Prog., 2013, 29, 394–402 Search PubMed.
  167. B. Voss, M. Hanselmann, B. Y. Renard, M. S. Lindner, U. Köthe, M. Kirchner and F. A. Hamprecht, Bioinformatics, 2011, 27, 987–993 Search PubMed.
  168. M. Katajamaa and M. Oresic, BMC Bioinf., 2005, 6, 179 Search PubMed.
  169. T. Pluskal, S. Castillo, A. Villar-Briones and M. Oresic, BMC Bioinf., 2010, 11, 395 Search PubMed.
  170. E. Lange, R. Tautenhahn and S. Neumann, BMC Bioinf., 2008, 9, 375 Search PubMed.
  171. G. Libiseller, M. Dvorzak, U. Kleb, E. Gander, T. Eisenberg, F. Madeo, S. Neumann, G. Trausinger, F. Sinner, T. Pieber and C. Magnes, BMC Bioinf., 2015, 16, 118 Search PubMed.
  172. B. Worley and R. Powers, Curr. Metabolomics, 2013, 1, 92–107 Search PubMed.
  173. J. M. Fonville, S. E. Richards and R. H. Barton, J. Chemom., 2010, 24, 636–649 Search PubMed.
  174. H. Abdi, Wiley Interdisciplinary Reviews: Computational Statistics, 2010, 2, 97–106 Search PubMed.
  175. R. Rosipal and N. Krämer, Lect. Notes Comput. Sci. Eng., 2006, 3940, 34–51 Search PubMed.
  176. R. Bro and A. K. Smilde, Anal. Methods, 2014, 6, 2812–2831 Search PubMed.
  177. M. Bylesjö, M. Rantalainen and O. Cloarec, J. Chemom., 2006, 20, 341–351 Search PubMed.
  178. O. Genilloud, I. González, O. Salazar, J. Martín, J. R. R. Tormo and F. Vicente, J. Ind. Microbiol. Biotechnol., 2011, 38, 375–389 Search PubMed.
  179. Y. Hou, D. R. Braun, C. R. Michel, J. L. Klassen, N. Adnani, T. P. Wyche and T. S. Bugni, Anal. Chem., 2012, 84, 4277–4283 Search PubMed.
  180. L. Macintyre, T. Zhang, C. Viegelmann, I. Martinez, C. Cheng, C. Dowdells, U. Abdelmohsen, C. Gernert, U. Hentschel and R. Edrada-Ebel, Mar. Drugs, 2014, 12, 3416–3448 Search PubMed.
  181. D. Forner, F. Berrué, H. Correa, K. Duncan and R. G. Kerr, Anal. Chim. Acta, 2013, 805, 70–79 Search PubMed.
  182. S. Norazwana, T. Pei Jean, S. Khozirah, A. Faridah and L. Hong Boon, Anal. Chem., 2014, 86, 1324–1331 Search PubMed.
  183. S. Grond, I. Papastavrou and A. Zeeck, Eur. J. Org. Chem., 2002, 2002, 3237–3242 Search PubMed.
  184. C. R. Goodwin, B. C. Covington, D. K. Derewacz, C. R. McNees, J. P. Wikswo, J. A. McLean and B. O. Bachmann, Chem. Biol., 2015, 22, 661–670 Search PubMed.
  185. D. Krug, G. Zurek, B. Schneider, R. Garcia and R. Müller, Anal. Chim. Acta, 2008, 624, 97–106 Search PubMed.
  186. N. S. Cortina, D. Krug, A. Plaza, O. Revermann and R. Müller, Angew. Chem., Int. Ed. Engl., 2012, 51, 811–816 Search PubMed.
  187. D. Krug, G. Zurek, O. Revermann, M. Vos, G. J. Velicer and R. Muller, Appl. Environ. Microbiol., 2008, 74, 3058–3068 Search PubMed.
  188. S. Wiklund, E. Johansson and L. Sjöström, Anal. Chem., 2008, 80, 115–122 Search PubMed.
  189. V. González-Menéndez, M. Pérez-Bonilla, I. Pérez-Victoria, J. Martín, F. Muñoz, F. Reyes, J. Tormo and O. Genilloud, Molecules, 2016, 21, 234 Search PubMed.
  190. M. Hur, A. A. Campbell, M. Almeida-de-Macedo, L. Li, N. Ransom, A. Jose, M. Crispin, B. J. Nikolau and E. S. Wurtele, Nat. Prod. Rep., 2013, 30, 565–583 Search PubMed.
  191. J. C. Albright, M. T. Henke, A. A. Soukup, R. A. McClure, R. J. Thomson, N. P. Keller and N. L. Kelleher, ACS Chem. Biol., 2015, 10, 1535–1541 Search PubMed.
  192. R. Tautenhahn, G. J. Patti, D. Rinehart and G. Siuzdak, Anal. Chem., 2012, 84, 5035–5039 Search PubMed.
  193. H. Gowda, J. Ivanisevic, C. H. Johnson, M. E. Kurczy, H. P. Benton, D. Rinehart, T. Nguyen, J. Ray, J. Kuehl, B. Arevalo, P. D. Westenskow, J. Wang, A. P. Arkin, A. M. Deutschbauer, G. J. Patti and G. Siuzdak, Anal. Chem., 2014, 86, 6931–6939 Search PubMed.
  194. J. Xia and D. S. Wishart, Curr. Protoc. Bioinformatics, 2011, ch. 14, unit 14 Search PubMed.
  195. J. Xia, I. V. Sinelnikov, B. Han and D. S. Wishart, Nucleic Acids Res., 2015, 43, W251–W257 Search PubMed.
  196. G. J. Patti, R. Tautenhahn, D. Rinehart, K. Cho, L. Shriver, M. Manchester, I. Nikolskiy, C. Johnson, N. Mahieu and G. Siuzdak, Anal. Chem., 2013, 85, 798–804 Search PubMed.
  197. C. R. Goodwin, S. D. Sherrod, C. C. Marasco, B. O. Bachmann, N. Schramm-Sapyta, J. P. Wikswo and J. A. McLean, Anal. Chem., 2014, 86, 6563–6571 CrossRef CAS PubMed.
  198. S. P. Gaudêncio and F. Pereira, Nat. Prod. Rep., 2015, 32, 779–810 Search PubMed.
  199. P. J. Eugster, J. Boccard, B. Debrus, L. Bréant, J.-L. Wolfender, S. Martel and P.-A. Carrupt, Phytochemistry, 2014, 108, 196–207 Search PubMed.
  200. S. M. Stow, C. R. Goodwin, M. Kliman, B. O. Bachmann, J. A. McLean and T. P. Lybrand, J. Phys. Chem. B, 2014, 118, 13812–13820 Search PubMed.
  201. C. A. Smith, G. O’Maille, E. J. Want, C. Qin, S. A. Trauger, T. R. Brandon, D. E. Custodio, R. Abagyan and G. Siuzdak, Ther. Drug Monit., 2005, 27, 747–751 Search PubMed.
  202. H. Horai, M. Arita, S. Kanaya, Y. Nihei, T. Ikeda, K. Suwa, Y. Ojima, K. Tanaka, S. Tanaka, K. Aoshima, Y. Oda, Y. Kakazu, M. Kusano, T. Tohge, F. Matsuda, Y. Sawada, M. Y. Hirai, H. Nakanishi, K. Ikeda, N. Akimoto, T. Maoka, H. Takahashi, T. Ara, N. Sakurai, H. Suzuki, D. Shibata, S. Neumann, T. Iida, K. Tanaka, K. Funatsu, F. Matsuura, T. Soga, R. Taguchi, K. Saito and T. Nishioka, J. Mass Spectrom., 2010, 45, 703–714 Search PubMed.
  203. H. Shen, K. Dührkop, S. Böcker and J. Rousu, Bioinformatics, 2014, 30, i157–i164 Search PubMed.
  204. J. Bérdy, J. Antibiot., 2012, 65, 385–395 Search PubMed.
  205. F. Hufsky, K. Scheubert and S. Böcker, Nat. Prod. Rep., 2014, 31, 807–817 Search PubMed.
  206. P.-M. M. Allard, T. Péresse, J. Bisson, K. Gindro, L. Marcourt, V. C. Pham, F. Roussi, M. Litaudon and J.-L. L. Wolfender, Anal. Chem., 2016, 88, 3317–3323 Search PubMed.
  207. J. Watrous, P. Roach, T. Alexandrov, B. S. Heath, J. Y. Yang, R. D. Kersten, M. van der Voort, K. Pogliano, H. Gross, J. M. Raaijmakers, B. S. Moore, J. Laskin, N. Bandeira and P. C. Dorrestein, Proc. Natl. Acad. Sci. U. S. A., 2012, 109, 1743–1752 Search PubMed.
  208. J. Y. Yang, L. M. Sanchez, C. M. Rath, X. Liu, P. D. Boudreau, N. Bruns, E. Glukhov, A. Wodtke, R. de Felicio, A. Fenner, W. R. Wong, R. G. Linington, L. Zhang, H. M. Debonsi, W. H. Gerwick and P. C. Dorrestein, J. Nat. Prod., 2013, 76, 1686–1699 Search PubMed.
  209. W. J. Moree, V. V. Phelan, C.-H. Wu, N. Bandeira, D. S. Cornett, B. M. Duggan and P. C. Dorrestein, Proc. Natl. Acad. Sci. U. S. A., 2012, 109, 13811–13816 Search PubMed.
  210. E. Briand, M. Bormans, M. Gugger, P. C. Dorrestein and W. H. Gerwick, Environ. Microbiol., 2015, 18, 384–400 CrossRef PubMed.
  211. K. R. Duncan, M. Crüsemann, A. Lechner, A. Sarkar, J. Li, N. Ziemert, M. Wang, N. Bandeira, B. S. Moore, P. C. Dorrestein and P. R. Jensen, Chem. Biol., 2015, 22, 460–471 CrossRef CAS PubMed.
  212. K. Kleigrewe, J. Almaliti, I. Y. Tian and R. B. Kinnel, J. Nat. Prod., 2015, 78, 1671–1682 Search PubMed.
  213. M. I. Vizcaino and J. M. Crawford, Nat. Chem., 2015, 7, 411–417 Search PubMed.
  214. E. P. Trautman and J. M. Crawford, Curr. Top. Med. Chem., 2015, 16, 1705–1716 CrossRef.
  215. M. I. Vizcaino, P. Engel, E. Trautman and J. M. Crawford, J. Am. Chem. Soc., 2014, 136, 9244–9247 Search PubMed.
  216. Y. Luo, B.-Z. Li, D. Liu, L. Zhang, Y. Chen, B. Jia, B.-X. Zeng, H. Zhao and Y.-J. Yuan, Chem. Soc. Rev., 2015, 44, 5265–5290 Search PubMed.
  217. H. Zhang, B. A. Boghigian and J. Armando, Nat. Prod. Rep., 2011, 28, 125–151 Search PubMed.
  218. M. S. Donia, D. E. Ruffner, S. Cao and E. W. Schmidt, ChemBioChem, 2011, 12, 1230–1236 Search PubMed.
  219. M. Schorn, J. Zettler, J. P. Noel and P. C. Dorrestein, ACS Chem. Biol., 2013, 9, 301–309 CrossRef PubMed.
  220. D. J. Gonzalez, R. Corriden, K. Akong-Moore and J. Olson, Chem. Biol., 2014, 21, 1457–1462 Search PubMed.
  221. D. J. Gonzalez, L. Vuong, I. S. Gonzalez, N. Keller, D. McGrosso, J. H. Hwang, J. Hung, A. Zinkernagel, J. E. Dixon and P. C. Dorrestein, Mol. Cell. Proteomics, 2014, 13, 1262–1272 Search PubMed.
  222. V. R. Macherla, J. Liu, M. Sunga and D. J. White, J. Nat. Prod., 2007, 70, 1454–1457 Search PubMed.
  223. K. Desjardine, A. Pereira and H. Wright, J. Nat. Prod., 2007, 70, 1850–1853 Search PubMed.
  224. K. A. McArthur, S. S. Mitchell and G. Tsueng, J. Nat. Prod., 2008, 71, 1732–1737 Search PubMed.
  225. M. E. Teasdale, T. L. Shearer and S. Engel, J. Org. Chem., 2012, 77, 8000–8006 Search PubMed.
  226. J. J. Kellogg, D. A. Todd, J. M. Egan and H. A. Raja, J. Nat. Prod., 2016, 79, 376–386 Search PubMed.
  227. L. A. Salvador-Reyes and H. Luesch, Nat. Prod. Rep., 2015, 32, 478–503 Search PubMed.
  228. C. W. Johnston, M. A. Skinnider, C. A. Dejong, P. N. Rees, G. M. Chen, C. G. Walker, S. French, E. D. Brown, J. Bérdy, D. Y. Liu and N. A. Magarvey, Nat. Chem. Biol., 2016, 12, 233–239 Search PubMed.
  229. M. G. Rees, B. Seashore-Ludlow, J. H. Cheah, D. J. Adams, E. V. Price, S. Gill, S. Javaid, M. E. Coletti, V. L. Jones, N. E. Bodycombe, C. K. Soule, B. Alexander, A. Li, P. Montgomery, J. D. Kotz, C. S. Hon, B. Munoz, T. Liefeld, V. Dančík, D. A. Haber, C. B. Clish, J. A. Bittker, M. Palmer, B. K. Wagner, P. A. Clemons, A. F. Shamji and S. L. Schreiber, Nat. Chem. Biol., 2015, 12, 109–116 Search PubMed.
  230. M. A. Farha and E. D. Brown, Nat. Prod. Rep., 2016, 33, 668–680 Search PubMed.
  231. W. R. Wong, A. G. Oliver and R. G. Linington, Chem. Biol., 2012, 19, 1483–1495 Search PubMed.
  232. J. Lamb, E. D. Crawford, D. Peck, J. W. Modell and I. C. Blat, Science, 2006, 313, 1929–1935 Search PubMed.
  233. B. Hutter, C. Schaab, S. Albrecht, M. Borgmann, N. A. Brunner, C. Freiberg, K. Ziegelbauer, C. O. Rock, I. Ivanov and H. Loferer, Antimicrob. Agents Chemother., 2004, 48, 2838–2844 Search PubMed.
  234. Y. Hu, M. B. Potts, D. Colosimo, M. L. Herrera-Herrera, A. G. Legako, M. Yousufuddin, M. A. White and J. B. MacMillan, J. Am. Chem. Soc., 2013, 135, 13387–13392 Search PubMed.
  235. M. B. Potts, H. S. Kim, K. W. Fisher, Y. Hu and Y. P. Carrasco, Sci. Signaling, 2013, 6, ra90 CrossRef PubMed.
  236. Y. Yu, Z. Yi and Y. Z. Liang, FEBS Lett., 2007, 581, 4179–4183 Search PubMed.
  237. Y. Liu, J. Wen, Y. Wang, Y. Li and W. Xu, Chromatographia, 2010, 71, 253–528 Search PubMed.
  238. S. Halouska, O. Chacon, R. J. Fenton, D. K. Zinniel, R. G. Barletta and R. Powers, J. Proteome Res., 2007, 6, 4608–4614 Search PubMed.
  239. S. Halouska, R. J. Fenton, R. G. Barletta and R. Powers, ACS Chem. Biol., 2012, 7, 166–171 Search PubMed.
  240. I. M. Vincent, S. Weidt, L. Rivas, K. Burgess, T. K. Smith and M. Ouellette, Int. J. Parasitol.: Drugs Drug Resist., 2014, 4, 20–27 CrossRef PubMed.
  241. I. M. Vincent, D. J. Creek, K. Burgess and D. J. Woods, PLoS Neglected Trop. Dis., 2012, 6, e1618 Search PubMed.
  242. A. Trochine, D. J. Creek, P. Faral-Tello and M. P. Barrett, PLoS Neglected Trop. Dis., 2014, 8, e2844 Search PubMed.
  243. I. M. Vincent, D. E. Ehmann, S. Mills and M. Perros, Antimicrob. Agents Chemother., 2016, 4, 20–27 Search PubMed.
  244. M. A. Lobritz, P. Belenky, C. B. Porter, A. Gutierrez, J. H. Yang, E. G. Schwarz, D. J. Dwyer, A. S. Khalil and J. J. Collins, Proc. Natl. Acad. Sci. U. S. A., 2015, 112, 8173–8180 Search PubMed.
  245. Z. E. Perlman, M. D. Slack, Y. Feng and T. J. Mitchison, Science, 2004, 306, 1194–1198 Search PubMed.
  246. M. H. Woehrmann, W. M. Bray, J. K. Durbin and S. C. Nisam, Mol. Biosyst., 2013, 9, 2604–2617 Search PubMed.
  247. K. C. Peach, W. M. Bray, D. Winslow and P. F. Linington, Mol. Biosyst., 2013, 9, 1837–1848 Search PubMed.
  248. G. Navarro, A. T. Cheng and K. C. Peach, Antimicrob. Agents Chemother., 2014, 58, 1092–1099 Search PubMed.
  249. C. J. Schulze, W. M. Bray, M. H. Woerhmann and J. Stuart, Chem. Biol., 2013, 20, 285–295 Search PubMed.
  250. K. L. Kurita, E. Glassey and R. G. Linington, Proc. Natl. Acad. Sci. U. S. A., 2015, 112, 11999–12004 Search PubMed.
  251. E. W. Schmidt, Nat. Chem., 2015, 7, 375–376 Search PubMed.
  252. G. Sharon, N. Garg, J. Debelius, R. Knight, P. C. Dorrestein and S. K. Mazmanian, Cell Metab., 2014, 20, 719–730 CrossRef CAS PubMed.
  253. L. V. Hooper, D. R. Littman and A. J. Macpherson, Science, 2012, 336, 1268–1273 CrossRef CAS PubMed.
  254. T. B. Clarke, K. M. Davis, E. S. Lysenko, A. Y. Zhou and Y. Yu, Nat. Med., 2010, 16, 228–231 CrossRef CAS PubMed.
  255. N. Koppel and E. P. Balskus, Cell Chemical Biology, 2016, 23, 18–30 Search PubMed.

This journal is © The Royal Society of Chemistry 2017