Kathleen R.
Murphy
*ab,
Colin A.
Stedmon
c,
Philip
Wenig
d and
Rasmus
Bro
e
aUniversity of New South Wales, Water Research Centre, Sydney, Australia. E-mail: krm@unsw.edu.au; Fax: +61 2 9313 8624; Tel: +61 2 9385 4601
bChalmers University of Technology, Water Environment Technology, Gothenburg, Sweden
cTechnical University of Denmark, National Institute for Aquatic Resources, Charlottenlund, Denmark. E-mail: cost@aqua.dtu.dk
dErnst-Kabel-Stieg 5a, 22087 Hamburg, Germany. E-mail: philip.wenig@gmx.net
eUniversity of Copenhagen, Dept. Food Science, Frederiksberg, Denmark. E-mail: rb@life.ku.dk
First published on 11th December 2013
An online repository of published organic fluorescence spectra has been developed, which can be searched for quantitative matches with any set of unknown spectra. It fills a critical gap by increasing access to measured and modelled (PARAFAC) spectra, and linking across studies and systems to reveal “global” fluorescence trends.
Fig. 1 A dataset of fluorescence excitation emission matrices (EEMs) decomposed into six underlying components using PARAFAC. |
There are now well over 100 published PARAFAC models of dissolved and natural organic matter (both referred to hereafter as NOM) and over 500 published PARAFAC components.9,10
However, no agreed measure exists for determining whether the same PARAFAC components were found in different studies. Furthermore, while scientists have some idea of the chemical structures likely to be responsible for NOM fluorescence, few reference data are readily available and even fewer studies have drawn reliable comparisons between PARAFAC components and pure organic compounds. It is presently unclear how often PARAFAC components extracted from NOM accurately represent the spectra of pure compounds or mixtures, or the degree to which PARAFAC decompositions are impaired by potential non-ideal chemical behaviours such as spectral shifting,4 energy or electron transfer,6,11 and charge–transfer interactions.12
It is widely supposed that spectrally similar PARAFAC components extracted from unrelated datasets are attributable to similar organic matter sources, and depict the same or similar underlying compounds having similar ecological functions. However, since the spectra of published PARAFAC components are only typically available as images or summary tables in the original publications, this hypothesis is extremely difficult to test. Thus, Ishii and Boyer13 recently reviewed the reported distributions and responses to physicochemical processes of three apparently widespread humic-like PARAFAC components, finding numerous inconsistencies between studies with regard to their reported behaviours. However, in that review as in the overwhelming majority of reviewed studies, PARAFAC components were equated on the basis of broad criteria such as the number and positions of spectral peaks, with peak positions approximately defined and allowed to vary over a broad wavelength range. Previously, In the literature, PARAFAC components have been equated to specific compounds and redox states with little or no quantification of spectral similarity. This widespread use of qualitative or subjective criteria for equating components between studies is a serious confounding factor for interpreting global trends in component distributions and behaviours, or for deducing the organic structures likely to be responsible for the observed patterns. Recent papers have emphasised the importance of standardised approaches to measuring EEMs14,15 and deriving PARAFAC models,9,16 and a systematic way of comparing the results of different studies is urgently needed.
To support quantitative comparisons of fluorescence spectra between studies, an open-access spectral database (http://www.openfluor.org) has been developed. The database is accessible using any modern web browser (e.g. Mozilla, Chrome, Internet Explorer) on desktops, tablets or smartphones. All interactions between the user and the database occur via a simple graphical user interface with no programming necessary. The supporting use of HTML5, jQuery and JavaScript create a rich and interactive graphical user interface within the browser. When search query is implemented on an unknown set of reference spectra, quantitatively similar spectra are retrieved from the database.
Algorithms for quantifying spectral similarity have been the subject of extensive research in other branches of analytical chemistry,17–19 but are undeveloped in the context of fluorescence. Currently, OpenFluor identifies similar spectra as having Tucker congruence20θ exceeding 0.95 on the excitation and emission spectra simultaneously (eqn (1)). A more targeted search for matching spectra will be implemented in the future as improved algorithms for matching spectra become available.
θ = θex × θem ≥ 0.95 | (1) |
Records in the OpenFluor database are accompanied by synopses of the study that generated the data, including a short methodological description and an active link to the published record at http://dx.doi.org. Unregistered visitors to the website may temporarily upload spectra and search for quantitatively similar spectra in the database. Completion of a free one-time registration process allows the user to browse descriptions of matching models, generate plots, and download matched data. Registered users may elect to submit published spectra to the database, thereby making their own research results available for searching by other members of the fluorescence community.
Fig. 2 illustrates the potential for a spectral database to reveal similarities as well as differences between PARAFAC spectra. Each of the humic-like components depicted in Fig. 2A–C fulfil the description of “reoccurring Component 2” described by Ishii and Boyer13 (excitation maxima approximately <240–275 nm and 339–420 nm; emission approximately 434–520 nm). Dozens of other spectra in the OpenFluor database also conform to this general description, yet are relatively poor quantitative matches for these spectra. In Fig. 2B, the four PARAFAC components shown share nearly identical emission spectra, but the excitation spectra fall into two distinct groups, corresponding with datasets from water treatment plants in Denmark21 and Australia5 which have different excitation spectra than in the models of datasets from the Florida Everglades22 and the South Atlantic Bight.23 Since nearly all components have primary excitation maxima near the limits of the measured or modelled range (<250 nm), they are mainly distinguishable by the position of their secondary excitation peak in conjunction with the position of the emission maximum (Cex/em). In Fig. 2C, the strongly overlapping components shown appear to mainly cluster in two sets, described here as C400/518 nm and C380/500 nm. The ESI† lists published sources for components in Fig. 2.
Fig. 2D–F depict humic-like components identified in a number of studies, each fulfilling the description of Ishii and Boyer's reoccurring component 3 (excitation maxima approximately 240–260 nm and 295–380 nm, and emission maximum approximately 374–450 nm).13 The component depicted in Fig 2D was identified repeatedly in a study of water treatment plants around Australia,5 in which samples were measured on a single instrument but independent PARAFAC models were developed for each plant. A similar component is seen in several other studies (Fig 2E), although those spectra are more variable. Fig 2F depicts a different component, or given the apparent continuum of peak locations, possibly a suite of components representing different compounds or groups thereof. As the number of datasets in OpenFluor increases, a more robust picture of such components should emerge.
Fig. 2G–I illustrate three different protein-like components in the database that have each been described as “tryptophan-like”. Fig. 2G depicts a component common to studies that sampled in Baltic24 sea ice, Antarctic25 sea ice, the North Atlantic ocean24,26 and the Florida Everglades.22 The spectra are extremely similar in each study, down to fine detail in the emission spectra, which suggests that a discrete organic compound rather than a mixture of compounds may be responsible for this signal. Fig. 2H depicts a component identified in models from natural and artificial environments.5,27,28 The component depicted using dashed lines in this figure was strongly correlated with lignin concentration in one study.27Fig. 2I depicts a commonly-observed component with spectra similar to free dissolved tryptophan. The shape of the emission spectrum for this component differs between studies, possibly because it is derived from a group of compounds, and possibly also because interference by Raman scatter makes it difficult to accurately resolve its spectra.
The OpenFluor spectral database aims to address a serious deficiency affecting the current interpretation of NOM-PARAFAC models. Thus, although it is widely assumed that spectrally similar PARAFAC components identified in unrelated studies have similar sources and ecological functions, quantitative spectral comparisons have been implemented only rarely5,10 and with respect to a small number of studies. At the same time, many studies have drawn conclusions about the origins and behaviours of various components on the basis of qualitative comparisons with earlier studies. It is therefore likely that inconsistencies between reported behaviours of similar PARAFAC components are at least partly attributable to the unintentional grouping of NOM components that are spectrally similar, yet chemically and behaviourally distinct.
It is also important to realise that many fluorophores could have very similar spectra, so identifying similar PARAFAC components in two different studies does not guarantee that the same compounds are responsible in both cases. Fig. 3 compares a PARAFAC component identified in the Mackenzie River plume29 in northern Canada with the spectrum of pure dissolved sodium salicylate (C.A. Stedmon, unpublished data), a common pharmaceutical derived from wintergreen plants. Since the Mackenzie River watershed is mostly covered by virgin forests and wetlands and is minimally influenced by human activities,30 a pharmaceutical source for this component can be ruled out. Instead, it is more likely to represent forest-derived phenolic compounds with very similar spectral characteristics to sodium salicylate. The database may therefore be more useful for detecting patterns in the occurrence of fluorescence components, and deducing relationships between them, than as a tool for identifying the specific chemical structures responsible for the observed signals.
Fig. 3 Fluorescence spectra of a terrestrially derived PARAFAC component in the nearly pristine Mackenzie River watershed29 (lines) compared with pure dissolved sodium salicylate (dashes). |
Footnote |
† Electronic supplementary information (ESI) available: Source of PARAFAC components depicted in Fig. 2. See DOI: 10.1039/c3ay41935e |
This journal is © The Royal Society of Chemistry 2014 |