Pasi
Soininen‡
ab,
Antti J.
Kangas‡
b,
Peter
Würtz
bc,
Taru
Tukiainen
b,
Tuulia
Tynkkynen
a,
Reino
Laatikainen
a,
Marjo-Riitta
Järvelin
bdef,
Mika
Kähönen
g,
Terho
Lehtimäki
h,
Jorma
Viikari
i,
Olli T.
Raitakari
c,
Markku J.
Savolainen
bj and
Mika
Ala-Korpela
*abj
aNMR Metabonomics Laboratory, Laboratory of Chemistry, Department of Biosciences, University of Kuopio, Kuopio, Finland
bComputational Medicine Research Group, Institute of Clinical Medicine, Faculty of Medicine, University of Oulu and Biocenter Oulu, P.O. Box 5000, FI-90014 University of Oulu, Oulu, Finland. E-mail: mika.ala-korpela@computationalmedicine.fi; Tel: + 358 50 35 35 457 Web: http://www.computationalmedicine.fi/
cDepartment of Clinical Physiology and the Research Centre of Applied and Preventive Cardiovascular Medicine, University of Turku and Turku University Central Hospital, Turku, Finland
dNational Institute of Health and Wellbeing, Department of Child and Adolescent Health, Oulu, Finland
eDepartment of Epidemiology and Public Health, Imperial College London, London, UK
fInstitute of Health Sciences and Biocenter Oulu, University of Oulu, Oulu, Finland
gDepartment of Clinical Physiology, Tampere University Hospital and University of Tampere, Tampere, Finland
hDepartment of Clinical Chemistry, Tampere University Hospital and University of Tampere, Tampere, Finland
iDepartment of Medicine, University of Turku and Turku University Central Hospital, Turku, Finland
jDepartment of Internal Medicine and Biocenter Oulu, Clinical Research Center, University of Oulu, Oulu, Finland
First published on 30th July 2009
A high-throughput proton (1H) nuclear magnetic resonance (NMR) metabonomics approach is introduced to characterise systemic metabolic phenotypes. The methodology combines two molecular windows that contain the majority of the metabolic information available by 1H NMR from native serum, e.g. serum lipids, lipoprotein subclasses as well as various low-molecular-weight metabolites. The experimentation is robotics-controlled and fully automated with a capacity of about 150–180 samples in 24 h. To the best of our knowledge, the presented set-up is unique in the sense of experimental high-throughput, cost-effectiveness, and automated multi-metabolic data analyses. As an example, we demonstrate that the NMR data as such reveal associations between systemic metabolic phenotypes and the metabolic syndrome (n = 4407). The high-throughput of up to 50
000 serum samples per year is also paving the way for this technology in large-scale clinical and epidemiological studies. In contradiction to single ‘biomarkers’, the application of this holistic NMR approach and the integrated computational methods provides a data-driven systems biology approach to biomedical research.
Proton NMR offers high analytical reproducibility and provides specific quantitative data in a non-selective manner.9,10 Thus, it is well suited for probing continuous multi-metabolic variations with respect to potential systemic effectors, for example, in the grey zone of atherothrombosis pathophysiology between health and disease.11,12 While the biochemistry of serum is reflected by the atherothrombotic processes in the arterial wall (and vice versa), the biological heterogeneity as well as the slow development and progression of pathological conditions make the borderline between health and disease inherently indistinct. It is in these types of situations in which single ‘biomarkers’ are currently being recognised to fail in describing the complex molecular foundations of diseases and where the metabonomics approaches have a remarkable potential to provide cost-effective solutions via high-throughput analytics and advanced computational methods.1–15
Here, we describe a new 1H NMR spectroscopy protocol for high-throughput metabonomics of serum to characterise individual systemic metabolic phenotypes. Blood serum is the primary body fluid connected to systemic metabolism and is therefore the natural choice for studies related to vascular and systemic diseases.1,4,8–12Proton NMRper se allows fast and reliable detection of a large number of metabolites. However, the molecular variety and multiple environments, particularly in serum, partly hamper the molecular identification and quantification. To overcome this issue, we have adopted an approach based on two molecular windows, LIPO and LMWM. The LIPO window represents a conventional water-suppressed 1H NMR spectrum of serum providing mainly information on lipoprotein lipids and subclasses, while the LMWM window is acquired in such a way that majority of the broad signals, characterising the LIPO window, are not visible and the detection of the low-molecular-weight metabolites is considerably improved. These combined data represent the majority of the metabolic information available by 1H NMR metabonomics of native serum.1,4,9,10 The spectral characteristics and the metabolic contents of these molecular windows are illustrated in Fig. 1. The protocol to be presented was established, optimised, and tested in our laboratory during approximately six months. In addition to all the testing and optimisation, we have now run around 12
000 actual study samples from a few epidemiological studies.
![]() | ||
| Fig. 1 The NMR spectral characteristics and the metabolic contents of the two molecular windows – LIPO and LMWM. The LIPO window is dominated by broad signals arising from macromolecules, mainly lipoprotein lipids and albumin. Despite the broad overall characteristics and heavy overlap of the resonances, appropriate data analyses provide abundant information on lipoprotein particles as indicated by the inset illustrating the lipoprotein subclasses. In the LMWM window a pulse sequence that suppresses the macromolecule signals is applied, thus, enhancing the detection of smaller solutes. The residual water peak region (4.2–5.0 ppm) in the LIPO and LMWM windows is not shown. Abbreviations used: HDL, high-density lipoprotein; IDL, intermediate-density lipoprotein; LDL, low-density lipoprotein; VLDL, very-low-density lipoprotein. | ||
![]() | ||
| Fig. 2 Some of the key technical devices for high-throughput serum NMR metabonomics. (a) A Gilson 215 Liquid Handler for automatic sample preparation. (b) Prepared serum samples in a 96-tube rack and a 5 mm outer-diameter SampleJet NMR tube with serum (300 µl of original serum + 300 µl of the NMR buffer). (c) View inside the SampleJet™ (Bruker BioSpin GmbH, Germany) sample changer showing the well-plate positions to place the sample racks. The robotic sample changer includes a cooling unit as well as a preheating unit. The actual experimentation takes place in an NMR probehead inside the superconducting magnet. | ||
The LMWM data are acquired with such spectrometer settings (using a T2-relaxation-filtered pulse sequence) that suppress most of the broad macromolecule and lipoprotein lipid signals and in that way enhance the detection of rapidly tumbling smaller solutes. A characteristic LMWM spectrum is shown at the bottom of Fig. 1. The LMWM data are recorded with 64k data points using 24 (or 16) transients acquired after 4 steady state scans with a Bruker 1D CPMG pulse sequence with water peak suppression and a 78 ms T2-filter with a fixed echo delay of 403 µs to minimise diffusion and J-modulation effects. The acquisition time is 3.3 s and the relaxation delay 3.0 s.
Both LIPO and LMWM data are processed and phase corrected in an automated fashion. Prior to Fourier transformations to spectra, the measured free induction decays for both LIPO and LMWM windows are zero-filled to 128k data points and then multiplied with an exponential window function with a 1.0 Hz line broadening. In total, the experimental time needed to record and process the LIPO and LMWM windows for one serum sample is less than 9 min (or around 8 min if 16 transients are acquired for the LMWM data), and the current set-up enables measurement of up to 180 serum samples per 24 h.
The spectroscopic data can be seen either as a holistic profile for statistical classification of the sample (in terms of, e.g., metabolic phenotypes, risk assessment, or diagnostics) or as a quantitative source for particular metabolites (e.g. triglycerides or glucose). In most of the published metabonomics applications the former approach has been taken.1,8,17 It is quite a natural choice and makes the statistical analyses rather straightforward since there is no need to focus on individual resonances and their quantification. Instead, the strategy is to commence relating the whole spectral pattern to the biochemical and/or clinical condition of interest with the aid of statistical and chemometric data analyses.1,8,17
We usually start the data analysis and visualisation by applying the SOM analysis to the combined experimental NMR data from the two molecular windows. The SOM is an unsupervised pattern recognition technique that organises the input data according to data-driven similarity criteria. The end result is a two-dimensional map where, in this context, mutually similar NMR spectra of serum (as representatives of the individual metabolic phenotypes) are placed next to each other and on which all the clinical and biochemical measures can easily be visualised and compared.4,5,18 We have recently developed and implemented the SOM analysis into a metabonomics framework with incorporated p-value statistics.4,5
In the long run, we believe that bringing metabonomics to clinical and epidemiological use necessitates a more specific metabolite approach than the one currently provided by the chemometric approaches analysing the entire spectral data.1,8,17,19 In the case of lipoprotein lipid quantification from the LIPO window, we are implementing automated regression models with which the computation of the measures is instant (since the analysis with trained models is non-iterative).20 The method of choice to quantify the low-molecular-weight metabolites (as well as the residual lipoprotein lipid resonances) from the LMWM spectra is iterative lineshape fitting analysis.9,21
Fig. 3 shows the SOM analysis of the LIPO and LMWM spectra together with visualisation of some associated biochemical and clinical measures for 4407 serum samples from the Cardiovascular Risk in Young Finns Study.22 It is notable that the data analysis via the SOM was based solely on the NMR spectral data and thereby provides purely data-driven metabolic phenotyping. The NMR-independent clinical and biochemical variables were used only to elucidate and interpret the observed structure of the SOM.
![]() | ||
| Fig. 3 Statistical colourings of selected clinical and biochemical variables in the SOM analysis of the combined LIPO and LMWM molecular windows from 1H NMR spectroscopy of 4407 serum samples (corresponding to 8814 spectra, see Fig. 1) from the Cardiovascular Risk in Young Finns Study.22 The SOM analysis positions the samples so that the multi-metabolite differences in the metabolic phenotypes between nearby samples are minimised. A given serum sample (person) is in exactly the same place in each of the component planes. The colouring is according to the corresponding measures of the local residents within each hexagonal unit. The values in each component plane are colour-coded to visualise whether the value is above (reddish), at (white) or below (bluish) the median of the variable. The numbers on selected units tell the local mean value for that particular region. MetS refers to the metabolic syndrome with NCEP, IDF and EGIR being three different clinical definitions of the condition, namely by the National Cholesterol Education Program (2005 revised criteria), by the International Diabetes Federation, and by the European Group for the Study of Insulin Resistance, respectively.22 BMI refers to body mass index, SBP to systolic blood pressure, TG to triglycerides, and HDL-C to high-density lipoprotein cholesterol. | ||
The component planes in Fig. 3 relate to clinically defined metabolic syndrome (MetS), a combination of disorders that together increase the risk of developing cardiovascular disease and diabetes.22 The distribution of MetS in Fig. 3 indicates that the metabolic phenotypes characterising the northeast corner of the SOM relate to a high preponderance of MetS (as indicated by the three different clinical definitions of MetS).22 The other seven measures shown in Fig. 3 are all components of the clinical definitions of MetS, namely, waist circumference, body mass index (BMI), systolic blood pressure (SBP), insulin, glucose, serum triglycerides (TG) and high-density lipoprotein cholesterol (HDL-C). In the case of the first six, the MetS is associated with high values and for the HDL-C with low values of the corresponding measures. This indicates that the metabolic data revealed by serum NMR metabonomics bear a clear link to the (patho)physiological pathways connected to the metabolic syndrome. This is by no means a surprise since, for example, glucose, serum TG, and HDL-C are known to be quantifiable from the LIPO window.9,10 Overall, these first preliminary results, obtained using the newly established high-throughput NMR metabonomics set-up and the integrated data analysis, give further support to the applications of serum NMR metabonomics in clinical and epidemiological research.
Footnotes |
| † Authors' contributions: P. S., A. J. K., P. W., J. V., O. T. R., and M. A. K. conceived and designed the study; P. S., A. J. K., and M. A. K. designed the serum NMR platform; P. S., A. J. K., P. W., T. Tu., T. Ty., R. L., and M. A. K. took part in analysing the spectroscopic data; M. R. J., M. K., T. L., J. V., O. T. R., and M. J. S. collected and interpreted clinical and biochemical data; all authors interpreted the results; P. S., A. J. K., P. W., and M. A. K. wrote the paper. All authors read and approved the final manuscript. |
| ‡ Contributed equally to this work. |
| This journal is © The Royal Society of Chemistry 2009 |