Understanding the molecular information contained in principal component analysis of vibrational spectra of biological systems

F. Bonnier; H. J. Byrne

doi:10.1039/C1AN15821J

Understanding the molecular information contained in principal component analysis of vibrational spectra of biological systems

F. Bonnier*^a and H. J. Byrne^a

* Corresponding authors

^a Focas Research Institute, Dublin Institute of Technology, Kevin Street, Dublin 8, Ireland
E-mail: fbonnier@dit.ie
Fax: +353 1 4027904
Tel: +353 1 4027917

Abstract

K-means clustering followed by Principal Component Analysis (PCA) is employed to analyse Raman spectroscopic maps of single biological cells. K-means clustering successfully identifies regions of cellular cytoplasm, nucleus and nucleoli, but the mean spectra do not differentiate their biochemical composition. The loadings of the principal components identified by PCA shed further light on the spectral basis for differentiation but they are complex and, as the number of spectra per cluster is imbalanced, particularly in the case of the nucleoli, the loadings under-represent the basis for differentiation of some cellular regions. Analysis of pure bio-molecules, both structurally and spectrally distinct, in the case of histone, ceramide and RNA, and similarly in the case of the proteins albumin, collagen and histone, show the relative strong representation of spectrally sharp features in the spectral loadings, and the systematic variation of the loadings as one cluster becomes reduced in number. The more complex cellular environment is simulated by weighted sums of spectra, illustrating that although the loading becomes increasingly complex; their origin in a weighted sum of the constituent molecular components is still evident. Returning to the cellular analysis, the number of spectra per cluster is artificially balanced by increasing the weighting of the spectra of smaller number clusters. While it renders the PCA loading more complex for the three-way analysis, a pair wise analysis illustrates clear differences between the identified subcellular regions, and notably the molecular differences between nuclear and nucleoli regions are elucidated. Overall, the study demonstrates how appropriate consideration of the data available can improve the understanding of the information delivered by PCA.

Article information

https://doi.org/10.1039/C1AN15821J

Article type

Paper

Submitted

05 Sep 2011

Accepted

09 Nov 2011

First published

24 Nov 2011

Download Citation

Analyst, 2012,137, 322-332

Permissions

Request permissions

Understanding the molecular information contained in principal component analysis of vibrational spectra of biological systems

F. Bonnier and H. J. Byrne, Analyst, 2012, 137, 322 DOI: 10.1039/C1AN15821J

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Analyst

Understanding the molecular information contained in principal component analysis of vibrational spectra of biological systems

Abstract

Article information

Download Citation

Permissions

Understanding the molecular information contained in principal component analysis of vibrational spectra of biological systems

Social activity

Search articles by author

Spotlight

Advertisements