Robust determination of differential abundance in shotgun proteomics using nonparametric statistics

Patrick Slama; Michael R. Hoopmann; Robert L. Moritz; Donald Geman

doi:10.1039/C8MO00077H

Robust determination of differential abundance in shotgun proteomics using nonparametric statistics†

Patrick Slama,

‡^ab Michael R. Hoopmann,

‡^c Robert L. Moritz

^c and Donald Geman*^ad

Author affiliations

* Corresponding authors

^a Center for Imaging Science, Institute for Computational Medicine, Johns Hopkins University, USA
E-mail: geman@jhu.edu

^b Independent Researcher, Paris, France

^c Institute for Systems Biology, 401 Terry Avenue N, Seattle, WA, USA

^d Department of Applied Mathematics and Statistics, Johns Hopkins University, 3400 N. Charles St., Baltimore, MD, USA

Abstract

Label-free shotgun mass spectrometry enables the detection of significant changes in protein abundance between different conditions. Due to often limited cohort sizes or replication, large ratios of potential protein markers to number of samples, as well as multiple null measurements pose important technical challenges to conventional parametric models. From a statistical perspective, a scenario similar to that of unlabeled proteomics is encountered in genomics when looking for differentially expressed genes. Still, the difficulty of detecting a large fraction of the true positives without a high false discovery rate is arguably greater in proteomics due to even smaller sample sizes and peptide-to-peptide variability in detectability. These constraints argue for nonparametric (or distribution-free) tests on normalized peptide values, thus minimizing the number of free parameters, as well as for measuring significance with permutation testing. We propose such a procedure with a class-based statistic, no parametric assumptions, and no parameters to select other than a nominal false discovery rate. Our method was tested on a new dataset which is available via ProteomeXchange with identifier PXD006447. The dataset was prepared using a standard proteolytic digest of a human protein mixture at 1.5-fold to 3-fold protein concentration changes and diluted into a constant background of yeast proteins. We demonstrate its superiority relative to other approaches in terms of the realized sensitivity and realized false discovery rates determined by ground truth, and recommend it for detecting differentially abundant proteins from MS data.

Molecular Omics

Robust determination of differential abundance in shotgun proteomics using nonparametric statistics†

Abstract

Supplementary files

Article information

Download Citation

Search articles by author

Spotlight

Advertisements