This website uses cookies to give you the best user experience. If you continue
without changing your settings we'll assume you are happy to receive all RSC cookies.
You can change your cookie settings by navigating to our Privacy and Cookies page and following the instructions. These instructions
are also obtainable from the privacy link at the bottom of any RSC page.
Centre for Research and Innovation, Fondazione Edmund Mach., Via E. Mach 1, San Michele all'Adige (TN), Italy
E-mail: ron.wehrens@fmach.it
; Fax: +39 0461 615200
; Tel: +39 0461 615563
Mol. BioSyst., 2012,8, 2339-2346
DOI:
10.1039/C2MB25121C
Received
29 Mar 2012,
Accepted
07 Jun 2012
First published online
13 Jun 2012
Biomarker selection is an important topic in the omics sciences, where holistic measurement methods routinely generate results for many variables simultaneously. Very often, only a small fraction of these variables are really associated with the phenomena of interest. Selection and identification of these biomarkers is essential for obtaining an understanding of the complex biological processes under study. Finding biomarkers, however, is a difficult task. Even if a relative order can be established, e.g., on the basis of p values, it is usually hard to determine where to stop including candidates in the final set. Higher Criticism is an approach for finding data-dependent cutoff values when comparing two distinct groups of samples. Here, we extend its use to multivariate data, providing a principled approach to compromise between not selecting too many variables and catching as many true positives as possible. The results show a marked improvement in biomarker selection, compared to the standard settings available for some methods. Interestingly, HC thresholds can differ considerably from what has been suggested in literature before, again showing that it is not possible to use the same cutoff value for all data sets. The data-specific cutoff values provided by HC also open the way to more fair comparisons between biomarker selection methods, not biased by unlucky or suboptimal threshold choices.
Fetching data from CrossRef. This may take some time to load.