Proteogenomics: Proteomics for Genome Annotation

Ghali, Fawaz; Jones, Andrew R.

doi:10.1039/9781782626732-00365

Skip Nav Destination

New Developments in Mass Spectrometry

Proteome Informatics

Edited by

DOI:

https://doi.org/10.1039/9781782626732

Hardback ISBN:

978-1-78262-428-8

PDF ISBN:

978-1-78262-673-2

EPUB ISBN:

978-1-78262-957-3

Special Collection: 2016 ebook collection

Series: New Developments in Mass Spectrometry

No. of Pages:

412

Publication date:

15 Nov 2016

Book Chapter

Chapter 15: Proteogenomics: Proteomics for Genome Annotation

Fawaz Ghali

Institute of Integrative Biology, University of Liverpool, Biosciences Building Crown Street

Liverpool

L69 7ZB

andrew.jones@liverpool.ac.uk

School of Computing, Mathematics and Digital Technology, Manchester Metropolitan University, Chester Street

Manchester

M1 5GD

Search for other works by this author on:

This Site

PubMed

Google Scholar

;

Andrew R. Jones

Institute of Integrative Biology, University of Liverpool, Biosciences Building Crown Street

Liverpool

L69 7ZB

andrew.jones@liverpool.ac.uk

Search for other works by this author on:

This Site

PubMed

Google Scholar

Doi:

https://doi.org/10.1039/9781782626732-00365

Published:

15 Nov 2016
Special Collection: 2016 ebook collection

Series: New Developments in Mass Spectrometry

One of major bottlenecks in omics biology is the generation of accurate gene models, including correct calling of the start codon, splicing of introns (taking account of alternative splicing), and the stop codon – collectively called genome annotation. Current genome annotation approaches for newly sequenced genomes are generally based on automated or semi-automated methods, usually involving gene finding software to look for intrinsic gene-like signatures (motifs) in the DNA sequence, the propagation of annotations from other (more well annotated) related species, and the mapping of experimental data sets, particularly from RNA Sequencing (RNA-Seq). Large scale proteomics data can also play an important role for confirming and correcting gene models. While proteomics approaches tend not to have the same level of sensitivity as RNA-Seq, they have the advantage that they can provide evidence that a predicted gene/transcript is indeed protein-coding. The use of proteomics data for genome annotation is called proteogenomics, and forms the basis for this chapter. We describe the theoretical underpinnings, different software packages that have been developed for proteogenomics, statistical approaches for validating the evidence, and support for proteogenomics data in file formats, standards and databases.

You do not currently have access to this chapter, but see below options to check access via your institution or sign in to purchase.

Don't already have an account? Register

Log in

Chapter 15: Proteogenomics: Proteomics for Genome Annotation

Log in

Institutional access

My account

Digital access

This Feature Is Available To Subscribers Only