Girish Mahajan (Editor)

Proteogenomics

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

Proteogenomics is an emerging field of biological research at the intersection of proteomics and genomics. While this intersection is large and can be defined in multiple ways, the term proteogenomics commonly refers to studies that use proteomic information, often derived from mass spectrometry, to improve gene annotations.

Contents

Practical applications

Proteogenomics has been applied to improve the gene annotations of various organisms. The term proteogenomics was first used in this context by a Harvard team in 2004, although the research in this field had been building up in the previous decade. Since then, the approach has been extended to other species including Arabidopsis thaliana, humans, multiple species of Shewanella bacteria, chicken, among many others.

Besides improving gene annotations, proteogenomic studies can also provide valuable information about the presence of programmed frameshifts, N-terminal methionine excision, signal peptides, proteolysis and other posttranslational modifications.

Methodology

The main idea behind the proteogenomic approach is to identify peptides in a biological sample using mass spectrometry by searching the six-frame translation of the genome sequence, as opposed to searching the protein database. This enables identification of protein regions that are absent from or incorrectly represented in current gene annotations, and thus allows improvement of the gene annotations.

Comparative proteogenomics is a branch of proteogenomics that compares proteomic data from multiple related species concurrently and exploits the homology between their proteins to improve annotations with higher statistical confidence.

References

Proteogenomics Wikipedia