Puneet Varma (Editor)

Proteome Analyst

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit
Research center
  
University of Alberta

Release date
  
2004

Laboratory
  
Dr. David Wishart


Description
  
For predicting protein subcellular localizations

Data types captured
  
Data input: Protein sequence in FASTA format. Data output: Localization predictions in tab delimited format.

Proteome Analyst (PA) is a freely available web server and online toolkit for predicting protein subcellular localization, or where a protein resides in a cell. In the field of proteomics, accurately predicting a protein’s subcellular localization, or where a specific protein is located inside a cell, is an important step in the large scale study of proteins. This computational prediction problem is known as Protein subcellular localization prediction. Over the last decade, more than a dozen web servers and computer programs have been developed to attempt to solve this problem. Proteome Analyst is an example of one of the better performing subcellular prediction tools. Proteome Analyst makes predictions for both prokaryotic eukaryotic proteins using a text mining approach. Proteome Analyst was originally developed by the Proteome Analyst Research Group at the University of Alberta, and was initially released on March 2004. It was recently updated on January 2014.

Contents

Input/Output and Method

Users can submit requests to the Proteome Analyst web server by selecting the organism type and then uploading a text file containing the protein sequence in a FASTA format. Proteome Analyst then uses BLAST to look for similar proteins in the Uniprot database with annotation on subcellular localization information. Proteome Analyst then uses a machine-learned classifier to analyze the annotation text fields of the most similar proteins identified in Uniprot search to make the final subcellular localization predictions. Users can view and download Proteome Analyst’s results or ask Proteome Analyst to explain its predictions.

Technology

Proteome Analyst consists of >30,000 lines of Java code and can be deployed on computer cluster to accelerate its speed and performance using multiple CPUs. The initial release of Proteome Analyst used Naïve Bayes classifier to perform its predictions. The current version of Proteome Analyst uses Support Vector Machine classifiers. Currently Proteome Analyst supports subcellular predictions for five organism types (Eurkayotes including animal, plant, fungi, and prokaryotes including gram-positive and gram-negative bacteria).

References

Proteome Analyst Wikipedia