Large-Scale Predictions of Gram-Negative Bacterial Protein Subcellular Locations

Kuo-Chen Chou* and Hong-Bin Shen
Gordon Life Science Institute, 13784 Torrey Del Mar Drive, San Diego, California 92130, and Institute of Image Processing & Pattern Recognition, Shanghai Jiaotong University, 1954 Hua-Shan Road, Shanghai 200030, China
J. Proteome Res., 2006, 5 (12), pp 3420–3428
DOI: 10.1021/pr060404b
Publication Date (Web): October 28, 2006
Copyright © 2006 American Chemical Society
*

 To whom correspondence should be addressed. E-mail:  kchou@san.rr.com.

,

 Gordon Life Science Institute.

,

 Shanghai Jiaotong University.

Abstract

Abstract Image

Many species of Gram-negative bacteria are pathogenic bacteria that can cause disease in a host organism. This pathogenic capability is usually associated with certain components in Gram-negative cells. Therefore, developing an automated method for fast and reliabe prediction of Gram-negative protein subcellular location will allow us to not only timely annotate gene products, but also screen candidates for drug discovery. However, protein subcellular location prediction is a very difficult problem, particularly when more location sites need to be involved and when unknown query proteins do not have significant homology to proteins of known subcellular locations. PSORT-B, a recently updated version of PSORT, widely used for predicting Gram-negative protein subcellular location, only covers five location sites. Also, the data set used to train PSORT-B contains many proteins with high degrees of sequence identity in a same location group and, hence, may bear a strong homology bias. To overcome these problems, a new predictor, called “Gneg-PLoc”, is developed. Featured by fusing many basic classifiers each being trained with a stringent data set containing proteins with strictly less than 25% sequence identity to one another in a same location group, the new predictor can cover eight subcellular locations; that is, cytoplasm, extracellular space, fimbrium, flagellum, inner membrane, nucleoid, outer membrane, and periplasm. In comparison with PSORT-B, the new predictor not only covers more subcellular locations, but also yields remarkably higher success rates. Gneg-PLoc is available as a Web server at http://202.120.37.186/bioinf/Gneg. To support the demand of people working in the relevant areas, a downloadable file is provided at the same Web site to list the results identified by Gneg-PLoc for 49 907 Gram-negative protein entries in the Swiss-Prot database that have no subcellular location annotations or are annotated with uncertain terms. The large-scale results will be updated twice a year to cover the new entries of Gram-negative bacterial proteins and reflect the new development of Gneg-PLoc.

Keywords: Gram-negative • Subcellular compartment • Gene ontology • Amphiphilic pseudo amino acid composition • Fusion • K-nearest neighbor rule

Tools

SciFinder Links

SciFinder subscribers:  Click to sign in | Not a SciFinder subscriber? Learn more at www.cas.org

History

  • Published In Issue December 01, 2006
  • Received August 10, 2006

Recommend & Share

Related Content

Other ACS content by these authors: