Structure-Related Statistical Singularities along Protein Sequences:  A Correlation Study

Mauro Colafranceschi, Alfredo Colosimo, Joseph P. Zbilut, Vladimir N. Uversky,§ and Alessandro Giuliani*
Department of Human Physiology and Pharmacology - University of Rome La Sapienza, P.le A. Moro, 5-00185 Rome, Italy, Department of Molecular Biophysics and Physiology, Rush Medical College, Chicago, Illinois 60612, Department of Chemistry and Biochemistry, University of California, Santa Cruz, California 95064, Institute for Biological Instrumentation of the Russian Academy of Sciences, Pushchino, Moscow Region, 142290 Russia, and Environment and Health Department Istituto Superiore di Sanit, Viale Regina Elena, 299-00161 Rome, Italy
J. Chem. Inf. Model., 2005, 45 (1), pp 183–189
DOI: 10.1021/ci049838m
Publication Date (Web): November 24, 2004
Copyright © 2005 American Chemical Society

 University of Rome “La Sapienza.

,

 Rush Medical College.

,
§

 University of California and Institute for Biological Instrumentation of the Russian Academy of Sciences.

,
*

 Corresponding author phone:  ++39 06 49902579; fax:  ++39 06 49902355; e-mail:  alessandro.giuliani@iss.it.

,

 Istituto Superiore di Sanità.

Abstract

A data set composed of 1141 proteins representative of all eukaryotic protein sequences in the Swiss-Prot Protein Knowledge base was coded by seven physicochemical properties of amino acid residues. The resulting numerical profiles were submitted to correlation analysis after the application of a linear (simple mean) and a nonlinear (Recurrence Quantification Analysis, RQA) filter. The main RQA variables, Recurrence and Determinism, were subsequently analyzed by Principal Component Analysis. The RQA descriptors showed that (i) within protein sequences is embedded specific information neither present in the codes nor in the amino acid composition and (ii) the most sensitive code for detecting ordered recurrent (deterministic) patterns of residues in protein sequences is the Miyazawa-Jernigan hydrophobicity scale. The most deterministic proteins in terms of autocorrelation properties of primary structures were found (i) to be involved in protein−protein and protein−DNA interactions and (ii) to display a significantly higher proportion of structural disorder with respect to the average data set. A study of the scaling behavior of the average determinism with the setting parameters of RQA (embedding dimension and radius) allows for the identification of patterns of minimal length (six residues) as possible markers of zones specifically prone to inter- and intramolecular interactions.

Tools

SciFinder Links

SciFinder subscribers:  Click to sign in | Not a SciFinder subscriber? Learn more at www.cas.org

History

  • Published In Issue January 24, 2005
  • Received May 18, 2004

Recommend & Share