Structure-Related Statistical Singularities along Protein Sequences: A Correlation Study
- Mauro Colafranceschi ,
- Alfredo Colosimo ,
- Joseph P. Zbilut ,
- Vladimir N. Uversky , and
- Alessandro Giuliani
Abstract
A data set composed of 1141 proteins representative of all eukaryotic protein sequences in the Swiss-Prot Protein Knowledge base was coded by seven physicochemical properties of amino acid residues. The resulting numerical profiles were submitted to correlation analysis after the application of a linear (simple mean) and a nonlinear (Recurrence Quantification Analysis, RQA) filter. The main RQA variables, Recurrence and Determinism, were subsequently analyzed by Principal Component Analysis. The RQA descriptors showed that (i) within protein sequences is embedded specific information neither present in the codes nor in the amino acid composition and (ii) the most sensitive code for detecting ordered recurrent (deterministic) patterns of residues in protein sequences is the Miyazawa-Jernigan hydrophobicity scale. The most deterministic proteins in terms of autocorrelation properties of primary structures were found (i) to be involved in protein−protein and protein−DNA interactions and (ii) to display a significantly higher proportion of structural disorder with respect to the average data set. A study of the scaling behavior of the average determinism with the setting parameters of RQA (embedding dimension and radius) allows for the identification of patterns of minimal length (six residues) as possible markers of zones specifically prone to inter- and intramolecular interactions.
†
University of Rome “La Sapienza.
‡
Rush Medical College.
§
University of California and Institute for Biological Instrumentation of the Russian Academy of Sciences.
*
Corresponding author phone: ++39 06 49902579; fax: ++39 06 49902355; e-mail: [email protected]
‖
Istituto Superiore di Sanità.
Cited By
This article is cited by 20 publications.
- Jeffrey R. Wagner, Christopher T. Lee, Jacob D. Durrant, Robert D. Malmstrom, Victoria A. Feher, and Rommie E. Amaro . Emerging Computational Methods for the Rational Discovery of Allosteric Drugs. Chemical Reviews 2016, 116 (11) , 6370-6390. https://doi.org/10.1021/acs.chemrev.5b00631
- Luisa Di Paola, Paola Paci, Daniele Santoni, Micol De Ruvo, and Alessandro Giuliani . Proteins as Sponges: A Statistical Journey along Protein Structure Organization Principles. Journal of Chemical Information and Modeling 2012, 52 (2) , 474-482. https://doi.org/10.1021/ci2005127
- Alessio Martino, Alessandro Giuliani, Antonello Rizzi. (Hyper)Graph Embedding and Classification via Simplicial Complexes. Algorithms 2019, 12 (11) , 223. https://doi.org/10.3390/a12110223
- Bogumił M. Konopka, Marta Marciniak, Witold Dyrka. Quantiprot - a Python package for quantitative analysis of protein sequences. BMC Bioinformatics 2017, 18 (1) https://doi.org/10.1186/s12859-017-1751-4
- Wael I. Karain. Detecting transitions in protein dynamics using a recurrence quantification analysis based bootstrap method. BMC Bioinformatics 2017, 18 (1) https://doi.org/10.1186/s12859-017-1943-y
- E. Garcia-Ochoa, F. Corvo, J. Genesca, V. Sosa, P. Estupiñán. Copper Corrosion Under Non-uniform Magnetic Field in 0.5 M Hydrochloric Acid. Journal of Materials Engineering and Performance 2017, 26 (5) , 2129-2135. https://doi.org/10.1007/s11665-017-2667-x
- E. García-Ochoa, F. Corvo. Using recurrence plot to study the dynamics of reinforcement steel corrosion. Protection of Metals and Physical Chemistry of Surfaces 2015, 51 (4) , 716-724. https://doi.org/10.1134/S2070205115040115
- Hiba Fataftah, Wael Karain. Detecting protein atom correlations using correlation of probability of recurrence. Proteins: Structure, Function, and Bioinformatics 2014, 82 (9) , 2180-2189. https://doi.org/10.1002/prot.24574
- E. Tejera, J. Nieto-Villar, I. Rebelo. Protein sequence complexity revisited. Relationship with fractal 3D structure, topological and kinetic parameters. Physica A: Statistical Mechanics and its Applications 2014, 410 , 287-301. https://doi.org/10.1016/j.physa.2014.05.019
- Saritha Namboodiri, Alessandro Giuliani, Achuthsankar S. Nair, Pawan K Dhar. Looking for a sequence based allostery definition: A statistical journey at different resolution scales. Journal of Theoretical Biology 2012, 304 , 211-218. https://doi.org/10.1016/j.jtbi.2012.03.005
- Saritha Namboodiri, Chandra Verma, Pawan K. Dhar, Alessandro Giuliani, Achuthsankar S. Nair. Application of Recurrence Quantification Analysis (RQA) in Biosequence Pattern Recognition. 2011,,, 284-293. https://doi.org/10.1007/978-3-642-22709-7_29
- Saritha Namboodiri, Chandra Verma, Pawan K. Dhar, Alessandro Giuliani, Achuthsankar S. Nair. Sequence signatures of allosteric proteins towards rational design. Systems and Synthetic Biology 2010, 4 (4) , 271-280. https://doi.org/10.1007/s11693-011-9072-9
- Mauro Colafranceschi, Alessandro Giuliani, Øivind Andersen, Ole Brix, Maria Cristina De Rosa, Bruno Giardina, Alfredo Colosimo. Hydrophobicity Patterns and Biological Adaptation: An Exemplary Case from Fish Hemoglobins. OMICS: A Journal of Integrative Biology 2010, 14 (3) , 275-281. https://doi.org/10.1089/omi.2010.0007
- Alberto Rolo-Naranjo, Rocio Rebollido-Rios, Kenia Melchor-Rodriguez, Edelsys Codorniu-Hernández. Pseudo-phase portrait applied to pattern recognition in flavonoid–protein interactions. Applied Mathematics and Computation 2009, 215 (1) , 156-167. https://doi.org/10.1016/j.amc.2009.04.070
- Yuchen Yang, Erwin Tantoso, Kuo-Bin Li. Remote protein homology detection using recurrence quantification analysis and amino acid physicochemical properties. Journal of Theoretical Biology 2008, 252 (1) , 145-154. https://doi.org/10.1016/j.jtbi.2008.01.028
- GEK-HUEY CHUA, ARUN KRISHNAN, KUO-BIN LI, MASARU TOMITA. MULTIRESOLUTION ANALYSIS UNCOVERS HIDDEN CONSERVATION OF PROPERTIES IN STRUCTURALLY AND FUNCTIONALLY SIMILAR PROTEINS. Journal of Bioinformatics and Computational Biology 2006, 04 (06) , 1245-1267. https://doi.org/10.1142/S0219720006002442
- Joseph P. Zbilut, Gek Huey Chua, Arun Krishnan, Cecilia Bossa, Mauro Colafranceschi, Alessandro Giuliani. Entropic criteria for protein folding derived from recurrences: Six residues patch as the basic protein word. FEBS Letters 2006, 580 (20) , 4861-4864. https://doi.org/10.1016/j.febslet.2006.07.076
- Franco Orsucci, Alessandro Giuliani, Charles Webber, Joseph Zbilut, Peter Fonagy, Marianna Mazza. Combinatorics and synchronization in natural semiotics. Physica A: Statistical Mechanics and its Applications 2006, 361 (2) , 665-676. https://doi.org/10.1016/j.physa.2005.06.044
- Mauro Colafranceschi, Massimiliano Papi, Alessandro Giuliani, Gino Amiconi, Alfredo Colosimo. Simulated Point Mutations in the Aα-Chain of Human Fibrinogen Support a Role of the αC Domain in the Stabilization of Fibrin Gel. Pathophysiology of Haemostasis and Thrombosis 2006, 35 (6) , 417-427. https://doi.org/10.1159/000102048
- Mauro Colafranceschi, Alfredo Colosimo, Joseph P. Zbilut, Vladimir N. Uversky, Alessandro Giuliani. Structure-Related Statistical Singularities Along Protein Sequences: A Correlation Study.. ChemInform 2005, 36 (16) https://doi.org/10.1002/chin.200516216



