Reconstructing the Remote Origins of a Fold Singleton from a Flavodoxin-Like AncestorClick to copy article linkArticle link copied!
- Saacnicteh Toledo-PatiñoSaacnicteh Toledo-PatiñoDepartment of Biochemistry, University of Bayreuth, 95447 Bayreuth, GermanyMax Planck Institute for Developmental Biology, 72076 Tübingen, GermanyMore by Saacnicteh Toledo-Patiño
- Manish ChaubeyManish ChaubeyMax Planck Institute for Developmental Biology, 72076 Tübingen, GermanyMore by Manish Chaubey
- Murray ColesMurray ColesMax Planck Institute for Developmental Biology, 72076 Tübingen, GermanyMore by Murray Coles
- Birte Höcker*Birte Höcker*E-mail: [email protected]Department of Biochemistry, University of Bayreuth, 95447 Bayreuth, GermanyMax Planck Institute for Developmental Biology, 72076 Tübingen, GermanyMore by Birte Höcker
Abstract
Evolutionary processes that led to the emergence of structured protein domains left footprints in the sequences of modern proteins. We searched for such hints employing state-of-the-art sequence analysis and found evidence that the HemD-like fold emerged from the flavodoxin-like fold through segment swap and gene duplication. To verify this hypothesis, we reverted these evolutionary steps experimentally, constructing a HemD-half that resulted in a protein with the canonical flavodoxin-like architecture. These results of fold reconstruction from the sequence of a different fold strongly support our hypothesis of common ancestry. It further illustrates the plasticity of modern proteins to form new folded proteins.
Sequence and Structure Comparisons
Experimental Reconstruction
Implications for Protein Fold Evolution
Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.biochem.9b00900.
Experimental methods and solution structure statistics (PDF)
The Uniprot ID of hemD is P48246, and the NCBI accession is 4ES6_A. The PDB ID of the NMR structure is 6TH8 and the BMRB ID 34452.
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
We thank Prof. Gunter Schneider and Dr. Robert Schnell for kindly providing us with the U3Spa construct and Sooruban Shanmugaratnam for excellent technical assistance.
References
This article references 18 other publications.
- 1Söding, J. and Lupas, A. N. (2003) More than the Sum of Their Parts: On the Evolution of Proteins from Peptides. BioEssays 25 (9), 837– 846, DOI: 10.1002/bies.10321Google Scholar1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD3szpslOltw%253D%253D&md5=ef9e186bca6e3e8484c2d6608ec9885fMore than the sum of their parts: on the evolution of proteins from peptidesSoding Johannes; Lupas Andrei NBioEssays : news and reviews in molecular, cellular and developmental biology (2003), 25 (9), 837-46 ISSN:0265-9247.Despite their seemingly endless diversity, proteins adopt a limited number of structural forms. It has been estimated that 80% of proteins will be found to adopt one of only about 400 folds, most of which are already known. These folds are largely formed by a limited 'vocabulary' of recurring supersecondary structure elements, often by repetition of the same element and, increasingly, elements similar in both structure and sequence are discovered. This suggests that modern proteins evolved by fusion and recombination from a more ancient peptide world and that many of the core folds observed today may contain homologous building blocks. The peptides forming these building blocks would not in themselves have had the ability to fold, but would have emerged as cofactors supporting RNA-based replication and catalysis (the 'RNA world'). Their association into larger structures and eventual fusion into polypeptide chains would have allowed them to become independent of their RNA scaffold, leading to the evolution of a novel type of macromolecule: the folded protein.
- 2Murzin, A.G. P (1998) How Far Divergent Evolution Goes in Proteins. Curr. Opin. Struct. Biol. 8 (3), 380– 387, DOI: 10.1016/S0959-440X(98)80073-0Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1cXks1egu7s%253D&md5=6fd71121962d77c8c2cb907424b83655How far divergent evolution goes in proteinsMurzin, Alexey G.Current Opinion in Structural Biology (1998), 8 (3), 380-387CODEN: COSBEF; ISSN:0959-440X. (Current Biology Ltd.)A review with 32 refs. In theory, mutations of protein sequences may eventually generate different functions as well as different structures. The observation of such records of protein evolution have been obscured by the dissipation of memory about the ancestors. In the past year, new advances in our understanding of divergent evolution were allowed by new protein structure detns., including the ClpP protease, steroid Δ-isomerase, carboxypeptidase G2, the thrombin inhibitor triabin and the chloroplast Rieske protein. There is strong evidence for their distant homol. with proteins of known structure despite significant functional or structural differences.
- 3Bork, P., Sander, C., and Valencia, A. (1993) Convergent Evolution of Similar Enzymatic Function on Different Protein Folds: The Hexokinase, Ribokinase, and Galactokinase Families of Sugar Kinases. Protein Sci. 2 (1), 31– 40, DOI: 10.1002/pro.5560020104Google Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3sXhvVKrsLY%253D&md5=29e6355f725f8bf8db0eea3f04821738Convergent evolution of similar enzymic function on different protein folds: The hexokinase, ribokinase, and galactokinase families of sugar kinasesBork, Peer; Sander, Chris; Valencia, AlfonsoProtein Science (1993), 2 (1), 31-40CODEN: PRCIEI; ISSN:0961-8368.Sugar kinases, that catalyze phosphorylation of sugars, can be divided into at least 3 distinct nonhomologous families. The 1st is the hexokinase family, which contains many prokaryotic and eukaryotic sugar kinases with diverse specificities, including a new member, rhamnokinase from Salmonella typhimurium. The 3-dimensional structure of hexokinase is known and can be used to build models of functionally important regions of other kinases in this family. The 2nd is the ribokinase family, of unknown 3-dimensional structure, and comprises pro- and eukaryotic ribokinases, bacterial fructokinases, the minor 6-phosphofructokinase 2 from Escherichia coli, 6-phosphotagatokinase, 1-phosphofructokinase, and, possibly, inosine-guanosine kinase. The 3rd family, also of unknown 3-dimensional structure, contains several bacterial and yeast galactokinases and eukaryotic mevalonate and phosphomevalonate kinases and may have a substrate binding region in common with homoserine kinases. Each of the 3 families of sugar kinases appears to have a distinct 3-dimensional fold, since conserved sequence patterns are strikingly different for the 3 families. However, each catalyzes chem. equiv. reactions on similar or identical substrates. The enzymic function of sugar phosphorylation appears to have evolved independently on the 3 distinct structural frameworks, by convergent evolution. In addn., evolutionary trees reveal that (1) fructokinase specificity has evolved independently in both the hexokinase and ribokinase families and (2) glucose specificity has evolved independently in different branches of the hexokinase family. These are examples of independent Darwinian adaptation of a structure to the same substrate at different evolutionary times. The flexible combination of active sites and 3-dimensional folds obsd. in nature can be exploited by protein engineers in designing and optimizing enzymic function.
- 4Söding, J., Biegert, A., and Lupas, A. N. (2005) The HHpred Interactive Server for Protein Homology Detection and Structure Prediction. Nucleic Acids Res. 33, W244– 248, DOI: 10.1093/nar/gki408Google Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD2MzjtVSntg%253D%253D&md5=58a35ae4ec7df2e6b8e6fcbb6fe146d5The HHpred interactive server for protein homology detection and structure predictionSoding Johannes; Biegert Andreas; Lupas Andrei NNucleic acids research (2005), 33 (Web Server issue), W244-8 ISSN:.HHpred is a fast server for remote protein homology detection and structure prediction and is the first to implement pairwise comparison of profile hidden Markov models (HMMs). It allows to search a wide choice of databases, such as the PDB, SCOP, Pfam, SMART, COGs and CDD. It accepts a single query sequence or a multiple alignment as input. Within only a few minutes it returns the search results in a user-friendly format similar to that of PSI-BLAST. Search options include local or global alignment and scoring secondary structure similarity. HHpred can produce pairwise query-template alignments, multiple alignments of the query with a set of templates selected from the search results, as well as 3D structural models that are calculated by the MODELLER software from these alignments. A detailed help facility is available. As a demonstration, we analyze the sequence of SpoVT, a transcriptional regulator from Bacillus subtilis. HHpred can be accessed at http://protevo.eb.tuebingen.mpg.de/hhpred.
- 5Farias-Rico, J. A., Schmidt, S., and Höcker, B. (2014) Evolutionary Relationship of two ancient protein superfolds. Nat. Chem. Biol. 10 (9), 710– 715, DOI: 10.1038/nchembio.1579Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtFygs7nN&md5=2be43499fad84afa2d2d4eee88131b10Evolutionary relationship of two ancient protein superfoldsFarias-Rico, Jose Arcadio; Schmidt, Steffen; Hoecker, BirteNature Chemical Biology (2014), 10 (9), 710-715CODEN: NCBABT; ISSN:1552-4450. (Nature Publishing Group)Proteins are the mol. machines of the cell that fold into specific three-dimensional structures to fulfill their functions. To improve our understanding of how the structure and function of proteins arises, it is crucial to understand how evolution has generated the structural diversity we observe today. Classically, proteins that adopt different folds are considered to be nonhomologous. However, using state-of-the-art tools for homol. detection, we found evidence of homol. between proteins of two ancient and highly populated protein folds, the (βα)8-barrel and the flavodoxin-like fold. We detected a family of sequences that show intermediate features between both folds and detd. what is to our knowledge the first representative crystal structure of one of its members, giving new insights into the evolutionary link of two of the earliest folds. Our findings contribute to an emergent vision where protein superfolds share common ancestry and encourage further approaches to complete the mapping of structure space onto sequence space.
- 6Fortian, A., Castano, D., Ortega, G., Lain, A., Pons, M., and Millet, O. (2009) Uroporhyrinogen III Synthase Mutations Related to Congenital Erythropoietic Porphyria Indentify Key Helix for Protein Stability. Biochemistry 48 (2), 454– 461, DOI: 10.1021/bi801731qGoogle Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXhsFanurzK&md5=964c53f4854b4ce6ce1d3ebdd39e7382Uroporphyrinogen III Synthase Mutations Related to Congenital Erythropoietic Porphyria Identify a Key Helix for Protein StabilityFortian, Arola; Castano, David; Ortega, Gabriel; Lain, Ana; Pons, Miquel; Millet, OscarBiochemistry (2009), 48 (2), 454-461CODEN: BICHAW; ISSN:0006-2960. (American Chemical Society)In the present study we have investigated deleterious mutants in the uroporphyrinogen III synthase (UROIIIS) that are related to the congenital erythropoietic porphyria (CEP). The 25 missense mutants found in CEP patients have been cloned, expressed, and purified. Their enzymic activities have been measured relative to wild-type UROIIIS activity. All mutants retain measurable activity, consistent with the recessive character of the disease. Most of the mutants with a significant decrease in activity involve residues likely assocd. in binding. However, other mutants are fully active, indicating that different mechanisms may contribute to enzyme misfunction. UROIIIS is a thermolabile enzyme undergoing irreversible denaturation. The unfolding kinetics of wild-type UROIIIS and the suite of mutants have been monitored by CD. This anal. allowed the identification of a helical region in the mol., essential to retain the kinetic stability of the folded conformation. C73R is found in one-third of CEP patients, and Cys73 is part of this helix. The integrated anal. of the enzymic activity and kinetic stability data is used to gain insight in the relationship between defects in UROIIIS sequence and CEP.
- 7Szilagyi, A., Györffy, D., and Zavodszky, P. (2017) Segment Swapping Aided the Evolution of Enzyme Function: The Case of Uroporphyrinogen III Synthase. Proteins: Struct., Funct., Genet. 85 (1), 46– 53, DOI: 10.1002/prot.25190Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhslyjtLnM&md5=f1a0d549c40c051186aa9bcdc741c070Segment swapping aided the evolution of enzyme function: The case of uroporphyrinogen III synthaseSzilagyi, Andras; Gyoerffy, Daniel; Zavodszky, PeterProteins: Structure, Function, and Bioinformatics (2017), 85 (1), 46-53CODEN: PSFBAF; ISSN:1097-0134. (Wiley-Blackwell)In an earlier study, we showed that two-domain segment-swapped proteins can evolve by domain swapping and fusion, resulting in a protein with two linkers connecting its domains. We proposed that a potential evolutionary advantage of this topol. may be the restriction of interdomain motions, which may facilitate domain closure by a hinge-like movement, crucial for the function of many enzymes. Here, we test this hypothesis computationally on uroporphyrinogen III synthase (U3S), a two-domain segment-swapped enzyme essential in porphyrin metab. To compare the interdomain flexibility between the wild-type, segment-swapped enzyme (having two interdomain linkers) and circular permutants of the same enzyme having only one interdomain linker, we performed geometric and mol. dynamics simulations for these species in their ligand-free and ligand-bound forms. We find that in the ligand-free form, interdomain motions in the wild-type enzyme are significantly more restricted than they would be with only one interdomain linker, while the flexibility difference is negligible in the ligand-bound form. We also estd. the entropy costs of ligand binding assocd. with the interdomain motions, and find that the change in domain connectivity due to segment swapping results in a redn. of this entropy cost, corresponding to ∼20% of the total ligand binding free energy. In addn., the restriction of interdomain motions may also help the functional domain-closure motion required for catalysis. This suggests that the evolution of the segment-swapped topol. facilitated the evolution of enzyme function for this protein by influencing its dynamic properties. Proteins 2016. © 2016 Wiley Periodicals, Inc.
- 8Alva, V., Nam, S. Z., Söding, J., and Lupas, A. N. (2016) The MPI Bioinformatics Toolkit as an Integrative Platform for Advanced Protein Sequence and Structure Analysis. Nucleic Acids Res. 44 (W1), W410– W415, DOI: 10.1093/nar/gkw348Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2it77E&md5=05641d8bca799244ae60ef06e5e3d039The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysisAlva, Vikram; Nam, Seung-Zin; Soeding, Johannes; Lupas, Andrei N.Nucleic Acids Research (2016), 44 (W1), W410-W415CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The MPI Bioinformatics Toolkit (http://toolkit. tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic anal. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current prodn.-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400,000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for exptl. scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment.
- 9Sander, C. and Schneider, R. (1991) Database of Homology-Derived protein structures and the Structural Meaning of Sequence Alignment. Proteins: Struct., Funct., Genet. 9 (1), 56– 68, DOI: 10.1002/prot.340090107Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3MXhtVSlsrw%253D&md5=e8c5c9faa7243a26e38113308cad1ac2Database of homology-derived protein structures and the structural meaning of sequence alignmentSander, Chris; Schneider, ReinhardProteins: Structure, Function, and Genetics (1991), 9 (1), 56-68CODEN: PSFGEY; ISSN:0887-3585.The database of known protein three-dimensional structures can be significantly increased by the use of sequence homol. The database of known sequences, currently at >12,000 proteins, is two orders of magnitude larger than the database of known structures. The currently most powerful method of predicting protein structures is model building by homol. Structural homol. can be inferred from the level of sequence similarity. The threshold of sequence similarity sufficient for structural homol. depends strongly on the length of the alignment. The relation is quantified between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homol. threshold curve as a function of alignment length. A database of homol.-derived secondary structure of proteins (HSSP) is produced by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the no. of known protein structures by a factor of 5 to >1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homol.
- 10Houwman, J. A. and van Mierlo, C. P. M. (2017) Folding of proteins with a flavodoxin-like architecture. FEBS J. 284, 3145– 3167, DOI: 10.1111/febs.14077Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXmvVCgsLw%253D&md5=1d70e4310189d8a2b8fd979101f005acFolding of proteins with a flavodoxin-like architectureHouwman, Joseline A.; van Mierlo, Carlo P. M.FEBS Journal (2017), 284 (19), 3145-3167CODEN: FJEOAC; ISSN:1742-464X. (Wiley-Blackwell)A review. The flavodoxin-like fold is a protein architecture that can be traced back to the universal ancestor of the three kingdoms of life. Many proteins share this α-β parallel topol. and hence it is highly relevant to illuminate how they fold. Here, we review expts. and simulations concerning the folding of flavodoxins and CheY-like proteins, which share the flavodoxin-like fold. These polypeptides tend to temporarily misfold during unassisted folding to their functionally active forms. This susceptibility to frustration is caused by the more rapid formation of an α-helix compared to a β-sheet, particularly when a parallel β-sheet is involved. As a result, flavodoxin-like proteins form intermediates that are off-pathway to native protein and several of these species are molten globules (MGs). Expts. suggest that the off-pathway species are of helical nature and that flavodoxin-like proteins have a nonconserved transition state that dets. the rate of productive folding. The folding of flavodoxin from Azotobacter vinelandii has been investigated extensively, enabling a schematic construction of its folding energy landscape. It is the only flavodoxin-like protein of which cotranslational folding has been probed. New insights that emphasize differences between in vivo and in vitro folding energy landscapes are emerging: the ribosome modulates MG formation in nascent apoflavodoxin and forces this polypeptide toward the native state.
- 11Holm, L. and Laakso, L. M. (2016) DALI Server Update. Nucleic Acids Res. 44 (W1), W351– W355, DOI: 10.1093/nar/gkw357Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2itrrO&md5=cd2b5fabf6a01a4b35f350342ae7ddc2Dali server updateHolm, Liisa; Laakso, Laura M.Nucleic Acids Research (2016), 44 (W1), W351-W355CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The Dali server is a network service for comparing protein structures in 3D. In favorable cases, comparing 3D structures may reveal biol. interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo.
- 12Dawson, N. L., Lewis, T. E., Das, S., Lees, J. G., Lee, D., Ashford, P., Orengo, C. A., and Sillitoe, I. (2017) CATH: an Expanded Resource to Predict Protein Function through Structure and Sequence. Nucleic Acids Res. 45 (D1), D289– D295, DOI: 10.1093/nar/gkw1098Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhslWhu7k%253D&md5=0f0181b39280024e816e8baa299859c1CATH: an expanded resource to predict protein function through structure and sequenceDawson, Natalie L.; Lewis, Tony E.; Das, Sayoni; Lees, Jonathan G.; Lee, David; Ashford, Paul; Orengo, Christine A.; Sillitoe, IanNucleic Acids Research (2017), 45 (D1), D289-D295CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)The latest version of the CATH-Gene3D protein structure classification database has recently been released (version 4.1, http://www.cathdb.info). The resource comprises over 300 000 domain structures and over 53 million protein domains classified into 2737 homologous superfamilies, doubling the no. of predicted protein domains in the previous version. The daily-updated CATH-B, which contains our very latest domain assignment data, provides putative classifications for over 100 000 addnl. protein domains. This article describes developments to the CATH-Gene3D resource over the last two years since the publication in 2015, including: significant increases to our structural and sequence coverage; expansion of the functional families in CATH; building a support vector machine (SVM) to automatically assign domains to superfamilies; improved search facilities to return alignments of query sequences against multiple sequence alignments; the redesign of the web pages and download site.
- 13Chandonia, J. M., Fox, N. K., and Brenner, S. E. (2017) SCOPe:Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended Database. J. Mol. Biol. 429 (3), 348– 355, DOI: 10.1016/j.jmb.2016.11.023Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XitVSmsLrF&md5=4b1574a02f8c1895fc860ab576277f2aSCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended DatabaseChandonia, John-Marc; Fox, Naomi K.; Brenner, Steven E.Journal of Molecular Biology (2017), 429 (3), 348-355CODEN: JMOBAK; ISSN:0022-2836. (Elsevier Ltd.)SCOPe (Structural Classification of Proteins-extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOP is an expert-curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. SCOPe classifies the majority of protein structures released since SCOP development concluded in 2009, using a combination of manual curation and highly precise automated tools, aiming to have the same accuracy as fully hand-curated SCOP releases. SCOPe also incorporates and updates the ASTRAL compendium, which provides several databases and tools to aid in the anal. of the sequences and structures of proteins classified in SCOPe. SCOPe continues high-quality manual classification of new superfamilies, a key feature of SCOP. Artifacts such as expression tags are now sepd. into their own class, in order to distinguish them from the homol.-based annotations in the remainder of the SCOPe hierarchy. SCOPe 2.06 contains 77,439 Protein Data Bank entries, double the 38,221 structures classified in SCOP.
- 14Szilagyi, A., Zhang, Y., and Zavodszky, P. (2012) Intra-Chain 3D Segment Swapping Spawns the Evolution of New Multidomain Protein Architectures. J. Mol. Biol. 415 (1), 221– 235, DOI: 10.1016/j.jmb.2011.10.045Google Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtFWmtg%253D%253D&md5=d1c81e8cc06fd375a9c26a721adf73d1Intra-chain 3D segment swapping spawns the evolution of new multidomain protein architecturesSzilagyi, Andras; Zhang, Yang; Zavodszky, PeterJournal of Molecular Biology (2012), 415 (1), 221-235CODEN: JMOBAK; ISSN:0022-2836. (Elsevier Ltd.)Multidomain proteins form in evolution through the concatenation of domains, but structural domains may comprise multiple segments of the chain. Here, the authors demonstrate that new multidomain architectures can evolve by an apparent 3-dimensional swap of segments between structurally similar domains within a single-chain monomer. By a comprehensive structural search of the current Protein Data Bank (PDB), the authors identified 32 well-defined segment-swapped proteins (SSPs) belonging to 18 structural families. Nearly 13% of all multidomain proteins in the PDB may have a segment-swapped evolutionary precursor as estd. by more permissive searching criteria. The formation of SSPs could be explained by 2 principal evolutionary mechanisms: (1) domain swapping and fusion (DSF) and (2) circular permutation (CP). By large-scale comparative analyses using structural alignment and hidden Markov model methods, it was found that the majority of SSPs evolved via the DSF mechanism, and a much smaller fraction, via CP. Functional analyses further revealed that segment swapping, which resulted in 2 linkers connecting the domains, may impart directed flexibility to multidomain proteins and contributes to the development of new functions. Thus, inter-domain segment swapping represents a novel general mechanism by which new protein folds and multidomain architectures arise in evolution, and SSPs have structural and functional properties that make them worth defining as a sep. group.
- 15Cameron, A. D., Olin, B., Riderström, M., Mannervik, B., and Jones, T. A. (1997) Crystal Structure of Human Glyoxalase I - Evidence for Gene Duplication and 3D Domain Swapping. EMBO. J. 16, 3386– 3395, DOI: 10.1093/emboj/16.12.3386Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2sXktlCjtr4%253D&md5=e592d5acfade0552011f8174fae6cb32Crystal structure of human glyoxalase I-evidence for gene duplication and 3D domain swappingCameron, Alexander D.; Olin, Birgit; Ridderstrom, Marianne; Mannervik, Bengt; Jones, T. AlwynEMBO Journal (1997), 16 (12), 3386-3395CODEN: EMJODG; ISSN:0261-4189. (Oxford University Press)The zinc metalloenzyme glyoxalase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal. The structure of the dimeric human enzyme in complex with S-benzyl-glutathione has been detd. by multiple isomorphous replacement (MIR) and refined at 2.2 Å resoln. Each monomer consists of two domains. Despite only low sequence homol. between them, these domains are structurally equiv. and appear to have arisen by a gene duplication. There is no structural homol. to the 'glutathione binding domain' found in other glutathione-linked proteins. 3D domain swapping of the N- and C-terminal domains has resulted in the active site being situated in the dimer interface, with the inhibitor and essential zinc ion interacting with side chains from both subunits. Two structurally equiv. residues from each domain contribute to a square pyramidal coordination of the zinc ion, rarely seen in zinc enzymes. Comparison of glyoxalase I with other known structures shows the enzyme to belong to a new structural family which includes the Fe2+-dependent dihydroxybiphenyl dioxygenase and the bleomycin resistance protein. This structural family appears to allow members to form with or without domain swapping.
- 16Caetano-Anolles, G., Kim, H. S., and Mittenthal, J. E. (2007) The Origin of Modern Metabolic Networks Inferred from Phylogenomic Analysis of Protein Architecture. Proc. Natl. Acad. Sci. U. S. A. 104 (22), 9358– 9363, DOI: 10.1073/pnas.0701214104Google Scholar16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXmtlehtLc%253D&md5=199d3b6cab5e41e29696acfb93bf2450The origin of modern metabolic networks inferred from phylogenomic analysis of protein architectureCaetano-Anolles, Gustavo; Kim, Hee Shin; Mittenthal, Jay E.Proceedings of the National Academy of Sciences of the United States of America (2007), 104 (22), 9358-9363CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Metab. represents a complex collection of enzymic reactions and transport processes that convert metabolites into mols. capable of supporting cellular life. Here we explore the origins and evolution of modern metab. Using phylogenomic information linked to the structure of metabolic enzymes, we sort out recruitment processes and discover that most enzymic activities were assocd. with the nine most ancient and widely distributed protein fold architectures. An anal. of newly discovered functions showed enzymic diversification occurred early, during the onset of the modern protein world. Most importantly, phylogenetic reconstruction exercises and other evidence suggest strongly that metab. originated in enzymes with the P-loop hydrolase fold in nucleotide metab., probably in pathways linked to the purine metabolic subnetwork. Consequently, the first enzymic takeover of an ancient biochem. or prebiotic chem. was related to the synthesis of nucleotides for the RNA world.
- 17Höcker, B., Beismann-Driemeyer, S., Hettwer, S., Lustig, A., and Sterner, R. (2001) Dissection of a (βα)8-barrel enzyme into two folded halves. Nat. Struct. Biol. 8 (1), 32– 36, DOI: 10.1038/83021Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3MXjslyrsw%253D%253D&md5=5b4525e068fe81f7d4cbee14d64736d0Dissection of a (βα)8-barrel enzyme into two folded halvesHocker, Birte; Beismann-Driemeyer, Silke; Hettwer, Stefan; Lustig, Ariel; Sterner, ReinhardNature Structural Biology (2001), 8 (1), 32-36CODEN: NSBIEW; ISSN:1072-8368. (Nature America Inc.)The (βα)8-barrel, which is the most frequently encountered protein fold, is generally considered to consist of a single structural domain. However, the x-ray structure of the imidazoleglycerol phosphate synthase (HisF) from Thermotoga maritima has identified it as a (βα)8-barrel made up of two superimposable subdomains (HisF-N and HisF-C). HisF-N consists of the four N-terminal (βα) units and HisF-C of the four C-terminal (βα) units. It has been postulated, therefore, that HisF evolved by tandem duplication and fusion from an ancestral half-barrel. To test this hypothesis, HisF-N and HisF-C were produced in Escherichia coli, purified and characterized. Sep., HisF-N and HisF-C are folded proteins, but are catalytically inactive. Upon co-expression in vivo or joint refolding in vitro, HisF-N and HisF-C assemble to the stoichiometric and catalytically fully active HisF-NC complex. These findings support the hypothesis that the (βα)8-barrel of HisF evolved from an ancestral half-barrel and have implications for the folding mechanism of the members of this large protein family.
- 18Höcker, B., Claren, J., and Sterner, R. (2004) Mimicking enzyme evolution by generating new (βα)8-barrels from (βα)4-half-barrels. Proc. Natl. Acad. Sci. U. S. A. 101 (47), 16448– 164453, DOI: 10.1073/pnas.0405832101Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD2cngsFaiug%253D%253D&md5=ba8346936fc96d8f8d1b8abc616812e6Mimicking enzyme evolution by generating new (betaalpha)8-barrels from (betaalpha)4-half-barrelsHocker Birte; Claren Jorg; Sterner ReinhardProceedings of the National Academy of Sciences of the United States of America (2004), 101 (47), 16448-53 ISSN:0027-8424.Gene duplication and fusion events that multiply and link functional protein domains are crucial mechanisms of enzyme evolution. The analysis of amino acid sequences and three-dimensional structures suggested that the (betaalpha)8-barrel, which is the most frequent fold among enzymes, has evolved by the duplication, fusion, and mixing of (betaalpha)4-half-barrel domains. Here, we mimicked this evolutionary strategy by generating in vitro (betaalpha)8-barrels from (betaalpha)4-half-barrels that were deduced from the enzymes imidazole glycerol phosphate synthase (HisF) and N'[(5'-phosphoribosyl)formimino]-5-aminoimidazole-4-carboxamide-ribonucleotide isomerase (HisA). To this end, the gene for the C-terminal (betaalpha)4-half-barrel (HisF-C) of HisF was duplicated and fused in tandem to yield HisF-CC, which is more stable than HisF-C. In the next step, by optimizing side-chain interactions within the center of the beta-barrel of HisF-CC, the monomeric and compact (betaalpha)8-barrel protein HisF-C*C was generated. Moreover, the genes for the N- and C-terminal (betaalpha)4-half-barrels of HisF and HisA were fused crosswise to yield the chimeric proteins HisFA and HisAF. Whereas HisFA contains native secondary structure elements but adopts ill-defined association states, the (betaalpha)8-barrel HisAF is a stable and compact monomer that reversibly unfolds with high cooperativity. The results obtained suggest a previously undescribed dimension for the diversification of enzymatic activities: new (betaalpha)8-barrels with novel functions might have evolved by the exchange of (betaalpha)4-half-barrel domains with distinct functional properties.
Cited By
This article is cited by 8 publications.
- Saacnicteh Toledo‐Patiño, Sara Kathrin Goetz, Sooruban Shanmugaratnam, Birte Höcker, José Arcadio Farías‐Rico. Molecular handcraft of a well‐folded protein chimera. FEBS Letters 2024, 598
(11)
, 1375-1386. https://doi.org/10.1002/1873-3468.14856
- Koya Sakuma, Ryotaro Koike, Motonori Ota. Dual‐wield
NTPases
: A novel protein family mined from
AlphaFold DB. Protein Science 2024, 33
(4)
https://doi.org/10.1002/pro.4934
- Léon Schierholz, Charlotte R Brown, Karla Helena-Bueno, Vladimir N Uversky, Robert P Hirt, Jonas Barandun, Sergey V Melnikov, . A Conserved Ribosomal Protein Has Entirely Dissimilar Structures in Different Organisms. Molecular Biology and Evolution 2024, 41
(1)
https://doi.org/10.1093/molbev/msad254
- Florian Michel, Sergio Romero‐Romero, Birte Höcker. Retracing the evolution of a modern periplasmic binding protein. Protein Science 2023, 32
(11)
https://doi.org/10.1002/pro.4793
- Florian Michel, Sooruban Shanmugaratnam, Sergio Romero-Romero, Birte Höcker. Structures of permuted halves of a modern ribose-binding protein. Acta Crystallographica Section D Structural Biology 2023, 79
(1)
, 40-49. https://doi.org/10.1107/S205979832201186X
- Vijay Jayaraman, Saacnicteh Toledo‐Patiño, Lianet Noda‐García, Paola Laurino. Mechanisms of protein evolution. Protein Science 2022, 31
(7)
https://doi.org/10.1002/pro.4362
- Noelia Ferruz, Florian Michel, Francisco Lobos, Steffen Schmidt, Birte Höcker. Fuzzle 2.0: Ligand Binding in Natural Protein Building Blocks. Frontiers in Molecular Biosciences 2021, 8 https://doi.org/10.3389/fmolb.2021.715972
- Sergio Romero-Romero, Sina Kordes, Florian Michel, Birte Höcker. Evolution, folding, and design of TIM barrels and related proteins. Current Opinion in Structural Biology 2021, 68 , 94-104. https://doi.org/10.1016/j.sbi.2020.12.007
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
References
This article references 18 other publications.
- 1Söding, J. and Lupas, A. N. (2003) More than the Sum of Their Parts: On the Evolution of Proteins from Peptides. BioEssays 25 (9), 837– 846, DOI: 10.1002/bies.103211https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD3szpslOltw%253D%253D&md5=ef9e186bca6e3e8484c2d6608ec9885fMore than the sum of their parts: on the evolution of proteins from peptidesSoding Johannes; Lupas Andrei NBioEssays : news and reviews in molecular, cellular and developmental biology (2003), 25 (9), 837-46 ISSN:0265-9247.Despite their seemingly endless diversity, proteins adopt a limited number of structural forms. It has been estimated that 80% of proteins will be found to adopt one of only about 400 folds, most of which are already known. These folds are largely formed by a limited 'vocabulary' of recurring supersecondary structure elements, often by repetition of the same element and, increasingly, elements similar in both structure and sequence are discovered. This suggests that modern proteins evolved by fusion and recombination from a more ancient peptide world and that many of the core folds observed today may contain homologous building blocks. The peptides forming these building blocks would not in themselves have had the ability to fold, but would have emerged as cofactors supporting RNA-based replication and catalysis (the 'RNA world'). Their association into larger structures and eventual fusion into polypeptide chains would have allowed them to become independent of their RNA scaffold, leading to the evolution of a novel type of macromolecule: the folded protein.
- 2Murzin, A.G. P (1998) How Far Divergent Evolution Goes in Proteins. Curr. Opin. Struct. Biol. 8 (3), 380– 387, DOI: 10.1016/S0959-440X(98)80073-02https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1cXks1egu7s%253D&md5=6fd71121962d77c8c2cb907424b83655How far divergent evolution goes in proteinsMurzin, Alexey G.Current Opinion in Structural Biology (1998), 8 (3), 380-387CODEN: COSBEF; ISSN:0959-440X. (Current Biology Ltd.)A review with 32 refs. In theory, mutations of protein sequences may eventually generate different functions as well as different structures. The observation of such records of protein evolution have been obscured by the dissipation of memory about the ancestors. In the past year, new advances in our understanding of divergent evolution were allowed by new protein structure detns., including the ClpP protease, steroid Δ-isomerase, carboxypeptidase G2, the thrombin inhibitor triabin and the chloroplast Rieske protein. There is strong evidence for their distant homol. with proteins of known structure despite significant functional or structural differences.
- 3Bork, P., Sander, C., and Valencia, A. (1993) Convergent Evolution of Similar Enzymatic Function on Different Protein Folds: The Hexokinase, Ribokinase, and Galactokinase Families of Sugar Kinases. Protein Sci. 2 (1), 31– 40, DOI: 10.1002/pro.55600201043https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3sXhvVKrsLY%253D&md5=29e6355f725f8bf8db0eea3f04821738Convergent evolution of similar enzymic function on different protein folds: The hexokinase, ribokinase, and galactokinase families of sugar kinasesBork, Peer; Sander, Chris; Valencia, AlfonsoProtein Science (1993), 2 (1), 31-40CODEN: PRCIEI; ISSN:0961-8368.Sugar kinases, that catalyze phosphorylation of sugars, can be divided into at least 3 distinct nonhomologous families. The 1st is the hexokinase family, which contains many prokaryotic and eukaryotic sugar kinases with diverse specificities, including a new member, rhamnokinase from Salmonella typhimurium. The 3-dimensional structure of hexokinase is known and can be used to build models of functionally important regions of other kinases in this family. The 2nd is the ribokinase family, of unknown 3-dimensional structure, and comprises pro- and eukaryotic ribokinases, bacterial fructokinases, the minor 6-phosphofructokinase 2 from Escherichia coli, 6-phosphotagatokinase, 1-phosphofructokinase, and, possibly, inosine-guanosine kinase. The 3rd family, also of unknown 3-dimensional structure, contains several bacterial and yeast galactokinases and eukaryotic mevalonate and phosphomevalonate kinases and may have a substrate binding region in common with homoserine kinases. Each of the 3 families of sugar kinases appears to have a distinct 3-dimensional fold, since conserved sequence patterns are strikingly different for the 3 families. However, each catalyzes chem. equiv. reactions on similar or identical substrates. The enzymic function of sugar phosphorylation appears to have evolved independently on the 3 distinct structural frameworks, by convergent evolution. In addn., evolutionary trees reveal that (1) fructokinase specificity has evolved independently in both the hexokinase and ribokinase families and (2) glucose specificity has evolved independently in different branches of the hexokinase family. These are examples of independent Darwinian adaptation of a structure to the same substrate at different evolutionary times. The flexible combination of active sites and 3-dimensional folds obsd. in nature can be exploited by protein engineers in designing and optimizing enzymic function.
- 4Söding, J., Biegert, A., and Lupas, A. N. (2005) The HHpred Interactive Server for Protein Homology Detection and Structure Prediction. Nucleic Acids Res. 33, W244– 248, DOI: 10.1093/nar/gki4084https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD2MzjtVSntg%253D%253D&md5=58a35ae4ec7df2e6b8e6fcbb6fe146d5The HHpred interactive server for protein homology detection and structure predictionSoding Johannes; Biegert Andreas; Lupas Andrei NNucleic acids research (2005), 33 (Web Server issue), W244-8 ISSN:.HHpred is a fast server for remote protein homology detection and structure prediction and is the first to implement pairwise comparison of profile hidden Markov models (HMMs). It allows to search a wide choice of databases, such as the PDB, SCOP, Pfam, SMART, COGs and CDD. It accepts a single query sequence or a multiple alignment as input. Within only a few minutes it returns the search results in a user-friendly format similar to that of PSI-BLAST. Search options include local or global alignment and scoring secondary structure similarity. HHpred can produce pairwise query-template alignments, multiple alignments of the query with a set of templates selected from the search results, as well as 3D structural models that are calculated by the MODELLER software from these alignments. A detailed help facility is available. As a demonstration, we analyze the sequence of SpoVT, a transcriptional regulator from Bacillus subtilis. HHpred can be accessed at http://protevo.eb.tuebingen.mpg.de/hhpred.
- 5Farias-Rico, J. A., Schmidt, S., and Höcker, B. (2014) Evolutionary Relationship of two ancient protein superfolds. Nat. Chem. Biol. 10 (9), 710– 715, DOI: 10.1038/nchembio.15795https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtFygs7nN&md5=2be43499fad84afa2d2d4eee88131b10Evolutionary relationship of two ancient protein superfoldsFarias-Rico, Jose Arcadio; Schmidt, Steffen; Hoecker, BirteNature Chemical Biology (2014), 10 (9), 710-715CODEN: NCBABT; ISSN:1552-4450. (Nature Publishing Group)Proteins are the mol. machines of the cell that fold into specific three-dimensional structures to fulfill their functions. To improve our understanding of how the structure and function of proteins arises, it is crucial to understand how evolution has generated the structural diversity we observe today. Classically, proteins that adopt different folds are considered to be nonhomologous. However, using state-of-the-art tools for homol. detection, we found evidence of homol. between proteins of two ancient and highly populated protein folds, the (βα)8-barrel and the flavodoxin-like fold. We detected a family of sequences that show intermediate features between both folds and detd. what is to our knowledge the first representative crystal structure of one of its members, giving new insights into the evolutionary link of two of the earliest folds. Our findings contribute to an emergent vision where protein superfolds share common ancestry and encourage further approaches to complete the mapping of structure space onto sequence space.
- 6Fortian, A., Castano, D., Ortega, G., Lain, A., Pons, M., and Millet, O. (2009) Uroporhyrinogen III Synthase Mutations Related to Congenital Erythropoietic Porphyria Indentify Key Helix for Protein Stability. Biochemistry 48 (2), 454– 461, DOI: 10.1021/bi801731q6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXhsFanurzK&md5=964c53f4854b4ce6ce1d3ebdd39e7382Uroporphyrinogen III Synthase Mutations Related to Congenital Erythropoietic Porphyria Identify a Key Helix for Protein StabilityFortian, Arola; Castano, David; Ortega, Gabriel; Lain, Ana; Pons, Miquel; Millet, OscarBiochemistry (2009), 48 (2), 454-461CODEN: BICHAW; ISSN:0006-2960. (American Chemical Society)In the present study we have investigated deleterious mutants in the uroporphyrinogen III synthase (UROIIIS) that are related to the congenital erythropoietic porphyria (CEP). The 25 missense mutants found in CEP patients have been cloned, expressed, and purified. Their enzymic activities have been measured relative to wild-type UROIIIS activity. All mutants retain measurable activity, consistent with the recessive character of the disease. Most of the mutants with a significant decrease in activity involve residues likely assocd. in binding. However, other mutants are fully active, indicating that different mechanisms may contribute to enzyme misfunction. UROIIIS is a thermolabile enzyme undergoing irreversible denaturation. The unfolding kinetics of wild-type UROIIIS and the suite of mutants have been monitored by CD. This anal. allowed the identification of a helical region in the mol., essential to retain the kinetic stability of the folded conformation. C73R is found in one-third of CEP patients, and Cys73 is part of this helix. The integrated anal. of the enzymic activity and kinetic stability data is used to gain insight in the relationship between defects in UROIIIS sequence and CEP.
- 7Szilagyi, A., Györffy, D., and Zavodszky, P. (2017) Segment Swapping Aided the Evolution of Enzyme Function: The Case of Uroporphyrinogen III Synthase. Proteins: Struct., Funct., Genet. 85 (1), 46– 53, DOI: 10.1002/prot.251907https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhslyjtLnM&md5=f1a0d549c40c051186aa9bcdc741c070Segment swapping aided the evolution of enzyme function: The case of uroporphyrinogen III synthaseSzilagyi, Andras; Gyoerffy, Daniel; Zavodszky, PeterProteins: Structure, Function, and Bioinformatics (2017), 85 (1), 46-53CODEN: PSFBAF; ISSN:1097-0134. (Wiley-Blackwell)In an earlier study, we showed that two-domain segment-swapped proteins can evolve by domain swapping and fusion, resulting in a protein with two linkers connecting its domains. We proposed that a potential evolutionary advantage of this topol. may be the restriction of interdomain motions, which may facilitate domain closure by a hinge-like movement, crucial for the function of many enzymes. Here, we test this hypothesis computationally on uroporphyrinogen III synthase (U3S), a two-domain segment-swapped enzyme essential in porphyrin metab. To compare the interdomain flexibility between the wild-type, segment-swapped enzyme (having two interdomain linkers) and circular permutants of the same enzyme having only one interdomain linker, we performed geometric and mol. dynamics simulations for these species in their ligand-free and ligand-bound forms. We find that in the ligand-free form, interdomain motions in the wild-type enzyme are significantly more restricted than they would be with only one interdomain linker, while the flexibility difference is negligible in the ligand-bound form. We also estd. the entropy costs of ligand binding assocd. with the interdomain motions, and find that the change in domain connectivity due to segment swapping results in a redn. of this entropy cost, corresponding to ∼20% of the total ligand binding free energy. In addn., the restriction of interdomain motions may also help the functional domain-closure motion required for catalysis. This suggests that the evolution of the segment-swapped topol. facilitated the evolution of enzyme function for this protein by influencing its dynamic properties. Proteins 2016. © 2016 Wiley Periodicals, Inc.
- 8Alva, V., Nam, S. Z., Söding, J., and Lupas, A. N. (2016) The MPI Bioinformatics Toolkit as an Integrative Platform for Advanced Protein Sequence and Structure Analysis. Nucleic Acids Res. 44 (W1), W410– W415, DOI: 10.1093/nar/gkw3488https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2it77E&md5=05641d8bca799244ae60ef06e5e3d039The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysisAlva, Vikram; Nam, Seung-Zin; Soeding, Johannes; Lupas, Andrei N.Nucleic Acids Research (2016), 44 (W1), W410-W415CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The MPI Bioinformatics Toolkit (http://toolkit. tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic anal. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current prodn.-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400,000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for exptl. scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment.
- 9Sander, C. and Schneider, R. (1991) Database of Homology-Derived protein structures and the Structural Meaning of Sequence Alignment. Proteins: Struct., Funct., Genet. 9 (1), 56– 68, DOI: 10.1002/prot.3400901079https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3MXhtVSlsrw%253D&md5=e8c5c9faa7243a26e38113308cad1ac2Database of homology-derived protein structures and the structural meaning of sequence alignmentSander, Chris; Schneider, ReinhardProteins: Structure, Function, and Genetics (1991), 9 (1), 56-68CODEN: PSFGEY; ISSN:0887-3585.The database of known protein three-dimensional structures can be significantly increased by the use of sequence homol. The database of known sequences, currently at >12,000 proteins, is two orders of magnitude larger than the database of known structures. The currently most powerful method of predicting protein structures is model building by homol. Structural homol. can be inferred from the level of sequence similarity. The threshold of sequence similarity sufficient for structural homol. depends strongly on the length of the alignment. The relation is quantified between sequence similarity, structure similarity, and alignment length by an exhaustive survey of alignments between proteins of known structure and report a homol. threshold curve as a function of alignment length. A database of homol.-derived secondary structure of proteins (HSSP) is produced by aligning to each protein of known structure all sequences deemed homologous on the basis of the threshold curve. For each known protein structure, the derived database contains the aligned sequences, secondary structure, sequence variability, and sequence profile. Tertiary structures of the aligned sequences are implied, but not modeled explicitly. The database effectively increases the no. of known protein structures by a factor of 5 to >1800. The results may be useful in assessing the structural significance of matches in sequence database searches, in deriving preferences and patterns for structure prediction, in elucidating the structural role of conserved residues, and in modeling three-dimensional detail by homol.
- 10Houwman, J. A. and van Mierlo, C. P. M. (2017) Folding of proteins with a flavodoxin-like architecture. FEBS J. 284, 3145– 3167, DOI: 10.1111/febs.1407710https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXmvVCgsLw%253D&md5=1d70e4310189d8a2b8fd979101f005acFolding of proteins with a flavodoxin-like architectureHouwman, Joseline A.; van Mierlo, Carlo P. M.FEBS Journal (2017), 284 (19), 3145-3167CODEN: FJEOAC; ISSN:1742-464X. (Wiley-Blackwell)A review. The flavodoxin-like fold is a protein architecture that can be traced back to the universal ancestor of the three kingdoms of life. Many proteins share this α-β parallel topol. and hence it is highly relevant to illuminate how they fold. Here, we review expts. and simulations concerning the folding of flavodoxins and CheY-like proteins, which share the flavodoxin-like fold. These polypeptides tend to temporarily misfold during unassisted folding to their functionally active forms. This susceptibility to frustration is caused by the more rapid formation of an α-helix compared to a β-sheet, particularly when a parallel β-sheet is involved. As a result, flavodoxin-like proteins form intermediates that are off-pathway to native protein and several of these species are molten globules (MGs). Expts. suggest that the off-pathway species are of helical nature and that flavodoxin-like proteins have a nonconserved transition state that dets. the rate of productive folding. The folding of flavodoxin from Azotobacter vinelandii has been investigated extensively, enabling a schematic construction of its folding energy landscape. It is the only flavodoxin-like protein of which cotranslational folding has been probed. New insights that emphasize differences between in vivo and in vitro folding energy landscapes are emerging: the ribosome modulates MG formation in nascent apoflavodoxin and forces this polypeptide toward the native state.
- 11Holm, L. and Laakso, L. M. (2016) DALI Server Update. Nucleic Acids Res. 44 (W1), W351– W355, DOI: 10.1093/nar/gkw35711https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2itrrO&md5=cd2b5fabf6a01a4b35f350342ae7ddc2Dali server updateHolm, Liisa; Laakso, Laura M.Nucleic Acids Research (2016), 44 (W1), W351-W355CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The Dali server is a network service for comparing protein structures in 3D. In favorable cases, comparing 3D structures may reveal biol. interesting similarities that are not detectable by comparing sequences. The Dali server has been running in various places for over 20 years and is used routinely by crystallographers on newly solved structures. The latest update of the server provides enhanced analytics for the study of sequence and structure conservation. The server performs three types of structure comparisons: (i) Protein Data Bank (PDB) search compares one query structure against those in the PDB and returns a list of similar structures; (ii) pairwise comparison compares one query structure against a list of structures specified by the user; and (iii) all against all structure comparison returns a structural similarity matrix, a dendrogram and a multidimensional scaling projection of a set of structures specified by the user. Structural superimpositions are visualized using the Java-free WebGL viewer PV. The structural alignment view is enhanced by sequence similarity searches against Uniprot. The combined structure-sequence alignment information is compressed to a stack of aligned sequence logos. In the stack, each structure is structurally aligned to the query protein and represented by a sequence logo.
- 12Dawson, N. L., Lewis, T. E., Das, S., Lees, J. G., Lee, D., Ashford, P., Orengo, C. A., and Sillitoe, I. (2017) CATH: an Expanded Resource to Predict Protein Function through Structure and Sequence. Nucleic Acids Res. 45 (D1), D289– D295, DOI: 10.1093/nar/gkw109812https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhslWhu7k%253D&md5=0f0181b39280024e816e8baa299859c1CATH: an expanded resource to predict protein function through structure and sequenceDawson, Natalie L.; Lewis, Tony E.; Das, Sayoni; Lees, Jonathan G.; Lee, David; Ashford, Paul; Orengo, Christine A.; Sillitoe, IanNucleic Acids Research (2017), 45 (D1), D289-D295CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)The latest version of the CATH-Gene3D protein structure classification database has recently been released (version 4.1, http://www.cathdb.info). The resource comprises over 300 000 domain structures and over 53 million protein domains classified into 2737 homologous superfamilies, doubling the no. of predicted protein domains in the previous version. The daily-updated CATH-B, which contains our very latest domain assignment data, provides putative classifications for over 100 000 addnl. protein domains. This article describes developments to the CATH-Gene3D resource over the last two years since the publication in 2015, including: significant increases to our structural and sequence coverage; expansion of the functional families in CATH; building a support vector machine (SVM) to automatically assign domains to superfamilies; improved search facilities to return alignments of query sequences against multiple sequence alignments; the redesign of the web pages and download site.
- 13Chandonia, J. M., Fox, N. K., and Brenner, S. E. (2017) SCOPe:Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended Database. J. Mol. Biol. 429 (3), 348– 355, DOI: 10.1016/j.jmb.2016.11.02313https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XitVSmsLrF&md5=4b1574a02f8c1895fc860ab576277f2aSCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended DatabaseChandonia, John-Marc; Fox, Naomi K.; Brenner, Steven E.Journal of Molecular Biology (2017), 429 (3), 348-355CODEN: JMOBAK; ISSN:0022-2836. (Elsevier Ltd.)SCOPe (Structural Classification of Proteins-extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOP is an expert-curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. SCOPe classifies the majority of protein structures released since SCOP development concluded in 2009, using a combination of manual curation and highly precise automated tools, aiming to have the same accuracy as fully hand-curated SCOP releases. SCOPe also incorporates and updates the ASTRAL compendium, which provides several databases and tools to aid in the anal. of the sequences and structures of proteins classified in SCOPe. SCOPe continues high-quality manual classification of new superfamilies, a key feature of SCOP. Artifacts such as expression tags are now sepd. into their own class, in order to distinguish them from the homol.-based annotations in the remainder of the SCOPe hierarchy. SCOPe 2.06 contains 77,439 Protein Data Bank entries, double the 38,221 structures classified in SCOP.
- 14Szilagyi, A., Zhang, Y., and Zavodszky, P. (2012) Intra-Chain 3D Segment Swapping Spawns the Evolution of New Multidomain Protein Architectures. J. Mol. Biol. 415 (1), 221– 235, DOI: 10.1016/j.jmb.2011.10.04514https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtFWmtg%253D%253D&md5=d1c81e8cc06fd375a9c26a721adf73d1Intra-chain 3D segment swapping spawns the evolution of new multidomain protein architecturesSzilagyi, Andras; Zhang, Yang; Zavodszky, PeterJournal of Molecular Biology (2012), 415 (1), 221-235CODEN: JMOBAK; ISSN:0022-2836. (Elsevier Ltd.)Multidomain proteins form in evolution through the concatenation of domains, but structural domains may comprise multiple segments of the chain. Here, the authors demonstrate that new multidomain architectures can evolve by an apparent 3-dimensional swap of segments between structurally similar domains within a single-chain monomer. By a comprehensive structural search of the current Protein Data Bank (PDB), the authors identified 32 well-defined segment-swapped proteins (SSPs) belonging to 18 structural families. Nearly 13% of all multidomain proteins in the PDB may have a segment-swapped evolutionary precursor as estd. by more permissive searching criteria. The formation of SSPs could be explained by 2 principal evolutionary mechanisms: (1) domain swapping and fusion (DSF) and (2) circular permutation (CP). By large-scale comparative analyses using structural alignment and hidden Markov model methods, it was found that the majority of SSPs evolved via the DSF mechanism, and a much smaller fraction, via CP. Functional analyses further revealed that segment swapping, which resulted in 2 linkers connecting the domains, may impart directed flexibility to multidomain proteins and contributes to the development of new functions. Thus, inter-domain segment swapping represents a novel general mechanism by which new protein folds and multidomain architectures arise in evolution, and SSPs have structural and functional properties that make them worth defining as a sep. group.
- 15Cameron, A. D., Olin, B., Riderström, M., Mannervik, B., and Jones, T. A. (1997) Crystal Structure of Human Glyoxalase I - Evidence for Gene Duplication and 3D Domain Swapping. EMBO. J. 16, 3386– 3395, DOI: 10.1093/emboj/16.12.338615https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2sXktlCjtr4%253D&md5=e592d5acfade0552011f8174fae6cb32Crystal structure of human glyoxalase I-evidence for gene duplication and 3D domain swappingCameron, Alexander D.; Olin, Birgit; Ridderstrom, Marianne; Mannervik, Bengt; Jones, T. AlwynEMBO Journal (1997), 16 (12), 3386-3395CODEN: EMJODG; ISSN:0261-4189. (Oxford University Press)The zinc metalloenzyme glyoxalase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal. The structure of the dimeric human enzyme in complex with S-benzyl-glutathione has been detd. by multiple isomorphous replacement (MIR) and refined at 2.2 Å resoln. Each monomer consists of two domains. Despite only low sequence homol. between them, these domains are structurally equiv. and appear to have arisen by a gene duplication. There is no structural homol. to the 'glutathione binding domain' found in other glutathione-linked proteins. 3D domain swapping of the N- and C-terminal domains has resulted in the active site being situated in the dimer interface, with the inhibitor and essential zinc ion interacting with side chains from both subunits. Two structurally equiv. residues from each domain contribute to a square pyramidal coordination of the zinc ion, rarely seen in zinc enzymes. Comparison of glyoxalase I with other known structures shows the enzyme to belong to a new structural family which includes the Fe2+-dependent dihydroxybiphenyl dioxygenase and the bleomycin resistance protein. This structural family appears to allow members to form with or without domain swapping.
- 16Caetano-Anolles, G., Kim, H. S., and Mittenthal, J. E. (2007) The Origin of Modern Metabolic Networks Inferred from Phylogenomic Analysis of Protein Architecture. Proc. Natl. Acad. Sci. U. S. A. 104 (22), 9358– 9363, DOI: 10.1073/pnas.070121410416https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXmtlehtLc%253D&md5=199d3b6cab5e41e29696acfb93bf2450The origin of modern metabolic networks inferred from phylogenomic analysis of protein architectureCaetano-Anolles, Gustavo; Kim, Hee Shin; Mittenthal, Jay E.Proceedings of the National Academy of Sciences of the United States of America (2007), 104 (22), 9358-9363CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Metab. represents a complex collection of enzymic reactions and transport processes that convert metabolites into mols. capable of supporting cellular life. Here we explore the origins and evolution of modern metab. Using phylogenomic information linked to the structure of metabolic enzymes, we sort out recruitment processes and discover that most enzymic activities were assocd. with the nine most ancient and widely distributed protein fold architectures. An anal. of newly discovered functions showed enzymic diversification occurred early, during the onset of the modern protein world. Most importantly, phylogenetic reconstruction exercises and other evidence suggest strongly that metab. originated in enzymes with the P-loop hydrolase fold in nucleotide metab., probably in pathways linked to the purine metabolic subnetwork. Consequently, the first enzymic takeover of an ancient biochem. or prebiotic chem. was related to the synthesis of nucleotides for the RNA world.
- 17Höcker, B., Beismann-Driemeyer, S., Hettwer, S., Lustig, A., and Sterner, R. (2001) Dissection of a (βα)8-barrel enzyme into two folded halves. Nat. Struct. Biol. 8 (1), 32– 36, DOI: 10.1038/8302117https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3MXjslyrsw%253D%253D&md5=5b4525e068fe81f7d4cbee14d64736d0Dissection of a (βα)8-barrel enzyme into two folded halvesHocker, Birte; Beismann-Driemeyer, Silke; Hettwer, Stefan; Lustig, Ariel; Sterner, ReinhardNature Structural Biology (2001), 8 (1), 32-36CODEN: NSBIEW; ISSN:1072-8368. (Nature America Inc.)The (βα)8-barrel, which is the most frequently encountered protein fold, is generally considered to consist of a single structural domain. However, the x-ray structure of the imidazoleglycerol phosphate synthase (HisF) from Thermotoga maritima has identified it as a (βα)8-barrel made up of two superimposable subdomains (HisF-N and HisF-C). HisF-N consists of the four N-terminal (βα) units and HisF-C of the four C-terminal (βα) units. It has been postulated, therefore, that HisF evolved by tandem duplication and fusion from an ancestral half-barrel. To test this hypothesis, HisF-N and HisF-C were produced in Escherichia coli, purified and characterized. Sep., HisF-N and HisF-C are folded proteins, but are catalytically inactive. Upon co-expression in vivo or joint refolding in vitro, HisF-N and HisF-C assemble to the stoichiometric and catalytically fully active HisF-NC complex. These findings support the hypothesis that the (βα)8-barrel of HisF evolved from an ancestral half-barrel and have implications for the folding mechanism of the members of this large protein family.
- 18Höcker, B., Claren, J., and Sterner, R. (2004) Mimicking enzyme evolution by generating new (βα)8-barrels from (βα)4-half-barrels. Proc. Natl. Acad. Sci. U. S. A. 101 (47), 16448– 164453, DOI: 10.1073/pnas.040583210118https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD2cngsFaiug%253D%253D&md5=ba8346936fc96d8f8d1b8abc616812e6Mimicking enzyme evolution by generating new (betaalpha)8-barrels from (betaalpha)4-half-barrelsHocker Birte; Claren Jorg; Sterner ReinhardProceedings of the National Academy of Sciences of the United States of America (2004), 101 (47), 16448-53 ISSN:0027-8424.Gene duplication and fusion events that multiply and link functional protein domains are crucial mechanisms of enzyme evolution. The analysis of amino acid sequences and three-dimensional structures suggested that the (betaalpha)8-barrel, which is the most frequent fold among enzymes, has evolved by the duplication, fusion, and mixing of (betaalpha)4-half-barrel domains. Here, we mimicked this evolutionary strategy by generating in vitro (betaalpha)8-barrels from (betaalpha)4-half-barrels that were deduced from the enzymes imidazole glycerol phosphate synthase (HisF) and N'[(5'-phosphoribosyl)formimino]-5-aminoimidazole-4-carboxamide-ribonucleotide isomerase (HisA). To this end, the gene for the C-terminal (betaalpha)4-half-barrel (HisF-C) of HisF was duplicated and fused in tandem to yield HisF-CC, which is more stable than HisF-C. In the next step, by optimizing side-chain interactions within the center of the beta-barrel of HisF-CC, the monomeric and compact (betaalpha)8-barrel protein HisF-C*C was generated. Moreover, the genes for the N- and C-terminal (betaalpha)4-half-barrels of HisF and HisA were fused crosswise to yield the chimeric proteins HisFA and HisAF. Whereas HisFA contains native secondary structure elements but adopts ill-defined association states, the (betaalpha)8-barrel HisAF is a stable and compact monomer that reversibly unfolds with high cooperativity. The results obtained suggest a previously undescribed dimension for the diversification of enzymatic activities: new (betaalpha)8-barrels with novel functions might have evolved by the exchange of (betaalpha)4-half-barrel domains with distinct functional properties.
Supporting Information
Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.biochem.9b00900.
Experimental methods and solution structure statistics (PDF)
The Uniprot ID of hemD is P48246, and the NCBI accession is 4ES6_A. The PDB ID of the NMR structure is 6TH8 and the BMRB ID 34452.
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.