Modeling Protein–Glycan Interactions with HADDOCKClick to copy article linkArticle link copied!
- Anna RanaudoAnna RanaudoDepartment of Earth and Environmental Sciences, University of Milano-Bicocca, Piazza Della Scienza 1, Milan 20126, ItalyBijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht 3584CH, The NetherlandsMore by Anna Ranaudo
- Marco GiuliniMarco GiuliniBijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht 3584CH, The NetherlandsMore by Marco Giulini
- Angela Pelissou AyusoAngela Pelissou AyusoBijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht 3584CH, The NetherlandsMore by Angela Pelissou Ayuso
- Alexandre M. J. J. Bonvin*Alexandre M. J. J. Bonvin*Email: [email protected]Bijvoet Centre for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Padualaan 8, Utrecht 3584CH, The NetherlandsMore by Alexandre M. J. J. Bonvin
Abstract
The term glycan refers to a broad category of molecules composed of monosaccharide units linked to each other in a variety of ways, whose structural diversity is related to different functions in living organisms. Among others, glycans are recognized by proteins with the aim of carrying information and for signaling purposes. Determining the three-dimensional structures of protein–glycan complexes is essential both for the understanding of the mechanisms glycans are involved in and for applications such as drug design. In this context, molecular docking approaches are of undoubted importance as complementary approaches to experiments. In this study, we show how high ambiguity-driven DOCKing (HADDOCK) can be efficiently used for the prediction of protein–glycan complexes. Using a benchmark of 89 complexes, starting from their bound or unbound forms, and assuming some knowledge of the binding site on the protein, our protocol reaches a 70% and 40% top 5 success rate on bound and unbound data sets, respectively. We show that the main limiting factor is related to the complexity of the glycan to be modeled and the associated conformational flexibility.
This publication is licensed under
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format and to adapt(remix, transform, and build upon) the material for any purpose, even commercially within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
Introduction
Methods
Benchmark Data Set Preparation
HADDOCK General Protocol and Scoring Function
Restraints to Drive the Docking
Docking Protocols for Bound and Unbound Data Sets
1 | topoaa: creation of the topologies of the two partners during which any missing atoms are automatically added. | ||||
2 | rigidbody: AIR-driven generation of rigid-body models. | ||||
3 | caprieval: models’ quality analysis (see below). | ||||
4 | rmsdmatrix: calculation of the RMSD matrix between all the models based on either all the interface residues (when ti-aa AIRs are used) or the protein interface residues and the whole glycan (when tip-ap AIRs are used). | ||||
5 | clustrmsd: RMSD-based agglomerative hierarchical clustering of the models using the average linkage criterion and a distance cutoff of 2.5 Å. Only clusters containing four or more models are evaluated. | ||||
6 | caprieval: cluster-based evaluation of the quality of the models. |
1 | topoaa: creation of the topologies of the two partners during which any missing atoms are automatically added. | ||||
2 | rigidbody: AIR-driven generation of rigid-body docking models (1000 by default) with increased sampling (200 per conformation) when starting from an ensemble of conformations. | ||||
3 | caprieval: models’ quality analysis (see below). | ||||
4 | rmsdmatrix: calculation of the RMSD matrix between all the models based on either all the interface residues (when ti-aa AIRs are used) or the protein interface residues and the whole glycan (when tip-ap AIRs are used). | ||||
5 | clustrmsd: RMSD-based agglomerative hierarchical clustering of the models using the average linkage criterion. Here, 50 (150 when the ensemble of glycan conformations is used) clusters are created. | ||||
6 | seletopclusts: selection of the top 5 models of the existing clusters. | ||||
7 | caprieval: cluster-based evaluation of the quality of the models. | ||||
8 | flexref: semi-flexible refinement through a simulated annealing protocol in torsion angle space in which first side-chains and then side-chains and backbone of interface residues are treated as flexible. | ||||
9 | caprieval: models’ quality analysis. | ||||
10 | rmsdmatrix: calculation of the RMSD matrix between all the models, as in point 4. | ||||
11 | clustrmsd: RMSD-based clustering of the models as in point 4 but here using a distance cutoff of 2.5 Å as in the bound scenario. | ||||
12 | caprieval: cluster-based evaluation of the quality of the models. |
Figure 1
Figure 1. Schematic representation of the docking protocol for the unbound data set. First, rigid-body docking is performed. The models are then clustered based on RMSD. The best-scoring models of each cluster are then subjected to a flexible refinement (interface), and the resulting models are again clustered and analyzed.
Model Quality Assessment
high-quality models: IL-RMSD ≤ 1.0 Å
medium-quality models: IL-RMSD ≤ 2.0 Å
acceptable-quality models: IL-RMSD ≤ 3.0 Å
near acceptable-quality models: IL-RMSD ≤ 4.0 Å
Glycans’ Conformational Sampling
Results and Discussion
Bound Docking Performance and the Impact of the Rigid-Body Scoring Function in the Ranking of Models
Figure 2
Figure 2. Comparison of bound docking success rates obtained with the default (w_vdW = 0.01) and vdW (w_vdW = 1.0) scoring functions (eqs 1 and 2) as a function of the number of top-ranked models (T = 1, 5, 10, 50, 100, and 200) selected using true interface residues of both protein and glycan to define the ambiguous interaction restraints (ti-aa AIRs).
Impact of Glycan Structural Features and AIR Definition on the Bound Docking Performance
Unbound Docking
Figure 3
Figure 3. HADDOCK3’s performance on the unbound data set using the vdW scoring function and tip-ap AIRs (true interface on the protein defined as active and the glycan residues as passive). The success rates (SRs) (Y axis), defined as the percentage of complexes for which acceptable-, medium-, or high-quality models are generated, are calculated on the top 1 (T1) to top 200 (T200) ranked rigid-body models (column “rigid”), T1 to T50 rigid-body clusters, considering the top 5 models of each cluster (column “rigid + clustering”), the T1 to T200 ranked refined models (column “flexref”), and the T1 to T10 refined clusters, considering the top 4 models of each cluster (column “flexref + clustering”). SRs are shown separately for the whole data set (first row) and for the three categories of complexes grouped by glycan size and connectivity: SL-SB (second row), LL (third row), and LB (fourth row).
Figure 4
Figure 4. Superimposition of the best-scoring flexible refinement models (orange) and the rigid-body models (teal) to the reference structures (gray) for the complexes 1OH4 (LB), 5VX5 (LL), and 1C1L (SL) and the unbound docking scenario carried out with vdW scoring potential and tip-ap AIRs. Oxygen atoms of the glycans are shown in red in all the structures, nitrogen atoms in blue, and hydrogens are not shown. Ranking and IL-RMSD values with respect to the reference structures for the flexref and rigid-body models are shown as well.
Can an Ensemble of Presampled Glycan Conformations Improve the Docking Performance?
Conclusions
Data Availability
The software used in this manuscript (HADDOCK3) is publicly available at https://github.com/haddocking/haddock3. All input data, analysis scripts, and results presented in the paper can be accessed at https://github.com/haddocking/protein-glycans. A tutorial describing the modeling of a protein–glycan complex using HADDOCK3 is hosted at https://bonvinlab.org/education/HADDOCK3/HADDOCK3-protein-glycan. The full runs, including docking models from all modules of a workflow, have been deposited in our lab collection (https://data.sbgrid.org/labs/32/1138) at the SBGRID data repository. (42)
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.4c01372.
Data set used in this study (XLSX)
Details of the preparation of protein and glycan structures; description of HADDOCK3 modules used in this work; details of the glycan conformational sampling protocol; SNFG representation of the glycans; modules and parameters used for bound docking; modules and parameters used for unbound docking; glycan conformational sampling scenarios; example of HADDOCK models satisfying the quality thresholds; HADDOCK3’ performance on the bound data set; glycans’ RMSD to their bound conformations; impact of mdref on glycan conformations; impact of the clustering on glycans’ lowest RMSD; examples of glycan ensembles of conformations; HADDOCK3’ performance with the ensembles of glycans; torsion angle analysis of glycosidic linkages: a comparison to HADDOCK flexible refinement models; and torsion angle analysis of glycosidic linkages: a comparison to HADDOCK short molecular dynamics refinement models (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
HADDOCK | high ambiguity-driven DOCKing |
IL-RMSD | interface-ligand root mean squared deviation |
SL | short-linear |
SB | short-branched |
LL | long-linear |
LB | long-branched |
AIRs | ambiguous interaction restraints |
PDB | Protein Data Bank. |
References
This article references 42 other publications.
- 1Gold, V. The IUPAC Compendium of Chemical Terminology; International Union of Pure and Applied Chemistry (IUPAC): Research Triangle Park, NC, 2019.Google ScholarThere is no corresponding record for this reference.
- 2Seeberger, P. H. Monosaccharide Diversity. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Eds.; Cold Spring Harbor Laboratory Press: New York, 2022;,pp 21– 32. DOI: 10.1101/glycobiology.4e.2 .Google ScholarThere is no corresponding record for this reference.
- 3Lebrilla, C. B.; Liu, J.; Widmalm, G.; Prestegard, J. H. Oligosaccharides and Polysaccharides. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Stanley, P., Hart, G. W., Aebi, M., Mohnen, D., Kinoshita, T., Packer, N. H., Prestegard, J. H., Schnaar, R. L., Seeberger, P. H., Eds.; Cold Spring Harbor Laboratory Press: New York, 2022, pp 33– 42. DOI: 10.1101/glycobiology.4e.3 .Google ScholarThere is no corresponding record for this reference.
- 4Varki, A.; Gagneux, P. Biological Functions of Glycans. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Eds.; Cold Spring Harbor Laboratory Press: New York, 2022. DOI: 10.1101/glycobiology.4e.7 .Google ScholarThere is no corresponding record for this reference.
- 5Molina, A.; O’Neill, M. A.; Darvill, A. G.; Etzler, M. E.; Mohnen, D.; Hahn, M. G.; Esko, J. D. Free Glycans as Bioactive Molecules. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Stanley, P., Hart, G. W., Aebi, M., Mohnen, D., Kinoshita, T., Packer, N. H., Prestegard, J. H., Schnaar, R. L., Seeberger, P. H., Eds.; Cold Spring Harbor Laboratory Press: New York, 2022, pp 539– 548. DOI: 10.1101/glycobiology.4e.40 .Google ScholarThere is no corresponding record for this reference.
- 6Casalino, L.; Gaieb, Z.; Goldsmith, J. A.; Hjorth, C. K.; Dommer, A. C.; Harbison, A. M.; Fogarty, C. A.; Barros, E. P.; Taylor, B. C.; McLellan, J. S.; Fadda, E.; Amaro, R. E. Beyond Shielding: The Roles of Glycans in the SARS-CoV-2 Spike Protein. ACS Cent. Sci. 2020, 6 (10), 1722– 1734, DOI: 10.1021/acscentsci.0c01056Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhvVOlsb3N&md5=52d499afcd7e3caa7d9e6017ffa86e45Beyond Shielding: The Roles of Glycans in the SARS-CoV-2 Spike ProteinCasalino, Lorenzo; Gaieb, Zied; Goldsmith, Jory A.; Hjorth, Christy K.; Dommer, Abigail C.; Harbison, Aoife M.; Fogarty, Carl A.; Barros, Emilia P.; Taylor, Bryn C.; McLellan, Jason S.; Fadda, Elisa; Amaro, Rommie E.ACS Central Science (2020), 6 (10), 1722-1734CODEN: ACSCII; ISSN:2374-7951. (American Chemical Society)The ongoing COVID-19 pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in more than 28,000,000 infections and 900,000 deaths worldwide to date. Antibody development efforts mainly revolve around the extensively glycosylated SARS-CoV-2 spike (S) protein, which mediates host cell entry by binding to the angiotensin-converting enzyme 2 (ACE2). Similar to many other viral fusion proteins, the SARS-CoV-2 spike utilizes a glycan shield to thwart the host immune response. Here, we built a full-length model of the glycosylated SARS-CoV-2 S protein, both in the open and closed states, augmenting the available structural and biol. data. Multiple microsecond-long, all-atom mol. dynamics simulations were used to provide an atomistic perspective on the roles of glycans and on the protein structure and dynamics. We reveal an essential structural role of N-glycans at sites N165 and N234 in modulating the conformational dynamics of the spike's receptor binding domain (RBD), which is responsible for ACE2 recognition. This finding is corroborated by biolayer interferometry expts., which show that deletion of these glycans through N165A and N234A mutations significantly reduces binding to ACE2 as a result of the RBD conformational shift toward the "down" state. Addnl., end-to-end accessibility analyses outline a complete overview of the vulnerabilities of the glycan shield of the SARS-CoV-2 S protein, which may be exploited in the therapeutic efforts targeting this mol. machine. Overall, this work presents hitherto unseen functional and structural insights into the SARS-CoV-2 S protein and its glycan coat, providing a strategy to control the conformational plasticity of the RBD that could be harnessed for vaccine development. The glycan shield is a sugary barrier that helps the viral SARS-CoV-2 spikes to evade the immune system. Beyond shielding, two of the spike's glycans are discovered to prime the virus for infection.
- 7Woods, R. J. Predicting the Structures of Glycans, Glycoproteins, and Their Complexes. Chem. Rev. 2018, 118 (17), 8005– 8024, DOI: 10.1021/acs.chemrev.8b00032Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhsVKhsbjF&md5=5b0fa8d3380516b34a09bf941e2853cePredicting the Structures of Glycans, Glycoproteins, and Their ComplexesWoods, Robert J.Chemical Reviews (Washington, DC, United States) (2018), 118 (17), 8005-8024CODEN: CHREAY; ISSN:0009-2665. (American Chemical Society)A review. Complex carbohydrates are ubiquitous in nature, and together with proteins and nucleic acids they comprise the building blocks of life. But unlike proteins and nucleic acids, carbohydrates form nonlinear polymers, and they are not characterized by robust secondary or tertiary structures but rather by distributions of well-defined conformational states. Their mol. flexibility means that oligosaccharides are often refractory to crystn., and NMR spectroscopy augmented by mol. dynamics (MD) simulation is the leading method for their characterization in soln. The biol. importance of carbohydrate-protein interactions, in organismal development as well as in disease, places urgency on the creation of innovative exptl. and theor. methods that can predict the specificity of such interactions and quantify their strengths. Addnl., the emerging realization that protein glycosylation impacts protein function and immunogenicity places the ability to define the mechanisms by which glycosylation impacts these features at the forefront of carbohydrate modeling. This review will discuss the relevant theor. approaches to studying the three-dimensional structures of this fascinating class of mols. and interactions, with ref. to the relevant exptl. data and techniques that are key for validation of the theor. predictions.
- 8Perez, S.; Makshakova, O. Multifaceted Computational Modeling in Glycoscience. Chem. Rev. 2022, 122 (20), 15914– 15970, DOI: 10.1021/acs.chemrev.2c00060Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xhs1Ort7jK&md5=9ab27d5d4e828eb9b5d7760c6c55d60bMultifaceted Computational Modeling in GlycosciencePerez, Serge; Makshakova, OlgaChemical Reviews (Washington, DC, United States) (2022), 122 (20), 15914-15970CODEN: CHREAY; ISSN:0009-2665. (American Chemical Society)A review. Glycoscience assembles all the scientific disciplines involved in studying various mols. and macromols. contg. carbohydrates and complex glycans. Such an ensemble involves one of the most extensive sets of mols. in quantity and occurrence since they occur in all microorganisms and higher organisms. Once the compns. and sequences of these mols. are established, the detn. of their three-dimensional structural and dynamical features is a step toward understanding the mol. basis underlying their properties and functions. The range of the relevant computational methods capable of addressing such issues is anchored by the specificity of stereoelectronic effects from quantum chem. to mesoscale modeling throughout mol. dynamics and mechanics and coarse-grained and docking calcns. The Review leads the reader through the detailed presentations of the applications of computational modeling. The illustrations cover carbohydrate-carbohydrate interactions, glycolipids, and N- and O-linked glycans, emphasizing their role in SARS-CoV-2. The presentation continues with the structure of polysaccharides in soln. and solid-state and lipopolysaccharides in membranes. The full range of protein-carbohydrate interactions is presented, as exemplified by carbohydrate-active enzymes, transporters, lectins, antibodies, and glycosaminoglycan binding proteins. A final section features a list of 150 tools and databases to help address the many issues of structural glycobioinformatics.
- 9Nance, M. L.; Labonte, J. W.; Adolf-Bryfogle, J.; Gray, J. J. Development and Evaluation of GlycanDock: A Protein–Glycoligand Docking Refinement Algorithm in Rosetta. J. Phys. Chem. B 2021, 125 (25), 6807– 6820, DOI: 10.1021/acs.jpcb.1c00910Google ScholarThere is no corresponding record for this reference.
- 10Mottarella, S. E.; Beglov, D.; Beglova, N.; Nugent, M. A.; Kozakov, D.; Vajda, S. Docking Server for the Identification of Heparin Binding Sites on Proteins. J. Chem. Inf. Model. 2014, 54 (7), 2068– 2078, DOI: 10.1021/ci500115jGoogle Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtVKnsLjK&md5=82dedc6c1e29961cbe8eb8a86957e582Docking Server for the Identification of Heparin Binding Sites on ProteinsMottarella, Scott E.; Beglov, Dmitri; Beglova, Natalia; Nugent, Matthew A.; Kozakov, Dima; Vajda, SandorJournal of Chemical Information and Modeling (2014), 54 (7), 2068-2078CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Many proteins of widely differing functionality and structure are capable of binding heparin and heparan sulfate. Since crystg. protein-heparin complexes for structure detn. is generally difficult, computational docking can be a useful approach for understanding specific interactions. Previous studies used programs originally developed for docking small mols. to well-defined pockets, rather than for docking polysaccharides to highly charged shallow crevices that usually bind heparin. The authors have extended the program PIPER and the automated protein-protein docking server ClusPro to heparin docking. Using a mol. mechanics energy function for scoring and the fast Fourier transform correlation approach, the method generates and evaluates close to a billion poses of a heparin tetrasaccharide probe. The docked structures are clustered using pairwise root-mean-square deviations as the distance measure. Clustering of heparin mols. close to each other but having different orientations and selecting the clusters with the highest protein-ligand contacts reliably predicts the heparin binding site. In addn., the centers of the five most populated clusters include structures close to the native orientation of the heparin. These structures can provide starting points for further refinement by methods that account for flexibility such as mol. dynamics. The heparin docking method is available as an advanced option of the ClusPro server at http://cluspro.bu.edu/.
- 11Nivedha, A. K.; Thieker, D. F.; Makeneni, S.; Hu, H.; Woods, R. J. Vina-Carb: Improving Glycosidic Angles during Carbohydrate Docking. J. Chem. Theory Comput. 2016, 12 (2), 892– 901, DOI: 10.1021/acs.jctc.5b00834Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XltlCrsQ%253D%253D&md5=cf018be5775bfaa2ad035ed1b82305b6Vina-Carb: Improving Glycosidic Angles during Carbohydrate DockingNivedha, Anita K.; Thieker, David F.; Hu, Huimin; Woods, Robert J.Journal of Chemical Theory and Computation (2016), 12 (2), 892-901CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)Mol. docking programs are primarily designed to align rigid, drug-like fragments into the binding sites of macromols. and frequently display poor performance when applied to flexible carbohydrate mols. A crit. source of flexibility within an oligosaccharide is the glycosidic linkages. Recently, Carbohydrate Intrinsic (CHI) energy functions are reported that attempt to quantify the glycosidic torsion angle preferences. The CHI-energy functions have been incorporated into the AutoDock Vina (ADV) scoring function, subsequently termed Vina-Carb (VC). Two user-adjustable parameters have been introduced, namely, a CHI- energy wt. term (chi_coeff) that affects the magnitude of the CHI-energy penalty and a CHI-cutoff term (chi_cutoff) that negates CHI-energy penalties below a specified value. A data set consisting of 101 protein-carbohydrate complexes and 29 apoprotein structures was used in the development and testing of VC, including antibodies, lectins, and carbohydrate binding modules. Accounting for the intramol. energies of the glycosidic linkages in the oligosaccharides during docking led VC to produce acceptable structures within the top five ranked poses in 74% of the systems tested, compared to a success rate of 55% for ADV. An enzyme system was employed to illustrate the potential application of VC to proteins that may distort glycosidic linkages of carbohydrate ligands upon binding. VC represents a significant step toward accurately predicting the structures of protein-carbohydrate complexes. Furthermore, the described approach is conceptually applicable to any class of ligands that populate well-defined conformational states.
- 12Boittier, E. D.; Burns, J. M.; Gandhi, N. S.; Ferro, V. GlycoTorch Vina: Docking Designed and Tested for Glycosaminoglycans. J. Chem. Inf. Model. 2020, 60 (12), 6328– 6343, DOI: 10.1021/acs.jcim.0c00373Google ScholarThere is no corresponding record for this reference.
- 13Labonte, J. W.; Adolf-Bryfogle, J.; Schief, W. R.; Gray, J. J. Residue-centric Modeling and Design of Saccharide and Glycoconjugate Structures. J. Comput. Chem. 2017, 38 (5), 276– 287, DOI: 10.1002/jcc.24679Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhvFKjsLnO&md5=822f98946c2d848b46f06b581beb11b4Residue-centric modeling and design of saccharide and glycoconjugate structuresLabonte, Jason W.; Adolf-Bryfogle, Jared; Schief, William R.; Gray, Jeffrey J.Journal of Computational Chemistry (2017), 38 (5), 276-287CODEN: JCCHDD; ISSN:0192-8651. (John Wiley & Sons, Inc.)The RosettaCarbohydrate framework is a new tool for modeling a wide variety of saccharide and glycoconjugate structures. This report describes the development of the framework and highlights its applications. The framework integrates with established protocols within the Rosetta modeling and design suite, and it handles the vast complexity and variety of carbohydrate mols., including branching and sugar modifications. To address challenges of sampling and scoring, RosettaCarbohydrate can sample glycosidic bonds, side-chain conformations, and ring forms, and it uses a glycan-specific term within its scoring function. Rosetta can work with std. PDB, GLYCAM, and GlycoWorkbench (.gws) file formats. Saccharide residue-specific chem. information is stored internally, permitting glycoengineering and design. Carbohydrate-specific applications described herein include virtual glycosylation, loop-modeling of carbohydrates, and docking of glyco-ligands to antibodies. Benchmarking data are presented and compared to other studies, demonstrating Rosetta's ability to predict glyco-ligand binding. The framework expands the tools available to glycoscientists and engineers.
- 14Glashagen, G.; de Vries, S.; Uciechowska-Kaczmarzyk, U.; Samsonov, S. A.; Murail, S.; Tuffery, P.; Zacharias, M. Coarse-grained and Atomic Resolution Biomolecular Docking with the ATTRACT Approach. Proteins: Struct., Funct., Bioinf. 2020, 88 (8), 1018– 1028, DOI: 10.1002/prot.25860Google ScholarThere is no corresponding record for this reference.
- 15Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; Bridgland, A.; Meyer, C.; Kohl, S. A. A.; Ballard, A. J.; Cowie, A.; Romera-Paredes, B.; Nikolov, S.; Jain, R.; Adler, J.; Back, T.; Petersen, S.; Reiman, D.; Clancy, E.; Zielinski, M.; Steinegger, M.; Pacholska, M.; Berghammer, T.; Bodenstein, S.; Silver, D.; Vinyals, O.; Senior, A. W.; Kavukcuoglu, K.; Kohli, P.; Hassabis, D. Highly Accurate Protein Structure Prediction with AlphaFold. Nature 2021, 596 (7873), 583– 589, DOI: 10.1038/s41586-021-03819-2Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhvVaktrrL&md5=25964ab1157cd5b74a437333dd86650dHighly accurate protein structure prediction with AlphaFoldJumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ; Zidek, Augustin; Potapenko, Anna; Bridgland, Alex; Meyer, Clemens; Kohl, Simon A. A.; Ballard, Andrew J.; Cowie, Andrew; Romera-Paredes, Bernardino; Nikolov, Stanislav; Jain, Rishub; Adler, Jonas; Back, Trevor; Petersen, Stig; Reiman, David; Clancy, Ellen; Zielinski, Michal; Steinegger, Martin; Pacholska, Michalina; Berghammer, Tamas; Bodenstein, Sebastian; Silver, David; Vinyals, Oriol; Senior, Andrew W.; Kavukcuoglu, Koray; Kohli, Pushmeet; Hassabis, DemisNature (London, United Kingdom) (2021), 596 (7873), 583-589CODEN: NATUAS; ISSN:0028-0836. (Nature Portfolio)Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous exptl. effort, the structures of around 100,000 unique proteins have been detd., but this represents a small fraction of the billions of known protein sequences. Structural coverage is bottlenecked by the months to years of painstaking effort required to det. a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'-has been an important open research problem for more than 50 years. Despite recent progress, existing methods fall far short of at. accuracy, esp. when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with at. accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Crit. Assessment of protein Structure Prediction (CASP14), demonstrating accuracy competitive with exptl. structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates phys. and biol. knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
- 16Abramson, J.; Adler, J.; Dunger, J.; Evans, R.; Green, T.; Pritzel, A.; Ronneberger, O.; Willmore, L.; Ballard, A. J.; Bambrick, J.; Bodenstein, S. W.; Evans, D. A.; Hung, C.-C.; O’Neill, M.; Reiman, D.; Tunyasuvunakool, K.; Wu, Z.; Žemgulytė, A.; Arvaniti, E.; Beattie, C.; Bertolli, O.; Bridgland, A.; Cherepanov, A.; Congreve, M.; Cowen-Rivers, A. I.; Cowie, A.; Figurnov, M.; Fuchs, F. B.; Gladman, H.; Jain, R.; Khan, Y. A.; Low, C. M. R.; Perlin, K.; Potapenko, A.; Savy, P.; Singh, S.; Stecula, A.; Thillaisundaram, A.; Tong, C.; Yakneen, S.; Zhong, E. D.; Zielinski, M.; Žídek, A.; Bapst, V.; Kohli, P.; Jaderberg, M.; Hassabis, D.; Jumper, J. M. Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3. Nature 2024, 630 (8016), 493– 500, DOI: 10.1038/s41586-024-07487-wGoogle ScholarThere is no corresponding record for this reference.
- 17HADDOCK3, Bonvin’s Lab , 2022. https://github.com/haddocking/haddock3.Google ScholarThere is no corresponding record for this reference.
- 18Dominguez, C.; Boelens, R.; Bonvin, A. M. J. J. HADDOCK: A Protein–Protein Docking Approach Based on Biochemical or Biophysical Information. J. Am. Chem. Soc. 2003, 125 (7), 1731– 1737, DOI: 10.1021/ja026939xGoogle Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXkvFGquw%253D%253D&md5=9cf6fd9c690afb0181579485ae263197HADDOCK: A protein-protein docking approach based on biochemical or biophysical informationDominguez, Cyril; Boelens, Rolf; Bonvin, Alexandre M. J. J.Journal of the American Chemical Society (2003), 125 (7), 1731-1737CODEN: JACSAT; ISSN:0002-7863. (American Chemical Society)The structure detn. of protein-protein complexes is a rather tedious and lengthy process, by both NMR and x-ray crystallog. Several methods based on docking to study protein complexes have also been well developed over the past few years. Most of these approaches are not driven by exptl. data but are based on a combination of energetics and shape complementarity. Here, we present an approach called HADDOCK (High Ambiguity Driven protein-protein Docking) that makes use of biochem. and/or biophys. interaction data such as chem. shift perturbation data resulting from NMR titrn. expts. or mutagenesis data. This information is introduced as Ambiguous Interaction Restraints (AIRs) to drive the docking process. An AIR is defined as an ambiguous distance between all residues shown to be involved in the interaction. The accuracy of our approach is demonstrated with three mol. complexes. For two of these complexes, for which both the complex and the free protein structures have been solved, NMR titrn. data were available. Mutagenesis data were used in the last example. In all cases, the best structures generated by HADDOCK, i.e., the structures with the lowest intermol. energies, were the closest to the published structure of the resp. complexes (within 2.0 Å backbone RMSD).
- 19Wu, A. M.; Singh, T.; Liu, J.-H.; Krzeminski, M.; Russwurm, R.; Siebert, H.-C.; Bonvin, A. M. J. J.; André, S.; Gabius, H.-J. Activity–Structure Correlations in Divergent Lectin Evolution: Fine Specificity of Chicken Galectin CG-14 and Computational Analysis of Flexible Ligand Docking for CG-14 and the Closely Related CG-16. Glycobiology 2007, 17 (2), 165– 184, DOI: 10.1093/glycob/cwl062Google ScholarThere is no corresponding record for this reference.
- 20Krzeminski, M.; Singh, T.; André, S.; Lensch, M.; Wu, A. M.; Bonvin, A. M. J. J.; Gabius, H.-J. Human Galectin-3 (Mac-2 Antigen): Defining Molecular Switches of Affinity to Natural Glycoproteins, Structural and Dynamic Aspects of Glycan Binding by Flexible Ligand Docking and Putative Regulatory Sequences in the Proximal Promoter Region. Biochim. Biophys. Acta, Gen. Subj. 2011, 1810 (2), 150– 161, DOI: 10.1016/j.bbagen.2010.11.001Google ScholarThere is no corresponding record for this reference.
- 21Berman, H. M. The Protein Data Bank. Nucleic Acids Res. 2000, 28 (1), 235– 242, DOI: 10.1093/nar/28.1.235Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXhvVKjt7w%253D&md5=227fb393f754be2be375ab727bfd05dcThe Protein Data BankBerman, Helen M.; Westbrook, John; Feng, Zukang; Gilliland, Gary; Bhat, T. N.; Weissig, Helge; Shindyalov, Ilya N.; Bourne, Philip E.Nucleic Acids Research (2000), 28 (1), 235-242CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The Protein Data Bank (PDB; http://www.rcsb.org/pdb/)is the single worldwide archive of structural data of biol. macromols. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
- 22Woods Group. Complex Carbohydrate Research Center; GLYCAM Web: University of Georgia, Athens, GA, 2023.Google ScholarThere is no corresponding record for this reference.
- 23Kirschner, K. N.; Yongye, A. B.; Tschampel, S. M.; González-Outeiriño, J.; Daniels, C. R.; Foley, B. L.; Woods, R. J. GLYCAM06: A Generalizable Biomolecular Force Field. Carbohydrates. J. Comput. Chem. 2008, 29 (4), 622– 655, DOI: 10.1002/jcc.20820Google Scholar23https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD1c%252FotlSmtQ%253D%253D&md5=fd32f08518d8e3568627c36eaa958efbGLYCAM06: a generalizable biomolecular force field. CarbohydratesKirschner Karl N; Yongye Austin B; Tschampel Sarah M; Gonzalez-Outeirino Jorge; Daniels Charlisa R; Foley B Lachele; Woods Robert JJournal of computational chemistry (2008), 29 (4), 622-55 ISSN:.A new derivation of the GLYCAM06 force field, which removes its previous specificity for carbohydrates, and its dependency on the AMBER force field and parameters, is presented. All pertinent force field terms have been explicitly specified and so no default or generic parameters are employed. The new GLYCAM is no longer limited to any particular class of biomolecules, but is extendible to all molecular classes in the spirit of a small-molecule force field. The torsion terms in the present work were all derived from quantum mechanical data from a collection of minimal molecular fragments and related small molecules. For carbohydrates, there is now a single parameter set applicable to both alpha- and beta-anomers and to all monosaccharide ring sizes and conformations. We demonstrate that deriving dihedral parameters by fitting to QM data for internal rotational energy curves for representative small molecules generally leads to correct rotamer populations in molecular dynamics simulations, and that this approach removes the need for phase corrections in the dihedral terms. However, we note that there are cases where this approach is inadequate. Reported here are the basic components of the new force field as well as an illustration of its extension to carbohydrates. In addition to reproducing the gas-phase properties of an array of small test molecules, condensed-phase simulations employing GLYCAM06 are shown to reproduce rotamer populations for key small molecules and representative biopolymer building blocks in explicit water, as well as crystalline lattice properties, such as unit cell dimensions, and vibrational frequencies.
- 24Varki, A.; Cummings, R. D.; Aebi, M.; Packer, N. H.; Seeberger, P. H.; Esko, J. D.; Stanley, P.; Hart, G.; Darvill, A.; Kinoshita, T.; Prestegard, J. J.; Schnaar, R. L.; Freeze, H. H.; Marth, J. D.; Bertozzi, C. R.; Etzler, M. E.; Frank, M.; Vliegenthart, J. F.; Lütteke, T.; Perez, S.; Bolton, E.; Rudd, P.; Paulson, J.; Kanehisa, M.; Toukach, P.; Aoki-Kinoshita, K. F.; Dell, A.; Narimatsu, H.; York, W.; Taniguchi, N.; Kornfeld, S. Symbol Nomenclature for Graphical Representations of Glycans. Glycobiology 2015, 25 (12), 1323– 1324, DOI: 10.1093/glycob/cwv091Google Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhs1OqurzK&md5=6d906f3912a24468074ded519e6354c9Symbol nomenclature for graphical representations of glycansVarki, Ajit; Cummings, Richard D.; Aebi, Markus; Packer, Nicole H.; Seeberger, Peter H.; Esko, Jeffrey D.; Stanley, Pamela; Hart, Gerald; Darvill, Alan; Kinoshita, Taroh; Prestegard, James J.; Schnaar, Ronald L.; Freeze, Hudson H.; Marth, Jamey D.; Bertozzi, Carolyn R.; Etzler, Marilynn E.; Frank, Martin; Vliegenthart, Johannes F. G.; Lutteke, Thomas; Perez, Serge; Bolton, Evan; Rudd, Pauline; Paulson, James; Kanehisa, Minoru; Toukach, Philip; Aoki-Kinoshita, Kiyoko F.; Dell, Anne; Narimatsu, Hisashi; York, William; Taniguchi, Naoyuki; Kornfeld, StuartGlycobiology (2015), 25 (12), 1323-1324CODEN: GLYCE3; ISSN:0959-6658. (Oxford University Press)Symbol nomenclature for graphical representations of glycans.
- 25Neelamegham, S.; Aoki-Kinoshita, K.; Bolton, E.; Frank, M.; Lisacek, F.; Lütteke, T.; O’Boyle, N.; Packer, N. H.; Stanley, P.; Toukach, P.; Varki, A.; Woods, R. J. Updates to the Symbol Nomenclature for Glycans guidelines. Glycobiology 2019, 29 (9), 620– 624, DOI: 10.1093/glycob/cwz045Google Scholar25https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXps1Witrk%253D&md5=b7bb24d9cf3e0b8ff3edb88960186d77Updates to the symbol nomenclature for glycans guidelinesNeelamegham, Sriram; Aoki-Kinoshita, Kiyoko; Bolton, Evan; Frank, Martin; Lisacek, Frederique; Lutteke, Thomas; O'Boyle, Noel; Packer, Nicolle H.; Stanley, Pamela; Toukach, Philip; Varki, Ajit; Woods, Robert J.; Darvill, Alan; Dell, Anne; Henrissat, Bernard; Bertozzi, Carolyn; Hart, Gerald; Narimatsu, Hisashi; Freeze, Hudson; Yamada, Issaku; Paulson, James; Prestegard, James; Marth, Jamey; Vliegenthart, Jfg; Etzler, Marilynn; Aebi, Markus; Kanehisa, Minoru; Taniguchi, Naoyuki; Edwards, Nathan; Rudd, Pauline; Seeberger, Peter; Mazumder, Raja; Ranzinger, Rene; Cummings, Richard; Schnaar, Ronald; Perez, Serge; Kornfeld, Stuart; Kinoshita, Taroh; York, William; Knirel, YuriyGlycobiology (2019), 29 (9), 620-624CODEN: GLYCE3; ISSN:1460-2423. (Oxford University Press)A review. The Symbol Nomenclature for Glycans (SNFG) is a community-curated std. for the depiction of monosaccharides and complex glycans using various colored-coded, geometric shapes, along with defined text addns. It is hosted by the National Center for Biotechnol. Information (NCBI) at the NCBI-Glycans Page. Several changes have been made to the SNFG page in the past year to update the rules for depicting glycans using the SNFG, to include more examples of use, particularly for non-mammalian organisms, and to provide guidelines for the depiction of ambiguous glycan structures. This Glycoforum article summarizes these recent changes.
- 26de Vries, S. J.; van Dijk, A. D. J.; Krzeminski, M.; van Dijk, M.; Thureau, A.; Hsu, V.; Wassenaar, T.; Bonvin, A. M. J. J. HADDOCK versus HADDOCK: New Features and Performance of HADDOCK2.0 on the CAPRI Targets. Proteins: Struct., Funct., Bioinf. 2007, 69 (4), 726– 733, DOI: 10.1002/prot.21723Google ScholarThere is no corresponding record for this reference.
- 27Jorgensen, W. L.; Tirado-Rives, J. The OPLS [Optimized Potentials for Liquid Simulations] Potential Functions for Proteins, Energy Minimizations for Crystals of Cyclic Peptides and Crambin. J. Am. Chem. Soc. 1988, 110 (6), 1657– 1666, DOI: 10.1021/ja00214a001Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL1cXht1yjt7Y%253D&md5=b6c901d8c295b3b37329a7faef527e12The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambinJorgensen, William L.; Tirado-Rives, JulianJournal of the American Chemical Society (1988), 110 (6), 1657-66CODEN: JACSAT; ISSN:0002-7863.A complete set of intermol. potential functions was developed for use in computer simulations of proteins in their native environment. Parameters are reported for 25 peptide residues as well as the common neutral and charged terminal groups. The potential functions have the simple Coulomb plus Lennard-Jones form and are compatible with the widely used models for water, TIP4P, TIP3P, and SPC. The parameters were obtained and tested primarily in conjunction with Monte Carlo statistical mechanics simulations of 36 pure org. liqs. and numerous aq. solns. of org. ions representative of subunits in the side chains and backbones of proteins. Bond stretch, angle bend, and torsional terms were adopted from the AMBER united-atom force field. As reported here, further testing involved studies of conformational energy surfaces and optimizations of the crystal structures for 4 cyclic hexapeptides and a cyclic pentapeptide. The av. root mean square deviation from the x-ray structures of the crystals is only 0.17 Å for the at. positions and 3% for the unit cell vols. A more crit. test was then provided by performing energy minimizations for the complete crystal of the protein crambin, including 182 water mols. that were initially placed via a Monte Carlo simulation. The resultant root mean square deviation for the non-H atoms is still only 0.17 Å and the variation in the errors for charged, polar, and nonpolar residues is small. Significant improvement is apparent over the AMBER united-atom force field which was previously demonstrated to be superior to many alternatives.
- 28Fernández-Recio, J.; Totrov, M.; Abagyan, R. Identification of Protein–Protein Interaction Sites from Docking Energy Landscapes. J. Mol. Biol. 2004, 335 (3), 843– 865, DOI: 10.1016/j.jmb.2003.10.069Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXpvVWltLg%253D&md5=8a7687b9ecf2f39625e0511d3ec71cc5Identification of Protein-Protein Interaction Sites from Docking Energy LandscapesFernandez-Recio, Juan; Totrov, Maxim; Abagyan, RubenJournal of Molecular Biology (2004), 335 (3), 843-865CODEN: JMOBAK; ISSN:0022-2836. (Elsevier)Protein recognition is one of the most challenging and intriguing problems in structural biol. Despite all the available structural, sequence and biophys. information about protein-protein complexes, the physico-chem. patterns, if any, that make a protein surface likely to be involved in protein-protein interactions, remain elusive. Here, we apply protein docking simulations and anal. of the interaction energy landscapes to identify protein-protein interaction sites. The new protocol for global docking based on multi-start global energy optimization of an all-atom model of the ligand, with detailed receptor potentials and at. solvation parameters optimized in a training set of 24 complexes, explores the conformational space around the whole receptor without restrictions. The ensembles of the rigid-body docking solns. generated by the simulations were subsequently used to project the docking energy landscapes onto the protein surfaces. We found that highly populated low-energy regions consistently corresponded to actual binding sites. The procedure was validated on a test set of 21 known protein-protein complexes not used in the training set. As much as 81% of the predicted high-propensity patch residues were located correctly in the native interfaces. This approach can guide the design of mutations on the surfaces of proteins, provide geometrical details of a possible interaction, and help to annotate protein surfaces in structural proteomics.
- 29Basciu, A.; Koukos, P. I.; Malloci, G.; Bonvin, A. M. J. J.; Vargiu, A. V. Coupling Enhanced Sampling of the Apo-Receptor with Template-Based Ligand Conformers Selection: Performance in Pose Prediction in the D3R Grand Challenge 4. J. Comput. Aided Mol. Des 2020, 34 (2), 149– 162, DOI: 10.1007/s10822-019-00244-6Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXitFGit73J&md5=af95398263b2f4c630dc50847867a518Coupling enhanced sampling of the apo-receptor with template-based ligand conformers selection: performance in pose prediction in the D3R Grand Challenge 4Basciu, Andrea; Koukos, Panagiotis I.; Malloci, Giuliano; Bonvin, Alexandre M. J. J.; Vargiu, Attilio V.Journal of Computer-Aided Molecular Design (2020), 34 (2), 149-162CODEN: JCADEQ; ISSN:0920-654X. (Springer)We report the performance of our newly introduced Ensemble Docking with Enhanced sampling of pocket Shape (EDES) protocol coupled to a template-based algorithm to generate near-native ligand conformations in the 2019 iteration of the Grand Challenge (GC4) organized by the D3R consortium. Using either AutoDock4.2 or HADDOCK2.2 docking programs (each software in two variants of the protocol) our method generated native-like poses among the top 5 submitted for evaluation for most of the 20 targets with similar performances. The protein selected for GC4 was the human beta-site amyloid precursor protein cleaving enzyme 1 (BACE-1), a transmembrane aspartic-acid protease. We identified at least one pose whose heavy-atoms RMSD was less than 2.5 Å from the native conformation for 16 (80%) and 17 (85%) of the 20 targets using AutoDock and HADDOCK, resp. Dissecting the possible sources of errors revealed that: (i) our EDES protocol (with minor modifications) was able to sample sub-angstrom conformations for all 20 protein targets, reproducing the correct conformation of the binding site within ∼ 1 Å RMSD; (ii) as already shown by some of us in GC3, even in the presence of near-native protein structures, a proper selection of ligand conformers is crucial for the success of ensemble-docking calcns. Importantly, our approach performed best among the protocols exploiting only structural information of the apo protein to generate conformations of the receptor for ensemble-docking calcns.
- 30Nilges, M. A calculation strategy for the structure determination of symmetric demers by 1H NMR. Proteins: Struct., Funct., Bioinf. 1993, 17 (3), 297– 309, DOI: 10.1002/prot.340170307Google ScholarThere is no corresponding record for this reference.
- 31Wallace, A. C.; Laskowski, R. A.; Thornton, J. M. LIGPLOT: A Program to Generate Schematic Diagrams of Protein-Ligand Interactions. Protein Eng., Des. Sel. 1995, 8 (2), 127– 134, DOI: 10.1093/protein/8.2.127Google ScholarThere is no corresponding record for this reference.
- 32McDonald, I. K.; Thornton, J. M. Satisfying Hydrogen Bonding Potential in Proteins. J. Mol. Biol. 1994, 238 (5), 777– 793, DOI: 10.1006/jmbi.1994.1334Google Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2cXltVagt7k%253D&md5=0f21bf6f6b1ae5f5f70235da81f426b5Satisfying hydrogen bonding potential in proteinsMcDonald, Ian K.; Thornton, Janet M.Journal of Molecular Biology (1994), 238 (5), 777-93CODEN: JMOBAK; ISSN:0022-2836.The authors have analyzed the frequency with which potential hydrogen bond donors and acceptors are satisfied in protein mols. There are a small percentage of nitrogen or oxygen atoms that do not form hydrogen bonds with either solvent or protein atoms, when std. criteria are used. For high resoln. structures, 9.5% and 5.1% of buried main-chain nitrogen and oxygen atoms, resp., fail to hydrogen bond under the authors' std. criteria, representing 5.8% and 2.1% of all main-chain nitrogen and oxygen atoms. The authors find that as the resoln. of the data improves, the percentages fall. If the hydrogen bond criteria are relaxed many of these unsatisfied atoms form weak hydrogen bonds. However, there remain some buried atoms (1.3% NH and 1.8% CO) that fail to hydrogen bond without any immediately obvious compensating interactions.
- 33Méndez, R.; Leplae, R.; De Maria, L.; Wodak, S. J. Assessment of Blind Predictions of Protein–Protein Interactions: Current Status of Docking Methods. Proteins: Struct., Funct., Bioinf. 2003, 52 (1), 51– 67, DOI: 10.1002/prot.10393Google ScholarThere is no corresponding record for this reference.
- 34Giulini, M.; Menichetti, R.; Shell, M. S.; Potestio, R. An Information-Theory-Based Approach for Optimal Model Reduction of Biomolecules. J. Chem. Theory Comput. 2020, 16 (11), 6795– 6813, DOI: 10.1021/acs.jctc.0c00676Google Scholar34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXit1alt73P&md5=03a82b725a5399cc9502639f62485100An Information-Theory-Based Approach for Optimal Model Reduction of BiomoleculesGiulini, Marco; Menichetti, Roberto; Shell, M. Scott; Potestio, RaffaelloJournal of Chemical Theory and Computation (2020), 16 (11), 6795-6813CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)In theor. modeling of a phys. system, a crucial step consists of the identification of those degrees of freedom that enable a synthetic yet informative representation of it. While in some cases this selection can be carried out on the basis of intuition and experience, straightforward discrimination of the important features from the negligible ones is difficult for many complex systems, most notably heteropolymers and large biomols. The authors here present a thermodn.-based theor. framework to gauge the effectiveness of a given simplified representation by measuring its information content. The authors employ this method to identify those reduced descriptions of proteins, in terms of a subset of their atoms, that retain the largest amt. of information from the original model; these highly informative representations share common features that are intrinsically related to the biol. properties of the proteins under examn., thereby establishing a bridge between protein structure, energetics, and function.
- 35Sokal, R. R.; Michener, C. D. A Statistical Method for Evaluating Systematic Relationships; University of Kansas Science Bulletin, 1958; Vol. 38, pp 1409– 1438.Google ScholarThere is no corresponding record for this reference.
- 36Charitou, V.; van Keulen, S. C.; Bonvin, A. M. J. J. Cyclization and Docking Protocol for Cyclic Peptide–Protein Modeling Using HADDOCK2.4. J. Chem. Theory Comput. 2022, 18 (6), 4027– 4040, DOI: 10.1021/acs.jctc.2c00075Google ScholarThere is no corresponding record for this reference.
- 37Buchanan, C. J.; Gaunt, B.; Harrison, P. J.; Yang, Y.; Liu, J.; Khan, A.; Giltrap, A. M.; Le Bas, A.; Ward, P. N.; Gupta, K.; Dumoux, M.; Tan, T. K.; Schimaski, L.; Daga, S.; Picchiotti, N.; Baldassarri, M.; Benetti, E.; Fallerini, C.; Fava, F.; Giliberti, A.; Koukos, P. I.; Davy, M. J.; Lakshminarayanan, A.; Xue, X.; Papadakis, G.; Deimel, L. P.; Casablancas-Antràs, V.; Claridge, T. D. W.; Bonvin, A. M. J. J.; Sattentau, Q. J.; Furini, S.; Gori, M.; Huo, J.; Owens, R. J.; Schaffitzel, C.; Berger, I.; Renieri, A.; Naismith, J. H.; Baldwin, A. J.; Davis, B. G.; Davis, B. G. Pathogen-Sugar Interactions Revealed by Universal Saturation Transfer Analysis. Science 2022, 377 (6604), eabm3125 DOI: 10.1126/science.abm3125Google ScholarThere is no corresponding record for this reference.
- 38Koukos, P. I.; Réau, M.; Bonvin, A. M. J. J. Shape-Restrained Modeling of Protein–Small-Molecule Complexes with High Ambiguity Driven DOCKing. J. Chem. Inf. Model. 2021, 61 (9), 4807– 4818, DOI: 10.1021/acs.jcim.1c00796Google ScholarThere is no corresponding record for this reference.
- 39Kerzmann, A.; Neumann, D.; Kohlbacher, O. SLICK – Scoring and Energy Functions for Protein–Carbohydrate Interactions. J. Chem. Inf. Model. 2006, 46 (4), 1635– 1642, DOI: 10.1021/ci050422yGoogle Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28Xksl2ksbY%253D&md5=af9f4a1e22709fcf8e39e6595b11a501SLICK - Scoring and Energy Functions for Protein-Carbohydrate InteractionsKerzmann, Andreas; Neumann, Dirk; Kohlbacher, OliverJournal of Chemical Information and Modeling (2006), 46 (4), 1635-1642CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Protein-carbohydrate interactions are increasingly being recognized as essential for many important biomol. recognition processes. From these, numerous biomedical applications arise in areas as diverse as drug design, immunol., or drug transport. The authors introduce SLICK, a package contg. a scoring and an energy function, which were specifically designed to predict binding modes and free energies of sugars and sugarlike compds. to proteins. SLICK accounts for van der Waals interactions, solvation effects, electrostatics, hydrogen bonds, and CH···π interactions, the latter being a particular feature of most protein-carbohydrate interactions. Parameters for the empirical energy function were calibrated on a set of high-resoln. crystal structures of protein-sugar complexes with known exptl. binding free energies. The authors show that SLICK predicts the binding free energies of predicted complexes (through mol. docking) with high accuracy. SLICK is available as part of the authors' mol. modeling package BALL (www.ball-project.org).
- 40Kerzmann, A.; Fuhrmann, J.; Kohlbacher, O.; Neumann, D. BALLDock/SLICK: A New Method for Protein-Carbohydrate Docking. J. Chem. Inf. Model. 2008, 48 (8), 1616– 1625, DOI: 10.1021/ci800103uGoogle Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXovVWitb8%253D&md5=e34cd7fe7693e397ad796306b8479314BALLDock/SLICK: A New Method for Protein-Carbohydrate DockingKerzmann, Andreas; Fuhrmann, Jan; Kohlbacher, Oliver; Neumann, DirkJournal of Chemical Information and Modeling (2008), 48 (8), 1616-1625CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Protein-ligand docking is an essential technique in computer-aided drug design. While generally available docking programs work well for most drug classes, carbohydrates and carbohydrate-like compds. are often problematic for docking. The authors present a new docking method specifically designed to handle docking of carbohydrate-like compds. BALLDock/SLICK combines an evolutionary docking algorithm for flexible ligands and flexible receptor side chains with carbohydrate-specific scoring and energy functions. The scoring function has been designed to identify accurate ligand poses, while the energy function yields accurate ests. of the binding free energies of these poses. On a test set of known protein-sugar complexes the authors demonstrate the ability of the approach to generate correct poses for almost all of the structures and achieve very low mean errors for the predicted binding free energies.
- 41Ives, C.; Singh, O.; D’Andrea, S.; Fogarty, C.; Harbison, A.; Satheesan, A.; Tropea, B.; Fadda, E. Restoring Protein Glycosylation with GlycoShape. bioRxiv 2023, DOI: 10.1101/2023.12.11.571101Google ScholarThere is no corresponding record for this reference.
- 42Meyer, P. A.; Socias, S.; Key, J.; Ransey, E.; Tjon, E. C.; Buschiazzo, A.; Lei, M.; Botka, C.; Withrow, J.; Neau, D.; Rajashankar, K.; Anderson, K. S.; Baxter, R. H.; Blacklow, S. C.; Boggon, T. J.; Bonvin, A. M. J. J.; Borek, D.; Brett, T. J.; Caflisch, A.; Chang, C.-I.; Chazin, W. J.; Corbett, K. D.; Cosgrove, M. S.; Crosson, S.; Dhe-Paganon, S.; Di Cera, E.; Drennan, C. L.; Eck, M. J.; Eichman, B. F.; Fan, Q. R.; Ferré-D’Amaré, A. R.; Christopher Fromme, J.; Garcia, K. C.; Gaudet, R.; Gong, P.; Harrison, S. C.; Heldwein, E. E.; Jia, Z.; Keenan, R. J.; Kruse, A. C.; Kvansakul, M.; McLellan, J. S.; Modis, Y.; Nam, Y.; Otwinowski, Z.; Pai, E. F.; Pereira, P. J. B.; Petosa, C.; Raman, C. S.; Rapoport, T. A.; Roll-Mecak, A.; Rosen, M. K.; Rudenko, G.; Schlessinger, J.; Schwartz, T. U.; Shamoo, Y.; Sondermann, H.; Tao, Y. J.; Tolia, N. H.; Tsodikov, O. V.; Westover, K. D.; Wu, H.; Foster, I.; Fraser, J. S.; Maia, F. R. N. C.; Gonen, T.; Kirchhausen, T.; Diederichs, K.; Crosas, M.; Sliz, P. Data Publication with the Structural Biology Data Grid Supports Live Analysis. Nat. Commun. 2016, 7 (1), 10882, DOI: 10.1038/ncomms10882Google ScholarThere is no corresponding record for this reference.
Cited By
This article has not yet been cited by other publications.
Article Views
Altmetric
Citations
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
Abstract
Figure 1
Figure 1. Schematic representation of the docking protocol for the unbound data set. First, rigid-body docking is performed. The models are then clustered based on RMSD. The best-scoring models of each cluster are then subjected to a flexible refinement (interface), and the resulting models are again clustered and analyzed.
Figure 2
Figure 2. Comparison of bound docking success rates obtained with the default (w_vdW = 0.01) and vdW (w_vdW = 1.0) scoring functions (eqs 1 and 2) as a function of the number of top-ranked models (T = 1, 5, 10, 50, 100, and 200) selected using true interface residues of both protein and glycan to define the ambiguous interaction restraints (ti-aa AIRs).
Figure 3
Figure 3. HADDOCK3’s performance on the unbound data set using the vdW scoring function and tip-ap AIRs (true interface on the protein defined as active and the glycan residues as passive). The success rates (SRs) (Y axis), defined as the percentage of complexes for which acceptable-, medium-, or high-quality models are generated, are calculated on the top 1 (T1) to top 200 (T200) ranked rigid-body models (column “rigid”), T1 to T50 rigid-body clusters, considering the top 5 models of each cluster (column “rigid + clustering”), the T1 to T200 ranked refined models (column “flexref”), and the T1 to T10 refined clusters, considering the top 4 models of each cluster (column “flexref + clustering”). SRs are shown separately for the whole data set (first row) and for the three categories of complexes grouped by glycan size and connectivity: SL-SB (second row), LL (third row), and LB (fourth row).
Figure 4
Figure 4. Superimposition of the best-scoring flexible refinement models (orange) and the rigid-body models (teal) to the reference structures (gray) for the complexes 1OH4 (LB), 5VX5 (LL), and 1C1L (SL) and the unbound docking scenario carried out with vdW scoring potential and tip-ap AIRs. Oxygen atoms of the glycans are shown in red in all the structures, nitrogen atoms in blue, and hydrogens are not shown. Ranking and IL-RMSD values with respect to the reference structures for the flexref and rigid-body models are shown as well.
References
This article references 42 other publications.
- 1Gold, V. The IUPAC Compendium of Chemical Terminology; International Union of Pure and Applied Chemistry (IUPAC): Research Triangle Park, NC, 2019.There is no corresponding record for this reference.
- 2Seeberger, P. H. Monosaccharide Diversity. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Eds.; Cold Spring Harbor Laboratory Press: New York, 2022;,pp 21– 32. DOI: 10.1101/glycobiology.4e.2 .There is no corresponding record for this reference.
- 3Lebrilla, C. B.; Liu, J.; Widmalm, G.; Prestegard, J. H. Oligosaccharides and Polysaccharides. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Stanley, P., Hart, G. W., Aebi, M., Mohnen, D., Kinoshita, T., Packer, N. H., Prestegard, J. H., Schnaar, R. L., Seeberger, P. H., Eds.; Cold Spring Harbor Laboratory Press: New York, 2022, pp 33– 42. DOI: 10.1101/glycobiology.4e.3 .There is no corresponding record for this reference.
- 4Varki, A.; Gagneux, P. Biological Functions of Glycans. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Eds.; Cold Spring Harbor Laboratory Press: New York, 2022. DOI: 10.1101/glycobiology.4e.7 .There is no corresponding record for this reference.
- 5Molina, A.; O’Neill, M. A.; Darvill, A. G.; Etzler, M. E.; Mohnen, D.; Hahn, M. G.; Esko, J. D. Free Glycans as Bioactive Molecules. In Essentials of Glycobiology; Varki, A., Cummings, R. D., Esko, J. D., Stanley, P., Hart, G. W., Aebi, M., Mohnen, D., Kinoshita, T., Packer, N. H., Prestegard, J. H., Schnaar, R. L., Seeberger, P. H., Eds.; Cold Spring Harbor Laboratory Press: New York, 2022, pp 539– 548. DOI: 10.1101/glycobiology.4e.40 .There is no corresponding record for this reference.
- 6Casalino, L.; Gaieb, Z.; Goldsmith, J. A.; Hjorth, C. K.; Dommer, A. C.; Harbison, A. M.; Fogarty, C. A.; Barros, E. P.; Taylor, B. C.; McLellan, J. S.; Fadda, E.; Amaro, R. E. Beyond Shielding: The Roles of Glycans in the SARS-CoV-2 Spike Protein. ACS Cent. Sci. 2020, 6 (10), 1722– 1734, DOI: 10.1021/acscentsci.0c010566https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXhvVOlsb3N&md5=52d499afcd7e3caa7d9e6017ffa86e45Beyond Shielding: The Roles of Glycans in the SARS-CoV-2 Spike ProteinCasalino, Lorenzo; Gaieb, Zied; Goldsmith, Jory A.; Hjorth, Christy K.; Dommer, Abigail C.; Harbison, Aoife M.; Fogarty, Carl A.; Barros, Emilia P.; Taylor, Bryn C.; McLellan, Jason S.; Fadda, Elisa; Amaro, Rommie E.ACS Central Science (2020), 6 (10), 1722-1734CODEN: ACSCII; ISSN:2374-7951. (American Chemical Society)The ongoing COVID-19 pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in more than 28,000,000 infections and 900,000 deaths worldwide to date. Antibody development efforts mainly revolve around the extensively glycosylated SARS-CoV-2 spike (S) protein, which mediates host cell entry by binding to the angiotensin-converting enzyme 2 (ACE2). Similar to many other viral fusion proteins, the SARS-CoV-2 spike utilizes a glycan shield to thwart the host immune response. Here, we built a full-length model of the glycosylated SARS-CoV-2 S protein, both in the open and closed states, augmenting the available structural and biol. data. Multiple microsecond-long, all-atom mol. dynamics simulations were used to provide an atomistic perspective on the roles of glycans and on the protein structure and dynamics. We reveal an essential structural role of N-glycans at sites N165 and N234 in modulating the conformational dynamics of the spike's receptor binding domain (RBD), which is responsible for ACE2 recognition. This finding is corroborated by biolayer interferometry expts., which show that deletion of these glycans through N165A and N234A mutations significantly reduces binding to ACE2 as a result of the RBD conformational shift toward the "down" state. Addnl., end-to-end accessibility analyses outline a complete overview of the vulnerabilities of the glycan shield of the SARS-CoV-2 S protein, which may be exploited in the therapeutic efforts targeting this mol. machine. Overall, this work presents hitherto unseen functional and structural insights into the SARS-CoV-2 S protein and its glycan coat, providing a strategy to control the conformational plasticity of the RBD that could be harnessed for vaccine development. The glycan shield is a sugary barrier that helps the viral SARS-CoV-2 spikes to evade the immune system. Beyond shielding, two of the spike's glycans are discovered to prime the virus for infection.
- 7Woods, R. J. Predicting the Structures of Glycans, Glycoproteins, and Their Complexes. Chem. Rev. 2018, 118 (17), 8005– 8024, DOI: 10.1021/acs.chemrev.8b000327https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhsVKhsbjF&md5=5b0fa8d3380516b34a09bf941e2853cePredicting the Structures of Glycans, Glycoproteins, and Their ComplexesWoods, Robert J.Chemical Reviews (Washington, DC, United States) (2018), 118 (17), 8005-8024CODEN: CHREAY; ISSN:0009-2665. (American Chemical Society)A review. Complex carbohydrates are ubiquitous in nature, and together with proteins and nucleic acids they comprise the building blocks of life. But unlike proteins and nucleic acids, carbohydrates form nonlinear polymers, and they are not characterized by robust secondary or tertiary structures but rather by distributions of well-defined conformational states. Their mol. flexibility means that oligosaccharides are often refractory to crystn., and NMR spectroscopy augmented by mol. dynamics (MD) simulation is the leading method for their characterization in soln. The biol. importance of carbohydrate-protein interactions, in organismal development as well as in disease, places urgency on the creation of innovative exptl. and theor. methods that can predict the specificity of such interactions and quantify their strengths. Addnl., the emerging realization that protein glycosylation impacts protein function and immunogenicity places the ability to define the mechanisms by which glycosylation impacts these features at the forefront of carbohydrate modeling. This review will discuss the relevant theor. approaches to studying the three-dimensional structures of this fascinating class of mols. and interactions, with ref. to the relevant exptl. data and techniques that are key for validation of the theor. predictions.
- 8Perez, S.; Makshakova, O. Multifaceted Computational Modeling in Glycoscience. Chem. Rev. 2022, 122 (20), 15914– 15970, DOI: 10.1021/acs.chemrev.2c000608https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xhs1Ort7jK&md5=9ab27d5d4e828eb9b5d7760c6c55d60bMultifaceted Computational Modeling in GlycosciencePerez, Serge; Makshakova, OlgaChemical Reviews (Washington, DC, United States) (2022), 122 (20), 15914-15970CODEN: CHREAY; ISSN:0009-2665. (American Chemical Society)A review. Glycoscience assembles all the scientific disciplines involved in studying various mols. and macromols. contg. carbohydrates and complex glycans. Such an ensemble involves one of the most extensive sets of mols. in quantity and occurrence since they occur in all microorganisms and higher organisms. Once the compns. and sequences of these mols. are established, the detn. of their three-dimensional structural and dynamical features is a step toward understanding the mol. basis underlying their properties and functions. The range of the relevant computational methods capable of addressing such issues is anchored by the specificity of stereoelectronic effects from quantum chem. to mesoscale modeling throughout mol. dynamics and mechanics and coarse-grained and docking calcns. The Review leads the reader through the detailed presentations of the applications of computational modeling. The illustrations cover carbohydrate-carbohydrate interactions, glycolipids, and N- and O-linked glycans, emphasizing their role in SARS-CoV-2. The presentation continues with the structure of polysaccharides in soln. and solid-state and lipopolysaccharides in membranes. The full range of protein-carbohydrate interactions is presented, as exemplified by carbohydrate-active enzymes, transporters, lectins, antibodies, and glycosaminoglycan binding proteins. A final section features a list of 150 tools and databases to help address the many issues of structural glycobioinformatics.
- 9Nance, M. L.; Labonte, J. W.; Adolf-Bryfogle, J.; Gray, J. J. Development and Evaluation of GlycanDock: A Protein–Glycoligand Docking Refinement Algorithm in Rosetta. J. Phys. Chem. B 2021, 125 (25), 6807– 6820, DOI: 10.1021/acs.jpcb.1c00910There is no corresponding record for this reference.
- 10Mottarella, S. E.; Beglov, D.; Beglova, N.; Nugent, M. A.; Kozakov, D.; Vajda, S. Docking Server for the Identification of Heparin Binding Sites on Proteins. J. Chem. Inf. Model. 2014, 54 (7), 2068– 2078, DOI: 10.1021/ci500115j10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtVKnsLjK&md5=82dedc6c1e29961cbe8eb8a86957e582Docking Server for the Identification of Heparin Binding Sites on ProteinsMottarella, Scott E.; Beglov, Dmitri; Beglova, Natalia; Nugent, Matthew A.; Kozakov, Dima; Vajda, SandorJournal of Chemical Information and Modeling (2014), 54 (7), 2068-2078CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Many proteins of widely differing functionality and structure are capable of binding heparin and heparan sulfate. Since crystg. protein-heparin complexes for structure detn. is generally difficult, computational docking can be a useful approach for understanding specific interactions. Previous studies used programs originally developed for docking small mols. to well-defined pockets, rather than for docking polysaccharides to highly charged shallow crevices that usually bind heparin. The authors have extended the program PIPER and the automated protein-protein docking server ClusPro to heparin docking. Using a mol. mechanics energy function for scoring and the fast Fourier transform correlation approach, the method generates and evaluates close to a billion poses of a heparin tetrasaccharide probe. The docked structures are clustered using pairwise root-mean-square deviations as the distance measure. Clustering of heparin mols. close to each other but having different orientations and selecting the clusters with the highest protein-ligand contacts reliably predicts the heparin binding site. In addn., the centers of the five most populated clusters include structures close to the native orientation of the heparin. These structures can provide starting points for further refinement by methods that account for flexibility such as mol. dynamics. The heparin docking method is available as an advanced option of the ClusPro server at http://cluspro.bu.edu/.
- 11Nivedha, A. K.; Thieker, D. F.; Makeneni, S.; Hu, H.; Woods, R. J. Vina-Carb: Improving Glycosidic Angles during Carbohydrate Docking. J. Chem. Theory Comput. 2016, 12 (2), 892– 901, DOI: 10.1021/acs.jctc.5b0083411https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XltlCrsQ%253D%253D&md5=cf018be5775bfaa2ad035ed1b82305b6Vina-Carb: Improving Glycosidic Angles during Carbohydrate DockingNivedha, Anita K.; Thieker, David F.; Hu, Huimin; Woods, Robert J.Journal of Chemical Theory and Computation (2016), 12 (2), 892-901CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)Mol. docking programs are primarily designed to align rigid, drug-like fragments into the binding sites of macromols. and frequently display poor performance when applied to flexible carbohydrate mols. A crit. source of flexibility within an oligosaccharide is the glycosidic linkages. Recently, Carbohydrate Intrinsic (CHI) energy functions are reported that attempt to quantify the glycosidic torsion angle preferences. The CHI-energy functions have been incorporated into the AutoDock Vina (ADV) scoring function, subsequently termed Vina-Carb (VC). Two user-adjustable parameters have been introduced, namely, a CHI- energy wt. term (chi_coeff) that affects the magnitude of the CHI-energy penalty and a CHI-cutoff term (chi_cutoff) that negates CHI-energy penalties below a specified value. A data set consisting of 101 protein-carbohydrate complexes and 29 apoprotein structures was used in the development and testing of VC, including antibodies, lectins, and carbohydrate binding modules. Accounting for the intramol. energies of the glycosidic linkages in the oligosaccharides during docking led VC to produce acceptable structures within the top five ranked poses in 74% of the systems tested, compared to a success rate of 55% for ADV. An enzyme system was employed to illustrate the potential application of VC to proteins that may distort glycosidic linkages of carbohydrate ligands upon binding. VC represents a significant step toward accurately predicting the structures of protein-carbohydrate complexes. Furthermore, the described approach is conceptually applicable to any class of ligands that populate well-defined conformational states.
- 12Boittier, E. D.; Burns, J. M.; Gandhi, N. S.; Ferro, V. GlycoTorch Vina: Docking Designed and Tested for Glycosaminoglycans. J. Chem. Inf. Model. 2020, 60 (12), 6328– 6343, DOI: 10.1021/acs.jcim.0c00373There is no corresponding record for this reference.
- 13Labonte, J. W.; Adolf-Bryfogle, J.; Schief, W. R.; Gray, J. J. Residue-centric Modeling and Design of Saccharide and Glycoconjugate Structures. J. Comput. Chem. 2017, 38 (5), 276– 287, DOI: 10.1002/jcc.2467913https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhvFKjsLnO&md5=822f98946c2d848b46f06b581beb11b4Residue-centric modeling and design of saccharide and glycoconjugate structuresLabonte, Jason W.; Adolf-Bryfogle, Jared; Schief, William R.; Gray, Jeffrey J.Journal of Computational Chemistry (2017), 38 (5), 276-287CODEN: JCCHDD; ISSN:0192-8651. (John Wiley & Sons, Inc.)The RosettaCarbohydrate framework is a new tool for modeling a wide variety of saccharide and glycoconjugate structures. This report describes the development of the framework and highlights its applications. The framework integrates with established protocols within the Rosetta modeling and design suite, and it handles the vast complexity and variety of carbohydrate mols., including branching and sugar modifications. To address challenges of sampling and scoring, RosettaCarbohydrate can sample glycosidic bonds, side-chain conformations, and ring forms, and it uses a glycan-specific term within its scoring function. Rosetta can work with std. PDB, GLYCAM, and GlycoWorkbench (.gws) file formats. Saccharide residue-specific chem. information is stored internally, permitting glycoengineering and design. Carbohydrate-specific applications described herein include virtual glycosylation, loop-modeling of carbohydrates, and docking of glyco-ligands to antibodies. Benchmarking data are presented and compared to other studies, demonstrating Rosetta's ability to predict glyco-ligand binding. The framework expands the tools available to glycoscientists and engineers.
- 14Glashagen, G.; de Vries, S.; Uciechowska-Kaczmarzyk, U.; Samsonov, S. A.; Murail, S.; Tuffery, P.; Zacharias, M. Coarse-grained and Atomic Resolution Biomolecular Docking with the ATTRACT Approach. Proteins: Struct., Funct., Bioinf. 2020, 88 (8), 1018– 1028, DOI: 10.1002/prot.25860There is no corresponding record for this reference.
- 15Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; Bridgland, A.; Meyer, C.; Kohl, S. A. A.; Ballard, A. J.; Cowie, A.; Romera-Paredes, B.; Nikolov, S.; Jain, R.; Adler, J.; Back, T.; Petersen, S.; Reiman, D.; Clancy, E.; Zielinski, M.; Steinegger, M.; Pacholska, M.; Berghammer, T.; Bodenstein, S.; Silver, D.; Vinyals, O.; Senior, A. W.; Kavukcuoglu, K.; Kohli, P.; Hassabis, D. Highly Accurate Protein Structure Prediction with AlphaFold. Nature 2021, 596 (7873), 583– 589, DOI: 10.1038/s41586-021-03819-215https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhvVaktrrL&md5=25964ab1157cd5b74a437333dd86650dHighly accurate protein structure prediction with AlphaFoldJumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ; Zidek, Augustin; Potapenko, Anna; Bridgland, Alex; Meyer, Clemens; Kohl, Simon A. A.; Ballard, Andrew J.; Cowie, Andrew; Romera-Paredes, Bernardino; Nikolov, Stanislav; Jain, Rishub; Adler, Jonas; Back, Trevor; Petersen, Stig; Reiman, David; Clancy, Ellen; Zielinski, Michal; Steinegger, Martin; Pacholska, Michalina; Berghammer, Tamas; Bodenstein, Sebastian; Silver, David; Vinyals, Oriol; Senior, Andrew W.; Kavukcuoglu, Koray; Kohli, Pushmeet; Hassabis, DemisNature (London, United Kingdom) (2021), 596 (7873), 583-589CODEN: NATUAS; ISSN:0028-0836. (Nature Portfolio)Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous exptl. effort, the structures of around 100,000 unique proteins have been detd., but this represents a small fraction of the billions of known protein sequences. Structural coverage is bottlenecked by the months to years of painstaking effort required to det. a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'-has been an important open research problem for more than 50 years. Despite recent progress, existing methods fall far short of at. accuracy, esp. when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with at. accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Crit. Assessment of protein Structure Prediction (CASP14), demonstrating accuracy competitive with exptl. structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates phys. and biol. knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
- 16Abramson, J.; Adler, J.; Dunger, J.; Evans, R.; Green, T.; Pritzel, A.; Ronneberger, O.; Willmore, L.; Ballard, A. J.; Bambrick, J.; Bodenstein, S. W.; Evans, D. A.; Hung, C.-C.; O’Neill, M.; Reiman, D.; Tunyasuvunakool, K.; Wu, Z.; Žemgulytė, A.; Arvaniti, E.; Beattie, C.; Bertolli, O.; Bridgland, A.; Cherepanov, A.; Congreve, M.; Cowen-Rivers, A. I.; Cowie, A.; Figurnov, M.; Fuchs, F. B.; Gladman, H.; Jain, R.; Khan, Y. A.; Low, C. M. R.; Perlin, K.; Potapenko, A.; Savy, P.; Singh, S.; Stecula, A.; Thillaisundaram, A.; Tong, C.; Yakneen, S.; Zhong, E. D.; Zielinski, M.; Žídek, A.; Bapst, V.; Kohli, P.; Jaderberg, M.; Hassabis, D.; Jumper, J. M. Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3. Nature 2024, 630 (8016), 493– 500, DOI: 10.1038/s41586-024-07487-wThere is no corresponding record for this reference.
- 17HADDOCK3, Bonvin’s Lab , 2022. https://github.com/haddocking/haddock3.There is no corresponding record for this reference.
- 18Dominguez, C.; Boelens, R.; Bonvin, A. M. J. J. HADDOCK: A Protein–Protein Docking Approach Based on Biochemical or Biophysical Information. J. Am. Chem. Soc. 2003, 125 (7), 1731– 1737, DOI: 10.1021/ja026939x18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXkvFGquw%253D%253D&md5=9cf6fd9c690afb0181579485ae263197HADDOCK: A protein-protein docking approach based on biochemical or biophysical informationDominguez, Cyril; Boelens, Rolf; Bonvin, Alexandre M. J. J.Journal of the American Chemical Society (2003), 125 (7), 1731-1737CODEN: JACSAT; ISSN:0002-7863. (American Chemical Society)The structure detn. of protein-protein complexes is a rather tedious and lengthy process, by both NMR and x-ray crystallog. Several methods based on docking to study protein complexes have also been well developed over the past few years. Most of these approaches are not driven by exptl. data but are based on a combination of energetics and shape complementarity. Here, we present an approach called HADDOCK (High Ambiguity Driven protein-protein Docking) that makes use of biochem. and/or biophys. interaction data such as chem. shift perturbation data resulting from NMR titrn. expts. or mutagenesis data. This information is introduced as Ambiguous Interaction Restraints (AIRs) to drive the docking process. An AIR is defined as an ambiguous distance between all residues shown to be involved in the interaction. The accuracy of our approach is demonstrated with three mol. complexes. For two of these complexes, for which both the complex and the free protein structures have been solved, NMR titrn. data were available. Mutagenesis data were used in the last example. In all cases, the best structures generated by HADDOCK, i.e., the structures with the lowest intermol. energies, were the closest to the published structure of the resp. complexes (within 2.0 Å backbone RMSD).
- 19Wu, A. M.; Singh, T.; Liu, J.-H.; Krzeminski, M.; Russwurm, R.; Siebert, H.-C.; Bonvin, A. M. J. J.; André, S.; Gabius, H.-J. Activity–Structure Correlations in Divergent Lectin Evolution: Fine Specificity of Chicken Galectin CG-14 and Computational Analysis of Flexible Ligand Docking for CG-14 and the Closely Related CG-16. Glycobiology 2007, 17 (2), 165– 184, DOI: 10.1093/glycob/cwl062There is no corresponding record for this reference.
- 20Krzeminski, M.; Singh, T.; André, S.; Lensch, M.; Wu, A. M.; Bonvin, A. M. J. J.; Gabius, H.-J. Human Galectin-3 (Mac-2 Antigen): Defining Molecular Switches of Affinity to Natural Glycoproteins, Structural and Dynamic Aspects of Glycan Binding by Flexible Ligand Docking and Putative Regulatory Sequences in the Proximal Promoter Region. Biochim. Biophys. Acta, Gen. Subj. 2011, 1810 (2), 150– 161, DOI: 10.1016/j.bbagen.2010.11.001There is no corresponding record for this reference.
- 21Berman, H. M. The Protein Data Bank. Nucleic Acids Res. 2000, 28 (1), 235– 242, DOI: 10.1093/nar/28.1.23521https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXhvVKjt7w%253D&md5=227fb393f754be2be375ab727bfd05dcThe Protein Data BankBerman, Helen M.; Westbrook, John; Feng, Zukang; Gilliland, Gary; Bhat, T. N.; Weissig, Helge; Shindyalov, Ilya N.; Bourne, Philip E.Nucleic Acids Research (2000), 28 (1), 235-242CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The Protein Data Bank (PDB; http://www.rcsb.org/pdb/)is the single worldwide archive of structural data of biol. macromols. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
- 22Woods Group. Complex Carbohydrate Research Center; GLYCAM Web: University of Georgia, Athens, GA, 2023.There is no corresponding record for this reference.
- 23Kirschner, K. N.; Yongye, A. B.; Tschampel, S. M.; González-Outeiriño, J.; Daniels, C. R.; Foley, B. L.; Woods, R. J. GLYCAM06: A Generalizable Biomolecular Force Field. Carbohydrates. J. Comput. Chem. 2008, 29 (4), 622– 655, DOI: 10.1002/jcc.2082023https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD1c%252FotlSmtQ%253D%253D&md5=fd32f08518d8e3568627c36eaa958efbGLYCAM06: a generalizable biomolecular force field. CarbohydratesKirschner Karl N; Yongye Austin B; Tschampel Sarah M; Gonzalez-Outeirino Jorge; Daniels Charlisa R; Foley B Lachele; Woods Robert JJournal of computational chemistry (2008), 29 (4), 622-55 ISSN:.A new derivation of the GLYCAM06 force field, which removes its previous specificity for carbohydrates, and its dependency on the AMBER force field and parameters, is presented. All pertinent force field terms have been explicitly specified and so no default or generic parameters are employed. The new GLYCAM is no longer limited to any particular class of biomolecules, but is extendible to all molecular classes in the spirit of a small-molecule force field. The torsion terms in the present work were all derived from quantum mechanical data from a collection of minimal molecular fragments and related small molecules. For carbohydrates, there is now a single parameter set applicable to both alpha- and beta-anomers and to all monosaccharide ring sizes and conformations. We demonstrate that deriving dihedral parameters by fitting to QM data for internal rotational energy curves for representative small molecules generally leads to correct rotamer populations in molecular dynamics simulations, and that this approach removes the need for phase corrections in the dihedral terms. However, we note that there are cases where this approach is inadequate. Reported here are the basic components of the new force field as well as an illustration of its extension to carbohydrates. In addition to reproducing the gas-phase properties of an array of small test molecules, condensed-phase simulations employing GLYCAM06 are shown to reproduce rotamer populations for key small molecules and representative biopolymer building blocks in explicit water, as well as crystalline lattice properties, such as unit cell dimensions, and vibrational frequencies.
- 24Varki, A.; Cummings, R. D.; Aebi, M.; Packer, N. H.; Seeberger, P. H.; Esko, J. D.; Stanley, P.; Hart, G.; Darvill, A.; Kinoshita, T.; Prestegard, J. J.; Schnaar, R. L.; Freeze, H. H.; Marth, J. D.; Bertozzi, C. R.; Etzler, M. E.; Frank, M.; Vliegenthart, J. F.; Lütteke, T.; Perez, S.; Bolton, E.; Rudd, P.; Paulson, J.; Kanehisa, M.; Toukach, P.; Aoki-Kinoshita, K. F.; Dell, A.; Narimatsu, H.; York, W.; Taniguchi, N.; Kornfeld, S. Symbol Nomenclature for Graphical Representations of Glycans. Glycobiology 2015, 25 (12), 1323– 1324, DOI: 10.1093/glycob/cwv09124https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhs1OqurzK&md5=6d906f3912a24468074ded519e6354c9Symbol nomenclature for graphical representations of glycansVarki, Ajit; Cummings, Richard D.; Aebi, Markus; Packer, Nicole H.; Seeberger, Peter H.; Esko, Jeffrey D.; Stanley, Pamela; Hart, Gerald; Darvill, Alan; Kinoshita, Taroh; Prestegard, James J.; Schnaar, Ronald L.; Freeze, Hudson H.; Marth, Jamey D.; Bertozzi, Carolyn R.; Etzler, Marilynn E.; Frank, Martin; Vliegenthart, Johannes F. G.; Lutteke, Thomas; Perez, Serge; Bolton, Evan; Rudd, Pauline; Paulson, James; Kanehisa, Minoru; Toukach, Philip; Aoki-Kinoshita, Kiyoko F.; Dell, Anne; Narimatsu, Hisashi; York, William; Taniguchi, Naoyuki; Kornfeld, StuartGlycobiology (2015), 25 (12), 1323-1324CODEN: GLYCE3; ISSN:0959-6658. (Oxford University Press)Symbol nomenclature for graphical representations of glycans.
- 25Neelamegham, S.; Aoki-Kinoshita, K.; Bolton, E.; Frank, M.; Lisacek, F.; Lütteke, T.; O’Boyle, N.; Packer, N. H.; Stanley, P.; Toukach, P.; Varki, A.; Woods, R. J. Updates to the Symbol Nomenclature for Glycans guidelines. Glycobiology 2019, 29 (9), 620– 624, DOI: 10.1093/glycob/cwz04525https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXps1Witrk%253D&md5=b7bb24d9cf3e0b8ff3edb88960186d77Updates to the symbol nomenclature for glycans guidelinesNeelamegham, Sriram; Aoki-Kinoshita, Kiyoko; Bolton, Evan; Frank, Martin; Lisacek, Frederique; Lutteke, Thomas; O'Boyle, Noel; Packer, Nicolle H.; Stanley, Pamela; Toukach, Philip; Varki, Ajit; Woods, Robert J.; Darvill, Alan; Dell, Anne; Henrissat, Bernard; Bertozzi, Carolyn; Hart, Gerald; Narimatsu, Hisashi; Freeze, Hudson; Yamada, Issaku; Paulson, James; Prestegard, James; Marth, Jamey; Vliegenthart, Jfg; Etzler, Marilynn; Aebi, Markus; Kanehisa, Minoru; Taniguchi, Naoyuki; Edwards, Nathan; Rudd, Pauline; Seeberger, Peter; Mazumder, Raja; Ranzinger, Rene; Cummings, Richard; Schnaar, Ronald; Perez, Serge; Kornfeld, Stuart; Kinoshita, Taroh; York, William; Knirel, YuriyGlycobiology (2019), 29 (9), 620-624CODEN: GLYCE3; ISSN:1460-2423. (Oxford University Press)A review. The Symbol Nomenclature for Glycans (SNFG) is a community-curated std. for the depiction of monosaccharides and complex glycans using various colored-coded, geometric shapes, along with defined text addns. It is hosted by the National Center for Biotechnol. Information (NCBI) at the NCBI-Glycans Page. Several changes have been made to the SNFG page in the past year to update the rules for depicting glycans using the SNFG, to include more examples of use, particularly for non-mammalian organisms, and to provide guidelines for the depiction of ambiguous glycan structures. This Glycoforum article summarizes these recent changes.
- 26de Vries, S. J.; van Dijk, A. D. J.; Krzeminski, M.; van Dijk, M.; Thureau, A.; Hsu, V.; Wassenaar, T.; Bonvin, A. M. J. J. HADDOCK versus HADDOCK: New Features and Performance of HADDOCK2.0 on the CAPRI Targets. Proteins: Struct., Funct., Bioinf. 2007, 69 (4), 726– 733, DOI: 10.1002/prot.21723There is no corresponding record for this reference.
- 27Jorgensen, W. L.; Tirado-Rives, J. The OPLS [Optimized Potentials for Liquid Simulations] Potential Functions for Proteins, Energy Minimizations for Crystals of Cyclic Peptides and Crambin. J. Am. Chem. Soc. 1988, 110 (6), 1657– 1666, DOI: 10.1021/ja00214a00127https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL1cXht1yjt7Y%253D&md5=b6c901d8c295b3b37329a7faef527e12The OPLS [optimized potentials for liquid simulations] potential functions for proteins, energy minimizations for crystals of cyclic peptides and crambinJorgensen, William L.; Tirado-Rives, JulianJournal of the American Chemical Society (1988), 110 (6), 1657-66CODEN: JACSAT; ISSN:0002-7863.A complete set of intermol. potential functions was developed for use in computer simulations of proteins in their native environment. Parameters are reported for 25 peptide residues as well as the common neutral and charged terminal groups. The potential functions have the simple Coulomb plus Lennard-Jones form and are compatible with the widely used models for water, TIP4P, TIP3P, and SPC. The parameters were obtained and tested primarily in conjunction with Monte Carlo statistical mechanics simulations of 36 pure org. liqs. and numerous aq. solns. of org. ions representative of subunits in the side chains and backbones of proteins. Bond stretch, angle bend, and torsional terms were adopted from the AMBER united-atom force field. As reported here, further testing involved studies of conformational energy surfaces and optimizations of the crystal structures for 4 cyclic hexapeptides and a cyclic pentapeptide. The av. root mean square deviation from the x-ray structures of the crystals is only 0.17 Å for the at. positions and 3% for the unit cell vols. A more crit. test was then provided by performing energy minimizations for the complete crystal of the protein crambin, including 182 water mols. that were initially placed via a Monte Carlo simulation. The resultant root mean square deviation for the non-H atoms is still only 0.17 Å and the variation in the errors for charged, polar, and nonpolar residues is small. Significant improvement is apparent over the AMBER united-atom force field which was previously demonstrated to be superior to many alternatives.
- 28Fernández-Recio, J.; Totrov, M.; Abagyan, R. Identification of Protein–Protein Interaction Sites from Docking Energy Landscapes. J. Mol. Biol. 2004, 335 (3), 843– 865, DOI: 10.1016/j.jmb.2003.10.06928https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXpvVWltLg%253D&md5=8a7687b9ecf2f39625e0511d3ec71cc5Identification of Protein-Protein Interaction Sites from Docking Energy LandscapesFernandez-Recio, Juan; Totrov, Maxim; Abagyan, RubenJournal of Molecular Biology (2004), 335 (3), 843-865CODEN: JMOBAK; ISSN:0022-2836. (Elsevier)Protein recognition is one of the most challenging and intriguing problems in structural biol. Despite all the available structural, sequence and biophys. information about protein-protein complexes, the physico-chem. patterns, if any, that make a protein surface likely to be involved in protein-protein interactions, remain elusive. Here, we apply protein docking simulations and anal. of the interaction energy landscapes to identify protein-protein interaction sites. The new protocol for global docking based on multi-start global energy optimization of an all-atom model of the ligand, with detailed receptor potentials and at. solvation parameters optimized in a training set of 24 complexes, explores the conformational space around the whole receptor without restrictions. The ensembles of the rigid-body docking solns. generated by the simulations were subsequently used to project the docking energy landscapes onto the protein surfaces. We found that highly populated low-energy regions consistently corresponded to actual binding sites. The procedure was validated on a test set of 21 known protein-protein complexes not used in the training set. As much as 81% of the predicted high-propensity patch residues were located correctly in the native interfaces. This approach can guide the design of mutations on the surfaces of proteins, provide geometrical details of a possible interaction, and help to annotate protein surfaces in structural proteomics.
- 29Basciu, A.; Koukos, P. I.; Malloci, G.; Bonvin, A. M. J. J.; Vargiu, A. V. Coupling Enhanced Sampling of the Apo-Receptor with Template-Based Ligand Conformers Selection: Performance in Pose Prediction in the D3R Grand Challenge 4. J. Comput. Aided Mol. Des 2020, 34 (2), 149– 162, DOI: 10.1007/s10822-019-00244-629https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXitFGit73J&md5=af95398263b2f4c630dc50847867a518Coupling enhanced sampling of the apo-receptor with template-based ligand conformers selection: performance in pose prediction in the D3R Grand Challenge 4Basciu, Andrea; Koukos, Panagiotis I.; Malloci, Giuliano; Bonvin, Alexandre M. J. J.; Vargiu, Attilio V.Journal of Computer-Aided Molecular Design (2020), 34 (2), 149-162CODEN: JCADEQ; ISSN:0920-654X. (Springer)We report the performance of our newly introduced Ensemble Docking with Enhanced sampling of pocket Shape (EDES) protocol coupled to a template-based algorithm to generate near-native ligand conformations in the 2019 iteration of the Grand Challenge (GC4) organized by the D3R consortium. Using either AutoDock4.2 or HADDOCK2.2 docking programs (each software in two variants of the protocol) our method generated native-like poses among the top 5 submitted for evaluation for most of the 20 targets with similar performances. The protein selected for GC4 was the human beta-site amyloid precursor protein cleaving enzyme 1 (BACE-1), a transmembrane aspartic-acid protease. We identified at least one pose whose heavy-atoms RMSD was less than 2.5 Å from the native conformation for 16 (80%) and 17 (85%) of the 20 targets using AutoDock and HADDOCK, resp. Dissecting the possible sources of errors revealed that: (i) our EDES protocol (with minor modifications) was able to sample sub-angstrom conformations for all 20 protein targets, reproducing the correct conformation of the binding site within ∼ 1 Å RMSD; (ii) as already shown by some of us in GC3, even in the presence of near-native protein structures, a proper selection of ligand conformers is crucial for the success of ensemble-docking calcns. Importantly, our approach performed best among the protocols exploiting only structural information of the apo protein to generate conformations of the receptor for ensemble-docking calcns.
- 30Nilges, M. A calculation strategy for the structure determination of symmetric demers by 1H NMR. Proteins: Struct., Funct., Bioinf. 1993, 17 (3), 297– 309, DOI: 10.1002/prot.340170307There is no corresponding record for this reference.
- 31Wallace, A. C.; Laskowski, R. A.; Thornton, J. M. LIGPLOT: A Program to Generate Schematic Diagrams of Protein-Ligand Interactions. Protein Eng., Des. Sel. 1995, 8 (2), 127– 134, DOI: 10.1093/protein/8.2.127There is no corresponding record for this reference.
- 32McDonald, I. K.; Thornton, J. M. Satisfying Hydrogen Bonding Potential in Proteins. J. Mol. Biol. 1994, 238 (5), 777– 793, DOI: 10.1006/jmbi.1994.133432https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2cXltVagt7k%253D&md5=0f21bf6f6b1ae5f5f70235da81f426b5Satisfying hydrogen bonding potential in proteinsMcDonald, Ian K.; Thornton, Janet M.Journal of Molecular Biology (1994), 238 (5), 777-93CODEN: JMOBAK; ISSN:0022-2836.The authors have analyzed the frequency with which potential hydrogen bond donors and acceptors are satisfied in protein mols. There are a small percentage of nitrogen or oxygen atoms that do not form hydrogen bonds with either solvent or protein atoms, when std. criteria are used. For high resoln. structures, 9.5% and 5.1% of buried main-chain nitrogen and oxygen atoms, resp., fail to hydrogen bond under the authors' std. criteria, representing 5.8% and 2.1% of all main-chain nitrogen and oxygen atoms. The authors find that as the resoln. of the data improves, the percentages fall. If the hydrogen bond criteria are relaxed many of these unsatisfied atoms form weak hydrogen bonds. However, there remain some buried atoms (1.3% NH and 1.8% CO) that fail to hydrogen bond without any immediately obvious compensating interactions.
- 33Méndez, R.; Leplae, R.; De Maria, L.; Wodak, S. J. Assessment of Blind Predictions of Protein–Protein Interactions: Current Status of Docking Methods. Proteins: Struct., Funct., Bioinf. 2003, 52 (1), 51– 67, DOI: 10.1002/prot.10393There is no corresponding record for this reference.
- 34Giulini, M.; Menichetti, R.; Shell, M. S.; Potestio, R. An Information-Theory-Based Approach for Optimal Model Reduction of Biomolecules. J. Chem. Theory Comput. 2020, 16 (11), 6795– 6813, DOI: 10.1021/acs.jctc.0c0067634https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXit1alt73P&md5=03a82b725a5399cc9502639f62485100An Information-Theory-Based Approach for Optimal Model Reduction of BiomoleculesGiulini, Marco; Menichetti, Roberto; Shell, M. Scott; Potestio, RaffaelloJournal of Chemical Theory and Computation (2020), 16 (11), 6795-6813CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)In theor. modeling of a phys. system, a crucial step consists of the identification of those degrees of freedom that enable a synthetic yet informative representation of it. While in some cases this selection can be carried out on the basis of intuition and experience, straightforward discrimination of the important features from the negligible ones is difficult for many complex systems, most notably heteropolymers and large biomols. The authors here present a thermodn.-based theor. framework to gauge the effectiveness of a given simplified representation by measuring its information content. The authors employ this method to identify those reduced descriptions of proteins, in terms of a subset of their atoms, that retain the largest amt. of information from the original model; these highly informative representations share common features that are intrinsically related to the biol. properties of the proteins under examn., thereby establishing a bridge between protein structure, energetics, and function.
- 35Sokal, R. R.; Michener, C. D. A Statistical Method for Evaluating Systematic Relationships; University of Kansas Science Bulletin, 1958; Vol. 38, pp 1409– 1438.There is no corresponding record for this reference.
- 36Charitou, V.; van Keulen, S. C.; Bonvin, A. M. J. J. Cyclization and Docking Protocol for Cyclic Peptide–Protein Modeling Using HADDOCK2.4. J. Chem. Theory Comput. 2022, 18 (6), 4027– 4040, DOI: 10.1021/acs.jctc.2c00075There is no corresponding record for this reference.
- 37Buchanan, C. J.; Gaunt, B.; Harrison, P. J.; Yang, Y.; Liu, J.; Khan, A.; Giltrap, A. M.; Le Bas, A.; Ward, P. N.; Gupta, K.; Dumoux, M.; Tan, T. K.; Schimaski, L.; Daga, S.; Picchiotti, N.; Baldassarri, M.; Benetti, E.; Fallerini, C.; Fava, F.; Giliberti, A.; Koukos, P. I.; Davy, M. J.; Lakshminarayanan, A.; Xue, X.; Papadakis, G.; Deimel, L. P.; Casablancas-Antràs, V.; Claridge, T. D. W.; Bonvin, A. M. J. J.; Sattentau, Q. J.; Furini, S.; Gori, M.; Huo, J.; Owens, R. J.; Schaffitzel, C.; Berger, I.; Renieri, A.; Naismith, J. H.; Baldwin, A. J.; Davis, B. G.; Davis, B. G. Pathogen-Sugar Interactions Revealed by Universal Saturation Transfer Analysis. Science 2022, 377 (6604), eabm3125 DOI: 10.1126/science.abm3125There is no corresponding record for this reference.
- 38Koukos, P. I.; Réau, M.; Bonvin, A. M. J. J. Shape-Restrained Modeling of Protein–Small-Molecule Complexes with High Ambiguity Driven DOCKing. J. Chem. Inf. Model. 2021, 61 (9), 4807– 4818, DOI: 10.1021/acs.jcim.1c00796There is no corresponding record for this reference.
- 39Kerzmann, A.; Neumann, D.; Kohlbacher, O. SLICK – Scoring and Energy Functions for Protein–Carbohydrate Interactions. J. Chem. Inf. Model. 2006, 46 (4), 1635– 1642, DOI: 10.1021/ci050422y39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28Xksl2ksbY%253D&md5=af9f4a1e22709fcf8e39e6595b11a501SLICK - Scoring and Energy Functions for Protein-Carbohydrate InteractionsKerzmann, Andreas; Neumann, Dirk; Kohlbacher, OliverJournal of Chemical Information and Modeling (2006), 46 (4), 1635-1642CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Protein-carbohydrate interactions are increasingly being recognized as essential for many important biomol. recognition processes. From these, numerous biomedical applications arise in areas as diverse as drug design, immunol., or drug transport. The authors introduce SLICK, a package contg. a scoring and an energy function, which were specifically designed to predict binding modes and free energies of sugars and sugarlike compds. to proteins. SLICK accounts for van der Waals interactions, solvation effects, electrostatics, hydrogen bonds, and CH···π interactions, the latter being a particular feature of most protein-carbohydrate interactions. Parameters for the empirical energy function were calibrated on a set of high-resoln. crystal structures of protein-sugar complexes with known exptl. binding free energies. The authors show that SLICK predicts the binding free energies of predicted complexes (through mol. docking) with high accuracy. SLICK is available as part of the authors' mol. modeling package BALL (www.ball-project.org).
- 40Kerzmann, A.; Fuhrmann, J.; Kohlbacher, O.; Neumann, D. BALLDock/SLICK: A New Method for Protein-Carbohydrate Docking. J. Chem. Inf. Model. 2008, 48 (8), 1616– 1625, DOI: 10.1021/ci800103u40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXovVWitb8%253D&md5=e34cd7fe7693e397ad796306b8479314BALLDock/SLICK: A New Method for Protein-Carbohydrate DockingKerzmann, Andreas; Fuhrmann, Jan; Kohlbacher, Oliver; Neumann, DirkJournal of Chemical Information and Modeling (2008), 48 (8), 1616-1625CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Protein-ligand docking is an essential technique in computer-aided drug design. While generally available docking programs work well for most drug classes, carbohydrates and carbohydrate-like compds. are often problematic for docking. The authors present a new docking method specifically designed to handle docking of carbohydrate-like compds. BALLDock/SLICK combines an evolutionary docking algorithm for flexible ligands and flexible receptor side chains with carbohydrate-specific scoring and energy functions. The scoring function has been designed to identify accurate ligand poses, while the energy function yields accurate ests. of the binding free energies of these poses. On a test set of known protein-sugar complexes the authors demonstrate the ability of the approach to generate correct poses for almost all of the structures and achieve very low mean errors for the predicted binding free energies.
- 41Ives, C.; Singh, O.; D’Andrea, S.; Fogarty, C.; Harbison, A.; Satheesan, A.; Tropea, B.; Fadda, E. Restoring Protein Glycosylation with GlycoShape. bioRxiv 2023, DOI: 10.1101/2023.12.11.571101There is no corresponding record for this reference.
- 42Meyer, P. A.; Socias, S.; Key, J.; Ransey, E.; Tjon, E. C.; Buschiazzo, A.; Lei, M.; Botka, C.; Withrow, J.; Neau, D.; Rajashankar, K.; Anderson, K. S.; Baxter, R. H.; Blacklow, S. C.; Boggon, T. J.; Bonvin, A. M. J. J.; Borek, D.; Brett, T. J.; Caflisch, A.; Chang, C.-I.; Chazin, W. J.; Corbett, K. D.; Cosgrove, M. S.; Crosson, S.; Dhe-Paganon, S.; Di Cera, E.; Drennan, C. L.; Eck, M. J.; Eichman, B. F.; Fan, Q. R.; Ferré-D’Amaré, A. R.; Christopher Fromme, J.; Garcia, K. C.; Gaudet, R.; Gong, P.; Harrison, S. C.; Heldwein, E. E.; Jia, Z.; Keenan, R. J.; Kruse, A. C.; Kvansakul, M.; McLellan, J. S.; Modis, Y.; Nam, Y.; Otwinowski, Z.; Pai, E. F.; Pereira, P. J. B.; Petosa, C.; Raman, C. S.; Rapoport, T. A.; Roll-Mecak, A.; Rosen, M. K.; Rudenko, G.; Schlessinger, J.; Schwartz, T. U.; Shamoo, Y.; Sondermann, H.; Tao, Y. J.; Tolia, N. H.; Tsodikov, O. V.; Westover, K. D.; Wu, H.; Foster, I.; Fraser, J. S.; Maia, F. R. N. C.; Gonen, T.; Kirchhausen, T.; Diederichs, K.; Crosas, M.; Sliz, P. Data Publication with the Structural Biology Data Grid Supports Live Analysis. Nat. Commun. 2016, 7 (1), 10882, DOI: 10.1038/ncomms10882There is no corresponding record for this reference.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.4c01372.
Data set used in this study (XLSX)
Details of the preparation of protein and glycan structures; description of HADDOCK3 modules used in this work; details of the glycan conformational sampling protocol; SNFG representation of the glycans; modules and parameters used for bound docking; modules and parameters used for unbound docking; glycan conformational sampling scenarios; example of HADDOCK models satisfying the quality thresholds; HADDOCK3’ performance on the bound data set; glycans’ RMSD to their bound conformations; impact of mdref on glycan conformations; impact of the clustering on glycans’ lowest RMSD; examples of glycan ensembles of conformations; HADDOCK3’ performance with the ensembles of glycans; torsion angle analysis of glycosidic linkages: a comparison to HADDOCK flexible refinement models; and torsion angle analysis of glycosidic linkages: a comparison to HADDOCK short molecular dynamics refinement models (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.