Training Neural Network Models Using Molecular Dynamics Simulation Results to Efficiently Predict Cyclic Hexapeptide Structural Ensembles
- Tiffani HuiTiffani HuiDepartment of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Tiffani Hui
- ,
- Marc L. DescoteauxMarc L. DescoteauxDepartment of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Marc L. Descoteaux
- ,
- Jiayuan MiaoJiayuan MiaoDepartment of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Jiayuan Miao
- , and
- Yu-Shan Lin*Yu-Shan Lin*Email: [email protected]Department of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Yu-Shan Lin
Abstract

Cyclic peptides have emerged as a promising class of therapeutics. However, their de novo design remains challenging, and many cyclic peptide drugs are simply natural products or their derivatives. Most cyclic peptides, including the current cyclic peptide drugs, adopt multiple conformations in water. The ability to characterize cyclic peptide structural ensembles would greatly aid their rational design. In a previous pioneering study, our group demonstrated that using molecular dynamics results to train machine learning models can efficiently predict structural ensembles of cyclic pentapeptides. Using this method, which was termed StrEAMM (Structural Ensembles Achieved by Molecular Dynamics and Machine Learning), linear regression models were able to predict the structural ensembles for an independent test set with R2 = 0.94 between the predicted populations for specific structures and the observed populations in molecular dynamics simulations for cyclic pentapeptides. An underlying assumption in these StrEAMM models is that cyclic peptide structural preferences are predominantly influenced by neighboring interactions, namely, interactions between (1,2) and (1,3) residues. Here we demonstrate that for larger cyclic peptides such as cyclic hexapeptides, linear regression models including only (1,2) and (1,3) interactions fail to produce satisfactory predictions (R2 = 0.47); further inclusion of (1,4) interactions leads to moderate improvements (R2 = 0.75). We show that when using convolutional neural networks and graph neural networks to incorporate complex nonlinear interaction patterns, we can achieve R2 = 0.97 and R2 = 0.91 for cyclic pentapeptides and hexapeptides, respectively.
This publication is licensed under
License Summary*
You are free to share (copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share (copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share (copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share (copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share (copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
SPECIAL ISSUE
This article is part of the
1. Introduction
Figure 1

Figure 1. StrEAMM linear regression models are constructed using contributions from neighboring interactions. For example, the natural logarithm of the population of cyclic pentapeptide cyclo-(X1X2X3X4X5) adopting a specific structure S1S2S3S4S5 is the sum of the interaction weights for each (1,2) and (1,3) neighbor present. Xi is an amino acid; Si is a structural digit that represents a region of (ϕ, ψ) space (see Figure 2); wSiSi+1XiXi+1 is the weight for residues XiXi+1 adopting structure SiSi+1; wSiSi+1Si+2Xi_Xi+2 is the weight for residues Xi and Xi+2 adopting substructure SiSi+1Si+2; and wQX1X2X3X4X5 is related to the partition function, Q, for sequence X1X2X3X4X5 and ensures that all the structures’ populations sum to 1 for a given sequence. This equation was first proposed by Miao et al. (31)
Figure 2

Figure 2. Structural binning maps divide the (ϕ, ψ) space of cyclic pentapeptides into 10 regions and the (ϕ, ψ) space of cyclic hexapeptides into six regions. (A) The (ϕ, ψ) population density of cyclo-(GGGGG) and (B) the resulting binning map when all the grid points are assigned to their closest centroid to form the final binning map. (C) The (ϕ, ψ) population density of cyclo-(GGGGGG) and (D) the resulting binning map using the same binning protocol.
2. Methods
2.1. Datasets
2.2. Molecular Dynamics Simulations
2.3. Structural Analysis
2.4. Linear Regression Models
2.5. Neural Network Models
2.5.1. Amino Acid Feature Representation and Data Augmentation for Neural Networks
2.5.2. StrEAMM Convolutional Neural Networks
Figure 3

Figure 3. Convolutional neural network and graph neural network architectures. (A) Cartoon example of the cyclic pentapeptide sequence cyclo-(sasFr) represented using a matrix with N = 5 columns and 2048 rows, where the 2048 bits come from the fingerprint encoding of each amino acid. (B) The representations of cyclo-(sasFr) and cyclo-(asFrs) are concatenated such that the (1,2) neighboring residues are spatially close together, enabling the 1D convolutional filter (blue box representing a (4096 × 1) vector of learnable model parameters) to encompass all the fingerprint features that define residues s and a. (C) The top graph represents a cyclic pentapeptide, where the (1,2) edge types are denoted with blue arrows and the (1,3) edge types are denoted with green arrows. The bottom graph represents a cyclic hexapeptide, where in addition to the (1,2) and (1,3) edge types we also can include (1,4) edge types, as denoted with purple arrows. Note: The purple (1,4) edges appear to be double-arrowed, but there are really two unique single-arrowed edges. For example, the forward edge in dark purple that starts at ser1 would be directed to Phe4, and the forward edge in dark purple that starts at Phe4 would be directed to ser1.
2.5.3. StrEAMM Graph Neural Networks (GNNs)
3. Results and Discussion
3.1. StrEAMM Linear Regression (1,2)+(1,3) Models Cannot Predict the Structural Ensembles of Cyclic Hexapeptides
Figure 4

Figure 4. Performance of StrEAMM linear regression models on the training dataset (top) and test dataset (bottom) for (A) cyclic pentapeptides and (B, C) cyclic hexapeptides. (A) Performance of StrEAMM linear regression model on cyclic pentapeptides using (1,2) and (1,3) interactions. (B) Performance of StrEAMM linear regression model on cyclic hexapeptides using (1,2) and (1,3) interactions. (C) Performance of StrEAMM linear regression model on cyclic hexapeptides using (1,2), (1,3), and (1,4) interactions. The black dashed line represents y = x. R2 is the coefficient of determination. WE is the weighted error, given by , where pi,observed and pi,predicted are the populations observed in MD simulation and predicted by StrEAMM, respectively. Each point on the plot represents the predicted versus the observed percent population in MD for a structure in the structural ensemble of a cyclic peptide. All the structures in the structural ensembles for all the cyclic peptides in the training or test dataset with a predicted or observed percent population in MD of >1% are plotted.
3.2. StrEAMM Neural Network Models Can Predict Cyclic Pentapeptide Structural Ensembles
3.2.1. StrEAMM CNN Models for Cyclic Pentapeptides
Figure 5

Figure 5. Performance of (A–C) StrEAMM CNN models and (D–F) StrEAMM GNN models on the test dataset for (A, D) cyclic pentapeptides and the models incorporating (1,2) and (1,3) filters/edges, (B, E) cyclic hexapeptides and the models incorporating (1,2) and (1,3) filters/edges, and (C, F) cyclic hexapeptides and the models incorporating (1,2), (1,3), and (1,4) filters/edges. See Figure S7 for the model performances on the corresponding training datasets. All the structures in the structural ensembles for all the cyclic peptides in the training or test dataset with a predicted or observed percent population in MD of >1% are plotted.
Figure 6

Figure 6. Comparison of the StrEAMM linear regression and neural network models’ performances on the (A) cyclic pentapeptide and (B) cyclic hexapeptide test datasets. The coefficient of determination, R2, and weighted error, WE, are shown for each model (the linear regression in red with diagonal slash pattern, CNN in green with dotted pattern, and GNN in blue with vertical line pattern) including different neighboring interactions.
3.2.2. StrEAMM GNN Models for Cyclic Pentapeptides
3.3. StrEAMM Neural Network Models Can Predict Cyclic Hexapeptide Structural Ensembles
3.3.1. StrEAMM CNN Models for Cyclic Hexapeptides
3.3.2. StrEAMM GNN Models for Cyclic Hexapeptides
3.4. Using Alternative Binning Maps Does Not Improve the StrEAMM Model Performance Trained on Either Cyclic Pentapeptides or Cyclic Hexapeptides
3.5. StrEAMM Neural Network Models Can Predict Structural Ensembles of Cyclic Peptide Sequences Containing Amino Acids That Were Absent in the Training Dataset Sequences
Figure 7

Figure 7. The StrEAMM neural network models can predict structural ensembles for cyclic pentapeptides and cyclic hexapeptides that contain amino acids that were absent in the training dataset. (A–D) Performances of the (A, B) CNN and (C, D) GNN models on cyclic pentapeptide and cyclic hexapeptide datasets containing sequences composed of the 37 amino acid library, when the models were trained using only sequences composed of the 15 amino acid library. (E–H) Performances of the (E, F) CNN and (G, H) GNN models on cyclic pentapeptide and cyclic hexapeptide datasets containing sequences composed of the 37 amino acid library, when the models were trained using sequences composed of the 15 amino acid library and “booster” sequences composed of the 37 amino acid library.
4. Conclusions and Future Directions
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.3c00154.
Lists of sequences in the training and test datasets; structural binning maps 2 and 3; hyperparameter tuning schemes; neural network model performances on the training datasets for cyclic pentapeptides and cyclic hexapeptides; performances of linear regression and neural network models including only (1,2) or only (1,3) interactions on training and test datasets for cyclic pentapeptides; performances of linear regression and neural network models including only (1,2), only (1,3), or only (1,4) interactions on training and test datasets for cyclic hexapeptides; comparison of model performances using different binning maps (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
This work was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award R01GM124160 (PI: Y.-S.L.). We are grateful for the support from the Tufts Technology Services and for the computing resources at the Tufts Research Cluster. Initial structures for the simulations were built using UCSF Chimera, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from NIH Grant P41-GM103311.
References
This article references 65 other publications.
- 1Smith, M. C.; Gestwicki, J. E. Features of protein-protein interactions that translate into potent inhibitors: topology, surface area and affinity. Expert Rev. Mol. Med. 2012, 14, e16 DOI: 10.1017/erm.2012.10Google Scholar1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhsFOjtL7L&md5=a24bcc721f4c6ebfc8fae029c79f6426Features of protein-protein interactions that translate into potent inhibitors: topology, surface area and affinitySmith, Matthew C.; Gestwicki, Jason E.Expert Reviews in Molecular Medicine (2012), 14 (), e16/1-e16/20CODEN: ERMMFS; ISSN:1462-3994. (Cambridge University Press)A review. Protein-protein interactions (PPIs) control the assembly of multi-protein complexes and, thus, these contacts have enormous potential as drug targets. However, the field has produced a mix of both exciting success stories and frustrating challenges. Here, we review known examples and explore how the phys. features of a PPI, such as its affinity, hotspots, off-rates, buried surface area and topol., might influence the chances of success in finding inhibitors. This anal. suggests that concise, tight binding PPIs are most amenable to inhibition. However, it is also clear that emerging tech. methods are expanding the repertoire of 'druggable' protein contacts and increasing the odds against difficult targets. In particular, natural product-like compd. libraries, high throughput screens specifically designed for PPIs and approaches that favor discovery of allosteric inhibitors appear to be attractive routes. The first group of PPI inhibitors has entered clin. trials, further motivating the need to understand the challenges and opportunities in pursuing these types of targets.
- 2Morelli, X.; Bourgeas, R.; Roche, P. Chemical and structural lessons from recent successes in protein-protein interaction inhibition (2P2I). Curr. Opin. Chem. Biol. 2011, 15, 475– 481, DOI: 10.1016/j.cbpa.2011.05.024Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXpvFCltro%253D&md5=efe168c18c73e16044583471b7927751Chemical and structural lessons from recent successes in protein-protein interaction inhibition (2P2I)Morelli, Xavier; Bourgeas, Raphael; Roche, PhilippeCurrent Opinion in Chemical Biology (2011), 15 (4), 475-481CODEN: COCBF4; ISSN:1367-5931. (Elsevier B.V.)A review. Worldwide research efforts have driven recent pharmaceutical successes, and consequently, the emerging role of Protein-Protein Interactions (PPIs) as drug targets has finally been widely embraced by the scientific community. Inhibitors of these Protein-Protein Interactions (2P2Is or i-PPIs) are likely to represent the next generation of highly innovative drugs that will reach the market over the next decade. This review describes up-to-date knowledge on this particular chem. space, with a specific emphasis on a subset of this ensemble. We also address current structural knowledge regarding both protein-protein and protein-inhibitor complexes, i.e., the 2P2I database. Finally, ligand efficiency analyses permit us to relate potency to size and polarity and to discuss the need to co-develop nanoparticle drug delivery systems.
- 3Rezai, T.; Bock, J. E.; Zhou, M. V.; Kalyanaraman, C.; Lokey, R. S.; Jacobson, M. P. Conformational Flexibility, Internal Hydrogen Bonding, and Passive Membrane Permeability: Successful in Silico Prediction of the Relative Permeabilities of Cyclic Peptides. J. Am. Chem. Soc. 2006, 128, 14073– 14080, DOI: 10.1021/ja063076pGoogle Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28XhtVCmur3I&md5=78d4b4082e6a72b288d4b6dac933ca35Conformational Flexibility, Internal Hydrogen Bonding, and Passive Membrane Permeability: Successful in Silico Prediction of the Relative Permeabilities of Cyclic PeptidesRezai, Taha; Bock, Jonathan E.; Zhou, Mai V.; Kalyanaraman, Chakrapani; Lokey, R. Scott; Jacobson, Matthew P.Journal of the American Chemical Society (2006), 128 (43), 14073-14080CODEN: JACSAT; ISSN:0002-7863. (American Chemical Society)We report an atomistic phys. model for the passive membrane permeability of cyclic peptides. The computational modeling was performed in advance of the expts. and did not involve the use of "training data". The model explicitly treats the conformational flexibility of the peptides by extensive conformational sampling in low (membrane) and high (water) dielec. environments. The passive membrane permeabilities of 11 cyclic peptides were obtained exptl. using a parallel artificial membrane permeability assay (PAMPA) and showed a linear correlation with the computational results with R2 = 0.96. In general, the results support the hypothesis, already well established in the literature, that the ability to form internal hydrogen bonds is crit. for passive membrane permeability and can be the distinguishing factor among closely related compds., such as those studied here. However, we have found that the no. of internal hydrogen bonds that can form in the membrane and the solvent-exposed polar surface area correlate more poorly with PAMPA permeability than our model, which quant. ests. the solvation free energy losses upon moving from high-dielec. water to the low-dielec. interior of a membrane.
- 4Dougherty, P. G.; Sahni, A.; Pei, D. Understanding Cell Penetration of Cyclic Peptides. Chem. Rev. 2019, 119, 10241– 10287, DOI: 10.1021/acs.chemrev.9b00008Google Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXptlCksLs%253D&md5=65f68a9b69f3f7f33641b6a91835b1ddUnderstanding Cell Penetration of Cyclic PeptidesDougherty, Patrick G.; Sahni, Ashweta; Pei, DehuaChemical Reviews (Washington, DC, United States) (2019), 119 (17), 10241-10287CODEN: CHREAY; ISSN:0009-2665. (American Chemical Society)A review. Approx. 75% of all disease-relevant human proteins, including those involved in intracellular protein-protein interactions (PPIs), are undruggable with the current drug modalities (i.e., small mols. and biologics). Macrocyclic peptides provide a potential soln. to these undruggable targets because their larger sizes (relative to conventional small mols.) endow them the capability of binding to flat PPI interfaces with antibody-like affinity and specificity. Powerful combinatorial library technologies have been developed to routinely identify cyclic peptides as potent, specific inhibitors against proteins including PPI targets. However, with the exception of a very small set of sequences, the vast majority of cyclic peptides are impermeable to the cell membrane, preventing their application against intracellular targets. This Review examines common structural features that render most cyclic peptides membrane impermeable, as well as the unique features that allow the minority of sequences to enter the cell interior by passive diffusion, endocytosis/endosomal escape, or other mechanisms. We also present the current state of knowledge about the mol. mechanisms of cell penetration, the various strategies for designing cell-permeable, biol. active cyclic peptides against intracellular targets, and the assay methods available to quantify their cell-permeability.
- 5Zhang, H.; Chen, S. Cyclic peptide drugs approved in the last two decades (2001–2021). RSC Chem. Biol. 2022, 3, 18– 31, DOI: 10.1039/D1CB00154JGoogle Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB2M7otVertw%253D%253D&md5=b5c064979a6ed1909cf621cc6bcb225aCyclic peptide drugs approved in the last two decades (2001-2021)Zhang Huiya; Chen ShiyuRSC chemical biology (2022), 3 (1), 18-31 ISSN:.In contrast to the major families of small molecules and antibodies, cyclic peptides, as a family of synthesizable macromolecules, have distinct biochemical and therapeutic properties for pharmaceutical applications. Cyclic peptide-based drugs have increasingly been developed in the past two decades, confirming the common perception that cyclic peptides have high binding affinities and low metabolic toxicity as antibodies, good stability and ease of manufacture as small molecules. Natural peptides were the major source of cyclic peptide drugs in the last century, and cyclic peptides derived from novel screening and cyclization strategies are the new source. In this review, we will discuss and summarize 18 cyclic peptides approved for clinical use in the past two decades to provide a better understanding of cyclic peptide development and to inspire new perspectives. The purpose of the present review is to promote efforts to resolve the challenges in the development of cyclic peptide drugs that are more effective.
- 6Zorzi, A.; Deyle, K.; Heinis, C. Cyclic peptide therapeutics: past, present and future. Curr. Opin. Chem. Biol. 2017, 38, 24– 29, DOI: 10.1016/j.cbpa.2017.02.006Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXjsFGksLg%253D&md5=4b9581365b92b649dad8c7b22c5bd9c3Cyclic peptide therapeutics: past, present and futureZorzi, Alessandro; Deyle, Kaycie; Heinis, ChristianCurrent Opinion in Chemical Biology (2017), 38 (), 24-29CODEN: COCBF4; ISSN:1367-5931. (Elsevier B.V.)Cyclic peptides combine several favorable properties such as good binding affinity, target selectivity and low toxicity that make them an attractive modality for the development of therapeutics. Over 40 cyclic peptide drugs are currently in clin. use and around one new cyclic peptide drug enters the market every year on av. The vast majority of clin. approved cyclic peptides are derived from natural products, such as antimicrobials or human peptide hormones. New powerful techniques based on rational design and in vitro evolution have enabled the de novo development of cyclic peptide ligands to targets for which nature does not offer solns. A look at the cyclic peptides currently under clin. evaluation shows that several have been developed using such techniques. This new source for cyclic peptide ligands introduces a freshness to the field, and it is likely that de novo developed cyclic peptides will be in clin. use in the near future.
- 7Nguyen, Q. N. N.; Schwochert, J.; Tantillo, D. J.; Lokey, R. S. Using 1H and 13C NMR chemical shifts to determine cyclic peptide conformations: a combined molecular dynamics and quantum mechanics approach. Phys. Chem. Chem. Phys. 2018, 20, 14003– 14012, DOI: 10.1039/C8CP01616JGoogle Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXosl2kt7w%253D&md5=7bb116300b2306d765a4aa6742980c85Using 1H and 13C NMR chemical shifts to determine cyclic peptide conformations: a combined molecular dynamics and quantum mechanics approachNguyen, Q. Nhu N.; Schwochert, Joshua; Tantillo, Dean J.; Lokey, R. ScottPhysical Chemistry Chemical Physics (2018), 20 (20), 14003-14012CODEN: PPCPFQ; ISSN:1463-9076. (Royal Society of Chemistry)Solving conformations of cyclic peptides can provide insight into structure-activity and structure-property relationships, which can help in the design of compds. with improved bioactivity and/or ADME characteristics. The most common approaches for detg. the structures of cyclic peptides are based on NMR-derived distance restraints obtained from NOESY or ROESY cross-peak intensities, and 3J-based dihedral restraints using the Karplus relationship. Unfortunately, these observables are often too weak, sparse, or degenerate to provide unequivocal, high-confidence soln. structures, prompting us to investigate an alternative approach that relies only on 1H and 13C chem. shifts as exptl. observables. This method, which we call conformational anal. from NMR and d.-functional prediction of low-energy ensembles (CANDLE), uses mol. dynamics (MD) simulations to generate conformer families and d. functional theory (DFT) calcns. to predict their 1H and 13C chem. shifts. Iterative conformer searches and DFT energy calcns. on a cyclic peptide-peptoid hybrid yielded Boltzmann ensembles whose predicted chem. shifts matched the exptl. values better than any single conformer. For these compds., CANDLE outperformed the classic NOE- and 3J-coupling-based approach by disambiguating similar β-turn types and also enabled the structural elucidation of the minor conformer. Through the use of chem. shifts, in conjunction with DFT and MD calcns., CANDLE can help illuminate conformational ensembles of cyclic peptides in soln.
- 8Ball, K. A.; Wemmer, D. E.; Head-Gordon, T. Comparison of Structure Determination Methods for Intrinsically Disordered Amyloid-β Peptides. J. Phys. Chem. B 2014, 118, 6405– 6416, DOI: 10.1021/jp410275yGoogle Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXlslCjug%253D%253D&md5=9bf46beb4da6f5c32a56b3230c24aac9Comparison of Structure Determination Methods for Intrinsically Disordered Amyloid-β PeptidesBall, K. Aurelia; Wemmer, David E.; Head-Gordon, TeresaJournal of Physical Chemistry B (2014), 118 (24), 6405-6416CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Intrinsically disordered proteins (IDPs) represent a new frontier in structural biol. since the primary characteristic of IDPs is that structures need to be characterized as diverse ensembles of conformational substates. We compare two general but very different ways of combining NMR spectroscopy with theor. methods to derive structural ensembles for the disease IDPs amyloid-β 1-40 and amyloid-β 1-42, which are assocd. with Alzheimer's Disease. We analyze the performance of de novo mol. dynamics and knowledge-based approaches for generating structural ensembles by assessing their ability to reproduce a range of NMR exptl. observables. In addn. to the comparison of computational methods, we also evaluate the relative value of different types of NMR data for refining or validating the IDP structural ensembles for these important disease peptides.
- 9Fisher, C. K.; Stultz, C. M. Constructing ensembles for intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2011, 21, 426– 431, DOI: 10.1016/j.sbi.2011.04.001Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXnsVyitb4%253D&md5=81a2affd275f291c133e2839543a969bConstructing ensembles for intrinsically disordered proteinsFisher, Charles K.; Stultz, Collin M.Current Opinion in Structural Biology (2011), 21 (3), 426-431CODEN: COSBEF; ISSN:0959-440X. (Elsevier Ltd.)A review. The relatively flat energy landscapes assocd. with intrinsically disordered proteins makes modeling these systems esp. problematic. A comprehensive model for these proteins requires one to build an ensemble consisting of a finite collection of structures, and their corresponding relative stabilities, which adequately capture the range of accessible states of the protein. In this regard, methods that use computational techniques to interpret exptl. data in terms of such ensembles are an essential part of the modeling process. In this review, we critically assess the advantages and limitations of current techniques and discuss new methods for the validation of these ensembles.
- 10Cicero, D. O.; Barbato, G.; Bazzo, R. NMR Analysis of Molecular Flexibility in Solution: A New Method for the Study of Complex Distributions of Rapidly Exchanging Conformations. Application to a 13-Residue Peptide with an 8-Residue Loop. J. Am. Chem. Soc. 1995, 117, 1027– 1033, DOI: 10.1021/ja00108a019Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2MXjtVyrt7s%253D&md5=299cae2efcd4cc43aef0e0c5701579deNMR Analysis of Molecular Flexibility in Solution: A New Method for the Study of Complex Distributions of Rapidly Exchanging Conformations. Application to a 13-Residue Peptide with an 8-Residue LoopCicero, D. O.; Barbato, G.; Bazzo, R.Journal of the American Chemical Society (1995), 117 (3), 1027-33CODEN: JACSAT; ISSN:0002-7863. (American Chemical Society)A new methodol., called NAMFIS (NMR anal. of mol. flexibility in soln.), is described for the anal. of flexible mols. in soln. Once a complete set of conformations is generated and is able to encompass all the possible states of the mol. that are not a priori incompatible with the available exptl. NMR evidence, NAMFIS allows for the examn. of the occurrence and relevance of arbitrary elements of secondary structure, even when extensive conformational averaging defies a detailed exptl. characterization. The anal. is based on the available exptl. NMR data. The method is demonstrated in the conformational anal. of peptide I (R = Lys-Aib-Lys-OH; Aib = α-aminoisobutyric acid, Mhe = 2-amino-6-mercaptohexanoic acid).
- 11Ge, Y.; Zhang, S.; Erdelyi, M.; Voelz, V. A. Solution-State Preorganization of Cyclic β-Hairpin Ligands Determines Binding Mechanism and Affinities for MDM2. J. Chem. Inf. Model. 2021, 61, 2353– 2367, DOI: 10.1021/acs.jcim.1c00029Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXpsFehs7s%253D&md5=5af83b4c86353bc2cce83b4bec566788Solution-State Preorganization of Cyclic β-Hairpin Ligands Determines Binding Mechanism and Affinities for MDM2Ge, Yunhui; Zhang, Si; Erdelyi, Mate; Voelz, Vincent A.Journal of Chemical Information and Modeling (2021), 61 (5), 2353-2367CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Understanding mechanisms of protein folding and binding is crucial to designing their mol. function. Mol. dynamics (MD) simulations and Markov state model (MSM) approaches provide a powerful way to understand complex conformational change that occurs over long time scales. Such dynamics are important for the design of therapeutic peptidomimetic ligands, whose affinity and binding mechanism are dictated by a combination of folding and binding. To examine the role of preorganization in peptide binding to protein targets, the authors performed massively parallel explicit-solvent MD simulations of cyclic β-hairpin ligands designed to mimic the p53 transactivation domain and competitively bind mouse double minute 2 homolog (MDM2). Disrupting the MDM2-p53 interaction is a therapeutic strategy to prevent degrdn. of the p53 tumor suppressor in cancer cells. MSM anal. of over 3 ms of aggregate trajectory data enabled the authors to build a detailed mechanistic model of coupled folding and binding of four cyclic peptides which the authors compare to exptl. binding affinities and rates. The results show a striking relation between the relative preorganization of each ligand in soln. and its affinity for MDM2. Specifically, changes in peptide conformational populations predicted by the MSMs suggest that entropy loss upon binding is the main factor influencing affinity. The MSMs also enable detailed examn. of non-native interactions which lead to misfolded states and comparison of structural ensembles with exptl. NMR measurements. In contrast to an MSM study of p53 transactivation domain (TAD) binding to MDM2, MSMs of cyclic β-hairpin binding show a conformational selection mechanism. Finally, the authors make progress toward predicting accurate off rates of cyclic peptides using multi-ensemble Markov models (MEMMs) constructed from unbiased and biased simulated trajectories.
- 12Slough, D. P.; McHugh, S. M.; Cummings, A. E.; Dai, P.; Pentelute, B. L.; Kritzer, J. A.; Lin, Y.-S. Designing Well-Structured Cyclic Pentapeptides Based on Sequence-Structure Relationships. J. Phys. Chem. B 2018, 122, 3908– 3919, DOI: 10.1021/acs.jpcb.8b01747Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXmtVCqsrg%253D&md5=00ab0c1f50ec1ce753262b4d0afd77aaDesigning well-structured cyclic pentapeptides based on sequence-structure relationshipsSlough, Diana P.; McHugh, Sean M.; Cummings, Ashleigh E.; Dai, Peng; Pentelute, Bradley L.; Kritzer, Joshua A.; Lin, Yu-ShanJournal of Physical Chemistry B (2018), 122 (14), 3908-3919CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Cyclic peptides are a promising class of mols. for unique applications. Unfortunately, cyclic peptide design is severely limited by the difficulty in predicting the conformations they will adopt in soln. In this work, we use explicit-solvent mol. dynamics simulations to design well-structured cyclic peptides by studying their sequence-structure relationships. Crit. to our approach is an enhanced sampling method that exploits the essential transitional motions of cyclic peptides to efficiently sample their conformational space. We simulated a range of cyclic pentapeptides from all-glycine to a library of cyclo-(X1X2AAA) peptides to map their conformational space and det. cooperative effects of neighboring residues. By combining the results from all cyclo-(X1X2AAA) peptides, we developed a scoring function to predict the structural preferences for X1-X2 residues within cyclic pentapeptides. Using this scoring function, we designed a cyclic pentapeptide, cyclo-(GNSRV), predicted to be well structured in aq. soln. Subsequent CD and NMR spectroscopy revealed that this cyclic pentapeptide is indeed well structured in water, with a nuclear Overhauser effect and J-coupling values consistent with the predicted structure.
- 13Cummings, A. E.; Miao, J.; Slough, D. P.; McHugh, S. M.; Kritzer, J. A.; Lin, Y. S. β-Branched Amino Acids Stabilize Specific Conformations of Cyclic Hexapeptides. Biophys. J. 2019, 116, 433– 444, DOI: 10.1016/j.bpj.2018.12.015Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXhtVaqtbs%253D&md5=b3230ad37a2a0f117aa4f186f41c991eβ-Branched Amino Acids Stabilize Specific Conformations of Cyclic HexapeptidesCummings, Ashleigh E.; Miao, Jiayuan; Slough, Diana P.; McHugh, Sean M.; Kritzer, Joshua A.; Lin, Yu-ShanBiophysical Journal (2019), 116 (3), 433-444CODEN: BIOJAU; ISSN:0006-3495. (Cell Press)Cyclic peptides (CPs) are a promising class of mols. for drug development, particularly as inhibitors of protein-protein interactions. Predicting low-energy structures and global structural ensembles of individual CPs is crit. for the design of bioactive mols., but these are challenging to predict and difficult to verify exptl. In our previous work, we used explicit-solvent mol. dynamics simulations with enhanced sampling methods to predict the global structural ensembles of cyclic hexapeptides contg. different permutations of glycine, alanine, and valine. One peptide, cyclo-(VVGGVG) or P7, was predicted to be unusually well structured. In this work, we synthesized P7, along with a less well-structured control peptide, cyclo-(VVGVGG) or P6, and characterized their global structural ensembles in water using NMR spectroscopy. The NMR data revealed a structural ensemble similar to the prediction for P7 and showed that P6 was indeed much less well-structured than P7. We then simulated and exptl. characterized the global structural ensembles of several P7 analogs and discovered that β-branching at one crit. position within P7 is important for overall structural stability. The simulations allowed deconvolution of thermodn. factors that underlie this structural stabilization. Overall, the excellent correlation between simulation and exptl. data indicates that our simulation platform will be a promising approach for designing well-structured CPs and also for understanding the complex interactions that control the conformations of constrained peptides and other macrocycles.
- 14Wakefield, A. E.; Wuest, W. M.; Voelz, V. A. Molecular Simulation of Conformational Pre-Organization in Cyclic RGD Peptides. J. Chem. Inf. Model. 2015, 55, 806– 813, DOI: 10.1021/ci500768uGoogle Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXjvVChu74%253D&md5=36c7ee3598117a064bee960c0f8ebd75Molecular Simulation of Conformational Pre-Organization in Cyclic RGD PeptidesWakefield, Amanda E.; Wuest, William M.; Voelz, Vincent A.Journal of Chemical Information and Modeling (2015), 55 (4), 806-813CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)To test the ability of mol. simulations to accurately predict the soln.-state conformational properties of peptidomimetics, we examd. a test set of 18 cyclic RGD peptides selected from the literature, including the anticancer drug candidate cilengitide, whose favorable binding affinity to integrin has been ascribed to its pre-organization in soln. For each design, we performed all-atom replica-exchange mol. dynamics simulations over several microseconds and compared the results to extensive published NMR data. We find excellent agreement with exptl. NOE distance restraints, suggesting that mol. simulation can be a useful tool for the computational design of pre-organized soln.-state structure. Moreover, our anal. of conformational populations ests. that, despite the potential for increased flexibility due to backbone amide isomerizaton, N-methylation provides about 0.5 kcal/mol of reduced conformational entropy to cyclic RGD peptides. The combination of pre-organization and binding-site compatibility explains the strong binding affinity of cilengitide to integrin.
- 15Damjanovic, J.; Miao, J.; Huang, H.; Lin, Y.-S. Elucidating Solution Structures of Cyclic Peptides Using Molecular Dynamics Simulations. Chem. Rev. 2021, 121, 2292– 2324, DOI: 10.1021/acs.chemrev.0c01087Google Scholar64https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXns1Sitg%253D%253D&md5=7a78dc319d70b09c1f2d68091b45e333Elucidating Solution Structures of Cyclic Peptides Using Molecular Dynamics SimulationsDamjanovic, Jovan; Miao, Jiayuan; Huang, He; Lin, Yu-ShanChemical Reviews (Washington, DC, United States) (2021), 121 (4), 2292-2324CODEN: CHREAY; ISSN:0009-2665. (American Chemical Society)A review. Protein-protein interactions are vital to biol. processes, but the shape and size of their interfaces make them hard to target using small mols. Cyclic peptides have shown promise as protein-protein interaction modulators, as they can bind protein surfaces with high affinity and specificity. Dozens of cyclic peptides are already FDA approved, and many more are in various stages of development as immunosuppressants, antibiotics, antivirals, or anticancer drugs. However, most cyclic peptide drugs so far have been natural products or derivs. thereof, with de novo design having proven challenging. A key obstacle is structural characterization: cyclic peptides frequently adopt multiple conformations in soln., which are difficult to resolve using techniques like NMR spectroscopy. The lack of soln. structural information prevents a thorough understanding of cyclic peptides' sequence-structure-function relationship. Here we review recent development and application of mol. dynamics simulations with enhanced sampling to studying the soln. structures of cyclic peptides. We describe novel computational methods capable of sampling cyclic peptides' conformational space and provide examples of computational studies that relate peptides' sequence and structure to biol. activity. We demonstrate that mol. dynamics simulations have grown from an explanatory technique to a full-fledged tool for systematic studies at the forefront of cyclic peptide therapeutic design.
- 16Ono, S.; Naylor, M. R.; Townsend, C. E.; Okumura, C.; Okada, O.; Lokey, R. S. Conformation and Permeability: Cyclic Hexapeptide Diastereomers. J. Chem. Inf. Model. 2019, 59, 2952– 2963, DOI: 10.1021/acs.jcim.9b00217Google Scholar65https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXosFGktb8%253D&md5=a4dc1f7d186cbe0614c96d95940d539dConformation and Permeability: Cyclic Hexapeptide DiastereomersOno, Satoshi; Naylor, Matthew R.; Townsend, Chad E.; Okumura, Chieko; Okada, Okimasa; Lokey, R. ScottJournal of Chemical Information and Modeling (2019), 59 (6), 2952-2963CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Conformational ensembles of eight cyclic hexapeptide diastereomers in explicit cyclohexane, chloroform, and water were analyzed by multicanonical mol. dynamics (McMD) simulations. Free-energy landscapes (FELs) for each compd. and solvent were obtained from the mol. shapes and principal component anal. at T = 300 K; detailed anal. of the conformational ensembles and flexibility of the FELs revealed that permeable compds. have different structural profiles even for a single stereoisomeric change. The av. solvent-accessible surface area (SASA) in cyclohexane showed excellent correlation with the cell permeability, whereas this correlation was weaker in chloroform. The av. SASA in water correlated with the aq. soly. The av. polar surface area did not correlate with cell permeability in these solvents. A possible strategy for designing permeable cyclic peptides from FELs obtained from McMD simulations is proposed.
- 17Wang, S.; König, G.; Roth, H.-J.; Fouché, M.; Rodde, S.; Riniker, S. Effect of Flexibility, Lipophilicity, and the Location of Polar Residues on the Passive Membrane Permeability of a Series of Cyclic Decapeptides. J. Med. Chem. 2021, 64, 12761– 12773, DOI: 10.1021/acs.jmedchem.1c00775Google Scholar66https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhvVSqs7nO&md5=21766e5b60ab466f8be3217e647e3515Effect of Flexibility, Lipophilicity, and the Location of Polar Residues on the Passive Membrane Permeability of a Series of Cyclic DecapeptidesWang, Shuzhe; Konig, Gerhard; Roth, Hans-Jorg; Fouche, Marianne; Rodde, Stephane; Riniker, SereinaJournal of Medicinal Chemistry (2021), 64 (17), 12761-12773CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Cyclic peptides have received increasing attention over the recent years as potential therapeutics for "undruggable" targets. One major obstacle is, however, their often relatively poor bioavailability. Here, we investigate the structure-permeability relationship of 24 cyclic decapeptides that share the same backbone N-methylation pattern but differ in their side chains. The peptides cover a large range of values for passive membrane permeability as well as lipophilicity and soly. To rationalize the obsd. differences in permeability, we extd. for each peptide the population of the membrane-permeable conformation in water from extensive explicit-solvent mol. dynamics simulations and used this as a metric for conformational rigidity or "prefolding.". The insights from the simulations together with lipophilicity measurements highlight the intricate interplay between polarity/lipophilicity and flexibility/rigidity and the possible compensating effects on permeability. The findings allow us to better understand the structure-permeability relationship of cyclic peptides and ext. general guiding principles.
- 18El Tayar, N.; Mark, A. E.; Vallat, P.; Brunne, R. M.; Testa, B.; van Gunsteren, W. F. Solvent-dependent conformation and hydrogen-bonding capacity of cyclosporin A: evidence from partition coefficients and molecular dynamics simulations. J. Med. Chem. 1993, 36, 3757– 3764, DOI: 10.1021/jm00076a002Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3sXmslWks78%253D&md5=c16833b2300b4fba6cb627ca1b9e3060Solvent-dependent conformation and hydrogen-bonding capacity of cyclosporin A: evidence from partition coefficients and molecular dynamics simulationsEl Tayar, Nabil; Mark, Alan E.; Vallat, Philippe; Brunne, Roger M.; Testa, Bernard; van Gunsteren, Wilfred F.Journal of Medicinal Chemistry (1993), 36 (24), 3757-64CODEN: JMCMAR; ISSN:0022-2623.The partition coeff. of cyclosporin A (CsA) was measured in octanol/water and heptane/water by centrifugal partition chromatog. By comparison with results from model compds., it was deduced that the hydrogen-bonding capacity of CsA changed dramatically from an apolar solvent (where it is internally H-bonded) to polar solvents (where it exposes its H-bonding groups to the solvent). Mol. dynamics simulations in water and CCl4 support the suggestion that CsA undergoes a solvent-dependent conformational changes and that the interconversion process is slow on the mol. dynamics time scale.
- 19Merten, C.; Li, F.; Bravo-Rodriguez, K.; Sanchez-Garcia, E.; Xu, Y.; Sander, W. Solvent-induced conformational changes in cyclic peptides: a vibrational circular dichroism study. Phys. Chem. Chem. Phys. 2014, 16, 5627– 5633, DOI: 10.1039/C3CP55018DGoogle Scholar16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXjtVGisrk%253D&md5=baf3123a7b73babb8a50700b4f89e0d8Solvent-induced conformational changes in cyclic peptides: a vibrational circular dichroism studyMerten, Christian; Li, Fee; Bravo-Rodriguez, Kenny; Sanchez-Garcia, Elsa; Xu, Yunjie; Sander, WolframPhysical Chemistry Chemical Physics (2014), 16 (12), 5627-5633CODEN: PPCPFQ; ISSN:1463-9076. (Royal Society of Chemistry)The three-dimensional structure of a peptide is strongly influenced by its solvent environment. In the present study, we study three cyclic tetrapeptides which serve as model peptides for β-turns. They are of the general structure cyclo(Boc-Cys-Pro-X-Cys-OMe) with the amino acid X being either glycine, or L- or D-leucine. Using vibrational CD (VCD) spectroscopy, we confirm previous NMR results which showed that cyclo(Boc-Cys-Pro-D-Leu-Cys-OMe) adopts predominantly a βII turn structure in apolar and polar solvents. Our results for cyclo(Boc-Cys-Pro-Leu-Cys-OMe) (Boc = tert-nutoxycarbonyl) indicate a preference for a βI structure over βII. With increasing solvent polarity, the preference for cyclo(Boc-Cys-Pro-Gly-Cys-OMe) is shifted from βII towards βI. This conformational change goes along with the breaking of an intramol. hydrogen bond which stabilizes the βII conformation. Instead, a hydrogen bond with a solvent mol. can stabilize the βI turn conformation.
- 20Quartararo, J. S.; Eshelman, M. R.; Peraro, L.; Yu, H.; Baleja, J. D.; Lin, Y.-S.; Kritzer, J. A. A bicyclic peptide scaffold promotes phosphotyrosine mimicry and cellular uptake. Bioorg. Med. Chem. 2014, 22, 6387– 6391, DOI: 10.1016/j.bmc.2014.09.050Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhslaht7nM&md5=c6439ec61bca8cd01b5a428dda8bb46fA bicyclic peptide scaffold promotes phosphotyrosine mimicry and cellular uptakeQuartararo, Justin S.; Eshelman, Matthew R.; Peraro, Leila; Yu, Hongtao; Baleja, James D.; Lin, Yu-Shan; Kritzer, Joshua A.Bioorganic & Medicinal Chemistry (2014), 22 (22), 6387-6391CODEN: BMECEP; ISSN:0968-0896. (Elsevier B.V.)While peptides are promising as probes and therapeutics, targeting intracellular proteins will require greater understanding of highly structured, cell-internalized scaffolds. We recently reported BC1, an 11-residue bicyclic peptide that inhibits the Src homol. 2 (SH2) domain of growth factor receptor-bound protein 2 (Grb2). In this work, we describe the unique structural and cell uptake properties of BC1 and similar cyclic and bicyclic scaffolds. These constrained scaffolds are taken up by mammalian cells despite their net neutral or neg. charges, while unconstrained analogs are not. The mechanism of uptake is shown to be energy-dependent and endocytic, but distinct from that of Tat. The soln. structure of BC1 was investigated by NMR and MD simulations, which revealed discrete water-binding sites on BC1 that reduce exposure of backbone amides to bulk water. This represents an original and potentially general strategy for promoting cell uptake.
- 21Baek, M.; DiMaio, F.; Anishchenko, I.; Dauparas, J.; Ovchinnikov, S.; Lee, G. R.; Wang, J.; Cong, Q.; Kinch, L. N.; Schaeffer, R. D.; Millán, C.; Park, H.; Adams, C.; Glassman, C. R.; DeGiovanni, A.; Pereira, J. H.; Rodrigues, A. V.; van Dijk, A. A.; Ebrecht, A. C.; Opperman, D. J.; Sagmeister, T.; Buhlheller, C.; Pavkov-Keller, T.; Rathinaswamy, M. K.; Dalwadi, U.; Yip, C. K.; Burke, J. E.; Garcia, K. C.; Grishin, N. V.; Adams, P. D.; Read, R. J.; Baker, D. Accurate prediction of protein structures and interactions using a three-track neural network. Science 2021, 373, 871– 876, DOI: 10.1126/science.abj8754Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhvVCku7zM&md5=85214c6497c7e1f9df582ef1b8ffa058Accurate prediction of protein structures and interactions using a three-track neural networkBaek, Minkyung; DiMaio, Frank; Anishchenko, Ivan; Dauparas, Justas; Ovchinnikov, Sergey; Lee, Gyu Rie; Wang, Jue; Cong, Qian; Kinch, Lisa N.; Schaeffer, R. Dustin; Millan, Claudia; Park, Hahnbeom; Adams, Carson; Glassman, Caleb R.; DeGiovanni, Andy; Pereira, Jose H.; Rodrigues, Andria V.; van Dijk, Alberdina A.; Ebrecht, Ana C.; Opperman, Diederik J.; Sagmeister, Theo; Buhlheller, Christoph; Pavkov-Keller, Tea; Rathinaswamy, Manoj K.; Dalwadi, Udit; Yip, Calvin K.; Burke, John E.; Garcia, K. Christopher; Grishin, Nick V.; Adams, Paul D.; Read, Randy J.; Baker, DavidScience (Washington, DC, United States) (2021), 373 (6557), 871-876CODEN: SCIEAS; ISSN:1095-9203. (American Association for the Advancement of Science)DeepMind presented notably accurate predictions at the recent 14th Crit. Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid soln. of challenging x-ray crystallog. and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biol. research.
- 22Bryant, P.; Pozzati, G.; Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 2022, 13, 1265, DOI: 10.1038/s41467-022-28865-wGoogle Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XmvVyisb0%253D&md5=b379d632b2e14877f36b3fff2d5e8d3bImproved prediction of protein-protein interactions using AlphaFold2Bryant, Patrick; Pozzati, Gabriele; Elofsson, ArneNature Communications (2022), 13 (1), 1265CODEN: NCAOBW; ISSN:2041-1723. (Nature Portfolio)Predicting the structure of interacting protein chains is a fundamental step towards understanding protein function. Unfortunately, no computational method can produce accurate structures of protein complexes. AlphaFold2, has shown unprecedented levels of accuracy in modeling single chain protein structures. Here, we apply AlphaFold2 for the prediction of heterodimeric protein complexes. We find that the AlphaFold2 protocol together with optimized multiple sequence alignments, generate models with acceptable quality (DockQ ≥ 0.23) for 63% of the dimers. From the predicted interfaces we create a simple function to predict the DockQ score which distinguishes acceptable from incorrect models as well as interacting from non-interacting proteins with state-of-art accuracy. We find that, using the predicted DockQ scores, we can identify 51% of all interacting pairs at 1% FPR.
- 23Bryant, P.; Pozzati, G.; Zhu, W.; Shenoy, A.; Kundrotas, P.; Elofsson, A. Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search. Nat. Commun. 2022, 13, 6028, DOI: 10.1038/s41467-022-33729-4Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xis1Ciur%252FO&md5=6633ab8a81fdf35be217ea991ab571d9Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree searchBryant, Patrick; Pozzati, Gabriele; Zhu, Wensi; Shenoy, Aditi; Kundrotas, Petras; Elofsson, ArneNature Communications (2022), 13 (1), 6028CODEN: NCAOBW; ISSN:2041-1723. (Nature Portfolio)AlphaFold can predict the structure of single- and multiple-chain proteins with very high accuracy. However, the accuracy decreases with the no. of chains, and the available GPU memory limits the size of protein complexes which can be predicted. Here we show that one can predict the structure of large complexes starting from predictions of subcomponents. We assemble 91 out of 175 complexes with 10-30 chains from predicted subcomponents using Monte Carlo tree search, with a median TM-score of 0.51. There are 30 highly accurate complexes (TM-score ≥0.8, 33% of complete assemblies). We create a scoring function, mpDockQ, that can distinguish if assemblies are complete and predict their accuracy. We find that complexes contg. symmetry are accurately assembled, while asym. complexes remain challenging.
- 24Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; Bridgland, A.; Meyer, C.; Kohl, S. A. A.; Ballard, A. J.; Cowie, A.; Romera-Paredes, B.; Nikolov, S.; Jain, R.; Adler, J.; Back, T.; Petersen, S.; Reiman, D.; Clancy, E.; Zielinski, M.; Steinegger, M.; Pacholska, M.; Berghammer, T.; Bodenstein, S.; Silver, D.; Vinyals, O.; Senior, A. W.; Kavukcuoglu, K.; Kohli, P.; Hassabis, D. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583– 589, DOI: 10.1038/s41586-021-03819-2Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhvVaktrrL&md5=25964ab1157cd5b74a437333dd86650dHighly accurate protein structure prediction with AlphaFoldJumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ; Zidek, Augustin; Potapenko, Anna; Bridgland, Alex; Meyer, Clemens; Kohl, Simon A. A.; Ballard, Andrew J.; Cowie, Andrew; Romera-Paredes, Bernardino; Nikolov, Stanislav; Jain, Rishub; Adler, Jonas; Back, Trevor; Petersen, Stig; Reiman, David; Clancy, Ellen; Zielinski, Michal; Steinegger, Martin; Pacholska, Michalina; Berghammer, Tamas; Bodenstein, Sebastian; Silver, David; Vinyals, Oriol; Senior, Andrew W.; Kavukcuoglu, Koray; Kohli, Pushmeet; Hassabis, DemisNature (London, United Kingdom) (2021), 596 (7873), 583-589CODEN: NATUAS; ISSN:0028-0836. (Nature Portfolio)Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous exptl. effort, the structures of around 100,000 unique proteins have been detd., but this represents a small fraction of the billions of known protein sequences. Structural coverage is bottlenecked by the months to years of painstaking effort required to det. a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'-has been an important open research problem for more than 50 years. Despite recent progress, existing methods fall far short of at. accuracy, esp. when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with at. accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Crit. Assessment of protein Structure Prediction (CASP14), demonstrating accuracy competitive with exptl. structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates phys. and biol. knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
- 25Rettie, S. A.; Campbell, K. V.; Bera, A. K.; Kang, A.; Kozlov, S.; De La Cruz, J.; Adebomi, V.; Zhou, G.; DiMaio, F.; Ovchinnikov, S.; Bhardwaj, G. Cyclic peptide structure prediction and design using AlphaFold. bioRxiv 2023, DOI: 10.1101/2023.02.25.529956v1Google ScholarThere is no corresponding record for this reference.
- 26Gang, D.; Kim, D. W.; Park, H. S. Cyclic Peptides: Promising Scaffolds for Biopharmaceuticals. Genes 2018, 9, 557, DOI: 10.3390/genes9110557Google Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXmsFKgsL0%253D&md5=21f836cd794571f233d547c756545adeCyclic peptides: promising scaffolds for biopharmaceuticalsGang, Donghyeok; Kim, Do Wook; Park, Hee-SungGenes (2018), 9 (11), 557/1-557/15CODEN: GENEG9; ISSN:2073-4425. (MDPI AG)A review. To date, small mols. and macromols., including antibodies, have been the most pursued substances in drug screening and development efforts. Despite numerous favorable features as a drug, these mols. still have limitations and are not complementary in many regards. Recently, peptide-based chem. structures that lie between these two categories in terms of both structural and functional properties have gained increasing attention as potential alternatives. In particular, peptides in a circular form provide a promising scaffold for the development of a novel drug class owing to their adjustable and expandable ability to bind a wide range of target mols. In this review, we discuss recent progress in methodologies for peptide cyclization and screening and use of bioactive cyclic peptides in various applications.
- 27Iacovelli, R.; Bovenberg, R. A. L.; Driessen, A. J. M. Nonribosomal peptide synthetases and their biotechnological potential in Penicillium rubens. J. Ind. Microbiol. Biotechnol. 2021, 48, kuab045, DOI: 10.1093/jimb/kuab045Google ScholarThere is no corresponding record for this reference.
- 28Marahiel, M. A. Working outside the protein-synthesis rules: insights into non-ribosomal peptide synthesis. J. Pept. Sci. 2009, 15, 799– 807, DOI: 10.1002/psc.1183Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhtlOku7%252FI&md5=1534b9333ce37cf1cdd650cb1699216fWorking outside the protein-synthesis rules: Insights into non-ribosomal peptide synthesisMarahiel, Mohamed A.Journal of Peptide Science (2009), 15 (12), 799-807CODEN: JPSIEI; ISSN:1075-2617. (John Wiley & Sons Ltd.)A review. Non-ribosomally synthesized microbial peptides show remarkable structural diversity and constitute a widespread class of the most potent antibiotics and other important pharmaceuticals that range from penicillin to the immunosuppressant cyclosporine. They are assembled independent of the ribosome in a nucleic acid-independent way by a group of multimodular megaenzymes called non-ribosomal peptide synthetases. These biosynthetic machineries rely not only on the 20 canonical amino acids, but also use several different building blocks, including D-configured- and β-amino acids, methylated, glycosylated and phosphorylated residues, heterocyclic elements and even fatty acid building blocks. This structural diversity leads to a high d. of functional groups, which are often essential for the bioactivity. Recent biochem. and structural studies on several non-ribosomal peptide synthetase assembly lines have substantially contributed to the understanding of the mol. mechanisms and dynamics of individual catalytic domains underlying substrate recognition and substrate shuffling among the different active sites as well as peptide bond formation and the regio- and stereoselective product release. Copyright © 2009 European Peptide Society and John Wiley & Sons, Ltd.
- 29Martínez-Núñez, M. A.; López y López, V. E. Nonribosomal peptides synthetases and their applications in industry. Sustainable Chem. Processes 2016, 4, 13, DOI: 10.1186/s40508-016-0057-6Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhs12rtrjN&md5=5ac8ebf7e483c308e51c5696c4722de0Nonribosomal peptides synthetases and their applications in industryMartinez-Nunez, Mario Alberto; Lopez y Lopez, Victor EricSustainable Chemical Processes (2016), 4 (), 13/1-13/8CODEN: SCPUCB; ISSN:2043-7129. (Chemistry Central Ltd.)A review. X. Nonribosomal peptides are products that fall into the class of secondary metabolites with diverse properties as toxins, siderophores, pigments, or antibiotics, among others. Unlike other proteins, its biosynthesis is independent of ribosomal machinery. Nonribosomal peptides are synthesized on large nonribosomal peptide synthetase (NRPS) enzyme complexes. NRPSs are defined as multimodular enzymes, consisting of repeated modules. The NRPS enzymes are at operons and their regulation can be pos. or neg. at transcriptional or post-translational level. The presence of NRPS enzymes has been reported in the 3 domains of life, being prevalent in bacteria. Nonribosomal peptides are used in human medicine, crop protection, or environment restoration; and their use as com. products has been approved by the U.S. Food and Drug Administration and the Environmental Protection Agency. Here, the key features of nonribosomal peptides and NRPS enzymes, and some of their applications in industry are summarized.
- 30Sieber, S. A.; Marahiel, M. A. Learning from Nature’s Drug Factories: Nonribosomal Synthesis of Macrocyclic Peptides. J. Bacteriol. 2003, 185, 7036– 7043, DOI: 10.1128/JB.185.24.7036-7043.2003Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXpvVaqs7Y%253D&md5=ae91df9ce59762a4a0cfa32fcba60880Learning from nature's drug factories: Nonribosomal synthesis of macrocyclic peptidesSieber, Stephan A.; Marahiel, Mohamed A.Journal of Bacteriology (2003), 185 (24), 7036-7043CODEN: JOBAAY; ISSN:0021-9193. (American Society for Microbiology)A review. The family of enzymes, nonribosomal peptide peptide synthetases/ cyclases, are discussed.
- 31Miao, J.; Descoteaux, M. L.; Lin, Y.-S. Structure prediction of cyclic peptides by molecular dynamics + machine learning. Chem. Sci. 2021, 12, 14927– 14936, DOI: 10.1039/D1SC05562CGoogle Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXisVWjtr7F&md5=9623985fff84adb03275890a78327113Structure prediction of cyclic peptides by molecular dynamics + machine learningMiao, Jiayuan; Descoteaux, Marc L.; Lin, Yu-ShanChemical Science (2021), 12 (44), 14927-14936CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)Recent computational methods have made strides in discovering well-structured cyclic peptides that preferentially populate a single conformation. However, many successful cyclic-peptide therapeutics adopt multiple conformations in soln. In fact, the chameleonic properties of some cyclic peptides are likely responsible for their high cell membrane permeability. Thus, we require the ability to predict complete structural ensembles for cyclic peptides, including the majority of cyclic peptides that have broad structural ensembles, to significantly improve our ability to rationally design cyclic-peptide therapeutics. Here, we introduce the idea of using mol. dynamics simulation results to train machine learning models to enable efficient structure prediction for cyclic peptides. Using mol. dynamics simulation results for several hundred cyclic pentapeptides as the training datasets, we developed machine-learning models that can provide mol. dynamics simulation-quality predictions of structural ensembles for all the hundreds of thousands of sequences in the entire sequence space. The prediction for each individual cyclic peptide can be made using less than 1 s of computation time. Even for the most challenging classes of poorly structured cyclic peptides with broad conformational ensembles, our predictions were similar to those one would normally obtain only after running multiple days of explicit-solvent mol. dynamics simulations. The resulting method, termed StrEAMM (Structural Ensembles Achieved by Mol. Dynamics and Machine Learning), is the first technique capable of efficiently predicting complete structural ensembles of cyclic peptides without relying on addnl. mol. dynamics simulations, constituting a seven-order-of-magnitude improvement in speed while retaining the same accuracy as explicit-solvent simulations.
- 32Jurtz, V. I.; Johansen, A. R.; Nielsen, M.; Almagro Armenteros, J. J.; Nielsen, H.; Sønderby, C. K.; Winther, O.; Sønderby, S. K. An introduction to deep learning on biological sequence data: examples and solutions. Bioinformatics 2017, 33, 3685– 3690, DOI: 10.1093/bioinformatics/btx531Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhvFWmtr7K&md5=81d10d6022d7e8d5a462a70b8855d49eAn introduction to deep learning on biological sequence data: Examples and solutionsJurtz, Vanessa Isabell; Johansen, Alexander Rosenberg; Nielsen, Morten; Armenteros, Jose Juan Almagro; Nielsen, Henrik; Soenderby, Casper Kaae; Winther, Ole; Soenderby, Soeren KaaeBioinformatics (2017), 33 (22), 3685-3690CODEN: BOINFP; ISSN:1367-4811. (Oxford University Press)Motivation: Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been esp. successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biol. Results: Here, we aim to further the development of deep learning methods within biol. by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biol. sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II mols.
- 33Hou, J.; Adhikari, B.; Cheng, J. DeepSF: deep convolutional neural network for mapping protein sequences to folds. Bioinformatics 2018, 34, 1295– 1303, DOI: 10.1093/bioinformatics/btx780Google Scholar31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitlGrsrvP&md5=95cf0ea4776f79880304a8bc693e5a8eDeepSF: deep convolutional neural network for mapping protein sequences to foldsHou, Jie; Adhikari, Badri; Cheng, JianlinBioinformatics (2018), 34 (8), 1295-1303CODEN: BOINFP; ISSN:1367-4811. (Oxford University Press)Motivation: Protein fold recognition is an important problem in structural bioinformatics. Almost all traditional fold recognition methods use sequence (homol.) comparison to indirectly predict the fold of a target protein based on the fold of a template protein with known structure, which cannot explain the relationship between sequence and fold. Only a few methods had been developed to classify protein sequences into a small no. of folds due to methodol. limitations, which are not generally useful in practice. Results: We develop a deep 1D-convolution neural network (DeepSF) to directly classify any protein sequence into one of 1195 known folds, which is useful for both fold recognition and the study of sequence-structure relationship. Different from traditional sequence alignment (comparison) based methods, our method automatically exts. fold-related features from a protein sequence of any length and maps it to the fold space. We train and test our method on the datasets curated from SCOP1.75, yielding an av. classification accuracy of 75.3%. On the independent testing dataset curated from SCOP2.06, the classification accuracy is 73.0%. We compare our method with a top profile-profile alignment method-HHSearch on hard template-based and template-free modeling targets of CASP9-12 in terms of fold recognition accuracy. The accuracy of our method is 12.63-26.32% higher than HHSearch on template-free modeling targets and 3.39-17.09% higher on hard template-based modeling targets for top 1, 5 and 10 predicted folds. The hidden features extd. from sequence by our method is robust against sequence mutation, insertion, deletion and truncation, and can be used for other protein pattern recognition problems such as protein clustering, comparison and ranking.
- 34Cheng, J.; Liu, Y.; Ma, Y. Protein secondary structure prediction based on integration of CNN and LSTM model. J. Vis. Commun. Image Representation. 2020, 71, 102844, DOI: 10.1016/j.jvcir.2020.102844Google ScholarThere is no corresponding record for this reference.
- 35Chen, Z.; Min, M. R.; Ning, X. Ranking-Based Convolutional Neural Network Models for Peptide-MHC Class I Binding Prediction. Front. Mol. Biosci. 2021, 8, 634836, DOI: 10.3389/fmolb.2021.634836Google Scholar33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhsFemtrvI&md5=ecc55a398a5c5a66e6b94d5de7197690Ranking-based convolutional neural network models for peptide-MHC class I binding predictionChen, Ziqi; Min, Martin Renqiang; Ning, XiaFrontiers in Molecular Biosciences (2021), 8 (), 634836CODEN: FMBRBS; ISSN:2296-889X. (Frontiers Media S.A.)T-cell receptors can recognize foreign peptides bound to major histocompatibility complex (MHC) class-I proteins, and thus trigger the adaptive immune response. Therefore, identifying peptides that can bind to MHC class-I mols. plays a vital role in the design of peptide vaccines. Many computational methods, for example, the state-of-the-art allele-specific method mhcflurry, have been developed to predict the binding affinities between peptides and MHC mols. In this manuscript, we develop two allele-specific Convolutional Neural Network-based methods named convm and spconvm to tackle the binding prediction problem. Specifically, we formulate the problem as to optimize the rankings of peptide-MHC bindings via ranking-based learning objectives. Such optimization is more robust and tolerant to the measurement inaccuracy of binding affinities, and therefore enables more accurate prioritization of binding peptides. In addn., we develop a new position encoding method in convm and spconvm to better identify the most important amino acids for the binding events. We conduct a comprehensive set of expts. using the latest Immune Epitope Database (IEDB) datasets. Our exptl. results demonstrate that our models significantly outperform the state-of-the-art methods including mhcflurry with an av. percentage improvement of 6.70% on AUC and 17.10% on ROC5 across 128 alleles.
- 36Gelman, S.; Fahlberg, S. A.; Heinzelman, P.; Romero, P. A.; Gitter, A. Neural networks to learn protein sequence─function relationships from deep mutational scanning data. Proc. Natl. Acad. Sci. U.S.A. 2021, 118, e2104878118 DOI: 10.1073/pnas.2104878118Google Scholar34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xhs1Kks74%253D&md5=4e0d60b83780874fe594adaadabbae9bNeural networks to learn protein sequence-function relationships from deep mutational scanning dataGelman, Sam; Fahlberg, Sarah A.; Heinzelman, Pete; Romero, Philip A.; Gitter, AnthonyProceedings of the National Academy of Sciences of the United States of America (2021), 118 (48), e2104878118CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)The mapping from protein sequence to function is highly complex, making it challenging to predict how sequence changes will affect a protein's behavior and properties. We present a supervised deep learning framework to learn the sequence-function mapping from deep mutational scanning data and make predictions for new, uncharacterized sequence variants. We test multiple neural network architectures, including a graph convolutional network that incorporates protein structure, to explore how a network's internal representation affects its ability to learn the sequence-function mapping. Our supervised learning approach displays superior performance over physics-based and unsupervised prediction methods. We find that networks that capture nonlinear interactions and share parameters across sequence positions are important for learning the relationship between sequence and function. Further anal. of the trained models reveals the networks' ability to learn biol. meaningful information about protein structure and mechanism. Finally, we demonstrate the models' ability to navigate sequence space and design new proteins beyond the training set. We applied the protein G B1 domain (GB1) models to design a sequence that binds to IgG with substantially higher affinity than wild-type GB1.
- 37Hosseinzadeh, P.; Bhardwaj, G.; Mulligan, V. K.; Shortridge, M. D.; Craven, T. W.; Pardo-Avila, F.; Rettie, S. A.; Kim, D. E.; Silva, D.-A.; Ibrahim, Y. M.; Webb, I. K.; Cort, J. R.; Adkins, J. N.; Varani, G.; Baker, D. Comprehensive computational design of ordered peptide macrocycles. Science 2017, 358, 1461– 1466, DOI: 10.1126/science.aap7577Google Scholar35https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhvFGmtr3I&md5=fd13cbca720f657d91131e28e78fd435Comprehensive computational design of ordered peptide macrocyclesHosseinzadeh, Parisa; Bhardwaj, Gaurav; Mulligan, Vikram Khipple; Shortridge, Matthew D.; Craven, Timothy W.; Pardo-Avila, Fatima; Rettie, Stephen A.; Kim, David E.; Silva, Daniel-Adriano; Ibrahim, Yehia M.; Webb, Ian K.; Cort, John R.; Adkins, Joshua N.; Varani, Gabriele; Baker, DavidScience (Washington, DC, United States) (2017), 358 (6369), 1461-1466CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)Mixed-chirality peptide macrocycles such as cyclosporine are among the most potent therapeutics identified to date, but there is currently no way to systematically search the structural space spanned by such compds. Natural proteins do not provide a useful guide: Peptide macrocycles lack regular secondary structures and hydrophobic cores, and can contain local structures not accessible with L-amino acids. Here, we enumerate the stable structures that can be adopted by macrocyclic peptides composed of L- and D-amino acids by near-exhaustive backbone sampling followed by sequence design and energy landscape calcns. We identify more than 200 designs predicted to fold into single stable structures, many times more than the no. of currently available unbound peptide macrocycle structures. NMR structures of 9 of 12 designed 7- to 10-residue macrocycles, and three 11- to 14-residue bicyclic designs, are close to the computational models. Our results provide a nearly complete coverage of the rich space of structures possible for short peptide macrocycles and vastly increase the available starting scaffolds for both rational drug design and library selection methods.
- 38Li, X.; Du, X.; Li, J.; Gao, Y.; Pan, Y.; Shi, J.; Zhou, N.; Xu, B. Introducing d-Amino Acid or Simple Glycoside into Small Peptides to Enable Supramolecular Hydrogelators to Resist Proteolysis. Langmuir 2012, 28, 13512– 13517, DOI: 10.1021/la302583aGoogle Scholar36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xht1egtr3F&md5=813f0a8449856ab37017f4ae448bfc23Introducing d-Amino Acid or Simple Glycoside into Small Peptides to Enable Supramolecular Hydrogelators to Resist ProteolysisLi, Xinming; Du, Xuewen; Li, Jiayang; Gao, Yuan; Pan, Yue; Shi, Junfeng; Zhou, Ning; Xu, BingLangmuir (2012), 28 (37), 13512-13517CODEN: LANGD5; ISSN:0743-7463. (American Chemical Society)Here we report the examn. of two convenient strategies, the use of a d-amino acid residue or a glycoside segment, for increasing the proteolytic resistance of supramol. hydrogelators based on small peptides. Our results show that the introduction of d-amino acid or glycoside to the peptides significantly increases the resistance of the hydrogelators against proteinase K, a powerful endopeptidase. The insertion of d-amino acid in the peptide backbone, however, results relatively low storage moduli of the hydrogels, likely due to the disruption of the superstructures of the mol. assembly. In contrast, the introduction of a glycoside to the C-terminal of peptide enhances the biostability of the hydrogelators without the significant decrease of the storage moduli of the hydrogels. This work suggests that the inclusion of a simple glycogen in hydrogelators is a useful approach to increase their biostability, and the gained understanding from the work may ultimately lead to development of hydrogels of functional peptides for biomedical applications that require long-term biostability.
- 39Liu, J.; Liu, J.; Chu, L.; Zhang, Y.; Xu, H.; Kong, D.; Yang, Z.; Yang, C.; Ding, D. Self-Assembling Peptide of d-Amino Acids Boosts Selectivity and Antitumor Efficacy of 10-Hydroxycamptothecin. ACS Appl. Mater. Interfaces 2014, 6, 5558– 5565, DOI: 10.1021/am406007gGoogle Scholar37https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXkslSmu7c%253D&md5=0c69ff23a0cc309b3abe9f974f6310a8Self-Assembling Peptide of d-Amino Acids Boosts Selectivity and Antitumor Efficacy of 10-HydroxycamptothecinLiu, Jianfeng; Liu, Jinjian; Chu, Liping; Zhang, Yumin; Xu, Hongyan; Kong, Deling; Yang, Zhimou; Yang, Cuihong; Ding, DanACS Applied Materials & Interfaces (2014), 6 (8), 5558-5565CODEN: AAMICK; ISSN:1944-8244. (American Chemical Society)D-Peptides, which consist of D-amino acids and can resist the hydrolysis catalyzed by endogenous peptidases, are one of the promising candidates for construction of peptide materials with enhanced biostability in vivo. In this paper, we report on a self-assembling supramol. nanostructure of D-amino acid-based peptide Nap-GDFDFDYGRGD (D-fiber, DF meant D-phenylalanine, DY meant D-tyrosine), which were used as carriers for 10-hydroxycamptothecin (HCPT). Transmission electron microscopy observations demonstrated the filamentous morphol. of the HCPT-loaded peptides (D-fiber-HCPT). The better selectivity and antitumor activity of D-fiber-HCPT than L-fiber-HCPT were found in the in vitro and in vivo antitumor studies. These results highlight that this model D-fiber system holds great promise as vehicles of hydrophobic drugs for cancer therapy.
- 40Piana, S.; Laio, A. A bias-exchange approach to protein folding. J. Phys. Chem. B 2007, 111, 4553– 4559, DOI: 10.1021/jp067873lGoogle Scholar38https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXjvFCht7Y%253D&md5=7f64fd37737745fa58c24f00f839207bA Bias-Exchange Approach to Protein FoldingPiana, Stefano; Laio, AlessandroJournal of Physical Chemistry B (2007), 111 (17), 4553-4559CODEN: JPCBFK; ISSN:1520-6106. (American Chemical Society)By suitably extending a recent approach [Bussi, G., et al., 2006] the authors introduce a powerful methodol. that allows the parallel reconstruction of the free energy of a system in a virtually unlimited no. of variables. Multiple metadynamics simulations of the same system at the same temp. are performed, biasing each replica with a time-dependent potential constructed in a different set of collective variables. Exchanges between the bias potentials in the different variables are periodically allowed according to a replica exchange scheme. Due to the efficaciously multidimensional nature of the bias the method allows exploring complex free energy landscapes with high efficiency. The usefulness of the method is demonstrated by performing an atomistic simulation in explicit solvent of the folding of a Triptophane cage miniprotein. It is shown that the folding free energy landscape can be fully characterized starting from an extended conformation with use of only 40 ns of simulation on 8 replicas.
- 41Laio, A.; Parrinello, M. Escaping free-energy minima. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 12562– 12566, DOI: 10.1073/pnas.202427399Google Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38XnvFGiurc%253D&md5=48d5bc7436f3ef9d78369671e70fa608Escaping free-energy minimaLaio, Alessandro; Parrinello, MicheleProceedings of the National Academy of Sciences of the United States of America (2002), 99 (20), 12562-12566CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)We introduce a powerful method for exploring the properties of the multidimensional free energy surfaces (FESs) of complex many-body systems by means of coarse-grained non-Markovian dynamics in the space defined by a few collective coordinates. A characteristic feature of these dynamics is the presence of a history-dependent potential term that, in time, fills the min. in the FES, allowing the efficient exploration and accurate detn. of the FES as a function of the collective coordinates. We demonstrate the usefulness of this approach in the case of the dissocn. of a NaCl mol. in water and in the study of the conformational changes of a dialanine in soln.
- 42McHugh, S. M.; Rogers, J. R.; Yu, H.; Lin, Y.-S. Insights into How Cyclic Peptides Switch Conformations. J. Chem. Theory Comput. 2016, 12, 2480– 2488, DOI: 10.1021/acs.jctc.6b00193Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xlt1GjtL0%253D&md5=24e7cca05844404dd601380bdfe2eee7Insights into How Cyclic Peptides Switch ConformationsMcHugh, Sean M.; Rogers, Julia R.; Yu, Hongtao; Lin, Yu-ShanJournal of Chemical Theory and Computation (2016), 12 (5), 2480-2488CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)Cyclic peptides have recently emerged as promising modulators of protein-protein interactions. However, it is currently highly difficult to predict the structures of cyclic peptides owing to their rugged conformational free energy landscape, which prevents sampling of all thermodynamically relevant conformations. In this article, we first investigate how a relatively flexible cyclic hexapeptide switches conformations. It is found that, although the circular geometry of small cyclic peptides of size 6-8 may require rare, coherent dihedral changes to sample a new conformation, the changes are rather local, involving simultaneous changes of .vphi.i and ψi or ψi and .vphi.i+1. The understanding of how these cyclic peptides switch conformations enables the use of metadynamics simulations with reaction coordinates specifically targeting such coupled two-dihedral changes to effectively sample cyclic peptide conformational space.
- 43Sugita, Y.; Kitao, A.; Okamoto, Y. Multidimensional replica-exchange method for free-energy calculations. J. Chem. Phys. 2000, 113, 6042– 6051, DOI: 10.1063/1.1308516Google Scholar41https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXntFSrt7w%253D&md5=066cf45c629b341bbd2fc4d92c7778a6Multidimensional replica-exchange method for free-energy calculationsSugita, Yuji; Kitao, Akio; Okamoto, YukoJournal of Chemical Physics (2000), 113 (15), 6042-6051CODEN: JCPSA6; ISSN:0021-9606. (American Institute of Physics)We have developed a new simulation algorithm for free-energy calcns. The method is a multidimensional extension of the replica-exchange method. While pairs of replicas with different temps. are exchanged during the simulation in the original replica-exchange method, pairs of replicas with different temps. and/or different parameters of the potential energy are exchanged in the new algorithm. This greatly enhances the sampling of the conformational space and allows accurate calcns. of free energy in a wide temp. range from a single simulation run, using the weighted histogram anal. method.
- 44Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin, T. E. UCSF Chimera-a visualization system for exploratory research and analysis. J. Comput. Chem. 2004, 25, 1605– 1612, DOI: 10.1002/jcc.20084Google Scholar42https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXmvVOhsbs%253D&md5=944b175f440c1ff323705987cf937ee7UCSF Chimera-A visualization system for exploratory research and analysisPettersen, Eric F.; Goddard, Thomas D.; Huang, Conrad C.; Couch, Gregory S.; Greenblatt, Daniel M.; Meng, Elaine C.; Ferrin, Thomas E.Journal of Computational Chemistry (2004), 25 (13), 1605-1612CODEN: JCCHDD; ISSN:0192-8651. (John Wiley & Sons, Inc.)The design, implementation, and capabilities of an extensible visualization system, UCSF Chimera, are discussed. Chimera is segmented into a core that provides basic services and visualization, and extensions that provide most higher level functionality. This architecture ensures that the extension mechanism satisfies the demands of outside developers who wish to incorporate new features. Two unusual extensions are presented: Multiscale, which adds the ability to visualize large-scale mol. assemblies such as viral coats, and Collab., which allows researchers to share a Chimera session interactively despite being at sep. locales. Other extensions include Multalign Viewer, for showing multiple sequence alignments and assocd. structures; ViewDock, for screening docked ligand orientations; Movie, for replaying mol. dynamics trajectories; and Vol. Viewer, for display and anal. of volumetric data. A discussion of the usage of Chimera in real-world situations is given, along with anticipated future directions. Chimera includes full user documentation, is free to academic and nonprofit users, and is available for Microsoft Windows, Linux, Apple Mac OS X, SGI IRIX, and HP Tru64 Unix from http://www.cgl.ucsf.edu/chimera/.
- 45Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindahl, E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1–2, 19– 25, DOI: 10.1016/j.softx.2015.06.001Google ScholarThere is no corresponding record for this reference.
- 46Zhou, C.-Y.; Jiang, F.; Wu, Y.-D. Residue-Specific Force Field Based on Protein Coil Library. RSFF2: Modification of AMBER ff99SB. J. Phys. Chem. B 2015, 119, 1035– 1047, DOI: 10.1021/jp5064676Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhslGktbrP&md5=42706764a58fdb7c5795552f0133c7afResidue-Specific Force Field Based on Protein Coil Library. RSFF2: Modification of AMBER ff99SBZhou, Chen-Yang; Jiang, Fan; Wu, Yun-DongJournal of Physical Chemistry B (2015), 119 (3), 1035-1047CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Recently, we developed a residue-specific force field (RSFF1) based on conformational free-energy distributions of the 20 amino acid residues from a protein coil library. Most parameters in RSFF1 were adopted from the OPLS-AA/L force field, but some van der Waals and torsional parameters that effectively affect local conformational preferences were introduced specifically for individual residues to fit the coil library distributions. Here a similar strategy has been applied to modify the Amber ff99SB force field, and a new force field named RSFF2 is developed. It can successfully fold α-helical structures such as polyalanine peptides, Trp-cage miniprotein, and villin headpiece subdomain and β-sheet structures such as Trpzip-2, GB1 β-hairpins, and the WW domain, simultaneously. The properties of various popular force fields in balancing between α-helix and β-sheet are analyzed based on their descriptions of local conformational features of various residues, and the anal. reveals the importance of accurate local free-energy distributions. Unlike the RSFF1, which overestimates the stability of both α-helix and β-sheet, RSFF2 gives melting curves of α-helical peptides and Trp-cage in good agreement with exptl. data. Fitting to the two-state model, RSFF2 gives folding enthalpies and entropies in reasonably good agreement with available exptl. results.
- 47Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926– 935, DOI: 10.1063/1.445869Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL3sXksF2htL4%253D&md5=a1161334e381746be8c9b15a5e56f704Comparison of simple potential functions for simulating liquid waterJorgensen, William L.; Chandrasekhar, Jayaraman; Madura, Jeffry D.; Impey, Roger W.; Klein, Michael L.Journal of Chemical Physics (1983), 79 (2), 926-35CODEN: JCPSA6; ISSN:0021-9606.Classical Monte Carlo simulations were carried out for liq. H2O in the NPT ensemble at 25° and 1 atm using 6 of the simpler intermol. potential functions for the dimer. Comparisons were made with exptl. thermodn. and structural data including the neutron diffraction results of Thiessen and Narten (1982). The computed densities and potential energies agree with expt. except for the original Bernal-Fowler model, which yields an 18% overest. of the d. and poor structural results. The discrepancy may be due to the correction terms needed in processing the neutron data or to an effect uniformly neglected in the computations. Comparisons were made for the self-diffusion coeffs. obtained from mol. dynamics simulations.
- 48Jiang, F.; Zhou, C.-Y.; Wu, Y.-D. Residue-Specific Force Field Based on the Protein Coil Library. RSFF1: Modification of OPLS-AA/L. J. Phys. Chem. B 2014, 118, 6983– 6998, DOI: 10.1021/jp5017449Google Scholar46https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXnslarsrs%253D&md5=65279dd20cb09368a7defaf4b9602bcfResidue-Specific Force Field Based on the Protein Coil Library. RSFF1: Modification of OPLS-AA/LJiang, Fan; Zhou, Chen-Yang; Wu, Yun-DongJournal of Physical Chemistry B (2014), 118 (25), 6983-6998CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Traditional protein force fields use one set of parameters for most of the 20 amino acids (AAs), allowing transferability of the parameters. However, a significant shortcoming is the difficulty to fit the Ramachandran plots of all AA residues simultaneously, affecting the accuracy of the force field. In this Feature Article, the authors report a new strategy for protein force field parametrization. Backbone and side-chain conformational distributions of all 20 AA residues obtained from protein coil library were used as the target data. The dihedral angle (torsion) potentials and some local nonbonded (1-4/1-5/1-6) interactions in OPLS-AA/L force field were modified such that the target data can be excellently reproduced by mol. dynamics simulations of dipeptides (blocked AAs) in explicit water, resulting in a new force field with AA-specific parameters, RSFF1. An efficient free energy decompn. approach was developed to sep. the corrections on φ and ψ from the two-dimensional Ramachandran plots. RSFF1 is shown to reproduce the exptl. NMR 3J-coupling consts. of AA dipeptides better than other force fields. It has a good balance between α-helical and β-sheet secondary structures. It can successfully fold a set of α-helix proteins (Trp-cage and Homeodomain) and β-hairpins (Trpzip-2, GB1 hairpin), which cannot be consistently stabilized by other state-of-the-art force fields. The RSFF1 force field systematically overestimates the melting temp. (and the stability of native state) of these peptides/proteins. It has a potential application in the simulation of protein folding and protein structure refinement.
- 49Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J. L.; Dror, R. O.; Shaw, D. E. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins: Struct., Funct., Bioinf. 2010, 78, 1950– 1958, DOI: 10.1002/prot.22711Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXkvFegtLo%253D&md5=447a9004026e2b93f0f7beff165daa09Improved side-chain torsion potentials for the Amber ff99SB protein force fieldLindorff-Larsen, Kresten; Piana, Stefano; Palmo, Kim; Maragakis, Paul; Klepeis, John L.; Dror, Ron O.; Shaw, David E.Proteins: Structure, Function, and Bioinformatics (2010), 78 (8), 1950-1958CODEN: PSFBAF ISSN:. (Wiley-Liss, Inc.)Recent advances in hardware and software have enabled increasingly long mol. dynamics (MD) simulations of biomols., exposing certain limitations in the accuracy of the force fields used for such simulations and spurring efforts to refine these force fields. Recent modifications to the Amber and CHARMM protein force fields, for example, have improved the backbone torsion potentials, remedying deficiencies in earlier versions. Here, the authors further advance simulation accuracy by improving the amino acid side-chain torsion potentials of the Amber ff99SB force field. First, the authors used simulations of model alpha-helical systems to identify the four residue types whose rotamer distribution differed the most from expectations based on Protein Data Bank statistics. Second, the authors optimized the side-chain torsion potentials of these residues to match new, high-level quantum-mech. calcns. Finally, the authors used microsecond-timescale MD simulations in explicit solvent to validate the resulting force field against a large set of exptl. NMR measurements that directly probe side-chain conformations. The new force field, which the authors have termed Amber ff99SB-ILDN, exhibits considerably better agreement with the NMR data. Proteins 2010. © 2010 Wiley-Liss, Inc.
- 50Geng, H.; Jiang, F.; Wu, Y.-D. Accurate Structure Prediction and Conformational Analysis of Cyclic Peptides with Residue-Specific Force Fields. J. Phys. Chem. Lett. 2016, 7, 1805– 1810, DOI: 10.1021/acs.jpclett.6b00452Google Scholar48https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XntFCiu74%253D&md5=d9c4be040be98396c50ae44a2fc97fbbAccurate Structure Prediction and Conformational Analysis of Cyclic Peptides with Residue-Specific Force FieldsGeng, Hao; Jiang, Fan; Wu, Yun-DongJournal of Physical Chemistry Letters (2016), 7 (10), 1805-1810CODEN: JPCLCD; ISSN:1948-7185. (American Chemical Society)Cyclic peptides (CPs) are promising candidates for drugs, chem. biol. tools, and self-assembling nanomaterials. However, the development of reliable and accurate computational methods for their structure prediction has been challenging. Here, 20 all-trans CPs of 5-12 residues selected from Cambridge Structure Database have been simulated using replica-exchange mol. dynamics with four different force fields. The authors' recently developed residue-specific force fields RSFF1 and RSFF2 can correctly identify the crystal-like conformations of more than half CPs as the most populated conformation. The RSFF2 performs the best, which consistently predicts the crystal structures of 17 out of 20 CPs with RMSD < 1.1 Å. The authors also compared the backbone (φ,ψ) sampling of residues in CPs which those in short linear peptides and in globular proteins. In general, unlike linear peptides, CPs have local conformational free energies and entropies quite similar to globular proteins.
- 51Kaminski, G. A.; Friesner, R. A.; Tirado-Rives, J.; Jorgensen, W. L. Evaluation and Reparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate Quantum Chemical Calculations on Peptides. J. Phys. Chem. B 2001, 105, 6474– 6487, DOI: 10.1021/jp003919dGoogle Scholar49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3MXislKhsLk%253D&md5=3ff059626977ee7f6342466f5820f5b7Evaluation and Reparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate Quantum Chemical Calculations on PeptidesKaminski, George A.; Friesner, Richard A.; Tirado-Rives, Julian; Jorgensen, William L.Journal of Physical Chemistry B (2001), 105 (28), 6474-6487CODEN: JPCBFK; ISSN:1089-5647. (American Chemical Society)We present results of improving the OPLS-AA force field for peptides by means of refitting the key Fourier torsional coeffs. The fitting technique combines using accurate ab initio data as the target, choosing an efficient fitting subspace of the whole potential-energy surface, and detg. wts. for each of the fitting points based on magnitudes of the potential-energy gradient. The av. energy RMS deviation from the LMP2/cc-pVTZ(-f)//HF/6-31G** data is reduced by ∼40% from 0.81 to 0.47 kcal/mol as a result of the fitting for the electrostatically uncharged dipeptides. Transferability of the parameters is demonstrated by using the same alanine dipeptide-fitted backbone torsional parameters for all of the other dipeptides (with the appropriate side-chain refitting) and the alanine tetrapeptide. Parameters of nonbonded interactions have also been refitted for the sulfur-contg. dipeptides (cysteine and methionine), and the validity of the new Coulombic charges and the van der Waals σ's and ε's is proved through reproducing gas-phase energies of complex formation heats of vaporization and densities of pure model liqs. Moreover, a novel approach to fitting torsional parameters for electrostatically charged mol. systems has been presented and successfully tested on five dipeptides with charged side chains.
- 52Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A smooth particle mesh Ewald method. J. Chem. Phys. 1995, 103, 8577– 8593, DOI: 10.1063/1.470117Google Scholar50https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2MXptlehtrw%253D&md5=092a679dd3bee08da28df41e302383a7A smooth particle mesh Ewald methodEssmann, Ulrich; Perera, Lalith; Berkowitz, Max L.; Darden, Tom; Lee, Hsing; Pedersen, Lee G.Journal of Chemical Physics (1995), 103 (19), 8577-93CODEN: JCPSA6; ISSN:0021-9606. (American Institute of Physics)The previously developed particle mesh Ewald method is reformulated in terms of efficient B-spline interpolation of the structure factors. This reformulation allows a natural extension of the method to potentials of the form 1/rp with p ≥ 1. Furthermore, efficient calcn. of the virial tensor follows. Use of B-splines in the place of Lagrange interpolation leads to analytic gradients as well as a significant improvement in the accuracy. The authors demonstrate that arbitrary accuracy can be achieved, independent of system size N, at a cost that scales as N log(N). For biomol. systems with many thousands of atoms and this method permits the use of Ewald summation at a computational cost comparable to that of a simple truncation method of 10 Å or less.
- 53Tribello, G. A.; Bonomi, M.; Branduardi, D.; Camilloni, C.; Bussi, G. PLUMED 2: New feathers for an old bird. Comput. Phys. Commun. 2014, 185, 604– 613, DOI: 10.1016/j.cpc.2013.09.018Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhs1yqs7fJ&md5=292009aab558d0ef1108bb9a5f036c40PLUMED 2: New feathers for an old birdTribello, Gareth A.; Bonomi, Massimiliano; Branduardi, Davide; Camilloni, Carlo; Bussi, GiovanniComputer Physics Communications (2014), 185 (2), 604-613CODEN: CPHCBZ; ISSN:0010-4655. (Elsevier B.V.)Enhancing sampling and analyzing simulations are central issues in mol. simulation. Recently, we introduced PLUMED, an open-source plug-in that provides some of the most popular mol. dynamics (MD) codes with implementations of a variety of different enhanced sampling algorithms and collective variables (CVs). The rapid changes in this field, in particular new directions in enhanced sampling and dimensionality redn. together with new hardware, require a code that is more flexible and more efficient. We therefore present PLUMED 2 here-a complete rewrite of the code in an object-oriented programming language (C++). This new version introduces greater flexibility and greater modularity, which both extends its core capabilities and makes it far easier to add new methods and CVs. It also has a simpler interface with the MD engines and provides a single software library contg. both tools and core facilities. Ultimately, the new code better serves the ever-growing community of users and contributors in coping with the new challenges arising in the field.
- 54Mu, Y.; Nguyen, P. H.; Stock, G. Energy landscape of a small peptide revealed by dihedral angle principal component analysis. Proteins 2005, 58, 45– 52, DOI: 10.1002/prot.20310Google Scholar52https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD2cnhvFWgtg%253D%253D&md5=47148e133a48df41753ec9daaf3a01f8Energy landscape of a small peptide revealed by dihedral angle principal component analysisMu Yuguang; Nguyen Phuong H; Stock GerhardProteins (2005), 58 (1), 45-52 ISSN:.A 100 ns molecular dynamics simulation of penta-alanine in explicit water is performed to study the reversible folding and unfolding of the peptide. Employing a standard principal component analysis (PCA) using Cartesian coordinates, the resulting free-energy landscape is found to have a single minimum, thus suggesting a simple, relatively smooth free-energy landscape. Introducing a novel PCA based on a transformation of the peptide dihedral angles, it is found, however, that there are numerous free energy minima of comparable energy (less than or approximately 1 kcal/mol), which correspond to well-defined structures with characteristic hydrogen-bonding patterns. That is, the true free-energy landscape is actually quite rugged and its smooth appearance in the Cartesian PCA represents an artifact of the mixing of internal and overall motion. Well-separated minima corresponding to specific conformational structures are also found in the unfolded part of the free energy landscape, revealing that the unfolded state of penta-alanine is structured rather than random. Performing a connectivity analysis, it is shown that neighboring states are connected by low barriers of similar height and that each state typically makes transitions to three or four neighbor states. Several principal pathways for helix nucleation are identified and discussed in some detail.
- 55Damas, J. M.; Filipe, L. C.; Campos, S. R.; Lousa, D.; Victor, B. L.; Baptista, A. M.; Soares, C. M. Predicting the Thermodynamics and Kinetics of Helix Formation in a Cyclic Peptide Model. J. Chem. Theory Comput. 2013, 9, 5148– 5157, DOI: 10.1021/ct400529kGoogle Scholar53https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhsVylsr7E&md5=bb054688bae9b7be6cfe0f1abcd084b4Predicting the thermodynamics and kinetics of helix formation in a cyclic peptide modelDamas, Joao M.; Filipe, Luis C. S.; Campos, Sara R. R.; Lousa, Diana; Victor, Bruno L.; Baptista, Antonio M.; Soares, Claudio M.Journal of Chemical Theory and Computation (2013), 9 (11), 5148-5157CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)The peptide, Ac-(cyclo-2,6)-R-[KAAAD]-NH2 (cyc-RKAAAD), is a short cyclic peptide known to adopt a remarkably stable single-turn α-helix in water. Due to its simplicity and the availability of thermodn. and kinetic exptl. data, cyc-RKAAAD poses as an ideal model for evaluating the aptness of current mol. dynamics (MD) simulation methodologies to accurately sample conformations that reproduce exptl. obsd. properties. Here, the authors extensively sampled the conformational space of cyc-RKAAAD using microsecond-timescale MD simulations. The authors characterized the peptide conformational preferences in terms of secondary structure propensities and, using Cartesian-coordinate principal component anal. (cPCA), constructed its free energy landscape, thus obtaining a detailed weighted discrimination between the helical and nonhelical subensembles. The cPCA state discrimination, together with a Markov model built from it, allowed the authors to est. the free energy of unfolding (-0.57 kJ/mol) and the relaxation time (∼0.435 μs) at 298.15 K, which were in excellent agreement with the exptl. reported values. Addnl., the authors presented simulations conducted using 2 enhanced sampling methods: replica-exchange mol. dynamics (REMD) and bias-exchange metadynamics (BE-MetaD). The authors compared the free energy landscape obtained by these 2 methods with the results from MD simulations and discussed the sampling and computational gains achieved. Overall, the results obtained attested to the suitability of modern simulation methods to explore the conformational behavior of peptide systems with a high level of realism.
- 56Hsueh, S. C. C.; Aina, A.; Plotkin, S. S. Ensemble Generation for Linear and Cyclic Peptides Using a Reservoir Replica Exchange Molecular Dynamics Implementation in GROMACS. J. Phys. Chem. B 2022, 126, 10384– 10399, DOI: 10.1021/acs.jpcb.2c05470Google Scholar54https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XivFWitLzE&md5=95ddb75adb68092863f1e5e4fa2b3ce6Ensemble Generation for Linear and Cyclic Peptides Using a Reservoir Replica Exchange Molecular Dynamics Implementation in GROMACSHsueh, Shawn C. C.; Aina, Adekunle; Plotkin, Steven S.Journal of Physical Chemistry B (2022), 126 (49), 10384-10399CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)The profile of shapes presented by a cyclic peptide modulates its therapeutic efficacy and is represented by the ensemble of its sampled conformations. Although some algorithms excel at creating a diverse ensemble of cyclic peptide conformations, they seldom address the entropic contribution of flexible conformations and often have significant practical difficulty producing an ensemble with converged and reliable thermodn. properties. In this study, an accelerated mol. dynamics (MD) method, namely, reservoir replica exchange MD (R-REMD or Res-REMD), was implemented in GROMACS ver. 4.6.7 and benchmarked on two small cyclic peptide model systems: a cyclized furin cleavage site of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike (cyclo-(CGPRRARSG)) and oxytocin (disulfide-bonded CYIQNCPLG). Addnl., we also benchmarked Res-REMD on alanine dipeptide and Trpzip2 to demonstrate its validity and efficiency over REMD. For Trpzip2, Res-REMD coupled with an umbrella-sampling-derived reservoir generated similar folded fractions as regular REMD but on a much faster time scale. For cyclic peptides, Res-REMD appeared to be marginally faster than REMD in ensemble generation. Finally, Res-REMD was more effective in sampling rare events such as trans to cis peptide bond isomerization. We provide a GitHub page with the modified GROMACS source code for running Res-REMD at https://github.com/PlotkinLab/Reservoir-REMD.
- 57Rodriguez, A.; Laio, A. Clustering by fast search and find of density peaks. Science 2014, 344, 1492– 1496, DOI: 10.1126/science.1242072Google Scholar55https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtVaks7%252FL&md5=be52ecdc50ba56d5bcceb3135cdb87daClustering by fast search and find of density peaksRodriguez, Alex; Laio, AlessandroScience (Washington, DC, United States) (2014), 344 (6191), 1492-1496CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)Cluster anal. is aimed at classifying elements into categories on the basis of their similarity. Its applications range from astronomy to bioinformatics, bibliometrics, and pattern recognition. We propose an approach based on the idea that cluster centers are characterized by a higher d. than their neighbors and by a relatively large distance from points with higher densities. This idea forms the basis of a clustering procedure in which the no. of clusters arises intuitively, outliers are automatically spotted and excluded from the anal., and clusters are recognized regardless of their shape and of the dimensionality of the space in which they are embedded. We demonstrate the power of the algorithm on several test cases.
- 58Morgan, H. L. The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. J. Chem. Doc. 1965, 5, 107– 113, DOI: 10.1021/c160017a018Google Scholar56https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaF2MXkt1Omtr0%253D&md5=63dacaaebba9a603360996ca690e56c3Generation of a unique machine description for chemical structures--a technique developed at Chemical Abstracts ServiceMorgan, H. L.Journal of Chemical Documentation (1965), 5 (2), 107-13CODEN: JCHDAN; ISSN:0021-9576.The description employed is a uniquely ordered list of the node symbols of the structure (or graph) in which the value (at. symbol) of each node and its attachment (bonding) to the other nodes of the total structure. When the entire structure has been numbered according to a given set of rules, the connection table is formed by recording the structural relation by a process of successive partial orderings.
- 59Landrum, G. RDKit: Open-Source Cheminformatics Software , 2021. https://www.rdkit.org/.Google ScholarThere is no corresponding record for this reference.
- 60Nair, V.; Hinton, G. E. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th International Conference on Machine Learning (ICML’10), Haifa, Israel, June 21–24, 2010; Fürnkranz, J., Joachims, T., Eds.; Omnipress: Madison, WI, 2010; pp 807– 814.Google ScholarThere is no corresponding record for this reference.
- 61Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv (Computer Science.Machine Learning) , January 30, 2017, 1412.6980, ver. 9. https://arxiv.org/abs/1412.6980 (accessed 2023-03-31).Google ScholarThere is no corresponding record for this reference.
- 62Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; Desmaison, A.; Kopf, A.; Yang, E.; DeVito, Z.; Raison, M.; Tejani, A.; Chilamkurthy, S.; Steiner, B.; Fang, L.; Bai, J.; Chintala, S. Pytorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Vancouver, Canada, December 8–14, 2019; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, 2019; pp 8024– 8035.Google ScholarThere is no corresponding record for this reference.
- 63Fey, M.; Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. arXiv (Computer Science.Machine Learning) , April 25, 2019, 1903.02428, ver. 3. https://arxiv.org/abs/1903.02428 (accessed 2023-03-31).Google ScholarThere is no corresponding record for this reference.
- 64Prechelt, L. Automatic early stopping using cross validation: quantifying the criteria. Neural Netw. 1998, 11, 761– 767, DOI: 10.1016/S0893-6080(98)00010-0Google Scholar62https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2sbnt1eguw%253D%253D&md5=cc470412bcc35d8792c76b7d53d04386Automatic early stopping using cross validation: quantifying the criteriaPrechelt LutzNeural networks : the official journal of the International Neural Network Society (1998), 11 (4), 761-767 ISSN:.Cross validation can be used to detect when overfitting starts during supervised training of a neural network; training is then stopped before convergence to avoid the overfitting ('early stopping'). The exact criterion used for cross validation based early stopping, however, is chosen in an ad-hoc fashion by most researchers or training is stopped interactively. To aid a more well-founded selection of the stopping criterion, 14 different automatic stopping criteria from three classes were evaluated empirically for their efficiency and effectiveness in 12 different classification and approximation tasks using multi-layer perceptrons with RPROP training. The experiments show that, on average, slower stopping criteria allow for small improvements in generalization (in the order of 4%), but cost about a factor of 4 longer in training time.
- 65Schlichtkrull, M.; Kipf, T. N.; Bloem, P.; van den Berg, R.; Titov, I.; Welling, M. Modeling relational data with graph convolutional networks. arXiv (Statistics.Machine Learning) , October 26, 2017, 1703.06103, ver. 4. https://arxiv.org/abs/1703.06103 (accessed 2023-03-31).Google ScholarThere is no corresponding record for this reference.
Cited By
This article has not yet been cited by other publications.
Abstract
Figure 1
Figure 1. StrEAMM linear regression models are constructed using contributions from neighboring interactions. For example, the natural logarithm of the population of cyclic pentapeptide cyclo-(X1X2X3X4X5) adopting a specific structure S1S2S3S4S5 is the sum of the interaction weights for each (1,2) and (1,3) neighbor present. Xi is an amino acid; Si is a structural digit that represents a region of (ϕ, ψ) space (see Figure 2); wSiSi+1XiXi+1 is the weight for residues XiXi+1 adopting structure SiSi+1; wSiSi+1Si+2Xi_Xi+2 is the weight for residues Xi and Xi+2 adopting substructure SiSi+1Si+2; and wQX1X2X3X4X5 is related to the partition function, Q, for sequence X1X2X3X4X5 and ensures that all the structures’ populations sum to 1 for a given sequence. This equation was first proposed by Miao et al. (31)
Figure 2
Figure 2. Structural binning maps divide the (ϕ, ψ) space of cyclic pentapeptides into 10 regions and the (ϕ, ψ) space of cyclic hexapeptides into six regions. (A) The (ϕ, ψ) population density of cyclo-(GGGGG) and (B) the resulting binning map when all the grid points are assigned to their closest centroid to form the final binning map. (C) The (ϕ, ψ) population density of cyclo-(GGGGGG) and (D) the resulting binning map using the same binning protocol.
Figure 3
Figure 3. Convolutional neural network and graph neural network architectures. (A) Cartoon example of the cyclic pentapeptide sequence cyclo-(sasFr) represented using a matrix with N = 5 columns and 2048 rows, where the 2048 bits come from the fingerprint encoding of each amino acid. (B) The representations of cyclo-(sasFr) and cyclo-(asFrs) are concatenated such that the (1,2) neighboring residues are spatially close together, enabling the 1D convolutional filter (blue box representing a (4096 × 1) vector of learnable model parameters) to encompass all the fingerprint features that define residues s and a. (C) The top graph represents a cyclic pentapeptide, where the (1,2) edge types are denoted with blue arrows and the (1,3) edge types are denoted with green arrows. The bottom graph represents a cyclic hexapeptide, where in addition to the (1,2) and (1,3) edge types we also can include (1,4) edge types, as denoted with purple arrows. Note: The purple (1,4) edges appear to be double-arrowed, but there are really two unique single-arrowed edges. For example, the forward edge in dark purple that starts at ser1 would be directed to Phe4, and the forward edge in dark purple that starts at Phe4 would be directed to ser1.
Figure 4
Figure 4. Performance of StrEAMM linear regression models on the training dataset (top) and test dataset (bottom) for (A) cyclic pentapeptides and (B, C) cyclic hexapeptides. (A) Performance of StrEAMM linear regression model on cyclic pentapeptides using (1,2) and (1,3) interactions. (B) Performance of StrEAMM linear regression model on cyclic hexapeptides using (1,2) and (1,3) interactions. (C) Performance of StrEAMM linear regression model on cyclic hexapeptides using (1,2), (1,3), and (1,4) interactions. The black dashed line represents y = x. R2 is the coefficient of determination. WE is the weighted error, given by , where pi,observed and pi,predicted are the populations observed in MD simulation and predicted by StrEAMM, respectively. Each point on the plot represents the predicted versus the observed percent population in MD for a structure in the structural ensemble of a cyclic peptide. All the structures in the structural ensembles for all the cyclic peptides in the training or test dataset with a predicted or observed percent population in MD of >1% are plotted.
Figure 5
Figure 5. Performance of (A–C) StrEAMM CNN models and (D–F) StrEAMM GNN models on the test dataset for (A, D) cyclic pentapeptides and the models incorporating (1,2) and (1,3) filters/edges, (B, E) cyclic hexapeptides and the models incorporating (1,2) and (1,3) filters/edges, and (C, F) cyclic hexapeptides and the models incorporating (1,2), (1,3), and (1,4) filters/edges. See Figure S7 for the model performances on the corresponding training datasets. All the structures in the structural ensembles for all the cyclic peptides in the training or test dataset with a predicted or observed percent population in MD of >1% are plotted.
Figure 6
Figure 6. Comparison of the StrEAMM linear regression and neural network models’ performances on the (A) cyclic pentapeptide and (B) cyclic hexapeptide test datasets. The coefficient of determination, R2, and weighted error, WE, are shown for each model (the linear regression in red with diagonal slash pattern, CNN in green with dotted pattern, and GNN in blue with vertical line pattern) including different neighboring interactions.
Figure 7
Figure 7. The StrEAMM neural network models can predict structural ensembles for cyclic pentapeptides and cyclic hexapeptides that contain amino acids that were absent in the training dataset. (A–D) Performances of the (A, B) CNN and (C, D) GNN models on cyclic pentapeptide and cyclic hexapeptide datasets containing sequences composed of the 37 amino acid library, when the models were trained using only sequences composed of the 15 amino acid library. (E–H) Performances of the (E, F) CNN and (G, H) GNN models on cyclic pentapeptide and cyclic hexapeptide datasets containing sequences composed of the 37 amino acid library, when the models were trained using sequences composed of the 15 amino acid library and “booster” sequences composed of the 37 amino acid library.
References
ARTICLE SECTIONSThis article references 65 other publications.
- 1Smith, M. C.; Gestwicki, J. E. Features of protein-protein interactions that translate into potent inhibitors: topology, surface area and affinity. Expert Rev. Mol. Med. 2012, 14, e16 DOI: 10.1017/erm.2012.10Google Scholar1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhsFOjtL7L&md5=a24bcc721f4c6ebfc8fae029c79f6426Features of protein-protein interactions that translate into potent inhibitors: topology, surface area and affinitySmith, Matthew C.; Gestwicki, Jason E.Expert Reviews in Molecular Medicine (2012), 14 (), e16/1-e16/20CODEN: ERMMFS; ISSN:1462-3994. (Cambridge University Press)A review. Protein-protein interactions (PPIs) control the assembly of multi-protein complexes and, thus, these contacts have enormous potential as drug targets. However, the field has produced a mix of both exciting success stories and frustrating challenges. Here, we review known examples and explore how the phys. features of a PPI, such as its affinity, hotspots, off-rates, buried surface area and topol., might influence the chances of success in finding inhibitors. This anal. suggests that concise, tight binding PPIs are most amenable to inhibition. However, it is also clear that emerging tech. methods are expanding the repertoire of 'druggable' protein contacts and increasing the odds against difficult targets. In particular, natural product-like compd. libraries, high throughput screens specifically designed for PPIs and approaches that favor discovery of allosteric inhibitors appear to be attractive routes. The first group of PPI inhibitors has entered clin. trials, further motivating the need to understand the challenges and opportunities in pursuing these types of targets.
- 2Morelli, X.; Bourgeas, R.; Roche, P. Chemical and structural lessons from recent successes in protein-protein interaction inhibition (2P2I). Curr. Opin. Chem. Biol. 2011, 15, 475– 481, DOI: 10.1016/j.cbpa.2011.05.024Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXpvFCltro%253D&md5=efe168c18c73e16044583471b7927751Chemical and structural lessons from recent successes in protein-protein interaction inhibition (2P2I)Morelli, Xavier; Bourgeas, Raphael; Roche, PhilippeCurrent Opinion in Chemical Biology (2011), 15 (4), 475-481CODEN: COCBF4; ISSN:1367-5931. (Elsevier B.V.)A review. Worldwide research efforts have driven recent pharmaceutical successes, and consequently, the emerging role of Protein-Protein Interactions (PPIs) as drug targets has finally been widely embraced by the scientific community. Inhibitors of these Protein-Protein Interactions (2P2Is or i-PPIs) are likely to represent the next generation of highly innovative drugs that will reach the market over the next decade. This review describes up-to-date knowledge on this particular chem. space, with a specific emphasis on a subset of this ensemble. We also address current structural knowledge regarding both protein-protein and protein-inhibitor complexes, i.e., the 2P2I database. Finally, ligand efficiency analyses permit us to relate potency to size and polarity and to discuss the need to co-develop nanoparticle drug delivery systems.
- 3Rezai, T.; Bock, J. E.; Zhou, M. V.; Kalyanaraman, C.; Lokey, R. S.; Jacobson, M. P. Conformational Flexibility, Internal Hydrogen Bonding, and Passive Membrane Permeability: Successful in Silico Prediction of the Relative Permeabilities of Cyclic Peptides. J. Am. Chem. Soc. 2006, 128, 14073– 14080, DOI: 10.1021/ja063076pGoogle Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28XhtVCmur3I&md5=78d4b4082e6a72b288d4b6dac933ca35Conformational Flexibility, Internal Hydrogen Bonding, and Passive Membrane Permeability: Successful in Silico Prediction of the Relative Permeabilities of Cyclic PeptidesRezai, Taha; Bock, Jonathan E.; Zhou, Mai V.; Kalyanaraman, Chakrapani; Lokey, R. Scott; Jacobson, Matthew P.Journal of the American Chemical Society (2006), 128 (43), 14073-14080CODEN: JACSAT; ISSN:0002-7863. (American Chemical Society)We report an atomistic phys. model for the passive membrane permeability of cyclic peptides. The computational modeling was performed in advance of the expts. and did not involve the use of "training data". The model explicitly treats the conformational flexibility of the peptides by extensive conformational sampling in low (membrane) and high (water) dielec. environments. The passive membrane permeabilities of 11 cyclic peptides were obtained exptl. using a parallel artificial membrane permeability assay (PAMPA) and showed a linear correlation with the computational results with R2 = 0.96. In general, the results support the hypothesis, already well established in the literature, that the ability to form internal hydrogen bonds is crit. for passive membrane permeability and can be the distinguishing factor among closely related compds., such as those studied here. However, we have found that the no. of internal hydrogen bonds that can form in the membrane and the solvent-exposed polar surface area correlate more poorly with PAMPA permeability than our model, which quant. ests. the solvation free energy losses upon moving from high-dielec. water to the low-dielec. interior of a membrane.
- 4Dougherty, P. G.; Sahni, A.; Pei, D. Understanding Cell Penetration of Cyclic Peptides. Chem. Rev. 2019, 119, 10241– 10287, DOI: 10.1021/acs.chemrev.9b00008Google Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXptlCksLs%253D&md5=65f68a9b69f3f7f33641b6a91835b1ddUnderstanding Cell Penetration of Cyclic PeptidesDougherty, Patrick G.; Sahni, Ashweta; Pei, DehuaChemical Reviews (Washington, DC, United States) (2019), 119 (17), 10241-10287CODEN: CHREAY; ISSN:0009-2665. (American Chemical Society)A review. Approx. 75% of all disease-relevant human proteins, including those involved in intracellular protein-protein interactions (PPIs), are undruggable with the current drug modalities (i.e., small mols. and biologics). Macrocyclic peptides provide a potential soln. to these undruggable targets because their larger sizes (relative to conventional small mols.) endow them the capability of binding to flat PPI interfaces with antibody-like affinity and specificity. Powerful combinatorial library technologies have been developed to routinely identify cyclic peptides as potent, specific inhibitors against proteins including PPI targets. However, with the exception of a very small set of sequences, the vast majority of cyclic peptides are impermeable to the cell membrane, preventing their application against intracellular targets. This Review examines common structural features that render most cyclic peptides membrane impermeable, as well as the unique features that allow the minority of sequences to enter the cell interior by passive diffusion, endocytosis/endosomal escape, or other mechanisms. We also present the current state of knowledge about the mol. mechanisms of cell penetration, the various strategies for designing cell-permeable, biol. active cyclic peptides against intracellular targets, and the assay methods available to quantify their cell-permeability.
- 5Zhang, H.; Chen, S. Cyclic peptide drugs approved in the last two decades (2001–2021). RSC Chem. Biol. 2022, 3, 18– 31, DOI: 10.1039/D1CB00154JGoogle Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB2M7otVertw%253D%253D&md5=b5c064979a6ed1909cf621cc6bcb225aCyclic peptide drugs approved in the last two decades (2001-2021)Zhang Huiya; Chen ShiyuRSC chemical biology (2022), 3 (1), 18-31 ISSN:.In contrast to the major families of small molecules and antibodies, cyclic peptides, as a family of synthesizable macromolecules, have distinct biochemical and therapeutic properties for pharmaceutical applications. Cyclic peptide-based drugs have increasingly been developed in the past two decades, confirming the common perception that cyclic peptides have high binding affinities and low metabolic toxicity as antibodies, good stability and ease of manufacture as small molecules. Natural peptides were the major source of cyclic peptide drugs in the last century, and cyclic peptides derived from novel screening and cyclization strategies are the new source. In this review, we will discuss and summarize 18 cyclic peptides approved for clinical use in the past two decades to provide a better understanding of cyclic peptide development and to inspire new perspectives. The purpose of the present review is to promote efforts to resolve the challenges in the development of cyclic peptide drugs that are more effective.
- 6Zorzi, A.; Deyle, K.; Heinis, C. Cyclic peptide therapeutics: past, present and future. Curr. Opin. Chem. Biol. 2017, 38, 24– 29, DOI: 10.1016/j.cbpa.2017.02.006Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXjsFGksLg%253D&md5=4b9581365b92b649dad8c7b22c5bd9c3Cyclic peptide therapeutics: past, present and futureZorzi, Alessandro; Deyle, Kaycie; Heinis, ChristianCurrent Opinion in Chemical Biology (2017), 38 (), 24-29CODEN: COCBF4; ISSN:1367-5931. (Elsevier B.V.)Cyclic peptides combine several favorable properties such as good binding affinity, target selectivity and low toxicity that make them an attractive modality for the development of therapeutics. Over 40 cyclic peptide drugs are currently in clin. use and around one new cyclic peptide drug enters the market every year on av. The vast majority of clin. approved cyclic peptides are derived from natural products, such as antimicrobials or human peptide hormones. New powerful techniques based on rational design and in vitro evolution have enabled the de novo development of cyclic peptide ligands to targets for which nature does not offer solns. A look at the cyclic peptides currently under clin. evaluation shows that several have been developed using such techniques. This new source for cyclic peptide ligands introduces a freshness to the field, and it is likely that de novo developed cyclic peptides will be in clin. use in the near future.
- 7Nguyen, Q. N. N.; Schwochert, J.; Tantillo, D. J.; Lokey, R. S. Using 1H and 13C NMR chemical shifts to determine cyclic peptide conformations: a combined molecular dynamics and quantum mechanics approach. Phys. Chem. Chem. Phys. 2018, 20, 14003– 14012, DOI: 10.1039/C8CP01616JGoogle Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXosl2kt7w%253D&md5=7bb116300b2306d765a4aa6742980c85Using 1H and 13C NMR chemical shifts to determine cyclic peptide conformations: a combined molecular dynamics and quantum mechanics approachNguyen, Q. Nhu N.; Schwochert, Joshua; Tantillo, Dean J.; Lokey, R. ScottPhysical Chemistry Chemical Physics (2018), 20 (20), 14003-14012CODEN: PPCPFQ; ISSN:1463-9076. (Royal Society of Chemistry)Solving conformations of cyclic peptides can provide insight into structure-activity and structure-property relationships, which can help in the design of compds. with improved bioactivity and/or ADME characteristics. The most common approaches for detg. the structures of cyclic peptides are based on NMR-derived distance restraints obtained from NOESY or ROESY cross-peak intensities, and 3J-based dihedral restraints using the Karplus relationship. Unfortunately, these observables are often too weak, sparse, or degenerate to provide unequivocal, high-confidence soln. structures, prompting us to investigate an alternative approach that relies only on 1H and 13C chem. shifts as exptl. observables. This method, which we call conformational anal. from NMR and d.-functional prediction of low-energy ensembles (CANDLE), uses mol. dynamics (MD) simulations to generate conformer families and d. functional theory (DFT) calcns. to predict their 1H and 13C chem. shifts. Iterative conformer searches and DFT energy calcns. on a cyclic peptide-peptoid hybrid yielded Boltzmann ensembles whose predicted chem. shifts matched the exptl. values better than any single conformer. For these compds., CANDLE outperformed the classic NOE- and 3J-coupling-based approach by disambiguating similar β-turn types and also enabled the structural elucidation of the minor conformer. Through the use of chem. shifts, in conjunction with DFT and MD calcns., CANDLE can help illuminate conformational ensembles of cyclic peptides in soln.
- 8Ball, K. A.; Wemmer, D. E.; Head-Gordon, T. Comparison of Structure Determination Methods for Intrinsically Disordered Amyloid-β Peptides. J. Phys. Chem. B 2014, 118, 6405– 6416, DOI: 10.1021/jp410275yGoogle Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXlslCjug%253D%253D&md5=9bf46beb4da6f5c32a56b3230c24aac9Comparison of Structure Determination Methods for Intrinsically Disordered Amyloid-β PeptidesBall, K. Aurelia; Wemmer, David E.; Head-Gordon, TeresaJournal of Physical Chemistry B (2014), 118 (24), 6405-6416CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Intrinsically disordered proteins (IDPs) represent a new frontier in structural biol. since the primary characteristic of IDPs is that structures need to be characterized as diverse ensembles of conformational substates. We compare two general but very different ways of combining NMR spectroscopy with theor. methods to derive structural ensembles for the disease IDPs amyloid-β 1-40 and amyloid-β 1-42, which are assocd. with Alzheimer's Disease. We analyze the performance of de novo mol. dynamics and knowledge-based approaches for generating structural ensembles by assessing their ability to reproduce a range of NMR exptl. observables. In addn. to the comparison of computational methods, we also evaluate the relative value of different types of NMR data for refining or validating the IDP structural ensembles for these important disease peptides.
- 9Fisher, C. K.; Stultz, C. M. Constructing ensembles for intrinsically disordered proteins. Curr. Opin. Struct. Biol. 2011, 21, 426– 431, DOI: 10.1016/j.sbi.2011.04.001Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXnsVyitb4%253D&md5=81a2affd275f291c133e2839543a969bConstructing ensembles for intrinsically disordered proteinsFisher, Charles K.; Stultz, Collin M.Current Opinion in Structural Biology (2011), 21 (3), 426-431CODEN: COSBEF; ISSN:0959-440X. (Elsevier Ltd.)A review. The relatively flat energy landscapes assocd. with intrinsically disordered proteins makes modeling these systems esp. problematic. A comprehensive model for these proteins requires one to build an ensemble consisting of a finite collection of structures, and their corresponding relative stabilities, which adequately capture the range of accessible states of the protein. In this regard, methods that use computational techniques to interpret exptl. data in terms of such ensembles are an essential part of the modeling process. In this review, we critically assess the advantages and limitations of current techniques and discuss new methods for the validation of these ensembles.
- 10Cicero, D. O.; Barbato, G.; Bazzo, R. NMR Analysis of Molecular Flexibility in Solution: A New Method for the Study of Complex Distributions of Rapidly Exchanging Conformations. Application to a 13-Residue Peptide with an 8-Residue Loop. J. Am. Chem. Soc. 1995, 117, 1027– 1033, DOI: 10.1021/ja00108a019Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2MXjtVyrt7s%253D&md5=299cae2efcd4cc43aef0e0c5701579deNMR Analysis of Molecular Flexibility in Solution: A New Method for the Study of Complex Distributions of Rapidly Exchanging Conformations. Application to a 13-Residue Peptide with an 8-Residue LoopCicero, D. O.; Barbato, G.; Bazzo, R.Journal of the American Chemical Society (1995), 117 (3), 1027-33CODEN: JACSAT; ISSN:0002-7863. (American Chemical Society)A new methodol., called NAMFIS (NMR anal. of mol. flexibility in soln.), is described for the anal. of flexible mols. in soln. Once a complete set of conformations is generated and is able to encompass all the possible states of the mol. that are not a priori incompatible with the available exptl. NMR evidence, NAMFIS allows for the examn. of the occurrence and relevance of arbitrary elements of secondary structure, even when extensive conformational averaging defies a detailed exptl. characterization. The anal. is based on the available exptl. NMR data. The method is demonstrated in the conformational anal. of peptide I (R = Lys-Aib-Lys-OH; Aib = α-aminoisobutyric acid, Mhe = 2-amino-6-mercaptohexanoic acid).
- 11Ge, Y.; Zhang, S.; Erdelyi, M.; Voelz, V. A. Solution-State Preorganization of Cyclic β-Hairpin Ligands Determines Binding Mechanism and Affinities for MDM2. J. Chem. Inf. Model. 2021, 61, 2353– 2367, DOI: 10.1021/acs.jcim.1c00029Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXpsFehs7s%253D&md5=5af83b4c86353bc2cce83b4bec566788Solution-State Preorganization of Cyclic β-Hairpin Ligands Determines Binding Mechanism and Affinities for MDM2Ge, Yunhui; Zhang, Si; Erdelyi, Mate; Voelz, Vincent A.Journal of Chemical Information and Modeling (2021), 61 (5), 2353-2367CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Understanding mechanisms of protein folding and binding is crucial to designing their mol. function. Mol. dynamics (MD) simulations and Markov state model (MSM) approaches provide a powerful way to understand complex conformational change that occurs over long time scales. Such dynamics are important for the design of therapeutic peptidomimetic ligands, whose affinity and binding mechanism are dictated by a combination of folding and binding. To examine the role of preorganization in peptide binding to protein targets, the authors performed massively parallel explicit-solvent MD simulations of cyclic β-hairpin ligands designed to mimic the p53 transactivation domain and competitively bind mouse double minute 2 homolog (MDM2). Disrupting the MDM2-p53 interaction is a therapeutic strategy to prevent degrdn. of the p53 tumor suppressor in cancer cells. MSM anal. of over 3 ms of aggregate trajectory data enabled the authors to build a detailed mechanistic model of coupled folding and binding of four cyclic peptides which the authors compare to exptl. binding affinities and rates. The results show a striking relation between the relative preorganization of each ligand in soln. and its affinity for MDM2. Specifically, changes in peptide conformational populations predicted by the MSMs suggest that entropy loss upon binding is the main factor influencing affinity. The MSMs also enable detailed examn. of non-native interactions which lead to misfolded states and comparison of structural ensembles with exptl. NMR measurements. In contrast to an MSM study of p53 transactivation domain (TAD) binding to MDM2, MSMs of cyclic β-hairpin binding show a conformational selection mechanism. Finally, the authors make progress toward predicting accurate off rates of cyclic peptides using multi-ensemble Markov models (MEMMs) constructed from unbiased and biased simulated trajectories.
- 12Slough, D. P.; McHugh, S. M.; Cummings, A. E.; Dai, P.; Pentelute, B. L.; Kritzer, J. A.; Lin, Y.-S. Designing Well-Structured Cyclic Pentapeptides Based on Sequence-Structure Relationships. J. Phys. Chem. B 2018, 122, 3908– 3919, DOI: 10.1021/acs.jpcb.8b01747Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXmtVCqsrg%253D&md5=00ab0c1f50ec1ce753262b4d0afd77aaDesigning well-structured cyclic pentapeptides based on sequence-structure relationshipsSlough, Diana P.; McHugh, Sean M.; Cummings, Ashleigh E.; Dai, Peng; Pentelute, Bradley L.; Kritzer, Joshua A.; Lin, Yu-ShanJournal of Physical Chemistry B (2018), 122 (14), 3908-3919CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Cyclic peptides are a promising class of mols. for unique applications. Unfortunately, cyclic peptide design is severely limited by the difficulty in predicting the conformations they will adopt in soln. In this work, we use explicit-solvent mol. dynamics simulations to design well-structured cyclic peptides by studying their sequence-structure relationships. Crit. to our approach is an enhanced sampling method that exploits the essential transitional motions of cyclic peptides to efficiently sample their conformational space. We simulated a range of cyclic pentapeptides from all-glycine to a library of cyclo-(X1X2AAA) peptides to map their conformational space and det. cooperative effects of neighboring residues. By combining the results from all cyclo-(X1X2AAA) peptides, we developed a scoring function to predict the structural preferences for X1-X2 residues within cyclic pentapeptides. Using this scoring function, we designed a cyclic pentapeptide, cyclo-(GNSRV), predicted to be well structured in aq. soln. Subsequent CD and NMR spectroscopy revealed that this cyclic pentapeptide is indeed well structured in water, with a nuclear Overhauser effect and J-coupling values consistent with the predicted structure.
- 13Cummings, A. E.; Miao, J.; Slough, D. P.; McHugh, S. M.; Kritzer, J. A.; Lin, Y. S. β-Branched Amino Acids Stabilize Specific Conformations of Cyclic Hexapeptides. Biophys. J. 2019, 116, 433– 444, DOI: 10.1016/j.bpj.2018.12.015Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXhtVaqtbs%253D&md5=b3230ad37a2a0f117aa4f186f41c991eβ-Branched Amino Acids Stabilize Specific Conformations of Cyclic HexapeptidesCummings, Ashleigh E.; Miao, Jiayuan; Slough, Diana P.; McHugh, Sean M.; Kritzer, Joshua A.; Lin, Yu-ShanBiophysical Journal (2019), 116 (3), 433-444CODEN: BIOJAU; ISSN:0006-3495. (Cell Press)Cyclic peptides (CPs) are a promising class of mols. for drug development, particularly as inhibitors of protein-protein interactions. Predicting low-energy structures and global structural ensembles of individual CPs is crit. for the design of bioactive mols., but these are challenging to predict and difficult to verify exptl. In our previous work, we used explicit-solvent mol. dynamics simulations with enhanced sampling methods to predict the global structural ensembles of cyclic hexapeptides contg. different permutations of glycine, alanine, and valine. One peptide, cyclo-(VVGGVG) or P7, was predicted to be unusually well structured. In this work, we synthesized P7, along with a less well-structured control peptide, cyclo-(VVGVGG) or P6, and characterized their global structural ensembles in water using NMR spectroscopy. The NMR data revealed a structural ensemble similar to the prediction for P7 and showed that P6 was indeed much less well-structured than P7. We then simulated and exptl. characterized the global structural ensembles of several P7 analogs and discovered that β-branching at one crit. position within P7 is important for overall structural stability. The simulations allowed deconvolution of thermodn. factors that underlie this structural stabilization. Overall, the excellent correlation between simulation and exptl. data indicates that our simulation platform will be a promising approach for designing well-structured CPs and also for understanding the complex interactions that control the conformations of constrained peptides and other macrocycles.
- 14Wakefield, A. E.; Wuest, W. M.; Voelz, V. A. Molecular Simulation of Conformational Pre-Organization in Cyclic RGD Peptides. J. Chem. Inf. Model. 2015, 55, 806– 813, DOI: 10.1021/ci500768uGoogle Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXjvVChu74%253D&md5=36c7ee3598117a064bee960c0f8ebd75Molecular Simulation of Conformational Pre-Organization in Cyclic RGD PeptidesWakefield, Amanda E.; Wuest, William M.; Voelz, Vincent A.Journal of Chemical Information and Modeling (2015), 55 (4), 806-813CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)To test the ability of mol. simulations to accurately predict the soln.-state conformational properties of peptidomimetics, we examd. a test set of 18 cyclic RGD peptides selected from the literature, including the anticancer drug candidate cilengitide, whose favorable binding affinity to integrin has been ascribed to its pre-organization in soln. For each design, we performed all-atom replica-exchange mol. dynamics simulations over several microseconds and compared the results to extensive published NMR data. We find excellent agreement with exptl. NOE distance restraints, suggesting that mol. simulation can be a useful tool for the computational design of pre-organized soln.-state structure. Moreover, our anal. of conformational populations ests. that, despite the potential for increased flexibility due to backbone amide isomerizaton, N-methylation provides about 0.5 kcal/mol of reduced conformational entropy to cyclic RGD peptides. The combination of pre-organization and binding-site compatibility explains the strong binding affinity of cilengitide to integrin.
- 15Damjanovic, J.; Miao, J.; Huang, H.; Lin, Y.-S. Elucidating Solution Structures of Cyclic Peptides Using Molecular Dynamics Simulations. Chem. Rev. 2021, 121, 2292– 2324, DOI: 10.1021/acs.chemrev.0c01087Google Scholar64https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXns1Sitg%253D%253D&md5=7a78dc319d70b09c1f2d68091b45e333Elucidating Solution Structures of Cyclic Peptides Using Molecular Dynamics SimulationsDamjanovic, Jovan; Miao, Jiayuan; Huang, He; Lin, Yu-ShanChemical Reviews (Washington, DC, United States) (2021), 121 (4), 2292-2324CODEN: CHREAY; ISSN:0009-2665. (American Chemical Society)A review. Protein-protein interactions are vital to biol. processes, but the shape and size of their interfaces make them hard to target using small mols. Cyclic peptides have shown promise as protein-protein interaction modulators, as they can bind protein surfaces with high affinity and specificity. Dozens of cyclic peptides are already FDA approved, and many more are in various stages of development as immunosuppressants, antibiotics, antivirals, or anticancer drugs. However, most cyclic peptide drugs so far have been natural products or derivs. thereof, with de novo design having proven challenging. A key obstacle is structural characterization: cyclic peptides frequently adopt multiple conformations in soln., which are difficult to resolve using techniques like NMR spectroscopy. The lack of soln. structural information prevents a thorough understanding of cyclic peptides' sequence-structure-function relationship. Here we review recent development and application of mol. dynamics simulations with enhanced sampling to studying the soln. structures of cyclic peptides. We describe novel computational methods capable of sampling cyclic peptides' conformational space and provide examples of computational studies that relate peptides' sequence and structure to biol. activity. We demonstrate that mol. dynamics simulations have grown from an explanatory technique to a full-fledged tool for systematic studies at the forefront of cyclic peptide therapeutic design.
- 16Ono, S.; Naylor, M. R.; Townsend, C. E.; Okumura, C.; Okada, O.; Lokey, R. S. Conformation and Permeability: Cyclic Hexapeptide Diastereomers. J. Chem. Inf. Model. 2019, 59, 2952– 2963, DOI: 10.1021/acs.jcim.9b00217Google Scholar65https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXosFGktb8%253D&md5=a4dc1f7d186cbe0614c96d95940d539dConformation and Permeability: Cyclic Hexapeptide DiastereomersOno, Satoshi; Naylor, Matthew R.; Townsend, Chad E.; Okumura, Chieko; Okada, Okimasa; Lokey, R. ScottJournal of Chemical Information and Modeling (2019), 59 (6), 2952-2963CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Conformational ensembles of eight cyclic hexapeptide diastereomers in explicit cyclohexane, chloroform, and water were analyzed by multicanonical mol. dynamics (McMD) simulations. Free-energy landscapes (FELs) for each compd. and solvent were obtained from the mol. shapes and principal component anal. at T = 300 K; detailed anal. of the conformational ensembles and flexibility of the FELs revealed that permeable compds. have different structural profiles even for a single stereoisomeric change. The av. solvent-accessible surface area (SASA) in cyclohexane showed excellent correlation with the cell permeability, whereas this correlation was weaker in chloroform. The av. SASA in water correlated with the aq. soly. The av. polar surface area did not correlate with cell permeability in these solvents. A possible strategy for designing permeable cyclic peptides from FELs obtained from McMD simulations is proposed.
- 17Wang, S.; König, G.; Roth, H.-J.; Fouché, M.; Rodde, S.; Riniker, S. Effect of Flexibility, Lipophilicity, and the Location of Polar Residues on the Passive Membrane Permeability of a Series of Cyclic Decapeptides. J. Med. Chem. 2021, 64, 12761– 12773, DOI: 10.1021/acs.jmedchem.1c00775Google Scholar66https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhvVSqs7nO&md5=21766e5b60ab466f8be3217e647e3515Effect of Flexibility, Lipophilicity, and the Location of Polar Residues on the Passive Membrane Permeability of a Series of Cyclic DecapeptidesWang, Shuzhe; Konig, Gerhard; Roth, Hans-Jorg; Fouche, Marianne; Rodde, Stephane; Riniker, SereinaJournal of Medicinal Chemistry (2021), 64 (17), 12761-12773CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Cyclic peptides have received increasing attention over the recent years as potential therapeutics for "undruggable" targets. One major obstacle is, however, their often relatively poor bioavailability. Here, we investigate the structure-permeability relationship of 24 cyclic decapeptides that share the same backbone N-methylation pattern but differ in their side chains. The peptides cover a large range of values for passive membrane permeability as well as lipophilicity and soly. To rationalize the obsd. differences in permeability, we extd. for each peptide the population of the membrane-permeable conformation in water from extensive explicit-solvent mol. dynamics simulations and used this as a metric for conformational rigidity or "prefolding.". The insights from the simulations together with lipophilicity measurements highlight the intricate interplay between polarity/lipophilicity and flexibility/rigidity and the possible compensating effects on permeability. The findings allow us to better understand the structure-permeability relationship of cyclic peptides and ext. general guiding principles.
- 18El Tayar, N.; Mark, A. E.; Vallat, P.; Brunne, R. M.; Testa, B.; van Gunsteren, W. F. Solvent-dependent conformation and hydrogen-bonding capacity of cyclosporin A: evidence from partition coefficients and molecular dynamics simulations. J. Med. Chem. 1993, 36, 3757– 3764, DOI: 10.1021/jm00076a002Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3sXmslWks78%253D&md5=c16833b2300b4fba6cb627ca1b9e3060Solvent-dependent conformation and hydrogen-bonding capacity of cyclosporin A: evidence from partition coefficients and molecular dynamics simulationsEl Tayar, Nabil; Mark, Alan E.; Vallat, Philippe; Brunne, Roger M.; Testa, Bernard; van Gunsteren, Wilfred F.Journal of Medicinal Chemistry (1993), 36 (24), 3757-64CODEN: JMCMAR; ISSN:0022-2623.The partition coeff. of cyclosporin A (CsA) was measured in octanol/water and heptane/water by centrifugal partition chromatog. By comparison with results from model compds., it was deduced that the hydrogen-bonding capacity of CsA changed dramatically from an apolar solvent (where it is internally H-bonded) to polar solvents (where it exposes its H-bonding groups to the solvent). Mol. dynamics simulations in water and CCl4 support the suggestion that CsA undergoes a solvent-dependent conformational changes and that the interconversion process is slow on the mol. dynamics time scale.
- 19Merten, C.; Li, F.; Bravo-Rodriguez, K.; Sanchez-Garcia, E.; Xu, Y.; Sander, W. Solvent-induced conformational changes in cyclic peptides: a vibrational circular dichroism study. Phys. Chem. Chem. Phys. 2014, 16, 5627– 5633, DOI: 10.1039/C3CP55018DGoogle Scholar16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXjtVGisrk%253D&md5=baf3123a7b73babb8a50700b4f89e0d8Solvent-induced conformational changes in cyclic peptides: a vibrational circular dichroism studyMerten, Christian; Li, Fee; Bravo-Rodriguez, Kenny; Sanchez-Garcia, Elsa; Xu, Yunjie; Sander, WolframPhysical Chemistry Chemical Physics (2014), 16 (12), 5627-5633CODEN: PPCPFQ; ISSN:1463-9076. (Royal Society of Chemistry)The three-dimensional structure of a peptide is strongly influenced by its solvent environment. In the present study, we study three cyclic tetrapeptides which serve as model peptides for β-turns. They are of the general structure cyclo(Boc-Cys-Pro-X-Cys-OMe) with the amino acid X being either glycine, or L- or D-leucine. Using vibrational CD (VCD) spectroscopy, we confirm previous NMR results which showed that cyclo(Boc-Cys-Pro-D-Leu-Cys-OMe) adopts predominantly a βII turn structure in apolar and polar solvents. Our results for cyclo(Boc-Cys-Pro-Leu-Cys-OMe) (Boc = tert-nutoxycarbonyl) indicate a preference for a βI structure over βII. With increasing solvent polarity, the preference for cyclo(Boc-Cys-Pro-Gly-Cys-OMe) is shifted from βII towards βI. This conformational change goes along with the breaking of an intramol. hydrogen bond which stabilizes the βII conformation. Instead, a hydrogen bond with a solvent mol. can stabilize the βI turn conformation.
- 20Quartararo, J. S.; Eshelman, M. R.; Peraro, L.; Yu, H.; Baleja, J. D.; Lin, Y.-S.; Kritzer, J. A. A bicyclic peptide scaffold promotes phosphotyrosine mimicry and cellular uptake. Bioorg. Med. Chem. 2014, 22, 6387– 6391, DOI: 10.1016/j.bmc.2014.09.050Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhslaht7nM&md5=c6439ec61bca8cd01b5a428dda8bb46fA bicyclic peptide scaffold promotes phosphotyrosine mimicry and cellular uptakeQuartararo, Justin S.; Eshelman, Matthew R.; Peraro, Leila; Yu, Hongtao; Baleja, James D.; Lin, Yu-Shan; Kritzer, Joshua A.Bioorganic & Medicinal Chemistry (2014), 22 (22), 6387-6391CODEN: BMECEP; ISSN:0968-0896. (Elsevier B.V.)While peptides are promising as probes and therapeutics, targeting intracellular proteins will require greater understanding of highly structured, cell-internalized scaffolds. We recently reported BC1, an 11-residue bicyclic peptide that inhibits the Src homol. 2 (SH2) domain of growth factor receptor-bound protein 2 (Grb2). In this work, we describe the unique structural and cell uptake properties of BC1 and similar cyclic and bicyclic scaffolds. These constrained scaffolds are taken up by mammalian cells despite their net neutral or neg. charges, while unconstrained analogs are not. The mechanism of uptake is shown to be energy-dependent and endocytic, but distinct from that of Tat. The soln. structure of BC1 was investigated by NMR and MD simulations, which revealed discrete water-binding sites on BC1 that reduce exposure of backbone amides to bulk water. This represents an original and potentially general strategy for promoting cell uptake.
- 21Baek, M.; DiMaio, F.; Anishchenko, I.; Dauparas, J.; Ovchinnikov, S.; Lee, G. R.; Wang, J.; Cong, Q.; Kinch, L. N.; Schaeffer, R. D.; Millán, C.; Park, H.; Adams, C.; Glassman, C. R.; DeGiovanni, A.; Pereira, J. H.; Rodrigues, A. V.; van Dijk, A. A.; Ebrecht, A. C.; Opperman, D. J.; Sagmeister, T.; Buhlheller, C.; Pavkov-Keller, T.; Rathinaswamy, M. K.; Dalwadi, U.; Yip, C. K.; Burke, J. E.; Garcia, K. C.; Grishin, N. V.; Adams, P. D.; Read, R. J.; Baker, D. Accurate prediction of protein structures and interactions using a three-track neural network. Science 2021, 373, 871– 876, DOI: 10.1126/science.abj8754Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhvVCku7zM&md5=85214c6497c7e1f9df582ef1b8ffa058Accurate prediction of protein structures and interactions using a three-track neural networkBaek, Minkyung; DiMaio, Frank; Anishchenko, Ivan; Dauparas, Justas; Ovchinnikov, Sergey; Lee, Gyu Rie; Wang, Jue; Cong, Qian; Kinch, Lisa N.; Schaeffer, R. Dustin; Millan, Claudia; Park, Hahnbeom; Adams, Carson; Glassman, Caleb R.; DeGiovanni, Andy; Pereira, Jose H.; Rodrigues, Andria V.; van Dijk, Alberdina A.; Ebrecht, Ana C.; Opperman, Diederik J.; Sagmeister, Theo; Buhlheller, Christoph; Pavkov-Keller, Tea; Rathinaswamy, Manoj K.; Dalwadi, Udit; Yip, Calvin K.; Burke, John E.; Garcia, K. Christopher; Grishin, Nick V.; Adams, Paul D.; Read, Randy J.; Baker, DavidScience (Washington, DC, United States) (2021), 373 (6557), 871-876CODEN: SCIEAS; ISSN:1095-9203. (American Association for the Advancement of Science)DeepMind presented notably accurate predictions at the recent 14th Crit. Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid soln. of challenging x-ray crystallog. and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biol. research.
- 22Bryant, P.; Pozzati, G.; Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 2022, 13, 1265, DOI: 10.1038/s41467-022-28865-wGoogle Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XmvVyisb0%253D&md5=b379d632b2e14877f36b3fff2d5e8d3bImproved prediction of protein-protein interactions using AlphaFold2Bryant, Patrick; Pozzati, Gabriele; Elofsson, ArneNature Communications (2022), 13 (1), 1265CODEN: NCAOBW; ISSN:2041-1723. (Nature Portfolio)Predicting the structure of interacting protein chains is a fundamental step towards understanding protein function. Unfortunately, no computational method can produce accurate structures of protein complexes. AlphaFold2, has shown unprecedented levels of accuracy in modeling single chain protein structures. Here, we apply AlphaFold2 for the prediction of heterodimeric protein complexes. We find that the AlphaFold2 protocol together with optimized multiple sequence alignments, generate models with acceptable quality (DockQ ≥ 0.23) for 63% of the dimers. From the predicted interfaces we create a simple function to predict the DockQ score which distinguishes acceptable from incorrect models as well as interacting from non-interacting proteins with state-of-art accuracy. We find that, using the predicted DockQ scores, we can identify 51% of all interacting pairs at 1% FPR.
- 23Bryant, P.; Pozzati, G.; Zhu, W.; Shenoy, A.; Kundrotas, P.; Elofsson, A. Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search. Nat. Commun. 2022, 13, 6028, DOI: 10.1038/s41467-022-33729-4Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xis1Ciur%252FO&md5=6633ab8a81fdf35be217ea991ab571d9Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree searchBryant, Patrick; Pozzati, Gabriele; Zhu, Wensi; Shenoy, Aditi; Kundrotas, Petras; Elofsson, ArneNature Communications (2022), 13 (1), 6028CODEN: NCAOBW; ISSN:2041-1723. (Nature Portfolio)AlphaFold can predict the structure of single- and multiple-chain proteins with very high accuracy. However, the accuracy decreases with the no. of chains, and the available GPU memory limits the size of protein complexes which can be predicted. Here we show that one can predict the structure of large complexes starting from predictions of subcomponents. We assemble 91 out of 175 complexes with 10-30 chains from predicted subcomponents using Monte Carlo tree search, with a median TM-score of 0.51. There are 30 highly accurate complexes (TM-score ≥0.8, 33% of complete assemblies). We create a scoring function, mpDockQ, that can distinguish if assemblies are complete and predict their accuracy. We find that complexes contg. symmetry are accurately assembled, while asym. complexes remain challenging.
- 24Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; Bridgland, A.; Meyer, C.; Kohl, S. A. A.; Ballard, A. J.; Cowie, A.; Romera-Paredes, B.; Nikolov, S.; Jain, R.; Adler, J.; Back, T.; Petersen, S.; Reiman, D.; Clancy, E.; Zielinski, M.; Steinegger, M.; Pacholska, M.; Berghammer, T.; Bodenstein, S.; Silver, D.; Vinyals, O.; Senior, A. W.; Kavukcuoglu, K.; Kohli, P.; Hassabis, D. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583– 589, DOI: 10.1038/s41586-021-03819-2Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhvVaktrrL&md5=25964ab1157cd5b74a437333dd86650dHighly accurate protein structure prediction with AlphaFoldJumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ; Zidek, Augustin; Potapenko, Anna; Bridgland, Alex; Meyer, Clemens; Kohl, Simon A. A.; Ballard, Andrew J.; Cowie, Andrew; Romera-Paredes, Bernardino; Nikolov, Stanislav; Jain, Rishub; Adler, Jonas; Back, Trevor; Petersen, Stig; Reiman, David; Clancy, Ellen; Zielinski, Michal; Steinegger, Martin; Pacholska, Michalina; Berghammer, Tamas; Bodenstein, Sebastian; Silver, David; Vinyals, Oriol; Senior, Andrew W.; Kavukcuoglu, Koray; Kohli, Pushmeet; Hassabis, DemisNature (London, United Kingdom) (2021), 596 (7873), 583-589CODEN: NATUAS; ISSN:0028-0836. (Nature Portfolio)Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous exptl. effort, the structures of around 100,000 unique proteins have been detd., but this represents a small fraction of the billions of known protein sequences. Structural coverage is bottlenecked by the months to years of painstaking effort required to det. a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'-has been an important open research problem for more than 50 years. Despite recent progress, existing methods fall far short of at. accuracy, esp. when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with at. accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Crit. Assessment of protein Structure Prediction (CASP14), demonstrating accuracy competitive with exptl. structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates phys. and biol. knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
- 25Rettie, S. A.; Campbell, K. V.; Bera, A. K.; Kang, A.; Kozlov, S.; De La Cruz, J.; Adebomi, V.; Zhou, G.; DiMaio, F.; Ovchinnikov, S.; Bhardwaj, G. Cyclic peptide structure prediction and design using AlphaFold. bioRxiv 2023, DOI: 10.1101/2023.02.25.529956v1Google ScholarThere is no corresponding record for this reference.
- 26Gang, D.; Kim, D. W.; Park, H. S. Cyclic Peptides: Promising Scaffolds for Biopharmaceuticals. Genes 2018, 9, 557, DOI: 10.3390/genes9110557Google Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXmsFKgsL0%253D&md5=21f836cd794571f233d547c756545adeCyclic peptides: promising scaffolds for biopharmaceuticalsGang, Donghyeok; Kim, Do Wook; Park, Hee-SungGenes (2018), 9 (11), 557/1-557/15CODEN: GENEG9; ISSN:2073-4425. (MDPI AG)A review. To date, small mols. and macromols., including antibodies, have been the most pursued substances in drug screening and development efforts. Despite numerous favorable features as a drug, these mols. still have limitations and are not complementary in many regards. Recently, peptide-based chem. structures that lie between these two categories in terms of both structural and functional properties have gained increasing attention as potential alternatives. In particular, peptides in a circular form provide a promising scaffold for the development of a novel drug class owing to their adjustable and expandable ability to bind a wide range of target mols. In this review, we discuss recent progress in methodologies for peptide cyclization and screening and use of bioactive cyclic peptides in various applications.
- 27Iacovelli, R.; Bovenberg, R. A. L.; Driessen, A. J. M. Nonribosomal peptide synthetases and their biotechnological potential in Penicillium rubens. J. Ind. Microbiol. Biotechnol. 2021, 48, kuab045, DOI: 10.1093/jimb/kuab045Google ScholarThere is no corresponding record for this reference.
- 28Marahiel, M. A. Working outside the protein-synthesis rules: insights into non-ribosomal peptide synthesis. J. Pept. Sci. 2009, 15, 799– 807, DOI: 10.1002/psc.1183Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhtlOku7%252FI&md5=1534b9333ce37cf1cdd650cb1699216fWorking outside the protein-synthesis rules: Insights into non-ribosomal peptide synthesisMarahiel, Mohamed A.Journal of Peptide Science (2009), 15 (12), 799-807CODEN: JPSIEI; ISSN:1075-2617. (John Wiley & Sons Ltd.)A review. Non-ribosomally synthesized microbial peptides show remarkable structural diversity and constitute a widespread class of the most potent antibiotics and other important pharmaceuticals that range from penicillin to the immunosuppressant cyclosporine. They are assembled independent of the ribosome in a nucleic acid-independent way by a group of multimodular megaenzymes called non-ribosomal peptide synthetases. These biosynthetic machineries rely not only on the 20 canonical amino acids, but also use several different building blocks, including D-configured- and β-amino acids, methylated, glycosylated and phosphorylated residues, heterocyclic elements and even fatty acid building blocks. This structural diversity leads to a high d. of functional groups, which are often essential for the bioactivity. Recent biochem. and structural studies on several non-ribosomal peptide synthetase assembly lines have substantially contributed to the understanding of the mol. mechanisms and dynamics of individual catalytic domains underlying substrate recognition and substrate shuffling among the different active sites as well as peptide bond formation and the regio- and stereoselective product release. Copyright © 2009 European Peptide Society and John Wiley & Sons, Ltd.
- 29Martínez-Núñez, M. A.; López y López, V. E. Nonribosomal peptides synthetases and their applications in industry. Sustainable Chem. Processes 2016, 4, 13, DOI: 10.1186/s40508-016-0057-6Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhs12rtrjN&md5=5ac8ebf7e483c308e51c5696c4722de0Nonribosomal peptides synthetases and their applications in industryMartinez-Nunez, Mario Alberto; Lopez y Lopez, Victor EricSustainable Chemical Processes (2016), 4 (), 13/1-13/8CODEN: SCPUCB; ISSN:2043-7129. (Chemistry Central Ltd.)A review. X. Nonribosomal peptides are products that fall into the class of secondary metabolites with diverse properties as toxins, siderophores, pigments, or antibiotics, among others. Unlike other proteins, its biosynthesis is independent of ribosomal machinery. Nonribosomal peptides are synthesized on large nonribosomal peptide synthetase (NRPS) enzyme complexes. NRPSs are defined as multimodular enzymes, consisting of repeated modules. The NRPS enzymes are at operons and their regulation can be pos. or neg. at transcriptional or post-translational level. The presence of NRPS enzymes has been reported in the 3 domains of life, being prevalent in bacteria. Nonribosomal peptides are used in human medicine, crop protection, or environment restoration; and their use as com. products has been approved by the U.S. Food and Drug Administration and the Environmental Protection Agency. Here, the key features of nonribosomal peptides and NRPS enzymes, and some of their applications in industry are summarized.
- 30Sieber, S. A.; Marahiel, M. A. Learning from Nature’s Drug Factories: Nonribosomal Synthesis of Macrocyclic Peptides. J. Bacteriol. 2003, 185, 7036– 7043, DOI: 10.1128/JB.185.24.7036-7043.2003Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXpvVaqs7Y%253D&md5=ae91df9ce59762a4a0cfa32fcba60880Learning from nature's drug factories: Nonribosomal synthesis of macrocyclic peptidesSieber, Stephan A.; Marahiel, Mohamed A.Journal of Bacteriology (2003), 185 (24), 7036-7043CODEN: JOBAAY; ISSN:0021-9193. (American Society for Microbiology)A review. The family of enzymes, nonribosomal peptide peptide synthetases/ cyclases, are discussed.
- 31Miao, J.; Descoteaux, M. L.; Lin, Y.-S. Structure prediction of cyclic peptides by molecular dynamics + machine learning. Chem. Sci. 2021, 12, 14927– 14936, DOI: 10.1039/D1SC05562CGoogle Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXisVWjtr7F&md5=9623985fff84adb03275890a78327113Structure prediction of cyclic peptides by molecular dynamics + machine learningMiao, Jiayuan; Descoteaux, Marc L.; Lin, Yu-ShanChemical Science (2021), 12 (44), 14927-14936CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)Recent computational methods have made strides in discovering well-structured cyclic peptides that preferentially populate a single conformation. However, many successful cyclic-peptide therapeutics adopt multiple conformations in soln. In fact, the chameleonic properties of some cyclic peptides are likely responsible for their high cell membrane permeability. Thus, we require the ability to predict complete structural ensembles for cyclic peptides, including the majority of cyclic peptides that have broad structural ensembles, to significantly improve our ability to rationally design cyclic-peptide therapeutics. Here, we introduce the idea of using mol. dynamics simulation results to train machine learning models to enable efficient structure prediction for cyclic peptides. Using mol. dynamics simulation results for several hundred cyclic pentapeptides as the training datasets, we developed machine-learning models that can provide mol. dynamics simulation-quality predictions of structural ensembles for all the hundreds of thousands of sequences in the entire sequence space. The prediction for each individual cyclic peptide can be made using less than 1 s of computation time. Even for the most challenging classes of poorly structured cyclic peptides with broad conformational ensembles, our predictions were similar to those one would normally obtain only after running multiple days of explicit-solvent mol. dynamics simulations. The resulting method, termed StrEAMM (Structural Ensembles Achieved by Mol. Dynamics and Machine Learning), is the first technique capable of efficiently predicting complete structural ensembles of cyclic peptides without relying on addnl. mol. dynamics simulations, constituting a seven-order-of-magnitude improvement in speed while retaining the same accuracy as explicit-solvent simulations.
- 32Jurtz, V. I.; Johansen, A. R.; Nielsen, M.; Almagro Armenteros, J. J.; Nielsen, H.; Sønderby, C. K.; Winther, O.; Sønderby, S. K. An introduction to deep learning on biological sequence data: examples and solutions. Bioinformatics 2017, 33, 3685– 3690, DOI: 10.1093/bioinformatics/btx531Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhvFWmtr7K&md5=81d10d6022d7e8d5a462a70b8855d49eAn introduction to deep learning on biological sequence data: Examples and solutionsJurtz, Vanessa Isabell; Johansen, Alexander Rosenberg; Nielsen, Morten; Armenteros, Jose Juan Almagro; Nielsen, Henrik; Soenderby, Casper Kaae; Winther, Ole; Soenderby, Soeren KaaeBioinformatics (2017), 33 (22), 3685-3690CODEN: BOINFP; ISSN:1367-4811. (Oxford University Press)Motivation: Deep neural network architectures such as convolutional and long short-term memory networks have become increasingly popular as machine learning tools during the recent years. The availability of greater computational resources, more data, new algorithms for training deep models and easy to use libraries for implementation and training of neural networks are the drivers of this development. The use of deep learning has been esp. successful in image recognition; and the development of tools, applications and code examples are in most cases centered within this field rather than within biol. Results: Here, we aim to further the development of deep learning methods within biol. by providing application examples and ready to apply and adapt code templates. Given such examples, we illustrate how architectures consisting of convolutional and long short-term memory neural networks can relatively easily be designed and trained to state-of-the-art performance on three biol. sequence problems: prediction of subcellular localization, protein secondary structure and the binding of peptides to MHC Class II mols.
- 33Hou, J.; Adhikari, B.; Cheng, J. DeepSF: deep convolutional neural network for mapping protein sequences to folds. Bioinformatics 2018, 34, 1295– 1303, DOI: 10.1093/bioinformatics/btx780Google Scholar31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitlGrsrvP&md5=95cf0ea4776f79880304a8bc693e5a8eDeepSF: deep convolutional neural network for mapping protein sequences to foldsHou, Jie; Adhikari, Badri; Cheng, JianlinBioinformatics (2018), 34 (8), 1295-1303CODEN: BOINFP; ISSN:1367-4811. (Oxford University Press)Motivation: Protein fold recognition is an important problem in structural bioinformatics. Almost all traditional fold recognition methods use sequence (homol.) comparison to indirectly predict the fold of a target protein based on the fold of a template protein with known structure, which cannot explain the relationship between sequence and fold. Only a few methods had been developed to classify protein sequences into a small no. of folds due to methodol. limitations, which are not generally useful in practice. Results: We develop a deep 1D-convolution neural network (DeepSF) to directly classify any protein sequence into one of 1195 known folds, which is useful for both fold recognition and the study of sequence-structure relationship. Different from traditional sequence alignment (comparison) based methods, our method automatically exts. fold-related features from a protein sequence of any length and maps it to the fold space. We train and test our method on the datasets curated from SCOP1.75, yielding an av. classification accuracy of 75.3%. On the independent testing dataset curated from SCOP2.06, the classification accuracy is 73.0%. We compare our method with a top profile-profile alignment method-HHSearch on hard template-based and template-free modeling targets of CASP9-12 in terms of fold recognition accuracy. The accuracy of our method is 12.63-26.32% higher than HHSearch on template-free modeling targets and 3.39-17.09% higher on hard template-based modeling targets for top 1, 5 and 10 predicted folds. The hidden features extd. from sequence by our method is robust against sequence mutation, insertion, deletion and truncation, and can be used for other protein pattern recognition problems such as protein clustering, comparison and ranking.
- 34Cheng, J.; Liu, Y.; Ma, Y. Protein secondary structure prediction based on integration of CNN and LSTM model. J. Vis. Commun. Image Representation. 2020, 71, 102844, DOI: 10.1016/j.jvcir.2020.102844Google ScholarThere is no corresponding record for this reference.
- 35Chen, Z.; Min, M. R.; Ning, X. Ranking-Based Convolutional Neural Network Models for Peptide-MHC Class I Binding Prediction. Front. Mol. Biosci. 2021, 8, 634836, DOI: 10.3389/fmolb.2021.634836Google Scholar33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3MXhsFemtrvI&md5=ecc55a398a5c5a66e6b94d5de7197690Ranking-based convolutional neural network models for peptide-MHC class I binding predictionChen, Ziqi; Min, Martin Renqiang; Ning, XiaFrontiers in Molecular Biosciences (2021), 8 (), 634836CODEN: FMBRBS; ISSN:2296-889X. (Frontiers Media S.A.)T-cell receptors can recognize foreign peptides bound to major histocompatibility complex (MHC) class-I proteins, and thus trigger the adaptive immune response. Therefore, identifying peptides that can bind to MHC class-I mols. plays a vital role in the design of peptide vaccines. Many computational methods, for example, the state-of-the-art allele-specific method mhcflurry, have been developed to predict the binding affinities between peptides and MHC mols. In this manuscript, we develop two allele-specific Convolutional Neural Network-based methods named convm and spconvm to tackle the binding prediction problem. Specifically, we formulate the problem as to optimize the rankings of peptide-MHC bindings via ranking-based learning objectives. Such optimization is more robust and tolerant to the measurement inaccuracy of binding affinities, and therefore enables more accurate prioritization of binding peptides. In addn., we develop a new position encoding method in convm and spconvm to better identify the most important amino acids for the binding events. We conduct a comprehensive set of expts. using the latest Immune Epitope Database (IEDB) datasets. Our exptl. results demonstrate that our models significantly outperform the state-of-the-art methods including mhcflurry with an av. percentage improvement of 6.70% on AUC and 17.10% on ROC5 across 128 alleles.
- 36Gelman, S.; Fahlberg, S. A.; Heinzelman, P.; Romero, P. A.; Gitter, A. Neural networks to learn protein sequence─function relationships from deep mutational scanning data. Proc. Natl. Acad. Sci. U.S.A. 2021, 118, e2104878118 DOI: 10.1073/pnas.2104878118Google Scholar34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38Xhs1Kks74%253D&md5=4e0d60b83780874fe594adaadabbae9bNeural networks to learn protein sequence-function relationships from deep mutational scanning dataGelman, Sam; Fahlberg, Sarah A.; Heinzelman, Pete; Romero, Philip A.; Gitter, AnthonyProceedings of the National Academy of Sciences of the United States of America (2021), 118 (48), e2104878118CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)The mapping from protein sequence to function is highly complex, making it challenging to predict how sequence changes will affect a protein's behavior and properties. We present a supervised deep learning framework to learn the sequence-function mapping from deep mutational scanning data and make predictions for new, uncharacterized sequence variants. We test multiple neural network architectures, including a graph convolutional network that incorporates protein structure, to explore how a network's internal representation affects its ability to learn the sequence-function mapping. Our supervised learning approach displays superior performance over physics-based and unsupervised prediction methods. We find that networks that capture nonlinear interactions and share parameters across sequence positions are important for learning the relationship between sequence and function. Further anal. of the trained models reveals the networks' ability to learn biol. meaningful information about protein structure and mechanism. Finally, we demonstrate the models' ability to navigate sequence space and design new proteins beyond the training set. We applied the protein G B1 domain (GB1) models to design a sequence that binds to IgG with substantially higher affinity than wild-type GB1.
- 37Hosseinzadeh, P.; Bhardwaj, G.; Mulligan, V. K.; Shortridge, M. D.; Craven, T. W.; Pardo-Avila, F.; Rettie, S. A.; Kim, D. E.; Silva, D.-A.; Ibrahim, Y. M.; Webb, I. K.; Cort, J. R.; Adkins, J. N.; Varani, G.; Baker, D. Comprehensive computational design of ordered peptide macrocycles. Science 2017, 358, 1461– 1466, DOI: 10.1126/science.aap7577Google Scholar35https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhvFGmtr3I&md5=fd13cbca720f657d91131e28e78fd435Comprehensive computational design of ordered peptide macrocyclesHosseinzadeh, Parisa; Bhardwaj, Gaurav; Mulligan, Vikram Khipple; Shortridge, Matthew D.; Craven, Timothy W.; Pardo-Avila, Fatima; Rettie, Stephen A.; Kim, David E.; Silva, Daniel-Adriano; Ibrahim, Yehia M.; Webb, Ian K.; Cort, John R.; Adkins, Joshua N.; Varani, Gabriele; Baker, DavidScience (Washington, DC, United States) (2017), 358 (6369), 1461-1466CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)Mixed-chirality peptide macrocycles such as cyclosporine are among the most potent therapeutics identified to date, but there is currently no way to systematically search the structural space spanned by such compds. Natural proteins do not provide a useful guide: Peptide macrocycles lack regular secondary structures and hydrophobic cores, and can contain local structures not accessible with L-amino acids. Here, we enumerate the stable structures that can be adopted by macrocyclic peptides composed of L- and D-amino acids by near-exhaustive backbone sampling followed by sequence design and energy landscape calcns. We identify more than 200 designs predicted to fold into single stable structures, many times more than the no. of currently available unbound peptide macrocycle structures. NMR structures of 9 of 12 designed 7- to 10-residue macrocycles, and three 11- to 14-residue bicyclic designs, are close to the computational models. Our results provide a nearly complete coverage of the rich space of structures possible for short peptide macrocycles and vastly increase the available starting scaffolds for both rational drug design and library selection methods.
- 38Li, X.; Du, X.; Li, J.; Gao, Y.; Pan, Y.; Shi, J.; Zhou, N.; Xu, B. Introducing d-Amino Acid or Simple Glycoside into Small Peptides to Enable Supramolecular Hydrogelators to Resist Proteolysis. Langmuir 2012, 28, 13512– 13517, DOI: 10.1021/la302583aGoogle Scholar36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xht1egtr3F&md5=813f0a8449856ab37017f4ae448bfc23Introducing d-Amino Acid or Simple Glycoside into Small Peptides to Enable Supramolecular Hydrogelators to Resist ProteolysisLi, Xinming; Du, Xuewen; Li, Jiayang; Gao, Yuan; Pan, Yue; Shi, Junfeng; Zhou, Ning; Xu, BingLangmuir (2012), 28 (37), 13512-13517CODEN: LANGD5; ISSN:0743-7463. (American Chemical Society)Here we report the examn. of two convenient strategies, the use of a d-amino acid residue or a glycoside segment, for increasing the proteolytic resistance of supramol. hydrogelators based on small peptides. Our results show that the introduction of d-amino acid or glycoside to the peptides significantly increases the resistance of the hydrogelators against proteinase K, a powerful endopeptidase. The insertion of d-amino acid in the peptide backbone, however, results relatively low storage moduli of the hydrogels, likely due to the disruption of the superstructures of the mol. assembly. In contrast, the introduction of a glycoside to the C-terminal of peptide enhances the biostability of the hydrogelators without the significant decrease of the storage moduli of the hydrogels. This work suggests that the inclusion of a simple glycogen in hydrogelators is a useful approach to increase their biostability, and the gained understanding from the work may ultimately lead to development of hydrogels of functional peptides for biomedical applications that require long-term biostability.
- 39Liu, J.; Liu, J.; Chu, L.; Zhang, Y.; Xu, H.; Kong, D.; Yang, Z.; Yang, C.; Ding, D. Self-Assembling Peptide of d-Amino Acids Boosts Selectivity and Antitumor Efficacy of 10-Hydroxycamptothecin. ACS Appl. Mater. Interfaces 2014, 6, 5558– 5565, DOI: 10.1021/am406007gGoogle Scholar37https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXkslSmu7c%253D&md5=0c69ff23a0cc309b3abe9f974f6310a8Self-Assembling Peptide of d-Amino Acids Boosts Selectivity and Antitumor Efficacy of 10-HydroxycamptothecinLiu, Jianfeng; Liu, Jinjian; Chu, Liping; Zhang, Yumin; Xu, Hongyan; Kong, Deling; Yang, Zhimou; Yang, Cuihong; Ding, DanACS Applied Materials & Interfaces (2014), 6 (8), 5558-5565CODEN: AAMICK; ISSN:1944-8244. (American Chemical Society)D-Peptides, which consist of D-amino acids and can resist the hydrolysis catalyzed by endogenous peptidases, are one of the promising candidates for construction of peptide materials with enhanced biostability in vivo. In this paper, we report on a self-assembling supramol. nanostructure of D-amino acid-based peptide Nap-GDFDFDYGRGD (D-fiber, DF meant D-phenylalanine, DY meant D-tyrosine), which were used as carriers for 10-hydroxycamptothecin (HCPT). Transmission electron microscopy observations demonstrated the filamentous morphol. of the HCPT-loaded peptides (D-fiber-HCPT). The better selectivity and antitumor activity of D-fiber-HCPT than L-fiber-HCPT were found in the in vitro and in vivo antitumor studies. These results highlight that this model D-fiber system holds great promise as vehicles of hydrophobic drugs for cancer therapy.
- 40Piana, S.; Laio, A. A bias-exchange approach to protein folding. J. Phys. Chem. B 2007, 111, 4553– 4559, DOI: 10.1021/jp067873lGoogle Scholar38https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXjvFCht7Y%253D&md5=7f64fd37737745fa58c24f00f839207bA Bias-Exchange Approach to Protein FoldingPiana, Stefano; Laio, AlessandroJournal of Physical Chemistry B (2007), 111 (17), 4553-4559CODEN: JPCBFK; ISSN:1520-6106. (American Chemical Society)By suitably extending a recent approach [Bussi, G., et al., 2006] the authors introduce a powerful methodol. that allows the parallel reconstruction of the free energy of a system in a virtually unlimited no. of variables. Multiple metadynamics simulations of the same system at the same temp. are performed, biasing each replica with a time-dependent potential constructed in a different set of collective variables. Exchanges between the bias potentials in the different variables are periodically allowed according to a replica exchange scheme. Due to the efficaciously multidimensional nature of the bias the method allows exploring complex free energy landscapes with high efficiency. The usefulness of the method is demonstrated by performing an atomistic simulation in explicit solvent of the folding of a Triptophane cage miniprotein. It is shown that the folding free energy landscape can be fully characterized starting from an extended conformation with use of only 40 ns of simulation on 8 replicas.
- 41Laio, A.; Parrinello, M. Escaping free-energy minima. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 12562– 12566, DOI: 10.1073/pnas.202427399Google Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38XnvFGiurc%253D&md5=48d5bc7436f3ef9d78369671e70fa608Escaping free-energy minimaLaio, Alessandro; Parrinello, MicheleProceedings of the National Academy of Sciences of the United States of America (2002), 99 (20), 12562-12566CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)We introduce a powerful method for exploring the properties of the multidimensional free energy surfaces (FESs) of complex many-body systems by means of coarse-grained non-Markovian dynamics in the space defined by a few collective coordinates. A characteristic feature of these dynamics is the presence of a history-dependent potential term that, in time, fills the min. in the FES, allowing the efficient exploration and accurate detn. of the FES as a function of the collective coordinates. We demonstrate the usefulness of this approach in the case of the dissocn. of a NaCl mol. in water and in the study of the conformational changes of a dialanine in soln.
- 42McHugh, S. M.; Rogers, J. R.; Yu, H.; Lin, Y.-S. Insights into How Cyclic Peptides Switch Conformations. J. Chem. Theory Comput. 2016, 12, 2480– 2488, DOI: 10.1021/acs.jctc.6b00193Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xlt1GjtL0%253D&md5=24e7cca05844404dd601380bdfe2eee7Insights into How Cyclic Peptides Switch ConformationsMcHugh, Sean M.; Rogers, Julia R.; Yu, Hongtao; Lin, Yu-ShanJournal of Chemical Theory and Computation (2016), 12 (5), 2480-2488CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)Cyclic peptides have recently emerged as promising modulators of protein-protein interactions. However, it is currently highly difficult to predict the structures of cyclic peptides owing to their rugged conformational free energy landscape, which prevents sampling of all thermodynamically relevant conformations. In this article, we first investigate how a relatively flexible cyclic hexapeptide switches conformations. It is found that, although the circular geometry of small cyclic peptides of size 6-8 may require rare, coherent dihedral changes to sample a new conformation, the changes are rather local, involving simultaneous changes of .vphi.i and ψi or ψi and .vphi.i+1. The understanding of how these cyclic peptides switch conformations enables the use of metadynamics simulations with reaction coordinates specifically targeting such coupled two-dihedral changes to effectively sample cyclic peptide conformational space.
- 43Sugita, Y.; Kitao, A.; Okamoto, Y. Multidimensional replica-exchange method for free-energy calculations. J. Chem. Phys. 2000, 113, 6042– 6051, DOI: 10.1063/1.1308516Google Scholar41https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXntFSrt7w%253D&md5=066cf45c629b341bbd2fc4d92c7778a6Multidimensional replica-exchange method for free-energy calculationsSugita, Yuji; Kitao, Akio; Okamoto, YukoJournal of Chemical Physics (2000), 113 (15), 6042-6051CODEN: JCPSA6; ISSN:0021-9606. (American Institute of Physics)We have developed a new simulation algorithm for free-energy calcns. The method is a multidimensional extension of the replica-exchange method. While pairs of replicas with different temps. are exchanged during the simulation in the original replica-exchange method, pairs of replicas with different temps. and/or different parameters of the potential energy are exchanged in the new algorithm. This greatly enhances the sampling of the conformational space and allows accurate calcns. of free energy in a wide temp. range from a single simulation run, using the weighted histogram anal. method.
- 44Pettersen, E. F.; Goddard, T. D.; Huang, C. C.; Couch, G. S.; Greenblatt, D. M.; Meng, E. C.; Ferrin, T. E. UCSF Chimera-a visualization system for exploratory research and analysis. J. Comput. Chem. 2004, 25, 1605– 1612, DOI: 10.1002/jcc.20084Google Scholar42https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXmvVOhsbs%253D&md5=944b175f440c1ff323705987cf937ee7UCSF Chimera-A visualization system for exploratory research and analysisPettersen, Eric F.; Goddard, Thomas D.; Huang, Conrad C.; Couch, Gregory S.; Greenblatt, Daniel M.; Meng, Elaine C.; Ferrin, Thomas E.Journal of Computational Chemistry (2004), 25 (13), 1605-1612CODEN: JCCHDD; ISSN:0192-8651. (John Wiley & Sons, Inc.)The design, implementation, and capabilities of an extensible visualization system, UCSF Chimera, are discussed. Chimera is segmented into a core that provides basic services and visualization, and extensions that provide most higher level functionality. This architecture ensures that the extension mechanism satisfies the demands of outside developers who wish to incorporate new features. Two unusual extensions are presented: Multiscale, which adds the ability to visualize large-scale mol. assemblies such as viral coats, and Collab., which allows researchers to share a Chimera session interactively despite being at sep. locales. Other extensions include Multalign Viewer, for showing multiple sequence alignments and assocd. structures; ViewDock, for screening docked ligand orientations; Movie, for replaying mol. dynamics trajectories; and Vol. Viewer, for display and anal. of volumetric data. A discussion of the usage of Chimera in real-world situations is given, along with anticipated future directions. Chimera includes full user documentation, is free to academic and nonprofit users, and is available for Microsoft Windows, Linux, Apple Mac OS X, SGI IRIX, and HP Tru64 Unix from http://www.cgl.ucsf.edu/chimera/.
- 45Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindahl, E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1–2, 19– 25, DOI: 10.1016/j.softx.2015.06.001Google ScholarThere is no corresponding record for this reference.
- 46Zhou, C.-Y.; Jiang, F.; Wu, Y.-D. Residue-Specific Force Field Based on Protein Coil Library. RSFF2: Modification of AMBER ff99SB. J. Phys. Chem. B 2015, 119, 1035– 1047, DOI: 10.1021/jp5064676Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhslGktbrP&md5=42706764a58fdb7c5795552f0133c7afResidue-Specific Force Field Based on Protein Coil Library. RSFF2: Modification of AMBER ff99SBZhou, Chen-Yang; Jiang, Fan; Wu, Yun-DongJournal of Physical Chemistry B (2015), 119 (3), 1035-1047CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Recently, we developed a residue-specific force field (RSFF1) based on conformational free-energy distributions of the 20 amino acid residues from a protein coil library. Most parameters in RSFF1 were adopted from the OPLS-AA/L force field, but some van der Waals and torsional parameters that effectively affect local conformational preferences were introduced specifically for individual residues to fit the coil library distributions. Here a similar strategy has been applied to modify the Amber ff99SB force field, and a new force field named RSFF2 is developed. It can successfully fold α-helical structures such as polyalanine peptides, Trp-cage miniprotein, and villin headpiece subdomain and β-sheet structures such as Trpzip-2, GB1 β-hairpins, and the WW domain, simultaneously. The properties of various popular force fields in balancing between α-helix and β-sheet are analyzed based on their descriptions of local conformational features of various residues, and the anal. reveals the importance of accurate local free-energy distributions. Unlike the RSFF1, which overestimates the stability of both α-helix and β-sheet, RSFF2 gives melting curves of α-helical peptides and Trp-cage in good agreement with exptl. data. Fitting to the two-state model, RSFF2 gives folding enthalpies and entropies in reasonably good agreement with available exptl. results.
- 47Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926– 935, DOI: 10.1063/1.445869Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL3sXksF2htL4%253D&md5=a1161334e381746be8c9b15a5e56f704Comparison of simple potential functions for simulating liquid waterJorgensen, William L.; Chandrasekhar, Jayaraman; Madura, Jeffry D.; Impey, Roger W.; Klein, Michael L.Journal of Chemical Physics (1983), 79 (2), 926-35CODEN: JCPSA6; ISSN:0021-9606.Classical Monte Carlo simulations were carried out for liq. H2O in the NPT ensemble at 25° and 1 atm using 6 of the simpler intermol. potential functions for the dimer. Comparisons were made with exptl. thermodn. and structural data including the neutron diffraction results of Thiessen and Narten (1982). The computed densities and potential energies agree with expt. except for the original Bernal-Fowler model, which yields an 18% overest. of the d. and poor structural results. The discrepancy may be due to the correction terms needed in processing the neutron data or to an effect uniformly neglected in the computations. Comparisons were made for the self-diffusion coeffs. obtained from mol. dynamics simulations.
- 48Jiang, F.; Zhou, C.-Y.; Wu, Y.-D. Residue-Specific Force Field Based on the Protein Coil Library. RSFF1: Modification of OPLS-AA/L. J. Phys. Chem. B 2014, 118, 6983– 6998, DOI: 10.1021/jp5017449Google Scholar46https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXnslarsrs%253D&md5=65279dd20cb09368a7defaf4b9602bcfResidue-Specific Force Field Based on the Protein Coil Library. RSFF1: Modification of OPLS-AA/LJiang, Fan; Zhou, Chen-Yang; Wu, Yun-DongJournal of Physical Chemistry B (2014), 118 (25), 6983-6998CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Traditional protein force fields use one set of parameters for most of the 20 amino acids (AAs), allowing transferability of the parameters. However, a significant shortcoming is the difficulty to fit the Ramachandran plots of all AA residues simultaneously, affecting the accuracy of the force field. In this Feature Article, the authors report a new strategy for protein force field parametrization. Backbone and side-chain conformational distributions of all 20 AA residues obtained from protein coil library were used as the target data. The dihedral angle (torsion) potentials and some local nonbonded (1-4/1-5/1-6) interactions in OPLS-AA/L force field were modified such that the target data can be excellently reproduced by mol. dynamics simulations of dipeptides (blocked AAs) in explicit water, resulting in a new force field with AA-specific parameters, RSFF1. An efficient free energy decompn. approach was developed to sep. the corrections on φ and ψ from the two-dimensional Ramachandran plots. RSFF1 is shown to reproduce the exptl. NMR 3J-coupling consts. of AA dipeptides better than other force fields. It has a good balance between α-helical and β-sheet secondary structures. It can successfully fold a set of α-helix proteins (Trp-cage and Homeodomain) and β-hairpins (Trpzip-2, GB1 hairpin), which cannot be consistently stabilized by other state-of-the-art force fields. The RSFF1 force field systematically overestimates the melting temp. (and the stability of native state) of these peptides/proteins. It has a potential application in the simulation of protein folding and protein structure refinement.
- 49Lindorff-Larsen, K.; Piana, S.; Palmo, K.; Maragakis, P.; Klepeis, J. L.; Dror, R. O.; Shaw, D. E. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins: Struct., Funct., Bioinf. 2010, 78, 1950– 1958, DOI: 10.1002/prot.22711Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXkvFegtLo%253D&md5=447a9004026e2b93f0f7beff165daa09Improved side-chain torsion potentials for the Amber ff99SB protein force fieldLindorff-Larsen, Kresten; Piana, Stefano; Palmo, Kim; Maragakis, Paul; Klepeis, John L.; Dror, Ron O.; Shaw, David E.Proteins: Structure, Function, and Bioinformatics (2010), 78 (8), 1950-1958CODEN: PSFBAF ISSN:. (Wiley-Liss, Inc.)Recent advances in hardware and software have enabled increasingly long mol. dynamics (MD) simulations of biomols., exposing certain limitations in the accuracy of the force fields used for such simulations and spurring efforts to refine these force fields. Recent modifications to the Amber and CHARMM protein force fields, for example, have improved the backbone torsion potentials, remedying deficiencies in earlier versions. Here, the authors further advance simulation accuracy by improving the amino acid side-chain torsion potentials of the Amber ff99SB force field. First, the authors used simulations of model alpha-helical systems to identify the four residue types whose rotamer distribution differed the most from expectations based on Protein Data Bank statistics. Second, the authors optimized the side-chain torsion potentials of these residues to match new, high-level quantum-mech. calcns. Finally, the authors used microsecond-timescale MD simulations in explicit solvent to validate the resulting force field against a large set of exptl. NMR measurements that directly probe side-chain conformations. The new force field, which the authors have termed Amber ff99SB-ILDN, exhibits considerably better agreement with the NMR data. Proteins 2010. © 2010 Wiley-Liss, Inc.
- 50Geng, H.; Jiang, F.; Wu, Y.-D. Accurate Structure Prediction and Conformational Analysis of Cyclic Peptides with Residue-Specific Force Fields. J. Phys. Chem. Lett. 2016, 7, 1805– 1810, DOI: 10.1021/acs.jpclett.6b00452Google Scholar48https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XntFCiu74%253D&md5=d9c4be040be98396c50ae44a2fc97fbbAccurate Structure Prediction and Conformational Analysis of Cyclic Peptides with Residue-Specific Force FieldsGeng, Hao; Jiang, Fan; Wu, Yun-DongJournal of Physical Chemistry Letters (2016), 7 (10), 1805-1810CODEN: JPCLCD; ISSN:1948-7185. (American Chemical Society)Cyclic peptides (CPs) are promising candidates for drugs, chem. biol. tools, and self-assembling nanomaterials. However, the development of reliable and accurate computational methods for their structure prediction has been challenging. Here, 20 all-trans CPs of 5-12 residues selected from Cambridge Structure Database have been simulated using replica-exchange mol. dynamics with four different force fields. The authors' recently developed residue-specific force fields RSFF1 and RSFF2 can correctly identify the crystal-like conformations of more than half CPs as the most populated conformation. The RSFF2 performs the best, which consistently predicts the crystal structures of 17 out of 20 CPs with RMSD < 1.1 Å. The authors also compared the backbone (φ,ψ) sampling of residues in CPs which those in short linear peptides and in globular proteins. In general, unlike linear peptides, CPs have local conformational free energies and entropies quite similar to globular proteins.
- 51Kaminski, G. A.; Friesner, R. A.; Tirado-Rives, J.; Jorgensen, W. L. Evaluation and Reparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate Quantum Chemical Calculations on Peptides. J. Phys. Chem. B 2001, 105, 6474– 6487, DOI: 10.1021/jp003919dGoogle Scholar49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3MXislKhsLk%253D&md5=3ff059626977ee7f6342466f5820f5b7Evaluation and Reparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate Quantum Chemical Calculations on PeptidesKaminski, George A.; Friesner, Richard A.; Tirado-Rives, Julian; Jorgensen, William L.Journal of Physical Chemistry B (2001), 105 (28), 6474-6487CODEN: JPCBFK; ISSN:1089-5647. (American Chemical Society)We present results of improving the OPLS-AA force field for peptides by means of refitting the key Fourier torsional coeffs. The fitting technique combines using accurate ab initio data as the target, choosing an efficient fitting subspace of the whole potential-energy surface, and detg. wts. for each of the fitting points based on magnitudes of the potential-energy gradient. The av. energy RMS deviation from the LMP2/cc-pVTZ(-f)//HF/6-31G** data is reduced by ∼40% from 0.81 to 0.47 kcal/mol as a result of the fitting for the electrostatically uncharged dipeptides. Transferability of the parameters is demonstrated by using the same alanine dipeptide-fitted backbone torsional parameters for all of the other dipeptides (with the appropriate side-chain refitting) and the alanine tetrapeptide. Parameters of nonbonded interactions have also been refitted for the sulfur-contg. dipeptides (cysteine and methionine), and the validity of the new Coulombic charges and the van der Waals σ's and ε's is proved through reproducing gas-phase energies of complex formation heats of vaporization and densities of pure model liqs. Moreover, a novel approach to fitting torsional parameters for electrostatically charged mol. systems has been presented and successfully tested on five dipeptides with charged side chains.
- 52Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A smooth particle mesh Ewald method. J. Chem. Phys. 1995, 103, 8577– 8593, DOI: 10.1063/1.470117Google Scholar50https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2MXptlehtrw%253D&md5=092a679dd3bee08da28df41e302383a7A smooth particle mesh Ewald methodEssmann, Ulrich; Perera, Lalith; Berkowitz, Max L.; Darden, Tom; Lee, Hsing; Pedersen, Lee G.Journal of Chemical Physics (1995), 103 (19), 8577-93CODEN: JCPSA6; ISSN:0021-9606. (American Institute of Physics)The previously developed particle mesh Ewald method is reformulated in terms of efficient B-spline interpolation of the structure factors. This reformulation allows a natural extension of the method to potentials of the form 1/rp with p ≥ 1. Furthermore, efficient calcn. of the virial tensor follows. Use of B-splines in the place of Lagrange interpolation leads to analytic gradients as well as a significant improvement in the accuracy. The authors demonstrate that arbitrary accuracy can be achieved, independent of system size N, at a cost that scales as N log(N). For biomol. systems with many thousands of atoms and this method permits the use of Ewald summation at a computational cost comparable to that of a simple truncation method of 10 Å or less.
- 53Tribello, G. A.; Bonomi, M.; Branduardi, D.; Camilloni, C.; Bussi, G. PLUMED 2: New feathers for an old bird. Comput. Phys. Commun. 2014, 185, 604– 613, DOI: 10.1016/j.cpc.2013.09.018Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhs1yqs7fJ&md5=292009aab558d0ef1108bb9a5f036c40PLUMED 2: New feathers for an old birdTribello, Gareth A.; Bonomi, Massimiliano; Branduardi, Davide; Camilloni, Carlo; Bussi, GiovanniComputer Physics Communications (2014), 185 (2), 604-613CODEN: CPHCBZ; ISSN:0010-4655. (Elsevier B.V.)Enhancing sampling and analyzing simulations are central issues in mol. simulation. Recently, we introduced PLUMED, an open-source plug-in that provides some of the most popular mol. dynamics (MD) codes with implementations of a variety of different enhanced sampling algorithms and collective variables (CVs). The rapid changes in this field, in particular new directions in enhanced sampling and dimensionality redn. together with new hardware, require a code that is more flexible and more efficient. We therefore present PLUMED 2 here-a complete rewrite of the code in an object-oriented programming language (C++). This new version introduces greater flexibility and greater modularity, which both extends its core capabilities and makes it far easier to add new methods and CVs. It also has a simpler interface with the MD engines and provides a single software library contg. both tools and core facilities. Ultimately, the new code better serves the ever-growing community of users and contributors in coping with the new challenges arising in the field.
- 54Mu, Y.; Nguyen, P. H.; Stock, G. Energy landscape of a small peptide revealed by dihedral angle principal component analysis. Proteins 2005, 58, 45– 52, DOI: 10.1002/prot.20310Google Scholar52https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD2cnhvFWgtg%253D%253D&md5=47148e133a48df41753ec9daaf3a01f8Energy landscape of a small peptide revealed by dihedral angle principal component analysisMu Yuguang; Nguyen Phuong H; Stock GerhardProteins (2005), 58 (1), 45-52 ISSN:.A 100 ns molecular dynamics simulation of penta-alanine in explicit water is performed to study the reversible folding and unfolding of the peptide. Employing a standard principal component analysis (PCA) using Cartesian coordinates, the resulting free-energy landscape is found to have a single minimum, thus suggesting a simple, relatively smooth free-energy landscape. Introducing a novel PCA based on a transformation of the peptide dihedral angles, it is found, however, that there are numerous free energy minima of comparable energy (less than or approximately 1 kcal/mol), which correspond to well-defined structures with characteristic hydrogen-bonding patterns. That is, the true free-energy landscape is actually quite rugged and its smooth appearance in the Cartesian PCA represents an artifact of the mixing of internal and overall motion. Well-separated minima corresponding to specific conformational structures are also found in the unfolded part of the free energy landscape, revealing that the unfolded state of penta-alanine is structured rather than random. Performing a connectivity analysis, it is shown that neighboring states are connected by low barriers of similar height and that each state typically makes transitions to three or four neighbor states. Several principal pathways for helix nucleation are identified and discussed in some detail.
- 55Damas, J. M.; Filipe, L. C.; Campos, S. R.; Lousa, D.; Victor, B. L.; Baptista, A. M.; Soares, C. M. Predicting the Thermodynamics and Kinetics of Helix Formation in a Cyclic Peptide Model. J. Chem. Theory Comput. 2013, 9, 5148– 5157, DOI: 10.1021/ct400529kGoogle Scholar53https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhsVylsr7E&md5=bb054688bae9b7be6cfe0f1abcd084b4Predicting the thermodynamics and kinetics of helix formation in a cyclic peptide modelDamas, Joao M.; Filipe, Luis C. S.; Campos, Sara R. R.; Lousa, Diana; Victor, Bruno L.; Baptista, Antonio M.; Soares, Claudio M.Journal of Chemical Theory and Computation (2013), 9 (11), 5148-5157CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)The peptide, Ac-(cyclo-2,6)-R-[KAAAD]-NH2 (cyc-RKAAAD), is a short cyclic peptide known to adopt a remarkably stable single-turn α-helix in water. Due to its simplicity and the availability of thermodn. and kinetic exptl. data, cyc-RKAAAD poses as an ideal model for evaluating the aptness of current mol. dynamics (MD) simulation methodologies to accurately sample conformations that reproduce exptl. obsd. properties. Here, the authors extensively sampled the conformational space of cyc-RKAAAD using microsecond-timescale MD simulations. The authors characterized the peptide conformational preferences in terms of secondary structure propensities and, using Cartesian-coordinate principal component anal. (cPCA), constructed its free energy landscape, thus obtaining a detailed weighted discrimination between the helical and nonhelical subensembles. The cPCA state discrimination, together with a Markov model built from it, allowed the authors to est. the free energy of unfolding (-0.57 kJ/mol) and the relaxation time (∼0.435 μs) at 298.15 K, which were in excellent agreement with the exptl. reported values. Addnl., the authors presented simulations conducted using 2 enhanced sampling methods: replica-exchange mol. dynamics (REMD) and bias-exchange metadynamics (BE-MetaD). The authors compared the free energy landscape obtained by these 2 methods with the results from MD simulations and discussed the sampling and computational gains achieved. Overall, the results obtained attested to the suitability of modern simulation methods to explore the conformational behavior of peptide systems with a high level of realism.
- 56Hsueh, S. C. C.; Aina, A.; Plotkin, S. S. Ensemble Generation for Linear and Cyclic Peptides Using a Reservoir Replica Exchange Molecular Dynamics Implementation in GROMACS. J. Phys. Chem. B 2022, 126, 10384– 10399, DOI: 10.1021/acs.jpcb.2c05470Google Scholar54https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB38XivFWitLzE&md5=95ddb75adb68092863f1e5e4fa2b3ce6Ensemble Generation for Linear and Cyclic Peptides Using a Reservoir Replica Exchange Molecular Dynamics Implementation in GROMACSHsueh, Shawn C. C.; Aina, Adekunle; Plotkin, Steven S.Journal of Physical Chemistry B (2022), 126 (49), 10384-10399CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)The profile of shapes presented by a cyclic peptide modulates its therapeutic efficacy and is represented by the ensemble of its sampled conformations. Although some algorithms excel at creating a diverse ensemble of cyclic peptide conformations, they seldom address the entropic contribution of flexible conformations and often have significant practical difficulty producing an ensemble with converged and reliable thermodn. properties. In this study, an accelerated mol. dynamics (MD) method, namely, reservoir replica exchange MD (R-REMD or Res-REMD), was implemented in GROMACS ver. 4.6.7 and benchmarked on two small cyclic peptide model systems: a cyclized furin cleavage site of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike (cyclo-(CGPRRARSG)) and oxytocin (disulfide-bonded CYIQNCPLG). Addnl., we also benchmarked Res-REMD on alanine dipeptide and Trpzip2 to demonstrate its validity and efficiency over REMD. For Trpzip2, Res-REMD coupled with an umbrella-sampling-derived reservoir generated similar folded fractions as regular REMD but on a much faster time scale. For cyclic peptides, Res-REMD appeared to be marginally faster than REMD in ensemble generation. Finally, Res-REMD was more effective in sampling rare events such as trans to cis peptide bond isomerization. We provide a GitHub page with the modified GROMACS source code for running Res-REMD at https://github.com/PlotkinLab/Reservoir-REMD.
- 57Rodriguez, A.; Laio, A. Clustering by fast search and find of density peaks. Science 2014, 344, 1492– 1496, DOI: 10.1126/science.1242072Google Scholar55https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtVaks7%252FL&md5=be52ecdc50ba56d5bcceb3135cdb87daClustering by fast search and find of density peaksRodriguez, Alex; Laio, AlessandroScience (Washington, DC, United States) (2014), 344 (6191), 1492-1496CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)Cluster anal. is aimed at classifying elements into categories on the basis of their similarity. Its applications range from astronomy to bioinformatics, bibliometrics, and pattern recognition. We propose an approach based on the idea that cluster centers are characterized by a higher d. than their neighbors and by a relatively large distance from points with higher densities. This idea forms the basis of a clustering procedure in which the no. of clusters arises intuitively, outliers are automatically spotted and excluded from the anal., and clusters are recognized regardless of their shape and of the dimensionality of the space in which they are embedded. We demonstrate the power of the algorithm on several test cases.
- 58Morgan, H. L. The Generation of a Unique Machine Description for Chemical Structures-A Technique Developed at Chemical Abstracts Service. J. Chem. Doc. 1965, 5, 107– 113, DOI: 10.1021/c160017a018Google Scholar56https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaF2MXkt1Omtr0%253D&md5=63dacaaebba9a603360996ca690e56c3Generation of a unique machine description for chemical structures--a technique developed at Chemical Abstracts ServiceMorgan, H. L.Journal of Chemical Documentation (1965), 5 (2), 107-13CODEN: JCHDAN; ISSN:0021-9576.The description employed is a uniquely ordered list of the node symbols of the structure (or graph) in which the value (at. symbol) of each node and its attachment (bonding) to the other nodes of the total structure. When the entire structure has been numbered according to a given set of rules, the connection table is formed by recording the structural relation by a process of successive partial orderings.
- 59Landrum, G. RDKit: Open-Source Cheminformatics Software , 2021. https://www.rdkit.org/.Google ScholarThere is no corresponding record for this reference.
- 60Nair, V.; Hinton, G. E. Rectified Linear Units Improve Restricted Boltzmann Machines. In Proceedings of the 27th International Conference on Machine Learning (ICML’10), Haifa, Israel, June 21–24, 2010; Fürnkranz, J., Joachims, T., Eds.; Omnipress: Madison, WI, 2010; pp 807– 814.Google ScholarThere is no corresponding record for this reference.
- 61Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv (Computer Science.Machine Learning) , January 30, 2017, 1412.6980, ver. 9. https://arxiv.org/abs/1412.6980 (accessed 2023-03-31).Google ScholarThere is no corresponding record for this reference.
- 62Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; Desmaison, A.; Kopf, A.; Yang, E.; DeVito, Z.; Raison, M.; Tejani, A.; Chilamkurthy, S.; Steiner, B.; Fang, L.; Bai, J.; Chintala, S. Pytorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32 (NeurIPS 2019), Vancouver, Canada, December 8–14, 2019; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, 2019; pp 8024– 8035.Google ScholarThere is no corresponding record for this reference.
- 63Fey, M.; Lenssen, J. E. Fast graph representation learning with PyTorch Geometric. arXiv (Computer Science.Machine Learning) , April 25, 2019, 1903.02428, ver. 3. https://arxiv.org/abs/1903.02428 (accessed 2023-03-31).Google ScholarThere is no corresponding record for this reference.
- 64Prechelt, L. Automatic early stopping using cross validation: quantifying the criteria. Neural Netw. 1998, 11, 761– 767, DOI: 10.1016/S0893-6080(98)00010-0Google Scholar62https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2sbnt1eguw%253D%253D&md5=cc470412bcc35d8792c76b7d53d04386Automatic early stopping using cross validation: quantifying the criteriaPrechelt LutzNeural networks : the official journal of the International Neural Network Society (1998), 11 (4), 761-767 ISSN:.Cross validation can be used to detect when overfitting starts during supervised training of a neural network; training is then stopped before convergence to avoid the overfitting ('early stopping'). The exact criterion used for cross validation based early stopping, however, is chosen in an ad-hoc fashion by most researchers or training is stopped interactively. To aid a more well-founded selection of the stopping criterion, 14 different automatic stopping criteria from three classes were evaluated empirically for their efficiency and effectiveness in 12 different classification and approximation tasks using multi-layer perceptrons with RPROP training. The experiments show that, on average, slower stopping criteria allow for small improvements in generalization (in the order of 4%), but cost about a factor of 4 longer in training time.
- 65Schlichtkrull, M.; Kipf, T. N.; Bloem, P.; van den Berg, R.; Titov, I.; Welling, M. Modeling relational data with graph convolutional networks. arXiv (Statistics.Machine Learning) , October 26, 2017, 1703.06103, ver. 4. https://arxiv.org/abs/1703.06103 (accessed 2023-03-31).Google ScholarThere is no corresponding record for this reference.
Supporting Information
Supporting Information
ARTICLE SECTIONSThe Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jctc.3c00154.
Lists of sequences in the training and test datasets; structural binning maps 2 and 3; hyperparameter tuning schemes; neural network model performances on the training datasets for cyclic pentapeptides and cyclic hexapeptides; performances of linear regression and neural network models including only (1,2) or only (1,3) interactions on training and test datasets for cyclic pentapeptides; performances of linear regression and neural network models including only (1,2), only (1,3), or only (1,4) interactions on training and test datasets for cyclic hexapeptides; comparison of model performances using different binning maps (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.