Molecular Dynamics (MD)-Derived Features for Canonical and Noncanonical Amino AcidsClick to copy article linkArticle link copied!
- Tiffani HuiTiffani HuiDepartment of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Tiffani Hui
- Maxim SecorMaxim SecorDepartment of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Maxim Secor
- Minh Ngoc HoMinh Ngoc HoDepartment of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Minh Ngoc Ho
- Nomindari BayaraaNomindari BayaraaDepartment of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Nomindari Bayaraa
- Yu-Shan Lin*Yu-Shan Lin*Email: [email protected]Department of Chemistry, Tufts University, Medford, Massachusetts 02155, United StatesMore by Yu-Shan Lin
Abstract
Machine learning (ML) models have become increasingly popular for predicting and designing structures and properties of peptides and proteins. These ML models typically use peptides and proteins containing only canonical amino acids as the training data. Consequently, these models struggle to make accurate predictions for peptides and proteins containing new amino acids that are absent in the training data set (e.g., noncanonical amino acids). One approach to improve the accuracy of the models is to collect more training data with the desired amino acids. However, this strategy is suboptimal as new data may not be easily attainable, and additional time is required to retrain the ML models. Alternatively, the extendibility of the ML models can be improved if the amino acid features used are representative and generalizable to the unseen amino acids. Herein, we develop amino acid features using molecular dynamics (MD) simulation results. Specifically, for a given amino acid, we perform MD simulation of its dipeptide to create features based on its backbone (ϕ, ψ) distributions and its electrostatic potentials. We demonstrate that these new features enable our ML models to more accurately predict the structural ensembles of cyclic peptides containing amino acids not present in the original training data set. For example, we build ML models to predict cyclic pentapeptide structures, with the training data set containing a library of 15 amino acids and the test data set containing the same 15-amino-acid library or an extended 50-amino-acid library. When using popular features such as Morgan fingerprints and MACCS keys to represent amino acids, the ML models achieve R2 = 0.963 for structural predictions of test cyclic pentapeptides containing the same 15-amino-acid library. However, these models’ performances decrease significantly to R2 = 0.430 and R2 = 0.508, respectively, when tasked to predict the structures of cyclic pentapeptides containing a library of 50 amino acids. On the other hand, the model using our backbone (ϕ, ψ) features outperforms those using Morgan fingerprints and MACCS keys, with R2 = 0.700. Overall, instead of having to collect more training data, our new features enable predictions of peptide sequences containing amino acids not originally present in the training data set at the mere cost of performing new dipeptide simulations for the new amino acids.
This publication is licensed under
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
License Summary*
You are free to share(copy and redistribute) this article in any medium or format within the parameters below:
Creative Commons (CC): This is a Creative Commons license.
Attribution (BY): Credit must be given to the creator.
Non-Commercial (NC): Only non-commercial uses of the work are permitted.
No Derivatives (ND): Derivative works may be created for non-commercial purposes, but sharing is prohibited.
*Disclaimer
This summary highlights only some of the key features and terms of the actual license. It is not a license and has no legal value. Carefully review the actual license before using these materials.
Introduction
Figure 1
Figure 1. Overview of custom features for amino acids. (A) Position-aware side chain (PASC) fingerprints are based on a “heavy-atom walk” along the amino acid side chain. Starting at the Cα atom, a Morgan fingerprint with a radius of 1 is generated (red circle). Morgan fingerprints with a radius of 1 centered at the Cβ atom (orange circle), Cγ atom (yellow circle), etc., are similarly generated to provide the PASC features. (B) MD simulations of amino acid dipeptides are used to generate MD-derived features. For the backbone (BB) features, the (ϕ, ψ) distribution is calculated from the dipeptide simulation and binned in a 2D grid. Then, the resulting 2D probability density is flattened into a 1D vector. For the voxel (VOX) features, the simulation frames are aligned to reference coordinates for C, Cα, and N, where the Cα atom is at the origin, the N atom is at (1.449, 0.000, 0.000), and the C atom is at (−0.523, 1.429, 0.000) (in Å). Then, frame-averaged molecular electrostatic potential is calculated on a 3D voxel and flattened into a 1D vector. See the Methods section for more details.
Figure 2
Figure 2. Amino acids included in the training, validation, and test data sets for the StrEAMM models. The training and validation data sets include cyclic peptide sequences from a 15-amino-acid (15-aa) library (black box). The test data sets include sequences containing amino acids in the same 15-aa library, the 37-aa library (blue box), or the 50-aa library (purple box). *For brevity, only the L-forms of chiral amino acids are depicted, but their mirror images are also included in the library.
Methods
Charge Derivation for Noncanonical Amino Acids (ncAAs)
Initial Charge Derivation
Simulated Annealing
Final Charge Derivation
MD Simulations of Amino Acid Dipeptides
Cyclic Pentapeptide and Cyclic Hexapeptide Data Sets
Position-Aware Side Chain (PASC) Fingerprints
Backbone (BB) Features
Electrostatic Potential Voxel (VOX) Features
StrEAMM Neural Network Models
Results and Discussion
Application of FPs, MACCS Keys, and Our New Features for Cyclic Peptide Structural Ensemble Prediction: 15-aa Training Data Set, 15-aa Test Data Set
Figure 3
Figure 3. Performance of different amino acid features on different cyclic pentapeptide test data sets. (A) The models are trained using 3-fold cross-validation. The table reports the average R2 (coefficient of determination) and standard deviation across the 3 folds. (B) The performance for one out of the three models from the 3-fold cross-validation is plotted. The predicted population of each structure in the cyclic peptides’ structural ensemble is compared to its populations observed in MD simulations. For clarity, only structures in the ensembles for all cyclic peptides in the test data sets with either a predicted or observed (in MD) percent population of >1% are plotted.
Figure 4
Figure 4. Performance of different amino acid features on different cyclic hexapeptide test data sets. (A) The models are trained using 3-fold cross-validation. The table reports the average R2 (coefficient of determination) and standard deviation across the 3 folds. (B) The performance for one out of the three models from the 3-fold cross-validation is plotted. The predicted population of each structure in the cyclic peptides’ structural ensemble is compared to its populations observed in MD simulations. For clarity, only structures in the ensembles for all cyclic peptides in the test data sets with either a predicted or observed (in MD) percent population of >1% are plotted.
Application of FPs, MACCS Keys, and Our New Features on Extended Test Data Sets: 15-aa Training Data Set, 37-aa and 50-aa Test Data Sets
Evaluation of the StrEAMM Model Performances on the Extended Test Data Sets Using Combinations of Features
Figure 5
Figure 5. Performance of different combinations of amino acid features on cyclic pentapeptide (top) and hexapeptide (bottom) test data sets containing 15 AAs (left), 37 AAs (middle), and 50 AAs (right). The models are trained using 3-fold cross-validation, and the average R2 and standard deviation are reported. The models using a single type of feature (e.g., “BB only”) are represented on the diagonals. The best-performing models for the 37 AA and 50 AA test data sets, based on the average R2, are boxed with bold black outlines.
Conclusions
Data Availability
The new features that were used to train the models in this study can be found in the public GitHub repository, https://github.com/thui16/MD_derived_features_for_aas/. The StrEAMM models and MD simulation data used to train these models are under a patent application (please see Competing Interests section) and are not publicly available.
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.4c02102.
Example of a (ϕ, ψ) distribution binned using various grid sizes; comparison of backbone (BB) features of amino acids in the different amino acid libraries; comparison of the different normalization schemes applied to the BB and voxel (VOX) features; examples of learning curves from hyperparameter tuning; model performances (reporting R2) on the training and validation data sets for the cyclic pentapeptides; p-values from t-tests comparing different features; model performances (reporting R2) on the training and validation data sets for the cyclic hexapeptides; model performances (reporting weighted error, WE) on the various test data sets for the cyclic pentapeptides and cyclic hexapeptides; model performances (reporting R2) using combinations of features on the training and validation data sets for the cyclic pentapeptides and cyclic hexapeptides (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
This work was supported by the National Institute of General Medical Sciences of the National Institutes of Health under award number R01GM124160 (PI: Y.-S.L.). We are grateful for the support from the Tufts Technology Services and for the computing resources at the Tufts Research Cluster. Initial structures for the simulations were built using UCSF Chimera, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from NIH Grant P41-GM103311.
References
This article references 55 other publications.
- 1Wang, L.; Wang, N.; Zhang, W.; Cheng, X.; Yan, Z.; Shao, G.; Wang, X.; Wang, R.; Fu, C. Therapeutic peptides: current applications and future directions. Signal Transduction Targeted Ther. 2022, 7, 48 DOI: 10.1038/s41392-022-00904-4Google Scholar1Therapeutic peptides: current applications and future directionsWang, Lei; Wang, Nanxi; Zhang, Wenping; Cheng, Xurui; Yan, Zhibin; Shao, Gang; Wang, Xi; Wang, Rui; Fu, CaiyunSignal Transduction and Targeted Therapy (2022), 7 (1), 48CODEN: STTTCB; ISSN:2059-3635. (Nature Portfolio)A review. Peptide drug development has made great progress in the last decade thanks to new prodn., modification, and analytic technologies. Peptides have been produced and modified using both chem. and biol. methods, together with novel design and delivery strategies, which have helped to overcome the inherent drawbacks of peptides and have allowed the continued advancement of this field. A wide variety of natural and modified peptides have been obtained and studied, covering multiple therapeutic areas. This review summarizes the efforts and achievements in peptide drug discovery, prodn., and modification, and their current applications. We also discuss the value and challenges assocd. with future developments in therapeutic peptides.
- 2Liu, K.; Li, M.; Li, Y.; Li, Y.; Chen, Z.; Tang, Y.; Yang, M.; Deng, G.; Liu, H. A review of the clinical efficacy of FDA-approved antibody–drug conjugates in human cancers. Mol. Cancer 2024, 23, 62 DOI: 10.1186/s12943-024-01963-7Google ScholarThere is no corresponding record for this reference.
- 3Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; Bridgland, A.; Meyer, C.; Kohl, S. A. A.; Ballard, A. J.; Cowie, A.; Romera-Paredes, B.; Nikolov, S.; Jain, R.; Adler, J.; Back, T.; Petersen, S.; Reiman, D.; Clancy, E.; Zielinski, M.; Steinegger, M.; Pacholska, M.; Berghammer, T.; Bodenstein, S.; Silver, D.; Vinyals, O.; Senior, A. W.; Kavukcuoglu, K.; Kohli, P.; Hassabis, D. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583– 589, DOI: 10.1038/s41586-021-03819-2Google Scholar3Highly accurate protein structure prediction with AlphaFoldJumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ; Zidek, Augustin; Potapenko, Anna; Bridgland, Alex; Meyer, Clemens; Kohl, Simon A. A.; Ballard, Andrew J.; Cowie, Andrew; Romera-Paredes, Bernardino; Nikolov, Stanislav; Jain, Rishub; Adler, Jonas; Back, Trevor; Petersen, Stig; Reiman, David; Clancy, Ellen; Zielinski, Michal; Steinegger, Martin; Pacholska, Michalina; Berghammer, Tamas; Bodenstein, Sebastian; Silver, David; Vinyals, Oriol; Senior, Andrew W.; Kavukcuoglu, Koray; Kohli, Pushmeet; Hassabis, DemisNature (London, United Kingdom) (2021), 596 (7873), 583-589CODEN: NATUAS; ISSN:0028-0836. (Nature Portfolio)Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous exptl. effort, the structures of around 100,000 unique proteins have been detd., but this represents a small fraction of the billions of known protein sequences. Structural coverage is bottlenecked by the months to years of painstaking effort required to det. a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'-has been an important open research problem for more than 50 years. Despite recent progress, existing methods fall far short of at. accuracy, esp. when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with at. accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Crit. Assessment of protein Structure Prediction (CASP14), demonstrating accuracy competitive with exptl. structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates phys. and biol. knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
- 4Baek, M.; DiMaio, F.; Anishchenko, I.; Dauparas, J.; Ovchinnikov, S.; Lee, G. R.; Wang, J.; Cong, Q.; Kinch, L. N.; Schaeffer, R. D.; Millán, C.; Park, H.; Adams, C.; Glassman, C. R.; DeGiovanni, A.; Pereira, J. H.; Rodrigues, A. V.; van Dijk, A. A.; Ebrecht, A. C.; Opperman, D. J.; Sagmeister, T.; Buhlheller, C.; Pavkov-Keller, T.; Rathinaswamy, M. K.; Dalwadi, U.; Yip, C. K.; Burke, J. E.; Garcia, K. C.; Grishin, N. V.; Adams, P. D.; Read, R. J.; Baker, D. Accurate prediction of protein structures and interactions using a three-track neural network. Science 2021, 373, 871– 876, DOI: 10.1126/science.abj8754Google Scholar4Accurate prediction of protein structures and interactions using a three-track neural networkBaek, Minkyung; DiMaio, Frank; Anishchenko, Ivan; Dauparas, Justas; Ovchinnikov, Sergey; Lee, Gyu Rie; Wang, Jue; Cong, Qian; Kinch, Lisa N.; Schaeffer, R. Dustin; Millan, Claudia; Park, Hahnbeom; Adams, Carson; Glassman, Caleb R.; DeGiovanni, Andy; Pereira, Jose H.; Rodrigues, Andria V.; van Dijk, Alberdina A.; Ebrecht, Ana C.; Opperman, Diederik J.; Sagmeister, Theo; Buhlheller, Christoph; Pavkov-Keller, Tea; Rathinaswamy, Manoj K.; Dalwadi, Udit; Yip, Calvin K.; Burke, John E.; Garcia, K. Christopher; Grishin, Nick V.; Adams, Paul D.; Read, Randy J.; Baker, DavidScience (Washington, DC, United States) (2021), 373 (6557), 871-876CODEN: SCIEAS; ISSN:1095-9203. (American Association for the Advancement of Science)DeepMind presented notably accurate predictions at the recent 14th Crit. Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid soln. of challenging x-ray crystallog. and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biol. research.
- 5Miao, J.; Descoteaux, M. L.; Lin, Y.-S. Structure prediction of cyclic peptides by molecular dynamics + machine learning. Chem. Sci. 2021, 12, 14927– 14936, DOI: 10.1039/D1SC05562CGoogle Scholar5Structure prediction of cyclic peptides by molecular dynamics + machine learningMiao, Jiayuan; Descoteaux, Marc L.; Lin, Yu-ShanChemical Science (2021), 12 (44), 14927-14936CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)Recent computational methods have made strides in discovering well-structured cyclic peptides that preferentially populate a single conformation. However, many successful cyclic-peptide therapeutics adopt multiple conformations in soln. In fact, the chameleonic properties of some cyclic peptides are likely responsible for their high cell membrane permeability. Thus, we require the ability to predict complete structural ensembles for cyclic peptides, including the majority of cyclic peptides that have broad structural ensembles, to significantly improve our ability to rationally design cyclic-peptide therapeutics. Here, we introduce the idea of using mol. dynamics simulation results to train machine learning models to enable efficient structure prediction for cyclic peptides. Using mol. dynamics simulation results for several hundred cyclic pentapeptides as the training datasets, we developed machine-learning models that can provide mol. dynamics simulation-quality predictions of structural ensembles for all the hundreds of thousands of sequences in the entire sequence space. The prediction for each individual cyclic peptide can be made using less than 1 s of computation time. Even for the most challenging classes of poorly structured cyclic peptides with broad conformational ensembles, our predictions were similar to those one would normally obtain only after running multiple days of explicit-solvent mol. dynamics simulations. The resulting method, termed StrEAMM (Structural Ensembles Achieved by Mol. Dynamics and Machine Learning), is the first technique capable of efficiently predicting complete structural ensembles of cyclic peptides without relying on addnl. mol. dynamics simulations, constituting a seven-order-of-magnitude improvement in speed while retaining the same accuracy as explicit-solvent simulations.
- 6Hui, T.; Descoteaux, M. L.; Miao, J.; Lin, Y.-S. Training neural network models using molecular dynamics simulation results to efficiently predict cyclic hexapeptide structural ensembles. J. Chem. Theory Comput. 2023, 19, 4757– 4769, DOI: 10.1021/acs.jctc.3c00154Google ScholarThere is no corresponding record for this reference.
- 7Wan, F.; Kontogiorgos-Heintz, D.; de la Fuente-Nunez, C. Deep generative models for peptide design. Digital Discovery 2022, 1, 195– 208, DOI: 10.1039/D1DD00024AGoogle ScholarThere is no corresponding record for this reference.
- 8Ferguson, A. L.; Ranganathan, R. 100th anniversary of macromolecular science viewpoint: data-driven protein design. ACS Macro Lett. 2021, 10, 327– 340, DOI: 10.1021/acsmacrolett.0c00885Google Scholar8100Th Anniversary of Macromolecular Science Viewpoint: Data-Driven Protein DesignFerguson, Andrew L.; Ranganathan, RamaACS Macro Letters (2021), 10 (3), 327-340CODEN: AMLCCD; ISSN:2161-1653. (American Chemical Society)A review. The design of synthetic proteins with the desired function is a long-standing goal in biomol. science, with broad applications in biochem. engineering, agriculture, medicine, and public health. Rational de novo design and exptl. directed evolution have achieved remarkable successes but are challenged by the requirement to find functional "needles" in the vast "haystack" of protein sequence space. Data-driven models for fitness landscapes provide a predictive map between protein sequence and function and can prospectively identify functional candidates for exptl. testing to greatly improve the efficiency of this search. This Viewpoint reviews the applications of machine learning and, in particular, deep learning as part of data-driven protein engineering platforms. We highlight recent successes, review promising computational methodologies, and provide an outlook on future challenges and opportunities. The article is written for a broad audience comprising both polymer and protein scientists and computer and data scientists interested in an up-to-date review of recent innovations and opportunities in this rapidly evolving field.
- 9Strokach, A.; Kim, P. M. Deep generative modeling for protein design. Curr. Opin. Struct. Biol. 2022, 72, 226– 236, DOI: 10.1016/j.sbi.2021.11.008Google Scholar9Deep generative modeling for protein designStrokach, Alexey; Kim, Philip M.Current Opinion in Structural Biology (2022), 72 (), 226-236CODEN: COSBEF; ISSN:0959-440X. (Elsevier Ltd.)A review. Deep learning approaches have produced substantial breakthroughs in fields such as image classification and natural language processing and are making rapid inroads in the area of protein design. Many generative models of proteins have been developed that encompass all known protein sequences, model specific protein families, or extrapolate the dynamics of individual proteins. Those generative models can learn protein representations that are often more informative of protein structure and function than hand-engineered features. Furthermore, they can be used to quickly propose millions of novel proteins that resemble the native counterparts in terms of expression level, stability, or other attributes. The protein design process can further be guided by discriminative oracles to select candidates with the highest probability of having the desired properties. In this review, we discuss five classes of generative models that have been most successful at modeling proteins and provide a framework for model guided protein design.
- 10Chandra, A.; Tünnermann, L.; Löfstedt, T.; Gratz, R. Transformer-based deep learning for predicting protein properties in the life sciences. eLife 2023, 12, e82819 DOI: 10.7554/eLife.82819Google ScholarThere is no corresponding record for this reference.
- 11Oliva, R.; Chino, M.; Pane, K.; Pistorio, V.; De Santis, A.; Pizzo, E.; D’Errico, G.; Pavone, V.; Lombardi, A.; Del Vecchio, P.; Notomista, E.; Nastri, F.; Petraccone, L. Exploring the role of unnatural amino acids in antimicrobial peptides. Sci. Rep. 2018, 8, 8888 DOI: 10.1038/s41598-018-27231-5Google Scholar11Exploring the role of unnatural amino acids in antimicrobial peptidesOliva Rosario; Chino Marco; De Santis Augusta; D'Errico Gerardino; Pavone Vincenzo; Lombardi Angela; Del Vecchio Pompea; Nastri Flavia; Petraccone Luigi; Pane Katia; Pizzo Elio; Notomista Eugenio; Pistorio ValeriaScientific reports (2018), 8 (1), 8888 ISSN:.Cationic antimicrobial peptides (CAMPs) are a promising alternative to treat multidrug-resistant bacteria, which have developed resistance to all the commonly used antimicrobial, and therefore represent a serious threat to human health. One of the major drawbacks of CAMPs is their sensitivity to proteases, which drastically limits their half-life. Here we describe the design and synthesis of three nine-residue CAMPs, which showed high stability in serum and broad spectrum antimicrobial activity. As for all peptides a very low selectivity between bacterial and eukaryotic cells was observed, we performed a detailed biophysical characterization of the interaction of one of these peptides with liposomes mimicking bacterial and eukaryotic membranes. Our results show a surface binding on the DPPC/DPPG vesicles, coupled with lipid domain formation, and, above a threshold concentration, a deep insertion into the bilayer hydrophobic core. On the contrary, mainly surface binding of the peptide on the DPPC bilayer was observed. These observed differences in the peptide interaction with the two model membranes suggest a divergence in the mechanisms responsible for the antimicrobial activity and for the observed high toxicity toward mammalian cell lines. These results could represent an important contribution to unravel some open and unresolved issues in the development of synthetic CAMPs.
- 12Lu, J.; Xu, H.; Xia, J.; Ma, J.; Xu, J.; Li, Y.; Feng, J. D- and unnatural amino acid substituted antimicrobial peptides with improved proteolytic resistance and their proteolytic degradation characteristics. Front. Microbiol. 2020, 11, 563030 DOI: 10.3389/fmicb.2020.563030Google Scholar12D- and Unnatural Amino Acid Substituted Antimicrobial Peptides With Improved Proteolytic Resistance and Their Proteolytic Degradation CharacteristicsLu Jianguang; Xu Hongjiang; Xia Jianghua; Li Yanan; Feng Jun; Lu Jianguang; Ma Jie; Xu Jun; Feng Jun; Xu Hongjiang; Li YananFrontiers in microbiology (2020), 11 (), 563030 ISSN:1664-302X.The transition of antimicrobial peptides (AMPs) from the laboratory to market has been severely hindered by their instability toward proteases in biological systems. In the present study, we synthesized derivatives of the cationic AMP Pep05 (KRLFKKLLKYLRKF) by substituting L-amino acid residues with D- and unnatural amino acids, such as D-lysine, D-arginine, L-2,4-diaminobutanoic acid (Dab), L-2,3-diaminopropionic acid (Dap), L-homoarginine, 4-aminobutanoic acid (Aib), and L-thienylalanine, and evaluated their antimicrobial activities, toxicities, and stabilities toward trypsin, plasma proteases, and secreted bacterial proteases. In addition to measuring changes in the concentration of the intact peptides, LC-MS was used to identify the degradation products of the modified AMPs in the presence of trypsin and plasma proteases to determine degradation pathways and examine whether the amino acid substitutions afforded improved proteolytic resistance. The results revealed that both D- and unnatural amino acids enhanced the stabilities of the peptides toward proteases. The derivative DP06, in which all of the L-lysine and L-arginine residues were replaced by D-amino acids, displayed remarkable stability and mild toxicity in vitro but only slight activity and severe toxicity in vivo, indicating a significant difference between the in vivo and in vitro results. Unexpectedly, we found that the incorporation of a single Aib residue at the N-terminus of compound UP09 afforded remarkably enhanced plasma stability and improved activity in vivo. Hence, this derivative may represent a candidate AMP for further optimization, providing a new strategy for the design of novel AMPs with improved bioavailability.
- 13Taechalertpaisarn, J.; Ono, S.; Okada, O.; Johnstone, T. C.; Lokey, R. S. A new amino acid for improving permeability and solubility in macrocyclic peptides through side chain-to-backbone hydrogen bonding. J. Med. Chem. 2022, 65, 5072– 5084, DOI: 10.1021/acs.jmedchem.2c00010Google Scholar13A new amino acid for improving permeability and solubility in macrocyclic peptides through side chain-to-backbone hydrogen bondingTaechalertpaisarn, Jaru; Ono, Satoshi; Okada, Okimasa; Johnstone, Timothy C.; Lokey, R. ScottJournal of Medicinal Chemistry (2022), 65 (6), 5072-5084CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Despite the notoriously poor membrane permeability of peptides, many cyclic peptide natural products show high passive membrane permeability and potently inhibit a variety of "undruggable" intracellular targets. A major impediment to the design of cyclic peptides with good permeability is the high desolvation energy assocd. with the peptide backbone amide NH groups. While several strategies have been proposed to mitigate this deleterious effect, only few studies have used polar side chains to sequester backbone NH groups. We investigated the ability of N,N-pyrrolidinylglutamine (Pye), whose side chain contains a powerful hydrogen-bond-accepting C:O amide group but no hydrogen-bond donors, to sequester exposed backbone NH groups in a series of cyclic hexapeptide diastereomers. Analyses revealed that specific Leu-to-Pye substitutions conferred dramatic improvements in aq. soly. and permeability in a scaffold- and position-dependent manner. Therefore, this approach offers a complementary tool for improving membrane permeability and soly. in cyclic peptides.
- 14Geurink, P. P.; van der Linden, W. A.; Mirabella, A. C.; Gallastegui, N.; de Bruin, G.; Blom, A. E.; Voges, M. J.; Mock, E. D.; Florea, B. I.; van der Marel, G. A.; Driessen, C.; van der Stelt, M.; Groll, M.; Overkleeft, H. S.; Kisselev, A. F. Incorporation of non-natural amino acids improves cell permeability and potency of specific inhibitors of proteasome trypsin-like sites. J. Med. Chem. 2013, 56, 1262– 1275, DOI: 10.1021/jm3016987Google Scholar14Incorporation of non-natural amino acids improves cell permeability and potency of specific inhibitors of proteasome trypsin-like sitesGeurink, Paul P.; van der Linden, Wouter A.; Mirabella, Anne C.; Gallastegui, Nerea; de Bruin, Gerjan; Blom, Annet E. M.; Voges, Mathias J.; Mock, Elliot D.; Florea, Bogdan I.; van der Marel, Gijs A.; Driessen, Christoph; van der Stelt, Mario; Groll, Michael; Overkleeft, Herman S.; Kisselev, Alexei F.Journal of Medicinal Chemistry (2013), 56 (3), 1262-1275CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Proteasomes degrade the majority of proteins in mammalian cells by a concerted action of three distinct pairs of active sites. The chymotrypsin-like sites are targets of antimyeloma agents bortezomib and carfilzomib. Inhibitors of the trypsin-like site sensitize multiple myeloma cells to these agents. Here we describe systematic effort to develop inhibitors with improved potency and cell permeability, yielding azido-Phe-Leu-Leu-4-aminomethyl-Phe-Me vinyl sulfone (I), LU-102, and a fluorescent activity-based probe for this site. X-ray structures of I and related inhibitors complexed with yeast proteasomes revealed the structural basis for specificity. Nontoxic to myeloma cells when used as a single agent, I sensitized them to bortezomib and carfilzomib. This sensitizing effect was much stronger than the synergistic effects of histone acetylase inhibitors or additive effects of doxorubicin and dexamethasone, raising the possibility that combinations of inhibitors of the trypsin-like site with bortezomib or carfilzomib would have stronger antineoplastic activity than combinations currently used clin.
- 15Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31– 36, DOI: 10.1021/ci00057a005Google Scholar15SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rulesWeininger, DavidJournal of Chemical Information and Computer Sciences (1988), 28 (1), 31-6CODEN: JCISD8; ISSN:0095-2338.The SMILES (simplified mol. input line entry system) chem. notation system is described for information processing. The system is based on principles of mol. graph theory and it allows structure specification by use of a very small and natural grammar well suited for high-speed machine processing. The system is easy to use, has high machine compatibility, and allows many computer applications, including notation generation, const. speed database retrieval, substructure searching, and property prediction models.
- 16Siani, M. A.; Weininger, D.; Blaney, J. M. CHUCKLES: A method for representing and searching peptide and peptoid sequences on both monomer and atomic levels. J. Chem. Inf. Comput. Sci. 1994, 34, 588– 593, DOI: 10.1021/ci00019a017Google Scholar16CHUCKLES: A method for representing and searching peptide and peptoid sequences on both monomer and atomic levelsSiani, Michael A.; Weininger, David; Blaney, Jeffrey M.Journal of Chemical Information and Computer Sciences (1994), 34 (3), 588-93CODEN: JCISD8; ISSN:0095-2338.Dual representation of peptide and non-peptide structures in a chem. database as at.-level mol. graphs and sequence strings permits chem. substructure and similarity searches as well as sequence-based substring and regular expression searches. CHUCKLES interconverts monomer-based sequences with SMILES, which represent at.-level mol. graphs. Forward-translation maps peptide or other sequences into SMILES. Back-translation exts. monomer sequences from SMILES. This approach permits a generalized representation of monomers allowing user specification of any monomer. CHUCKLES allows mixing of atoms with user-defined monomer names; i.e., monomer representation is consistent with SMILES notation. In addn., oligomer branching and cyclization are handled.
- 17Siani, M. A.; Weininger, D.; James, C. A.; Blaney, J. M. CHORTLES: A method for representing oligomeric and template-based mixtures. J. Chem. Inf. Comput. Sci. 1995, 35, 1026– 1033, DOI: 10.1021/ci00028a012Google Scholar17CHORTLES: A Method for Representing Oligomeric and Template-Based MixturesSiani, Michael A.; Weininger, David; James, Craig A.; Blaney, Jeffrey M.Journal of Chemical Information and Computer Sciences (1995), 35 (6), 1026-33CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)Screening mixts. of synthetic oligomers or fixed templates (e.g., rings) with varying substituents is increasingly the focus of drug discovery programs. CHORTLES is designed and implemented to facilitate representation, storage, and searching of oligomeric and template-based mixts. of any size. Building upon the CHUCKLES method of representing oligomers as both monomer-based sequences and all-atom structures, CHORTLES compactly represents a mixt. without explicitly enumerating individual mols. This method lends itself to a hierarchy relating mixts. to submixts. and individual compds., as one finds when deconvoluting mixts. in drug lead discovery programs. In addn., we describe two methods of searching mixts. at the monomer level. We also present a simple pictorial representation for describing all components in a mixt., which becomes essential as the list of monomer names is expanded beyond common names (e.g., amino acids).
- 18Jensen, J. H.; Hoeg-Jensen, T.; Padkjær, S. B. Building a biochemformatics database. J. Chem. Inf. Model. 2008, 48, 2404– 2413, DOI: 10.1021/ci800128bGoogle Scholar18Building a BioChemformatics DatabaseJensen, Jan H.; Hoeg-Jensen, Thomas; Padkjaer, Soeren B.Journal of Chemical Information and Modeling (2008), 48 (12), 2404-2413CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)The structural registration of chem. modified macromols. is vital for the development of biopharmaceuticals. However, registration and search of such complex mols. has so far posed formidable challenges performance-wise, since today's chem.-oriented databases do not scale well to macromols. As a practical consequence, macromols. tend to be stored in protein databases with a focus on protein sequence only, and salient chem. details are therefore lost. This article describes protein format extensions and the use of pseudoatoms for representing natural amino acids in chem. structures to allow high-performance registration and retrieval of large macromols. The representations include exact chem. modifications and enable lossless conversion between chem. and sequence formats. Registration is done in parallel in both sequence and chem. formats, and users can register and retrieve mols. in either format as they choose, resulting in what we call a BioChemformatics database. Having both sequence and chem. formats available on-demand allows for the construction of protein SAR tables with mixed sequence and chem. information. Likewise, searching may combine sequence and chem. terms and be performed in std. vendor applications like MDL's ISIS/Base or inhouse applications using std. SQL queries.
- 19Lin, T.-S.; Coley, C. W.; Mochigase, H.; Beech, H. K.; Wang, W.; Wang, Z.; Woods, E.; Craig, S. L.; Johnson, J. A.; Kalow, J. A.; Jensen, K. F.; Olsen, B. D. BigSMILES: A structurally-based line notation for describing macromolecules. ACS Cent. Sci. 2019, 5, 1523– 1531, DOI: 10.1021/acscentsci.9b00476Google Scholar19BigSMILES: A Structurally-Based Line Notation for Describing MacromoleculesLin, Tzyy-Shyang; Coley, Connor W.; Mochigase, Hidenobu; Beech, Haley K.; Wang, Wencong; Wang, Zi; Woods, Eliot; Craig, Stephen L.; Johnson, Jeremiah A.; Kalow, Julia A.; Jensen, Klavs F.; Olsen, Bradley D.ACS Central Science (2019), 5 (9), 1523-1531CODEN: ACSCII; ISSN:2374-7951. (American Chemical Society)Having a compact yet robust structurally based identifier or representation system is a key enabling factor for efficient sharing and dissemination of research results within the chem. community, and such systems lay down the essential foundations for future informatics and data-driven research. While substantial advances have been made for small mols., the polymer community has struggled in coming up with an efficient representation system. This is because, unlike other disciplines in chem., the basic premise that each distinct chem. species corresponds to a well-defined chem. structure does not hold for polymers. Polymers are intrinsically stochastic mols. that are often ensembles with a distribution of chem. structures. This difficulty limits the applicability of all deterministic representations developed for small mols. In this work, a new representation system that is capable of handling the stochastic nature of polymers is proposed. The new system is based on the popular "simplified mol.-input line-entry system" (SMILES), and it aims to provide representations that can be used as indexing identifiers for entries in polymer databases. As a pilot test, the entries of the std. data set of the glass transition temp. of linear polymers (Bicerano, 2002) were converted into the new BigSMILES language. Furthermore, it is hoped that the proposed system will provide a more effective language for communication within the polymer community and increase cohesion between the researchers within the community.
- 20Zhang, T.; Li, H.; Xi, H.; Stanton, R. V.; Rotstein, S. H. HELM: A hierarchical notation language for complex biomolecule structure representation. J. Chem. Inf. Model. 2012, 52, 2796– 2806, DOI: 10.1021/ci3001925Google Scholar20HELM: A Hierarchical Notation Language for Complex Biomolecule Structure RepresentationZhang, Tianhong; Li, Hongli; Xi, Hualin; Stanton, Robert V.; Rotstein, Sergio H.Journal of Chemical Information and Modeling (2012), 52 (10), 2796-2806CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)When biol. macromols. are used as therapeutic agents, it is often necessary to introduce non-natural chem. modifications to improve their pharmaceutical properties. The final products are complex structures where entities such as proteins, peptides, oligonucleotides, and small mol. drugs may be covalently linked to each other, or may include chem. modified biol. moieties. An accurate in silico representation of these complex structures is essential, as it forms the basis for their electronic registration, storage, anal., and visualization. The size of these mols. (henceforth referred to as "biomols.") often makes them too unwieldy and impractical to represent at the at. level, while the presence of non-natural chem. modifications makes it impossible to represent them by sequence alone. Here we describe the Hierarchical Editing Language for Macromols. ("HELM") and demonstrate its utility in the representation of structures such as antisense oligonucleotides, short interference RNAs, peptides, proteins, and antibody drug conjugates.
- 21David, L.; Thakkar, A.; Mercado, R.; Engkvist, O. Molecular representations in AI-driven drug discovery: a review and practical guide. J. Cheminf. 2020, 12, 56 DOI: 10.1186/s13321-020-00460-5Google Scholar21Molecular representations in AI-driven drug discovery: a review and practical guideDavid, Laurianne; Thakkar, Amol; Mercado, Rocio; Engkvist, OlaJournal of Cheminformatics (2020), 12 (1), 56CODEN: JCOHB3; ISSN:1758-2946. (SpringerOpen)A review. Abstr.: The technol. advances of the past century, marked by the computer revolution and the advent of high-throughput screening technologies in drug discovery, opened the path to the computational anal. and visualization of bioactive mols. For this purpose, it became necessary to represent mols. in a syntax that would be readable by computers and understandable by scientists of various fields. A large no. of chem. representations have been developed over the years, their numerosity being due to the fast development of computers and the complexity of producing a representation that encompasses all structural and chem. characteristics. We present here some of the most popular electronic mol. and macromol. representations used in drug discovery, many of which are based on graph representations. Furthermore, we describe applications of these representations in AI-driven drug discovery. Our aim is to provide a brief guide on structural representations that are essential to the practise of AI in drug discovery. This review serves as a guide for researchers who have little experience with the handling of chem. representations and plan to work on applications at the interface of these fields.
- 22Nguyen-Vo, T.-H.; Teesdale-Spittle, P.; Harvey, J. E.; Nguyen, B. P. Molecular representations in bio-cheminformatics. Memetic Comput. 2024, 16, 519– 536, DOI: 10.1007/s12293-024-00414-6Google ScholarThere is no corresponding record for this reference.
- 23Wigh, D. S.; Goodman, J. M.; Lapkin, A. A. A review of molecular representation in the age of machine learning. WIREs Comput. Mol. Sci. 2022, 12, e1603 DOI: 10.1002/wcms.1603Google ScholarThere is no corresponding record for this reference.
- 24Sousa, T.; Correia, J.; Pereira, V.; Rocha, M. Generative deep learning for targeted compound design. J. Chem. Inf. Model. 2021, 61, 5343– 5361, DOI: 10.1021/acs.jcim.0c01496Google Scholar24Generative Deep Learning for Targeted Compound DesignSousa, Tiago; Correia, Joao; Pereira, Vitor; Rocha, MiguelJournal of Chemical Information and Modeling (2021), 61 (11), 5343-5361CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)A review. In the past few years, de novo mol. design has increasingly been using generative models from the emergent field of Deep Learning, proposing novel compds. that are likely to possess desired properties or activities. De novo mol. design finds applications in different fields ranging from drug discovery and materials sciences to biotechnol. A panoply of deep generative models, including architectures as Recurrent Neural Networks, Autoencoders, and Generative Adversarial Networks, can be trained on existing data sets and provide for the generation of novel compds. Typically, the new compds. follow the same underlying statistical distributions of properties exhibited on the training data set. Addnl., different optimization strategies, including transfer learning, Bayesian optimization, reinforcement learning, and conditional generation, can direct the generation process toward desired aims, regarding their biol. activities, synthesis processes or chem. features. Given the recent emergence of these technologies and their relevance, this work presents a systematic and crit. review on deep generative models and related optimization methods for targeted compd. design, and their applications.
- 25Morgan, H. L. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc. 1965, 5, 107– 113, DOI: 10.1021/c160017a018Google Scholar25Generation of a unique machine description for chemical structures--a technique developed at Chemical Abstracts ServiceMorgan, H. L.Journal of Chemical Documentation (1965), 5 (2), 107-13CODEN: JCHDAN; ISSN:0021-9576.The description employed is a uniquely ordered list of the node symbols of the structure (or graph) in which the value (at. symbol) of each node and its attachment (bonding) to the other nodes of the total structure. When the entire structure has been numbered according to a given set of rules, the connection table is formed by recording the structural relation by a process of successive partial orderings.
- 26Durant, J. L.; Leland, B. A.; Henry, D. R.; Nourse, J. G. Reoptimization of MDL Keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 2002, 42, 1273– 1280, DOI: 10.1021/ci010132rGoogle Scholar26Reoptimization of MDL Keys for Use in Drug DiscoveryDurant, Joseph L.; Leland, Burton A.; Henry, Douglas R.; Nourse, James G.Journal of Chemical Information and Computer Sciences (2002), 42 (6), 1273-1280CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)For a no. of years MDL products have exposed both 166 bit and 960 bit keysets based on 2D descriptors. These keysets were originally constructed and optimized for substructure searching. We report on improvements in the performance of MDL keysets which are reoptimized for use in mol. similarity. Classification performance for a test data set of 957 compds. was increased from 0.65 for the 166 bit keyset and 0.67 for the 960 bit keyset to 0.71 for a surprisal S/N pruned keyset contg. 208 bits and 0.71 for a genetic algorithm optimized keyset contg. 548 bits. We present an overview of the underlying technol. supporting the definition of descriptors and the encoding of these descriptors into keysets. This technol. allows definition of descriptors as combinations of atom properties, bond properties, and at. neighborhoods at various topol. sepns. as well as supporting a no. of custom descriptors. These descriptors can then be used to set one or more bits in a keyset. We constructed various keysets and optimized their performance in clustering bioactive substances. Performance was measured using methodol. developed by Briem and Lessel. "Directed pruning" was carried out by eliminating bits from the keysets on the basis of random selection, values of the surprisal of the bit, or values of the surprisal S/N ratio of the bit. The random pruning expt. highlighted the insensitivity of keyset performance for keyset lengths of more than 1000 bits. Contrary to initial expectations, pruning on the basis of the surprisal values of the various bits resulted in keysets which underperformed those resulting from random pruning. In contrast, pruning on the basis of the surprisal S/N ratio was found to yield keysets which performed better than those resulting from random pruning. We also explored the use of genetic algorithms in the selection of optimal keysets. Once more the performance was only a weak function of keyset size, and the optimizations failed to identify a single globally optimal keyset. Instead multiple, equally optimal keysets could be produced which had relatively low overlap of the descriptors they encoded.
- 27Schissel, C. K.; Mohapatra, S.; Wolfe, J. M.; Fadzen, C. M.; Bellovoda, K.; Wu, C.-L.; Wood, J. A.; Malmberg, A. B.; Loas, A.; Gómez-Bombarelli, R.; Pentelute, B. L. Deep learning to design nuclear-targeting abiotic miniproteins. Nat. Chem. 2021, 13, 992– 1000, DOI: 10.1038/s41557-021-00766-3Google Scholar27Deep learning to design nuclear-targeting abiotic miniproteinsSchissel, Carly K.; Mohapatra, Somesh; Wolfe, Justin M.; Fadzen, Colin M.; Bellovoda, Kamela; Wu, Chia-Ling; Wood, Jenna A.; Malmberg, Annika B.; Loas, Andrei; Gomez-Bombarelli, Rafael; Pentelute, Bradley L.Nature Chemistry (2021), 13 (10), 992-1000CODEN: NCAHBB; ISSN:1755-4330. (Nature Portfolio)There are more amino acid permutations within a 40-residue sequence than atoms on Earth. This vast chem. search space hinders the use of human learning to design functional polymers. Here we show how machine learning enables the de novo design of abiotic nuclear-targeting miniproteins to traffic antisense oligomers to the nucleus of cells. We combined high-throughput experimentation with a directed evolution-inspired deep-learning approach in which the mol. structures of natural and unnatural residues are represented as topol. fingerprints. The model is able to predict activities beyond the training dataset, and simultaneously deciphers and visualizes sequence-activity predictions. The predicted miniproteins, termed Mach reach an av. mass of 10 kDa, are more effective than any previously known variant in cells and can also deliver proteins into the cytosol. The Mach miniproteins are non-toxic and efficiently deliver antisense cargo in mice. These results demonstrate that deep learning can decipher design principles to generate highly active biomols. that are unlikely to be discovered by empirical approaches.
- 28Ji, X.; Nielsen, A. L.; Heinis, C. Cyclic peptides for drug development. Angew. Chem., Int. Ed. 2024, 63, e202308251 DOI: 10.1002/anie.202308251Google ScholarThere is no corresponding record for this reference.
- 29Costa, L.; Sousa, E.; Fernandes, C. Cyclic peptides in pipeline: what future for these great molecules?. Pharmaceuticals 2023, 16, 996 DOI: 10.3390/ph16070996Google Scholar29Antibiotics, antifungals, anticancer, and immunosuppressants use of cyclic peptides in therapeutics for different diseasesCosta, Lia; Sousa, Emilia; Fernandes, CarlaPharmaceuticals (2023), 16 (7), 996CODEN: PHARH2; ISSN:1424-8247. (MDPI AG)A review. Cyclic peptides are mols. that are already used as drugs in therapies approved for various pharmacol. activities, for example, as antibiotics, antifungals, anticancer, and immunosuppressants. Interest in these mols. has been growing due to the improved pharmacokinetic and pharmacodynamic properties of the cyclic structure over linear peptides and by the evolution of chem. synthesis, computational, and in vitro methods. To date, 53 cyclic peptides have been approved by different regulatory authorities, and many others are in clin. trials for a wide diversity of conditions. In this review, the potential of cyclic peptides is presented, and general aspects of their synthesis and development are discussed. Furthermore, an overview of already approved cyclic peptides is also given, and the cyclic peptides in clin. trials are summarized.
- 30Zhang, H.; Chen, S. Cyclic peptide drugs approved in the last two decades (2001–2021). RSC Chem. Biol. 2022, 3, 18– 31, DOI: 10.1039/D1CB00154JGoogle Scholar30Cyclic peptide drugs approved in the last two decades (2001-2021)Zhang Huiya; Chen ShiyuRSC chemical biology (2022), 3 (1), 18-31 ISSN:.In contrast to the major families of small molecules and antibodies, cyclic peptides, as a family of synthesizable macromolecules, have distinct biochemical and therapeutic properties for pharmaceutical applications. Cyclic peptide-based drugs have increasingly been developed in the past two decades, confirming the common perception that cyclic peptides have high binding affinities and low metabolic toxicity as antibodies, good stability and ease of manufacture as small molecules. Natural peptides were the major source of cyclic peptide drugs in the last century, and cyclic peptides derived from novel screening and cyclization strategies are the new source. In this review, we will discuss and summarize 18 cyclic peptides approved for clinical use in the past two decades to provide a better understanding of cyclic peptide development and to inspire new perspectives. The purpose of the present review is to promote efforts to resolve the challenges in the development of cyclic peptide drugs that are more effective.
- 31Landrum, G. RDKit: Open-Source Cheminformatics Software. https://www.rdkit.org/.Google ScholarThere is no corresponding record for this reference.
- 32The PyMOL Molecular Graphics System, version 3.0; Schrödinger, LLC.Google ScholarThere is no corresponding record for this reference.
- 33Case, D. A.; Aktulga, H. M. A.; Belfon, K.; Ben-Shalom, I. Y.; Berryman, J. T.; Brozell, S. R.; Cerutti, D. S.; Cheatham, T. E., III; Cisneros, G. A.; Cruzeiro, V. W. D.; Darden, T. A.; Duke, R. E.; Giambasu, G.; Gilson, M. K.; Gohlke, H.; Goetz, A. W.; Harris, R.; Izadi, S.; Izmailov, S. A.; Kasavajhala, K.; Kaymak, M. C.; King, E.; Kovalenko, A.; Kurtzman, T.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Liu, J.; Luchko, T.; Luo, R.; Machado, M.; Man, V.; Manathunga, M.; Merz, K. M.; Miao, Y.; Mikhailovskii, O.; Monard, G.; Nguyen, H.; O’Hearn, K. A.; Onufriev, A.; Pan, F.; Pantano, S.; Qi, R.; Rahnamoun, A.; Roe, D. R.; Roitberg, A.; Sagui, C.; Schott-Verdugo, S.; Shajan, A.; Shen, J.; Simmerling, C. L.; Skrynnikov, N. R.; Smith, J.; Swails, J.; Walker, R. C.; Wang, J.; Wang, J.; Wei, H.; Wolf, R. M.; Wu, X.; Xiong, Y.; Xue, Y.; York, D. M.; Zhao, S.; Kollman, P. A. Amber; University of California: San Francisco, 2022.Google ScholarThere is no corresponding record for this reference.
- 34Jakalian, A.; Jack, D. B.; Bayly, C. I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation. J. Comput. Chem. 2002, 23, 1623– 4161, DOI: 10.1002/jcc.10128Google Scholar34Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. parameterization and validationJakalian, Araz; Jack, David B.; Bayly, Christopher I.Journal of Computational Chemistry (2002), 23 (16), 1623-1641CODEN: JCCHDD; ISSN:0192-8651. (John Wiley & Sons, Inc.)We present the first global parameterization and validation of a novel charge model, called AM1-BCC, which quickly and efficiently generates high-quality at. charges for computer simulations of org. mols. in polar media. The goal of the charge model is to produce at. charges that emulate the HF/6-31G* electrostatic potential (ESP) of a mol. Underlying electronic structure features, including formal charge and electron delocalization, are first captured by AM1 population charges; simple additive bond charge corrections (BCCs) are then applied to these AM1 at. charges to produce the AM1-BCC charges. The parameterization of BCCs was carried out by fitting to the HF/6-31G* ESP of a training set of >2700 mols. Most org. functional groups and their combinations were sampled, as well as an extensive variety of cyclic and fused bicyclic heteroaryl systems. The resulting BCC parameters allow the AM1-BCC charging scheme to handle virtually all types of org. compds. listed in The Merck Index and the NCI Database. Validation of the model was done through comparisons of hydrogen-bonded dimer energies and relative free energies of solvation using AM1-BCC charges in conjunction with the 1994 Cornell et al. forcefield for AMBER. Homo-dimer and hetero-dimer hydrogen-bond energies of a diverse set of org. mols. were reproduced to within 0.95 kcal/mol RMS deviation from the ab initio values, and for DNA dimers the energies were within 0.9 kcal/mol RMS deviation from ab initio values. The calcd. relative free energies of solvation for a diverse set of monofunctional isosteres were reproduced to within 0.69 kcal/mol of expt. In all these validation tests, AMBER with the AM1-BCC charge model maintained a correlation coeff. above 0.96. Thus, the parameters presented here for use with the AM1-BCC method present a fast, accurate, and robust alternative to HF/6-31G* ESP-fit charges for general use with the AMBER force field in computer simulations involving org. small mols.
- 35Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model. J. Phys. Chem. A 1993, 97, 10269– 10280, DOI: 10.1021/j100142a004Google Scholar35A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP modelBayly, Christopher I.; Cieplak, Piotr; Cornell, Wendy; Kollman, Peter A.Journal of Physical Chemistry (1993), 97 (40), 10269-80CODEN: JPCHAX; ISSN:0022-3654.The authors present a new approach to generating electrostatic potential (ESP) derived charges for mols. The major strength of electrostatic potential derived charges is that they optimally reproduce the intermol. interaction properties of mols. with a simple two-body additive potential, provided, of course, that a suitably accurate level of quantum mech. calcn. is used to derive the ESP around the mol. Previously, the major weaknesses of these charges have been that they were not easily transferably between common functional groups in related mols., they have often been conformationally dependent, and the large charges that frequently occur can be problematic for simulating intramol. interactions. Introducing restraints in the form of a penalty function into the fitting process considerably reduces the above problems, with only a minor decrease in the quality of the fit to the quantum mech. ESP. Several other refinements in addn. to the restrained electrostatic potential (RESP) fit yield a general and algorithmic charge fitting procedure for generating atom-centered point charges. This approach can thus be recommended for general use in mol. mechanics, mol. dynamics, and free energy calcns. for any org. or bioorg. system.
- 36Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of multiple amber force fields and development of improved protein backbone parameters. Proteins 2006, 65, 712– 725, DOI: 10.1002/prot.21123Google Scholar36Comparison of multiple Amber force fields and development of improved protein backbone parametersHornak, Viktor; Abel, Robert; Okur, Asim; Strockbine, Bentley; Roitberg, Adrian; Simmerling, CarlosProteins: Structure, Function, and Bioinformatics (2006), 65 (3), 712-725CODEN: PSFBAF ISSN:. (Wiley-Liss, Inc.)The ff94 force field that is commonly assocd. with the Amber simulation package is one of the most widely used parameter sets for biomol. simulation. After a decade of extensive use and testing, limitations in this force field, such as over-stabilization of α-helixes, were reported by the authors and other researchers. This led to a no. of attempts to improve these parameters, resulting in a variety of "Amber" force fields and significant difficulty in detg. which should be used for a particular application. The authors show that several of these continue to suffer from inadequate balance between different secondary structure elements. In addn., the approach used in most of these studies neglected to account for the existence in Amber of two sets of backbone .vphi./ψ dihedral terms. This led to parameter sets that provide unreasonable conformational preferences for glycine. The authors report here an effort to improve the .vphi./ψ dihedral terms in the ff99 energy function. Dihedral term parameters are based on fitting the energies of multiple conformations of glycine and alanine tetrapeptides from high level ab initio quantum mech. calcns. The new parameters for backbone dihedrals replace those in the existing ff99 force field. This parameter set, which the authors denote ff99SB, achieves a better balance of secondary structure elements as judged by improved distribution of backbone dihedrals for glycine and alanine with respect to PDB survey data. It also accomplishes improved agreement with published exptl. data for conformational preferences of short alanine peptides and better accord with exptl. NMR relaxation data of test protein systems.
- 37Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157– 1174, DOI: 10.1002/jcc.20035Google Scholar37Development and testing of a general Amber force fieldWang, Junmei; Wolf, Romain M.; Caldwell, James W.; Kollman, Peter A.; Case, David A.Journal of Computational Chemistry (2004), 25 (9), 1157-1174CODEN: JCCHDD; ISSN:0192-8651. (John Wiley & Sons, Inc.)We describe here a general Amber force field (GAFF) for org. mols. GAFF is designed to be compatible with existing Amber force fields for proteins and nucleic acids, and has parameters for most org. and pharmaceutical mols. that are composed of H, C, N, O, S, P, and halogens. It uses a simple functional form and a limited no. of atom types, but incorporates both empirical and heuristic models to est. force consts. and partial at. charges. The performance of GAFF in test cases is encouraging. In test I, 74 crystallog. structures were compared to GAFF minimized structures, with a root-mean-square displacement of 0.26 Å, which is comparable to that of the Tripos 5.2 force field (0.25 Å) and better than those of MMFF 94 and CHARMm (0.47 and 0.44 Å, resp.). In test II, gas phase minimizations were performed on 22 nucleic acid base pairs, and the minimized structures and intermol. energies were compared to MP2/6-31G* results. The RMS of displacements and relative energies were 0.25 Å and 1.2 kcal/mol, resp. These data are comparable to results from Parm99/RESP (0.16 Å and 1.18 kcal/mol, resp.), which were parameterized to these base pairs. Test III looked at the relative energies of 71 conformational pairs that were used in development of the Parm99 force field. The RMS error in relative energies (compared to expt.) is about 0.5 kcal/mol. GAFF can be applied to wide range of mols. in an automatic fashion, making it suitable for rational drug design and database searching.
- 38Zhou, C.-Y.; Jiang, F.; Wu, Y.-D. Residue-specific force field based on protein coil library. RSFF2: modification of AMBER ff99SB. J. Phys. Chem. B 2015, 119, 1035– 1047, DOI: 10.1021/jp5064676Google Scholar38Residue-Specific Force Field Based on Protein Coil Library. RSFF2: Modification of AMBER ff99SBZhou, Chen-Yang; Jiang, Fan; Wu, Yun-DongJournal of Physical Chemistry B (2015), 119 (3), 1035-1047CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Recently, we developed a residue-specific force field (RSFF1) based on conformational free-energy distributions of the 20 amino acid residues from a protein coil library. Most parameters in RSFF1 were adopted from the OPLS-AA/L force field, but some van der Waals and torsional parameters that effectively affect local conformational preferences were introduced specifically for individual residues to fit the coil library distributions. Here a similar strategy has been applied to modify the Amber ff99SB force field, and a new force field named RSFF2 is developed. It can successfully fold α-helical structures such as polyalanine peptides, Trp-cage miniprotein, and villin headpiece subdomain and β-sheet structures such as Trpzip-2, GB1 β-hairpins, and the WW domain, simultaneously. The properties of various popular force fields in balancing between α-helix and β-sheet are analyzed based on their descriptions of local conformational features of various residues, and the anal. reveals the importance of accurate local free-energy distributions. Unlike the RSFF1, which overestimates the stability of both α-helix and β-sheet, RSFF2 gives melting curves of α-helical peptides and Trp-cage in good agreement with exptl. data. Fitting to the two-state model, RSFF2 gives folding enthalpies and entropies in reasonably good agreement with available exptl. results.
- 39Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindahl, E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1–2, 19– 25, DOI: 10.1016/j.softx.2015.06.001Google ScholarThere is no corresponding record for this reference.
- 40Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926– 935, DOI: 10.1063/1.445869Google Scholar40Comparison of simple potential functions for simulating liquid waterJorgensen, William L.; Chandrasekhar, Jayaraman; Madura, Jeffry D.; Impey, Roger W.; Klein, Michael L.Journal of Chemical Physics (1983), 79 (2), 926-35CODEN: JCPSA6; ISSN:0021-9606.Classical Monte Carlo simulations were carried out for liq. H2O in the NPT ensemble at 25° and 1 atm using 6 of the simpler intermol. potential functions for the dimer. Comparisons were made with exptl. thermodn. and structural data including the neutron diffraction results of Thiessen and Narten (1982). The computed densities and potential energies agree with expt. except for the original Bernal-Fowler model, which yields an 18% overest. of the d. and poor structural results. The discrepancy may be due to the correction terms needed in processing the neutron data or to an effect uniformly neglected in the computations. Comparisons were made for the self-diffusion coeffs. obtained from mol. dynamics simulations.
- 41Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A smooth particle mesh Ewald method. J. Chem. Phys. 1995, 103, 8577– 8593, DOI: 10.1063/1.470117Google Scholar41A smooth particle mesh Ewald methodEssmann, Ulrich; Perera, Lalith; Berkowitz, Max L.; Darden, Tom; Lee, Hsing; Pedersen, Lee G.Journal of Chemical Physics (1995), 103 (19), 8577-93CODEN: JCPSA6; ISSN:0021-9606. (American Institute of Physics)The previously developed particle mesh Ewald method is reformulated in terms of efficient B-spline interpolation of the structure factors. This reformulation allows a natural extension of the method to potentials of the form 1/rp with p ≥ 1. Furthermore, efficient calcn. of the virial tensor follows. Use of B-splines in the place of Lagrange interpolation leads to analytic gradients as well as a significant improvement in the accuracy. The authors demonstrate that arbitrary accuracy can be achieved, independent of system size N, at a cost that scales as N log(N). For biomol. systems with many thousands of atoms and this method permits the use of Ewald summation at a computational cost comparable to that of a simple truncation method of 10 Å or less.
- 42Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Petersson, G. A.; Nakatsuji, H.; Li, X.; Caricato, M.; Marenich, A. V.; Bloino, J.; Janesko, B. G.; Gomperts, R.; Mennucci, B.; Hratchian, H. P.; Ortiz, J. V.; Izmaylov, A. F.; Sonnenberg, J. L.; Williams; ; Ding, F.; Lipparini, F.; Egidi, F.; Goings, J.; Peng, B.; Petrone, A.; Henderson, T.; Ranasinghe, D.; Zakrzewski, V. G.; Gao, J.; Rega, N.; Zheng, G.; Liang, W.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Throssell, K.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M. J.; Heyd, J. J.; Brothers, E. N.; Kudin, K. N.; Staroverov, V. N.; Keith, T. A.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A. P.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Millam, J. M.; Klene, M.; Adamo, C.; Cammi, R.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Farkas, O.; Foresman, J. B.; Fox, D. J. Gaussian 16, rev. C.01; Gaussian Inc.: Wallingford, CT, 2016.Google ScholarThere is no corresponding record for this reference.
- 43Vanquelef, E.; Simon, S.; Marquant, G.; Garcia, E.; Klimerak, G.; Delepine, J. C.; Cieplak, P.; Dupradeau, F. Y. R.E.D. Server: a web service for deriving RESP and ESP charges and building force field libraries for new molecules and molecular fragments. Nucleic Acids Res. 2011, 39, W511– 517, DOI: 10.1093/nar/gkr288Google Scholar43R.E.D. Server: a web service for deriving RESP and ESP charges and building force field libraries for new molecules and molecular fragmentsVanquelef, Enguerran; Simon, Sabrina; Marquant, Gaelle; Garcia, Elodie; Klimerak, Geoffroy; Delepine, Jean Charles; Cieplak, Piotr; Dupradeau, Francois-YvesNucleic Acids Research (2011), 39 (Web Server), W511-W517CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)R.E.D. Server is a unique, open web service, designed to derive non-polarizable RESP and ESP charges and to build force field libraries for new mols./mol. fragments. It provides to computational biologists the means to derive rigorously mol. electrostatic potential-based charges embedded in force field libraries that are ready to be used in force field development, charge validation and mol. dynamics simulations. R.E.D. Server interfaces quantum mechanics programs, the RESP program and the latest version of the R.E.D. tools. A two step approach has been developed. The first one consists of prepg. P2N file(s) to rigorously define key elements such as atom names, topol. and chem. equivalencing needed when building a force field library. Then, P2N files are used to derive RESP or ESP charges embedded in force field libraries in the Tripos mol2 format. In complex cases an entire set of force field libraries or force field topol. database is generated. Other features developed in R.E.D. Server include help services, a demonstration, tutorials, frequently asked questions, Jmol-based tools useful to construct PDB input files and parse R.E.D. Server outputs as well as a graphical queuing system allowing any user to check the status of R.E.D. Server jobs.
- 44Piana, S.; Laio, A. A bias-exchange approach to protein folding. J. Phys. Chem. B 2007, 111, 4553– 4559, DOI: 10.1021/jp067873lGoogle Scholar44A Bias-Exchange Approach to Protein FoldingPiana, Stefano; Laio, AlessandroJournal of Physical Chemistry B (2007), 111 (17), 4553-4559CODEN: JPCBFK; ISSN:1520-6106. (American Chemical Society)By suitably extending a recent approach [Bussi, G., et al., 2006] the authors introduce a powerful methodol. that allows the parallel reconstruction of the free energy of a system in a virtually unlimited no. of variables. Multiple metadynamics simulations of the same system at the same temp. are performed, biasing each replica with a time-dependent potential constructed in a different set of collective variables. Exchanges between the bias potentials in the different variables are periodically allowed according to a replica exchange scheme. Due to the efficaciously multidimensional nature of the bias the method allows exploring complex free energy landscapes with high efficiency. The usefulness of the method is demonstrated by performing an atomistic simulation in explicit solvent of the folding of a Triptophane cage miniprotein. It is shown that the folding free energy landscape can be fully characterized starting from an extended conformation with use of only 40 ns of simulation on 8 replicas.
- 45Tribello, G. A.; Bonomi, M.; Branduardi, D.; Camilloni, C.; Bussi, G. PLUMED 2: New feathers for an old bird. Comput. Phys. Commun. 2014, 185, 604– 613, DOI: 10.1016/j.cpc.2013.09.018Google Scholar45PLUMED 2: New feathers for an old birdTribello, Gareth A.; Bonomi, Massimiliano; Branduardi, Davide; Camilloni, Carlo; Bussi, GiovanniComputer Physics Communications (2014), 185 (2), 604-613CODEN: CPHCBZ; ISSN:0010-4655. (Elsevier B.V.)Enhancing sampling and analyzing simulations are central issues in mol. simulation. Recently, we introduced PLUMED, an open-source plug-in that provides some of the most popular mol. dynamics (MD) codes with implementations of a variety of different enhanced sampling algorithms and collective variables (CVs). The rapid changes in this field, in particular new directions in enhanced sampling and dimensionality redn. together with new hardware, require a code that is more flexible and more efficient. We therefore present PLUMED 2 here-a complete rewrite of the code in an object-oriented programming language (C++). This new version introduces greater flexibility and greater modularity, which both extends its core capabilities and makes it far easier to add new methods and CVs. It also has a simpler interface with the MD engines and provides a single software library contg. both tools and core facilities. Ultimately, the new code better serves the ever-growing community of users and contributors in coping with the new challenges arising in the field.
- 46Damas, J. M.; Filipe, L. C.; Campos, S. R.; Lousa, D.; Victor, B. L.; Baptista, A. M.; Soares, C. M. Predicting the thermodynamics and kinetics of helix formation in a cyclic peptide model. J. Chem. Theory Comput. 2013, 9, 5148– 5157, DOI: 10.1021/ct400529kGoogle Scholar46Predicting the thermodynamics and kinetics of helix formation in a cyclic peptide modelDamas, Joao M.; Filipe, Luis C. S.; Campos, Sara R. R.; Lousa, Diana; Victor, Bruno L.; Baptista, Antonio M.; Soares, Claudio M.Journal of Chemical Theory and Computation (2013), 9 (11), 5148-5157CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)The peptide, Ac-(cyclo-2,6)-R-[KAAAD]-NH2 (cyc-RKAAAD), is a short cyclic peptide known to adopt a remarkably stable single-turn α-helix in water. Due to its simplicity and the availability of thermodn. and kinetic exptl. data, cyc-RKAAAD poses as an ideal model for evaluating the aptness of current mol. dynamics (MD) simulation methodologies to accurately sample conformations that reproduce exptl. obsd. properties. Here, the authors extensively sampled the conformational space of cyc-RKAAAD using microsecond-timescale MD simulations. The authors characterized the peptide conformational preferences in terms of secondary structure propensities and, using Cartesian-coordinate principal component anal. (cPCA), constructed its free energy landscape, thus obtaining a detailed weighted discrimination between the helical and nonhelical subensembles. The cPCA state discrimination, together with a Markov model built from it, allowed the authors to est. the free energy of unfolding (-0.57 kJ/mol) and the relaxation time (∼0.435 μs) at 298.15 K, which were in excellent agreement with the exptl. reported values. Addnl., the authors presented simulations conducted using 2 enhanced sampling methods: replica-exchange mol. dynamics (REMD) and bias-exchange metadynamics (BE-MetaD). The authors compared the free energy landscape obtained by these 2 methods with the results from MD simulations and discussed the sampling and computational gains achieved. Overall, the results obtained attested to the suitability of modern simulation methods to explore the conformational behavior of peptide systems with a high level of realism.
- 47Nair, V.; Hinton, G. E. Rectified Linear Units Improve Restricted Boltzmann Machines, Proceedings of the 27th International Conference on Machine Learning (ICML’10), Haifa, Israel, June 21–24, Fürnkranz, J.; Joachims, T., Eds.; Omnipress: Madison, WI, 2010; pp 807– 814.Google ScholarThere is no corresponding record for this reference.
- 48Prechelt, L. Automatic early stopping using cross validation: quantifying the criteria. Neural Networks 1998, 11, 761– 767, DOI: 10.1016/S0893-6080(98)00010-0Google Scholar48Automatic early stopping using cross validation: quantifying the criteriaPrechelt LutzNeural networks : the official journal of the International Neural Network Society (1998), 11 (4), 761-767 ISSN:.Cross validation can be used to detect when overfitting starts during supervised training of a neural network; training is then stopped before convergence to avoid the overfitting ('early stopping'). The exact criterion used for cross validation based early stopping, however, is chosen in an ad-hoc fashion by most researchers or training is stopped interactively. To aid a more well-founded selection of the stopping criterion, 14 different automatic stopping criteria from three classes were evaluated empirically for their efficiency and effectiveness in 12 different classification and approximation tasks using multi-layer perceptrons with RPROP training. The experiments show that, on average, slower stopping criteria allow for small improvements in generalization (in the order of 4%), but cost about a factor of 4 longer in training time.
- 49Paszke, A. G. S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T. L.; Gimelshein, N.; Antiga, L.; Desmaison, A.; Kopf, A.; Yang, E. D. Z.; Raison, M.; Tejani, A.; Chilamkurthy, S.; Steiner, B. F.; Bai, J.; Chintala, S. An Imperative Style, High-Performance Deep Learning Library, Advances in Neural Information Processing Systems 32 (NeurIPS 2019); Wallach, H.; Larochelle, H.; Beygelzimer, A.; d’Alché-Buc, F.; Fox, E.; Garnett, R., Eds.; Curran Associates: Vancouver, Canada, 2019; pp 8024– 8035.Google ScholarThere is no corresponding record for this reference.
- 50Hayashi, K.; Uehara, S.; Yamamoto, S.; Cary, D. R.; Nishikawa, J.; Ueda, T.; Ozasa, H.; Mihara, K.; Yoshimura, N.; Kawai, T.; Ono, T.; Yamamoto, S.; Fumoto, M.; Mikamiyama, H. Macrocyclic peptides as a novel class of NNMT inhibitors: A SAR study aimed at inhibitory activity in the cell. ACS Med. Chem. Lett. 2021, 12, 1093– 1101, DOI: 10.1021/acsmedchemlett.1c00134Google Scholar50Macrocyclic Peptides as a Novel Class of NNMT Inhibitors: A SAR Study Aimed at Inhibitory Activity in the CellHayashi, Kyohei; Uehara, Shota; Yamamoto, Shiho; Cary, Douglas R.; Nishikawa, Junichi; Ueda, Taichi; Ozasa, Hiroki; Mihara, Kousuke; Yoshimura, Norito; Kawai, Taeko; Ono, Takashi; Yamamoto, Saki; Fumoto, Masataka; Mikamiyama, HidenoriACS Medicinal Chemistry Letters (2021), 12 (7), 1093-1101CODEN: AMCLCT; ISSN:1948-5875. (American Chemical Society)Nicotinamide N-methyltransferase (NNMT), which catalyzes the methylation of nicotinamide, is a cytosolic enzyme that has attracted much attention as a therapeutic target for a variety of diseases. However, despite the considerable interest in this target, reports of NNMT inhibitors have still been limited to date. In this work, utilizing in vitro translated macrocyclic peptide libraries, we identified peptide 1 as a novel class of NNMT inhibitors. Further exploration based on the X-ray cocrystal structures of the peptides with NNMT provided a dramatic improvement in inhibitory activity (peptide 23: IC50 = 0.15 nM). Furthermore, by balance of the peptides' lipophilicity and biol. activity, inhibitory activity against NNMT in cell-based assay was successfully achieved (peptide 26: cell-based IC50 = 770 nM). These findings illuminate the potential of cyclic peptides as a relatively new drug discovery modality even for intracellular targets.
- 51Brousseau, M. E.; Clairmont, K. B.; Spraggon, G.; Flyer, A. N.; Golosov, A. A.; Grosche, P.; Amin, J.; Andre, J.; Burdick, D.; Caplan, S.; Chen, G.; Chopra, R.; Ames, L.; Dubiel, D.; Fan, L.; Gattlen, R.; Kelly-Sullivan, D.; Koch, A. W.; Lewis, I.; Li, J.; Liu, E.; Lubicka, D.; Marzinzik, A.; Nakajima, K.; Nettleton, D.; Ottl, J.; Pan, M.; Patel, T.; Perry, L.; Pickett, S.; Poirier, J.; Reid, P. C.; Pelle, X.; Seepersaud, M.; Subramanian, V.; Vera, V.; Xu, M.; Yang, L.; Yang, Q.; Yu, J.; Zhu, G.; Monovich, L. G. Identification of a PCSK9-LDLR disruptor peptide with in vivo function. Cell Chem. Biol. 2022, 29, 249– 258.e5, DOI: 10.1016/j.chembiol.2021.08.012Google Scholar51Identification of a PCSK9-LDLR disruptor peptide with in vivo functionBrousseau, Margaret E.; Clairmont, Kevin B.; Spraggon, Glen; Flyer, Alec N.; Golosov, Andrei A.; Grosche, Philipp; Amin, Jakal; Andre, Jerome; Burdick, Debra; Caplan, Shari; Chen, Guanjing; Chopra, Raj; Ames, Lisa; Dubiel, Diana; Fan, Li; Gattlen, Raphael; Kelly-Sullivan, Dawn; Koch, Alexander W.; Lewis, Ian; Li, Jingzhou; Liu, Eugene; Lubicka, Danuta; Marzinzik, Andreas; Nakajima, Katsumasa; Nettleton, David; Ottl, Johannes; Pan, Meihui; Patel, Tajesh; Perry, Lauren; Pickett, Stephanie; Poirier, Jennifer; Reid, Patrick C.; Pelle, Xavier; Seepersaud, Mohindra; Subramanian, Vanitha; Vera, Victoria; Xu, Mei; Yang, Lihua; Yang, Qing; Yu, Jinghua; Zhu, Guoming; Monovich, Lauren G.Cell Chemical Biology (2022), 29 (2), 249-258.e5CODEN: CCBEBM; ISSN:2451-9448. (Cell Press)Proprotein convertase subtilisin/kexin type 9 (PCSK9) regulates plasma low-d. lipoprotein cholesterol (LDL-C) levels by promoting hepatic LDL receptor (LDLR) degrdn. Therapeutic antibodies that disrupt PCSK9-LDLR binding reduce LDL-C concns. and cardiovascular disease risk. The epidermal growth factor precursor homol. domain A (EGF-A) of the LDLR serves as a primary contact with PCSK9 via a flat interface, presenting a challenge for identifying small mol. PCSK9-LDLR disruptors. We employ an affinity-based screen of 1013in vitro-translated macrocyclic peptides to identify high-affinity PCSK9 ligands that utilize a unique, induced-fit pocket and partially disrupt the PCSK9-LDLR interaction. Structure-based design led to mols. with enhanced function and pharmacokinetic properties (e.g., 13PCSK9i). In mice, 13PCSK9i reduces plasma cholesterol levels and increases hepatic LDLR d. in a dose-dependent manner. 13PCSK9i functions by a unique, allosteric mechanism and is the smallest mol. identified to date with in vivo PCSK9-LDLR disruptor function.
- 52Yoshida, S.; Uehara, S.; Kondo, N.; Takahashi, Y.; Yamamoto, S.; Kameda, A.; Kawagoe, S.; Inoue, N.; Yamada, M.; Yoshimura, N.; Tachibana, Y. Peptide-to-small molecule: a pharmacophore-guided small molecule lead generation strategy from high-affinity macrocyclic peptides. J. Med. Chem. 2022, 65, 10655– 10673, DOI: 10.1021/acs.jmedchem.2c00919Google ScholarThere is no corresponding record for this reference.
- 53Banerjee, R.; Basu, G.; Chène, P.; Roy, S. Aib-based peptide backbone as scaffolds for helical peptide mimics. J. Pept. Res. 2002, 60, 88– 94, DOI: 10.1034/j.1399-3011.2002.201005.xGoogle Scholar53Aib-based peptide backbone as scaffolds for helical peptide mimicsBanerjee, R.; Basu, G.; Chene, P.; Roy, S.Journal of Peptide Research (2002), 60 (2), 88-94CODEN: JPERFA; ISSN:1397-002X. (Blackwell Munksgaard)Helical peptides that can intervene and disrupt therapeutically important protein-protein interactions are attractive drug targets. In order to develop a general strategy for developing such helical peptide mimics, the authors have studied the effect of incorporating α-amino isobutyric acid (Aib), an amino acid with strong preference for helical backbone, as the sole helix promoter in designed peptides. Specifically, the focus is on the hdm2-p53 interaction, which is central to development of many types of cancer. The peptide corresponding to the hdm2 interacting part of p53, helical in bound state but devoid of structure in soln., served as the starting point for peptide design that involved replacement of noninteracting residues by Aib. Incorporation of Aib, while preserving the interacting residues, led to significant increase in helical structure, particularly at the C-terminal region as judged by NMR and CD. The interaction with hdm2 was also found to be enhanced. Most interestingly, trypsin cleavage was found to be retarded by several orders of magnitude. It is concluded that incorporation of Aib is a feasible strategy to create peptide helical mimics with enhanced receptor binding and lower protease cleavage rate.
- 54Karle, I. L. Controls exerted by the Aib residue: helix formation and helix reversal. Pept. Sci. 2001, 60, 351– 365, DOI: 10.1002/1097-0282(2001)60:5<351::AID-BIP10174>3.0.CO;2-UGoogle ScholarThere is no corresponding record for this reference.
- 55Lundberg, S. M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions; Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, December 4–9, 2017; Von Luxburg, U.; Guyon, I.; Bengio, S.; Wallach, H.; Ferugs, R., Eds.; Curan Associates: Redhook, NY, 2017; pp 4768– 4777.Google ScholarThere is no corresponding record for this reference.
Cited By
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by ACS Publications if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
This article is cited by 1 publications.
- Yanpeng Fang, Duoyang Fan, Bin Feng, Yingli Zhu, Ruyan Xie, Xiaorong Tan, Qianhui Liu, Jie Dong, Wenbin Zeng. Harnessing advanced computational approaches to design novel antimicrobial peptides against intracellular bacterial infections. Bioactive Materials 2025, 50 , 510-524. https://doi.org/10.1016/j.bioactmat.2025.04.016
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
Abstract
Figure 1
Figure 1. Overview of custom features for amino acids. (A) Position-aware side chain (PASC) fingerprints are based on a “heavy-atom walk” along the amino acid side chain. Starting at the Cα atom, a Morgan fingerprint with a radius of 1 is generated (red circle). Morgan fingerprints with a radius of 1 centered at the Cβ atom (orange circle), Cγ atom (yellow circle), etc., are similarly generated to provide the PASC features. (B) MD simulations of amino acid dipeptides are used to generate MD-derived features. For the backbone (BB) features, the (ϕ, ψ) distribution is calculated from the dipeptide simulation and binned in a 2D grid. Then, the resulting 2D probability density is flattened into a 1D vector. For the voxel (VOX) features, the simulation frames are aligned to reference coordinates for C, Cα, and N, where the Cα atom is at the origin, the N atom is at (1.449, 0.000, 0.000), and the C atom is at (−0.523, 1.429, 0.000) (in Å). Then, frame-averaged molecular electrostatic potential is calculated on a 3D voxel and flattened into a 1D vector. See the Methods section for more details.
Figure 2
Figure 2. Amino acids included in the training, validation, and test data sets for the StrEAMM models. The training and validation data sets include cyclic peptide sequences from a 15-amino-acid (15-aa) library (black box). The test data sets include sequences containing amino acids in the same 15-aa library, the 37-aa library (blue box), or the 50-aa library (purple box). *For brevity, only the L-forms of chiral amino acids are depicted, but their mirror images are also included in the library.
Figure 3
Figure 3. Performance of different amino acid features on different cyclic pentapeptide test data sets. (A) The models are trained using 3-fold cross-validation. The table reports the average R2 (coefficient of determination) and standard deviation across the 3 folds. (B) The performance for one out of the three models from the 3-fold cross-validation is plotted. The predicted population of each structure in the cyclic peptides’ structural ensemble is compared to its populations observed in MD simulations. For clarity, only structures in the ensembles for all cyclic peptides in the test data sets with either a predicted or observed (in MD) percent population of >1% are plotted.
Figure 4
Figure 4. Performance of different amino acid features on different cyclic hexapeptide test data sets. (A) The models are trained using 3-fold cross-validation. The table reports the average R2 (coefficient of determination) and standard deviation across the 3 folds. (B) The performance for one out of the three models from the 3-fold cross-validation is plotted. The predicted population of each structure in the cyclic peptides’ structural ensemble is compared to its populations observed in MD simulations. For clarity, only structures in the ensembles for all cyclic peptides in the test data sets with either a predicted or observed (in MD) percent population of >1% are plotted.
Figure 5
Figure 5. Performance of different combinations of amino acid features on cyclic pentapeptide (top) and hexapeptide (bottom) test data sets containing 15 AAs (left), 37 AAs (middle), and 50 AAs (right). The models are trained using 3-fold cross-validation, and the average R2 and standard deviation are reported. The models using a single type of feature (e.g., “BB only”) are represented on the diagonals. The best-performing models for the 37 AA and 50 AA test data sets, based on the average R2, are boxed with bold black outlines.
References
This article references 55 other publications.
- 1Wang, L.; Wang, N.; Zhang, W.; Cheng, X.; Yan, Z.; Shao, G.; Wang, X.; Wang, R.; Fu, C. Therapeutic peptides: current applications and future directions. Signal Transduction Targeted Ther. 2022, 7, 48 DOI: 10.1038/s41392-022-00904-41Therapeutic peptides: current applications and future directionsWang, Lei; Wang, Nanxi; Zhang, Wenping; Cheng, Xurui; Yan, Zhibin; Shao, Gang; Wang, Xi; Wang, Rui; Fu, CaiyunSignal Transduction and Targeted Therapy (2022), 7 (1), 48CODEN: STTTCB; ISSN:2059-3635. (Nature Portfolio)A review. Peptide drug development has made great progress in the last decade thanks to new prodn., modification, and analytic technologies. Peptides have been produced and modified using both chem. and biol. methods, together with novel design and delivery strategies, which have helped to overcome the inherent drawbacks of peptides and have allowed the continued advancement of this field. A wide variety of natural and modified peptides have been obtained and studied, covering multiple therapeutic areas. This review summarizes the efforts and achievements in peptide drug discovery, prodn., and modification, and their current applications. We also discuss the value and challenges assocd. with future developments in therapeutic peptides.
- 2Liu, K.; Li, M.; Li, Y.; Li, Y.; Chen, Z.; Tang, Y.; Yang, M.; Deng, G.; Liu, H. A review of the clinical efficacy of FDA-approved antibody–drug conjugates in human cancers. Mol. Cancer 2024, 23, 62 DOI: 10.1186/s12943-024-01963-7There is no corresponding record for this reference.
- 3Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; Bridgland, A.; Meyer, C.; Kohl, S. A. A.; Ballard, A. J.; Cowie, A.; Romera-Paredes, B.; Nikolov, S.; Jain, R.; Adler, J.; Back, T.; Petersen, S.; Reiman, D.; Clancy, E.; Zielinski, M.; Steinegger, M.; Pacholska, M.; Berghammer, T.; Bodenstein, S.; Silver, D.; Vinyals, O.; Senior, A. W.; Kavukcuoglu, K.; Kohli, P.; Hassabis, D. Highly accurate protein structure prediction with AlphaFold. Nature 2021, 596, 583– 589, DOI: 10.1038/s41586-021-03819-23Highly accurate protein structure prediction with AlphaFoldJumper, John; Evans, Richard; Pritzel, Alexander; Green, Tim; Figurnov, Michael; Ronneberger, Olaf; Tunyasuvunakool, Kathryn; Bates, Russ; Zidek, Augustin; Potapenko, Anna; Bridgland, Alex; Meyer, Clemens; Kohl, Simon A. A.; Ballard, Andrew J.; Cowie, Andrew; Romera-Paredes, Bernardino; Nikolov, Stanislav; Jain, Rishub; Adler, Jonas; Back, Trevor; Petersen, Stig; Reiman, David; Clancy, Ellen; Zielinski, Michal; Steinegger, Martin; Pacholska, Michalina; Berghammer, Tamas; Bodenstein, Sebastian; Silver, David; Vinyals, Oriol; Senior, Andrew W.; Kavukcuoglu, Koray; Kohli, Pushmeet; Hassabis, DemisNature (London, United Kingdom) (2021), 596 (7873), 583-589CODEN: NATUAS; ISSN:0028-0836. (Nature Portfolio)Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous exptl. effort, the structures of around 100,000 unique proteins have been detd., but this represents a small fraction of the billions of known protein sequences. Structural coverage is bottlenecked by the months to years of painstaking effort required to det. a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence-the structure prediction component of the 'protein folding problem'-has been an important open research problem for more than 50 years. Despite recent progress, existing methods fall far short of at. accuracy, esp. when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with at. accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Crit. Assessment of protein Structure Prediction (CASP14), demonstrating accuracy competitive with exptl. structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates phys. and biol. knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm.
- 4Baek, M.; DiMaio, F.; Anishchenko, I.; Dauparas, J.; Ovchinnikov, S.; Lee, G. R.; Wang, J.; Cong, Q.; Kinch, L. N.; Schaeffer, R. D.; Millán, C.; Park, H.; Adams, C.; Glassman, C. R.; DeGiovanni, A.; Pereira, J. H.; Rodrigues, A. V.; van Dijk, A. A.; Ebrecht, A. C.; Opperman, D. J.; Sagmeister, T.; Buhlheller, C.; Pavkov-Keller, T.; Rathinaswamy, M. K.; Dalwadi, U.; Yip, C. K.; Burke, J. E.; Garcia, K. C.; Grishin, N. V.; Adams, P. D.; Read, R. J.; Baker, D. Accurate prediction of protein structures and interactions using a three-track neural network. Science 2021, 373, 871– 876, DOI: 10.1126/science.abj87544Accurate prediction of protein structures and interactions using a three-track neural networkBaek, Minkyung; DiMaio, Frank; Anishchenko, Ivan; Dauparas, Justas; Ovchinnikov, Sergey; Lee, Gyu Rie; Wang, Jue; Cong, Qian; Kinch, Lisa N.; Schaeffer, R. Dustin; Millan, Claudia; Park, Hahnbeom; Adams, Carson; Glassman, Caleb R.; DeGiovanni, Andy; Pereira, Jose H.; Rodrigues, Andria V.; van Dijk, Alberdina A.; Ebrecht, Ana C.; Opperman, Diederik J.; Sagmeister, Theo; Buhlheller, Christoph; Pavkov-Keller, Tea; Rathinaswamy, Manoj K.; Dalwadi, Udit; Yip, Calvin K.; Burke, John E.; Garcia, K. Christopher; Grishin, Nick V.; Adams, Paul D.; Read, Randy J.; Baker, DavidScience (Washington, DC, United States) (2021), 373 (6557), 871-876CODEN: SCIEAS; ISSN:1095-9203. (American Association for the Advancement of Science)DeepMind presented notably accurate predictions at the recent 14th Crit. Assessment of Structure Prediction (CASP14) conference. We explored network architectures that incorporate related ideas and obtained the best performance with a three-track network in which information at the one-dimensional (1D) sequence level, the 2D distance map level, and the 3D coordinate level is successively transformed and integrated. The three-track network produces structure predictions with accuracies approaching those of DeepMind in CASP14, enables the rapid soln. of challenging x-ray crystallog. and cryo-electron microscopy structure modeling problems, and provides insights into the functions of proteins of currently unknown structure. The network also enables rapid generation of accurate protein-protein complex models from sequence information alone, short-circuiting traditional approaches that require modeling of individual subunits followed by docking. We make the method available to the scientific community to speed biol. research.
- 5Miao, J.; Descoteaux, M. L.; Lin, Y.-S. Structure prediction of cyclic peptides by molecular dynamics + machine learning. Chem. Sci. 2021, 12, 14927– 14936, DOI: 10.1039/D1SC05562C5Structure prediction of cyclic peptides by molecular dynamics + machine learningMiao, Jiayuan; Descoteaux, Marc L.; Lin, Yu-ShanChemical Science (2021), 12 (44), 14927-14936CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)Recent computational methods have made strides in discovering well-structured cyclic peptides that preferentially populate a single conformation. However, many successful cyclic-peptide therapeutics adopt multiple conformations in soln. In fact, the chameleonic properties of some cyclic peptides are likely responsible for their high cell membrane permeability. Thus, we require the ability to predict complete structural ensembles for cyclic peptides, including the majority of cyclic peptides that have broad structural ensembles, to significantly improve our ability to rationally design cyclic-peptide therapeutics. Here, we introduce the idea of using mol. dynamics simulation results to train machine learning models to enable efficient structure prediction for cyclic peptides. Using mol. dynamics simulation results for several hundred cyclic pentapeptides as the training datasets, we developed machine-learning models that can provide mol. dynamics simulation-quality predictions of structural ensembles for all the hundreds of thousands of sequences in the entire sequence space. The prediction for each individual cyclic peptide can be made using less than 1 s of computation time. Even for the most challenging classes of poorly structured cyclic peptides with broad conformational ensembles, our predictions were similar to those one would normally obtain only after running multiple days of explicit-solvent mol. dynamics simulations. The resulting method, termed StrEAMM (Structural Ensembles Achieved by Mol. Dynamics and Machine Learning), is the first technique capable of efficiently predicting complete structural ensembles of cyclic peptides without relying on addnl. mol. dynamics simulations, constituting a seven-order-of-magnitude improvement in speed while retaining the same accuracy as explicit-solvent simulations.
- 6Hui, T.; Descoteaux, M. L.; Miao, J.; Lin, Y.-S. Training neural network models using molecular dynamics simulation results to efficiently predict cyclic hexapeptide structural ensembles. J. Chem. Theory Comput. 2023, 19, 4757– 4769, DOI: 10.1021/acs.jctc.3c00154There is no corresponding record for this reference.
- 7Wan, F.; Kontogiorgos-Heintz, D.; de la Fuente-Nunez, C. Deep generative models for peptide design. Digital Discovery 2022, 1, 195– 208, DOI: 10.1039/D1DD00024AThere is no corresponding record for this reference.
- 8Ferguson, A. L.; Ranganathan, R. 100th anniversary of macromolecular science viewpoint: data-driven protein design. ACS Macro Lett. 2021, 10, 327– 340, DOI: 10.1021/acsmacrolett.0c008858100Th Anniversary of Macromolecular Science Viewpoint: Data-Driven Protein DesignFerguson, Andrew L.; Ranganathan, RamaACS Macro Letters (2021), 10 (3), 327-340CODEN: AMLCCD; ISSN:2161-1653. (American Chemical Society)A review. The design of synthetic proteins with the desired function is a long-standing goal in biomol. science, with broad applications in biochem. engineering, agriculture, medicine, and public health. Rational de novo design and exptl. directed evolution have achieved remarkable successes but are challenged by the requirement to find functional "needles" in the vast "haystack" of protein sequence space. Data-driven models for fitness landscapes provide a predictive map between protein sequence and function and can prospectively identify functional candidates for exptl. testing to greatly improve the efficiency of this search. This Viewpoint reviews the applications of machine learning and, in particular, deep learning as part of data-driven protein engineering platforms. We highlight recent successes, review promising computational methodologies, and provide an outlook on future challenges and opportunities. The article is written for a broad audience comprising both polymer and protein scientists and computer and data scientists interested in an up-to-date review of recent innovations and opportunities in this rapidly evolving field.
- 9Strokach, A.; Kim, P. M. Deep generative modeling for protein design. Curr. Opin. Struct. Biol. 2022, 72, 226– 236, DOI: 10.1016/j.sbi.2021.11.0089Deep generative modeling for protein designStrokach, Alexey; Kim, Philip M.Current Opinion in Structural Biology (2022), 72 (), 226-236CODEN: COSBEF; ISSN:0959-440X. (Elsevier Ltd.)A review. Deep learning approaches have produced substantial breakthroughs in fields such as image classification and natural language processing and are making rapid inroads in the area of protein design. Many generative models of proteins have been developed that encompass all known protein sequences, model specific protein families, or extrapolate the dynamics of individual proteins. Those generative models can learn protein representations that are often more informative of protein structure and function than hand-engineered features. Furthermore, they can be used to quickly propose millions of novel proteins that resemble the native counterparts in terms of expression level, stability, or other attributes. The protein design process can further be guided by discriminative oracles to select candidates with the highest probability of having the desired properties. In this review, we discuss five classes of generative models that have been most successful at modeling proteins and provide a framework for model guided protein design.
- 10Chandra, A.; Tünnermann, L.; Löfstedt, T.; Gratz, R. Transformer-based deep learning for predicting protein properties in the life sciences. eLife 2023, 12, e82819 DOI: 10.7554/eLife.82819There is no corresponding record for this reference.
- 11Oliva, R.; Chino, M.; Pane, K.; Pistorio, V.; De Santis, A.; Pizzo, E.; D’Errico, G.; Pavone, V.; Lombardi, A.; Del Vecchio, P.; Notomista, E.; Nastri, F.; Petraccone, L. Exploring the role of unnatural amino acids in antimicrobial peptides. Sci. Rep. 2018, 8, 8888 DOI: 10.1038/s41598-018-27231-511Exploring the role of unnatural amino acids in antimicrobial peptidesOliva Rosario; Chino Marco; De Santis Augusta; D'Errico Gerardino; Pavone Vincenzo; Lombardi Angela; Del Vecchio Pompea; Nastri Flavia; Petraccone Luigi; Pane Katia; Pizzo Elio; Notomista Eugenio; Pistorio ValeriaScientific reports (2018), 8 (1), 8888 ISSN:.Cationic antimicrobial peptides (CAMPs) are a promising alternative to treat multidrug-resistant bacteria, which have developed resistance to all the commonly used antimicrobial, and therefore represent a serious threat to human health. One of the major drawbacks of CAMPs is their sensitivity to proteases, which drastically limits their half-life. Here we describe the design and synthesis of three nine-residue CAMPs, which showed high stability in serum and broad spectrum antimicrobial activity. As for all peptides a very low selectivity between bacterial and eukaryotic cells was observed, we performed a detailed biophysical characterization of the interaction of one of these peptides with liposomes mimicking bacterial and eukaryotic membranes. Our results show a surface binding on the DPPC/DPPG vesicles, coupled with lipid domain formation, and, above a threshold concentration, a deep insertion into the bilayer hydrophobic core. On the contrary, mainly surface binding of the peptide on the DPPC bilayer was observed. These observed differences in the peptide interaction with the two model membranes suggest a divergence in the mechanisms responsible for the antimicrobial activity and for the observed high toxicity toward mammalian cell lines. These results could represent an important contribution to unravel some open and unresolved issues in the development of synthetic CAMPs.
- 12Lu, J.; Xu, H.; Xia, J.; Ma, J.; Xu, J.; Li, Y.; Feng, J. D- and unnatural amino acid substituted antimicrobial peptides with improved proteolytic resistance and their proteolytic degradation characteristics. Front. Microbiol. 2020, 11, 563030 DOI: 10.3389/fmicb.2020.56303012D- and Unnatural Amino Acid Substituted Antimicrobial Peptides With Improved Proteolytic Resistance and Their Proteolytic Degradation CharacteristicsLu Jianguang; Xu Hongjiang; Xia Jianghua; Li Yanan; Feng Jun; Lu Jianguang; Ma Jie; Xu Jun; Feng Jun; Xu Hongjiang; Li YananFrontiers in microbiology (2020), 11 (), 563030 ISSN:1664-302X.The transition of antimicrobial peptides (AMPs) from the laboratory to market has been severely hindered by their instability toward proteases in biological systems. In the present study, we synthesized derivatives of the cationic AMP Pep05 (KRLFKKLLKYLRKF) by substituting L-amino acid residues with D- and unnatural amino acids, such as D-lysine, D-arginine, L-2,4-diaminobutanoic acid (Dab), L-2,3-diaminopropionic acid (Dap), L-homoarginine, 4-aminobutanoic acid (Aib), and L-thienylalanine, and evaluated their antimicrobial activities, toxicities, and stabilities toward trypsin, plasma proteases, and secreted bacterial proteases. In addition to measuring changes in the concentration of the intact peptides, LC-MS was used to identify the degradation products of the modified AMPs in the presence of trypsin and plasma proteases to determine degradation pathways and examine whether the amino acid substitutions afforded improved proteolytic resistance. The results revealed that both D- and unnatural amino acids enhanced the stabilities of the peptides toward proteases. The derivative DP06, in which all of the L-lysine and L-arginine residues were replaced by D-amino acids, displayed remarkable stability and mild toxicity in vitro but only slight activity and severe toxicity in vivo, indicating a significant difference between the in vivo and in vitro results. Unexpectedly, we found that the incorporation of a single Aib residue at the N-terminus of compound UP09 afforded remarkably enhanced plasma stability and improved activity in vivo. Hence, this derivative may represent a candidate AMP for further optimization, providing a new strategy for the design of novel AMPs with improved bioavailability.
- 13Taechalertpaisarn, J.; Ono, S.; Okada, O.; Johnstone, T. C.; Lokey, R. S. A new amino acid for improving permeability and solubility in macrocyclic peptides through side chain-to-backbone hydrogen bonding. J. Med. Chem. 2022, 65, 5072– 5084, DOI: 10.1021/acs.jmedchem.2c0001013A new amino acid for improving permeability and solubility in macrocyclic peptides through side chain-to-backbone hydrogen bondingTaechalertpaisarn, Jaru; Ono, Satoshi; Okada, Okimasa; Johnstone, Timothy C.; Lokey, R. ScottJournal of Medicinal Chemistry (2022), 65 (6), 5072-5084CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Despite the notoriously poor membrane permeability of peptides, many cyclic peptide natural products show high passive membrane permeability and potently inhibit a variety of "undruggable" intracellular targets. A major impediment to the design of cyclic peptides with good permeability is the high desolvation energy assocd. with the peptide backbone amide NH groups. While several strategies have been proposed to mitigate this deleterious effect, only few studies have used polar side chains to sequester backbone NH groups. We investigated the ability of N,N-pyrrolidinylglutamine (Pye), whose side chain contains a powerful hydrogen-bond-accepting C:O amide group but no hydrogen-bond donors, to sequester exposed backbone NH groups in a series of cyclic hexapeptide diastereomers. Analyses revealed that specific Leu-to-Pye substitutions conferred dramatic improvements in aq. soly. and permeability in a scaffold- and position-dependent manner. Therefore, this approach offers a complementary tool for improving membrane permeability and soly. in cyclic peptides.
- 14Geurink, P. P.; van der Linden, W. A.; Mirabella, A. C.; Gallastegui, N.; de Bruin, G.; Blom, A. E.; Voges, M. J.; Mock, E. D.; Florea, B. I.; van der Marel, G. A.; Driessen, C.; van der Stelt, M.; Groll, M.; Overkleeft, H. S.; Kisselev, A. F. Incorporation of non-natural amino acids improves cell permeability and potency of specific inhibitors of proteasome trypsin-like sites. J. Med. Chem. 2013, 56, 1262– 1275, DOI: 10.1021/jm301698714Incorporation of non-natural amino acids improves cell permeability and potency of specific inhibitors of proteasome trypsin-like sitesGeurink, Paul P.; van der Linden, Wouter A.; Mirabella, Anne C.; Gallastegui, Nerea; de Bruin, Gerjan; Blom, Annet E. M.; Voges, Mathias J.; Mock, Elliot D.; Florea, Bogdan I.; van der Marel, Gijs A.; Driessen, Christoph; van der Stelt, Mario; Groll, Michael; Overkleeft, Herman S.; Kisselev, Alexei F.Journal of Medicinal Chemistry (2013), 56 (3), 1262-1275CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Proteasomes degrade the majority of proteins in mammalian cells by a concerted action of three distinct pairs of active sites. The chymotrypsin-like sites are targets of antimyeloma agents bortezomib and carfilzomib. Inhibitors of the trypsin-like site sensitize multiple myeloma cells to these agents. Here we describe systematic effort to develop inhibitors with improved potency and cell permeability, yielding azido-Phe-Leu-Leu-4-aminomethyl-Phe-Me vinyl sulfone (I), LU-102, and a fluorescent activity-based probe for this site. X-ray structures of I and related inhibitors complexed with yeast proteasomes revealed the structural basis for specificity. Nontoxic to myeloma cells when used as a single agent, I sensitized them to bortezomib and carfilzomib. This sensitizing effect was much stronger than the synergistic effects of histone acetylase inhibitors or additive effects of doxorubicin and dexamethasone, raising the possibility that combinations of inhibitors of the trypsin-like site with bortezomib or carfilzomib would have stronger antineoplastic activity than combinations currently used clin.
- 15Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Comput. Sci. 1988, 28, 31– 36, DOI: 10.1021/ci00057a00515SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rulesWeininger, DavidJournal of Chemical Information and Computer Sciences (1988), 28 (1), 31-6CODEN: JCISD8; ISSN:0095-2338.The SMILES (simplified mol. input line entry system) chem. notation system is described for information processing. The system is based on principles of mol. graph theory and it allows structure specification by use of a very small and natural grammar well suited for high-speed machine processing. The system is easy to use, has high machine compatibility, and allows many computer applications, including notation generation, const. speed database retrieval, substructure searching, and property prediction models.
- 16Siani, M. A.; Weininger, D.; Blaney, J. M. CHUCKLES: A method for representing and searching peptide and peptoid sequences on both monomer and atomic levels. J. Chem. Inf. Comput. Sci. 1994, 34, 588– 593, DOI: 10.1021/ci00019a01716CHUCKLES: A method for representing and searching peptide and peptoid sequences on both monomer and atomic levelsSiani, Michael A.; Weininger, David; Blaney, Jeffrey M.Journal of Chemical Information and Computer Sciences (1994), 34 (3), 588-93CODEN: JCISD8; ISSN:0095-2338.Dual representation of peptide and non-peptide structures in a chem. database as at.-level mol. graphs and sequence strings permits chem. substructure and similarity searches as well as sequence-based substring and regular expression searches. CHUCKLES interconverts monomer-based sequences with SMILES, which represent at.-level mol. graphs. Forward-translation maps peptide or other sequences into SMILES. Back-translation exts. monomer sequences from SMILES. This approach permits a generalized representation of monomers allowing user specification of any monomer. CHUCKLES allows mixing of atoms with user-defined monomer names; i.e., monomer representation is consistent with SMILES notation. In addn., oligomer branching and cyclization are handled.
- 17Siani, M. A.; Weininger, D.; James, C. A.; Blaney, J. M. CHORTLES: A method for representing oligomeric and template-based mixtures. J. Chem. Inf. Comput. Sci. 1995, 35, 1026– 1033, DOI: 10.1021/ci00028a01217CHORTLES: A Method for Representing Oligomeric and Template-Based MixturesSiani, Michael A.; Weininger, David; James, Craig A.; Blaney, Jeffrey M.Journal of Chemical Information and Computer Sciences (1995), 35 (6), 1026-33CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)Screening mixts. of synthetic oligomers or fixed templates (e.g., rings) with varying substituents is increasingly the focus of drug discovery programs. CHORTLES is designed and implemented to facilitate representation, storage, and searching of oligomeric and template-based mixts. of any size. Building upon the CHUCKLES method of representing oligomers as both monomer-based sequences and all-atom structures, CHORTLES compactly represents a mixt. without explicitly enumerating individual mols. This method lends itself to a hierarchy relating mixts. to submixts. and individual compds., as one finds when deconvoluting mixts. in drug lead discovery programs. In addn., we describe two methods of searching mixts. at the monomer level. We also present a simple pictorial representation for describing all components in a mixt., which becomes essential as the list of monomer names is expanded beyond common names (e.g., amino acids).
- 18Jensen, J. H.; Hoeg-Jensen, T.; Padkjær, S. B. Building a biochemformatics database. J. Chem. Inf. Model. 2008, 48, 2404– 2413, DOI: 10.1021/ci800128b18Building a BioChemformatics DatabaseJensen, Jan H.; Hoeg-Jensen, Thomas; Padkjaer, Soeren B.Journal of Chemical Information and Modeling (2008), 48 (12), 2404-2413CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)The structural registration of chem. modified macromols. is vital for the development of biopharmaceuticals. However, registration and search of such complex mols. has so far posed formidable challenges performance-wise, since today's chem.-oriented databases do not scale well to macromols. As a practical consequence, macromols. tend to be stored in protein databases with a focus on protein sequence only, and salient chem. details are therefore lost. This article describes protein format extensions and the use of pseudoatoms for representing natural amino acids in chem. structures to allow high-performance registration and retrieval of large macromols. The representations include exact chem. modifications and enable lossless conversion between chem. and sequence formats. Registration is done in parallel in both sequence and chem. formats, and users can register and retrieve mols. in either format as they choose, resulting in what we call a BioChemformatics database. Having both sequence and chem. formats available on-demand allows for the construction of protein SAR tables with mixed sequence and chem. information. Likewise, searching may combine sequence and chem. terms and be performed in std. vendor applications like MDL's ISIS/Base or inhouse applications using std. SQL queries.
- 19Lin, T.-S.; Coley, C. W.; Mochigase, H.; Beech, H. K.; Wang, W.; Wang, Z.; Woods, E.; Craig, S. L.; Johnson, J. A.; Kalow, J. A.; Jensen, K. F.; Olsen, B. D. BigSMILES: A structurally-based line notation for describing macromolecules. ACS Cent. Sci. 2019, 5, 1523– 1531, DOI: 10.1021/acscentsci.9b0047619BigSMILES: A Structurally-Based Line Notation for Describing MacromoleculesLin, Tzyy-Shyang; Coley, Connor W.; Mochigase, Hidenobu; Beech, Haley K.; Wang, Wencong; Wang, Zi; Woods, Eliot; Craig, Stephen L.; Johnson, Jeremiah A.; Kalow, Julia A.; Jensen, Klavs F.; Olsen, Bradley D.ACS Central Science (2019), 5 (9), 1523-1531CODEN: ACSCII; ISSN:2374-7951. (American Chemical Society)Having a compact yet robust structurally based identifier or representation system is a key enabling factor for efficient sharing and dissemination of research results within the chem. community, and such systems lay down the essential foundations for future informatics and data-driven research. While substantial advances have been made for small mols., the polymer community has struggled in coming up with an efficient representation system. This is because, unlike other disciplines in chem., the basic premise that each distinct chem. species corresponds to a well-defined chem. structure does not hold for polymers. Polymers are intrinsically stochastic mols. that are often ensembles with a distribution of chem. structures. This difficulty limits the applicability of all deterministic representations developed for small mols. In this work, a new representation system that is capable of handling the stochastic nature of polymers is proposed. The new system is based on the popular "simplified mol.-input line-entry system" (SMILES), and it aims to provide representations that can be used as indexing identifiers for entries in polymer databases. As a pilot test, the entries of the std. data set of the glass transition temp. of linear polymers (Bicerano, 2002) were converted into the new BigSMILES language. Furthermore, it is hoped that the proposed system will provide a more effective language for communication within the polymer community and increase cohesion between the researchers within the community.
- 20Zhang, T.; Li, H.; Xi, H.; Stanton, R. V.; Rotstein, S. H. HELM: A hierarchical notation language for complex biomolecule structure representation. J. Chem. Inf. Model. 2012, 52, 2796– 2806, DOI: 10.1021/ci300192520HELM: A Hierarchical Notation Language for Complex Biomolecule Structure RepresentationZhang, Tianhong; Li, Hongli; Xi, Hualin; Stanton, Robert V.; Rotstein, Sergio H.Journal of Chemical Information and Modeling (2012), 52 (10), 2796-2806CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)When biol. macromols. are used as therapeutic agents, it is often necessary to introduce non-natural chem. modifications to improve their pharmaceutical properties. The final products are complex structures where entities such as proteins, peptides, oligonucleotides, and small mol. drugs may be covalently linked to each other, or may include chem. modified biol. moieties. An accurate in silico representation of these complex structures is essential, as it forms the basis for their electronic registration, storage, anal., and visualization. The size of these mols. (henceforth referred to as "biomols.") often makes them too unwieldy and impractical to represent at the at. level, while the presence of non-natural chem. modifications makes it impossible to represent them by sequence alone. Here we describe the Hierarchical Editing Language for Macromols. ("HELM") and demonstrate its utility in the representation of structures such as antisense oligonucleotides, short interference RNAs, peptides, proteins, and antibody drug conjugates.
- 21David, L.; Thakkar, A.; Mercado, R.; Engkvist, O. Molecular representations in AI-driven drug discovery: a review and practical guide. J. Cheminf. 2020, 12, 56 DOI: 10.1186/s13321-020-00460-521Molecular representations in AI-driven drug discovery: a review and practical guideDavid, Laurianne; Thakkar, Amol; Mercado, Rocio; Engkvist, OlaJournal of Cheminformatics (2020), 12 (1), 56CODEN: JCOHB3; ISSN:1758-2946. (SpringerOpen)A review. Abstr.: The technol. advances of the past century, marked by the computer revolution and the advent of high-throughput screening technologies in drug discovery, opened the path to the computational anal. and visualization of bioactive mols. For this purpose, it became necessary to represent mols. in a syntax that would be readable by computers and understandable by scientists of various fields. A large no. of chem. representations have been developed over the years, their numerosity being due to the fast development of computers and the complexity of producing a representation that encompasses all structural and chem. characteristics. We present here some of the most popular electronic mol. and macromol. representations used in drug discovery, many of which are based on graph representations. Furthermore, we describe applications of these representations in AI-driven drug discovery. Our aim is to provide a brief guide on structural representations that are essential to the practise of AI in drug discovery. This review serves as a guide for researchers who have little experience with the handling of chem. representations and plan to work on applications at the interface of these fields.
- 22Nguyen-Vo, T.-H.; Teesdale-Spittle, P.; Harvey, J. E.; Nguyen, B. P. Molecular representations in bio-cheminformatics. Memetic Comput. 2024, 16, 519– 536, DOI: 10.1007/s12293-024-00414-6There is no corresponding record for this reference.
- 23Wigh, D. S.; Goodman, J. M.; Lapkin, A. A. A review of molecular representation in the age of machine learning. WIREs Comput. Mol. Sci. 2022, 12, e1603 DOI: 10.1002/wcms.1603There is no corresponding record for this reference.
- 24Sousa, T.; Correia, J.; Pereira, V.; Rocha, M. Generative deep learning for targeted compound design. J. Chem. Inf. Model. 2021, 61, 5343– 5361, DOI: 10.1021/acs.jcim.0c0149624Generative Deep Learning for Targeted Compound DesignSousa, Tiago; Correia, Joao; Pereira, Vitor; Rocha, MiguelJournal of Chemical Information and Modeling (2021), 61 (11), 5343-5361CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)A review. In the past few years, de novo mol. design has increasingly been using generative models from the emergent field of Deep Learning, proposing novel compds. that are likely to possess desired properties or activities. De novo mol. design finds applications in different fields ranging from drug discovery and materials sciences to biotechnol. A panoply of deep generative models, including architectures as Recurrent Neural Networks, Autoencoders, and Generative Adversarial Networks, can be trained on existing data sets and provide for the generation of novel compds. Typically, the new compds. follow the same underlying statistical distributions of properties exhibited on the training data set. Addnl., different optimization strategies, including transfer learning, Bayesian optimization, reinforcement learning, and conditional generation, can direct the generation process toward desired aims, regarding their biol. activities, synthesis processes or chem. features. Given the recent emergence of these technologies and their relevance, this work presents a systematic and crit. review on deep generative models and related optimization methods for targeted compd. design, and their applications.
- 25Morgan, H. L. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc. 1965, 5, 107– 113, DOI: 10.1021/c160017a01825Generation of a unique machine description for chemical structures--a technique developed at Chemical Abstracts ServiceMorgan, H. L.Journal of Chemical Documentation (1965), 5 (2), 107-13CODEN: JCHDAN; ISSN:0021-9576.The description employed is a uniquely ordered list of the node symbols of the structure (or graph) in which the value (at. symbol) of each node and its attachment (bonding) to the other nodes of the total structure. When the entire structure has been numbered according to a given set of rules, the connection table is formed by recording the structural relation by a process of successive partial orderings.
- 26Durant, J. L.; Leland, B. A.; Henry, D. R.; Nourse, J. G. Reoptimization of MDL Keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 2002, 42, 1273– 1280, DOI: 10.1021/ci010132r26Reoptimization of MDL Keys for Use in Drug DiscoveryDurant, Joseph L.; Leland, Burton A.; Henry, Douglas R.; Nourse, James G.Journal of Chemical Information and Computer Sciences (2002), 42 (6), 1273-1280CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)For a no. of years MDL products have exposed both 166 bit and 960 bit keysets based on 2D descriptors. These keysets were originally constructed and optimized for substructure searching. We report on improvements in the performance of MDL keysets which are reoptimized for use in mol. similarity. Classification performance for a test data set of 957 compds. was increased from 0.65 for the 166 bit keyset and 0.67 for the 960 bit keyset to 0.71 for a surprisal S/N pruned keyset contg. 208 bits and 0.71 for a genetic algorithm optimized keyset contg. 548 bits. We present an overview of the underlying technol. supporting the definition of descriptors and the encoding of these descriptors into keysets. This technol. allows definition of descriptors as combinations of atom properties, bond properties, and at. neighborhoods at various topol. sepns. as well as supporting a no. of custom descriptors. These descriptors can then be used to set one or more bits in a keyset. We constructed various keysets and optimized their performance in clustering bioactive substances. Performance was measured using methodol. developed by Briem and Lessel. "Directed pruning" was carried out by eliminating bits from the keysets on the basis of random selection, values of the surprisal of the bit, or values of the surprisal S/N ratio of the bit. The random pruning expt. highlighted the insensitivity of keyset performance for keyset lengths of more than 1000 bits. Contrary to initial expectations, pruning on the basis of the surprisal values of the various bits resulted in keysets which underperformed those resulting from random pruning. In contrast, pruning on the basis of the surprisal S/N ratio was found to yield keysets which performed better than those resulting from random pruning. We also explored the use of genetic algorithms in the selection of optimal keysets. Once more the performance was only a weak function of keyset size, and the optimizations failed to identify a single globally optimal keyset. Instead multiple, equally optimal keysets could be produced which had relatively low overlap of the descriptors they encoded.
- 27Schissel, C. K.; Mohapatra, S.; Wolfe, J. M.; Fadzen, C. M.; Bellovoda, K.; Wu, C.-L.; Wood, J. A.; Malmberg, A. B.; Loas, A.; Gómez-Bombarelli, R.; Pentelute, B. L. Deep learning to design nuclear-targeting abiotic miniproteins. Nat. Chem. 2021, 13, 992– 1000, DOI: 10.1038/s41557-021-00766-327Deep learning to design nuclear-targeting abiotic miniproteinsSchissel, Carly K.; Mohapatra, Somesh; Wolfe, Justin M.; Fadzen, Colin M.; Bellovoda, Kamela; Wu, Chia-Ling; Wood, Jenna A.; Malmberg, Annika B.; Loas, Andrei; Gomez-Bombarelli, Rafael; Pentelute, Bradley L.Nature Chemistry (2021), 13 (10), 992-1000CODEN: NCAHBB; ISSN:1755-4330. (Nature Portfolio)There are more amino acid permutations within a 40-residue sequence than atoms on Earth. This vast chem. search space hinders the use of human learning to design functional polymers. Here we show how machine learning enables the de novo design of abiotic nuclear-targeting miniproteins to traffic antisense oligomers to the nucleus of cells. We combined high-throughput experimentation with a directed evolution-inspired deep-learning approach in which the mol. structures of natural and unnatural residues are represented as topol. fingerprints. The model is able to predict activities beyond the training dataset, and simultaneously deciphers and visualizes sequence-activity predictions. The predicted miniproteins, termed Mach reach an av. mass of 10 kDa, are more effective than any previously known variant in cells and can also deliver proteins into the cytosol. The Mach miniproteins are non-toxic and efficiently deliver antisense cargo in mice. These results demonstrate that deep learning can decipher design principles to generate highly active biomols. that are unlikely to be discovered by empirical approaches.
- 28Ji, X.; Nielsen, A. L.; Heinis, C. Cyclic peptides for drug development. Angew. Chem., Int. Ed. 2024, 63, e202308251 DOI: 10.1002/anie.202308251There is no corresponding record for this reference.
- 29Costa, L.; Sousa, E.; Fernandes, C. Cyclic peptides in pipeline: what future for these great molecules?. Pharmaceuticals 2023, 16, 996 DOI: 10.3390/ph1607099629Antibiotics, antifungals, anticancer, and immunosuppressants use of cyclic peptides in therapeutics for different diseasesCosta, Lia; Sousa, Emilia; Fernandes, CarlaPharmaceuticals (2023), 16 (7), 996CODEN: PHARH2; ISSN:1424-8247. (MDPI AG)A review. Cyclic peptides are mols. that are already used as drugs in therapies approved for various pharmacol. activities, for example, as antibiotics, antifungals, anticancer, and immunosuppressants. Interest in these mols. has been growing due to the improved pharmacokinetic and pharmacodynamic properties of the cyclic structure over linear peptides and by the evolution of chem. synthesis, computational, and in vitro methods. To date, 53 cyclic peptides have been approved by different regulatory authorities, and many others are in clin. trials for a wide diversity of conditions. In this review, the potential of cyclic peptides is presented, and general aspects of their synthesis and development are discussed. Furthermore, an overview of already approved cyclic peptides is also given, and the cyclic peptides in clin. trials are summarized.
- 30Zhang, H.; Chen, S. Cyclic peptide drugs approved in the last two decades (2001–2021). RSC Chem. Biol. 2022, 3, 18– 31, DOI: 10.1039/D1CB00154J30Cyclic peptide drugs approved in the last two decades (2001-2021)Zhang Huiya; Chen ShiyuRSC chemical biology (2022), 3 (1), 18-31 ISSN:.In contrast to the major families of small molecules and antibodies, cyclic peptides, as a family of synthesizable macromolecules, have distinct biochemical and therapeutic properties for pharmaceutical applications. Cyclic peptide-based drugs have increasingly been developed in the past two decades, confirming the common perception that cyclic peptides have high binding affinities and low metabolic toxicity as antibodies, good stability and ease of manufacture as small molecules. Natural peptides were the major source of cyclic peptide drugs in the last century, and cyclic peptides derived from novel screening and cyclization strategies are the new source. In this review, we will discuss and summarize 18 cyclic peptides approved for clinical use in the past two decades to provide a better understanding of cyclic peptide development and to inspire new perspectives. The purpose of the present review is to promote efforts to resolve the challenges in the development of cyclic peptide drugs that are more effective.
- 31Landrum, G. RDKit: Open-Source Cheminformatics Software. https://www.rdkit.org/.There is no corresponding record for this reference.
- 32The PyMOL Molecular Graphics System, version 3.0; Schrödinger, LLC.There is no corresponding record for this reference.
- 33Case, D. A.; Aktulga, H. M. A.; Belfon, K.; Ben-Shalom, I. Y.; Berryman, J. T.; Brozell, S. R.; Cerutti, D. S.; Cheatham, T. E., III; Cisneros, G. A.; Cruzeiro, V. W. D.; Darden, T. A.; Duke, R. E.; Giambasu, G.; Gilson, M. K.; Gohlke, H.; Goetz, A. W.; Harris, R.; Izadi, S.; Izmailov, S. A.; Kasavajhala, K.; Kaymak, M. C.; King, E.; Kovalenko, A.; Kurtzman, T.; Lee, T. S.; LeGrand, S.; Li, P.; Lin, C.; Liu, J.; Luchko, T.; Luo, R.; Machado, M.; Man, V.; Manathunga, M.; Merz, K. M.; Miao, Y.; Mikhailovskii, O.; Monard, G.; Nguyen, H.; O’Hearn, K. A.; Onufriev, A.; Pan, F.; Pantano, S.; Qi, R.; Rahnamoun, A.; Roe, D. R.; Roitberg, A.; Sagui, C.; Schott-Verdugo, S.; Shajan, A.; Shen, J.; Simmerling, C. L.; Skrynnikov, N. R.; Smith, J.; Swails, J.; Walker, R. C.; Wang, J.; Wang, J.; Wei, H.; Wolf, R. M.; Wu, X.; Xiong, Y.; Xue, Y.; York, D. M.; Zhao, S.; Kollman, P. A. Amber; University of California: San Francisco, 2022.There is no corresponding record for this reference.
- 34Jakalian, A.; Jack, D. B.; Bayly, C. I. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation. J. Comput. Chem. 2002, 23, 1623– 4161, DOI: 10.1002/jcc.1012834Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. parameterization and validationJakalian, Araz; Jack, David B.; Bayly, Christopher I.Journal of Computational Chemistry (2002), 23 (16), 1623-1641CODEN: JCCHDD; ISSN:0192-8651. (John Wiley & Sons, Inc.)We present the first global parameterization and validation of a novel charge model, called AM1-BCC, which quickly and efficiently generates high-quality at. charges for computer simulations of org. mols. in polar media. The goal of the charge model is to produce at. charges that emulate the HF/6-31G* electrostatic potential (ESP) of a mol. Underlying electronic structure features, including formal charge and electron delocalization, are first captured by AM1 population charges; simple additive bond charge corrections (BCCs) are then applied to these AM1 at. charges to produce the AM1-BCC charges. The parameterization of BCCs was carried out by fitting to the HF/6-31G* ESP of a training set of >2700 mols. Most org. functional groups and their combinations were sampled, as well as an extensive variety of cyclic and fused bicyclic heteroaryl systems. The resulting BCC parameters allow the AM1-BCC charging scheme to handle virtually all types of org. compds. listed in The Merck Index and the NCI Database. Validation of the model was done through comparisons of hydrogen-bonded dimer energies and relative free energies of solvation using AM1-BCC charges in conjunction with the 1994 Cornell et al. forcefield for AMBER. Homo-dimer and hetero-dimer hydrogen-bond energies of a diverse set of org. mols. were reproduced to within 0.95 kcal/mol RMS deviation from the ab initio values, and for DNA dimers the energies were within 0.9 kcal/mol RMS deviation from ab initio values. The calcd. relative free energies of solvation for a diverse set of monofunctional isosteres were reproduced to within 0.69 kcal/mol of expt. In all these validation tests, AMBER with the AM1-BCC charge model maintained a correlation coeff. above 0.96. Thus, the parameters presented here for use with the AM1-BCC method present a fast, accurate, and robust alternative to HF/6-31G* ESP-fit charges for general use with the AMBER force field in computer simulations involving org. small mols.
- 35Bayly, C. I.; Cieplak, P.; Cornell, W.; Kollman, P. A. A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP model. J. Phys. Chem. A 1993, 97, 10269– 10280, DOI: 10.1021/j100142a00435A well-behaved electrostatic potential based method using charge restraints for deriving atomic charges: the RESP modelBayly, Christopher I.; Cieplak, Piotr; Cornell, Wendy; Kollman, Peter A.Journal of Physical Chemistry (1993), 97 (40), 10269-80CODEN: JPCHAX; ISSN:0022-3654.The authors present a new approach to generating electrostatic potential (ESP) derived charges for mols. The major strength of electrostatic potential derived charges is that they optimally reproduce the intermol. interaction properties of mols. with a simple two-body additive potential, provided, of course, that a suitably accurate level of quantum mech. calcn. is used to derive the ESP around the mol. Previously, the major weaknesses of these charges have been that they were not easily transferably between common functional groups in related mols., they have often been conformationally dependent, and the large charges that frequently occur can be problematic for simulating intramol. interactions. Introducing restraints in the form of a penalty function into the fitting process considerably reduces the above problems, with only a minor decrease in the quality of the fit to the quantum mech. ESP. Several other refinements in addn. to the restrained electrostatic potential (RESP) fit yield a general and algorithmic charge fitting procedure for generating atom-centered point charges. This approach can thus be recommended for general use in mol. mechanics, mol. dynamics, and free energy calcns. for any org. or bioorg. system.
- 36Hornak, V.; Abel, R.; Okur, A.; Strockbine, B.; Roitberg, A.; Simmerling, C. Comparison of multiple amber force fields and development of improved protein backbone parameters. Proteins 2006, 65, 712– 725, DOI: 10.1002/prot.2112336Comparison of multiple Amber force fields and development of improved protein backbone parametersHornak, Viktor; Abel, Robert; Okur, Asim; Strockbine, Bentley; Roitberg, Adrian; Simmerling, CarlosProteins: Structure, Function, and Bioinformatics (2006), 65 (3), 712-725CODEN: PSFBAF ISSN:. (Wiley-Liss, Inc.)The ff94 force field that is commonly assocd. with the Amber simulation package is one of the most widely used parameter sets for biomol. simulation. After a decade of extensive use and testing, limitations in this force field, such as over-stabilization of α-helixes, were reported by the authors and other researchers. This led to a no. of attempts to improve these parameters, resulting in a variety of "Amber" force fields and significant difficulty in detg. which should be used for a particular application. The authors show that several of these continue to suffer from inadequate balance between different secondary structure elements. In addn., the approach used in most of these studies neglected to account for the existence in Amber of two sets of backbone .vphi./ψ dihedral terms. This led to parameter sets that provide unreasonable conformational preferences for glycine. The authors report here an effort to improve the .vphi./ψ dihedral terms in the ff99 energy function. Dihedral term parameters are based on fitting the energies of multiple conformations of glycine and alanine tetrapeptides from high level ab initio quantum mech. calcns. The new parameters for backbone dihedrals replace those in the existing ff99 force field. This parameter set, which the authors denote ff99SB, achieves a better balance of secondary structure elements as judged by improved distribution of backbone dihedrals for glycine and alanine with respect to PDB survey data. It also accomplishes improved agreement with published exptl. data for conformational preferences of short alanine peptides and better accord with exptl. NMR relaxation data of test protein systems.
- 37Wang, J.; Wolf, R. M.; Caldwell, J. W.; Kollman, P. A.; Case, D. A. Development and testing of a general amber force field. J. Comput. Chem. 2004, 25, 1157– 1174, DOI: 10.1002/jcc.2003537Development and testing of a general Amber force fieldWang, Junmei; Wolf, Romain M.; Caldwell, James W.; Kollman, Peter A.; Case, David A.Journal of Computational Chemistry (2004), 25 (9), 1157-1174CODEN: JCCHDD; ISSN:0192-8651. (John Wiley & Sons, Inc.)We describe here a general Amber force field (GAFF) for org. mols. GAFF is designed to be compatible with existing Amber force fields for proteins and nucleic acids, and has parameters for most org. and pharmaceutical mols. that are composed of H, C, N, O, S, P, and halogens. It uses a simple functional form and a limited no. of atom types, but incorporates both empirical and heuristic models to est. force consts. and partial at. charges. The performance of GAFF in test cases is encouraging. In test I, 74 crystallog. structures were compared to GAFF minimized structures, with a root-mean-square displacement of 0.26 Å, which is comparable to that of the Tripos 5.2 force field (0.25 Å) and better than those of MMFF 94 and CHARMm (0.47 and 0.44 Å, resp.). In test II, gas phase minimizations were performed on 22 nucleic acid base pairs, and the minimized structures and intermol. energies were compared to MP2/6-31G* results. The RMS of displacements and relative energies were 0.25 Å and 1.2 kcal/mol, resp. These data are comparable to results from Parm99/RESP (0.16 Å and 1.18 kcal/mol, resp.), which were parameterized to these base pairs. Test III looked at the relative energies of 71 conformational pairs that were used in development of the Parm99 force field. The RMS error in relative energies (compared to expt.) is about 0.5 kcal/mol. GAFF can be applied to wide range of mols. in an automatic fashion, making it suitable for rational drug design and database searching.
- 38Zhou, C.-Y.; Jiang, F.; Wu, Y.-D. Residue-specific force field based on protein coil library. RSFF2: modification of AMBER ff99SB. J. Phys. Chem. B 2015, 119, 1035– 1047, DOI: 10.1021/jp506467638Residue-Specific Force Field Based on Protein Coil Library. RSFF2: Modification of AMBER ff99SBZhou, Chen-Yang; Jiang, Fan; Wu, Yun-DongJournal of Physical Chemistry B (2015), 119 (3), 1035-1047CODEN: JPCBFK; ISSN:1520-5207. (American Chemical Society)Recently, we developed a residue-specific force field (RSFF1) based on conformational free-energy distributions of the 20 amino acid residues from a protein coil library. Most parameters in RSFF1 were adopted from the OPLS-AA/L force field, but some van der Waals and torsional parameters that effectively affect local conformational preferences were introduced specifically for individual residues to fit the coil library distributions. Here a similar strategy has been applied to modify the Amber ff99SB force field, and a new force field named RSFF2 is developed. It can successfully fold α-helical structures such as polyalanine peptides, Trp-cage miniprotein, and villin headpiece subdomain and β-sheet structures such as Trpzip-2, GB1 β-hairpins, and the WW domain, simultaneously. The properties of various popular force fields in balancing between α-helix and β-sheet are analyzed based on their descriptions of local conformational features of various residues, and the anal. reveals the importance of accurate local free-energy distributions. Unlike the RSFF1, which overestimates the stability of both α-helix and β-sheet, RSFF2 gives melting curves of α-helical peptides and Trp-cage in good agreement with exptl. data. Fitting to the two-state model, RSFF2 gives folding enthalpies and entropies in reasonably good agreement with available exptl. results.
- 39Abraham, M. J.; Murtola, T.; Schulz, R.; Páll, S.; Smith, J. C.; Hess, B.; Lindahl, E. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 2015, 1–2, 19– 25, DOI: 10.1016/j.softx.2015.06.001There is no corresponding record for this reference.
- 40Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey, R. W.; Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 1983, 79, 926– 935, DOI: 10.1063/1.44586940Comparison of simple potential functions for simulating liquid waterJorgensen, William L.; Chandrasekhar, Jayaraman; Madura, Jeffry D.; Impey, Roger W.; Klein, Michael L.Journal of Chemical Physics (1983), 79 (2), 926-35CODEN: JCPSA6; ISSN:0021-9606.Classical Monte Carlo simulations were carried out for liq. H2O in the NPT ensemble at 25° and 1 atm using 6 of the simpler intermol. potential functions for the dimer. Comparisons were made with exptl. thermodn. and structural data including the neutron diffraction results of Thiessen and Narten (1982). The computed densities and potential energies agree with expt. except for the original Bernal-Fowler model, which yields an 18% overest. of the d. and poor structural results. The discrepancy may be due to the correction terms needed in processing the neutron data or to an effect uniformly neglected in the computations. Comparisons were made for the self-diffusion coeffs. obtained from mol. dynamics simulations.
- 41Essmann, U.; Perera, L.; Berkowitz, M. L.; Darden, T.; Lee, H.; Pedersen, L. G. A smooth particle mesh Ewald method. J. Chem. Phys. 1995, 103, 8577– 8593, DOI: 10.1063/1.47011741A smooth particle mesh Ewald methodEssmann, Ulrich; Perera, Lalith; Berkowitz, Max L.; Darden, Tom; Lee, Hsing; Pedersen, Lee G.Journal of Chemical Physics (1995), 103 (19), 8577-93CODEN: JCPSA6; ISSN:0021-9606. (American Institute of Physics)The previously developed particle mesh Ewald method is reformulated in terms of efficient B-spline interpolation of the structure factors. This reformulation allows a natural extension of the method to potentials of the form 1/rp with p ≥ 1. Furthermore, efficient calcn. of the virial tensor follows. Use of B-splines in the place of Lagrange interpolation leads to analytic gradients as well as a significant improvement in the accuracy. The authors demonstrate that arbitrary accuracy can be achieved, independent of system size N, at a cost that scales as N log(N). For biomol. systems with many thousands of atoms and this method permits the use of Ewald summation at a computational cost comparable to that of a simple truncation method of 10 Å or less.
- 42Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Scuseria, G. E.; Robb, M. A.; Cheeseman, J. R.; Scalmani, G.; Barone, V.; Petersson, G. A.; Nakatsuji, H.; Li, X.; Caricato, M.; Marenich, A. V.; Bloino, J.; Janesko, B. G.; Gomperts, R.; Mennucci, B.; Hratchian, H. P.; Ortiz, J. V.; Izmaylov, A. F.; Sonnenberg, J. L.; Williams; ; Ding, F.; Lipparini, F.; Egidi, F.; Goings, J.; Peng, B.; Petrone, A.; Henderson, T.; Ranasinghe, D.; Zakrzewski, V. G.; Gao, J.; Rega, N.; Zheng, G.; Liang, W.; Hada, M.; Ehara, M.; Toyota, K.; Fukuda, R.; Hasegawa, J.; Ishida, M.; Nakajima, T.; Honda, Y.; Kitao, O.; Nakai, H.; Vreven, T.; Throssell, K.; Montgomery, J. A., Jr.; Peralta, J. E.; Ogliaro, F.; Bearpark, M. J.; Heyd, J. J.; Brothers, E. N.; Kudin, K. N.; Staroverov, V. N.; Keith, T. A.; Kobayashi, R.; Normand, J.; Raghavachari, K.; Rendell, A. P.; Burant, J. C.; Iyengar, S. S.; Tomasi, J.; Cossi, M.; Millam, J. M.; Klene, M.; Adamo, C.; Cammi, R.; Ochterski, J. W.; Martin, R. L.; Morokuma, K.; Farkas, O.; Foresman, J. B.; Fox, D. J. Gaussian 16, rev. C.01; Gaussian Inc.: Wallingford, CT, 2016.There is no corresponding record for this reference.
- 43Vanquelef, E.; Simon, S.; Marquant, G.; Garcia, E.; Klimerak, G.; Delepine, J. C.; Cieplak, P.; Dupradeau, F. Y. R.E.D. Server: a web service for deriving RESP and ESP charges and building force field libraries for new molecules and molecular fragments. Nucleic Acids Res. 2011, 39, W511– 517, DOI: 10.1093/nar/gkr28843R.E.D. Server: a web service for deriving RESP and ESP charges and building force field libraries for new molecules and molecular fragmentsVanquelef, Enguerran; Simon, Sabrina; Marquant, Gaelle; Garcia, Elodie; Klimerak, Geoffroy; Delepine, Jean Charles; Cieplak, Piotr; Dupradeau, Francois-YvesNucleic Acids Research (2011), 39 (Web Server), W511-W517CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)R.E.D. Server is a unique, open web service, designed to derive non-polarizable RESP and ESP charges and to build force field libraries for new mols./mol. fragments. It provides to computational biologists the means to derive rigorously mol. electrostatic potential-based charges embedded in force field libraries that are ready to be used in force field development, charge validation and mol. dynamics simulations. R.E.D. Server interfaces quantum mechanics programs, the RESP program and the latest version of the R.E.D. tools. A two step approach has been developed. The first one consists of prepg. P2N file(s) to rigorously define key elements such as atom names, topol. and chem. equivalencing needed when building a force field library. Then, P2N files are used to derive RESP or ESP charges embedded in force field libraries in the Tripos mol2 format. In complex cases an entire set of force field libraries or force field topol. database is generated. Other features developed in R.E.D. Server include help services, a demonstration, tutorials, frequently asked questions, Jmol-based tools useful to construct PDB input files and parse R.E.D. Server outputs as well as a graphical queuing system allowing any user to check the status of R.E.D. Server jobs.
- 44Piana, S.; Laio, A. A bias-exchange approach to protein folding. J. Phys. Chem. B 2007, 111, 4553– 4559, DOI: 10.1021/jp067873l44A Bias-Exchange Approach to Protein FoldingPiana, Stefano; Laio, AlessandroJournal of Physical Chemistry B (2007), 111 (17), 4553-4559CODEN: JPCBFK; ISSN:1520-6106. (American Chemical Society)By suitably extending a recent approach [Bussi, G., et al., 2006] the authors introduce a powerful methodol. that allows the parallel reconstruction of the free energy of a system in a virtually unlimited no. of variables. Multiple metadynamics simulations of the same system at the same temp. are performed, biasing each replica with a time-dependent potential constructed in a different set of collective variables. Exchanges between the bias potentials in the different variables are periodically allowed according to a replica exchange scheme. Due to the efficaciously multidimensional nature of the bias the method allows exploring complex free energy landscapes with high efficiency. The usefulness of the method is demonstrated by performing an atomistic simulation in explicit solvent of the folding of a Triptophane cage miniprotein. It is shown that the folding free energy landscape can be fully characterized starting from an extended conformation with use of only 40 ns of simulation on 8 replicas.
- 45Tribello, G. A.; Bonomi, M.; Branduardi, D.; Camilloni, C.; Bussi, G. PLUMED 2: New feathers for an old bird. Comput. Phys. Commun. 2014, 185, 604– 613, DOI: 10.1016/j.cpc.2013.09.01845PLUMED 2: New feathers for an old birdTribello, Gareth A.; Bonomi, Massimiliano; Branduardi, Davide; Camilloni, Carlo; Bussi, GiovanniComputer Physics Communications (2014), 185 (2), 604-613CODEN: CPHCBZ; ISSN:0010-4655. (Elsevier B.V.)Enhancing sampling and analyzing simulations are central issues in mol. simulation. Recently, we introduced PLUMED, an open-source plug-in that provides some of the most popular mol. dynamics (MD) codes with implementations of a variety of different enhanced sampling algorithms and collective variables (CVs). The rapid changes in this field, in particular new directions in enhanced sampling and dimensionality redn. together with new hardware, require a code that is more flexible and more efficient. We therefore present PLUMED 2 here-a complete rewrite of the code in an object-oriented programming language (C++). This new version introduces greater flexibility and greater modularity, which both extends its core capabilities and makes it far easier to add new methods and CVs. It also has a simpler interface with the MD engines and provides a single software library contg. both tools and core facilities. Ultimately, the new code better serves the ever-growing community of users and contributors in coping with the new challenges arising in the field.
- 46Damas, J. M.; Filipe, L. C.; Campos, S. R.; Lousa, D.; Victor, B. L.; Baptista, A. M.; Soares, C. M. Predicting the thermodynamics and kinetics of helix formation in a cyclic peptide model. J. Chem. Theory Comput. 2013, 9, 5148– 5157, DOI: 10.1021/ct400529k46Predicting the thermodynamics and kinetics of helix formation in a cyclic peptide modelDamas, Joao M.; Filipe, Luis C. S.; Campos, Sara R. R.; Lousa, Diana; Victor, Bruno L.; Baptista, Antonio M.; Soares, Claudio M.Journal of Chemical Theory and Computation (2013), 9 (11), 5148-5157CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)The peptide, Ac-(cyclo-2,6)-R-[KAAAD]-NH2 (cyc-RKAAAD), is a short cyclic peptide known to adopt a remarkably stable single-turn α-helix in water. Due to its simplicity and the availability of thermodn. and kinetic exptl. data, cyc-RKAAAD poses as an ideal model for evaluating the aptness of current mol. dynamics (MD) simulation methodologies to accurately sample conformations that reproduce exptl. obsd. properties. Here, the authors extensively sampled the conformational space of cyc-RKAAAD using microsecond-timescale MD simulations. The authors characterized the peptide conformational preferences in terms of secondary structure propensities and, using Cartesian-coordinate principal component anal. (cPCA), constructed its free energy landscape, thus obtaining a detailed weighted discrimination between the helical and nonhelical subensembles. The cPCA state discrimination, together with a Markov model built from it, allowed the authors to est. the free energy of unfolding (-0.57 kJ/mol) and the relaxation time (∼0.435 μs) at 298.15 K, which were in excellent agreement with the exptl. reported values. Addnl., the authors presented simulations conducted using 2 enhanced sampling methods: replica-exchange mol. dynamics (REMD) and bias-exchange metadynamics (BE-MetaD). The authors compared the free energy landscape obtained by these 2 methods with the results from MD simulations and discussed the sampling and computational gains achieved. Overall, the results obtained attested to the suitability of modern simulation methods to explore the conformational behavior of peptide systems with a high level of realism.
- 47Nair, V.; Hinton, G. E. Rectified Linear Units Improve Restricted Boltzmann Machines, Proceedings of the 27th International Conference on Machine Learning (ICML’10), Haifa, Israel, June 21–24, Fürnkranz, J.; Joachims, T., Eds.; Omnipress: Madison, WI, 2010; pp 807– 814.There is no corresponding record for this reference.
- 48Prechelt, L. Automatic early stopping using cross validation: quantifying the criteria. Neural Networks 1998, 11, 761– 767, DOI: 10.1016/S0893-6080(98)00010-048Automatic early stopping using cross validation: quantifying the criteriaPrechelt LutzNeural networks : the official journal of the International Neural Network Society (1998), 11 (4), 761-767 ISSN:.Cross validation can be used to detect when overfitting starts during supervised training of a neural network; training is then stopped before convergence to avoid the overfitting ('early stopping'). The exact criterion used for cross validation based early stopping, however, is chosen in an ad-hoc fashion by most researchers or training is stopped interactively. To aid a more well-founded selection of the stopping criterion, 14 different automatic stopping criteria from three classes were evaluated empirically for their efficiency and effectiveness in 12 different classification and approximation tasks using multi-layer perceptrons with RPROP training. The experiments show that, on average, slower stopping criteria allow for small improvements in generalization (in the order of 4%), but cost about a factor of 4 longer in training time.
- 49Paszke, A. G. S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T. L.; Gimelshein, N.; Antiga, L.; Desmaison, A.; Kopf, A.; Yang, E. D. Z.; Raison, M.; Tejani, A.; Chilamkurthy, S.; Steiner, B. F.; Bai, J.; Chintala, S. An Imperative Style, High-Performance Deep Learning Library, Advances in Neural Information Processing Systems 32 (NeurIPS 2019); Wallach, H.; Larochelle, H.; Beygelzimer, A.; d’Alché-Buc, F.; Fox, E.; Garnett, R., Eds.; Curran Associates: Vancouver, Canada, 2019; pp 8024– 8035.There is no corresponding record for this reference.
- 50Hayashi, K.; Uehara, S.; Yamamoto, S.; Cary, D. R.; Nishikawa, J.; Ueda, T.; Ozasa, H.; Mihara, K.; Yoshimura, N.; Kawai, T.; Ono, T.; Yamamoto, S.; Fumoto, M.; Mikamiyama, H. Macrocyclic peptides as a novel class of NNMT inhibitors: A SAR study aimed at inhibitory activity in the cell. ACS Med. Chem. Lett. 2021, 12, 1093– 1101, DOI: 10.1021/acsmedchemlett.1c0013450Macrocyclic Peptides as a Novel Class of NNMT Inhibitors: A SAR Study Aimed at Inhibitory Activity in the CellHayashi, Kyohei; Uehara, Shota; Yamamoto, Shiho; Cary, Douglas R.; Nishikawa, Junichi; Ueda, Taichi; Ozasa, Hiroki; Mihara, Kousuke; Yoshimura, Norito; Kawai, Taeko; Ono, Takashi; Yamamoto, Saki; Fumoto, Masataka; Mikamiyama, HidenoriACS Medicinal Chemistry Letters (2021), 12 (7), 1093-1101CODEN: AMCLCT; ISSN:1948-5875. (American Chemical Society)Nicotinamide N-methyltransferase (NNMT), which catalyzes the methylation of nicotinamide, is a cytosolic enzyme that has attracted much attention as a therapeutic target for a variety of diseases. However, despite the considerable interest in this target, reports of NNMT inhibitors have still been limited to date. In this work, utilizing in vitro translated macrocyclic peptide libraries, we identified peptide 1 as a novel class of NNMT inhibitors. Further exploration based on the X-ray cocrystal structures of the peptides with NNMT provided a dramatic improvement in inhibitory activity (peptide 23: IC50 = 0.15 nM). Furthermore, by balance of the peptides' lipophilicity and biol. activity, inhibitory activity against NNMT in cell-based assay was successfully achieved (peptide 26: cell-based IC50 = 770 nM). These findings illuminate the potential of cyclic peptides as a relatively new drug discovery modality even for intracellular targets.
- 51Brousseau, M. E.; Clairmont, K. B.; Spraggon, G.; Flyer, A. N.; Golosov, A. A.; Grosche, P.; Amin, J.; Andre, J.; Burdick, D.; Caplan, S.; Chen, G.; Chopra, R.; Ames, L.; Dubiel, D.; Fan, L.; Gattlen, R.; Kelly-Sullivan, D.; Koch, A. W.; Lewis, I.; Li, J.; Liu, E.; Lubicka, D.; Marzinzik, A.; Nakajima, K.; Nettleton, D.; Ottl, J.; Pan, M.; Patel, T.; Perry, L.; Pickett, S.; Poirier, J.; Reid, P. C.; Pelle, X.; Seepersaud, M.; Subramanian, V.; Vera, V.; Xu, M.; Yang, L.; Yang, Q.; Yu, J.; Zhu, G.; Monovich, L. G. Identification of a PCSK9-LDLR disruptor peptide with in vivo function. Cell Chem. Biol. 2022, 29, 249– 258.e5, DOI: 10.1016/j.chembiol.2021.08.01251Identification of a PCSK9-LDLR disruptor peptide with in vivo functionBrousseau, Margaret E.; Clairmont, Kevin B.; Spraggon, Glen; Flyer, Alec N.; Golosov, Andrei A.; Grosche, Philipp; Amin, Jakal; Andre, Jerome; Burdick, Debra; Caplan, Shari; Chen, Guanjing; Chopra, Raj; Ames, Lisa; Dubiel, Diana; Fan, Li; Gattlen, Raphael; Kelly-Sullivan, Dawn; Koch, Alexander W.; Lewis, Ian; Li, Jingzhou; Liu, Eugene; Lubicka, Danuta; Marzinzik, Andreas; Nakajima, Katsumasa; Nettleton, David; Ottl, Johannes; Pan, Meihui; Patel, Tajesh; Perry, Lauren; Pickett, Stephanie; Poirier, Jennifer; Reid, Patrick C.; Pelle, Xavier; Seepersaud, Mohindra; Subramanian, Vanitha; Vera, Victoria; Xu, Mei; Yang, Lihua; Yang, Qing; Yu, Jinghua; Zhu, Guoming; Monovich, Lauren G.Cell Chemical Biology (2022), 29 (2), 249-258.e5CODEN: CCBEBM; ISSN:2451-9448. (Cell Press)Proprotein convertase subtilisin/kexin type 9 (PCSK9) regulates plasma low-d. lipoprotein cholesterol (LDL-C) levels by promoting hepatic LDL receptor (LDLR) degrdn. Therapeutic antibodies that disrupt PCSK9-LDLR binding reduce LDL-C concns. and cardiovascular disease risk. The epidermal growth factor precursor homol. domain A (EGF-A) of the LDLR serves as a primary contact with PCSK9 via a flat interface, presenting a challenge for identifying small mol. PCSK9-LDLR disruptors. We employ an affinity-based screen of 1013in vitro-translated macrocyclic peptides to identify high-affinity PCSK9 ligands that utilize a unique, induced-fit pocket and partially disrupt the PCSK9-LDLR interaction. Structure-based design led to mols. with enhanced function and pharmacokinetic properties (e.g., 13PCSK9i). In mice, 13PCSK9i reduces plasma cholesterol levels and increases hepatic LDLR d. in a dose-dependent manner. 13PCSK9i functions by a unique, allosteric mechanism and is the smallest mol. identified to date with in vivo PCSK9-LDLR disruptor function.
- 52Yoshida, S.; Uehara, S.; Kondo, N.; Takahashi, Y.; Yamamoto, S.; Kameda, A.; Kawagoe, S.; Inoue, N.; Yamada, M.; Yoshimura, N.; Tachibana, Y. Peptide-to-small molecule: a pharmacophore-guided small molecule lead generation strategy from high-affinity macrocyclic peptides. J. Med. Chem. 2022, 65, 10655– 10673, DOI: 10.1021/acs.jmedchem.2c00919There is no corresponding record for this reference.
- 53Banerjee, R.; Basu, G.; Chène, P.; Roy, S. Aib-based peptide backbone as scaffolds for helical peptide mimics. J. Pept. Res. 2002, 60, 88– 94, DOI: 10.1034/j.1399-3011.2002.201005.x53Aib-based peptide backbone as scaffolds for helical peptide mimicsBanerjee, R.; Basu, G.; Chene, P.; Roy, S.Journal of Peptide Research (2002), 60 (2), 88-94CODEN: JPERFA; ISSN:1397-002X. (Blackwell Munksgaard)Helical peptides that can intervene and disrupt therapeutically important protein-protein interactions are attractive drug targets. In order to develop a general strategy for developing such helical peptide mimics, the authors have studied the effect of incorporating α-amino isobutyric acid (Aib), an amino acid with strong preference for helical backbone, as the sole helix promoter in designed peptides. Specifically, the focus is on the hdm2-p53 interaction, which is central to development of many types of cancer. The peptide corresponding to the hdm2 interacting part of p53, helical in bound state but devoid of structure in soln., served as the starting point for peptide design that involved replacement of noninteracting residues by Aib. Incorporation of Aib, while preserving the interacting residues, led to significant increase in helical structure, particularly at the C-terminal region as judged by NMR and CD. The interaction with hdm2 was also found to be enhanced. Most interestingly, trypsin cleavage was found to be retarded by several orders of magnitude. It is concluded that incorporation of Aib is a feasible strategy to create peptide helical mimics with enhanced receptor binding and lower protease cleavage rate.
- 54Karle, I. L. Controls exerted by the Aib residue: helix formation and helix reversal. Pept. Sci. 2001, 60, 351– 365, DOI: 10.1002/1097-0282(2001)60:5<351::AID-BIP10174>3.0.CO;2-UThere is no corresponding record for this reference.
- 55Lundberg, S. M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions; Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, December 4–9, 2017; Von Luxburg, U.; Guyon, I.; Bengio, S.; Wallach, H.; Ferugs, R., Eds.; Curan Associates: Redhook, NY, 2017; pp 4768– 4777.There is no corresponding record for this reference.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.4c02102.
Example of a (ϕ, ψ) distribution binned using various grid sizes; comparison of backbone (BB) features of amino acids in the different amino acid libraries; comparison of the different normalization schemes applied to the BB and voxel (VOX) features; examples of learning curves from hyperparameter tuning; model performances (reporting R2) on the training and validation data sets for the cyclic pentapeptides; p-values from t-tests comparing different features; model performances (reporting R2) on the training and validation data sets for the cyclic hexapeptides; model performances (reporting weighted error, WE) on the various test data sets for the cyclic pentapeptides and cyclic hexapeptides; model performances (reporting R2) using combinations of features on the training and validation data sets for the cyclic pentapeptides and cyclic hexapeptides (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.