Viewpoints
In Memory of Maurizio Botta: His Contribution to the Development of Computer-Aided Drug Design
Mattia Mori *- ,
Fabrizio Manetti - ,
Bruno Botta - , and
Andrea Tafi
This publication is free to access through this site. Learn More
Application Notes
ROBOKOP KG and KGB: Integrated Knowledge Graphs from Federated Sources
Chris Bizon *- ,
Steven Cox - ,
James Balhoff - ,
Yaphet Kebede - ,
Patrick Wang - ,
Kenneth Morton - ,
Karamarie Fecho - , and
Alexander Tropsha
A proliferation of data sources has led to the notional existence of an implicit Knowledge Graph (KG) that contains vast amounts of biological knowledge contributed by distributed Application Programming Interfaces (APIs). However, challenges arise when integrating data across multiple APIs due to incompatible semantic types, identifier schemes, and data formats. We present ROBOKOP KG (http://robokopkg.renci.org), which is a KG that was initially built to support the open biomedical question-answering application, ROBOKOP (Reasoning Over Biomedical Objects linked in Knowledge-Oriented Pathways) (http://robokop.renci.org). Additionally, we present the ROBOKOP Knowledge Graph Builder (KGB), which constructs the KG and provides an extensible framework to handle graph query over and integration of federated data sources.
Machine Learning and Deep Learning
Development of New Methods Needs Proper Evaluation—Benchmarking Sets for Machine Learning Experiments for Class A GPCRs
Damian Leśniak - ,
Sabina Podlewska *- ,
Stanisław Jastrzębski - ,
Igor Sieradzki - ,
Andrzej J. Bojarski - , and
Jacek Tabor
New computational approaches for virtual screening applications are constantly being developed. However, before a particular tool is used to search for new active compounds, its effectiveness in the type of task must be examined. In this study, we conducted a detailed analysis of various aspects of preparation of respective data sets for such an evaluation. We propose a protocol for fetching data from the ChEMBL database, examine various compound representations in terms of the possible bias resulting from the way they are generated, and define a new metric for comparing the structural similarity of compounds, which is in line with chemical intuition. The newly developed method is also used for the evaluation of various approaches for division of the data set into training and test set parts, which are also examined in detail in terms of being the source of possible results bias. Finally, machine learning methods are applied in cross-validation studies of data sets constructed within the paper, constituting benchmarks for the assessment of computational methods developed for virtual screening tasks. Additionally, analogous data sets for class A G protein-coupled receptors (100 targets with the highest number of records) were prepared. They are available at http://gmum.net/benchmarks/, together with script enabling reproduction of all results available at https://github.com/lesniak43/ananas.
Convolutional Neural Networks for the Design and Analysis of Non-Fullerene Acceptors
Shi-Ping Peng - and
Yi Zhao *
Convolutional neural network (CNN) is employed to construct generative and prediction models for the design and analysis of non-fullerene acceptors (NFAs) in organic solar cells. It is demonstrated that the dilated causal CNN can be trained as a good string-based molecular generation model, and the diversity of the generated NFAs is influenced by the depth of convolutional layers. In the property prediction model, the features of NFAs are extracted from the string representations by the dilated CNN. Specially, the attention mechanism is adopted to pool the extracted information, from which the contributions of fragments to molecular properties can be obtained by calculating the corresponding weighted sum. The promising NFAs among the predicted molecules are further verified by quantum chemistry calculations. The proposed generative, prediction models and the theoretical calculations perform as a complete cycle from molecular generation and property prediction to verification, which offer a strategy for the application of CNN in material discovery.
Machine Learning Models Based on Molecular Fingerprints and an Extreme Gradient Boosting Method Lead to the Discovery of JAK2 Inhibitors
Minjian Yang - ,
Bingzhong Tao - ,
Chengjuan Chen - ,
Wenqiang Jia - ,
Shaolei Sun - ,
Tiantai Zhang - , and
Xiaojian Wang *
Developing Janus kinase 2 (JAK2) inhibitors has become a significant focus for small-molecule drug discovery programs in recent years because the inhibition of JAK2 may be an effective approach for the treatment of myeloproliferative neoplasm. Here, based on three different types of fingerprints and Extreme Gradient Boosting (XGBoost) methods, we developed three groups of models in that each group contained a classification model and a regression model to accurately acquire highly potent JAK2 kinase inhibitors from the ZINC database. The three classification models resulted in Matthews correlation coefficients of 0.97, 0.94, and 0.97. Docking methods including Glide and AutoDock Vina were employed to evaluate the virtual screening effectiveness of our classification models. The R2 of three regression models were 0.80, 0.78, and 0.80. Finally, 13 compounds were biologically evaluated, and the results showed that the IC50 values of six compounds were identified to be less than 100 nM. Among them, compound 9 showed high activity and selectivity in that its IC50 value was less than 1 nM against JAK2 while 694 nM against JAK3. The strategy developed may be generally applicable in ligand-based virtual screening campaigns.
Machine-Learning-Based Predictive Modeling of Glass Transition Temperatures: A Case of Polyhydroxyalkanoate Homopolymers and Copolymers
Ghanshyam Pilania *- ,
Carl N. Iverson - ,
Turab Lookman - , and
Babetta L. Marrone
Polyhydroxyalkanoate-based polymers—being ecofriendly, biosynthesizable, and economically viable and possessing a broad range of tunable properties—are currently being actively pursued as promising alternatives for petroleum-based plastics. The vast chemical complexity accessible within this class of polymers gives rise to challenges in the rational discovery of novel polymer chemistries for specific applications. The burgeoning field of polymer informatics addresses this challenge via providing tools and strategies for accelerated property prediction and materials design via surrogate machine-learning models built on reliable past data. In this contribution, we use glass transition temperature Tg as an example target property to demonstrate promise of the data-enabled route to accelerated learning of accurate structure–property mappings in PHA-based polymers. Our analysis uses a data set of experimentally measured Tg values, polymer molecular weights, and a polydispersity index for PHA-based homo- and copolymers that was carefully assembled from the literature. A fingerprinting scheme that captures key properties based on topology, shape, and charge/polarity of specific chemical units or motifs forming the polymer backbone was devised to numerically represent the polymers. A validated statistical learning model is then developed to allow for a mapping of the polymer fingerprints onto the property space in a physically meaningful and reliable manner. Once developed, the model can not only rapidly predict the property of new PHA polymers but also provide uncertainties underlying the predictions. The model is further combined with an evolutionary-algorithm-based search strategy to efficiently identify multicomponent polymer compositions with a prespecified Tg. While the present contribution is focused specifically on Tg, the surrogate model development approach put forward here is general and can, in principle, be extended to a range of other properties.
Chemical Information
Prediction and Interpretable Visualization of Retrosynthetic Reactions Using Graph Convolutional Networks
Shoichi Ishida - ,
Kei Terayama - ,
Ryosuke Kojima - ,
Kiyosei Takasu - , and
Yasushi Okuno *
Recently, many research groups have been addressing data-driven approaches for (retro)synthetic reaction prediction and retrosynthetic analysis. Although the performances of the data-driven approach have progressed because of recent advances of machine learning and deep learning techniques, problems such as improving capability of reaction prediction and the black-box problem of neural networks persist for practical use by chemists. To spread data-driven approaches to chemists, we focused on two challenges: improvement of retrosynthetic reaction prediction and interpretability of the prediction. In this paper, we propose an interpretable prediction framework using graph convolutional networks (GCN) for retrosynthetic reaction prediction and integrated gradients (IG) for visualization of contributions to the prediction to address these challenges. As a result, from the viewpoint of balanced accuracies, our model showed better performances than the approach using an extended-connectivity fingerprint. Furthermore, IG-based visualization of the GCN prediction successfully highlighted reaction-related atoms.
Quantifying Inter-Residue Contacts through Interaction Energies
Thomas J. Summers - ,
Baty P. Daniel - ,
Qianyi Cheng - , and
Nathan J. DeYonker *
The validity and accuracy of protein modeling is dependent on constructing models that account for the inter-residue interactions crucial for protein structure and function. Residue interaction networks derived from interatomic van der Waals contacts have previously demonstrated usefulness toward designing protein models, but there has not yet been evidence of a connection between network-predicted interaction strength and quantitative interaction energies. This work evaluates the intraprotein contact networks of five proteins against ab initio interaction energies computed using symmetry-adapted perturbation theory. To more appropriately capture the local chemistry of the protein, we deviate from traditional protein network analysis to redefine the interacting nodes in terms of main chain and side chain functional groups rather than complete amino acids. While there is no simple correspondence between the features of the contact network and actual interaction strength, random forest models constructed from minimal structural, network, and chemical descriptors are capable of accurately predicting interaction energy. The results of this work serve as a foundation for the development and improvement of functional group-based contact networks.
Computational Chemistry
Framework for Inverse Mapping Chemistry-Agnostic Coarse-Grained Simulation Models into Chemistry-Specific Models
Christian Nowak - ,
Mayank Misra - , and
Fernando A. Escobedo *
Coarse-grained (CG) models have allowed molecular simulations to access large enough time and length scales to elucidate relationships between macroscale properties and microscale molecular interactions. However, an unaddressed inverse-design problem concerns the identification of an optimal chemistry-specific (CS) molecule that the generic CG model represents. This has been addressed here by introducing new tools for automatically generating and refining the mapping of CS-molecule candidates to the constraints of a CG model, based on representative optimization criteria. With these tools, for each CS-molecule from a candidate group, the best mapping of that molecule onto the CG model is found and their fit is assessed by an objective function designed to emphasize matching key properties of the CG model. We employ this methodology to a range of CG models from small solvent molecules up to block copolymer systems to show its ability to find optimal candidates and to uncover the underlying length scale of some of the CG models. For instances where the identity of the CG model is known a priori, the methodology identifies the correct AA chemistry. For instances where the identity is unknown and a pool of candidates is provided, the method selects a chemistry that aligns well with physical intuition. The best candidate chemistry is also found to be sensitive to changes to the CG model.
Propagation of Holes and Electrons in Metal–Organic Frameworks
Maximilian Kriebel - ,
Matthias Hennemann - ,
Frank R. Beierlein - ,
Dana D. Medina - ,
Thomas Bein - , and
Timothy Clark *
Charge transport in two zinc metal–organic frameworks (MOFs) has been investigated using periodic semiempirical molecular orbital calculations with the AM1* Hamiltonian. Restricted Hartree–Fock calculations underestimate the band gap using Koopmans theorem (ca. 2 eV compared to the experimental value of 2.8 eV). However, it almost doubles when the constraint on the wave function to remain spin-restricted is removed and the energies of the UHF Natural Orbitals are used. Charge-transport simulations using propagation of the electron- or hole-density in imaginary time allow charge-transport paths and mechanisms to be determined. The calculated relative mobilities in the directions of the three crystal axes agree with experimental expectations, but the absolute values are not reliable using the current technique. Hole-mobility along the crystal c-axis (along the metal stacks) is found to be 13 times higher in the zinc MOF with anthracene linker (Zn-ANMOF-74) than in the other directions, whereas the factor is far smaller (1.7) for electron mobility. Directional preferences are far less distinct in the equivalent structure with phenyl linkers (Zn-MOF-74). The imaginary-time simulation technique does not give quantitative mobilities. The simulations reveal a change in mechanism between the different directions: Coherent polaron migration is observed along the stacks but tunneling hops between them.
Multisolvent Similarity Measure of Chinese Herbal Medicine Ingredients for Cold–Hot Nature Identification
Guohui Wei *- ,
Xianjun Fu - , and
Zhenguo Wang *
Cold–hot nature theory is the core basic theory of traditional Chinese medicine (TCM). “Treating the hot syndrome with cold nature medicine and treating cold syndrome with hot nature medicine” indicates that correct classification of medical properties (cold or hot nature) of Chinese herbal medicines (CHMs) is an important basis for TCM treatment. In this study, we propose a novel multisolvent similarity measure retrieval scheme (MSSMRS) for discriminating CHMs as cold or hot. We explore a multisolvent distance metric learning algorithm to calculate similarity measure of CHM ingredients, and a retrieval scheme for nature identification. First, four solvents (chloroform, distilled water, absolute ethanol, and petroleum ether) are applied to extract ultraviolet (UV) spectrum data of CHM ingredients. Second, we study quantifying the similarity of CHM ingredients to fingerprint similarity. We explore a multisolvent distance metric learning (MSDML) algorithm to measure the similarity of CHM ingredients. MSDML can discover complementary characteristics of different solvent data sets through an optimization algorithm. Finally, a retrieval scheme is designed to analyze the relationship between the CHM ingredients and cold–hot nature. Extensive experimental results demonstrate that CHMs with similar compositions of substances have similar medicinal natures. Experimental evaluations based on the proposed retrieval scheme suggest the effectiveness of MSDML in the identification of the nature of CHMs.
GroScore: Accurate Scoring of Protein–Protein Binding Poses Using Explicit-Solvent Free-Energy Calculations
Jan Walther Perthold - and
Chris Oostenbrink *
Protein–protein docking algorithms promise a potential relief for the mismatch between the number of experimentally determined complex structures and the number of relevant protein interactions in an organism. To distinguish correctly from wrongly generated poses, it is necessary to score complexes according to their structural similarity to the real complex, which is usually done by computing interaction energies of some sort. Here, we explore the potential of free-energy calculations with statistical-mechanical foundation in the context of molecular dynamics (MD) simulations with explicit solvent to score a large number of complex poses. We introduce an adaptive sampling scheme which ensures that most sampling time is spent on the most promising poses. Our approach is illustrated by scoring of all targets in the CAPRI Score_set, a scoring benchmark set, and three additional CAPRI targets, together consisting of more than 22 000 poses. Our scoring scheme shows a performance that is competitive with the most successful approaches that were previously reported. All necessary scripts to run the automated scoring pipeline are available in the Supporting Information for this paper.
Mechanism of Uncoupled Carbocyclization and Epimerization Catalyzed by Two Non-Heme Iron/α-Ketoglutarate Dependent Enzymes
Hong Li - ,
Wenyou Zhu - , and
Yongjun Liu *
The non-heme iron/α-ketoglutarate dependent enzymes SnoK and SnoN from Streptomyces nogalater are involved in the biosynthesis of anthracycline nogalamycin. Although they have similar active sites, SnoK is responsible for carbocyclization whereas SnoN solely catalyzes the hydroxyl epimerization. Herein, we performed docking, molecular simulations, and a series of combined quantum mechanics and molecular mechanics (QM/MM) calculations to illuminate the mechanisms of two enzymes. The catalytic reactions of two enzymes occur on the quintet state surface. For SnoK, the whole reaction includes two separated hydrogen-abstraction steps and one radical addition, and the latter step is calculated to be rate limiting with an energy barrier of 21.7 kcal/mol. Residue D106 is confirmed to participate in the construction of the hydrogen bond network, which plays a crucial role in positioning the bulky substrate in a specific orientation. Moreover, it is found that SnoN is only responsible for the hydrogen abstraction of the intermediate, and no residue was suggested to be suitable for donating a hydrogen atom to the substrate radical, which further confirms the suggestion based on experiments that either a cellular reductant or another enzyme protein could donate a hydrogen atom to the substrate. Our docking results coincide with the previous structural study that the different roles of two enzymes are achieved by minor changes in the alignment of the substrates in front of the reactive ferryl-oxo species. This work highlights the reaction mechanisms catalyzed by SnoK and SnoN, which is helpful for engineering the enzymes for the biosynthesis of anthracycline nogalamycin.
Density Functional Theory Transition-State Modeling for the Prediction of Ames Mutagenicity in 1,4 Michael Acceptors
Piers A. Townsend - and
Matthew N. Grayson *
This publication is Open Access under the license indicated. Learn More
Assessing the safety of new chemicals, without introducing the need for animal testing, is a task of great importance. The Ames test, a widely used bioassay to assess mutagenicity, can be an expensive, wasteful process with animal-derived reagents. Existing in silico methods for the prediction of Ames test results are traditionally based on chemical category formation and can lead to false positive predictions. Category formation also neglects the intrinsic chemistry associated with DNA reactivity. Activation energies and HOMO/LUMO energies for thirty 1,4 Michael acceptors were calculated using a model nucleobase and were further used to predict the Ames test result of these compounds. The proposed model builds upon existing work and examines the fundamental toxicant–target interactions using density functional theory transition-state modeling. The results show that Michael acceptors with activation energies <20.7 kcal/mol and LUMO energies < −1.85 eV are likely to act as direct mutagens upon exposure to DNA.
Heteroaryldihydropyrimidines Alter Capsid Assembly By Adjusting the Binding Affinity and Pattern of the Hepatitis B Virus Core Protein
Huihui Liu - ,
Susumu Okazaki - , and
Wataru Shinoda *
Hepatitis B virus (HBV) infections are a major global health concern, for which heteroaryldihydropyrimidines (HAPs) have been developed. HAPs accelerate and/or result in aberrant capsid assembly; however, their effect on the assembly mechanism is unknown. This study aimed to compare the effects of three representative HAPs on core protein dimer assembly through molecular dynamics simulations and free energy calculations. Molecular docking and equilibrium simulations showed that different HAPs bind at the same binding site and are involved in different interactions. The observed conformational changes in HAPs deter the calculation of binding affinity. Herein, the reduced free energy perturbation/Hamiltonian replica exchange molecular dynamics method was used to enhance sampling during binding affinity calculations, indicating consistency between the binding free energies of HAPs and pEC50. Furthermore, binding pattern analysis revealed that the tetramer could sample flat structures after binding HAPs. The present results suggest a mechanism wherein HAPs accelerate capsid assembly by increasing the binding affinity of dimers, leading to aberrant assembly by altering the binding orientation of dimers.
Computational Biochemistry
Radical Stabilization Energies for Enzyme Engineering: Tackling the Substrate Scope of the Radical Enzyme QueE
Christian J. Suess - ,
Floriane L. Martins - ,
Anna K. Croft - , and
Christof M. Jäger *
Experimental assessment of catalytic reaction mechanisms and profiles of radical enzymes can be severely challenging due to the reactive nature of the intermediates and sensitivity of cofactors such as iron–sulfur clusters. Here, we present an enzyme-directed computational methodology for the assessment of thermodynamic reaction profiles and screening for radical stabilization energies (RSEs) for the assessment of catalytic turnovers in radical enzymes. We have applied this new screening method to the radical S-adenosylmethione enzyme 7-carboxy-7-deazaguanine synthase (QueE), following a detailed molecular dynamics (MD) analysis that clarifies the role of both specific enzyme residues and bound Mg2+, Ca2+, or Na+. The MD simulations provided the basis for a statistical approach to sample different conformational outcomes. RSE calculation at the M06-2X/6-31+G* level of theory provided the most computationally cost-effective assessment of enzyme-based energies, facilitated by an initial triage using semiempirical methods. The impact of intermolecular interactions on RSE was clearly established, and application to the assessment of potential alternative substrates (focusing on radical clock type rearrangements) proposes a selection of carbon-substituted analogues that would react to afford cyclopropylcarbinyl radical intermediates as candidates for catalytic turnover by QueE.
Molecular Docking as a Promising Predictive Model for Silver Nanoparticle-Mediated Inhibition of Cytochrome P450 Enzymes
Nootcharin Wasukan - ,
Mayuso Kuno - , and
Rawiwan Maniratanachote *
Cytochrome P450 (CYP) enzymes are responsible for oxidative metabolisms of a large number of xenobiotics. In this study, we investigated interactions of silver nanoparticles (AgNPs) and silver ions (Ag+) with six CYP isoforms, namely, CYP1A2, CYP2C9, CYP2C19, CYP2D6, CYP2E1, and CYP3A4, within CYP-specific inhibitor-binding pockets by molecular docking and quantum mechanical (QM) calculations. The docking results revealed that the Ag3 cluster, not Ag+, interacted with key amino acids of CYP2C9, CYP2C19, and CYP2D6 within a distance of about 3 Å. Moreover, the QM analysis confirmed that the amino acid residues of these CYP enzymes strongly interacted with the Ag3 cluster, providing more insight into the mechanism of the potential inhibition of CYP enzyme activities. Interestingly, these results are consistent with previous in vitro data indicating that AgNPs inhibited activities of CYP2C and CYP2D in rat liver microsomes. It is suggested that the Ag3 cluster is a minimal unit of AgNPs for in silico modeling. In summary, we demonstrated that molecular docking, together with QM analysis, is a promising tool to predict AgNP-mediated CYP inhibition. These methods are useful for deeper understanding of reaction mechanisms and could be used for other nanomaterials.
Estimation of Protein–Ligand Unbinding Kinetics Using Non-Equilibrium Targeted Molecular Dynamics Simulations
Steffen Wolf *- ,
Marta Amaral - ,
Maryse Lowinski - ,
Francois Vallée - ,
Djordje Musil - ,
Jörn Güldenhaupt - ,
Matthias K. Dreyer - ,
Jörg Bomke - ,
Matthias Frech - ,
Jürgen Schlitter - , and
Klaus Gerwert
We here report on nonequilibrium targeted molecular dynamics simulations as a tool for the estimation of protein–ligand unbinding kinetics. Correlating simulations with experimental data from SPR kinetics measurements and X-ray crystallography on two small molecule compound libraries bound to the N-terminal domain of the chaperone Hsp90, we show that the mean nonequilibrium work computed in an ensemble of trajectories of enforced ligand unbinding is a promising predictor for ligand unbinding rates. We furthermore investigate the molecular basis determining unbinding rates within the compound libraries. We propose ligand conformational changes and protein–ligand nonbonded interactions to impact on unbinding rates. Ligands may remain longer at the protein if they exhibit strong electrostatic and/or van der Waals interactions with the target. In the case of ligands with a rigid chemical scaffold that exhibit longer residence times, transient electrostatic interactions with the protein appear to facilitate unbinding. Our results imply that understanding the unbinding pathway and the protein–ligand interactions along this path is crucial for the prediction of small molecule ligands with defined unbinding kinetics.
Assessing Peptide Binding to MHC II: An Accurate Semiempirical Quantum Mechanics Based Proposal
Carlos A. Ortiz-Mahecha - ,
Hugo J. Bohórquez - ,
William A. Agudelo - ,
Manuel A. Patarroyo - ,
Manuel E. Patarroyo - , and
Carlos F. Suárez *
Estimating peptide–major histocompatibility complex (pMHC) binding using structural computational methods has an impact on understanding overall immune function triggering adaptive immune responses in MHC class II molecules. We developed a strategy for optimizing pMHC structure interacting with water molecules and for calculating the binding energy of receptor + ligand systems, such as HLA-DR1 + HA, HLA-DR1 + CLIP, HLA-DR2 + MBP, and HLA-DR3 + CLIP, as well as a monosubstitution panel. Taking pMHC’s structural properties, we assumed that ΔH ≫ −TΔS would generate a linear model for estimating relative free energy change, using three semiempirical quantum methods (PM6, PM7, and FMO-SCC-DFTB3) along with the implicit solvent models, and considering proteins in neutral and charged states. Likewise, we confirmed our approach’s effectiveness in calculating binding energies having high correlation with experimental data and low root-mean-square error (<2 kcal/mol). All in all, our pipeline differentiates weak from strong peptide binders as a reliable method for studying pMHC interactions.
Catalytic Mechanism and Covalent Inhibition of UDP-N-Acetylglucosamine Enolpyruvyl Transferase (MurA): Implications to the Design of Novel Antibacterials
Levente M. Mihalovits - ,
György G. Ferenczy - , and
György M. Keserű *
UDP-N-acetylglucosamine enolpyruvyl transferase (MurA) catalyzes the first step in the biosynthesis of the bacterial cell wall. This pathway is essential for the growth of bacteria but missing in mammals, that nominates MurA as an attractive antibacterial target. MurA has a flexible loop whose conformational change is known to be part of the activation mechanism of the enzyme. We have shown that the loop closed conformation makes the proton transfer from Cys115 to His394 possible by a low barrier exothermic process. QM/MM MD simulations revealed that the activated thiolate is able to react with phosphoenolpyruvate (PEP), the natural substrate of MurA. The binding free energy profile of several covalent inhibitors with various warheads reacting with the activated Cys115 was calculated by QM/MM MD simulations and confirmed that reaction barrier heights tend to separate active from inactive compounds. Our results give new insight into the catalytic mechanism and covalent inhibition of MurA and suggest that QM/MM MD simulations are able to support ligand design by providing sensible relative free energy barriers for covalent inhibitors with various warheads reacting with thiolate nucleophiles.
Regulatory Mechanics of Constitutive Androstane Receptors: Basal and Ligand-Directed Actions
Bill Pham - ,
Avery Bancroft Arons - ,
Jeremy G. Vincent - ,
Elias J. Fernandez - , and
Tongye Shen *
Constitutive androstane receptor (CAR) is a nuclear hormone receptor that primarily functions in sensing and metabolizing xenobiotics. The basal activity of this receptor is relatively high, and CAR is deemed active in the absence of ligand. The (over)activation can promote drug toxicity and tumor growth. Thus, therapeutic treatments seek inverse agonists to inhibit or modulate CAR activities. To advance our understanding of the regulatory mechanisms of CAR, we used computational and experimental approaches to elucidate three aspects of CAR activation and inactivation: (1) ligand-dependent actions, (2) ligand-orthologue specificity, and (3) constitutive activity. For ligand-dependent actions, we examined the ligand-bound simulations and identified two sets of ligand-induced contacts promoting CAR activation via coactivator binding (H11–H12 contact) or inactivation via corepressor binding (H4–H11 contact). For orthologue specificity, we addressed a puzzling fact that murine CAR (mCAR) and human CAR (hCAR) respond differently to the same ligand (CITCO), despite their high sequence homology. We found that the helix H7 of hCAR is responsible for a stronger binding of the ligand CITCO compared to mCAR, hence a stronger CITCO-induced activation. For basal activity, we reported computer-generated unliganded CAR structures and critical mutagenesis (mCAR’s V209A and N333D) results of a cell-based transcription assay. Our results reveal that the basal conformation of CAR shares prominent features with the agonist-bound form, and helix HX has an important contribution to the constitutive activity. These findings altogether can be useful for the understanding of constitutively active receptors and the design of drug molecules targeting them.
Insights to the Binding of a Selective Adenosine A3 Receptor Antagonist Using Molecular Dynamic Simulations, MM-PBSA and MM-GBSA Free Energy Calculations, and Mutagenesis
Panagiotis Lagarias - ,
Kerry Barkan - ,
Eva Tzortzini - ,
Margarita Stampelou - ,
Eleni Vrontaki - ,
Graham Ladds *- , and
Antonios Kolocouris *
Adenosine A3 receptor (A3R) is a promising drug target cancer and for a number of other conditions like inflammatory diseases, including asthma and rheumatoid arthritis, glaucoma, chronic obstructive pulmonary disease, and ischemic injury. Currently, there is no experimentally determined structure of A3R. We explored the binding profile of O4-{[3-(2,6-dichlorophenyl)-5-methylisoxazol-4-yl]carbonyl}-2-methyl-1,3-thiazole-4-carbohydroximamide (K18), which is a new specific and competitive antagonist at the orthosteric binding site of A3R. MD simulations and MM-GBSA calculations of the WT A3R in complex with K18 combined with in vitro mutagenic studies show that the most plausible binding conformation for the dichlorophenyl group of K18 is oriented toward trans-membrane helices (TM) 5, 6 and reveal important residues for binding. Further, MM-GBSA calculations distinguish mutations that reduce or maintain or increase antagonistic activity. Our studies show that selectivity of K18 toward A3R is defined not only by direct interactions with residues within the orthosteric binding area but also by remote residues playing a significant role. Although V1695.30 is considered to be a selectivity filter for A3R binders, when it was mutated to glutamic acid, K18 maintained antagonistic potency, in agreement with our previous results obtained for agonists binding profile investigation. Mutation of the direct interacting residue L903.32 in the low region and the remote L2647.35 in the middle/upper region to alanine increases antagonistic potency, suggesting an empty space in the orthosteric area available for increasing antagonist potency. These results approve the computational model for the description of K18 binding at A3R, which we previously performed for agonists binding to A3R, and the design of more effective antagonists based on K18.
Nontargeted Parallel Cascade Selection Molecular Dynamics Based on a Nonredundant Selection Rule for Initial Structures Enhances Conformational Sampling of Proteins
Ryuhei Harada *- ,
Vladimir Sladek *- , and
Yasuteru Shigeta *
Nontargeted parallel cascade selection molecular dynamics (nt-PaCS-MD) is a method for enhanced conformational sampling of proteins. To search a broad conformational subspace, nt-PaCS-MD repeats cycles of conformational resampling from relevant initial structures. Generally, the conformational sampling efficiency of nt-PaCS-MD depends on a selection rule for the initial structures. In the original nt-PaCS-MD, the initial structures were selected by referring to structural distributions of protein configurations generated by conformational resampling (multiple short-time MD simulations). However, their structural redundancy among the initial structures was neglected for the cycles of conformational resampling, indicating that similar protein configurations might be frequently specified and resampled in every cycle in the original nt-PaCS-MD. To reduce the possibility of resampling from redundant initial structures, we propose an alternative selection rule that accounts for structural similarity among the initial structures. Specifically, a pairwise root-mean-square deviation (RMSD) is defined for all of the initial structures selected for all of the past cycles. Then a set of protein configurations with a larger pairwise RMSD is sequentially specified and resampled in the next cycle, which is regarded to as a history-dependent selection of initial structures by considering a profile of the past specified initial structures. The present scheme, termed extended nt-PaCS-MD, prevents us from resampling a set of redundant protein configurations. To check the conformational sampling efficiency of the extended nt-PaCS-MD, we used a middle-sized protein, T4 lysozyme, in explicit water. Through the assessment, this extended nt-PaCS-MD identified the open–closed transitions of T4 lysozyme more efficiently than the original nt-PaCS-MD.
Do Cholesterol and Sphingomyelin Change the Mechanism of Aβ25–35 Peptide Binding to Zwitterionic Bilayer?
Amy K. Smith - ,
Elias Khayat - ,
Christopher Lockhart - , and
Dmitri K. Klimov *
Using replica exchange with solute tempering all-atom molecular dynamics, we studied the equilibrium binding of Aβ25–35 peptide to the ternary bilayer composed of an equimolar mixture of dimyristoylphosphatidylcholine (DMPC), N-palmitoylsphingomyelin (PSM), and cholesterol. Binding of the same peptide to the pure DMPC bilayer served as a control. Due to significant C-terminal hydrophobic moment, binding to the ternary and DMPC bilayers promotes helical structure in the peptide. For both bilayers a polarized binding profile is observed, in which the N-terminus anchors to the bilayer surface, whereas the C-terminus alternates between unbound and inserted states. Both ternary and DMPC bilayers feature two Aβ25–35 bound states, surface bound, S, and inserted, I, separated by modest free energy barriers. Experimental data are in agreement with our results but indicate that cholesterol impact is Aβ fragment dependent. For Aβ25–35, we predict that its binding mechanism is independent of the inclusion of PSM and cholesterol into the bilayer.
Investigating Reliable Conditions for HEWL as an Amyloid Model in Computational Studies and Drug Interactions
Hamid R. Kalhor *- and
Mohammadparsa Jabbary
A number of conformational diseases in humans have been associated with protein/peptide fibrillation known as amyloid. Although extensive studies have been conducted in understanding the molecular basis of amyloid formation, a detailed mechanism is still missing. Experimentally, HEWL (hen egg white lysozyme) has been exploited ubiquitously as a model protein for amyloid fibrillation and drug inhibition. However, computational studies investigating fibril formation of HEWL have been a difficult task to perform mainly due to high stability of lysozymes and the absence of crystal structures of HEWL fibril oligomers. In this study, we have examined various conditions of HEWL amyloid formation computationally; the results indicated that, at high concentration of ethanol (90%), significant unfolding of the protein was apparent. Higher values for RMSD, solvent accessibility, and solvent diffusion into the core, as well as conversion of native α-helical structures to random coils, were detected in the ethanol solution. REMD (replica exchange molecular dynamics) analysis demonstrated that the presence of ethanol significantly altered the minimum structure of HEWL into partially unfolded states. It has been observed that unfolding of the protein was initiated from the C-terminal region, exposing the protein to the solvent. The interaction of previously known anti amyloid drug (RS-0406) with HEWL was analyzed in high concentration of ethanol both in silico and in vitro. The results demonstrated that the drug was able to attenuate HEWL unfolding and fibrillation both experimentally and computationally. Computational studies provided detailed interactions explaining the inhibitory effect of the drug in this model. Most importantly, a mechanism of drug inhibition was purported based on a bridge formed by the drug that stabilized the C-terminus. All in all, a computational model of HEWL amyloid formation was attained which can be employed to assess inhibitory effects of antiamyloid drugs in a reasonable processing time.
Computing the Pathogenicity of Wilson’s Disease ATP7B Mutations: Implications for Disease Prevalence
Ning Tang - ,
Thomas D. Sandahl - ,
Peter Ott - , and
Kasper P. Kepp *
Genetic variations in the gene encoding the copper-transport protein ATP7B are the primary cause of Wilson’s disease. Controversially, clinical prevalence seems much smaller than the prevalence estimated by genetic screening tools, causing fear that many people are undiagnosed, although early diagnosis and treatment is essential. To address this issue, we benchmarked 16 state-of-the-art computational disease-prediction methods against established data of missense ATP7B mutations. Our results show that the quality of the methods varies widely. We show the importance of optimizing the threshold of the methods used to distinguish pathogenic from nonpathogenic mutations against data of clinically confirmed pathogenic and nonpathogenic mutations. We find that most methods use thresholds that predict too many ATP7B mutations to be pathogenic. Thus, our findings explain the current controversy on Wilson’s disease prevalence because meta-analysis and text search methods include many computational estimates that lead to higher disease prevalence than clinically observed. As proteins and diseases differ widely, a one-size-fits-all threshold cannot distinguish pathogenic and nonpathogenic mutations efficiently, as shown here. We also show that amino acid changes with small evolutionary substitution probability, mainly due to amino acid volume, are more associated with the disease, implying a pathological effect on the conformational state of the protein, which could affect copper transport or adenosine triphosphate recognition and hydrolysis. These findings may be a first step toward a more quantitative genotype–phenotype relationship of Wilson’s disease.
Pharmaceutical Modeling
Investigation of Crystal Structures in Structure-Based Virtual Screening for Protein Kinase Inhibitors
Xingye Chen - ,
Haichun Liu - ,
Wuchen Xie - ,
Yan Yang - ,
Yuchen Wang - ,
Yuanrong Fan - ,
Yi Hua - ,
Lu Zhu - ,
Junnan Zhao - ,
Tao Lu - ,
Yadong Chen *- , and
Yanmin Zhang *
Protein kinases are important drug targets in several therapeutic areas ,and structure-based virtual screening (SBVS) is an important strategy in discovering lead compounds for kinase targets. However, there are multiple crystal structures available for each target, and determining which one is the most favorable is a key step in molecular docking for SBVS due to the ligand induce-fit effect. This work aimed to find the most desirable crystal structures for molecular docking by a comprehensive analysis of the protein kinase database which covers 190 different kinases from all eight main kinase families. Through an integrated self-docking and cross-docking evaluation, 86 targets were eventually evaluated on a total of 2608 crystal structures. Results showed that molecular docking has great capability in reproducing conformation of crystallized ligands and for each target, the most favorable crystal structure was selected, and the AGC family outperformed the other family targets based on RMSD comparison. In addition, RMSD values, GlideScore, and corresponding bioactivity data were compared and demonstrated certain relationships. This work provides great convenience for researchers to directly select the optimal crystal structure in SBVS-based kinase drug design and further validates the effectiveness of molecular docking in drug discovery.
Accelerated Structural Prediction of Flexible Protein–Ligand Complexes: The SLICE Method
James M. B. McFarlane *- ,
Katherine D. Krause - , and
Irina Paci *
Using existing and academically available software, we present a new method for the structural prediction of binding events containing flexible protein targets. SLICE (Selective Ligand-Induced Conformational Ensemble) combines opportunistic stochastic jumps of ligand position with standard molecular dynamics to model the induced-fit binding of ligands starting with unbound host coordinates. To induce the structural adaptations of the complex at the binding site, conformational jumps in ligand position are selected in SLICE from structures generated by a docking software. Multiple binding trajectories from the docking set are followed using molecular dynamics for a set time to relax the host structure and generate new host poses. A new configurational jump is made on the set of newly generated host poses. The process is then repeated. The method was implemented with AutoDock Vina as the docking method, Vina scores as the selection criterion, and Amber code for molecular dynamics and applied to several test systems. A system consisting of Chromobox protein homologue 8 (CBX8) and its small peptide ligand, H3K9Me3, for which the final (bound) configuration is known, is used for verifying SLICE in the present setup. The setup was also applied to several nonpeptide molecules on known difficult flexible targets exhibiting a large disparity between apo and holo host states. The SLICE simulations provide a promising approach to generate induced-fit configurations compared to existing long (microsecond) classical and accelerated dynamics approaches in all the test systems considered here. However, further optimization of SLICE parameters is required for replicating crystal structure coordinates for some systems. We discuss in the following pages the various SLICE parameters and how they can be optimized for the system at hand.
Mechanism of Small Molecules Inhibiting Activator Protein-1 DNA Binding Probed with Induced Fit Docking and Metadynamics Simulations
Zhou Yin *
Transcription factor activator protein-1 (AP-1) binds to cognate DNA and regulates gene expression. In recent decades, small-molecule inhibitors have been developed for therapeutic applications that block AP-1 binding to DNA. However, the mechanism by which small molecules inhibit AP-1-DNA binding remains elusive. Here, computational studies identified a drug-binding site on the AP-1 Fos/Jun apo structure. Induced fit docking of known inhibitors, together with metadynamics simulations to identify the most plausible binding pose, showed a consensus mode of AP-1/inhibitor interaction. The in silico binding mode of the inhibitors suggests a mechanism of AP-1-DNA binding inhibition, where the inhibitors block the base-contacting residues, preclude access of DNA, and prohibit conformational changes of AP-1 upon DNA binding.
Bioinformatics
A Unified Framework for the Prediction of Small Molecule–MicroRNA Association Based on Cross-Layer Dependency Inference on Multilayered Networks
Chun-Chun Wang - and
Xing Chen *
MicroRNAs (miRNAs) play a key role in many critical biological processes and are involved in the occurrence and development of complex human diseases. Many studies demonstrated that discovering the associations between small molecules (SMs) and miRNAs will facilitate the design of miRNA targeted therapeutic strategies for complex human diseases. This work presents a calculation model of cross-layer dependency inference on multilayered networks for small molecule–miRNA association prediction (CLDISMMA), which constructed multilayered networks composed of SMs, miRNAs, and diseases. It utilized the within layer topology and the known cross-layer associations to infer latent representations of all layers for SM–miRNA association prediction. In CLDISMMA, the novelties lie in introducing disease information for SM–miRNA association prediction and utilizing a regularized optimization model to describe the SM–miRNA association prediction problem. To evaluate the performance of CLDISMMA, global leave-one-out cross validation (LOOCV) and miRNA-fixed and SM-fixed local LOOCV were implemented in two data sets. In data set 1, CLDISMMA achieved AUCs of 0.9889, 0.9886, and 0.7755 in turns. The corresponding AUCs were 0.8726, 0.8798, and 0.7021 based on data set 2. In addition, CLDISMMA obtained average AUCs of 0.9887 and 0.8647 in data sets 1 and 2 under 100 times 5-fold cross validation. Furthermore, we employed CLDISMMA to predict SM–miRNA associations based on data set 1, and 21 out of the top 50 predicted associations were confirmed by experimental reports. In the case study for new SMs, 5-fluorouracil and 5-aza-2′-deoxycytidine, 40 and 30 miRNAs, respectively, were verified to be associated with them among the top 50 miRNAs predicted by CLDISMMA.
Molecular Interaction between Distal C-Terminal Domain of the CB1 Cannabinoid Receptor and Cannabinoid Receptor Interacting Proteins (CRIP1a/CRIP1b)
Pratishtha Singh - ,
Anjali Ganjiwale - ,
Allyn C. Howlett - , and
Sudha M. Cowsik *
We have investigated the structure of the distal C-terminal domain of the of the CB1 cannabinoid receptor (CB1R) to study its interactions with CRIP1a and CRIP1b using computational techniques. The amino acid sequence from the distal C-terminal domain of CB1R (G417-L472) was found to be unique, as it does not share sequence similarity with other protein structures, so the structure was predicted using ab initio modeling. The computed model of the distal C-terminal region of CB1R has a helical region between positions 441 and 455. The CRIP1a and CRIP1b were modeled using Rho-GDI 2 as a template. The three-dimensional model of the distal C-terminal domain of the CB1R was docked with both CRIP1a as well as CRIP1b to study the crucial interactions between CB1R and CRIP1a/b. The last nine residues of CB1R (S464TDTSAEAL4722) are known to be a CRIP1a/b binding site. The majority of the key interactions were identified in this region, but notable interactions were also observed beyond theses nine residues. The multiple interactions between Thr418 (CB1R) and Asn61 (CRIP1a) as well as Asp430 (CB1R) and Lys76 (CRIP1a) indicate their importance in the CB1R–CRIP1a interaction. In the case of CRIP1b, multiple hydrogen bond interactions between Asn437 (CB1R) and Glu77 (CRIP1b) were observed. These interactions can be critical for CB1R’s interaction with CRIP1a/b, and targeting them for further experimental studies can advance information about CRIP1a/b functionality.
Errata
Correction to Analyzing Learned Molecular Representations for Property Prediction
Kevin Yang *- ,
Kyle Swanson *- ,
Wengong Jin - ,
Connor Coley - ,
Philipp Eiden - ,
Hua Gao - ,
Angel Guzman-Perez - ,
Timothy Hopper - ,
Brian Kelley - ,
Miriam Mathea - ,
Andrew Palmer - ,
Volker Settels - ,
Tommi Jaakkola - ,
Klavs Jensen - , and
Regina Barzilay
This publication is Open Access under the license indicated. Learn More
Mastheads
Issue Editorial Masthead
This publication is free to access through this site. Learn More
Issue Publication Information
This publication is free to access through this site. Learn More