3D-e-Chem-VM: Structural Cheminformatics Research Infrastructure in a Freely Available Virtual MachineClick to copy article linkArticle link copied!
- Ross McGuire
- Stefan Verhoeven
- Márton Vass
- Gerrit Vriend
- Iwan J. P. de Esch
- Scott J. Lusher
- Rob Leurs
- Lars Ridder
- Albert J. Kooistra
- Tina Ritschel
- Chris de Graaf
Abstract
3D-e-Chem-VM is an open source, freely available Virtual Machine (http://3d-e-chem.github.io/3D-e-Chem-VM/) that integrates cheminformatics and bioinformatics tools for the analysis of protein–ligand interaction data. 3D-e-Chem-VM consists of software libraries, and database and workflow tools that can analyze and combine small molecule and protein structural information in a graphical programming environment. New chemical and biological data analytics tools and workflows have been developed for the efficient exploitation of structural and pharmacological protein–ligand interaction data from proteomewide databases (e.g., ChEMBLdb and PDB), as well as customized information systems focused on, e.g., G protein-coupled receptors (GPCRdb) and protein kinases (KLIFS). The integrated structural cheminformatics research infrastructure compiled in the 3D-e-Chem-VM enables the design of new approaches in virtual ligand screening (Chemdb4VS), ligand-based metabolism prediction (SyGMa), and structure-based protein binding site comparison and bioisosteric replacement for ligand design (KRIPOdb).
Introduction
3D-e-Chem-VM
GPCRdb Nodes
GPCRDB Protein Families: Extraction of protein family information, including the protein names and classifications of all GPCRs in the four-level hierarchy defined by GPCRdb (class, ligand type, subfamily, subtype).
GPCRDB Protein Information: Retrieval of source, species, and sequence data from UniProt identifiers or protein family identifier.
GPCRDB Protein Residues: Retrieval of residues and numbering schemes. This node retrieves all residues of the specified protein with secondary structure annotation, UniProt numbering, and GPCR residue numbering. (47)
GPCRDB Structures of a Protein: Retrieval of experimental GPCR structures with literature references, PDB codes, and ligands.
GPCRDB Mutations of a Protein: Retrieval of single point mutations in GPCRs, including the sequence position, mutation, ligand, assay type, mutation effect, protein expression information, and publication reference.
GPCRDB Structure–Ligand Interactions: Returns the sequence numbers of amino acid residues interacting with ligands in the specified PDB entry. The interaction type is annotated in the output table.
GPCRDB Protein Similarity: Returns the sequence identity and similarity of a query receptor versus a set of receptors, based on the full sequence or a specified set of residues.
Figure 1
Figure 1. KNIME workflows to exploit cheminformatics and bioinformatics information on GPCRs (GPCRdb nodes) and protein kinases (KLIFS nodes). In the GPCRdb workflow, KNIME nodes are used to enable the extraction and combination of protein information, sequence, alternative numbering schemes, mutagenesis data, and experimental structures for a selected receptor from GPCRdb. The lower branch of the workflow returns all sequence identities and similarities of the TM domain for the selected receptors and can be used for further structural chemogenomics analyses (44) using, e.g., structural and structure-based sequence alignments of the ligand binding site residues of crystallized aminergic receptors (available in the VM as a PyMOL session). In the KLIFS workflow, KNIME nodes enable the integrated analysis of structural kinase–ligand interactions from all structures for a specific kinase in KLIFS (human MAPK in the example). Kinase–ligand complexes with a specific hydrogen bond interaction pattern between the ligand and residues in the hinge region of the kinase (stacked bar chart) are selected for an all-against-all comparison of their structural kinase–ligand interactions fingerprints (heat map). The ligands from the selected structures are compared and the ligand pair with the lowest chemical similarity and a high interaction fingerprint similarity are retrieved from KLIFS for binding mode comparison. Meta nodes in the workflows in panels A and B are indicated with a star (*). The full workflows are provided in the Supporting Information, Figures S2 and S3.
KLIFS Nodes
KLIFS Information Nodes
Kinase ID Mapper: Maps a user-supplied set of kinase names (names according to Manning et al. (48)), HGNC gene symbols, or UniProt accession codes to a KLIFS kinase ID. The output also contains all related kinase information present within KLIFS (see “Kinase Information Retriever”).
Kinase Information Retriever: Returns a table comprising the KLIFS kinase ID, kinase name, HGNC symbol, kinase group, kinase family, kinase class, species, full name, UniProt accession code, IUPHAR ID, and the amino acid sequence of the pocket based on the KLIFS pocket definition using a consistent alignment of 85 residues.
KLIFS Interactions Nodes
Interaction Fingerprint Decomposer: Decomposes a protein–ligand interaction fingerprint (IFP) (49) into a human-readable table with annotated interactions for each structure. This node can optionally add the sequence number and the KLIFS residue position (29) for each pocket residue to the table.
Interaction Fingerprint Retriever: Retrieval of the interaction fingerprint of specific kinase-ligand complexes from KLIFS. The fingerprint has been corrected for gaps/missing residues within the KLIFS pocket thereby enabling all-against-all comparisons.
Interaction Types Retriever: Retrieves the different interaction types for each bit position of the interaction fingerprint method and can be used in combination with the interaction fingerprint decomposer to identify which kinase–ligand interactions are present in a given set of kinase structures.
KLIFS Ligands Nodes
Ligands Overview Retriever: Retrieval of ligand IDs, three-letter PDB-codes, names, molecular structures (SMILES), and InChIKeys for all ligands from (a specific set of) kinase-ligand complexes present within KLIFS.
KLIFS structures nodes
Structures Overview Retriever: Retrieves a list of all corresponding structures within KLIFS based on a user-supplied set of KLIFS kinase or ligand IDs (e.g., from a specific kinase family). The node returns the structure ID, kinase name, kinase ID, PDB-code, and all other structural annotation data within KLIFS (e.g., pocket sequence, resolution, quality, ligands, DFG conformation, targeted subpockets, waters). (29)
Structures PDB Mapper: Maps a set of PDB-codes to structure IDs from KLIFS and provides all related structural information from KLIFS.
Structures Retriever (MOL2): Retrieves from KLIFS a set of structures, (optionally the full complex, the protein, the pocket, or the ligand) in MOL2 format, based on a user-supplied set of Structure IDs. As output the node provides a table of aligned structures based on the KLIFS pocket definition.
KRIPOdb and KRIPO Nodes
Similar Fragments: Retrieval of ligand fragments that share a similar subpocket with the query fragment, based on a specified similarity matrix (local HDF5 file or web service URL), similarity threshold, and maximum number of fragment hits.
Fragment Information: Retrieval of the chemical structures of the fragment, the full ligand, and the associated PDB based on the fragment identifier.
Figure 2
Figure 2. KRIPO binding site similarity based bioisosteric replacement and SyGMa metabolite prediction workflows. Ligands in KRIPOdb that share a chemical (sub)structure with a specified molecule (doxepin in the example) are identified and defined as query fragment(s). Ligand (fragment) binding site hits that share pharmacophore fingerprint similarity with the binding site(s) associated with the query fragment(s) (e.g., the doxepin binding site of the histamine H1 receptor) are identified and ranked according to Tanimoto similarity score. The occurrence of protein targets in the top hit list is analyzed. The pharmacophore overlay underlying the similarity value of an example hit (histamine methyltransferase, PDB ID: 2aot; available in the VM as a PyMOL session). The full workflow is provided in the Supporting Information (Figure S4). In the SyGMa workflow Smiles strings of clozapine and dasatinib are converted into RDKit molecules for the prediction of metabolites using the SyGMa Metabolites node, filtered based on a SyGMa_score threshold of 0.1. The two tables are subsections of the resulting table, showing the top ranked metabolites of clozapine and dasatinib, consistent with experimental metabolism data. (51, 52) Meta nodes are indicated with a star (*).
SyGMa Node
3D-e-Chem Workflow Application Example 1: Kinase Interaction Pattern Analysis
3D-e-Chem Workflow Application Example 2: GPCR-Kinase Cross-Reactivity Prediction
Figure 3
Figure 3. Schematic diagram of possible interactions of the 3D-e-Chem-VM virtual machine elements: KLIFS and GPCRdb web service connector nodes, KRIPOdb, KRIPO, and SyGMa nodes, and the Chemdb4VS workflow (full workflow presented in the Supporting Information, Figure S6) integrated in a GPCR-kinase cross-reactivity prediction workflow.
Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.6b00686.
Figures presenting the full versions of the GPCRdb, KLIFS, KRIPO, SyGMa, Chemdb4VS, and GPCR-kinase cross-reactivity prediction example KNIME workflows (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgment
Vignir Isberg, Christian Munk, and David Gloriam from University of Copenhagen for useful discussions on the developments of the GPCRdb KNIME nodes.
References
This article references 60 other publications.
- 1Hu, Y.; Bajorath, J. Learning from ’big data’: compounds and targets Drug Discovery Today 2014, 19, 357– 60 DOI: 10.1016/j.drudis.2014.02.004Google ScholarThere is no corresponding record for this reference.
- 2Lusher, S. J.; McGuire, R.; van Schaik, R. C.; Nicholson, C. D.; de Vlieg, J. Data-driven medicinal chemistry in the era of big data Drug Discovery Today 2014, 19, 859– 68 DOI: 10.1016/j.drudis.2013.12.004Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXivVKhtw%253D%253D&md5=9de50fcde3985e05470544da2261d35cData-driven medicinal chemistry in the era of big dataLusher, Scott J.; McGuire, Ross; van Schaik, Rene C.; Nicholson, C. David; de Vlieg, JacobDrug Discovery Today (2014), 19 (7), 859-868CODEN: DDTOFS; ISSN:1359-6446. (Elsevier Ltd.)Science, and the way we undertake research, is changing. The increasing rate of data generation across all scientific disciplines is providing incredible opportunities for data-driven research, with the potential to transform our current practices. The exploitation of so-called 'big data' will enable us to undertake research projects never previously possible but should also stimulate a re-evaluation of all our data practices. Data-driven medicinal chem. approaches have the potential to improve decision making in drug discovery projects, providing that all researchers embrace the role of 'data scientist' and uncover the meaningful relationships and patterns in available data.
- 3RDKit. http://www.rdkit.org.Google ScholarThere is no corresponding record for this reference.
- 4Steinbeck, C. C.; Han, Y.; Kuhn, S.; Horlacher, O.; Luttmann, E.; Willighagen, E. The Chemistry Development Kit J. Chem. Inf. Comput. Sci. 2003, 43, 493– 500 DOI: 10.1021/ci025584yGoogle Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXhtVaktbg%253D&md5=afc8fd10783af301c73a8183727230bfThe Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and BioinformaticsSteinbeck, Christoph; Han, Yongquan; Kuhn, Stefan; Horlacher, Oliver; Luttmann, Edgar; Willighagen, EgonJournal of Chemical Information and Computer Sciences (2003), 43 (2), 493-500CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)The Chem. Development Kit (CDK) is a freely available open-source Java library for Structural Chemo- and Bioinformatics. Its architecture and capabilities as well as the development as an open-source project by a team of international collaborators from academic and industrial institutions is described. The CDK provides methods for many common tasks in mol. informatics, including 2D and 3D rendering of chem. structures, I/O routines, SMILES parsing and generation, ring searches, isomorphism checking, structure diagram generation, etc. Application scenarios as well as access information for interested users and potential contributors are given.
- 5Jmol. http://jmol.sourceforge.net/.Google ScholarThere is no corresponding record for this reference.
- 6Pymol. https://www.pymol.org/.Google ScholarThere is no corresponding record for this reference.
- 7ChemAxon. https://www.chemaxon.com/.Google ScholarThere is no corresponding record for this reference.
- 8Indigo. http://lifescience.opensource.epam.com/indigo/.Google ScholarThere is no corresponding record for this reference.
- 9O’Boyle, N.; Banck, M.; James, C.; Morley, C.; Vandermeersch, T.; Hutchison, G. Open babel: an open chemical toolbox J. Cheminf. 2011, 3, 33 DOI: 10.1186/1758-2946-3-33Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhsVWjurbF&md5=74e4f19b7f87417f916d57f7abcfb761Open Babel: an open chemical toolboxO'Boyle, Noel M.; Banck, Michael; James, Craig A.; Morley, Chris; Vandermeersch, Tim; Hutchison, Geoffrey R.Journal of Cheminformatics (2011), 3 (), 33CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: A frequent problem in computational modeling is the interconversion of chem. structures between different formats. While std. interchange formats exist (for example, Chem. Markup Language) and de facto stds. have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chem. data, differences in the data stored by different formats (0D vs. 3D, for example), and competition between software along with a lack of vendor-neutral formats. Results: We discuss, for the first time, Open Babel, an open-source chem. toolbox that speaks the many languages of chem. data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chem. and mol. data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions: Open Babel presents a soln. to the proliferation of multiple chem. file formats. In addn., it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chem. data in areas such as org. chem., drug design, materials science, and computational chem. It is freely available under an open-source license.
- 10Beisken, S.; Meinl, T.; Wiswedel, B.; de Figueiredo, L. F.; Berthold, M.; Steinbeck, C. KNIME-CDK: Workflow-driven cheminformatics BMC Bioinf. 2013, 14, 257 DOI: 10.1186/1471-2105-14-257Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXlsVCktb8%253D&md5=d6a769e0c88b6d0d79bed69b1ea210caKNIME-CDK: workflow-driven cheminformaticsBeisken, Stephan; Meinl, Thorsten; Wiswedel, Bernd; de Figueiredo, Luis F.; Berthold, Michael; Steinbeck, ChristophBMC Bioinformatics (2013), 14 (), 257/1-257/4, 4 pp.CODEN: BBMIC4; ISSN:1471-2105. (BioMed Central Ltd.)A review. Background: Cheminformaticians have to routinely process and analyze libraries of small mols. Among other things, that includes the standardization of mols., calcn. of various descriptors, visualisation of mol. structures, and downstream anal. For this purpose, scientific workflow platforms such as the Konstanz Information Miner can be used if provided with the right plug-in. A workflow-based cheminformatics tool provides the advantage of ease-of-use and interoperability between complementary cheminformatics packages within the same framework, hence facilitating the anal. process. Results: KNIME-CDK comprises functions for mol. conversion to/from common formats, generation of signatures, fingerprints, and mol. properties. It is based on the Chem. Development Toolkit and uses the Chem. Markup Language for persistence. A comparison with the cheminformatics plug-in RDKit shows that KNIME-CDK supports a similar range of chem. classes and adds new functionality to the framework. We describe the design and integration of the plug-in, and demonstrate the usage of the nodes on ChEBI, a library of small mols. of biol. interest. Conclusions: KNIME-CDK is an open-source plug-in for the Konstanz Information Miner, a free workflow platform. KNIME-CDK is built on top of the open-source Chem. Development Toolkit and allows for efficient cross-vendor structural cheminformatics. Its ease-of-use and modularity enables researchers to automate routine tasks and data anal., bringing complimentary cheminformatics functionality to the workflow environment.
- 11Murrell, D. S.; Cortes-Ciriano, I.; van Westen, G. J.; Stott, I. P.; Bender, A.; Malliavin, T. E.; Glen, R. C. Chemically Aware Model Builder (camb): an R package for property and bioactivity modelling of small molecules J. Cheminf. 2015, 7, 45 DOI: 10.1186/s13321-015-0086-2Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XotlWgsLw%253D&md5=69d17c8379c1268eb9eba6ee6f24d4fbChemically Aware Model Builder (camb): an R package for property and bioactivity modelling of small moleculesMurrell, Daniel S.; Cortes-Ciriano, Isidro; van Westen, Gerard J. P.; Stott, Ian P.; Bender, Andreas; Malliavin, Therese E.; Glen, Robert C.Journal of Cheminformatics (2015), 7 (), 45/1-45/10CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: In silico predictive models have proved to be valuable for the optimization of compd. potency, selectivity and safety profiles in the drug discovery process. Results:camb is an R package that provides an environment for the rapid generation of quant. Structure-Property and Structure-Activity models for small mols. (including QSAR, QSPR, QSAM, PCM) and is aimed at both advanced and beginner R users. camb's capabilities include the standardisation of chem. structure representation, computation of 905 one-dimensional and 14 fingerprint type descriptors for small mols., 8 types of amino acid descriptors, 13 whole protein sequence descriptors, filtering methods for feature selection, generation of predictive models (using an interface to the R package caret), as well as techniques to create model ensembles using techniques from the R package caretEnsemble. Results can be visualised through high-quality, customisable plots (R package ggplot2). Conclusions: Overall, camb constitutes an open-source framework to perform the following steps: (1) compd. standardisation, (2) mol. and protein descriptor calcn., (3) descriptor pre-processing and model training, visualisation and validation, and (4) bioactivity/property prediction for new mols. camb aims to speed model generation, in order to provide reproducibility and tests of robustness. QSPR and proteochemometric case studies are included which demonstrate camb's application.
- 12Sander, T.; Freyss, J.; von Korff, M.; Rufener, C. Datawarrior: An Open-Source Program for Chemistry Aware Data Visualization and Analysis J. Chem. Inf. Model. 2015, 55, 460– 473 DOI: 10.1021/ci500588jGoogle Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXktFWnuw%253D%253D&md5=5c849901b5cb4549d870d81f5eeaca0aDataWarrior: An Open-Source Program For Chemistry Aware Data Visualization And AnalysisSander, Thomas; Freyss, Joel; von Korff, Modest; Rufener, ChristianJournal of Chemical Information and Modeling (2015), 55 (2), 460-473CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Drug discovery projects in the pharmaceutical industry accumulate thousands of chem. structures and ten-thousands of data points from a dozen or more biol. and pharmacol. assays. A sufficient interpretation of the data requires understanding which mol. families are present, which structural motifs correlate with measured properties, and which tiny structural changes cause large property changes. Data visualization and anal. software with sufficient chem. intelligence to support chemists in this task is rare. In an attempt to contribute to filling the gap, we released our inhouse developed chem. aware data anal. program DataWarrior for free public use. This paper gives an overview of DataWarrior's functionality and architecture. Exemplarily, a new unsupervised, 2-dimensional scaling algorithm is presented, which employs vector-based or nonvector-based descriptors to visualize the chem. or pharmacophore space of even large data sets. DataWarrior uses this method to interactively explore chem. space, activity landscapes, and activity cliffs.
- 13R Core Team. R: A language and environment for statistical computing; R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/.Google ScholarThere is no corresponding record for this reference.
- 14Python. http://www.python.org.Google ScholarThere is no corresponding record for this reference.
- 15Java. https://www.oracle.com/java/index.html.Google ScholarThere is no corresponding record for this reference.
- 16Berthold, M. R.; Cebron, N.; Dill, F.; Gabriel, T. R.; Kötter, T.; Meinl, T.; Ohl, P.; Sieb, C.; Thiel, K.; Wiswedel, B. KNIME: The Konstanz Information Miner. In Data Analysis, Machine Learning and Applications; Springer Berlin Heidelberg, 2007; pp 319– 326.Google ScholarThere is no corresponding record for this reference.
- 17Mazanetz, M. P.; Marmon, R. J.; Reisser, C. B.; Morao, I. Drug Discovery Applications for KNIME: An Open Source Data Mining Platform Curr. Top. Med. Chem. 2012, 12, 1965– 1979 DOI: 10.2174/156802612804910331Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXjt1Whtr4%253D&md5=772876af38009c19b1ae8736ed35dcdaDrug discovery applications for KNIME: an open source data mining platformMazanetz, Michael P.; Marmon, Robert J.; Reisser, Catherine B. T.; Morao, InakiCurrent Topics in Medicinal Chemistry (Sharjah, United Arab Emirates) (2012), 12 (18), 1965-1979CODEN: CTMCCL; ISSN:1568-0266. (Bentham Science Publishers Ltd.)A review. Technol. advances in high-throughput screening methods, combinatorial chem. and the design of virtual libraries have evolved in the pursuit of challenging drug targets. Over the last two decades a vast amt. of data has been generated within these fields and as a consequence data mining methods have been developed to ext. key pieces of information from these large data pools. Much of this data is now available in the public domain. This has been helpful in the arena of drug discovery for both academic groups and for small to medium sized enterprises which previously would not have had access to such data resources. Com. data mining software is sometimes prohibitively expensive and the alternate open source data mining software is gaining momentum in both academia and in industrial applications as the costs of research and development continue to rise. KNIME, the Konstanz Information Miner, has emerged as a leader in open source data mining tools. KNIME provides an integrated soln. for the data mining requirements across the drug discovery pipeline through a visual assembly of data workflows drawing from an extensive repository of tools. This review will examine KNIME as an open source data mining tool and its applications in drug discovery.
- 18KNIME Cheminformatics Extensions. https://tech.knime.org/cheminformatics-extensions.Google ScholarThere is no corresponding record for this reference.
- 19Bento, A. P.; Gaulton, A.; Hersey, A.; Bellis, L. J.; Chambers, J.; Davies, M.; Krüger, F. A.; Light, Y.; Mak, L.; McGlinchey, S.; Nowotka, M.; Papadatos, G.; Santos, R.; Overington, J. P. The ChEMBL Bioactivity Database: An Update Nucleic Acids Res. 2014, 42, D1083– 1090 DOI: 10.1093/nar/gkt1031Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXoslWl&md5=31b832d03d56ea3065d7aa29618362bcThe ChEMBL bioactivity database: an updateBento, A. Patricia; Gaulton, Anna; Hersey, Anne; Bellis, Louisa J.; Chambers, Jon; Davies, Mark; Krueger, Felix A.; Light, Yvonne; Mak, Lora; McGlinchey, Shaun; Nowotka, Michal; Papadatos, George; Santos, Rita; Overington, John P.Nucleic Acids Research (2014), 42 (D1), D1083-D1090CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compds. from research stages through clin. development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a no. of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addn. to the web-based interface, data downloads and web services.
- 20Kim, S.; Thiessen, P. A.; Bolton, E. E.; Chen, J.; Fu, G.; Gindulyte, A.; Han, L.; He, J.; He, S.; Shoemaker, B. A.; Wang, J.; Yu, B.; Zhang, J.; Bryant, S. H. PubChem Substance and Compound databases Nucleic Acids Res. 2016, 44, D1202– 1213 DOI: 10.1093/nar/gkv951Google Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2gu7bE&md5=1ba53f15667506b761d05f0f02313892PubChem substance and compound databasesKim, Sunghwan; Thiessen, Paul A.; Bolton, Evan E.; Chen, Jie; Fu, Gang; Gindulyte, Asta; Han, Lianyi; He, Jane; He, Siqian; Shoemaker, Benjamin A.; Wang, Jiyao; Yu, Bo; Zhang, Jian; Bryant, Stephen H.Nucleic Acids Research (2016), 44 (D1), D1202-D1213CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public repository for information on chem. substances and their biol. activities, launched in 2004 as a component of the Mol. Libraries Roadmap Initiatives of the US National Institutes of Health (NIH). For the past 11 years, PubChem has grown to a sizable system, serving as a chem. information resource for the scientific research community. PubChem consists of three inter-linked databases, Substance, Compd. and BioAssay. The Substance database contains chem. information deposited by individual data contributors to PubChem, and the Compd. database stores unique chem. structures extd. from the Substance database. Biol. activity data of chem. substances tested in assay expts. are contained in the BioAssay database. This paper provides an overview of the PubChem Substance and Compd. databases, including data sources and contents, data organization, data submission using PubChem Upload, chem. structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access. It also gives a brief description of PubChem3D, a resource derived from theor. three-dimensional structures of compds. in PubChem, as well as PubChemRDF, Resource Description Framework (RDF)-formatted PubChem data for data sharing, anal. and integration with information contained in other databases.
- 21Liu, T.; Lin, Y.; Wen, X.; Jorissen, R. N.; Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities Nucleic Acids Res. 2007, 35, D198– D201 DOI: 10.1093/nar/gkl999Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXivFKktg%253D%253D&md5=0ccb20d9b9178a624d4829b5909e7ff8BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinitiesLiu, Tiqing; Lin, Yuhmei; Wen, Xin; Jorissen, Robert N.; Gilson, Michael K.Nucleic Acids Research (2007), 35 (Database Iss), D198-D201CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)BindingDB is a publicly accessible database currently contg. ∼20 000 exptl. detd. binding affinities of protein-ligand complexes, for 110 protein targets including isoforms and mutational variants, and ∼11 000 small mol. ligands. The data are extd. from the scientific literature, data collection focusing on proteins that are drug-targets or candidate drug-targets and for which structural data are present in the Protein Data Bank. The BindingDB website supports a range of query types, including searches by chem. structure, substructure and similarity; protein sequence; ligand and protein names; affinity ranges and mol. wt. Data sets generated by BindingDB queries can be downloaded in the form of annotated SDfiles for further anal., or used as the basis for virtual screening of a compd. database uploaded by the user. The data in BindingDB are linked both to structural data in the PDB via PDB IDs and chem. and sequence searches, and to the literature in PubMed via PubMed IDs.
- 22Berman, H. M.; W, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank Nucleic Acids Res. 2000, 28, 235– 242 DOI: 10.1093/nar/28.1.235Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXhvVKjt7w%253D&md5=227fb393f754be2be375ab727bfd05dcThe Protein Data BankBerman, Helen M.; Westbrook, John; Feng, Zukang; Gilliland, Gary; Bhat, T. N.; Weissig, Helge; Shindyalov, Ilya N.; Bourne, Philip E.Nucleic Acids Research (2000), 28 (1), 235-242CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The Protein Data Bank (PDB; http://www.rcsb.org/pdb/)is the single worldwide archive of structural data of biol. macromols. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
- 23Papadatos, G.; van Westen, G. J.; Croset, S.; Santos, R.; Trubian, S.; Overington, J. P. A document classifier for medicinal chemistry publications trained on the ChEMBL corpus J. Cheminf. 2014, 6, 40 DOI: 10.1186/s13321-014-0040-8Google Scholar23https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXitFGgsLvK&md5=280d8aa01a5d4cdf10313f3da76c17bcA document classifier for medicinal chemistry publications trained on the ChEMBL corpusPapadatos, George; van Westen, Gerard J. P.; Croset, Samuel; Santos, Rita; Trubian, Simone; Overington, John P.Journal of Cheminformatics (2014), 6 (), 40/1-40/8, 8 pp.CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: The large increase in the no. of scientific publications has fuelled a need for semi- and fully-automated text mining approaches in order to assist in the triage process, both for individual scientists and also for larger-scale data extn. and curation into public databases. Here, we introduce a document classifier, which is able to successfully distinguish between publications that are 'ChEMBL-like' (i.e. related to small mol. drug discovery and likely to contain quant. bioactivity data) and those that are not. The unprecedented size of the medicinal chem. literature collection, coupled with the advantage of manual curation and mapping to chem. and biol. make the ChEMBL corpus a unique resource for text mining. Results: The method has been implemented as a data protocol/workflow for both Pipeline Pilot (version 8.5) and KNIME (version 2.9) resp. Both workflows and models are freely available at online. These can be readily modified to include addnl. keyword constraints to further focus searches. Conclusions: Large-scale machine learning document classification was shown to be very robust and flexible for this particular application, as illustrated in four distinct text-mining-based use cases. The models are readily available on two data workflow platforms, which we believe will allow the majority of the scientific community to apply them to their own data.
- 24Williams, A. J.; Harland, L.; Groth, P.; Pettifer, S.; Chichester, C.; Willighagen, E. L.; Evelo, C. T.; Blomberg, N.; Ecker, G.; Goble, C.; Mons, B. Open PHACTS: semantic interoperability for drug discovery Drug Discovery Today 2012, 17, 1188– 1198 DOI: 10.1016/j.drudis.2012.05.016Google Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC38jgsVWqtQ%253D%253D&md5=ae3584b34948d7f6e92aee3ef36cc6a8Open PHACTS: semantic interoperability for drug discoveryWilliams Antony J; Harland Lee; Groth Paul; Pettifer Stephen; Chichester Christine; Willighagen Egon L; Evelo Chris T; Blomberg Niklas; Ecker Gerhard; Goble Carole; Mons BarendDrug discovery today (2012), 17 (21-22), 1188-98 ISSN:.Open PHACTS is a public-private partnership between academia, publishers, small and medium sized enterprises and pharmaceutical companies. The goal of the project is to deliver and sustain an 'open pharmacological space' using and enhancing state-of-the-art semantic web standards and technologies. It is focused on practical and robust applications to solve specific questions in drug discovery research. OPS is intended to facilitate improvements in drug discovery in academia and industry and to support open innovation and in-house non-public drug discovery research. This paper lays out the challenges and how the Open PHACTS project is hoping to address these challenges technically and socially.
- 25Stierand, K.; Harder, T.; Marek, T.; Hilbig, M.; Lemmen, C.; Rarey, M. The Internet as Scientific Knowledge Base: Navigating the Chem-Bio Space Mol. Inf. 2012, 31, 543– 546 DOI: 10.1002/minf.201200037Google Scholar25https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtFKlsbrJ&md5=1be2596f0673237bd1dc11d3ddd73838The Internet as Scientific Knowledge Base: Navigating the Chem-Bio SpaceStierand, Katrin; Harder, Tim; Marek, Thomas; Hilbig, Matthias; Lemmen, Christian; Rarey, MatthiasMolecular Informatics (2012), 31 (8), 543-546CODEN: MIONBS; ISSN:1868-1743. (Wiley-VCH Verlag GmbH & Co. KGaA)A first prototype of the ChemBioNavigator (CBN), an OpenPHACTS exemplar service for navigating the chem-bio space with a focus on small mols. relevant in pharmaceutical research, is described. This service allows to access large amts. of data originating from numerous public data sources available on the Internet and to merge this with proprietary compd. information dynamically during runtime. The added information is taken directly from datasets included in the OPS or from external data sources which are referenced in the OPS data cache. The CBN is realized using modern web technologies and state of the art cheminformatics software libraries.
- 26Carrascosa, M. C.; Massaguer, O. L.; Mestres, J. PharmaTrek: A Semantic Web Explorer for Open Innovation in Multitarget Drug Discovery Mol. Inf. 2012, 31, 537– 541 DOI: 10.1002/minf.201200070Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtFKlsbzE&md5=09ec01395695661ed8aeb42fc0d4d79cPharmaTrek: A Semantic Web Explorer for Open Innovation in Multitarget Drug DiscoveryCarrascosa, Maria C.; Massaguer, Oriol L.; Mestres, JordiMolecular Informatics (2012), 31 (8), 537-541CODEN: MIONBS; ISSN:1868-1743. (Wiley-VCH Verlag GmbH & Co. KGaA)This paper introduces PharmaTrek, an interactive semantic web explorer purposely designed for researchers in the field of multitarget pharmacol. to address complex queries in a most simple and intuitive manner. Other existing applications, such as SuperTarget, STITCH, DrugViz, and iPHACE, provide means to access and visualize drug-target interactions. PharmaTrek differs conceptually from those tools by the way the user submits complex multitarget queries to the single largest open pharmacol. space available to date (ChEMBL v1.3) and visualizes the results in a unique interactive manner that allows taking informed decisions on the original objective multitarget queries. Further development is currently underway.
- 27Isberg, V.; Mordalski, S.; Munk, C.; Rataj, K.; Harpsøe, K.; Hauser, A. S.; Vroling, B.; Bojarski, A. J.; Vriend, G.; Gloriam, D. E. GPCRDB: an information system for G protein-coupled receptors Nucleic Acids Res. 2016, 44, D356– D364 DOI: 10.1093/nar/gkv1178Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2gu7bJ&md5=ee7db0ef81d65cf6e5ab8750b990bb78GPCRdb: an information system for G protein-coupled receptorsIsberg, Vignir; Mordalski, Stefan; Munk, Christian; Rataj, Krzysztof; Harpsoee, Kasper; Hauser, Alexander S.; Vroling, Bas; Bojarski, Andrzej J.; Vriend, Gert; Gloriam, David E.Nucleic Acids Research (2016), 44 (D1), D356-D364CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)A review. Recent developments in G protein-coupled receptor (GPCR) structural biol. and pharmacol. have greatly enhanced our knowledge of receptor structure-function relations, and have helped improve the scientific foundation for drug design studies. The GPCR database, GPCRdb, serves a dual role in disseminating and enabling new scientific developments by providing ref. data, anal. tools and interactive diagrams. This paper highlights new features in the fifth major GPCRdb release: (i) GPCR crystal structure browsing, superposition and display of ligand interactions; (ii) direct deposition by users of point mutations and their effects on ligand binding; (iii) refined snake and helix box residue diagram looks; and (iv) phylogenetic trees with receptor classification color schemes. Under the hood, the entire GPCRdb front- and back-ends have been recoded within one infrastructure, ensuring a smooth browsing experience and development. GPCRdb is available at http://www.gpcrdb.org/ and it's open source code at https://bitbucket.org/gpcr/protwis.
- 28van Linden, O. P.; Kooistra, A. J.; Leurs, R.; de Esch, I. J.; de Graaf, C. KLIFS: a knowledge-based structural database to navigate kinase–ligand interaction space J. Med. Chem. 2014, 57, 249– 277 DOI: 10.1021/jm400378wGoogle Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXht1Ojur7P&md5=8519942e1703ba04f64797e07df4b712KLIFS: A Knowledge-Based Structural Database To Navigate Kinase-Ligand Interaction Spacevan Linden, Oscar P. J.; Kooistra, Albert J.; Leurs, Rob; de Esch, Iwan J. P.; de Graaf, ChrisJournal of Medicinal Chemistry (2014), 57 (2), 249-277CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)A review. Protein kinases regulate the majority of signal transduction pathways in cells and have become important targets for the development of designer drugs. We present a systematic anal. of kinase-ligand interactions in all regions of the catalytic cleft of all 1252 human kinase-ligand cocrystal structures present in the Protein Data Bank (PDB). The kinase-ligand interaction fingerprints and structure database (KLIFS) contains a consistent alignment of 85 kinase ligand binding site residues that enables the identification of family specific interaction features and classification of ligands according to their binding modes. We illustrate how systematic mining of kinase-ligand interaction space gives new insights into how conserved and selective kinase interaction hot spots can accommodate the large diversity of chem. scaffolds in kinase ligands. These analyses lead to an improved understanding of the structural requirements of kinase binding that will be useful in ligand discovery and design studies.
- 29Kooistra, A. J.; Kanev, G. K.; van Linden, O. P.; Leurs, R.; de Esch, I. J.; de Graaf, C. KLIFS: a structural kinase-ligand interaction database Nucleic Acids Res. 2016, 44, D365– 371 DOI: 10.1093/nar/gkv1082Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2nsrfP&md5=bc093b02225987fe791a2d41738e4d99KLIFS: a structural kinase-ligand interaction databaseKooistra, Albert J.; Kanev, Georgi K.; van Linden, Oscar P. J.; Leurs, Rob; de Esch, Iwan J. P.; de Graaf, ChrisNucleic Acids Research (2016), 44 (D1), D365-D371CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)Protein kinases play a crucial role in cell signaling and are important drug targets in several therapeutic areas. The KLIFS database contains detailed structural kinase-ligand interaction information derived from all (>2900) structures of catalytic domains of human and mouse protein kinases deposited in the Protein Data Bank in order to provide insights into the structural determinants of kinase-ligand binding and selectivity. The kinase structures have been processed in a consistent manner by systematically analyzing the structural features and mol. interaction fingerprints (IFPs) of a predefined set of 85 binding site residues with bound ligands. KLIFS has been completely rebuilt and extended (>65% more structures) since its first release as a data set, including: novel automated annotation methods for (i) the assessment of ligand-targeted subpockets and the anal. of (ii) DFG and (iii) αC-helix conformations; improved and automated protocols for (iv) the generation of sequence/structure alignments, (v) the curation of ligand atom and bond typing for accurate IFP anal. and (vi) weekly database updates. KLIFS is now accessible via a website (http://klifs.vucompmedchem. nl) that provides a comprehensive visual presentation of different types of chem., biol. and structural chemogenomics data, and allows the user to easily access, compare, search and download the data.
- 30Wood, D. J.; de Vlieg, J.; Wagener, M.; Ritschel, T. Pharmacophore fingerprint-based approach to binding site subpocket similarity and its application to bioisostere replacement J. Chem. Inf. Model. 2012, 52, 2031– 2043 DOI: 10.1021/ci3000776Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtVylur%252FF&md5=f344aa0423546f08ea347327f56ef7cbPharmacophore Fingerprint-Based Approach to Binding Site Subpocket Similarity and Its Application to Bioisostere ReplacementWood, David J.; Vlieg, Jacob de; Wagener, Markus; Ritschel, TinaJournal of Chemical Information and Modeling (2012), 52 (8), 2031-2043CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Bioisosteres have been defined as structurally different mols. or substructures that can form comparable intermol. interactions, and therefore, fragments that bind to similar protein structures exhibit a degree of bioisosterism. We present KRIPO (Key Representation of Interaction in POckets): a new method for quantifying the similarities of binding site subpockets based on pharmacophore fingerprints. The binding site fingerprints have been optimized to improve their performance for both intra- and interprotein family comparisons. A range of attributes of the fingerprints was considered in the optimization, including the placement of pharmacophore features, whether or not the fingerprints are fuzzified, and the resoln. and complexity of the pharmacophore fingerprints (2-, 3-, and 4-point fingerprints). Fuzzy 3-point pharmacophore fingerprints were found to represent the optimal balance between computational resource requirements and the identification of potential replacements. The complete PDB was converted into a database comprising almost 300 000 optimized fingerprints of local binding sites together with their assocd. ligand fragments. The value of the approach is demonstrated by application to two crystal structures from the Protein Data Bank: (1) a MAP kinase P38 structure in complex with a pyridinylimidazole inhibitor (1A9U) and (2) a complex of thrombin with melagatran (1K22). Potentially valuable bioisosteric replacements for all subpockets of the two studied protein are identified.
- 31Ridder, L.; Wagener, M. SyGMa: combining expert knowledge and empirical scoring in the prediction of metabolites ChemMedChem 2008, 3, 821– 32 DOI: 10.1002/cmdc.200700312Google Scholar31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXms1emtLw%253D&md5=92ed84b2e97af5ad9a50beef151fa5dbSyGMa: combining expert knowledge and empirical scoring in the prediction of metabolitesRidder, Lars; Wagener, MarkusChemMedChem (2008), 3 (5), 821-832CODEN: CHEMGX; ISSN:1860-7179. (Wiley-VCH Verlag GmbH & Co. KGaA)Predictions of potential metabolites based on chem. structure are becoming increasingly important in drug discovery to guide medicinal chem. efforts that address metabolic issues and to support exptl. metabolite screening and identification. Herein we present a novel rule-based method, SyGMa (Systematic Generation of potential Metabolites), to predict the potential metabolites of a given parent structure. A set of reaction rules covering a broad range of phase 1 and phase 2 metab. has been derived from metabolic reactions reported in the Metabolite Database to occur in humans. An empirical probability score is assigned to each rule representing the fraction of correctly predicted metabolites in the training database. This score is used to refine the rules and to rank predicted metabolites. The current rule set of SyGMa covers approx. 70% of biotransformation reactions obsd. in humans. Evaluation of the rule-based predictions demonstrated a significant enrichment of true metabolites in the top of the ranking list: while in total, 68% of all obsd. metabolites in an independent test set were reproduced by SyGMa, a large part, 30% of the obsd. metabolites, were identified among the top three predictions. From a subset of cytochrome P 450 specific metabolites, 84% were reproduced overall, with 66% in the top three predicted phase 1 metabolites. A similarity anal. of the reactions present in the database was performed to obtain an overview of the metabolic reactions predicted by SyGMa and to support ongoing efforts to extend the rules. Specific examples demonstrate the use of SyGMa in exptl. metabolite identification and the application of SyGMa to suggest chem. modifications that improve the metabolic stability of compds.
- 32Postgresql. https://www.postgresql.org/.Google ScholarThere is no corresponding record for this reference.
- 33Ochoa, R.; Davies, M.; Papadatos, G.; Atkinson, F.; Overington, J. P. myChEMBL: a virtual machine implementation of open data and cheminformatics tools Bioinformatics 2014, 30, 298– 300 DOI: 10.1093/bioinformatics/btt666Google ScholarThere is no corresponding record for this reference.
- 34https://www.vagrantup.com/.Google ScholarThere is no corresponding record for this reference.
- 35https://atlas.hashicorp.com/boxes/search.Google ScholarThere is no corresponding record for this reference.
- 36https://www.packer.io/.Google ScholarThere is no corresponding record for this reference.
- 37https://www.virtualbox.org/.Google ScholarThere is no corresponding record for this reference.
- 38http://www.ansible.com.Google ScholarThere is no corresponding record for this reference.
- 39Travis-CI. https://travis-ci.org/.Google ScholarThere is no corresponding record for this reference.
- 40http://www.eclipse.org/tycho/.Google ScholarThere is no corresponding record for this reference.
- 41KNIME Developer Guide. https://tech.knime.org/developer-guide.Google ScholarThere is no corresponding record for this reference.
- 42Le Guilloux, V.; Schmidtke, P.; Tuffery, P. Fpocket: an open source platform for ligand pocket detection BMC Bioinf. 2009, 10, 168 DOI: 10.1186/1471-2105-10-168Google Scholar42https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD1MvjsFWltw%253D%253D&md5=7d16f53ed64eac9cdeb33ea40b61adfdFpocket: an open source platform for ligand pocket detectionLe Guilloux Vincent; Schmidtke Peter; Tuffery PierreBMC bioinformatics (2009), 10 (), 168 ISSN:.BACKGROUND: Virtual screening methods start to be well established as effective approaches to identify hits, candidates and leads for drug discovery research. Among those, structure based virtual screening (SBVS) approaches aim at docking collections of small compounds in the target structure to identify potent compounds. For SBVS, the identification of candidate pockets in protein structures is a key feature, and the recent years have seen increasing interest in developing methods for pocket and cavity detection on protein surfaces. RESULTS: Fpocket is an open source pocket detection package based on Voronoi tessellation and alpha spheres built on top of the publicly available package Qhull. The modular source code is organised around a central library of functions, a basis for three main programs: (i) Fpocket, to perform pocket identification, (ii) Tpocket, to organise pocket detection benchmarking on a set of known protein-ligand complexes, and (iii) Dpocket, to collect pocket descriptor values on a set of proteins. Fpocket is written in the C programming language, which makes it a platform well suited for the scientific community willing to develop new scoring functions and extract various pocket descriptors on a large scale level. Fpocket 1.0, relying on a simple scoring function, is able to detect 94% and 92% of the pockets within the best three ranked pockets from the holo and apo proteins respectively, outperforming the standards of the field, while being faster. CONCLUSION: Fpocket provides a rapid, open source and stable basis for further developments related to protein pocket detection, efficient pocket descriptor extraction, or drugablity prediction purposes. Fpocket is freely available under the GNU GPL license at http://fpocket.sourceforge.net.
- 43OPS-KNIME. https://github.com/openphacts/OPS-Knime.Google ScholarThere is no corresponding record for this reference.
- 44Kooistra, A. J.; Kuhne, S.; de Esch, I. J.; Leurs, R.; de Graaf, C. A structural chemogenomics analysis of aminergic GPCRs: lessons for histamine receptor ligand design Br. J. Pharmacol. 2013, 170, 101– 26 DOI: 10.1111/bph.12248Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhtlSgtbnE&md5=08ae55b9565885165e76d98e2db1befaA structural chemogenomics analysis of aminergic GPCRs: lessons for histamine receptor ligand designKooistra, A. J.; Kuhne, S.; de Esch, I. J. P.; Leurs, R.; de Graaf, C.British Journal of Pharmacology (2013), 170 (1), 101-126CODEN: BJPCBM; ISSN:1476-5381. (Wiley-Blackwell)Background and Purpose Chemogenomics focuses on the discovery of new connections between chem. and biol. space leading to the discovery of new protein targets and biol. active mols. G-protein coupled receptors (GPCRs) are a particularly interesting protein family for chemogenomics studies because there is an overwhelming amt. of ligand binding affinity data available. The increasing no. of aminergic GPCR crystal structures now for the first time allows the integration of chemogenomics studies with high-resoln. structural analyses of GPCR-ligand complexes. Exptl. Approach In this study, we have combined ligand affinity data, receptor mutagenesis studies, and amino acid sequence analyses to high-resoln. structural analyses of (hist)aminergic GPCR-ligand interactions. This integrated structural chemogenomics anal. is used to more accurately describe the mol. and structural determinants of ligand affinity and selectivity in different key binding regions of the crystd. aminergic GPCRs, and histamine receptors in particular. Key Results Our investigations highlight interesting correlations and differences between ligand similarity and ligand binding site similarity of different aminergic receptors. Apparent discrepancies can be explained by combining detailed anal. of crystd. or predicted protein-ligand binding modes, receptor mutation studies, and ligand structure-selectivity relationships that identify local differences in essential pharmacophore features in the ligand binding sites of different receptors. Conclusions and Implications We have performed structural chemogenomics studies that identify links between (hist)aminergic receptor ligands and their binding sites and binding modes. This knowledge can be used to identify structure-selectivity relationships that increase our understanding of ligand binding to (hist)aminergic receptors and hence can be used in future GPCR ligand discovery and design.
- 45Vass, M.; Kooistra, A. J.; Ritschel, T.; Leurs, R.; de Esch, I. J.; de Graaf, C. Molecular interaction fingerprint approaches for GPCR drug discovery Curr. Opin. Pharmacol. 2016, 30, 59– 68 DOI: 10.1016/j.coph.2016.07.007Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xht1eju7zJ&md5=567b591ab504d3d9ec6bec9556588118Molecular interaction fingerprint approaches for GPCR drug discoveryVass, Marton; Kooistra, Albert J.; Ritschel, Tina; Leurs, Rob; de Esch, Iwan J. P.; de Graaf, ChrisCurrent Opinion in Pharmacology (2016), 30 (), 59-68CODEN: COPUBK; ISSN:1471-4892. (Elsevier Ltd.)Protein-ligand interaction fingerprints (IFPs) are binary 1D representations of the 3D structure of protein-ligand complexes encoding the presence or absence of specific interactions between the binding pocket amino acids and the ligand. Various implementations of IFPs have been developed and successfully applied for post-processing mol. docking results for G Protein-Coupled Receptor (GPCR) ligand binding mode prediction and virtual ligand screening. Novel interaction fingerprint methods enable structural chemogenomics and polypharmacol. predictions by complementing the increasing amt. of GPCR structural data. Machine learning methods are increasingly used to derive relationships between bioactivity data and fingerprint descriptors of chem. and structural information of binding sites, ligands, and protein-ligand interactions. Factors that influence the application of IFPs include structure prepn., binding site definition, fingerprint similarity assessment, and data processing and these factors pose challenges as well possibilities to optimize interaction fingerprint methods for GPCR drug discovery.
- 46http://swagger.io/swagger-codegen.Google ScholarThere is no corresponding record for this reference.
- 47Isberg, V.; de Graaf, C.; Bortolato, A.; Cherezov, V.; Katritch, V.; Marshall, F. H.; Mordalski, S.; Pin, J. P.; Stevens, R. C.; Vriend, G.; Gloriam, D. E. Generic GPCR residue numbers - aligning topology maps while minding the gaps Trends Pharmacol. Sci. 2015, 36, 22– 31 DOI: 10.1016/j.tips.2014.11.001Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXitVSgu7rK&md5=31e18cc8196fb05a12d04dd0402b2338Generic GPCR residue numbers - aligning topology maps while minding the gapsIsberg, Vignir; de Graaf, Chris; Bortolato, Andrea; Cherezov, Vadim; Katritch, Vsevolod; Marshall, Fiona H.; Mordalski, Stefan; Pin, Jean-Philippe; Stevens, Raymond C.; Vriend, Gerrit; Gloriam, David E.Trends in Pharmacological Sciences (2015), 36 (1), 22-31CODEN: TPHSDY; ISSN:0165-6147. (Elsevier Ltd.)A review. Generic residue nos. facilitate comparisons of, for example, mutational effects, ligand interactions, and structural motifs. The numbering scheme by Ballesteros and Weinstein for residues within the class A GPCRs (G protein-coupled receptors) has more than 1100 citations, and the recent crystal structures for classes B, C, and F now call for a community consensus in residue numbering within and across these classes. Furthermore, the structural era has uncovered helix bulges and constrictions that offset the generic residue nos. The use of generic residue nos. depends on convenient access by pharmacologists, chemists, and structural biologists. We review the generic residue numbering schemes for each GPCR class, as well as a complementary structure-based scheme, and provide illustrative examples and GPCR database (GPCRDB) web tools to no. any receptor sequence or structure.
- 48Manning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S. The protein kinase complement of the human genome Science 2002, 298, 1912– 1934 DOI: 10.1126/science.1075762Google Scholar48https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38Xpt1Wisb0%253D&md5=b3def83bace52257a3252a3347b8bb92The Protein Kinase Complement of the Human GenomeManning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S.Science (Washington, DC, United States) (2002), 298 (5600), 1912-1916, 1933-1934CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)We have catalogued the protein kinase complement of the human genome (the "kinome") using public and proprietary genomic, complementary DNA, and expressed sequence tag (EST) sequences. This provides a starting point for comprehensive anal. of protein phosphorylation in normal and disease states, as well as a detailed view of the current state of human genome anal. through a focus on one large gene family. We identify 518 putative protein kinase genes, of which 71 have not previously been reported or described as kinases, and we extend or correct the protein sequences of 56 more kinases. New genes include members of well-studied families as well as previously unidentified families, some of which are conserved in model organisms. Classification and comparison with model organism kinomes identified orthologous groups and highlighted expansions specific to human and other lineages. We also identified 106 protein kinase pseudogenes. Chromosomal mapping revealed several small clusters of kinase genes and revealed that 244 kinases map to disease loci or cancer amplicons.
- 49Marcou, G.; Rognan, D. Optimizing fragment and scaffold docking by use of molecular interaction fingerprints J. Chem. Inf. Model. 2007, 47, 195– 207 DOI: 10.1021/ci600342eGoogle Scholar49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28Xht12iurzL&md5=0c9137a39d40fbcc83546aec17b595baOptimizing Fragment and Scaffold Docking by Use of Molecular Interaction FingerprintsMarcou, Gilles; Rognan, DidierJournal of Chemical Information and Modeling (2007), 47 (1), 195-207CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Protein-ligand interaction fingerprints have been used to postprocess docking poses of three ligand data sets: a set of 40 low-mol.-wt. compds. from the Protein Data Bank, a collection of 40 scaffolds from pharmaceutically relevant protein ligands, and a database of 19 scaffolds extd. from true cdk2 inhibitors seeded in 2230 scaffold decoys. Four popular docking tools (FlexX, Glide, Gold, and Surflex) were used to generate poses for ligands of the three data sets. In all cases, scoring by the similarity of interaction fingerprints to a given ref. was statistically superior to conventional scoring functions in posing low-mol.-wt. fragments, predicting protein-bound scaffold coordinates according to the known binding mode of related ligands, and screening a scaffold library to enrich a hit list in true cdk2-targeted scaffolds.
- 50Fligner, M. A.; Verducci, J. S.; Blower, P. E. A modification of the Jaccard–Tanimoto similarity index for diverse selection of chemical compounds using binary strings Technometrics 2002, 44, 110– 119 DOI: 10.1198/004017002317375064Google ScholarThere is no corresponding record for this reference.
- 51Nijmeijer, S.; Vischer, H. F.; Rudebeck, A. F.; Fleurbaaij, F.; Falck, D.; Leurs, R.; Niessen, W. M.; Kool, J. Development of a profiling strategy for metabolic mixtures by combining chromatography and mass spectrometry with cell-based GPCR signaling J. Biomol. Screening 2012, 17, 1329– 38 DOI: 10.1177/1087057112451922Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhslWjtbrL&md5=547c354a161effc51cfe116e4af29eb8Development of a profiling strategy for metabolic mixtures by combining chromatography and mass spectrometry with cell-based GPCR signalingNijmeijer, Saskia; Vischer, Henry F.; Rudebeck, Anders F.; Fleurbaaij, Frank; Falck, David; Leurs, Rob; Niessen, Wilfried M. A.; Kool, JeroenJournal of Biomolecular Screening (2012), 17 (10), 1329-1338, 10 pp.CODEN: JBISF3; ISSN:1087-0571. (Sage Publications)In this study, we developed an in-line methodol. that combines anal. with pharmacol. techniques to characterize metabolites of human histamine H4 receptor (hH4R) ligands. Liq. chromatog. sepn. of metabolic mixts. is coupled to high-resoln. fractionation into 96- or 384-well plates and directly followed by a cell-based reporter gene assay to measure receptor signaling. The complete methodol. was designed, optimized, validated, and ultimately miniaturized into a high-d. well plate format. Finally, the methodol. was demonstrated in a metabolic profiling setting for three hH4R lead compds. and the drug clozapine. This new methodol. comprises integrated anal. sepns., mass spectrometry, and a cell-based signal transduction-driven reporter gene assay that enables the implementation of comprehensive metabolic profiling earlier in the drug discovery process.
- 52Wang, L.; Christopher, L. J.; Cui, D.; Li, W.; Iyer, R.; Humphreys, W. G.; Zhang, D. Identification of the human enzymes involved in the oxidative metabolism of dasatinib: an effective approach for determining metabolite formation kinetics Drug Metab. Dispos. 2008, 36, 1828– 39 DOI: 10.1124/dmd.107.020255Google ScholarThere is no corresponding record for this reference.
- 53Rogers, D.; Hahn, M. Extended-connectivity fingerprints J. Chem. Inf. Model. 2010, 50, 742– 54 DOI: 10.1021/ci100050tGoogle Scholar53https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXlt1Onsbg%253D&md5=cd6c736cd7a3d280b67f5316acce8006Extended-Connectivity FingerprintsRogers, David; Hahn, MathewJournal of Chemical Information and Modeling (2010), 50 (5), 742-754CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Extended-connectivity fingerprints (ECFPs) are a novel class of topol. fingerprints for mol. characterization. Historically, topol. fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure-activity modeling. ECFPs are circular fingerprints with a no. of useful qualities: they can be very rapidly calcd.; they are not predefined and can represent an essentially infinite no. of different mol. features (including stereochem. information); their features represent the presence of particular substructures, allowing easier interpretation of anal. results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature.
- 54Kooistra, A. J.; Vischer, H. F.; McNaught-Flores, D.; Leurs, R.; de Esch, I. J.; de Graaf, C. Function-specific virtual screening for GPCR ligands using a combined scoring method Sci. Rep. 2016, 6, 28288 DOI: 10.1038/srep28288Google Scholar54https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhtVKitbfK&md5=d1e2d58a971dbc9d5f4c3b883fe2656bFunction-specific virtual screening for GPCR ligands using a combined scoring methodKooistra, Albert J.; Vischer, Henry F.; McNaught-Flores, Daniel; Leurs, Rob; de Esch, Iwan J. P.; de Graaf, ChrisScientific Reports (2016), 6 (), 28288CODEN: SRCEC3; ISSN:2045-2322. (Nature Publishing Group)The ability of scoring functions to correctly select and rank docking poses of small mols. in protein binding sites is highly target dependent, which presents a challenge for structure-based drug discovery. Here we describe a virtual screening method that combines an energy-based docking scoring function with a mol. interaction fingerprint (IFP) to identify new ligands based on G protein-coupled receptor (GPCR) crystal structures. The consensus scoring method is prospectively evaluated by: 1) the discovery of chem. novel, fragment-like, high affinity histamine H1 receptor (H1R) antagonists/inverse agonists, 2) the selective structure-based identification of ss2-adrenoceptor (ss2R) agonists, and 3) the exptl. validation and comparison of the combined and individual scoring approaches. Systematic retrospective virtual screening simulations allowed the definition of scoring cut-offs for the identification of H1R and ss2R ligands and the selection of an optimal ss-adrenoceptor crystal structure for the discrimination between ss2R agonists and antagonists. The consensus approach resulted in the exptl. validation of 53% of the ss2R and 73% of the H1R virtual screening hits with up to nanomolar affinities and potencies. The selective identification of ss2R agonists shows the possibilities of structure-based prediction of GPCR ligand function by integrating protein-ligand binding mode information.
- 55Astolfi, A.; Iraci, N.; Manfroni, G.; Barreca, M. L.; Cecchetti, V. A Comprehensive Structural Overview of p38alpha MAPK in Complex with Type I Inhibitors ChemMedChem 2015, 10, 957– 69 DOI: 10.1002/cmdc.201500030Google ScholarThere is no corresponding record for this reference.
- 56Lin, X.; Huang, X. P.; Chen, G.; Whaley, R.; Peng, S.; Wang, Y.; Zhang, G.; Wang, S. X.; Wang, S.; Roth, B. L.; Huang, N. Life beyond kinases: structure-based discovery of sorafenib as nanomolar antagonist of 5-HT receptors J. Med. Chem. 2012, 55, 5749– 59 DOI: 10.1021/jm300338mGoogle Scholar56https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XosV2qsrg%253D&md5=52497e84e7aa648ac226a166cb07f0e3Life Beyond Kinases: Structure-Based Discovery of Sorafenib as Nanomolar Antagonist of 5-HT ReceptorsLin, Xingyu; Huang, Xi-Ping; Chen, Gang; Whaley, Ryan; Peng, Shiming; Wang, Yanli; Zhang, Guoliang; Wang, Simon X.; Wang, Shaohui; Roth, Bryan L.; Huang, NiuJournal of Medicinal Chemistry (2012), 55 (12), 5749-5759CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Of great interest in recent years has been computationally predicting the novel polypharmacol. of drug mols. Here, we applied an "induced-fit" protocol to improve the homol. models of 5-HT2A receptor, and we assessed the quality of these models in retrospective virtual screening. Subsequently, we computationally screened the FDA approved drug mols. against the best induced-fit 5-HT2A models and chose six top scoring hits for exptl. assays. Surprisingly, one well-known kinase inhibitor, sorafenib, has shown unexpected promiscuous 5-HTRs binding affinities, Ki = 1959, 56, and 417 nM against 5-HT2A, 5-HT2B, and 5-HT2C, resp. Our preliminary SAR exploration supports the predicted binding mode and further suggests sorafenib to be a novel lead compd. for 5HTR ligand discovery. Although it has been well-known that sorafenib produces anticancer effects through targeting multiple kinases, carefully designed exptl. studies are desirable to fully understand whether its "off-target" 5-HTR binding activities contribute to its therapeutic efficacy or otherwise undesirable side effects.
- 57
DRUGMATRIX: Adenosine A2A radioligand binding assay (ligand: AB-MECA) CHEMBL1909214.
There is no corresponding record for this reference. - 58Dombroski, M. A.; Letavic, M. A.; McClure, K. F.; Barberia, J. T.; Carty, T. J.; Cortina, S. R.; Csiki, C.; Dipesa, A. J.; Elliott, N. C.; Gabel, C. A.; Jordan, C. K.; Labasi, J. M.; Martin, W. H.; Peese, K. M.; Stock, I. A.; Svensson, L.; Sweeney, F. J.; Yu, C. H. Benzimidazolone p38 inhibitors Bioorg. Med. Chem. Lett. 2004, 14, 919– 23 DOI: 10.1016/j.bmcl.2003.12.023Google ScholarThere is no corresponding record for this reference.
- 59Yang, B.; Hird, A. W.; Russell, D. J.; Fauber, B. P.; Dakin, L. A.; Zheng, X.; Su, Q.; Godin, R.; Brassil, P.; Devereaux, E.; Janetka, J. W. Discovery of novel hedgehog antagonists from cell-based screening: Isosteric modification of p38 bisamides as potent inhibitors of SMO Bioorg. Med. Chem. Lett. 2012, 22, 4907– 11 DOI: 10.1016/j.bmcl.2012.04.104Google Scholar59https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XoslSltbY%253D&md5=75b328a6a6fce78a5e9c84c60c33e180Discovery of novel hedgehog antagonists from cell-based screening: Isosteric modification of p38 bisamides as potent inhibitors of SMOYang, Bin; Hird, Alexander W.; Russell, Daniel John; Fauber, Benjamin P.; Dakin, Les A.; Zheng, Xiaolan; Su, Qibin; Godin, Robert; Brassil, Patrick; Devereaux, Erik; Janetka, James W.Bioorganic & Medicinal Chemistry Letters (2012), 22 (14), 4907-4911CODEN: BMCLE8; ISSN:0960-894X. (Elsevier B.V.)Cell-based subset screening of compds. using a Gli transcription factor reporter cell assay and shh stimulated cell differentiation assay identified a series of bisamide compds. as hedgehog pathway inhibitors with good potency. Using a ligand-based optimization strategy, heteroaryl groups were utilized as conformationally restricted amide isosteres replacing one of the amides which significantly increased their potency against SMO and the hedgehog pathway while decreasing activity against p38α kinase. We report herein the identification of advanced lead compds. such as imidazole 11c and 11f encompassing good p38α selectivity, low nanomolar potency in both cell assays, excellent physiochem. properties and in vivo pharmacokinetics.
- 60Peters, J. U. Polypharmacology - foe or friend? J. Med. Chem. 2013, 56, 8955– 71 DOI: 10.1021/jm400856tGoogle Scholar60https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXht1amtrnO&md5=f4aeb6efddd4bfdf4e94656303323cbaPolypharmacology - Foe or Friend?Peters, Jens-UweJournal of Medicinal Chemistry (2013), 56 (22), 8955-8971CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)A review. Polypharmacol. describes the activity of compds. at multiple targets. Current research focuses on two aspects of polypharmacol.: (1) unintended polypharmacol. can lead to adverse effects; (2) polypharmacol. across several disease-relevant targets can improve therapeutic efficacy, prevent drug resistance, or reduce therapeutic-target-related adverse effects. This perspective reviews these interconnected aspects of polypharmacol. The first part discusses the relevance of polypharmacol. for the safety of drugs, the mitigation of safety risks, and methods to identify polypharmacol. compds. early in the drug discovery process. The second part discusses the advantages of polypharmacol. in the treatment of multigenic diseases and infections, and opportunities for drug discovery and drug repurposing. This perspective aims to provide a balanced view on polypharmacol., which can compromise the safety of drugs, but can also confer superior efficacy.
Cited By
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by ACS Publications if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
This article is cited by 22 publications.
- Tom Dekker, Mathilde A. C. H. Janssen, Christina Sutherland, Rene W. M. Aben, Hans W. Scheeren, Daniel Blanco-Ania, Floris P. J. T. Rutjes, Maikel Wijtmans, Iwan J. P. de Esch. An Automated, Open-Source Workflow for the Generation of (3D) Fragment Libraries. ACS Medicinal Chemistry Letters 2023, 14
(5)
, 583-590. https://doi.org/10.1021/acsmedchemlett.2c00503
- Filip Miljković, Raquel Rodríguez-Pérez, Jürgen Bajorath. Machine Learning Models for Accurate Prediction of Kinase Inhibitors with Different Binding Modes. Journal of Medicinal Chemistry 2020, 63
(16)
, 8738-8748. https://doi.org/10.1021/acs.jmedchem.9b00867
- Dominique Sydow, Michele Wichmann, Jaime Rodríguez-Guerra, Daria Goldmann, Gregory Landrum, Andrea Volkamer. TeachOpenCADD-KNIME: A Teaching Platform for Computer-Aided Drug Design Using KNIME Workflows. Journal of Chemical Information and Modeling 2019, 59
(10)
, 4083-4086. https://doi.org/10.1021/acs.jcim.9b00662
- Márton Vass, Sabina Podlewska, Iwan J. P. de Esch, Andrzej J. Bojarski, Rob Leurs, Albert J. Kooistra, Chris de Graaf. Aminergic GPCR–Ligand Interactions: A Chemical and Structural Map of Receptor Mutation Data. Journal of Medicinal Chemistry 2019, 62
(8)
, 3784-3839. https://doi.org/10.1021/acs.jmedchem.8b00836
- Filip Miljković and Jürgen Bajorath . Exploring Selectivity of Multikinase Inhibitors across the Human Kinome. ACS Omega 2018, 3
(1)
, 1147-1153. https://doi.org/10.1021/acsomega.7b01960
- Charlotte A. Hoogstraten, Jan B. Koenderink, Carolijn E. van Straaten, Tom Scheer-Weijers, Jan A.M. Smeitink, Tom J.J. Schirris, Frans G.M. Russel. Pyruvate dehydrogenase is a potential mitochondrial off-target for gentamicin based on in silico predictions and in vitro inhibition studies. Toxicology in Vitro 2024, 95 , 105740. https://doi.org/10.1016/j.tiv.2023.105740
- Tomoki Yonezawa, Tsuyoshi Esaki, Kazuyoshi Ikeda. Benchmark of 3D conformer generation and molecular property calculation for medium-sized molecules. Chem-Bio Informatics Journal 2022, 22
(0)
, 38-45. https://doi.org/10.1273/cbij.22.38
- Dominique Sydow, Jaime Rodríguez-Guerra, Andrea Volkamer. OpenCADD-KLIFS: A Python package to fetch kinase data from the KLIFS database. Journal of Open Source Software 2022, 7
(70)
, 3951. https://doi.org/10.21105/joss.03951
- Georgi K Kanev, Chris de Graaf, Bart A Westerman, Iwan J P de Esch, Albert J Kooistra. KLIFS: an overhaul after the first 5 years of supporting kinase research. Nucleic Acids Research 2021, 49
(D1)
, D562-D569. https://doi.org/10.1093/nar/gkaa895
- Nalini Schaduangrat, Samuel Lampa, Saw Simeon, Matthew Paul Gleeson, Ola Spjuth, Chanin Nantasenamat. Towards reproducible computational drug discovery. Journal of Cheminformatics 2020, 12
(1)
https://doi.org/10.1186/s13321-020-0408-x
- Michael P. Mazanetz, Charlotte H.F. Goode, Ewa I. Chudyk. Ligand- and Structure-Based Drug Design and Optimization using KNIME. Current Medicinal Chemistry 2020, 27
(38)
, 6458-6479. https://doi.org/10.2174/0929867326666190409141016
- Antreas Afantitis, Andreas Tsoumanis, Georgia Melagraki. Enalos Suite of Tools: Enhancing Cheminformatics and Nanoinfor - matics through KNIME. Current Medicinal Chemistry 2020, 27
(38)
, 6523-6535. https://doi.org/10.2174/0929867327666200727114410
- Anuraj Nayarisseri. Experimental and Computational Approaches to Improve Binding Affinity in Chemical Biology and Drug Discovery. Current Topics in Medicinal Chemistry 2020, 20
(19)
, 1651-1660. https://doi.org/10.2174/156802662019200701164759
- Babs Briels, Chris de Graaf, Andreas Bender. Structural Chemogenomics. 2020, 53-77. https://doi.org/10.1002/9781118681121.ch3
- Magdalena Galster, Marius Löppenberg, Fabian Galla, Frederik Börgel, Oriana Agoglitta, Johannes Kirchmair, Ralph Holl. Phenylethylene glycol-derived LpxC inhibitors with diverse Zn2+-binding groups. Tetrahedron 2019, 75
(4)
, 486-509. https://doi.org/10.1016/j.tet.2018.12.011
- Christiane Ehrt, Tobias Brinkjost, Oliver Koch, . A benchmark driven guide to binding site comparison: An exhaustive evaluation using tailor-made data sets (ProSPECCTs). PLOS Computational Biology 2018, 14
(11)
, e1006483. https://doi.org/10.1371/journal.pcbi.1006483
- Márton Vass, Albert J. Kooistra, Dehua Yang, Raymond C. Stevens, Ming-Wei Wang, Chris de Graaf. Chemical Diversity in the G Protein-Coupled Receptor Superfamily. Trends in Pharmacological Sciences 2018, 39
(5)
, 494-512. https://doi.org/10.1016/j.tips.2018.02.004
- Fleur M. Ferguson, Nathanael S. Gray. Kinase inhibitors: the road ahead. Nature Reviews Drug Discovery 2018, 17
(5)
, 353-377. https://doi.org/10.1038/nrd.2018.21
- Albert J. Kooistra, Márton Vass, Ross McGuire, Rob Leurs, Iwan J. P. de Esch, Gert Vriend, Stefan Verhoeven, Chris de Graaf. 3D‐e‐Chem: Structural Cheminformatics Workflows for Computer‐Aided Drug Discovery. ChemMedChem 2018, 13
(6)
, 614-626. https://doi.org/10.1002/cmdc.201700754
- Márton Vass, Albert J. Kooistra, Stefan Verhoeven, David Gloriam, Iwan J. P. de Esch, Chris de Graaf. A Structural Framework for GPCR Chemogenomics: What’s In a Residue Number?. 2018, 73-113. https://doi.org/10.1007/978-1-4939-7465-8_4
- Albert J. Kooistra, Andrea Volkamer. Kinase-Centric Computational Drug Development. 2017, 197-236. https://doi.org/10.1016/bs.armc.2017.08.001
- Mariana González-Medina, J. Jesús Naveja, Norberto Sánchez-Cruz, José L. Medina-Franco. Open chemoinformatic resources to explore the structure, properties and chemical space of molecules. RSC Advances 2017, 7
(85)
, 54153-54163. https://doi.org/10.1039/C7RA11831G
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
Abstract
Figure 1
Figure 1. KNIME workflows to exploit cheminformatics and bioinformatics information on GPCRs (GPCRdb nodes) and protein kinases (KLIFS nodes). In the GPCRdb workflow, KNIME nodes are used to enable the extraction and combination of protein information, sequence, alternative numbering schemes, mutagenesis data, and experimental structures for a selected receptor from GPCRdb. The lower branch of the workflow returns all sequence identities and similarities of the TM domain for the selected receptors and can be used for further structural chemogenomics analyses (44) using, e.g., structural and structure-based sequence alignments of the ligand binding site residues of crystallized aminergic receptors (available in the VM as a PyMOL session). In the KLIFS workflow, KNIME nodes enable the integrated analysis of structural kinase–ligand interactions from all structures for a specific kinase in KLIFS (human MAPK in the example). Kinase–ligand complexes with a specific hydrogen bond interaction pattern between the ligand and residues in the hinge region of the kinase (stacked bar chart) are selected for an all-against-all comparison of their structural kinase–ligand interactions fingerprints (heat map). The ligands from the selected structures are compared and the ligand pair with the lowest chemical similarity and a high interaction fingerprint similarity are retrieved from KLIFS for binding mode comparison. Meta nodes in the workflows in panels A and B are indicated with a star (*). The full workflows are provided in the Supporting Information, Figures S2 and S3.
Figure 2
Figure 2. KRIPO binding site similarity based bioisosteric replacement and SyGMa metabolite prediction workflows. Ligands in KRIPOdb that share a chemical (sub)structure with a specified molecule (doxepin in the example) are identified and defined as query fragment(s). Ligand (fragment) binding site hits that share pharmacophore fingerprint similarity with the binding site(s) associated with the query fragment(s) (e.g., the doxepin binding site of the histamine H1 receptor) are identified and ranked according to Tanimoto similarity score. The occurrence of protein targets in the top hit list is analyzed. The pharmacophore overlay underlying the similarity value of an example hit (histamine methyltransferase, PDB ID: 2aot; available in the VM as a PyMOL session). The full workflow is provided in the Supporting Information (Figure S4). In the SyGMa workflow Smiles strings of clozapine and dasatinib are converted into RDKit molecules for the prediction of metabolites using the SyGMa Metabolites node, filtered based on a SyGMa_score threshold of 0.1. The two tables are subsections of the resulting table, showing the top ranked metabolites of clozapine and dasatinib, consistent with experimental metabolism data. (51, 52) Meta nodes are indicated with a star (*).
Figure 3
Figure 3. Schematic diagram of possible interactions of the 3D-e-Chem-VM virtual machine elements: KLIFS and GPCRdb web service connector nodes, KRIPOdb, KRIPO, and SyGMa nodes, and the Chemdb4VS workflow (full workflow presented in the Supporting Information, Figure S6) integrated in a GPCR-kinase cross-reactivity prediction workflow.
References
This article references 60 other publications.
- 1Hu, Y.; Bajorath, J. Learning from ’big data’: compounds and targets Drug Discovery Today 2014, 19, 357– 60 DOI: 10.1016/j.drudis.2014.02.004There is no corresponding record for this reference.
- 2Lusher, S. J.; McGuire, R.; van Schaik, R. C.; Nicholson, C. D.; de Vlieg, J. Data-driven medicinal chemistry in the era of big data Drug Discovery Today 2014, 19, 859– 68 DOI: 10.1016/j.drudis.2013.12.0042https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXivVKhtw%253D%253D&md5=9de50fcde3985e05470544da2261d35cData-driven medicinal chemistry in the era of big dataLusher, Scott J.; McGuire, Ross; van Schaik, Rene C.; Nicholson, C. David; de Vlieg, JacobDrug Discovery Today (2014), 19 (7), 859-868CODEN: DDTOFS; ISSN:1359-6446. (Elsevier Ltd.)Science, and the way we undertake research, is changing. The increasing rate of data generation across all scientific disciplines is providing incredible opportunities for data-driven research, with the potential to transform our current practices. The exploitation of so-called 'big data' will enable us to undertake research projects never previously possible but should also stimulate a re-evaluation of all our data practices. Data-driven medicinal chem. approaches have the potential to improve decision making in drug discovery projects, providing that all researchers embrace the role of 'data scientist' and uncover the meaningful relationships and patterns in available data.
- 3RDKit. http://www.rdkit.org.There is no corresponding record for this reference.
- 4Steinbeck, C. C.; Han, Y.; Kuhn, S.; Horlacher, O.; Luttmann, E.; Willighagen, E. The Chemistry Development Kit J. Chem. Inf. Comput. Sci. 2003, 43, 493– 500 DOI: 10.1021/ci025584y4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXhtVaktbg%253D&md5=afc8fd10783af301c73a8183727230bfThe Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and BioinformaticsSteinbeck, Christoph; Han, Yongquan; Kuhn, Stefan; Horlacher, Oliver; Luttmann, Edgar; Willighagen, EgonJournal of Chemical Information and Computer Sciences (2003), 43 (2), 493-500CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)The Chem. Development Kit (CDK) is a freely available open-source Java library for Structural Chemo- and Bioinformatics. Its architecture and capabilities as well as the development as an open-source project by a team of international collaborators from academic and industrial institutions is described. The CDK provides methods for many common tasks in mol. informatics, including 2D and 3D rendering of chem. structures, I/O routines, SMILES parsing and generation, ring searches, isomorphism checking, structure diagram generation, etc. Application scenarios as well as access information for interested users and potential contributors are given.
- 5Jmol. http://jmol.sourceforge.net/.There is no corresponding record for this reference.
- 6Pymol. https://www.pymol.org/.There is no corresponding record for this reference.
- 7ChemAxon. https://www.chemaxon.com/.There is no corresponding record for this reference.
- 8Indigo. http://lifescience.opensource.epam.com/indigo/.There is no corresponding record for this reference.
- 9O’Boyle, N.; Banck, M.; James, C.; Morley, C.; Vandermeersch, T.; Hutchison, G. Open babel: an open chemical toolbox J. Cheminf. 2011, 3, 33 DOI: 10.1186/1758-2946-3-339https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhsVWjurbF&md5=74e4f19b7f87417f916d57f7abcfb761Open Babel: an open chemical toolboxO'Boyle, Noel M.; Banck, Michael; James, Craig A.; Morley, Chris; Vandermeersch, Tim; Hutchison, Geoffrey R.Journal of Cheminformatics (2011), 3 (), 33CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: A frequent problem in computational modeling is the interconversion of chem. structures between different formats. While std. interchange formats exist (for example, Chem. Markup Language) and de facto stds. have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chem. data, differences in the data stored by different formats (0D vs. 3D, for example), and competition between software along with a lack of vendor-neutral formats. Results: We discuss, for the first time, Open Babel, an open-source chem. toolbox that speaks the many languages of chem. data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chem. and mol. data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions: Open Babel presents a soln. to the proliferation of multiple chem. file formats. In addn., it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chem. data in areas such as org. chem., drug design, materials science, and computational chem. It is freely available under an open-source license.
- 10Beisken, S.; Meinl, T.; Wiswedel, B.; de Figueiredo, L. F.; Berthold, M.; Steinbeck, C. KNIME-CDK: Workflow-driven cheminformatics BMC Bioinf. 2013, 14, 257 DOI: 10.1186/1471-2105-14-25710https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXlsVCktb8%253D&md5=d6a769e0c88b6d0d79bed69b1ea210caKNIME-CDK: workflow-driven cheminformaticsBeisken, Stephan; Meinl, Thorsten; Wiswedel, Bernd; de Figueiredo, Luis F.; Berthold, Michael; Steinbeck, ChristophBMC Bioinformatics (2013), 14 (), 257/1-257/4, 4 pp.CODEN: BBMIC4; ISSN:1471-2105. (BioMed Central Ltd.)A review. Background: Cheminformaticians have to routinely process and analyze libraries of small mols. Among other things, that includes the standardization of mols., calcn. of various descriptors, visualisation of mol. structures, and downstream anal. For this purpose, scientific workflow platforms such as the Konstanz Information Miner can be used if provided with the right plug-in. A workflow-based cheminformatics tool provides the advantage of ease-of-use and interoperability between complementary cheminformatics packages within the same framework, hence facilitating the anal. process. Results: KNIME-CDK comprises functions for mol. conversion to/from common formats, generation of signatures, fingerprints, and mol. properties. It is based on the Chem. Development Toolkit and uses the Chem. Markup Language for persistence. A comparison with the cheminformatics plug-in RDKit shows that KNIME-CDK supports a similar range of chem. classes and adds new functionality to the framework. We describe the design and integration of the plug-in, and demonstrate the usage of the nodes on ChEBI, a library of small mols. of biol. interest. Conclusions: KNIME-CDK is an open-source plug-in for the Konstanz Information Miner, a free workflow platform. KNIME-CDK is built on top of the open-source Chem. Development Toolkit and allows for efficient cross-vendor structural cheminformatics. Its ease-of-use and modularity enables researchers to automate routine tasks and data anal., bringing complimentary cheminformatics functionality to the workflow environment.
- 11Murrell, D. S.; Cortes-Ciriano, I.; van Westen, G. J.; Stott, I. P.; Bender, A.; Malliavin, T. E.; Glen, R. C. Chemically Aware Model Builder (camb): an R package for property and bioactivity modelling of small molecules J. Cheminf. 2015, 7, 45 DOI: 10.1186/s13321-015-0086-211https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XotlWgsLw%253D&md5=69d17c8379c1268eb9eba6ee6f24d4fbChemically Aware Model Builder (camb): an R package for property and bioactivity modelling of small moleculesMurrell, Daniel S.; Cortes-Ciriano, Isidro; van Westen, Gerard J. P.; Stott, Ian P.; Bender, Andreas; Malliavin, Therese E.; Glen, Robert C.Journal of Cheminformatics (2015), 7 (), 45/1-45/10CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: In silico predictive models have proved to be valuable for the optimization of compd. potency, selectivity and safety profiles in the drug discovery process. Results:camb is an R package that provides an environment for the rapid generation of quant. Structure-Property and Structure-Activity models for small mols. (including QSAR, QSPR, QSAM, PCM) and is aimed at both advanced and beginner R users. camb's capabilities include the standardisation of chem. structure representation, computation of 905 one-dimensional and 14 fingerprint type descriptors for small mols., 8 types of amino acid descriptors, 13 whole protein sequence descriptors, filtering methods for feature selection, generation of predictive models (using an interface to the R package caret), as well as techniques to create model ensembles using techniques from the R package caretEnsemble. Results can be visualised through high-quality, customisable plots (R package ggplot2). Conclusions: Overall, camb constitutes an open-source framework to perform the following steps: (1) compd. standardisation, (2) mol. and protein descriptor calcn., (3) descriptor pre-processing and model training, visualisation and validation, and (4) bioactivity/property prediction for new mols. camb aims to speed model generation, in order to provide reproducibility and tests of robustness. QSPR and proteochemometric case studies are included which demonstrate camb's application.
- 12Sander, T.; Freyss, J.; von Korff, M.; Rufener, C. Datawarrior: An Open-Source Program for Chemistry Aware Data Visualization and Analysis J. Chem. Inf. Model. 2015, 55, 460– 473 DOI: 10.1021/ci500588j12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXktFWnuw%253D%253D&md5=5c849901b5cb4549d870d81f5eeaca0aDataWarrior: An Open-Source Program For Chemistry Aware Data Visualization And AnalysisSander, Thomas; Freyss, Joel; von Korff, Modest; Rufener, ChristianJournal of Chemical Information and Modeling (2015), 55 (2), 460-473CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Drug discovery projects in the pharmaceutical industry accumulate thousands of chem. structures and ten-thousands of data points from a dozen or more biol. and pharmacol. assays. A sufficient interpretation of the data requires understanding which mol. families are present, which structural motifs correlate with measured properties, and which tiny structural changes cause large property changes. Data visualization and anal. software with sufficient chem. intelligence to support chemists in this task is rare. In an attempt to contribute to filling the gap, we released our inhouse developed chem. aware data anal. program DataWarrior for free public use. This paper gives an overview of DataWarrior's functionality and architecture. Exemplarily, a new unsupervised, 2-dimensional scaling algorithm is presented, which employs vector-based or nonvector-based descriptors to visualize the chem. or pharmacophore space of even large data sets. DataWarrior uses this method to interactively explore chem. space, activity landscapes, and activity cliffs.
- 13R Core Team. R: A language and environment for statistical computing; R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/.There is no corresponding record for this reference.
- 14Python. http://www.python.org.There is no corresponding record for this reference.
- 15Java. https://www.oracle.com/java/index.html.There is no corresponding record for this reference.
- 16Berthold, M. R.; Cebron, N.; Dill, F.; Gabriel, T. R.; Kötter, T.; Meinl, T.; Ohl, P.; Sieb, C.; Thiel, K.; Wiswedel, B. KNIME: The Konstanz Information Miner. In Data Analysis, Machine Learning and Applications; Springer Berlin Heidelberg, 2007; pp 319– 326.There is no corresponding record for this reference.
- 17Mazanetz, M. P.; Marmon, R. J.; Reisser, C. B.; Morao, I. Drug Discovery Applications for KNIME: An Open Source Data Mining Platform Curr. Top. Med. Chem. 2012, 12, 1965– 1979 DOI: 10.2174/15680261280491033117https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXjt1Whtr4%253D&md5=772876af38009c19b1ae8736ed35dcdaDrug discovery applications for KNIME: an open source data mining platformMazanetz, Michael P.; Marmon, Robert J.; Reisser, Catherine B. T.; Morao, InakiCurrent Topics in Medicinal Chemistry (Sharjah, United Arab Emirates) (2012), 12 (18), 1965-1979CODEN: CTMCCL; ISSN:1568-0266. (Bentham Science Publishers Ltd.)A review. Technol. advances in high-throughput screening methods, combinatorial chem. and the design of virtual libraries have evolved in the pursuit of challenging drug targets. Over the last two decades a vast amt. of data has been generated within these fields and as a consequence data mining methods have been developed to ext. key pieces of information from these large data pools. Much of this data is now available in the public domain. This has been helpful in the arena of drug discovery for both academic groups and for small to medium sized enterprises which previously would not have had access to such data resources. Com. data mining software is sometimes prohibitively expensive and the alternate open source data mining software is gaining momentum in both academia and in industrial applications as the costs of research and development continue to rise. KNIME, the Konstanz Information Miner, has emerged as a leader in open source data mining tools. KNIME provides an integrated soln. for the data mining requirements across the drug discovery pipeline through a visual assembly of data workflows drawing from an extensive repository of tools. This review will examine KNIME as an open source data mining tool and its applications in drug discovery.
- 18KNIME Cheminformatics Extensions. https://tech.knime.org/cheminformatics-extensions.There is no corresponding record for this reference.
- 19Bento, A. P.; Gaulton, A.; Hersey, A.; Bellis, L. J.; Chambers, J.; Davies, M.; Krüger, F. A.; Light, Y.; Mak, L.; McGlinchey, S.; Nowotka, M.; Papadatos, G.; Santos, R.; Overington, J. P. The ChEMBL Bioactivity Database: An Update Nucleic Acids Res. 2014, 42, D1083– 1090 DOI: 10.1093/nar/gkt103119https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXoslWl&md5=31b832d03d56ea3065d7aa29618362bcThe ChEMBL bioactivity database: an updateBento, A. Patricia; Gaulton, Anna; Hersey, Anne; Bellis, Louisa J.; Chambers, Jon; Davies, Mark; Krueger, Felix A.; Light, Yvonne; Mak, Lora; McGlinchey, Shaun; Nowotka, Michal; Papadatos, George; Santos, Rita; Overington, John P.Nucleic Acids Research (2014), 42 (D1), D1083-D1090CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compds. from research stages through clin. development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a no. of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addn. to the web-based interface, data downloads and web services.
- 20Kim, S.; Thiessen, P. A.; Bolton, E. E.; Chen, J.; Fu, G.; Gindulyte, A.; Han, L.; He, J.; He, S.; Shoemaker, B. A.; Wang, J.; Yu, B.; Zhang, J.; Bryant, S. H. PubChem Substance and Compound databases Nucleic Acids Res. 2016, 44, D1202– 1213 DOI: 10.1093/nar/gkv95120https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2gu7bE&md5=1ba53f15667506b761d05f0f02313892PubChem substance and compound databasesKim, Sunghwan; Thiessen, Paul A.; Bolton, Evan E.; Chen, Jie; Fu, Gang; Gindulyte, Asta; Han, Lianyi; He, Jane; He, Siqian; Shoemaker, Benjamin A.; Wang, Jiyao; Yu, Bo; Zhang, Jian; Bryant, Stephen H.Nucleic Acids Research (2016), 44 (D1), D1202-D1213CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public repository for information on chem. substances and their biol. activities, launched in 2004 as a component of the Mol. Libraries Roadmap Initiatives of the US National Institutes of Health (NIH). For the past 11 years, PubChem has grown to a sizable system, serving as a chem. information resource for the scientific research community. PubChem consists of three inter-linked databases, Substance, Compd. and BioAssay. The Substance database contains chem. information deposited by individual data contributors to PubChem, and the Compd. database stores unique chem. structures extd. from the Substance database. Biol. activity data of chem. substances tested in assay expts. are contained in the BioAssay database. This paper provides an overview of the PubChem Substance and Compd. databases, including data sources and contents, data organization, data submission using PubChem Upload, chem. structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access. It also gives a brief description of PubChem3D, a resource derived from theor. three-dimensional structures of compds. in PubChem, as well as PubChemRDF, Resource Description Framework (RDF)-formatted PubChem data for data sharing, anal. and integration with information contained in other databases.
- 21Liu, T.; Lin, Y.; Wen, X.; Jorissen, R. N.; Gilson, M. K. BindingDB: a web-accessible database of experimentally determined protein–ligand binding affinities Nucleic Acids Res. 2007, 35, D198– D201 DOI: 10.1093/nar/gkl99921https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXivFKktg%253D%253D&md5=0ccb20d9b9178a624d4829b5909e7ff8BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinitiesLiu, Tiqing; Lin, Yuhmei; Wen, Xin; Jorissen, Robert N.; Gilson, Michael K.Nucleic Acids Research (2007), 35 (Database Iss), D198-D201CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)BindingDB is a publicly accessible database currently contg. ∼20 000 exptl. detd. binding affinities of protein-ligand complexes, for 110 protein targets including isoforms and mutational variants, and ∼11 000 small mol. ligands. The data are extd. from the scientific literature, data collection focusing on proteins that are drug-targets or candidate drug-targets and for which structural data are present in the Protein Data Bank. The BindingDB website supports a range of query types, including searches by chem. structure, substructure and similarity; protein sequence; ligand and protein names; affinity ranges and mol. wt. Data sets generated by BindingDB queries can be downloaded in the form of annotated SDfiles for further anal., or used as the basis for virtual screening of a compd. database uploaded by the user. The data in BindingDB are linked both to structural data in the PDB via PDB IDs and chem. and sequence searches, and to the literature in PubMed via PubMed IDs.
- 22Berman, H. M.; W, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank Nucleic Acids Res. 2000, 28, 235– 242 DOI: 10.1093/nar/28.1.23522https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXhvVKjt7w%253D&md5=227fb393f754be2be375ab727bfd05dcThe Protein Data BankBerman, Helen M.; Westbrook, John; Feng, Zukang; Gilliland, Gary; Bhat, T. N.; Weissig, Helge; Shindyalov, Ilya N.; Bourne, Philip E.Nucleic Acids Research (2000), 28 (1), 235-242CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The Protein Data Bank (PDB; http://www.rcsb.org/pdb/)is the single worldwide archive of structural data of biol. macromols. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
- 23Papadatos, G.; van Westen, G. J.; Croset, S.; Santos, R.; Trubian, S.; Overington, J. P. A document classifier for medicinal chemistry publications trained on the ChEMBL corpus J. Cheminf. 2014, 6, 40 DOI: 10.1186/s13321-014-0040-823https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXitFGgsLvK&md5=280d8aa01a5d4cdf10313f3da76c17bcA document classifier for medicinal chemistry publications trained on the ChEMBL corpusPapadatos, George; van Westen, Gerard J. P.; Croset, Samuel; Santos, Rita; Trubian, Simone; Overington, John P.Journal of Cheminformatics (2014), 6 (), 40/1-40/8, 8 pp.CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: The large increase in the no. of scientific publications has fuelled a need for semi- and fully-automated text mining approaches in order to assist in the triage process, both for individual scientists and also for larger-scale data extn. and curation into public databases. Here, we introduce a document classifier, which is able to successfully distinguish between publications that are 'ChEMBL-like' (i.e. related to small mol. drug discovery and likely to contain quant. bioactivity data) and those that are not. The unprecedented size of the medicinal chem. literature collection, coupled with the advantage of manual curation and mapping to chem. and biol. make the ChEMBL corpus a unique resource for text mining. Results: The method has been implemented as a data protocol/workflow for both Pipeline Pilot (version 8.5) and KNIME (version 2.9) resp. Both workflows and models are freely available at online. These can be readily modified to include addnl. keyword constraints to further focus searches. Conclusions: Large-scale machine learning document classification was shown to be very robust and flexible for this particular application, as illustrated in four distinct text-mining-based use cases. The models are readily available on two data workflow platforms, which we believe will allow the majority of the scientific community to apply them to their own data.
- 24Williams, A. J.; Harland, L.; Groth, P.; Pettifer, S.; Chichester, C.; Willighagen, E. L.; Evelo, C. T.; Blomberg, N.; Ecker, G.; Goble, C.; Mons, B. Open PHACTS: semantic interoperability for drug discovery Drug Discovery Today 2012, 17, 1188– 1198 DOI: 10.1016/j.drudis.2012.05.01624https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC38jgsVWqtQ%253D%253D&md5=ae3584b34948d7f6e92aee3ef36cc6a8Open PHACTS: semantic interoperability for drug discoveryWilliams Antony J; Harland Lee; Groth Paul; Pettifer Stephen; Chichester Christine; Willighagen Egon L; Evelo Chris T; Blomberg Niklas; Ecker Gerhard; Goble Carole; Mons BarendDrug discovery today (2012), 17 (21-22), 1188-98 ISSN:.Open PHACTS is a public-private partnership between academia, publishers, small and medium sized enterprises and pharmaceutical companies. The goal of the project is to deliver and sustain an 'open pharmacological space' using and enhancing state-of-the-art semantic web standards and technologies. It is focused on practical and robust applications to solve specific questions in drug discovery research. OPS is intended to facilitate improvements in drug discovery in academia and industry and to support open innovation and in-house non-public drug discovery research. This paper lays out the challenges and how the Open PHACTS project is hoping to address these challenges technically and socially.
- 25Stierand, K.; Harder, T.; Marek, T.; Hilbig, M.; Lemmen, C.; Rarey, M. The Internet as Scientific Knowledge Base: Navigating the Chem-Bio Space Mol. Inf. 2012, 31, 543– 546 DOI: 10.1002/minf.20120003725https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtFKlsbrJ&md5=1be2596f0673237bd1dc11d3ddd73838The Internet as Scientific Knowledge Base: Navigating the Chem-Bio SpaceStierand, Katrin; Harder, Tim; Marek, Thomas; Hilbig, Matthias; Lemmen, Christian; Rarey, MatthiasMolecular Informatics (2012), 31 (8), 543-546CODEN: MIONBS; ISSN:1868-1743. (Wiley-VCH Verlag GmbH & Co. KGaA)A first prototype of the ChemBioNavigator (CBN), an OpenPHACTS exemplar service for navigating the chem-bio space with a focus on small mols. relevant in pharmaceutical research, is described. This service allows to access large amts. of data originating from numerous public data sources available on the Internet and to merge this with proprietary compd. information dynamically during runtime. The added information is taken directly from datasets included in the OPS or from external data sources which are referenced in the OPS data cache. The CBN is realized using modern web technologies and state of the art cheminformatics software libraries.
- 26Carrascosa, M. C.; Massaguer, O. L.; Mestres, J. PharmaTrek: A Semantic Web Explorer for Open Innovation in Multitarget Drug Discovery Mol. Inf. 2012, 31, 537– 541 DOI: 10.1002/minf.20120007026https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtFKlsbzE&md5=09ec01395695661ed8aeb42fc0d4d79cPharmaTrek: A Semantic Web Explorer for Open Innovation in Multitarget Drug DiscoveryCarrascosa, Maria C.; Massaguer, Oriol L.; Mestres, JordiMolecular Informatics (2012), 31 (8), 537-541CODEN: MIONBS; ISSN:1868-1743. (Wiley-VCH Verlag GmbH & Co. KGaA)This paper introduces PharmaTrek, an interactive semantic web explorer purposely designed for researchers in the field of multitarget pharmacol. to address complex queries in a most simple and intuitive manner. Other existing applications, such as SuperTarget, STITCH, DrugViz, and iPHACE, provide means to access and visualize drug-target interactions. PharmaTrek differs conceptually from those tools by the way the user submits complex multitarget queries to the single largest open pharmacol. space available to date (ChEMBL v1.3) and visualizes the results in a unique interactive manner that allows taking informed decisions on the original objective multitarget queries. Further development is currently underway.
- 27Isberg, V.; Mordalski, S.; Munk, C.; Rataj, K.; Harpsøe, K.; Hauser, A. S.; Vroling, B.; Bojarski, A. J.; Vriend, G.; Gloriam, D. E. GPCRDB: an information system for G protein-coupled receptors Nucleic Acids Res. 2016, 44, D356– D364 DOI: 10.1093/nar/gkv117827https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2gu7bJ&md5=ee7db0ef81d65cf6e5ab8750b990bb78GPCRdb: an information system for G protein-coupled receptorsIsberg, Vignir; Mordalski, Stefan; Munk, Christian; Rataj, Krzysztof; Harpsoee, Kasper; Hauser, Alexander S.; Vroling, Bas; Bojarski, Andrzej J.; Vriend, Gert; Gloriam, David E.Nucleic Acids Research (2016), 44 (D1), D356-D364CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)A review. Recent developments in G protein-coupled receptor (GPCR) structural biol. and pharmacol. have greatly enhanced our knowledge of receptor structure-function relations, and have helped improve the scientific foundation for drug design studies. The GPCR database, GPCRdb, serves a dual role in disseminating and enabling new scientific developments by providing ref. data, anal. tools and interactive diagrams. This paper highlights new features in the fifth major GPCRdb release: (i) GPCR crystal structure browsing, superposition and display of ligand interactions; (ii) direct deposition by users of point mutations and their effects on ligand binding; (iii) refined snake and helix box residue diagram looks; and (iv) phylogenetic trees with receptor classification color schemes. Under the hood, the entire GPCRdb front- and back-ends have been recoded within one infrastructure, ensuring a smooth browsing experience and development. GPCRdb is available at http://www.gpcrdb.org/ and it's open source code at https://bitbucket.org/gpcr/protwis.
- 28van Linden, O. P.; Kooistra, A. J.; Leurs, R.; de Esch, I. J.; de Graaf, C. KLIFS: a knowledge-based structural database to navigate kinase–ligand interaction space J. Med. Chem. 2014, 57, 249– 277 DOI: 10.1021/jm400378w28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXht1Ojur7P&md5=8519942e1703ba04f64797e07df4b712KLIFS: A Knowledge-Based Structural Database To Navigate Kinase-Ligand Interaction Spacevan Linden, Oscar P. J.; Kooistra, Albert J.; Leurs, Rob; de Esch, Iwan J. P.; de Graaf, ChrisJournal of Medicinal Chemistry (2014), 57 (2), 249-277CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)A review. Protein kinases regulate the majority of signal transduction pathways in cells and have become important targets for the development of designer drugs. We present a systematic anal. of kinase-ligand interactions in all regions of the catalytic cleft of all 1252 human kinase-ligand cocrystal structures present in the Protein Data Bank (PDB). The kinase-ligand interaction fingerprints and structure database (KLIFS) contains a consistent alignment of 85 kinase ligand binding site residues that enables the identification of family specific interaction features and classification of ligands according to their binding modes. We illustrate how systematic mining of kinase-ligand interaction space gives new insights into how conserved and selective kinase interaction hot spots can accommodate the large diversity of chem. scaffolds in kinase ligands. These analyses lead to an improved understanding of the structural requirements of kinase binding that will be useful in ligand discovery and design studies.
- 29Kooistra, A. J.; Kanev, G. K.; van Linden, O. P.; Leurs, R.; de Esch, I. J.; de Graaf, C. KLIFS: a structural kinase-ligand interaction database Nucleic Acids Res. 2016, 44, D365– 371 DOI: 10.1093/nar/gkv108229https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2nsrfP&md5=bc093b02225987fe791a2d41738e4d99KLIFS: a structural kinase-ligand interaction databaseKooistra, Albert J.; Kanev, Georgi K.; van Linden, Oscar P. J.; Leurs, Rob; de Esch, Iwan J. P.; de Graaf, ChrisNucleic Acids Research (2016), 44 (D1), D365-D371CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)Protein kinases play a crucial role in cell signaling and are important drug targets in several therapeutic areas. The KLIFS database contains detailed structural kinase-ligand interaction information derived from all (>2900) structures of catalytic domains of human and mouse protein kinases deposited in the Protein Data Bank in order to provide insights into the structural determinants of kinase-ligand binding and selectivity. The kinase structures have been processed in a consistent manner by systematically analyzing the structural features and mol. interaction fingerprints (IFPs) of a predefined set of 85 binding site residues with bound ligands. KLIFS has been completely rebuilt and extended (>65% more structures) since its first release as a data set, including: novel automated annotation methods for (i) the assessment of ligand-targeted subpockets and the anal. of (ii) DFG and (iii) αC-helix conformations; improved and automated protocols for (iv) the generation of sequence/structure alignments, (v) the curation of ligand atom and bond typing for accurate IFP anal. and (vi) weekly database updates. KLIFS is now accessible via a website (http://klifs.vucompmedchem. nl) that provides a comprehensive visual presentation of different types of chem., biol. and structural chemogenomics data, and allows the user to easily access, compare, search and download the data.
- 30Wood, D. J.; de Vlieg, J.; Wagener, M.; Ritschel, T. Pharmacophore fingerprint-based approach to binding site subpocket similarity and its application to bioisostere replacement J. Chem. Inf. Model. 2012, 52, 2031– 2043 DOI: 10.1021/ci300077630https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XhtVylur%252FF&md5=f344aa0423546f08ea347327f56ef7cbPharmacophore Fingerprint-Based Approach to Binding Site Subpocket Similarity and Its Application to Bioisostere ReplacementWood, David J.; Vlieg, Jacob de; Wagener, Markus; Ritschel, TinaJournal of Chemical Information and Modeling (2012), 52 (8), 2031-2043CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Bioisosteres have been defined as structurally different mols. or substructures that can form comparable intermol. interactions, and therefore, fragments that bind to similar protein structures exhibit a degree of bioisosterism. We present KRIPO (Key Representation of Interaction in POckets): a new method for quantifying the similarities of binding site subpockets based on pharmacophore fingerprints. The binding site fingerprints have been optimized to improve their performance for both intra- and interprotein family comparisons. A range of attributes of the fingerprints was considered in the optimization, including the placement of pharmacophore features, whether or not the fingerprints are fuzzified, and the resoln. and complexity of the pharmacophore fingerprints (2-, 3-, and 4-point fingerprints). Fuzzy 3-point pharmacophore fingerprints were found to represent the optimal balance between computational resource requirements and the identification of potential replacements. The complete PDB was converted into a database comprising almost 300 000 optimized fingerprints of local binding sites together with their assocd. ligand fragments. The value of the approach is demonstrated by application to two crystal structures from the Protein Data Bank: (1) a MAP kinase P38 structure in complex with a pyridinylimidazole inhibitor (1A9U) and (2) a complex of thrombin with melagatran (1K22). Potentially valuable bioisosteric replacements for all subpockets of the two studied protein are identified.
- 31Ridder, L.; Wagener, M. SyGMa: combining expert knowledge and empirical scoring in the prediction of metabolites ChemMedChem 2008, 3, 821– 32 DOI: 10.1002/cmdc.20070031231https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXms1emtLw%253D&md5=92ed84b2e97af5ad9a50beef151fa5dbSyGMa: combining expert knowledge and empirical scoring in the prediction of metabolitesRidder, Lars; Wagener, MarkusChemMedChem (2008), 3 (5), 821-832CODEN: CHEMGX; ISSN:1860-7179. (Wiley-VCH Verlag GmbH & Co. KGaA)Predictions of potential metabolites based on chem. structure are becoming increasingly important in drug discovery to guide medicinal chem. efforts that address metabolic issues and to support exptl. metabolite screening and identification. Herein we present a novel rule-based method, SyGMa (Systematic Generation of potential Metabolites), to predict the potential metabolites of a given parent structure. A set of reaction rules covering a broad range of phase 1 and phase 2 metab. has been derived from metabolic reactions reported in the Metabolite Database to occur in humans. An empirical probability score is assigned to each rule representing the fraction of correctly predicted metabolites in the training database. This score is used to refine the rules and to rank predicted metabolites. The current rule set of SyGMa covers approx. 70% of biotransformation reactions obsd. in humans. Evaluation of the rule-based predictions demonstrated a significant enrichment of true metabolites in the top of the ranking list: while in total, 68% of all obsd. metabolites in an independent test set were reproduced by SyGMa, a large part, 30% of the obsd. metabolites, were identified among the top three predictions. From a subset of cytochrome P 450 specific metabolites, 84% were reproduced overall, with 66% in the top three predicted phase 1 metabolites. A similarity anal. of the reactions present in the database was performed to obtain an overview of the metabolic reactions predicted by SyGMa and to support ongoing efforts to extend the rules. Specific examples demonstrate the use of SyGMa in exptl. metabolite identification and the application of SyGMa to suggest chem. modifications that improve the metabolic stability of compds.
- 32Postgresql. https://www.postgresql.org/.There is no corresponding record for this reference.
- 33Ochoa, R.; Davies, M.; Papadatos, G.; Atkinson, F.; Overington, J. P. myChEMBL: a virtual machine implementation of open data and cheminformatics tools Bioinformatics 2014, 30, 298– 300 DOI: 10.1093/bioinformatics/btt666There is no corresponding record for this reference.
- 34https://www.vagrantup.com/.There is no corresponding record for this reference.
- 35https://atlas.hashicorp.com/boxes/search.There is no corresponding record for this reference.
- 36https://www.packer.io/.There is no corresponding record for this reference.
- 37https://www.virtualbox.org/.There is no corresponding record for this reference.
- 38http://www.ansible.com.There is no corresponding record for this reference.
- 39Travis-CI. https://travis-ci.org/.There is no corresponding record for this reference.
- 40http://www.eclipse.org/tycho/.There is no corresponding record for this reference.
- 41KNIME Developer Guide. https://tech.knime.org/developer-guide.There is no corresponding record for this reference.
- 42Le Guilloux, V.; Schmidtke, P.; Tuffery, P. Fpocket: an open source platform for ligand pocket detection BMC Bioinf. 2009, 10, 168 DOI: 10.1186/1471-2105-10-16842https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD1MvjsFWltw%253D%253D&md5=7d16f53ed64eac9cdeb33ea40b61adfdFpocket: an open source platform for ligand pocket detectionLe Guilloux Vincent; Schmidtke Peter; Tuffery PierreBMC bioinformatics (2009), 10 (), 168 ISSN:.BACKGROUND: Virtual screening methods start to be well established as effective approaches to identify hits, candidates and leads for drug discovery research. Among those, structure based virtual screening (SBVS) approaches aim at docking collections of small compounds in the target structure to identify potent compounds. For SBVS, the identification of candidate pockets in protein structures is a key feature, and the recent years have seen increasing interest in developing methods for pocket and cavity detection on protein surfaces. RESULTS: Fpocket is an open source pocket detection package based on Voronoi tessellation and alpha spheres built on top of the publicly available package Qhull. The modular source code is organised around a central library of functions, a basis for three main programs: (i) Fpocket, to perform pocket identification, (ii) Tpocket, to organise pocket detection benchmarking on a set of known protein-ligand complexes, and (iii) Dpocket, to collect pocket descriptor values on a set of proteins. Fpocket is written in the C programming language, which makes it a platform well suited for the scientific community willing to develop new scoring functions and extract various pocket descriptors on a large scale level. Fpocket 1.0, relying on a simple scoring function, is able to detect 94% and 92% of the pockets within the best three ranked pockets from the holo and apo proteins respectively, outperforming the standards of the field, while being faster. CONCLUSION: Fpocket provides a rapid, open source and stable basis for further developments related to protein pocket detection, efficient pocket descriptor extraction, or drugablity prediction purposes. Fpocket is freely available under the GNU GPL license at http://fpocket.sourceforge.net.
- 43OPS-KNIME. https://github.com/openphacts/OPS-Knime.There is no corresponding record for this reference.
- 44Kooistra, A. J.; Kuhne, S.; de Esch, I. J.; Leurs, R.; de Graaf, C. A structural chemogenomics analysis of aminergic GPCRs: lessons for histamine receptor ligand design Br. J. Pharmacol. 2013, 170, 101– 26 DOI: 10.1111/bph.1224844https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhtlSgtbnE&md5=08ae55b9565885165e76d98e2db1befaA structural chemogenomics analysis of aminergic GPCRs: lessons for histamine receptor ligand designKooistra, A. J.; Kuhne, S.; de Esch, I. J. P.; Leurs, R.; de Graaf, C.British Journal of Pharmacology (2013), 170 (1), 101-126CODEN: BJPCBM; ISSN:1476-5381. (Wiley-Blackwell)Background and Purpose Chemogenomics focuses on the discovery of new connections between chem. and biol. space leading to the discovery of new protein targets and biol. active mols. G-protein coupled receptors (GPCRs) are a particularly interesting protein family for chemogenomics studies because there is an overwhelming amt. of ligand binding affinity data available. The increasing no. of aminergic GPCR crystal structures now for the first time allows the integration of chemogenomics studies with high-resoln. structural analyses of GPCR-ligand complexes. Exptl. Approach In this study, we have combined ligand affinity data, receptor mutagenesis studies, and amino acid sequence analyses to high-resoln. structural analyses of (hist)aminergic GPCR-ligand interactions. This integrated structural chemogenomics anal. is used to more accurately describe the mol. and structural determinants of ligand affinity and selectivity in different key binding regions of the crystd. aminergic GPCRs, and histamine receptors in particular. Key Results Our investigations highlight interesting correlations and differences between ligand similarity and ligand binding site similarity of different aminergic receptors. Apparent discrepancies can be explained by combining detailed anal. of crystd. or predicted protein-ligand binding modes, receptor mutation studies, and ligand structure-selectivity relationships that identify local differences in essential pharmacophore features in the ligand binding sites of different receptors. Conclusions and Implications We have performed structural chemogenomics studies that identify links between (hist)aminergic receptor ligands and their binding sites and binding modes. This knowledge can be used to identify structure-selectivity relationships that increase our understanding of ligand binding to (hist)aminergic receptors and hence can be used in future GPCR ligand discovery and design.
- 45Vass, M.; Kooistra, A. J.; Ritschel, T.; Leurs, R.; de Esch, I. J.; de Graaf, C. Molecular interaction fingerprint approaches for GPCR drug discovery Curr. Opin. Pharmacol. 2016, 30, 59– 68 DOI: 10.1016/j.coph.2016.07.00745https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xht1eju7zJ&md5=567b591ab504d3d9ec6bec9556588118Molecular interaction fingerprint approaches for GPCR drug discoveryVass, Marton; Kooistra, Albert J.; Ritschel, Tina; Leurs, Rob; de Esch, Iwan J. P.; de Graaf, ChrisCurrent Opinion in Pharmacology (2016), 30 (), 59-68CODEN: COPUBK; ISSN:1471-4892. (Elsevier Ltd.)Protein-ligand interaction fingerprints (IFPs) are binary 1D representations of the 3D structure of protein-ligand complexes encoding the presence or absence of specific interactions between the binding pocket amino acids and the ligand. Various implementations of IFPs have been developed and successfully applied for post-processing mol. docking results for G Protein-Coupled Receptor (GPCR) ligand binding mode prediction and virtual ligand screening. Novel interaction fingerprint methods enable structural chemogenomics and polypharmacol. predictions by complementing the increasing amt. of GPCR structural data. Machine learning methods are increasingly used to derive relationships between bioactivity data and fingerprint descriptors of chem. and structural information of binding sites, ligands, and protein-ligand interactions. Factors that influence the application of IFPs include structure prepn., binding site definition, fingerprint similarity assessment, and data processing and these factors pose challenges as well possibilities to optimize interaction fingerprint methods for GPCR drug discovery.
- 46http://swagger.io/swagger-codegen.There is no corresponding record for this reference.
- 47Isberg, V.; de Graaf, C.; Bortolato, A.; Cherezov, V.; Katritch, V.; Marshall, F. H.; Mordalski, S.; Pin, J. P.; Stevens, R. C.; Vriend, G.; Gloriam, D. E. Generic GPCR residue numbers - aligning topology maps while minding the gaps Trends Pharmacol. Sci. 2015, 36, 22– 31 DOI: 10.1016/j.tips.2014.11.00147https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXitVSgu7rK&md5=31e18cc8196fb05a12d04dd0402b2338Generic GPCR residue numbers - aligning topology maps while minding the gapsIsberg, Vignir; de Graaf, Chris; Bortolato, Andrea; Cherezov, Vadim; Katritch, Vsevolod; Marshall, Fiona H.; Mordalski, Stefan; Pin, Jean-Philippe; Stevens, Raymond C.; Vriend, Gerrit; Gloriam, David E.Trends in Pharmacological Sciences (2015), 36 (1), 22-31CODEN: TPHSDY; ISSN:0165-6147. (Elsevier Ltd.)A review. Generic residue nos. facilitate comparisons of, for example, mutational effects, ligand interactions, and structural motifs. The numbering scheme by Ballesteros and Weinstein for residues within the class A GPCRs (G protein-coupled receptors) has more than 1100 citations, and the recent crystal structures for classes B, C, and F now call for a community consensus in residue numbering within and across these classes. Furthermore, the structural era has uncovered helix bulges and constrictions that offset the generic residue nos. The use of generic residue nos. depends on convenient access by pharmacologists, chemists, and structural biologists. We review the generic residue numbering schemes for each GPCR class, as well as a complementary structure-based scheme, and provide illustrative examples and GPCR database (GPCRDB) web tools to no. any receptor sequence or structure.
- 48Manning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S. The protein kinase complement of the human genome Science 2002, 298, 1912– 1934 DOI: 10.1126/science.107576248https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38Xpt1Wisb0%253D&md5=b3def83bace52257a3252a3347b8bb92The Protein Kinase Complement of the Human GenomeManning, G.; Whyte, D. B.; Martinez, R.; Hunter, T.; Sudarsanam, S.Science (Washington, DC, United States) (2002), 298 (5600), 1912-1916, 1933-1934CODEN: SCIEAS; ISSN:0036-8075. (American Association for the Advancement of Science)We have catalogued the protein kinase complement of the human genome (the "kinome") using public and proprietary genomic, complementary DNA, and expressed sequence tag (EST) sequences. This provides a starting point for comprehensive anal. of protein phosphorylation in normal and disease states, as well as a detailed view of the current state of human genome anal. through a focus on one large gene family. We identify 518 putative protein kinase genes, of which 71 have not previously been reported or described as kinases, and we extend or correct the protein sequences of 56 more kinases. New genes include members of well-studied families as well as previously unidentified families, some of which are conserved in model organisms. Classification and comparison with model organism kinomes identified orthologous groups and highlighted expansions specific to human and other lineages. We also identified 106 protein kinase pseudogenes. Chromosomal mapping revealed several small clusters of kinase genes and revealed that 244 kinases map to disease loci or cancer amplicons.
- 49Marcou, G.; Rognan, D. Optimizing fragment and scaffold docking by use of molecular interaction fingerprints J. Chem. Inf. Model. 2007, 47, 195– 207 DOI: 10.1021/ci600342e49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28Xht12iurzL&md5=0c9137a39d40fbcc83546aec17b595baOptimizing Fragment and Scaffold Docking by Use of Molecular Interaction FingerprintsMarcou, Gilles; Rognan, DidierJournal of Chemical Information and Modeling (2007), 47 (1), 195-207CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Protein-ligand interaction fingerprints have been used to postprocess docking poses of three ligand data sets: a set of 40 low-mol.-wt. compds. from the Protein Data Bank, a collection of 40 scaffolds from pharmaceutically relevant protein ligands, and a database of 19 scaffolds extd. from true cdk2 inhibitors seeded in 2230 scaffold decoys. Four popular docking tools (FlexX, Glide, Gold, and Surflex) were used to generate poses for ligands of the three data sets. In all cases, scoring by the similarity of interaction fingerprints to a given ref. was statistically superior to conventional scoring functions in posing low-mol.-wt. fragments, predicting protein-bound scaffold coordinates according to the known binding mode of related ligands, and screening a scaffold library to enrich a hit list in true cdk2-targeted scaffolds.
- 50Fligner, M. A.; Verducci, J. S.; Blower, P. E. A modification of the Jaccard–Tanimoto similarity index for diverse selection of chemical compounds using binary strings Technometrics 2002, 44, 110– 119 DOI: 10.1198/004017002317375064There is no corresponding record for this reference.
- 51Nijmeijer, S.; Vischer, H. F.; Rudebeck, A. F.; Fleurbaaij, F.; Falck, D.; Leurs, R.; Niessen, W. M.; Kool, J. Development of a profiling strategy for metabolic mixtures by combining chromatography and mass spectrometry with cell-based GPCR signaling J. Biomol. Screening 2012, 17, 1329– 38 DOI: 10.1177/108705711245192251https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhslWjtbrL&md5=547c354a161effc51cfe116e4af29eb8Development of a profiling strategy for metabolic mixtures by combining chromatography and mass spectrometry with cell-based GPCR signalingNijmeijer, Saskia; Vischer, Henry F.; Rudebeck, Anders F.; Fleurbaaij, Frank; Falck, David; Leurs, Rob; Niessen, Wilfried M. A.; Kool, JeroenJournal of Biomolecular Screening (2012), 17 (10), 1329-1338, 10 pp.CODEN: JBISF3; ISSN:1087-0571. (Sage Publications)In this study, we developed an in-line methodol. that combines anal. with pharmacol. techniques to characterize metabolites of human histamine H4 receptor (hH4R) ligands. Liq. chromatog. sepn. of metabolic mixts. is coupled to high-resoln. fractionation into 96- or 384-well plates and directly followed by a cell-based reporter gene assay to measure receptor signaling. The complete methodol. was designed, optimized, validated, and ultimately miniaturized into a high-d. well plate format. Finally, the methodol. was demonstrated in a metabolic profiling setting for three hH4R lead compds. and the drug clozapine. This new methodol. comprises integrated anal. sepns., mass spectrometry, and a cell-based signal transduction-driven reporter gene assay that enables the implementation of comprehensive metabolic profiling earlier in the drug discovery process.
- 52Wang, L.; Christopher, L. J.; Cui, D.; Li, W.; Iyer, R.; Humphreys, W. G.; Zhang, D. Identification of the human enzymes involved in the oxidative metabolism of dasatinib: an effective approach for determining metabolite formation kinetics Drug Metab. Dispos. 2008, 36, 1828– 39 DOI: 10.1124/dmd.107.020255There is no corresponding record for this reference.
- 53Rogers, D.; Hahn, M. Extended-connectivity fingerprints J. Chem. Inf. Model. 2010, 50, 742– 54 DOI: 10.1021/ci100050t53https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXlt1Onsbg%253D&md5=cd6c736cd7a3d280b67f5316acce8006Extended-Connectivity FingerprintsRogers, David; Hahn, MathewJournal of Chemical Information and Modeling (2010), 50 (5), 742-754CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Extended-connectivity fingerprints (ECFPs) are a novel class of topol. fingerprints for mol. characterization. Historically, topol. fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure-activity modeling. ECFPs are circular fingerprints with a no. of useful qualities: they can be very rapidly calcd.; they are not predefined and can represent an essentially infinite no. of different mol. features (including stereochem. information); their features represent the presence of particular substructures, allowing easier interpretation of anal. results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature.
- 54Kooistra, A. J.; Vischer, H. F.; McNaught-Flores, D.; Leurs, R.; de Esch, I. J.; de Graaf, C. Function-specific virtual screening for GPCR ligands using a combined scoring method Sci. Rep. 2016, 6, 28288 DOI: 10.1038/srep2828854https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhtVKitbfK&md5=d1e2d58a971dbc9d5f4c3b883fe2656bFunction-specific virtual screening for GPCR ligands using a combined scoring methodKooistra, Albert J.; Vischer, Henry F.; McNaught-Flores, Daniel; Leurs, Rob; de Esch, Iwan J. P.; de Graaf, ChrisScientific Reports (2016), 6 (), 28288CODEN: SRCEC3; ISSN:2045-2322. (Nature Publishing Group)The ability of scoring functions to correctly select and rank docking poses of small mols. in protein binding sites is highly target dependent, which presents a challenge for structure-based drug discovery. Here we describe a virtual screening method that combines an energy-based docking scoring function with a mol. interaction fingerprint (IFP) to identify new ligands based on G protein-coupled receptor (GPCR) crystal structures. The consensus scoring method is prospectively evaluated by: 1) the discovery of chem. novel, fragment-like, high affinity histamine H1 receptor (H1R) antagonists/inverse agonists, 2) the selective structure-based identification of ss2-adrenoceptor (ss2R) agonists, and 3) the exptl. validation and comparison of the combined and individual scoring approaches. Systematic retrospective virtual screening simulations allowed the definition of scoring cut-offs for the identification of H1R and ss2R ligands and the selection of an optimal ss-adrenoceptor crystal structure for the discrimination between ss2R agonists and antagonists. The consensus approach resulted in the exptl. validation of 53% of the ss2R and 73% of the H1R virtual screening hits with up to nanomolar affinities and potencies. The selective identification of ss2R agonists shows the possibilities of structure-based prediction of GPCR ligand function by integrating protein-ligand binding mode information.
- 55Astolfi, A.; Iraci, N.; Manfroni, G.; Barreca, M. L.; Cecchetti, V. A Comprehensive Structural Overview of p38alpha MAPK in Complex with Type I Inhibitors ChemMedChem 2015, 10, 957– 69 DOI: 10.1002/cmdc.201500030There is no corresponding record for this reference.
- 56Lin, X.; Huang, X. P.; Chen, G.; Whaley, R.; Peng, S.; Wang, Y.; Zhang, G.; Wang, S. X.; Wang, S.; Roth, B. L.; Huang, N. Life beyond kinases: structure-based discovery of sorafenib as nanomolar antagonist of 5-HT receptors J. Med. Chem. 2012, 55, 5749– 59 DOI: 10.1021/jm300338m56https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XosV2qsrg%253D&md5=52497e84e7aa648ac226a166cb07f0e3Life Beyond Kinases: Structure-Based Discovery of Sorafenib as Nanomolar Antagonist of 5-HT ReceptorsLin, Xingyu; Huang, Xi-Ping; Chen, Gang; Whaley, Ryan; Peng, Shiming; Wang, Yanli; Zhang, Guoliang; Wang, Simon X.; Wang, Shaohui; Roth, Bryan L.; Huang, NiuJournal of Medicinal Chemistry (2012), 55 (12), 5749-5759CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Of great interest in recent years has been computationally predicting the novel polypharmacol. of drug mols. Here, we applied an "induced-fit" protocol to improve the homol. models of 5-HT2A receptor, and we assessed the quality of these models in retrospective virtual screening. Subsequently, we computationally screened the FDA approved drug mols. against the best induced-fit 5-HT2A models and chose six top scoring hits for exptl. assays. Surprisingly, one well-known kinase inhibitor, sorafenib, has shown unexpected promiscuous 5-HTRs binding affinities, Ki = 1959, 56, and 417 nM against 5-HT2A, 5-HT2B, and 5-HT2C, resp. Our preliminary SAR exploration supports the predicted binding mode and further suggests sorafenib to be a novel lead compd. for 5HTR ligand discovery. Although it has been well-known that sorafenib produces anticancer effects through targeting multiple kinases, carefully designed exptl. studies are desirable to fully understand whether its "off-target" 5-HTR binding activities contribute to its therapeutic efficacy or otherwise undesirable side effects.
- 57
DRUGMATRIX: Adenosine A2A radioligand binding assay (ligand: AB-MECA) CHEMBL1909214.
There is no corresponding record for this reference. - 58Dombroski, M. A.; Letavic, M. A.; McClure, K. F.; Barberia, J. T.; Carty, T. J.; Cortina, S. R.; Csiki, C.; Dipesa, A. J.; Elliott, N. C.; Gabel, C. A.; Jordan, C. K.; Labasi, J. M.; Martin, W. H.; Peese, K. M.; Stock, I. A.; Svensson, L.; Sweeney, F. J.; Yu, C. H. Benzimidazolone p38 inhibitors Bioorg. Med. Chem. Lett. 2004, 14, 919– 23 DOI: 10.1016/j.bmcl.2003.12.023There is no corresponding record for this reference.
- 59Yang, B.; Hird, A. W.; Russell, D. J.; Fauber, B. P.; Dakin, L. A.; Zheng, X.; Su, Q.; Godin, R.; Brassil, P.; Devereaux, E.; Janetka, J. W. Discovery of novel hedgehog antagonists from cell-based screening: Isosteric modification of p38 bisamides as potent inhibitors of SMO Bioorg. Med. Chem. Lett. 2012, 22, 4907– 11 DOI: 10.1016/j.bmcl.2012.04.10459https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XoslSltbY%253D&md5=75b328a6a6fce78a5e9c84c60c33e180Discovery of novel hedgehog antagonists from cell-based screening: Isosteric modification of p38 bisamides as potent inhibitors of SMOYang, Bin; Hird, Alexander W.; Russell, Daniel John; Fauber, Benjamin P.; Dakin, Les A.; Zheng, Xiaolan; Su, Qibin; Godin, Robert; Brassil, Patrick; Devereaux, Erik; Janetka, James W.Bioorganic & Medicinal Chemistry Letters (2012), 22 (14), 4907-4911CODEN: BMCLE8; ISSN:0960-894X. (Elsevier B.V.)Cell-based subset screening of compds. using a Gli transcription factor reporter cell assay and shh stimulated cell differentiation assay identified a series of bisamide compds. as hedgehog pathway inhibitors with good potency. Using a ligand-based optimization strategy, heteroaryl groups were utilized as conformationally restricted amide isosteres replacing one of the amides which significantly increased their potency against SMO and the hedgehog pathway while decreasing activity against p38α kinase. We report herein the identification of advanced lead compds. such as imidazole 11c and 11f encompassing good p38α selectivity, low nanomolar potency in both cell assays, excellent physiochem. properties and in vivo pharmacokinetics.
- 60Peters, J. U. Polypharmacology - foe or friend? J. Med. Chem. 2013, 56, 8955– 71 DOI: 10.1021/jm400856t60https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXht1amtrnO&md5=f4aeb6efddd4bfdf4e94656303323cbaPolypharmacology - Foe or Friend?Peters, Jens-UweJournal of Medicinal Chemistry (2013), 56 (22), 8955-8971CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)A review. Polypharmacol. describes the activity of compds. at multiple targets. Current research focuses on two aspects of polypharmacol.: (1) unintended polypharmacol. can lead to adverse effects; (2) polypharmacol. across several disease-relevant targets can improve therapeutic efficacy, prevent drug resistance, or reduce therapeutic-target-related adverse effects. This perspective reviews these interconnected aspects of polypharmacol. The first part discusses the relevance of polypharmacol. for the safety of drugs, the mitigation of safety risks, and methods to identify polypharmacol. compds. early in the drug discovery process. The second part discusses the advantages of polypharmacol. in the treatment of multigenic diseases and infections, and opportunities for drug discovery and drug repurposing. This perspective aims to provide a balanced view on polypharmacol., which can compromise the safety of drugs, but can also confer superior efficacy.
Supporting Information
Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.6b00686.
Figures presenting the full versions of the GPCRdb, KLIFS, KRIPO, SyGMa, Chemdb4VS, and GPCR-kinase cross-reactivity prediction example KNIME workflows (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.