Directory of Useful Decoys, Enhanced (DUD-E): Better Ligands and Decoys for Better Benchmarking
Abstract

A key metric to assess molecular docking remains ligand enrichment against challenging decoys. Whereas the directory of useful decoys (DUD) has been widely used, clear areas for optimization have emerged. Here we describe an improved benchmarking set that includes more diverse targets such as GPCRs and ion channels, totaling 102 proteins with 22886 clustered ligands drawn from ChEMBL, each with 50 property-matched decoys drawn from ZINC. To ensure chemotype diversity, we cluster each target’s ligands by their Bemis–Murcko atomic frameworks. We add net charge to the matched physicochemical properties and include only the most dissimilar decoys, by topology, from the ligands. An online automated tool (http://decoys.docking.org) generates these improved matched decoys for user-supplied ligands. We test this data set by docking all 102 targets, using the results to improve the balance between ligand desolvation and electrostatics in DOCK 3.6. The complete DUD-E benchmarking set is freely available at http://dude.docking.org.
Introduction
Results
Figure 1

Figure 1. DUD-E target classification. Number of the 102 targets that belong to eight broad protein categories.
total | ChEMBL | manual | |
---|---|---|---|
no. targets | 102 | 101 | 1 |
total | average | minimum | maximum | |
---|---|---|---|---|
no. raw ligands | 66695 | 653.9 | 40 | 3090 |
no. clustered ligands | 22886 | 224.4 | 40 | 592 |
no. experimental decoys | 9219 | 90.4 | 1 | 1070 |
no. clustered ligands unique charge states | 28377 | 278.2 | 46 | 1030 |
no. computational decoys | 1411214 | 13835 | 2300 | 51500 |
target class | gene ID | description | total ligands | clustered ligands | experimental decoys | matched decoys | PDB | LogAUC (%) | ROC EF1 | AUC (%) |
---|---|---|---|---|---|---|---|---|---|---|
cytochrome P450 | CP2C9 | cytochrome P450 2C9 | 145 | 120 | 176 | 7450 | 1R9O | 7 | 3 | 60 |
CP3A4 | cytochrome P450 3A4 | 302 | 170 | 267 | 11800 | 3NXU | 7 | 2 | 63 | |
GPCR | AA2AR | adenosine A2a receptor | 3057 | 482 | 192 | 31550 | 3EML | 28 | 22 | 83 |
ADRB1 | β-1 adrenergic receptor | 648 | 247 | 69 | 15850 | 2VT4 | 19 | 11 | 76 | |
CXCR4 | C-X-C chemokine receptor type 4 | 40 | 40 | 14 | 3406 | 3ODU | 36 | 18 | 90 | |
ion channel | GRIA2 | glutamate receptor ionotropic AMPA 2 | 476 | 158 | 201 | 11845 | 3KGC | 23 | 23 | 71 |
GRIK1 | glutamate receptor ionotropic kainate 1 | 136 | 101 | 235 | 6550 | 1VSO | 35 | 27 | 86 | |
kinase | AKT1 | serine/threonine-protein kinase AKT | 585 | 293 | 53 | 16450 | 3CQW | 27 | 29 | 72 |
MK10 | c-Jun N-terminal kinase 3 | 199 | 104 | 23 | 6600 | 2ZDT | 24 | 11 | 82 | |
MK14 | MAP kinase p38 α | 2205 | 578 | 73 | 35850 | 2QD9 | 17 | 10 | 74 | |
miscellaneous | KIF11 | kinesin-like protein 1 | 272 | 116 | 29 | 6850 | 3CJO | 34 | 35 | 77 |
XIAP | inhibitor of apoptosis protein 3 | 100 | 100 | 7 | 5150 | 3HL5 | 52 | 55 | 88 | |
nuclear receptor | ESR1 | estrogen receptor α | 1297 | 383 | 136 | 20685 | 1SJ0 | 18 | 15 | 67 |
MCR | mineralocorticoid receptor | 201 | 94 | 2 | 5150 | 2AA2 | –4 | 2 | 36 | |
THB | thyroid hormone receptor β-1 | 246 | 103 | 29 | 7450 | 1Q4X | 36 | 38 | 79 | |
PPARD | peroxisome proliferator-activated receptor δ | 699 | 240 | 79 | 12250 | 2ZNP | 32 | 20 | 89 | |
other enzymes | FNTA | protein farnesyltransferase type I α | 1430 | 592 | 132 | 51500 | 3E37 | 16 | 7 | 76 |
HDAC8 | histone deacetylase 8 | 309 | 170 | 73 | 10450 | 3F07 | 29 | 24 | 80 | |
HIVINT | HIV type 1 integrase | 167 | 100 | 268 | 6650 | 3NF7 | 8 | 2 | 64 | |
KITH | thymidine kinase | 57 | 57 | 68 | 2850 | 2B8T | 15 | 0 | 80 | |
PARP1 | poly (ADP-ribose)polymerase-1 | 1031 | 508 | 12 | 30050 | 3L3M | 25 | 21 | 79 | |
PUR2 | GAR transformylase | 50 | 50 | 12 | 2700 | 1NJS | 51 | 50 | 92 | |
protease | DPP4 | dipeptidyl peptidase IV | 1939 | 533 | 167 | 40950 | 2I78 | 41 | 41 | 87 |
FA10 | coagulation factor X | 3090 | 537 | 176 | 28325 | 3KL6 | 39 | 36 | 87 | |
LKHA4 | leukotriene A4 hydrolase | 343 | 171 | 21 | 9450 | 3CHP | 18 | 4 | 82 | |
MMP13 | matrix metallo-proteinase 13 | 1632 | 572 | 26 | 37200 | 830C | 12 | 5 | 71 |
Figure 2

Figure 2. Ligand clustering. (A) The seventh largest Murcko cluster of kinesin-like protein 1 (KIF11), showing both the scaffold (left) and all seven member ligands. (B) Number of ligands in each of the 70 KIF11 Bemis–Murcko atomic frameworks. We removed lower affinity compounds over-represented clusters (above the line), while retaining 100 ligands. (C) Number of adenosine A2A receptor (AA2AR) Murcko clusters is plotted against affinity threshold. Fewer than 600 clusters are present using a 30 nM affinity threshold.
Figure 3

Figure 3. Decoy generation. (A) Three key “warhead” groups from factor Xa (FA10), glycinamide ribonucleotide transformylase (PUR2), and thymidine kinase (KITH). (B) Fraction of warheads remaining is plotted against the dissimilarity method. The dissimilarity methods consist of a fingerprint (Daylight or ECFP4) and either a hard cutoff or a fraction of the most dissimilar decoys to be retained. (C) Property distributions of estrogen receptor α (ESR1) for both the 383 ligands (blue) and the 20685 property-matched decoys (red).
incremental change | all original | new style decoys | switch to new ligands | switch target preparation |
---|---|---|---|---|
decoys | DUD | DUD-E | DUD-E | DUD-E |
ligands | DUD | DUD | DUD-E | DUD-E |
receptor preparation | DUD | DUD | DUD | DUD-E |
average LogAUCa | 14.8 | 19.7 | 16.4 | 22.8 |
Over the 37 common targets (target-by-target data in Supporting Information Table S5).
Figure 4

Figure 4. Retrospective enrichment comparing ligand desolvation and electrostatics methods. Docking results over DUD-E as measured by LogAUC. “None” has no ligand desolvation term, “SEV” uses solvent-excluded volume ligand desolvation, “Thin” employs a thin low-dielectric layer in the electrostatic calculations.
Figure 5

Figure 5. Representative ROC plots. ROC plots using no desolvation (None), solvent-excluded volume ligand desolvation (SEV), the thin low-dielectric layer (Thin), or a drug-like background that consists of all ChEMBL12 ligands with affinities better than 10 μM (Drug-like). The black dotted line represents the results expected from docking ligands randomly. LogAUC percentages are reported in the legend text.
Mineralocorticoid Receptor (MCR)
Figure 6

Figure 6. Representative docking poses. The crystallographic ligand was rebuilt and docked from scratch. (A–F) The crystal pose (magenta) is compared to the resulting docked pose (green). In (C), more ligand conformations are generated and the redocked pose is also shown (tan). Key hydrogen bonds are shown by black dotted lines, and the partially transparent protein surface is colored by atom type.
Thyroid Hormone Receptor β1 (THB)
Serine/Threonine-Protein Kinase AKT (AKT1)
Discussion
Balancing Ligands and Decoys for Enrichment
Online Tools for Automated Generation of Further Ligand and Decoy Sets
Applications to Docking Optimization and Testing
Methods
ChEMBL and RCSB PDB Data Extraction
Target Selection Docking
Target Preparation
Ligand Preparation
Ligand Clustering
Automated Decoy Generation
Original DUD Comparison
Docking Calculations
Docking Metrics
Supporting Information
Figure showing DUD-E workflows, while tables provide detailed target-by-target data and tab delimited text files provide the raw data. This material is available free of charge via the Internet at http://pubs.acs.org.
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgment
Supported by NIH grant GM71896 (to J.J.I. and B.K.S.). We thank Andrew Good for discussions that initiated DUD-E. We thank Teague Sterling for website development and Sunil Koovakkat for DOCK bugfixes. We are grateful to the commercial software vendors who support ZINC and the decoy generation toolchain: Molinspiration (Bratislava, Slovakia) for mib, OpenEye Scientific Software (Santa Fe, NM) for OEChem, Omega, and QuacPac, Molecular Networks (Erlangen, Germany) for Corina, Accelrys (San Diego, CA) for Pipeline Pilot, and ChemAxon (Budapest, Hungary) for cxcalc. We thank Oliv Eidam, Matthew Merski, and Nir London for reading this manuscript.
DUD | Directory of Useful Decoys |
DUD-E | Directory of Useful Decoys—Enhanced |
EF1 | enrichment factor at 1% of ROC curve |
PH | pleckstrin homology |
ROC | receiver operating characteristic |
SEV | solvent-excluded volume |
Tc | Tanimoto coefficient |
References
This article references 53 other publications.
- 1Kitchen, D. B.; Decornez, H.; Furr, J. R.; Bajorath, J. Docking and scoring in virtual screening for drug discovery: methods and applications Nature Rev. Drug Discovery 2004, 3, 935– 949Google Scholar1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXptFemtrg%253D&md5=875a2b37a4299181509a1922b11dbd2fDocking and scoring in virtual screening for drug discovery: methods and applicationsKitchen, Douglas B.; Decornez, Helene; Furr, John R.; Bajorath, JuergenNature Reviews Drug Discovery (2004), 3 (11), 935-949CODEN: NRDDAG; ISSN:1474-1776. (Nature Publishing Group)A review. Computational approaches that 'dock' small mols. into the structures of macromol. targets and 'score' their potential complementarity to binding sites are widely used in hit identification and lead optimization. Indeed, there are now a no. of drugs whose development was heavily influenced by or based on structure-based design and screening strategies, such as HIV protease inhibitors. Nevertheless, there remain significant challenges in the application of these approaches, in particular in relation to current scoring schemes. Here, we review key concepts and specific features of small-mol.-protein docking methods, highlight selected applications and discuss recent advances that aim to address the acknowledged limitations of established approaches.
- 2Kolb, P.; Rosenbaum, D. M.; Irwin, J. J.; Fung, J. J.; Kobilka, B. K.; Shoichet, B. K. Structure-based discovery of beta(2)-adrenergic receptor ligands Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 6843– 6848Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXlsV2qsro%253D&md5=4d1a4cb2aa3925aa4c99c6b0496417a7Structure-based discovery of β2-adrenergic receptor ligandsKolb, Peter; Rosenbaum, Daniel M.; Irwin, John J.; Fung, Juan Jose; Kobilka, Brian K.; Shoichet, Brian K.Proceedings of the National Academy of Sciences of the United States of America (2009), 106 (16), 6843-6848CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Aminergic G protein-coupled receptors (GPCRs) have been a major focus of pharmaceutical research for many years. Due partly to the lack of reliable receptor structures, drug discovery efforts have been largely ligand-based. The recently detd. X-ray structure of the β2-adrenergic receptor offers an opportunity to investigate the advantages and limitations inherent in a structure-based approach to ligand discovery against this and related GPCR targets. Approx. 1 million com. available, "lead-like" mols. were docked against the β2-adrenergic receptor structure. On testing of 25 high-ranking mols., 6 were active with binding affinities <4 μM, with the best mol. binding with a Ki of 9 nM (95% confidence interval 7-10 nM). Five of these mols. were inverse agonists. The high hit rate, the high affinity of the most potent mol., the discovery of unprecedented chemotypes among the new inhibitors, and the apparent bias toward inverse agonists among the docking hits, have implications for structure-based approaches against GPCRs that recognize small org. mols.
- 3Mysinger, M. M.; Weiss, D. R.; Ziarek, J. J.; Gravel, S.; Doak, A. K.; Karpiak, J.; Heveker, N.; Shoichet, B. K.; Volkman, B. F. Structure-based ligand discovery for the protein–protein interface of chemokine receptor CXCR4 Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 5517– 5522Google ScholarThere is no corresponding record for this reference.
- 4Gruneberg, S.; Stubbs, M. T.; Klebe, G. Successful virtual screening for novel inhibitors of human carbonic anhydrase: strategy and experimental confirmation J. Med. Chem. 2002, 45, 3588– 3602Google ScholarThere is no corresponding record for this reference.
- 5Jain, A. N.; Nicholls, A. Recommendations for evaluation of computational methods J. Comput.-Aided Mol. Des. 2008, 22, 133– 139Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsb0%253D&md5=9576e150079dcbf6e4bfea5070082016Recommendations for evaluation of computational methodsJain, Ajay N.; Nicholls, AnthonyJournal of Computer-Aided Molecular Design (2008), 22 (3-4), 133-139CODEN: JCADEQ; ISSN:0920-654X. (Springer)A review. The field of computational chem., particularly as applied to drug design, has become increasingly important in terms of the practical application of predictive modeling to pharmaceutical research and development. Tools for exploiting protein structures or sets of ligands known to bind particular targets can be used for binding-mode prediction, virtual screening, and prediction of activity. A serious weakness within the field is a lack of stds. with respect to quant. evaluation of methods, data set prepn., and data set sharing. Our goal should be to report new methods or comparative evaluations of methods in a manner that supports decision making for practical applications. Here we propose a modest beginning, with recommendations for requirements on statistical reporting, requirements for data sharing, and best practices for benchmark prepn. and usage.
- 6Babaoglu, K.; Simeonov, A.; Irwin, J. J.; Nelson, M. E.; Feng, B.; Thomas, C. J.; Cancian, L.; Costi, M. P.; Maltby, D. A.; Jadhav, A.; Inglese, J.; Austin, C. P.; Shoichet, B. K. Comprehensive mechanistic analysis of hits from high-throughput and docking screens against beta-lactamase J. Med. Chem. 2008, 51, 2502– 2511Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjtFSksb0%253D&md5=7736429aeafc6b17a5d0182e24df0dd4Comprehensive Mechanistic Analysis of Hits from High-Throughput and Docking Screens against β-LactamaseBabaoglu, Kerim; Simeonov, Anton; Irwin, John J.; Nelson, Michael E.; Feng, Brian; Thomas, Craig J.; Cancian, Laura; Costi, M. Paola; Maltby, David A.; Jadhav, Ajit; Inglese, James; Austin, Christopher P.; Shoichet, Brian K.Journal of Medicinal Chemistry (2008), 51 (8), 2502-2511CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)High-throughput screening (HTS) is widely used in drug discovery. Esp. for screens of unbiased libraries, false positives can dominate "hit lists"; their origins are much debated. Here we det. the mechanism of every active hit from a screen of 70,563 unbiased mols. against β-lactamase using quant. HTS (qHTS). Of the 1274 initial inhibitors, 95% were detergent-sensitive and were classified as aggregators. Among the 70 remaining were 25 potent, covalent-acting β-lactams. Mass spectra, counter-screens, and crystallog. identified 12 as promiscuous covalent inhibitors. The remaining 33 were either aggregators or irreproducible. No specific reversible inhibitors were found. We turned to mol. docking to prioritize mols. from the same library for testing at higher concns. Of 16 tested, 2 were modest inhibitors. Subsequent X-ray structures corresponded to the docking prediction. Analog synthesis improved affinity to 8 μM. These results suggest that it may be the phys. behavior of org. mols., not their reactivity, that accounts for most screening artifacts. Structure-based methods may prioritize weak-but-novel chemotypes in unbiased library screens.
- 7Ferreira, R. S.; Simeonov, A.; Jadhav, A.; Eidam, O.; Mott, B. T.; Keiser, M. J.; McKerrow, J. H.; Maloney, D. J.; Irwin, J. J.; Shoichet, B. K. Complementarity between a docking and a high-throughput screen in discovering new cruzain inhibitors J. Med. Chem. 2010, 53, 4891– 4905Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXntlWrurk%253D&md5=0661663ff0802bf981a1997eeca9fa05Complementarity Between a Docking and a High-Throughput Screen in Discovering New Cruzain InhibitorsFerreira, Rafaela S.; Simeonov, Anton; Jadhav, Ajit; Eidam, Oliv; Mott, Bryan T.; Keiser, Michael J.; McKerrow, James H.; Maloney, David J.; Irwin, John J.; Shoichet, Brian K.Journal of Medicinal Chemistry (2010), 53 (13), 4891-4905CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Virtual and high-throughput screens (HTS) should have complementary strengths and weaknesses, but studies that prospectively and comprehensively compare them are rare. We undertook a parallel docking and HTS screen of 197861 compds. against cruzain, a thiol protease target for Chagas disease, looking for reversible, competitive inhibitors. On workup, 99% of the hits were eliminated as false positives, yielding 146 well-behaved, competitive ligands. These fell into five chemotypes: two were prioritized by scoring among the top 0.1% of the docking-ranked library, two were prioritized by behavior in the HTS and by clustering, and one chemotype was prioritized by both approaches. Detn. of an inhibitor/cruzain crystal structure and comparison of the high-scoring docking hits to expt. illuminated the origins of docking false-negatives and false-positives. Prioritizing mols. that are both predicted by docking and are HTS-active yields well-behaved mols., relatively unobscured by the false-positives to which both techniques are individually prone.
- 8Gohlke, H.; Klebe, G. Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors Angew. Chem., Int. Ed. Engl. 2002, 41, 2644– 2676Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38Xmt1entbw%253D&md5=d2e5e2490fe8909e5899a1a89630223fApproaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptorsGohlke, Holger; Klebe, GerhardAngewandte Chemie, International Edition (2002), 41 (15), 2644-2676CODEN: ACIEF5; ISSN:1433-7851. (Wiley-VCH Verlag GmbH)A review. The influence of a xenobiotic compd. on an organism is usually summarized by the expression biol. activity. If a controlled, therapeutically relevant, and regulatory action is obsd. the compd. has potential as a drug, otherwise its toxicity on the biol. system is of interest. However, what do we understand by the biol. activity. In principle, the overall effect on an organism has to be considered. However, because of the complexity of the interrelated processes involved, as a simplification primarily the "main action" on the organism is taken into consideration. On the mol. level, biol. activity corresponds to the binding of a (lowmol. wt.) compd. to a macromol. receptor, usually a protein. Enzymic reactions or signal-transduction cascades are thereby influenced with respect to their function for the organism. We regard this binding as a process under equil. conditions; thus, binding can be described as an assocn. or dissocn. process. Accordingly, biol. activity is expressed as the affinity of both partners for each other, as a thermodn. equil. quantity. How well do we understand these terms and how well are they theor. predictable today. The holy grail of rational drug design is the prediction of the biol. activity of a compd. The processes involving ligand binding are extremely complicated, both ligand and protein are flexible mols., and the energy inventory between the bound and unbound states must be considered in aq. soln. How sophisticated and reliable are our exptl. approaches to obtaining the necessary insight. The present review summarizes our current understanding of the binding affinity of a small-mol. ligand to a protein. Both theor. and empirical approaches for predicting binding affinity, starting from the three-dimensional structure of a protein-ligand complex, will be described and compared. Exptl. methods, primarily microcalorimetry, will be discussed. As a perspective, our own knowledge-based approach towards affinity prediction and exptl. data on factorizing binding contributions to protein-ligand binding will be presented.
- 9Enyedy, I. J.; Egan, W. J. Can we use docking and scoring for hit-to-lead optimization? J. Comput.-Aided Mol. Des. 2008, 22, 161– 168Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsbk%253D&md5=24adfd0e8bb71c0e7b32a9208e7ab4d1Can we use docking and scoring for hit-to-lead optimization?Enyedy, Istvan J.; Egan, William J.Journal of Computer-Aided Molecular Design (2008), 22 (3-4), 161-168CODEN: JCADEQ; ISSN:0920-654X. (Springer)Docking and scoring is currently one of the tools used for hit finding and hit-to-lead optimization when structural information about the target is known. Docking scores have been found useful for optimizing ligand binding to reproduce exptl. obsd. binding modes. The question is, can docking and scoring be used reliably for hit-to-lead optimization To illustrate the challenges of scoring for hit-to-lead optimization, the relationship of docking scores with exptl. detd. IC50 values measured inhouse were tested. The influences of the particular target, crystal structure, and the precision of the scoring function on the ability to differentiate between actives and inactives were analyzed by calcg. the area under the curve of receiver operator characteristic curves for docking scores. It was found that for the test sets considered, MW and sometimes ClogP were as useful as GlideScores and no significant difference was obsd. between SP and XP scores for differentiating between actives and inactives. Interpretation by an expert is still required to successfully utilize docking and scoring in hit-to-lead optimization.
- 10Stahl, M.; Rarey, M. Detailed analysis of scoring functions for virtual screening J. Med. Chem. 2001, 44, 1035– 1042Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3MXhsFKitro%253D&md5=9fe0ea156ae846868e710115e1d71500Detailed Analysis of Scoring Functions for Virtual ScreeningStahl, Martin; Rarey, MatthiasJournal of Medicinal Chemistry (2001), 44 (7), 1035-1042CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)The authors present a comprehensive study of the performance of fast scoring functions for library docking using the program FlexX as the docking engine. Four scoring functions, among them two recently developed knowledge-based potentials, are evaluated on seven target proteins whose binding sites represent a wide range of size, form, and polarity. The results of these calcns. give valuable insight into strengths and weaknesses of current scoring functions. Furthermore, it is shown that a well-chosen combination of two of the tested scoring functions leads to a new, robust scoring scheme with superior performance in virtual screening.
- 11Bissantz, C.; Folkers, G.; Rognan, D. Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations J. Med. Chem. 2000, 43, 4759– 4767Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXotFymurY%253D&md5=ee74ac9a99e55759c2df27fd1db58010Protein-Based Virtual Screening of Chemical Databases. 1. Evaluation of Different Docking/Scoring CombinationsBissantz, Caterina; Folkers, Gerd; Rognan, DidierJournal of Medicinal Chemistry (2000), 43 (25), 4759-4767CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Three different database docking programs (Dock, FlexX, Gold) have been used in combination with seven scoring functions (Chemscore, Dock, FlexX, Fresno, Gold, Pmf, Score) to assess the accuracy of virtual screening methods against two protein targets (thymidine kinase, estrogen receptor) of known three-dimensional structure. For both targets, it was generally possible to discriminate about 7 out of 10 true hits from a random database of 990 ligands. The use of consensus lists common to two or three scoring functions clearly enhances hit rates among the top 5% scorers from 10% (single scoring) to 25-40% (double scoring) and up to 65-70% (triple scoring). However, in all tested cases, no clear relationships could be found between docking and ranking accuracies. Moreover, predicting the abs. binding free energy of true hits was not possible whatever docking accuracy was achieved and scoring function used. As the best docking/consensus scoring combination varies with the selected target and the physicochem. of target-ligand interactions, we propose a two-step protocol for screening large databases: (i) screening of a reduced dataset contg. a few known ligands for deriving the optimal docking/consensus scoring scheme, (ii) applying the latter parameters to the screening of the entire database.
- 12Pham, T. A.; Jain, A. N. Parameter estimation for scoring protein–ligand interactions using negative training data J. Med. Chem. 2006, 49, 5856– 5868Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXlsF2ls70%253D&md5=2296d0e658abfda21e3cc50d2a248e44Parameter Estimation for Scoring Protein-Ligand Interactions Using Negative Training DataPham, Tuan A.; Jain, Ajay N.Journal of Medicinal Chemistry (2006), 49 (20), 5856-5868CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Surflex-Dock employs an empirically derived scoring function to rank putative protein-ligand interactions by flexible docking of small mols. to proteins of known structure. The scoring function employed by Surflex was developed purely on the basis of pos. data, comprising noncovalent protein-ligand complexes with known binding affinities. Consequently, scoring function terms for improper interactions received little wt. in parameter estn., and an ad hoc scheme for avoiding protein-ligand interpenetration was adopted. We present a generalized method for incorporating synthetically generated neg. training data, which allows for rigorous estn. of all scoring function parameters. Geometric docking accuracy remained excellent under the new parametrization. In addn., a test of screening utility covering a diverse set of 29 proteins and corresponding ligand sets showed improved performance. Maximal enrichment of true ligands over non-ligands exceeded 20-fold in over 80% of cases, with enrichment of greater than 100-fold in over 50% of cases.
- 13Kellenberger, E.; Rodrigo, J.; Muller, P.; Rognan, D. Comparative evaluation of eight docking tools for docking and virtual screening accuracy Proteins 2004, 57, 225– 242Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXnvFaku7k%253D&md5=75a11a8f0603c819e4f4c8013b948686Comparative evaluation of eight docking tools for docking and virtual screening accuracyKellenberger, Esther; Rodrigo, Jordi; Muller, Pascal; Rognan, DidierProteins: Structure, Function, and Bioinformatics (2004), 57 (2), 225-242CODEN: PSFBAF ISSN:. (Wiley-Liss, Inc.)Eight docking programs (DOCK, FLEXX, FRED, GLIDE, GOLD, SLIDE, SURFLEX, and QXP) that can be used for either single-ligand docking or database screening have been compared for their propensity to recover the x-ray pose of 100 small-mol.-wt. ligands, and for their capacity to discriminate known inhibitors of an enzyme (thymidine kinase) from randomly chosen "drug-like" mols. Interestingly, both properties are found to be correlated, since the tools showing the best docking accuracy (GLIDE, GOLD, and SURFLEX) are also the most successful in ranking known inhibitors in a virtual screening expt. Moreover, the current study pinpoints some physicochem. descriptors of either the ligand or its cognate protein-binding site that generally lead to docking/scoring inaccuracies.
- 14Ferrara, P.; Gohlke, H.; Price, D. J.; Klebe, G.; Brooks, C. L., III. Assessing scoring functions for protein–ligand interactions J. Med. Chem. 2004, 47, 3032– 3047Google Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXjs1Sjsrk%253D&md5=a9f1e74194d3949aefadefb3266372d2Assessing Scoring Functions for Protein-Ligand InteractionsFerrara, Philippe; Gohlke, Holger; Price, Daniel J.; Klebe, Gerhard; Brooks, Charles L., IIIJournal of Medicinal Chemistry (2004), 47 (12), 3032-3047CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)An assessment of nine scoring functions commonly applied in docking using a set of 189 protein-ligand complexes is presented. The scoring functions include the CHARMm potential, the scoring function DrugScore, the scoring function used in AutoDock, the three scoring functions implemented in DOCK, as well as three scoring functions implemented in the CScore module in SYBYL (PMF, Gold, ChemScore). The authors evaluated the abilities of these scoring functions to recognize near-native configurations among a set of decoys and to rank binding affinities. Binding site decoys were generated by mol. dynamics with restraints. To investigate whether the scoring functions can also be applied for binding site detection, decoys on the protein surface were generated. The influence of the assignment of protonation states was probed by either assigning "std." protonation states to binding site residues or adjusting protonation states according to exptl. evidence. The role of solvation models in conjunction with CHARMm was explored in detail. These include a distance-dependent dielec. function, a generalized Born model, and the Poisson equation. The authors evaluated the effect of using a rigid receptor on the outcome of docking by generating all-pairs decoys ("cross-decoys") for six trypsin and seven HIV-1 protease complexes. The scoring functions perform well to discriminate near-native from misdocked conformations, with CHARMm, DOCK-energy, DrugScore, ChemScore, and AutoDock yielding recognition rates of around 80%. Significant degrdn. in performance is obsd. in going from decoy to cross-decoy recognition for CHARMm in the case of HIV-1 protease, whereas DrugScore and ChemScore, as well as CHARMm in the case of trypsin, show only small deterioration. In contrast, the prediction of binding affinities remains problematic for all of the scoring functions. ChemScore gives the highest correlation value with R2 = 0.51 for the set of 189 complexes and R2 = 0.43 for the set of 116 complexes that does not contain any of the complexes used to calibrate this scoring function. Neither a more accurate treatment of solvation nor a more sophisticated charge model for zinc improves the quality of the results. Improved modeling of the protonation states, however, leads to a better prediction of binding affinities in the case of the generalized Born and the Poisson continuum models used in conjunction with the CHARMm force field. The method can be used for drug discovery.
- 15Huang, N.; Shoichet, B. K.; Irwin, J. J. Benchmarking sets for molecular docking J. Med. Chem. 2006, 49, 6789– 6801Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28XhtFehurzI&md5=43af46968c2a6d66334922bf0caedbe1Benchmarking Sets for Molecular DockingHuang, Niu; Shoichet, Brian K.; Irwin, John J.Journal of Medicinal Chemistry (2006), 49 (23), 6789-6801CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Ligand enrichment among top-ranking hits is a key metric of mol. docking. To avoid bias, decoys should resemble ligands phys., so that enrichment is not simply a sepn. of gross features, yet be chem. distinct from them, so that they are unlikely to be binders. We have assembled a directory of useful decoys (DUD), with 2950 ligands for 40 different targets. Every ligand has 36 decoy mols. that are phys. similar but topol. distinct, leading to a database of 98 266 compds. For most targets, enrichment was at least half a log better with uncorrected databases such as the MDDR than with DUD, evidence of bias in the former. These calcns. also allowed 40×40 cross-docking, where the enrichments of each ligand set could be compared for all 40 targets, enabling a specificity metric for the docking screens. DUD is freely available online as a benchmarking set for docking at http://blaster.docking.org/dud/.
- 16Christofferson, A. J.; Huang, N. How to benchmark methods for structure-based virtual screening of large compound libraries. In Computational Drug Discovery and Design (Methods in Molecular Biology); 2011/12/21 ed.; Baron, R., Ed.; Springer Protocols: New York, 2012; Vol. 819, Chapter 13, pp 187– 195.Google ScholarThere is no corresponding record for this reference.
- 17Verdonk, M. L.; Berdini, V.; Hartshorn, M. J.; Mooij, W. T.; Murray, C. W.; Taylor, R. D.; Watson, P. Virtual screening using protein–ligand docking: avoiding artificial enrichment J. Chem. Inf. Comput. Sci. 2004, 44, 793– 806Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXhtVGhs70%253D&md5=b0cd9e9a04ddebe401dee9744e2184cbVirtual screening using protein-ligand docking: Avoiding artificial enrichmentVerdonk, Marcel L.; Berdini, Valerio; Hartshorn, Michael J.; Mooij, Wijnand T. M.; Murray, Christopher W.; Taylor, Richard D.; Watson, PaulJournal of Chemical Information and Computer Sciences (2004), 44 (3), 793-806CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)This study addresses a no. of topical issues around the use of protein-ligand docking in virtual screening. We show that, for the validation of such methods, it is key to use focused libraries (contg. compds. with one-dimensional properties, similar to the actives), rather than "random" or "drug-like" libraries to test the actives against. We also show that, to obtain good enrichments, the docking program needs to produce reliable binding modes. We demonstrate how pharmacophores can be used to guide the dockings and improve enrichments, and we compare the performance of three consensus-ranking protocols against ranking based on individual scoring functions. Finally, we show that protein-ligand docking can be an effective aid in the screening for weak, fragment-like binders, which has rapidly become a popular strategy for hit identification. All results presented are based on carefully constructed virtual screening expts. against four targets, using the protein-ligand docking program GOLD.
- 18Kuntz, I. D.; Chen, K.; Sharp, K. A.; Kollman, P. A. The maximal affinity of ligands Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 9997– 10002Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1MXlvFehu7s%253D&md5=46fe894b016f3041831a2f0b71f812b1The maximal affinity of ligandsKuntz, I. D.; Chen, K.; Sharp, K. A.; Kollman, P. A.Proceedings of the National Academy of Sciences of the United States of America (1999), 96 (18), 9997-10002CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)We explore the question of what are the best ligands for macromol. targets. A survey of exptl. data on a large no. of the strongest-binding ligands indicates that the free energy of binding increases with the no. of nonhydrogen atoms with an initial slope of ≈-1.5 kcal/mol (1 cal = 4.18 J) per atom. For ligands that contain more than 15 nonhydrogen atoms, the free energy of binding increases very little with relative mol. mass. This nonlinearity is largely ascribed to nonthermodynamic factors. An anal. of the dominant interactions suggests that van der Waals interactions and hydrophobic effects provide a reasonable basis for understanding binding affinities across the entire set of ligands. Interesting outliers that bind unusually strongly on a per atom basis include metal ions, covalently attached ligands, and a few well known complexes such as biotin-avidin.
- 19Fan, H.; Irwin, J. J.; Webb, B. M.; Klebe, G.; Shoichet, B. K.; Sali, A. Molecular Docking Screens Using Comparative Models of Proteins J. Chem. Inf. Model. 2009, 49, 2512– 2527Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXht1yntL7M&md5=11bd5284c6b43592906e45b6aea3019dMolecular Docking Screens Using Comparative Models of ProteinsFan, Hao; Irwin, John J.; Webb, Benjamin M.; Klebe, Gerhard; Shoichet, Brian K.; Sali, AndrejJournal of Chemical Information and Modeling (2009), 49 (11), 2512-2527CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Two orders of magnitude more protein sequences can be modeled by comparative modeling than have been detd. by X-ray crystallog. and NMR spectroscopy. Investigators have nevertheless been cautious about using comparative models for ligand discovery because of concerns about model errors. We suggest how to exploit comparative models for mol. screens, based on docking against a wide range of crystallog. structures and comparative models with known ligands. To account for the variation in the ligand-binding pocket as it binds different ligands, we calc. "consensus" enrichment by ranking each library compd. by its best docking score against all available comparative models and/or modeling templates. For the majority of the targets, the consensus enrichment for multiple models was better than or comparable to that of the holo and apo X-ray structures. Even for single models, the models are significantly more enriching than the template structure if the template is paralogous and shares more than 25% sequence identity with the target.
- 20Repasky, M. P.; Murphy, R. B.; Banks, J. L.; Greenwood, J. R.; Tubert-Brohman, I.; Bhat, S.; Friesner, R. A. Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9575-9Google ScholarThere is no corresponding record for this reference.
- 21Brozell, S. R.; Mukherjee, S.; Balius, T. E.; Roe, D. R.; Case, D. A.; Rizzo, R. C. Evaluation of DOCK 6 as a pose generation and database enrichment tool J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9565-yGoogle ScholarThere is no corresponding record for this reference.
- 22Neves, M. A.; Totrov, M.; Abagyan, R. Docking and scoring with ICM: the benchmarking results and strategies for improvement J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9547-0Google ScholarThere is no corresponding record for this reference.
- 23Spitzer, R.; Jain, A. N. Surflex-Dock: docking benchmarks and real-world application J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-011-9533-yGoogle ScholarThere is no corresponding record for this reference.
- 24Schneider, N.; Hindle, S.; Lange, G.; Klein, R.; Albrecht, J.; Briem, H.; Beyer, K.; Claussen, H.; Gastreich, M.; Lemmen, C.; Rarey, M. Substantial improvements in large-scale redocking and screening using the novel HYDE scoring function J. Comput.-Aided Mol. Des. 2011, DOI: 10.1007/s10822-011-9531-0Google ScholarThere is no corresponding record for this reference.
- 25Liebeschuetz, J. W.; Cole, J. C.; Korb, O. Pose prediction and virtual screening performance of GOLD scoring functions in a standardized test J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9551-4Google ScholarThere is no corresponding record for this reference.
- 26Novikov, F. N.; Stroylov, V. S.; Zeifman, A. A.; Stroganov, O. V.; Kulkov, V.; Chilov, G. G. Lead Finder docking and virtual screening evaluation with Astex and DUD test sets J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9549-yGoogle ScholarThere is no corresponding record for this reference.
- 27Good, A. C.; Oprea, T. I. Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? J. Comput.-Aided Mol. Des. 2008, 22, 169– 178Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsLo%253D&md5=459cd3d09a88d5afc04a4a49e172fb76Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection?Good, Andrew C.; Oprea, Tudor I.Journal of Computer-Aided Molecular Design (2008), 22 (3-4), 169-178CODEN: JCADEQ; ISSN:0920-654X. (Springer)Over the last few years many articles have been published in an attempt to provide performance benchmarks for virtual screening tools. While this research has imparted useful insights, the myriad variables controlling said studies place significant limits on results interpretability. Here we investigate the effects of these variables, including anal. of calcn. setup variation, the effect of target choice, active/decoy set selection (with particular emphasis on the effect of analog bias) and enrichment data interpretation. In addn. the optimization of the publicly available DUD benchmark sets through analog bias removal is discussed, as is their augmentation through the addn. of large diverse data sets collated using WOMBAT.
- 28Mackey, M. D.; Melville, J. L. Better than random? The chemotype enrichment problem J. Chem. Inf. Model. 2009, 49, 1154– 1162Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXltVCgs78%253D&md5=51b6efb624c3a383b7546997c11ab1e0Better than Random? The Chemotype Enrichment ProblemMackey, Mark D.; Melville, James L.Journal of Chemical Information and Modeling (2009), 49 (5), 1154-1162CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Chemotype enrichment is increasingly recognized as an important measure of virtual screening performance. However, little attention has been paid to producing metrics which can quantify chemotype retrieval. Here, we examine two different protocols for analyzing chemotype retrieval: "cluster averaging", where the contribution of each active to the scoring metric is proportional to the no. of other actives with the same chemotype, and "first found", where only the first active for a given chemotype contributes to the score. We demonstrate that this latter anal., common in the qual. anal. used in the current literature, has important drawbacks when combined with quant. metrics.
- 29Hawkins, P. C.; Warren, G. L.; Skillman, A. G.; Nicholls, A. How to do an evaluation: pitfalls and traps J. Comput.-Aided Mol. Des. 2008, 22, 179– 190Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsLk%253D&md5=3e176941c4a526f3a97519ce02007792How to do an evaluation: pitfalls and trapsHawkins, Paul C. D.; Warren, Gregory L.; Skillman, A. Geoffrey; Nicholls, AnthonyJournal of Computer-Aided Molecular Design (2008), 22 (3-4), 179-190CODEN: JCADEQ; ISSN:0920-654X. (Springer)The recent literature is replete with papers evaluating computational tools (often those operating on 3D structures) for their performance in a certain set of tasks. Most commonly these papers compare a no. of docking tools for their performance in cognate re-docking (pose prediction) and/or virtual screening. Related papers have been published on ligand-based tools: pose prediction by conformer generators and virtual screening using a variety of ligand-based approaches. The reliability of these comparisons is critically affected by a no. of factors usually ignored by the authors, including bias in the datasets used in virtual screening, the metrics used to assess performance in virtual screening and pose prediction and errors in crystal structures used.
- 30Irwin, J. J. Community benchmarks for virtual screening J. Comput.-Aided Mol. Des. 2008, 22, 193– 199Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsb8%253D&md5=4972b768c915bc3e7ee05210dcf1b7a8Community benchmarks for virtual screeningIrwin, John J.Journal of Computer-Aided Molecular Design (2008), 22 (3-4), 193-199CODEN: JCADEQ; ISSN:0920-654X. (Springer)Ligand enrichment among top-ranking hits is a key metric of virtual screening. To avoid bias, decoys should resemble ligands phys., so that enrichment is not attributable to simple differences of gross features. We therefore created a directory of useful decoys (DUD) by selecting decoys that resembled annotated ligands phys. but not topol. to benchmark docking performance. DUD has 2950 annotated ligands and 95,316 property-matched decoys for 40 targets. It is by far the largest and most comprehensive public data set for benchmarking virtual screening programs that I am aware of. This paper outlines several ways that DUD can be improved to provide better telemetry to investigators seeking to understand both the strengths and the weaknesses of current docking methods. I also highlight several pitfalls for the unwary: a risk of over-optimization, questions about chem. space, and the proper scope for using DUD. Careful attention to both the compn. of benchmarks and how they are used is essential to avoid being misled by overfitting and bias.
- 31Mysinger, M. M.; Shoichet, B. K. Rapid context-dependent ligand desolvation in molecular docking J. Chem. Inf. Model. 2010, 50, 1561– 1573Google Scholar31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhtVGrsLvI&md5=5124d2c4618e07b4f1e41609c9e61362Rapid Context-Dependent Ligand Desolvation in Molecular DockingMysinger, Michael M.; Shoichet, Brian K.Journal of Chemical Information and Modeling (2010), 50 (9), 1561-1573CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)In structure-based screens for new ligands, a mol. docking algorithm must rapidly score many mols. in multiple configurations, accounting for both the ligand's interactions with receptor and its competing interactions with solvent. Here the authors explore a context-dependent ligand desolvation scoring term for mol. docking. The authors relate the Generalized-Born effective Born radii for every ligand atom to a fractional desolvation and then use this fraction to scale an atom-by-atom decompn. of the full transfer free energy. The fractional desolvation is precomputed on a scoring grid by numerically integrating over the vol. of receptor proximal to a ligand atom, weighted by distance. To test this method's performance, the authors dock ligands vs. property-matched decoys over 40 DUD targets. Context-dependent desolvation better enriches ligands compared to both the raw full transfer free energy penalty and compared to ignoring desolvation altogether, though the improvement is modest. More compellingly, the new method improves docking performance across receptor types. Thus, whereas entirely ignoring desolvation works best for charged sites and overpenalizing with full desolvation works well for neutral sites, the phys. more correct context-dependent ligand desolvation is competitive across both types of targets. The method also reliably discriminates ligands from highly charged mols., where ignoring desolvation performs poorly. Since this context-dependent ligand desolvation may be precalcd., it improves docking reliability with minimal cost to calcn. time and may be readily incorporated into any physics-based docking program.
- 32Vogel, S. M.; Bauer, M. R.; Boeckler, F. M. DEKOIS: demanding evaluation kits for objective in silico screening—a versatile tool for benchmarking docking programs and scoring functions J. Chem. Inf. Model. 2011, 51, 2650– 2665Google Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtValsbnN&md5=e3f59bcd3f898dce1c116a325e914861DEKOIS: Demanding Evaluation Kits for Objective in Silico Screening - A Versatile Tool for Benchmarking Docking Programs and Scoring FunctionsVogel, Simon M.; Bauer, Matthias R.; Boeckler, Frank M.Journal of Chemical Information and Modeling (2011), 51 (10), 2650-2665CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)For widely applied in silico screening techniques success depends on the rational selection of an appropriate method. We herein present a fast, versatile, and robust method to construct demanding evaluation kits for objective in silico screening (DEKOIS). This automated process enables creating tailor-made decoy sets for any given sets of bioactives. It facilitates a target-dependent validation of docking algorithms and scoring functions helping to save time and resources. We have developed metrics for assessing and improving decoy set quality and employ them to investigate how decoy embedding affects docking. We demonstrate that screening performance is target-dependent and can be impaired by latent actives in the decoy set (LADS) or enhanced by poor decoy embedding. The presented method allows extending and complementing the collection of publicly available high quality decoy sets toward new target space. All present and future DEKOIS data sets will be made accessible at www.dekois.com.
- 33Wallach, I.; Lilien, R. Virtual decoy sets for molecular docking benchmarks J. Chem. Inf. Model. 2011, 51, 196– 202Google Scholar33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXoslel&md5=3b58c7641cb2c665ce381081d8b1bf41Virtual Decoy Sets for Molecular Docking BenchmarksWallach, Izhar; Lilien, RyanJournal of Chemical Information and Modeling (2011), 51 (2), 196-202CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Virtual docking algorithms are often evaluated on their ability to sep. active ligands from decoy mols. The current state-of-the-art benchmark, the Directory of Useful Decoys (DUD), minimizes bias by including decoys from a library of synthetically feasible mols. that are phys. similar yet chem. dissimilar to the active ligands. We show that by ignoring synthetic feasibility, we can compile a benchmark that is comparable to the DUD and less biased with respect to phys. similarity.
- 34Gatica, E. A.; Cavasotto, C. N. Ligand and decoy sets for docking to G protein-coupled receptors J. Chem. Inf. Model. 2012, 52, 1– 6Google Scholar34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhs1ajsb3P&md5=a51e1b60afda728614a3bc1ceed79cfdLigand and Decoy Sets for Docking to G Protein-Coupled ReceptorsGatica, Edgar A.; Cavasotto, Claudio N.Journal of Chemical Information and Modeling (2012), 52 (1), 1-6CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)We compiled a G protein-coupled receptor (GPCR) ligand library (GLL) for 147 targets, selecting for each ligand 39 decoy mols., collected in the GPCR Decoy Database (GDD). Decoys were chosen ensuring a ligand-decoy similarity of six phys. properties, while enforcing ligand-decoy chem. dissimilarity. The performance in docking of the GDD was evaluated on 19 GPCRs, showing a marked decrease in enrichment compared to bias-uncorrected decoy sets. Both the GLL and GDD are freely available for the scientific community.
- 35Cereto-Massague, A.; Guasch, L.; Valls, C.; Mulero, M.; Pujadas, G.; Garcia-Vallve, S. DecoyFinder: an easy-to-use python GUI application for building target-specific decoy sets Bioinformatics 2012, 28, 1661– 1662Google ScholarThere is no corresponding record for this reference.
- 36Rohrer, S. G.; Baumann, K. Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data J. Chem. Inf. Model. 2009, 49, 169– 184Google Scholar36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXptlOhtQ%253D%253D&md5=17805f49d995c4c082c3d07be7145a2dMaximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity DataRohrer, Sebastian G.; Baumann, KnutJournal of Chemical Information and Modeling (2009), 49 (2), 169-184CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Refined nearest neighbor anal. was recently introduced for the anal. of virtual screening benchmark data sets. It constitutes a technique from the field of spatial statistics and provides a math. framework for the nonparametric anal. of mapped point patterns. Here, refined nearest neighbor anal. is used to design benchmark data sets for virtual screening based on PubChem bioactivity data. A workflow is devised that purges data sets of compds. active against pharmaceutically relevant targets from unselective hits. Topol. optimization using exptl. design strategies monitored by refined nearest neighbor anal. functions is applied to generate corresponding data sets of actives and decoys that are unbiased with regard to analog bias and artificial enrichment. These data sets provide a tool for Maximum Unbiased Validation (MUV) of virtual screening methods. The data sets and a software package implementing the MUV design workflow are freely available.
- 37Ripphausen, P.; Wassermann, A. M.; Bajorath, J. REPROVIS-DB: a benchmark system for ligand-based virtual screening derived from reproducible prospective applications J. Chem. Inf. Model. 2011, 51, 2467– 2473Google Scholar37https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXht1ait7fJ&md5=8439a3f6471f43afe159d08c28a5bc14REPROVIS-DB: A Benchmark System for Ligand-Based Virtual Screening Derived from Reproducible Prospective ApplicationsRipphausen, Peter; Wassermann, Anne Mai; Bajorath, JurgenJournal of Chemical Information and Modeling (2011), 51 (10), 2467-2473CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Benchmark calcns. are essential for the evaluation of virtual screening (VS) methods. Typically, classes of known active compds. taken from the medicinal chem. literature are divided into ref. mols. (search templates) and potential hits that are added to background databases assumed to consist of compds. not sharing this activity. Then VS calcns. are carried out, and the recall of known active compds. is detd. However, conventional benchmarking is affected by a no. of problems that reduce its value for method evaluation. In addn. to often insufficient statistical validation and the lack of generally accepted evaluation stds., the artificial nature of typical benchmark settings is often criticized. Retrospective benchmark calcns. generally overestimate the potential of VS methods and do not scale with their performance in prospective applications. In order to provide addnl. opportunities for benchmarking that more closely resemble practical VS conditions, we have designed a publicly available compd. database (DB) of reproducible virtual screens (REPROVIS-DB) that organizes information from successful ligand-based VS applications including ref. compds., screening databases, compd. selection criteria, and exptl. confirmed hits. Using the currently available 25 hand-selected compd. data sets, one can attempt to reproduce successful virtual screens with other than the originally applied methods and assess their potential for practical applications.
- 38Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale bioactivity database for drug discovery Nucleic Acids Res. 2012, 40, D1100– 1107Google Scholar38https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhs12htbjN&md5=aedf7793e1ca54b6a4fa272ea3ef7d0eChEMBL: a large-scale bioactivity database for drug discoveryGaulton, Anna; Bellis, Louisa J.; Bento, A. Patricia; Chambers, Jon; Davies, Mark; Hersey, Anne; Light, Yvonne; McGlinchey, Shaun; Michalovich, David; Al-Lazikani, Bissan; Overington, John P.Nucleic Acids Research (2012), 40 (D1), D1100-D1107CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)ChEMBL is an Open Data database contg. binding, functional and ADMET information for a large no. of drug-like bioactive compds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chem. biol. and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compds. and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.
- 39Bemis, G. W.; Murcko, M. A. The properties of known drugs. 1. Molecular frameworks J. Med. Chem. 1996, 39, 2887– 2893Google Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK28XjvVejtro%253D&md5=5e2c4fdfea9434456a0cca83de4185b3The Properties of Known Drugs. 1. Molecular FrameworksBemis, Guy W.; Murcko, Mark A.Journal of Medicinal Chemistry (1996), 39 (15), 2887-2893CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)To better understand the common features present in drug mols., we use shape description methods to analyze a database of com. available drugs and prep. a list of common drug shapes. A useful way of organizing this structural data is to group the atoms of each drug mol. into ring, linker, framework, and side chain atoms. On the basis of the two-dimensional mol. structures (without regard to atom type, hybridization, and bond order), there are 1179 different frameworks among the 5120 compds. analyzed. However, the shapes of half of the drugs in the database are described by the 32 most frequently occurring frameworks. This suggests that the diversity of shapes in the set of known drugs is extremely low. In our second method of anal., in which atom type, hybridization, and bond order are considered, more diversity is seen; there are 2506 different frameworks among the 5120 compds. in the database, and the most frequently occurring 42 frameworks account for only one-fourth of the drugs. We discuss the possible interpretations of these findings and the way they may be used to guide future drug discovery research.
- 40Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank Nucleic Acids Res. 2000, 28, 235– 242Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXhvVKjt7w%253D&md5=227fb393f754be2be375ab727bfd05dcThe Protein Data BankBerman, Helen M.; Westbrook, John; Feng, Zukang; Gilliland, Gary; Bhat, T. N.; Weissig, Helge; Shindyalov, Ilya N.; Bourne, Philip E.Nucleic Acids Research (2000), 28 (1), 235-242CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The Protein Data Bank (PDB; http://www.rcsb.org/pdb/)is the single worldwide archive of structural data of biol. macromols. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
- 41Apweiler, R.; Bairoch, A.; Wu, C. H.; Barker, W. C.; Boeckmann, B.; Ferro, S.; Gasteiger, E.; Huang, H.; Lopez, R.; Magrane, M.; Martin, M. J.; Natale, D. A.; O’Donovan, C.; Redaschi, N.; Yeh, L. S. UniProt: The Universal Protein Knowledgebase Nucleic Acids Res. 2004, 32, D115– D119Google Scholar41https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXhtVSru7vK&md5=4b692ffbb5c27bd9b4d2a8d0809e84e1UniProt: the universal protein knowledgebaseApweiler, Rolf; Bairoch, Amos; Wu, Cathy H.; Barker, Winona C.; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J.; Natale, Darren A.; O'Donovan, Claire; Redaschi, Nicole; Yeh, Lai-Su L.Nucleic Acids Research (2004), 32 (Database), D115-D119CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-refs. and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-refs.). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniProt.
- 42Irwin, J. J.; Shoichet, B. K.; Mysinger, M. M.; Huang, N.; Colizzi, F.; Wassam, P.; Cao, Y. Automated docking screens: a feasibility study J. Med. Chem. 2009, 52, 5712– 5720Google Scholar42https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhtVOjurjN&md5=f302db1f77b87f26d1fa2fece55bca96Automated Docking Screens: A Feasibility StudyIrwin, John J.; Shoichet, Brian K.; Mysinger, Michael M.; Huang, Niu; Colizzi, Francesco; Wassam, Pascal; Cao, YiqunJournal of Medicinal Chemistry (2009), 52 (18), 5712-5720CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Mol. docking is the most practical approach to leverage protein structure for ligand discovery, but the technique retains important liabilities that make it challenging to deploy on a large scale. We have therefore created an expert system, DOCK Blaster, to investigate the feasibility of full automation. The method requires a PDB code, sometimes with a ligand structure, and from that alone can launch a full screen of large libraries. A crit. feature is self-assessment, which ests. the anticipated reliability of the automated screening results using pose fidelity and enrichment. Against common benchmarks, DOCK Blaster recapitulates the crystal ligand pose within 2 Å rmsd 50-60% of the time; inferior to an expert, but respectable. Half the time the ligand also ranked among the top 5% of 100 phys. matched decoys chosen on the fly. Further tests were undertaken culminating in a study of 7755 eligible PDB structures. In 1398 cases, the redocked ligand ranked in the top 5% of 100 property-matched decoys while also posing within 2 Å rmsd, suggesting that unsupervised prospective docking is viable. DOCK Blaster is available at http://blaster.docking.org.
- 43Powers, R. A.; Morandi, F.; Shoichet, B. K. Structure-based discovery of a novel, noncovalent inhibitor of AmpC beta-lactamase Structure 2002, 10, 1013– 1023Google Scholar43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38XlsVyjsrk%253D&md5=5f539484d08191f0c65626493fc1b262Structure-Based Discovery of a Novel, Noncovalent Inhibitor of AmpC β-LactamasePowers, Rachel A.; Morandi, Federica; Shoichet, Brian K.Structure (Cambridge, MA, United States) (2002), 10 (7), 1013-1023CODEN: STRUE6; ISSN:0969-2126. (Cell Press)β-Lactamases are the most widespread resistance mechanisms to β-lactam antibiotics, and there is a pressing need for novel, non-β-lactam drugs. A database of over 200,000 compds. was docked to the active site of AmpC β-lactamase to identify potential inhibitors. Fifty-six compds. were tested, and three had Ki values of 650 μM or better. The best of these, 3-[(4-chloroanilino)sulfonyl]thiophene-2-carboxylic acid, was a competitive noncovalent inhibitor (Ki = 26 μM), which also reversed resistance to β-lactams in bacteria expressing AmpC. The structure of AmpC in complex with this compd. was detd. by x-ray crystallog. to 1.94 A and reveals that the inhibitor interacts with key active-site residues in sites targeted in the docking calcn. Indeed, the exptl. detd. conformation of the inhibitor closely resembles the prediction. The structure of the enzyme-inhibitor complex presents an opportunity to improve binding affinity in a novel series of inhibitors discovered by structure-based methods.
- 44Carlsson, J.; Yoo, L.; Gao, Z. G.; Irwin, J. J.; Shoichet, B. K.; Jacobson, K. A. Structure-based discovery of A2A adenosine receptor ligands J. Med. Chem. 2010, 53, 3748– 3755Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXkvFaqsL8%253D&md5=c36c941d52d2cec06387d79c4c423d46Structure-Based Discovery of A2A Adenosine Receptor LigandsCarlsson, Jens; Yoo, Lena; Gao, Zhan-Guo; Irwin, John J.; Shoichet, Brian K.; Jacobson, Kenneth A.Journal of Medicinal Chemistry (2010), 53 (9), 3748-3755CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)The recent detn. of X-ray structures of pharmacol. relevant GPCRs has made these targets accessible to structure-based ligand discovery. Here we explore whether novel chemotypes may be discovered for the A2A adenosine receptor, based on complementarity to its recently detd. structure. The A2A adenosine receptor signals in the periphery and the CNS, with agonists explored as anti-inflammatory drugs and antagonists explored for neurodegenerative diseases. We used mol. docking to screen a 1.4 million compd. database against the X-ray structure computationally and tested 20 high-ranking, previously unknown mols. exptl. Of these 35% showed substantial activity with affinities between 200 nM and 9 μM. For the most potent of these new inhibitors, over 50-fold specificity was obsd. for the A2A vs. the related A1 and A3 subtypes. These high hit rates and affinities at least partly reflect the bias of com. libraries toward GPCR-like chemotypes, an issue that we attempt to investigate quant. Despite this bias, many of the most potent new ligands were novel, dissimilar from known ligands, providing new lead structures for modulation of this medically important target.
- 45Carlsson, J.; Coleman, R. G.; Setola, V.; Irwin, J. J.; Fan, H.; Schlessinger, A.; Sali, A.; Roth, B. L.; Shoichet, B. K. Ligand discovery from a dopamine D3 receptor homology model and crystal structure Natre Chem. Biol. 2011, 7, 769– 778Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtFylsLvP&md5=7242aa512079a348678b309d352ecdb9Ligand discovery from a dopamine D3 receptor homology model and crystal structureCarlsson, Jens; Coleman, Ryan G.; Setola, Vincent; Irwin, John J.; Fan, Hao; Schlessinger, Avner; Sali, Andrej; Roth, Bryan L.; Shoichet, Brian K.Nature Chemical Biology (2011), 7 (11), 769-778CODEN: NCBABT; ISSN:1552-4450. (Nature Publishing Group)G protein-coupled receptors (GPCRs) are intensely studied as drug targets and for their role in signaling. With the detn. of the first crystal structures, interest in structure-based ligand discovery increased. Unfortunately, for most GPCRs no exptl. structures are available. The detn. of the D3 receptor structure and the challenge to the community to predict it enabled a fully prospective comparison of ligand discovery from a modeled structure vs. that of the subsequently released crystal structure. Over 3.3 million mols. were docked against a homol. model, and 26 of the highest ranking were tested for binding. Six had affinities ranging from 0.2 to 3.1 μM. Subsequently, the crystal structure was released and the docking screen repeated. Of the 25 compds. selected, five had affinities ranging from 0.3 to 3.0 μM. One of the new ligands from the homol. model screen was optimized for affinity to 81 nM. The feasibility of docking screens against modeled GPCRs more generally is considered.
- 46Irwin, J. J.; Shoichet, B. K. ZINC—a free database of commercially available compounds for virtual screening J. Chem. Inf. Model. 2005, 45, 177– 182Google Scholar46https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXhtVOjt77J&md5=e3892b7dc8608b17a3e63541a5ed60e6ZINC - A Free Database of Commercially Available Compounds for Virtual ScreeningIrwin, John J.; Shoichet, Brian K.Journal of Chemical Information and Computer Sciences (2005), 45 (1), 177-182CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)A crit. barrier to entry into structure-based virtual screening is the lack of a suitable, easy to access database of purchasable compds. We have therefore prepd. a library of 727 842 mols., each with 3D structure, using catalogs of compds. from vendors (the size of this library continues to grow). The mols. have been assigned biol. relevant protonation states and are annotated with properties such as mol. wt., calcd. LogP, and no. of rotatable bonds. Each mol. in the library contains vendor and purchasing information and is ready for docking using a no. of popular docking programs. Within certain limits, the mols. are prepd. in multiple protonation states and multiple tautomeric forms. In one format, multiple conformations are available for the mols. This database is available for free download (http://zinc.docking.org) in several common file formats including SMILES, mol2, 3D SDF, and DOCK flexibase format. A Web-based query tool incorporating a mol. drawing interface enables the database to be searched and browsed and subsets to be created. Users can process their own mols. by uploading them to a server. Our hope is that this database will bring virtual screening libraries to a wide community of structural biologists and medicinal chemists.
- 47Velankar, S.; McNeil, P.; Mittard-Runte, V.; Suarez, A.; Barrell, D.; Apweiler, R.; Henrick, K. E-MSD: an integrated data resource for bioinformatics Nucleic Acids Res. 2005, 33, D262– 265Google ScholarThere is no corresponding record for this reference.
- 48Hawkins, P. C.; Skillman, A. G.; Nicholls, A. Comparison of shape-matching and docking as virtual screening tools J. Med. Chem. 2007, 50, 74– 82Google Scholar48https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28Xhtlansb%252FF&md5=6f97f5c0cc092b4e225f7c2656c1bcf6Comparison of Shape-Matching and Docking as Virtual Screening ToolsHawkins, Paul C. D.; Skillman, A. Geoffrey; Nicholls, AnthonyJournal of Medicinal Chemistry (2007), 50 (1), 74-82CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Ligand docking is a widely used approach in virtual screening. In recent years a large no. of publications have appeared in which docking tools are compared and evaluated for their effectiveness in virtual screening against a wide variety of protein targets. These studies have shown that the effectiveness of docking in virtual screening is highly variable due to a large no. of possible confounding factors. Another class of method that has shown promise in virtual screening is the shape-based, ligand-centric approach. Several direct comparisons of docking with the shape-based tool ROCS have been conducted using data sets from some of these recent docking publications. The results show that a shape-based, ligand-centric approach is more consistent than, and often superior to, the protein-centric approach taken by docking.
- 49Teotico, D. G.; Babaoglu, K.; Rocklin, G. J.; Ferreira, R. S.; Giannetti, A. M.; Shoichet, B. K. Docking for fragment inhibitors of AmpC beta-lactamase Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 7455– 7460Google Scholar49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXmt1Khtrg%253D&md5=d2df01432d5381c226e8aa7e008457b5Docking for fragment inhibitors of AmpC 83-lactamaseTeotico, Denise G.; Babaoglu, Kerim; Rocklin, Gabriel J.; Ferreira, Rafaela S.; Giannetti, Anthony M.; Shoichet, Brian K.Proceedings of the National Academy of Sciences of the United States of America (2009), 106 (18), 7455-7460CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Fragment screens for new ligands have had wide success, not withstanding their constraint to libraries of 1000-10,000 mols. Larger libraries would be addressable were mol. docking reliable for fragment screens, but this has not been widely accepted. To investigate docking's ability to prioritize fragments, a library of >137,000 such mols. were docked against the structure of β-lactamase. Forty-eight fragments highly ranked by docking were acquired and tested; 23 had Ki values ranging from 0.7 to 9.2 mM. X-ray crystal structures of the enzyme-bound complexes were detd. for 8 of the fragments. For 4, the correspondence between the predicted and exptl. structures was high (RMSD between 1.2 and 1.4 Å), whereas for another 2, the fidelity was lower but retained most key interactions (RMSD 2.4-2.6 Å). Two of the 8 fragments adopted very different poses in the active site owing to enzyme conformational changes. The 48% hit rate of the fragment docking compares very favorably with "lead-like" docking and high-throughput screening against the same enzyme. To understand this, we investigated the occurrence of the fragment scaffolds among larger, lead-like mols. Approx. 1% of com. available fragments contain these inhibitors whereas only 10-7% of lead-like mols. do. This suggests that many more chemotypes and combinations of chemotypes are present among fragments than are available among lead-like mols., contributing to the higher hit rates. The ability of docking to prioritize these fragments suggests that the technique can be used to exploit the better chemotype coverage that exists at the fragment level.
- 50Tondi, D.; Morandi, F.; Bonnet, R.; Costi, M. P.; Shoichet, B. K. Structure-based optimization of a non-beta-lactam lead results in inhibitors that do not up-regulate beta-lactamase expression in cell culture J. Am. Chem. Soc. 2005, 127, 4632– 4639Google Scholar50https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXit1Krsbk%253D&md5=ddad35ac3e6fb96cf65e12cd807f271cStructure-Based Optimization of a Non-β-lactam Lead Results in Inhibitors That Do Not Up-Regulate β-Lactamase Expression in Cell CultureTondi, Donatella; Morandi, Federica; Bonnet, Richard; Costi, M. Paola; Shoichet, Brian K.Journal of the American Chemical Society (2005), 127 (13), 4632-4639CODEN: JACSAT; ISSN:0002-7863. (American Chemical Society)Bacterial expression of β-lactamases is the most widespread resistance mechanism to β-lactam antibiotics, such as penicillins and cephalosporins. There is a pressing need for novel, non-β-lactam inhibitors of these enzymes. One previously discovered novel inhibitor of the β-lactamase AmpC (I) has several favorable properties: it is chem. dissimilar to β-lactams and is a noncovalent, competitive inhibitor of the enzyme. However, at 26 μM its activity is modest. Using the x-ray structure of the AmpC/I complex as a template, 14 analogs were designed and synthesized. The most active of these, (II), had a Ki of 1 μM, 26-fold better than the lead. To understand the origins of this improved activity, the structures of AmpC in complex with compd. II and an analog were detd. by x-ray crystallog. to 1.97 and 1.96 Å, resp. II was active in cell culture, reversing resistance to the third generation cephalosporin ceftazidime in bacterial pathogens expressing AmpC. In contrast to β-lactam-based inhibitors clavulanate and cefoxitin, compd. II did not up-regulate β-lactamase expression in cell culture but simply inhibited the enzyme expressed by the resistant bacteria. Its escape from this resistance mechanism derives from its dissimilarity to β-lactam antibiotics.
- 51Graves, A. P.; Brenk, R.; Shoichet, B. K. Decoys for docking J. Med. Chem. 2005, 48, 3714– 3728Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXjvFKjsbs%253D&md5=1797b501c1777dc94553701d45c822f0Decoys for DockingGraves, Alan P.; Brenk, Ruth; Shoichet, Brian K.Journal of Medicinal Chemistry (2005), 48 (11), 3714-3728CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Mol. docking is widely used to predict novel lead compds. for drug discovery. Success depends on the quality of the docking scoring function, among other factors. An imperfect scoring function can mislead by predicting incorrect ligand geometries or by selecting nonbinding mols. over true ligands. These false-pos. hits may be considered "decoys". Although these decoys are frustrating, they potentially provide important tests for a docking algorithm; the more subtle the decoy, the more rigorous the test. Indeed, decoy databases have been used to improve protein structure prediction algorithms and protein-protein docking algorithms. Here, we describe 20 geometric decoys in five enzymes and 166 "hit list" decoys-i.e., mols. predicted to bind by our docking program that were tested and found not to do so - for β-lactamase and two cavity sites in lysozyme. Esp. in the cavity sites, which are very simple, these decoys highlight particular weaknesses in our scoring function. We also consider the performance of five other widely used docking scoring functions against our geometric and hit list decoys. Intriguingly, whereas many of these other scoring functions performed better on the geometric decoys, they typically performed worse on the hit list decoys, often highly ranking mols. that seemed to poorly complement the model sites. Several of these "hits" from the other scoring functions were tested exptl. and found, in fact, to be decoys. Collectively, these decoys provide a tool for the development and improvement of mol. docking scoring functions. Such improvements may, in turn, be rapidly tested exptl. against these and related exptl. systems, which are well-behaved in assays and for structure detn.
- 52Hawkins, P. C.; Skillman, A. G.; Warren, G. L.; Ellingson, B. A.; Stahl, M. T. Conformer generation with OMEGA: algorithm and validation using high quality structures from the Protein Data Bank and Cambridge Structural Database J. Chem. Inf. Model. 2010, 50, 572– 584Google Scholar52https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXjtlaisrY%253D&md5=fb87ecc9c51eddef63b41fffcd9babeeConformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural DatabaseHawkins, Paul C. D.; Skillman, A. Geoffrey; Warren, Gregory L.; Ellingson, Benjamin A.; Stahl, Matthew T.Journal of Chemical Information and Modeling (2010), 50 (4), 572-584CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Here, we present the algorithm and validation for OMEGA, a systematic, knowledge-based conformer generator. The algorithm consists of three phases: assembly of an initial 3D structure from a library of fragments; exhaustive enumeration of all rotatable torsions using values drawn from a knowledge-based list of angles, thereby generating a large set of conformations; and sampling of this set by geometric and energy criteria. Validation of conformer generators like OMEGA has often been undertaken by comparing computed conformer sets to exptl. mol. conformations from crystallog., usually from the Protein Databank (PDB). Such an approach is fraught with difficulty due to the systematic problems with small mol. structures in the PDB. Methods are presented to identify a diverse set of small mol. structures from cocomplexes in the PDB that has maximal reliability. A challenging set of 197 high quality, carefully selected ligand structures from well-solved models was obtained using these methods. This set will provide a sound basis for comparison and validation of conformer generators in the future. Validation results from this set are compared to the results using structures of a set of druglike mols. extd. from the Cambridge Structural Database (CSD). OMEGA is found to perform very well in reproducing the crystallog. conformations from both these data sets using two complementary metrics of success.
- 53Jain, A. N. Bias, reporting, and sharing: computational evaluations of docking methods J. Comput.-Aided Mol. Des. 2008, 22, 201– 212Google Scholar53https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsbg%253D&md5=5ec23dce64722830afa856002acd6cf6Bias, reporting, and sharing: computational evaluations of docking methodsJain, Ajay N.Journal of Computer-Aided Molecular Design (2008), 22 (3-4), 201-212CODEN: JCADEQ; ISSN:0920-654X. (Springer)Computational methods for docking ligands to protein binding sites have become ubiquitous in drug discovery. Despite the age of the field, no stds. have been established with respect to methodol. evaluation of docking accuracy, virtual screening utility, or scoring accuracy. There are crit. issues relating to data sharing, data set design and prepn., and statistical reporting that have an impact on the degree to which a report will translate into real-world performance. These issues also have an impact on whether there is a transparent relationship between methodol. changes and reported performance improvements. This paper presents detailed examples of pitfalls in each area and makes recommendations as to best practices.
Cited By
This article is cited by 1324 publications.
- Rupeng Dai, Xueting Bao, Ying Zhang, Yan Huang, Haohao Zhu, Kundi Yang, Bo Wang, Hongmei Wen, Wei Li, Jian Liu. Hot-Spot Residue-Based Virtual Screening of Novel Selective Estrogen-Receptor Degraders for Breast Cancer Treatment. Journal of Chemical Information and Modeling 2023, 63
(23)
, 7588-7602. https://doi.org/10.1021/acs.jcim.3c01503
- Luise Jacobsen, Jonathan Hungerland, Vladimir Bačić, Luca Gerhards, Fabian Schuhmann, Ilia A. Solov’yov. Introducing the Automated Ligand Searcher. Journal of Chemical Information and Modeling 2023, 63
(23)
, 7518-7528. https://doi.org/10.1021/acs.jcim.3c01317
- Daniel Del Hoyo, Martin Salinas, Alba Lomas, Eugenia Ulzurrun, Nuria E. Campillo, Carlos Oscar Sorzano. Scipion-Chem: An Open Platform for Virtual Drug Screening. Journal of Chemical Information and Modeling 2023, Article ASAP.
- Mehdi Paykan Heyrati, Zahra Ghorbanali, Mohammad Akbari, Ghasem Pishgahi, Fatemeh Zare-Mirakabad. BioAct-Het: A Heterogeneous Siamese Neural Network for Bioactivity Prediction Using Novel Bioactivity Representation. ACS Omega 2023, 8
(47)
, 44757-44772. https://doi.org/10.1021/acsomega.3c05778
- Furyal Ahmed, Charles L. Brooks, III. FASTDock: A Pipeline for Allosteric Drug Discovery. Journal of Chemical Information and Modeling 2023, 63
(22)
, 7219-7227. https://doi.org/10.1021/acs.jcim.3c00895
- Manuel A. Llanos, Nicolás Enrique, Vega Esteban-López, Sebastian Scioli-Montoto, David Sánchez-Benito, María E. Ruiz, Veronica Milesi, Dolores E. López, Alan Talevi, Pedro Martín, Luciana Gavernet. A Combined Ligand- and Structure-Based Virtual Screening To Identify Novel NaV1.2 Blockers: In Vitro Patch Clamp Validation and In Vivo Anticonvulsant Activity. Journal of Chemical Information and Modeling 2023, 63
(22)
, 7083-7096. https://doi.org/10.1021/acs.jcim.3c00645
- Andrew T. McNutt, Fatimah Bisiriyu, Sophia Song, Ananya Vyas, Geoffrey R. Hutchison, David Ryan Koes. Conformer Generation for Structure-Based Drug Design: How Many and How Good?. Journal of Chemical Information and Modeling 2023, 63
(21)
, 6598-6607. https://doi.org/10.1021/acs.jcim.3c01245
- Zixuan Cheng, Siaw San Hwang, Mrinal Bhave, Taufiq Rahman, Xavier Chee Wezen. Combination of QSAR Modeling and Hybrid-Based Consensus Scoring to Identify Dual-Targeting Inhibitors of PLK1 and p38γ. Journal of Chemical Information and Modeling 2023, 63
(21)
, 6912-6924. https://doi.org/10.1021/acs.jcim.3c01252
- Lan Phuong Nguyen, Rasel Ahmed Khan, Soomin Kang, Hobin Lee, Jong-Ik Hwang, Hong-Rae Kim. Discovery of Chemical Scaffolds as Lysophosphatidic Acid Receptor 1 Antagonists: Virtual Screening, In Vitro Validation, and Molecular Dynamics Analysis. ACS Omega 2023, 8
(43)
, 40375-40386. https://doi.org/10.1021/acsomega.3c04798
- Mohemmed Faraz Khan, Shubhangi Kandwal, Darren Fayne. DataPype: A Fully Automated Unified Software Platform for Computer-Aided Drug Design. ACS Omega 2023, 8
(42)
, 39468-39480. https://doi.org/10.1021/acsomega.3c05207
- Muhammad Yasir, Jinyoung Park, Eun-Taek Han, Won Sun Park, Jin-Hee Han, Yong-Soo Kwon, Hee-Jae Lee, Wanjoo Chun. Vismodegib Identified as a Novel COX-2 Inhibitor via Deep-Learning-Based Drug Repositioning and Molecular Docking Analysis. ACS Omega 2023, 8
(37)
, 34160-34170. https://doi.org/10.1021/acsomega.3c05425
- Monica A. Kamal, Hedy A. Badary, Dalia Omran, Hend I. Shousha, Ashraf O. Abdelaziz, Hend M. El Tayebi, Yasmine M. Mandour. Virtual Screening and Biological Evaluation of Potential PD-1/PD-L1 Immune Checkpoint Inhibitors as Anti-Hepatocellular Carcinoma Agents. ACS Omega 2023, 8
(37)
, 33242-33254. https://doi.org/10.1021/acsomega.3c00279
- Yan Li, Zhe Zhang, Renxiao Wang. HydraMap v.2: Prediction of Hydration Sites and Desolvation Energy with Refined Statistical Potentials. Journal of Chemical Information and Modeling 2023, 63
(15)
, 4749-4761. https://doi.org/10.1021/acs.jcim.3c00408
- Xujun Zhang, Chao Shen, Tianyue Wang, Yu Kang, Dan Li, Peichen Pan, Jike Wang, Gaoang Wang, Yafeng Deng, Lei Xu, Dongsheng Cao, Tingjun Hou, Zhe Wang. Topology-Based and Conformation-Based Decoys Database: An Unbiased Online Database for Training and Benchmarking Machine-Learning Scoring Functions. Journal of Medicinal Chemistry 2023, 66
(13)
, 9174-9183. https://doi.org/10.1021/acs.jmedchem.3c00801
- Xiaoyang Qu, Lina Dong, Ding Luo, Yubing Si, Binju Wang. Water Network-Augmented Two-State Model for Protein–Ligand Binding Affinity Prediction. Journal of Chemical Information and Modeling 2023, Article ASAP.
- Song Li, Chao Hu, Song Ke, Chenxing Yang, Jun Chen, Yi Xiong, Hao Liu, Liang Hong. LS-MolGen: Ligand-and-Structure Dual-Driven Deep Reinforcement Learning for Target-Specific Molecular Generation Improves Binding Affinity and Novelty. Journal of Chemical Information and Modeling 2023, 63
(13)
, 4207-4215. https://doi.org/10.1021/acs.jcim.3c00587
- Isha Singh, Fengling Li, Elissa A. Fink, Irene Chau, Alice Li, Annía Rodriguez-Hernández, Isabella Glenn, Francisco J. Zapatero-Belinchón, M. Luis Rodriguez, Kanchan Devkota, Zhijie Deng, Kris White, Xiaobo Wan, Nataliya A. Tolmachova, Yurii S. Moroz, H. Ümit Kaniskan, Melanie Ott, Adolfo García-Sastre, Jian Jin, Danica Galonić Fujimori, John J. Irwin, Masoud Vedadi, Brian K. Shoichet. Structure-Based Discovery of Inhibitors of the SARS-CoV-2 Nsp14 N7-Methyltransferase. Journal of Medicinal Chemistry 2023, 66
(12)
, 7785-7803. https://doi.org/10.1021/acs.jmedchem.2c02120
- Xiangying Zhang, Haotian Gao, Haojie Wang, Zhihang Chen, Zhe Zhang, Xinchong Chen, Yan Li, Yifei Qi, Renxiao Wang. PLANET: A Multi-objective Graph Neural Network Model for Protein–Ligand Binding Affinity Prediction. Journal of Chemical Information and Modeling 2023, Article ASAP.
- Yuejiang Yu, Chun Cai, Jiayue Wang, Zonghua Bo, Zhengdan Zhu, Hang Zheng. Uni-Dock: GPU-Accelerated Docking Enables Ultralarge Virtual Screening. Journal of Chemical Theory and Computation 2023, 19
(11)
, 3336-3345. https://doi.org/10.1021/acs.jctc.2c01145
- Xu Qian, Xiaowen Dai, Lin Luo, Mingde Lin, Yuan Xu, Yang Zhao, Dingfang Huang, Haodi Qiu, Li Liang, Haichun Liu, Yingbo Liu, Lingxi Gu, Tao Lu, Yadong Chen, Yanmin Zhang. An Interpretable Multitask Framework BiLAT Enables Accurate Prediction of Cyclin-Dependent Protein Kinase Inhibitors. Journal of Chemical Information and Modeling 2023, 63
(11)
, 3350-3368. https://doi.org/10.1021/acs.jcim.3c00473
- Yuwei Yang, Chang-Yu Hsieh, Yu Kang, Tingjun Hou, Huanxiang Liu, Xiaojun Yao. Deep Generation Model Guided by the Docking Score for Active Molecular Design. Journal of Chemical Information and Modeling 2023, 63
(10)
, 2983-2991. https://doi.org/10.1021/acs.jcim.3c00572
- Jerome Eberhardt, Stefano Forli. WaterKit: Thermodynamic Profiling of Protein Hydration Sites. Journal of Chemical Theory and Computation 2023, 19
(9)
, 2535-2556. https://doi.org/10.1021/acs.jctc.2c01087
- Christian Kersten, Steven Clower, Fabian Barthels. Hic Sunt Dracones: Molecular Docking in Uncharted Territories with Structures from AlphaFold2 and RoseTTAfold. Journal of Chemical Information and Modeling 2023, 63
(7)
, 2218-2225. https://doi.org/10.1021/acs.jcim.2c01400
- Anna M. Díaz-Rovira, Helena Martín, Thijs Beuming, Lucía Díaz, Victor Guallar, Soumya S. Ray. Are Deep Learning Structural Models Sufficiently Accurate for Virtual Screening? Application of Docking Algorithms to AlphaFold2 Predicted Structures. Journal of Chemical Information and Modeling 2023, 63
(6)
, 1668-1674. https://doi.org/10.1021/acs.jcim.2c01270
- Yuqi Zhang, Marton Vass, Da Shi, Esam Abualrous, Jennifer M. Chambers, Nikita Chopra, Christopher Higgs, Koushik Kasavajhala, Hubert Li, Prajwal Nandekar, Hideyuki Sato, Edward B. Miller, Matthew P. Repasky, Steven V. Jerome. Benchmarking Refined and Unrefined AlphaFold2 Structures for Hit Discovery. Journal of Chemical Information and Modeling 2023, 63
(6)
, 1656-1667. https://doi.org/10.1021/acs.jcim.2c01219
- Yeajee Kwon, Sera Park, Jaeok Lee, Jiyeon Kang, Hwa Jeong Lee, Wankyu Kim. BEAR: A Novel Virtual Screening Method Based on Large-Scale Bioactivity Data. Journal of Chemical Information and Modeling 2023, 63
(5)
, 1429-1437. https://doi.org/10.1021/acs.jcim.2c01300
- Lukas Waterloo, Harald Hübner, Fabrizio Fierro, Tara Pfeiffer, Regine Brox, Stefan Löber, Dorothee Weikert, Masha Y. Niv, Peter Gmeiner. Discovery of 2-Aminopyrimidines as Potent Agonists for the Bitter Taste Receptor TAS2R14. Journal of Medicinal Chemistry 2023, 66
(5)
, 3499-3521. https://doi.org/10.1021/acs.jmedchem.2c01997
- Izaz Monir Kamal, Saikat Chakrabarti. MetaDOCK: A Combinatorial Molecular Docking Approach. ACS Omega 2023, 8
(6)
, 5850-5860. https://doi.org/10.1021/acsomega.2c07619
- Horrick Sharma, Pragya Sharma, Uzziah Urquiza, Lerin R. Chastain, Michael A. Ihnat. Exploration of a Large Virtual Chemical Space: Identification of Potent Inhibitors of Lactate Dehydrogenase-A against Pancreatic Cancer. Journal of Chemical Information and Modeling 2023, 63
(3)
, 1028-1043. https://doi.org/10.1021/acs.jcim.2c01544
- Piseth Nhoek, Sungjin Ahn, Pisey Pel, Young-Mi Kim, Jungmoo Huh, Hyun Woo Kim, Minsoo Noh, Young-Won Chin. Alkaloids and Coumarins with Adiponectin-Secretion-Promoting Activities from the Leaves of Orixa japonica. Journal of Natural Products 2023, 86
(1)
, 138-148. https://doi.org/10.1021/acs.jnatprod.2c00844
- Ganesh Chandan Kanakala, Rishal Aggarwal, Divya Nayar, U. Deva Priyakumar. Latent Biases in Machine Learning Models for Predicting Binding Affinities Using Popular Data Sets. ACS Omega 2023, 8
(2)
, 2389-2397. https://doi.org/10.1021/acsomega.2c06781
- Jörg Heider, Jonas Kilian, Aleksandra Garifulina, Steffen Hering, Thierry Langer, Thomas Seidel. Apo2ph4: A Versatile Workflow for the Generation of Receptor-based Pharmacophore Models for Virtual Screening. Journal of Chemical Information and Modeling 2023, 63
(1)
, 101-110. https://doi.org/10.1021/acs.jcim.2c00814
- Daniel Vella, Jean-Paul Ebejer. Few-Shot Learning for Low-Data Drug Discovery. Journal of Chemical Information and Modeling 2023, 63
(1)
, 27-42. https://doi.org/10.1021/acs.jcim.2c00779
- Fergus Boyles, Charlotte M. Deane, Garrett M. Morris. Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained on Docked Poses. Journal of Chemical Information and Modeling 2022, 62
(22)
, 5329-5341. https://doi.org/10.1021/acs.jcim.1c00096
- Eric R. Hantz, Steffen Lindert. Actives-Based Receptor Selection Strongly Increases the Success Rate in Structure-Based Drug Design and Leads to Identification of 22 Potent Cancer Inhibitors. Journal of Chemical Information and Modeling 2022, 62
(22)
, 5675-5687. https://doi.org/10.1021/acs.jcim.2c00848
- Connor J. Morris, Jacob A. Stern, Brenden Stark, Max Christopherson, Dennis Della Corte. MILCDock: Machine Learning Enhanced Consensus Docking for Virtual Screening in Drug Discovery. Journal of Chemical Information and Modeling 2022, 62
(22)
, 5342-5350. https://doi.org/10.1021/acs.jcim.2c00705
- Janez Konc, Dušanka Janežič. ProBiS-Fold Approach for Annotation of Human Structures from the AlphaFold Database with No Corresponding Structure in the PDB to Discover New Druggable Binding Sites. Journal of Chemical Information and Modeling 2022, 62
(22)
, 5821-5829. https://doi.org/10.1021/acs.jcim.2c00947
- Yanjun Li, Daohong Zhou, Guangrong Zheng, Xiaolin Li, Dapeng Wu, Yaxia Yuan. DyScore: A Boosting Scoring Method with Dynamic Properties for Identifying True Binders and Nonbinders in Structure-Based Drug Discovery. Journal of Chemical Information and Modeling 2022, 62
(22)
, 5550-5567. https://doi.org/10.1021/acs.jcim.2c00926
- Jiao Zhou, Wei Li, Shanyue Guan, Xiaohong Chen, Xiang Liu, Weiyan Shao. Discovery of Chemokine CXCL12 Inhibitors by Tandem Application of Virtual Screening and NMR Spectrometry. Journal of Chemical Information and Modeling 2022, 62
(22)
, 5729-5737. https://doi.org/10.1021/acs.jcim.2c01018
- Jinze Zhang, Hao Li, Xuejun Zhao, Qilong Wu, Sheng-You Huang. Holo Protein Conformation Generation from Apo Structures by Ligand Binding Site Refinement. Journal of Chemical Information and Modeling 2022, 62
(22)
, 5806-5820. https://doi.org/10.1021/acs.jcim.2c00895
- Min Xu, Cheng Shen, Jincai Yang, Qing Wang, Niu Huang. Systematic Investigation of Docking Failures in Large-Scale Structure-Based Virtual Screening. ACS Omega 2022, 7
(43)
, 39417-39428. https://doi.org/10.1021/acsomega.2c05826
- Juliana María García-Chacón, Edisson Tello, Ericsson Coy-Barrera, Devin G. Peterson, Coralia Osorio. Mono-n-butyl Malate-Derived Compounds from Camu-camu (Myrciaria dubia) Malic Acid: The Alkyl-Dependent Antihyperglycemic-Related Activity. ACS Omega 2022, 7
(43)
, 39335-39346. https://doi.org/10.1021/acsomega.2c05551
- Timothy R. Stachowski, Marcus Fischer. Large-Scale Ligand Perturbations of the Protein Conformational Landscape Reveal State-Specific Interaction Hotspots. Journal of Medicinal Chemistry 2022, 65
(20)
, 13692-13704. https://doi.org/10.1021/acs.jmedchem.2c00708
- Stefanie Kampen, David Rodríguez, Morten Jørgensen, Monika Kruszyk-Kujawa, Xinyan Huang, Michael Collins, Jr, Noel Boyle, Damien Maurel, Axel Rudling, Guillaume Lebon, Jens Carlsson. Structure-Based Discovery of Negative Allosteric Modulators of the Metabotropic Glutamate Receptor 5. ACS Chemical Biology 2022, 17
(10)
, 2744-2752. https://doi.org/10.1021/acschembio.2c00234
- Adam Stasiulewicz, Anna Lesniak, Piotr Setny, Magdalena Bujalska-Zadrożny, Joanna I. Sulkowska. Identification of CB1 Ligands among Drugs, Phytochemicals and Natural-Like Compounds: Virtual Screening and In Vitro Verification. ACS Chemical Neuroscience 2022, 13
(20)
, 2991-3007. https://doi.org/10.1021/acschemneuro.2c00502
- Agamemnon Krasoulis, Nick Antonopoulos, Vassilis Pitsikalis, Stavros Theodorakis. DENVIS: Scalable and High-Throughput Virtual Screening Using Graph Neural Networks with Atomic and Surface Protein Pocket Features. Journal of Chemical Information and Modeling 2022, 62
(19)
, 4642-4659. https://doi.org/10.1021/acs.jcim.2c01057
- Melisa E. Gantner, Denis N. Prada Gori, Manuel A. Llanos, Alan Talevi, Andrea Angeli, Daniela Vullo, Claudiu T. Supuran, Luciana Gavernet. Identification of New Carbonic Anhydrase VII Inhibitors by Structure-Based Virtual Screening. Journal of Chemical Information and Modeling 2022, 62
(19)
, 4760-4770. https://doi.org/10.1021/acs.jcim.2c00910
- Baddipadige Raju, Gera Narendra, Himanshu Verma, Manoj Kumar, Bharti Sapra, Gurleen Kaur, Subheet Kumar jain, Om Silakari. Machine Learning Enabled Structure-Based Drug Repurposing Approach to Identify Potential CYP1B1 Inhibitors. ACS Omega 2022, 7
(36)
, 31999-32013. https://doi.org/10.1021/acsomega.2c02983
- Elisabeth Kallert, Tim R. Fischer, Simon Schneider, Maike Grimm, Mark Helm, Christian Kersten. Protein-Based Virtual Screening Tools Applied for RNA–Ligand Docking Identify New Binders of the preQ1-Riboswitch. Journal of Chemical Information and Modeling 2022, 62
(17)
, 4134-4148. https://doi.org/10.1021/acs.jcim.2c00751
- Keisuke Yanagisawa, Rikuto Kubota, Yasushi Yoshikawa, Masahito Ohue, Yutaka Akiyama. Effective Protein–Ligand Docking Strategy via Fragment Reuse and a Proof-of-Concept Implementation. ACS Omega 2022, 7
(34)
, 30265-30274. https://doi.org/10.1021/acsomega.2c03470
- Chao Shen, Xujun Zhang, Yafeng Deng, Junbo Gao, Dong Wang, Lei Xu, Peichen Pan, Tingjun Hou, Yu Kang. Boosting Protein–Ligand Binding Pose Prediction and Virtual Screening Based on Residue–Atom Distance Likelihood Potential and Graph Transformer. Journal of Medicinal Chemistry 2022, 65
(15)
, 10691-10706. https://doi.org/10.1021/acs.jmedchem.2c00991
- Miguel García-Ortegón, Gregor N. C. Simm, Austin J. Tripp, José Miguel Hernández-Lobato, Andreas Bender, Sergio Bacallado. DOCKSTRING: Easy Molecular Docking Yields Better Benchmarks for Ligand Design. Journal of Chemical Information and Modeling 2022, 62
(15)
, 3486-3502. https://doi.org/10.1021/acs.jcim.1c01334
- Manuel A. Llanos, Nicolás Enrique, María L. Sbaraglini, Federico M. Garofalo, Alan Talevi, Luciana Gavernet, Pedro Martín. Structure-Based Virtual Screening Identifies Novobiocin, Montelukast, and Cinnarizine as TRPV1 Modulators with Anticonvulsant Activity In Vivo. Journal of Chemical Information and Modeling 2022, 62
(12)
, 3008-3022. https://doi.org/10.1021/acs.jcim.2c00312
- Haoxi Li, Rosa Mirabel, Joseph Zimmerman, Ion Ghiviriga, Darian K. Phidd, Nicole Horenstein, Nikhil M. Urs. Structure–Functional Selectivity Relationship Studies on A-86929 Analogs and Small Aryl Fragments toward the Discovery of Biased Dopamine D1 Receptor Agonists. ACS Chemical Neuroscience 2022, 13
(12)
, 1818-1831. https://doi.org/10.1021/acschemneuro.2c00235
- Michael C. Hutter. Differential Multimolecule Fingerprint for Similarity Search─Making Use of Active and Inactive Compound Sets in Virtual Screening. Journal of Chemical Information and Modeling 2022, 62
(11)
, 2726-2736. https://doi.org/10.1021/acs.jcim.2c00242
- Chao Yang, Yingkai Zhang. Delta Machine Learning to Improve Scoring-Ranking-Screening Performances of Protein–Ligand Scoring Functions. Journal of Chemical Information and Modeling 2022, 62
(11)
, 2696-2712. https://doi.org/10.1021/acs.jcim.2c00485
- Xujun Zhang, Chao Shen, Ben Liao, Dejun Jiang, Jike Wang, Zhenxing Wu, Hongyan Du, Tianyue Wang, Wenbo Huo, Lei Xu, Dongsheng Cao, Chang-Yu Hsieh, Tingjun Hou. TocoDecoy: A New Approach to Design Unbiased Datasets for Training and Benchmarking Machine-Learning Scoring Functions. Journal of Medicinal Chemistry 2022, 65
(11)
, 7918-7932. https://doi.org/10.1021/acs.jmedchem.2c00460
- Weixin Xie, Fanhao Wang, Yibo Li, Luhua Lai, Jianfeng Pei. Advances and Challenges in De Novo Drug Design Using Three-Dimensional Deep Generative Models. Journal of Chemical Information and Modeling 2022, 62
(10)
, 2269-2279. https://doi.org/10.1021/acs.jcim.2c00042
- Nemanja Djokovic, Dusan Ruzic, Minna Rahnasto-Rilla, Tatjana Srdic-Rajic, Maija Lahtela-Kakkonen, Katarina Nikolic. Expanding the Accessible Chemical Space of SIRT2 Inhibitors through Exploration of Binding Pocket Dynamics. Journal of Chemical Information and Modeling 2022, 62
(10)
, 2571-2585. https://doi.org/10.1021/acs.jcim.2c00241
- Haoqi Wang, Nirmitee Mulgaonkar, Lisa M. Pérez, Sandun Fernando. ELIXIR-A: An Interactive Visualization Tool for Multi-Target Pharmacophore Refinement. ACS Omega 2022, 7
(15)
, 12707-12715. https://doi.org/10.1021/acsomega.1c07144
- Nabeel Ahmad, Anamika Singh, Akshita Gupta, Pradeep Pant, Tej P. Singh, Sujata Sharma, Pradeep Sharma. Discovery of the Lead Molecules Targeting the First Step of the Histidine Biosynthesis Pathway of Acinetobacter baumannii. Journal of Chemical Information and Modeling 2022, 62
(7)
, 1744-1759. https://doi.org/10.1021/acs.jcim.1c01421
- Tomomi Shimazaki, Masanori Tachikawa. Collaborative Approach between Explainable Artificial Intelligence and Simplified Chemical Interactions to Explore Active Ligands for Cyclin-Dependent Kinase 2. ACS Omega 2022, 7
(12)
, 10372-10381. https://doi.org/10.1021/acsomega.1c06976
- C. Johan van der Westhuizen, André Stander, Darren L. Riley, Jenny-Lee Panayides. Discovery of Novel Acetylcholinesterase Inhibitors by Virtual Screening, In Vitro Screening, and Molecular Dynamics Simulations. Journal of Chemical Information and Modeling 2022, 62
(6)
, 1550-1572. https://doi.org/10.1021/acs.jcim.1c01443
- Janez Konc, Samo Lešnik, Blaž Škrlj, Matej Sova, Matic Proj, Damijan Knez, Stanislav Gobec, Dušanka Janežič. ProBiS-Dock: A Hybrid Multitemplate Homology Flexible Docking Algorithm Enabled by Protein Binding Site Comparison. Journal of Chemical Information and Modeling 2022, 62
(6)
, 1573-1584. https://doi.org/10.1021/acs.jcim.1c01176
- Giovanni Bolcato, Esther Heid, Jonas Boström. On the Value of Using 3D Shape and Electrostatic Similarities in Deep Generative Methods. Journal of Chemical Information and Modeling 2022, 62
(6)
, 1388-1398. https://doi.org/10.1021/acs.jcim.1c01535
- Anat Levit Kaplan, Ryan T. Strachan, Joao M. Braz, Veronica Craik, Samuel Slocum, Thomas Mangano, Vanessa Amabo, Henry O’Donnell, Parnian Lak, Allan I. Basbaum, Bryan L. Roth, Brian K. Shoichet. Structure-Based Design of a Chemical Probe Set for the 5-HT5A Serotonin Receptor. Journal of Medicinal Chemistry 2022, 65
(5)
, 4201-4217. https://doi.org/10.1021/acs.jmedchem.1c02031
- Eugene Lin, Chieh-Hsin Lin, Hsien-Yuan Lane. De Novo Peptide and Protein Design Using Generative Adversarial Networks: An Update. Journal of Chemical Information and Modeling 2022, 62
(4)
, 761-774. https://doi.org/10.1021/acs.jcim.1c01361
- Sami T. Kurkinen, Jukka V. Lehtonen, Olli T. Pentikäinen, Pekka A. Postila. Optimization of Cavity-Based Negative Images to Boost Docking Enrichment in Virtual Screening. Journal of Chemical Information and Modeling 2022, 62
(4)
, 1100-1112. https://doi.org/10.1021/acs.jcim.1c01145
- Fabio Begnini, Stefan Geschwindner, Patrik Johansson, Lisa Wissler, Richard J. Lewis, Emma Danelius, Andreas Luttens, Pierre Matricon, Jens Carlsson, Stijn Lenders, Beate König, Anna Friedel, Peter Sjö, Stefan Schiesser, Jan Kihlberg. Importance of Binding Site Hydration and Flexibility Revealed When Optimizing a Macrocyclic Inhibitor of the Keap1–Nrf2 Protein–Protein Interaction. Journal of Medicinal Chemistry 2022, 65
(4)
, 3473-3517. https://doi.org/10.1021/acs.jmedchem.1c01975
- Andreas Luttens, Hjalmar Gullberg, Eldar Abdurakhmanov, Duy Duc Vo, Dario Akaberi, Vladimir O. Talibov, Natalia Nekhotiaeva, Laura Vangeel, Steven De Jonghe, Dirk Jochmans, Janina Krambrich, Ali Tas, Bo Lundgren, Ylva Gravenfors, Alexander J. Craig, Yoseph Atilaw, Anja Sandström, Lindon W. K. Moodie, Åke Lundkvist, Martijn J. van Hemert, Johan Neyts, Johan Lennerstrand, Jan Kihlberg, Kristian Sandberg, U. Helena Danielson, Jens Carlsson. Ultralarge Virtual Screening Identifies SARS-CoV-2 Main Protease Inhibitors with Broad-Spectrum Activity against Coronaviruses. Journal of the American Chemical Society 2022, 144
(7)
, 2905-2920. https://doi.org/10.1021/jacs.1c08402
- Dongping Li, Kexin Jiang, Dan Teng, Zengrui Wu, Weihua Li, Yun Tang, Rui Wang, Guixia Liu. Discovery of New Estrogen-Related Receptor α Agonists via a Combination Strategy Based on Shape Screening and Ensemble Docking. Journal of Chemical Information and Modeling 2022, 62
(3)
, 486-497. https://doi.org/10.1021/acs.jcim.1c00662
- Damien Geslin, Alban Lepailleur, Jean-Luc Manguin, Nhat-Vinh Vo, Jean-Luc Lamotte, Bertrand Cuissart, Ronan Bureau. Deciphering a Pharmacophore Network: A Case Study Using BCR-ABL Data. Journal of Chemical Information and Modeling 2022, 62
(3)
, 678-691. https://doi.org/10.1021/acs.jcim.1c00427
- Wenyi Zhang, Jing Huang. EViS: An Enhanced Virtual Screening Approach Based on Pocket–Ligand Similarity. Journal of Chemical Information and Modeling 2022, 62
(3)
, 498-510. https://doi.org/10.1021/acs.jcim.1c00944
- Iván Felsztyna, Marcos A. Villarreal, Daniel A. García, Virginia Miguel. Insect RDL Receptor Models for Virtual Screening: Impact of the Template Conformational State in Pentameric Ligand-Gated Ion Channels. ACS Omega 2022, 7
(2)
, 1988-2001. https://doi.org/10.1021/acsomega.1c05465
- Ilenia Giangreco, Abhik Mukhopadhyay, Jason C. Cole. Validation of a Field-Based Ligand Screener Using a Novel Benchmarking Data Set for Assessing 3D-Based Virtual Screening Methods. Journal of Chemical Information and Modeling 2021, 61
(12)
, 5841-5852. https://doi.org/10.1021/acs.jcim.1c00866
- Dejun Jiang, Chang-Yu Hsieh, Zhenxing Wu, Yu Kang, Jike Wang, Ercheng Wang, Ben Liao, Chao Shen, Lei Xu, Jian Wu, Dongsheng Cao, Tingjun Hou. InteractionGraphNet: A Novel and Efficient Deep Graph Representation Learning Framework for Accurate Protein–Ligand Interaction Predictions. Journal of Medicinal Chemistry 2021, 64
(24)
, 18209-18232. https://doi.org/10.1021/acs.jmedchem.1c01830
- Yujin Wu, Charles L. Brooks III. Flexible CDOCKER: Hybrid Searching Algorithm and Scoring Function with Side Chain Conformational Entropy. Journal of Chemical Information and Modeling 2021, 61
(11)
, 5535-5549. https://doi.org/10.1021/acs.jcim.1c01078
- Hugo Guterres, Sang-Jun Park, Yiwei Cao, Wonpil Im. CHARMM-GUI Ligand Designer for Template-Based Virtual Ligand Design in a Binding Site. Journal of Chemical Information and Modeling 2021, 61
(11)
, 5336-5342. https://doi.org/10.1021/acs.jcim.1c01156
- Anantha Krishnan Dhanabalan, Mamangam Subaraja, Kuppusamy Palanichamy, Devadasan Velmurugan, Krishnasamy Gunasekaran. Identification of a Chlorogenic Ester as a Monoamine Oxidase (MAO-B) Inhibitor by Integrating “Traditional and Machine Learning” Virtual Screening and In Vitro as well as In Vivo Validation: A Lead against Neurodegenerative Disorders?. ACS Chemical Neuroscience 2021, 12
(19)
, 3690-3707. https://doi.org/10.1021/acschemneuro.1c00430
- Shuo Gu, Matthew S. Smith, Ying Yang, John J. Irwin, Brian K. Shoichet. Ligand Strain Energy in Large Library Docking. Journal of Chemical Information and Modeling 2021, 61
(9)
, 4331-4341. https://doi.org/10.1021/acs.jcim.1c00368
- Panagiotis I. Koukos, Manon Réau, Alexandre M. J. J. Bonvin. Shape-Restrained Modeling of Protein–Small-Molecule Complexes with High Ambiguity Driven DOCKing. Journal of Chemical Information and Modeling 2021, 61
(9)
, 4807-4818. https://doi.org/10.1021/acs.jcim.1c00796
- Shuoyan Tan, Xiaoqing Gong, Huanxiang Liu, Xiaojun Yao. Virtual Screening and Biological Activity Evaluation of New Potent Inhibitors Targeting LRRK2 Kinase Domain. ACS Chemical Neuroscience 2021, 12
(17)
, 3214-3224. https://doi.org/10.1021/acschemneuro.1c00399
- Felix Musil, Andrea Grisafi, Albert P. Bartók, Christoph Ortner, Gábor Csányi, Michele Ceriotti. Physics-Inspired Structural Representations for Molecules and Materials. Chemical Reviews 2021, 121
(16)
, 9759-9815. https://doi.org/10.1021/acs.chemrev.1c00021
- Jerome Eberhardt, Diogo Santos-Martins, Andreas F. Tillack, Stefano Forli. AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. Journal of Chemical Information and Modeling 2021, 61
(8)
, 3891-3898. https://doi.org/10.1021/acs.jcim.1c00203
- Janez Konc, Samo Lešnik, Blaž Škrlj, Dušanka Janežič. ProBiS-Dock Database: A Web Server and Interactive Web Repository of Small Ligand–Protein Binding Sites for Drug Design. Journal of Chemical Information and Modeling 2021, 61
(8)
, 4097-4107. https://doi.org/10.1021/acs.jcim.1c00454
- Hugo Guterres, Sang-Jun Park, Han Zhang, Wonpil Im. CHARMM-GUI LBS Finder & Refiner for Ligand Binding Site Prediction and Refinement. Journal of Chemical Information and Modeling 2021, 61
(8)
, 3744-3751. https://doi.org/10.1021/acs.jcim.1c00561
- SahaIshikaGraduate Student Researcherishikasaha@g.
ucla. eduHarranPatrick G.D.J. & J.M. Cram Chair in Organic Chemistryharran@chem. ucla. eduDr. Jonathan Bohmann, Department of Pharmaceuticals and Bioengineering, Southwest Research Institute, Ryan Gumpper, Postdoctoral Researcher, University of North Carolina at Chapel Hill. Virtual Screening for Chemists. 2021https://doi.org/10.1021/acsinfocus.7e5001 - Biao Ma, Kei Terayama, Shigeyuki Matsumoto, Yuta Isaka, Yoko Sasakura, Hiroaki Iwata, Mitsugu Araki, Yasushi Okuno. Structure-Based de Novo Molecular Generator Combined with Artificial Intelligence and Docking Simulations. Journal of Chemical Information and Modeling 2021, 61
(7)
, 3304-3313. https://doi.org/10.1021/acs.jcim.1c00679
- Wei Chen, Guanxing Chen, Lu Zhao, Calvin Yu-Chian Chen. Predicting Drug–Target Interactions with Deep-Embedding Learning of Graphs and Sequences. The Journal of Physical Chemistry A 2021, 125
(25)
, 5633-5642. https://doi.org/10.1021/acs.jpca.1c02419
- Alzbeta Tuerkova, Orsolya Ungvári, Réka Laczkó-Rigó, Erzsébet Mernyák, Gergely Szakács, Csilla Özvegy-Laczka, Barbara Zdrazil. Data-Driven Ensemble Docking to Map Molecular Interactions of Steroid Analogs with Hepatic Organic Anion Transporting Polypeptides. Journal of Chemical Information and Modeling 2021, 61
(6)
, 3109-3127. https://doi.org/10.1021/acs.jcim.1c00362
- Lijuan Yang, Guanghui Yang, Xiaolong Chen, Qiong Yang, Xiaojun Yao, Zhitong Bing, Yuzhen Niu, Liang Huang, Lei Yang. Deep Scoring Neural Network Replacing the Scoring Function Components to Improve the Performance of Structure-Based Molecular Docking. ACS Chemical Neuroscience 2021, 12
(12)
, 2133-2142. https://doi.org/10.1021/acschemneuro.1c00110
- Raghuram Srinivas, Niraj Verma, Elfi Kraka, Eric C. Larson. Deep Learning-Based Ligand Design Using Shared Latent Implicit Fingerprints from Collaborative Filtering. Journal of Chemical Information and Modeling 2021, 61
(5)
, 2159-2174. https://doi.org/10.1021/acs.jcim.0c01355
- Francois Berenger, Ashutosh Kumar, Kam Y. J. Zhang, Yoshihiro Yamanishi. Lean-Docking: Exploiting Ligands’ Predicted Docking Scores to Accelerate Molecular Docking. Journal of Chemical Information and Modeling 2021, 61
(5)
, 2341-2352. https://doi.org/10.1021/acs.jcim.0c01452
- Hongyi Zhou, Hongnan Cao, Jeffrey Skolnick. FRAGSITE: A Fragment-Based Approach for Virtual Ligand Screening. Journal of Chemical Information and Modeling 2021, 61
(4)
, 2074-2089. https://doi.org/10.1021/acs.jcim.0c01160
- Joanna Zarnecka, Iva Lukac, Stephen J. Messham, Alhusein Hussin, Francesco Coppola, Steven J. Enoch, Alexander G. Dossetter, Edward J. Griffen, Andrew G. Leach. Mapping Ligand-Shape Space for Protein–Ligand Systems: Distinguishing Key-in-Lock and Hand-in-Glove Proteins. Journal of Chemical Information and Modeling 2021, 61
(4)
, 1859-1874. https://doi.org/10.1021/acs.jcim.1c00089
- Chao Li, Jun Sun, Vasile Palade. MSLDOCK: Multi-Swarm Optimization for Flexible Ligand Docking and Virtual Screening. Journal of Chemical Information and Modeling 2021, 61
(3)
, 1500-1515. https://doi.org/10.1021/acs.jcim.0c01358
- Reed M. Stein, Ying Yang, Trent E. Balius, Matt J. O’Meara, Jiankun Lyu, Jennifer Young, Khanh Tang, Brian K. Shoichet, John J. Irwin. Property-Unmatched Decoys in Docking Benchmarks. Journal of Chemical Information and Modeling 2021, 61
(2)
, 699-714. https://doi.org/10.1021/acs.jcim.0c00598
- Robert Schmidt, Florian Krull, Anna Lina Heinzke, Matthias Rarey. Disconnected Maximum Common Substructures under Constraints. Journal of Chemical Information and Modeling 2021, 61
(1)
, 167-178. https://doi.org/10.1021/acs.jcim.0c00741
- Katherine J. Schultz, Sean M. Colby, Vivian S. Lin, Aaron T. Wright, Ryan S. Renslow. Ligand- and Structure-Based Analysis of Deep Learning-Generated Potential α2a Adrenoceptor Agonists. Journal of Chemical Information and Modeling 2021, 61
(1)
, 481-492. https://doi.org/10.1021/acs.jcim.0c01019
- Hugo Guterres, Sang-Jun Park, Wei Jiang, Wonpil Im. Ligand-Binding-Site Refinement to Generate Reliable Holo Protein Structure Conformations from Apo Structures. Journal of Chemical Information and Modeling 2021, 61
(1)
, 535-546. https://doi.org/10.1021/acs.jcim.0c01354
Abstract
Figure 1
Figure 1. DUD-E target classification. Number of the 102 targets that belong to eight broad protein categories.
Figure 2
Figure 2. Ligand clustering. (A) The seventh largest Murcko cluster of kinesin-like protein 1 (KIF11), showing both the scaffold (left) and all seven member ligands. (B) Number of ligands in each of the 70 KIF11 Bemis–Murcko atomic frameworks. We removed lower affinity compounds over-represented clusters (above the line), while retaining 100 ligands. (C) Number of adenosine A2A receptor (AA2AR) Murcko clusters is plotted against affinity threshold. Fewer than 600 clusters are present using a 30 nM affinity threshold.
Figure 3
Figure 3. Decoy generation. (A) Three key “warhead” groups from factor Xa (FA10), glycinamide ribonucleotide transformylase (PUR2), and thymidine kinase (KITH). (B) Fraction of warheads remaining is plotted against the dissimilarity method. The dissimilarity methods consist of a fingerprint (Daylight or ECFP4) and either a hard cutoff or a fraction of the most dissimilar decoys to be retained. (C) Property distributions of estrogen receptor α (ESR1) for both the 383 ligands (blue) and the 20685 property-matched decoys (red).
Figure 4
Figure 4. Retrospective enrichment comparing ligand desolvation and electrostatics methods. Docking results over DUD-E as measured by LogAUC. “None” has no ligand desolvation term, “SEV” uses solvent-excluded volume ligand desolvation, “Thin” employs a thin low-dielectric layer in the electrostatic calculations.
Figure 5
Figure 5. Representative ROC plots. ROC plots using no desolvation (None), solvent-excluded volume ligand desolvation (SEV), the thin low-dielectric layer (Thin), or a drug-like background that consists of all ChEMBL12 ligands with affinities better than 10 μM (Drug-like). The black dotted line represents the results expected from docking ligands randomly. LogAUC percentages are reported in the legend text.
Figure 6
Figure 6. Representative docking poses. The crystallographic ligand was rebuilt and docked from scratch. (A–F) The crystal pose (magenta) is compared to the resulting docked pose (green). In (C), more ligand conformations are generated and the redocked pose is also shown (tan). Key hydrogen bonds are shown by black dotted lines, and the partially transparent protein surface is colored by atom type.
References
ARTICLE SECTIONSThis article references 53 other publications.
- 1Kitchen, D. B.; Decornez, H.; Furr, J. R.; Bajorath, J. Docking and scoring in virtual screening for drug discovery: methods and applications Nature Rev. Drug Discovery 2004, 3, 935– 949Google Scholar1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXptFemtrg%253D&md5=875a2b37a4299181509a1922b11dbd2fDocking and scoring in virtual screening for drug discovery: methods and applicationsKitchen, Douglas B.; Decornez, Helene; Furr, John R.; Bajorath, JuergenNature Reviews Drug Discovery (2004), 3 (11), 935-949CODEN: NRDDAG; ISSN:1474-1776. (Nature Publishing Group)A review. Computational approaches that 'dock' small mols. into the structures of macromol. targets and 'score' their potential complementarity to binding sites are widely used in hit identification and lead optimization. Indeed, there are now a no. of drugs whose development was heavily influenced by or based on structure-based design and screening strategies, such as HIV protease inhibitors. Nevertheless, there remain significant challenges in the application of these approaches, in particular in relation to current scoring schemes. Here, we review key concepts and specific features of small-mol.-protein docking methods, highlight selected applications and discuss recent advances that aim to address the acknowledged limitations of established approaches.
- 2Kolb, P.; Rosenbaum, D. M.; Irwin, J. J.; Fung, J. J.; Kobilka, B. K.; Shoichet, B. K. Structure-based discovery of beta(2)-adrenergic receptor ligands Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 6843– 6848Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXlsV2qsro%253D&md5=4d1a4cb2aa3925aa4c99c6b0496417a7Structure-based discovery of β2-adrenergic receptor ligandsKolb, Peter; Rosenbaum, Daniel M.; Irwin, John J.; Fung, Juan Jose; Kobilka, Brian K.; Shoichet, Brian K.Proceedings of the National Academy of Sciences of the United States of America (2009), 106 (16), 6843-6848CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Aminergic G protein-coupled receptors (GPCRs) have been a major focus of pharmaceutical research for many years. Due partly to the lack of reliable receptor structures, drug discovery efforts have been largely ligand-based. The recently detd. X-ray structure of the β2-adrenergic receptor offers an opportunity to investigate the advantages and limitations inherent in a structure-based approach to ligand discovery against this and related GPCR targets. Approx. 1 million com. available, "lead-like" mols. were docked against the β2-adrenergic receptor structure. On testing of 25 high-ranking mols., 6 were active with binding affinities <4 μM, with the best mol. binding with a Ki of 9 nM (95% confidence interval 7-10 nM). Five of these mols. were inverse agonists. The high hit rate, the high affinity of the most potent mol., the discovery of unprecedented chemotypes among the new inhibitors, and the apparent bias toward inverse agonists among the docking hits, have implications for structure-based approaches against GPCRs that recognize small org. mols.
- 3Mysinger, M. M.; Weiss, D. R.; Ziarek, J. J.; Gravel, S.; Doak, A. K.; Karpiak, J.; Heveker, N.; Shoichet, B. K.; Volkman, B. F. Structure-based ligand discovery for the protein–protein interface of chemokine receptor CXCR4 Proc. Natl. Acad. Sci. U.S.A. 2012, 109, 5517– 5522Google ScholarThere is no corresponding record for this reference.
- 4Gruneberg, S.; Stubbs, M. T.; Klebe, G. Successful virtual screening for novel inhibitors of human carbonic anhydrase: strategy and experimental confirmation J. Med. Chem. 2002, 45, 3588– 3602Google ScholarThere is no corresponding record for this reference.
- 5Jain, A. N.; Nicholls, A. Recommendations for evaluation of computational methods J. Comput.-Aided Mol. Des. 2008, 22, 133– 139Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsb0%253D&md5=9576e150079dcbf6e4bfea5070082016Recommendations for evaluation of computational methodsJain, Ajay N.; Nicholls, AnthonyJournal of Computer-Aided Molecular Design (2008), 22 (3-4), 133-139CODEN: JCADEQ; ISSN:0920-654X. (Springer)A review. The field of computational chem., particularly as applied to drug design, has become increasingly important in terms of the practical application of predictive modeling to pharmaceutical research and development. Tools for exploiting protein structures or sets of ligands known to bind particular targets can be used for binding-mode prediction, virtual screening, and prediction of activity. A serious weakness within the field is a lack of stds. with respect to quant. evaluation of methods, data set prepn., and data set sharing. Our goal should be to report new methods or comparative evaluations of methods in a manner that supports decision making for practical applications. Here we propose a modest beginning, with recommendations for requirements on statistical reporting, requirements for data sharing, and best practices for benchmark prepn. and usage.
- 6Babaoglu, K.; Simeonov, A.; Irwin, J. J.; Nelson, M. E.; Feng, B.; Thomas, C. J.; Cancian, L.; Costi, M. P.; Maltby, D. A.; Jadhav, A.; Inglese, J.; Austin, C. P.; Shoichet, B. K. Comprehensive mechanistic analysis of hits from high-throughput and docking screens against beta-lactamase J. Med. Chem. 2008, 51, 2502– 2511Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjtFSksb0%253D&md5=7736429aeafc6b17a5d0182e24df0dd4Comprehensive Mechanistic Analysis of Hits from High-Throughput and Docking Screens against β-LactamaseBabaoglu, Kerim; Simeonov, Anton; Irwin, John J.; Nelson, Michael E.; Feng, Brian; Thomas, Craig J.; Cancian, Laura; Costi, M. Paola; Maltby, David A.; Jadhav, Ajit; Inglese, James; Austin, Christopher P.; Shoichet, Brian K.Journal of Medicinal Chemistry (2008), 51 (8), 2502-2511CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)High-throughput screening (HTS) is widely used in drug discovery. Esp. for screens of unbiased libraries, false positives can dominate "hit lists"; their origins are much debated. Here we det. the mechanism of every active hit from a screen of 70,563 unbiased mols. against β-lactamase using quant. HTS (qHTS). Of the 1274 initial inhibitors, 95% were detergent-sensitive and were classified as aggregators. Among the 70 remaining were 25 potent, covalent-acting β-lactams. Mass spectra, counter-screens, and crystallog. identified 12 as promiscuous covalent inhibitors. The remaining 33 were either aggregators or irreproducible. No specific reversible inhibitors were found. We turned to mol. docking to prioritize mols. from the same library for testing at higher concns. Of 16 tested, 2 were modest inhibitors. Subsequent X-ray structures corresponded to the docking prediction. Analog synthesis improved affinity to 8 μM. These results suggest that it may be the phys. behavior of org. mols., not their reactivity, that accounts for most screening artifacts. Structure-based methods may prioritize weak-but-novel chemotypes in unbiased library screens.
- 7Ferreira, R. S.; Simeonov, A.; Jadhav, A.; Eidam, O.; Mott, B. T.; Keiser, M. J.; McKerrow, J. H.; Maloney, D. J.; Irwin, J. J.; Shoichet, B. K. Complementarity between a docking and a high-throughput screen in discovering new cruzain inhibitors J. Med. Chem. 2010, 53, 4891– 4905Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXntlWrurk%253D&md5=0661663ff0802bf981a1997eeca9fa05Complementarity Between a Docking and a High-Throughput Screen in Discovering New Cruzain InhibitorsFerreira, Rafaela S.; Simeonov, Anton; Jadhav, Ajit; Eidam, Oliv; Mott, Bryan T.; Keiser, Michael J.; McKerrow, James H.; Maloney, David J.; Irwin, John J.; Shoichet, Brian K.Journal of Medicinal Chemistry (2010), 53 (13), 4891-4905CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Virtual and high-throughput screens (HTS) should have complementary strengths and weaknesses, but studies that prospectively and comprehensively compare them are rare. We undertook a parallel docking and HTS screen of 197861 compds. against cruzain, a thiol protease target for Chagas disease, looking for reversible, competitive inhibitors. On workup, 99% of the hits were eliminated as false positives, yielding 146 well-behaved, competitive ligands. These fell into five chemotypes: two were prioritized by scoring among the top 0.1% of the docking-ranked library, two were prioritized by behavior in the HTS and by clustering, and one chemotype was prioritized by both approaches. Detn. of an inhibitor/cruzain crystal structure and comparison of the high-scoring docking hits to expt. illuminated the origins of docking false-negatives and false-positives. Prioritizing mols. that are both predicted by docking and are HTS-active yields well-behaved mols., relatively unobscured by the false-positives to which both techniques are individually prone.
- 8Gohlke, H.; Klebe, G. Approaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptors Angew. Chem., Int. Ed. Engl. 2002, 41, 2644– 2676Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38Xmt1entbw%253D&md5=d2e5e2490fe8909e5899a1a89630223fApproaches to the description and prediction of the binding affinity of small-molecule ligands to macromolecular receptorsGohlke, Holger; Klebe, GerhardAngewandte Chemie, International Edition (2002), 41 (15), 2644-2676CODEN: ACIEF5; ISSN:1433-7851. (Wiley-VCH Verlag GmbH)A review. The influence of a xenobiotic compd. on an organism is usually summarized by the expression biol. activity. If a controlled, therapeutically relevant, and regulatory action is obsd. the compd. has potential as a drug, otherwise its toxicity on the biol. system is of interest. However, what do we understand by the biol. activity. In principle, the overall effect on an organism has to be considered. However, because of the complexity of the interrelated processes involved, as a simplification primarily the "main action" on the organism is taken into consideration. On the mol. level, biol. activity corresponds to the binding of a (lowmol. wt.) compd. to a macromol. receptor, usually a protein. Enzymic reactions or signal-transduction cascades are thereby influenced with respect to their function for the organism. We regard this binding as a process under equil. conditions; thus, binding can be described as an assocn. or dissocn. process. Accordingly, biol. activity is expressed as the affinity of both partners for each other, as a thermodn. equil. quantity. How well do we understand these terms and how well are they theor. predictable today. The holy grail of rational drug design is the prediction of the biol. activity of a compd. The processes involving ligand binding are extremely complicated, both ligand and protein are flexible mols., and the energy inventory between the bound and unbound states must be considered in aq. soln. How sophisticated and reliable are our exptl. approaches to obtaining the necessary insight. The present review summarizes our current understanding of the binding affinity of a small-mol. ligand to a protein. Both theor. and empirical approaches for predicting binding affinity, starting from the three-dimensional structure of a protein-ligand complex, will be described and compared. Exptl. methods, primarily microcalorimetry, will be discussed. As a perspective, our own knowledge-based approach towards affinity prediction and exptl. data on factorizing binding contributions to protein-ligand binding will be presented.
- 9Enyedy, I. J.; Egan, W. J. Can we use docking and scoring for hit-to-lead optimization? J. Comput.-Aided Mol. Des. 2008, 22, 161– 168Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsbk%253D&md5=24adfd0e8bb71c0e7b32a9208e7ab4d1Can we use docking and scoring for hit-to-lead optimization?Enyedy, Istvan J.; Egan, William J.Journal of Computer-Aided Molecular Design (2008), 22 (3-4), 161-168CODEN: JCADEQ; ISSN:0920-654X. (Springer)Docking and scoring is currently one of the tools used for hit finding and hit-to-lead optimization when structural information about the target is known. Docking scores have been found useful for optimizing ligand binding to reproduce exptl. obsd. binding modes. The question is, can docking and scoring be used reliably for hit-to-lead optimization To illustrate the challenges of scoring for hit-to-lead optimization, the relationship of docking scores with exptl. detd. IC50 values measured inhouse were tested. The influences of the particular target, crystal structure, and the precision of the scoring function on the ability to differentiate between actives and inactives were analyzed by calcg. the area under the curve of receiver operator characteristic curves for docking scores. It was found that for the test sets considered, MW and sometimes ClogP were as useful as GlideScores and no significant difference was obsd. between SP and XP scores for differentiating between actives and inactives. Interpretation by an expert is still required to successfully utilize docking and scoring in hit-to-lead optimization.
- 10Stahl, M.; Rarey, M. Detailed analysis of scoring functions for virtual screening J. Med. Chem. 2001, 44, 1035– 1042Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3MXhsFKitro%253D&md5=9fe0ea156ae846868e710115e1d71500Detailed Analysis of Scoring Functions for Virtual ScreeningStahl, Martin; Rarey, MatthiasJournal of Medicinal Chemistry (2001), 44 (7), 1035-1042CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)The authors present a comprehensive study of the performance of fast scoring functions for library docking using the program FlexX as the docking engine. Four scoring functions, among them two recently developed knowledge-based potentials, are evaluated on seven target proteins whose binding sites represent a wide range of size, form, and polarity. The results of these calcns. give valuable insight into strengths and weaknesses of current scoring functions. Furthermore, it is shown that a well-chosen combination of two of the tested scoring functions leads to a new, robust scoring scheme with superior performance in virtual screening.
- 11Bissantz, C.; Folkers, G.; Rognan, D. Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations J. Med. Chem. 2000, 43, 4759– 4767Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXotFymurY%253D&md5=ee74ac9a99e55759c2df27fd1db58010Protein-Based Virtual Screening of Chemical Databases. 1. Evaluation of Different Docking/Scoring CombinationsBissantz, Caterina; Folkers, Gerd; Rognan, DidierJournal of Medicinal Chemistry (2000), 43 (25), 4759-4767CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Three different database docking programs (Dock, FlexX, Gold) have been used in combination with seven scoring functions (Chemscore, Dock, FlexX, Fresno, Gold, Pmf, Score) to assess the accuracy of virtual screening methods against two protein targets (thymidine kinase, estrogen receptor) of known three-dimensional structure. For both targets, it was generally possible to discriminate about 7 out of 10 true hits from a random database of 990 ligands. The use of consensus lists common to two or three scoring functions clearly enhances hit rates among the top 5% scorers from 10% (single scoring) to 25-40% (double scoring) and up to 65-70% (triple scoring). However, in all tested cases, no clear relationships could be found between docking and ranking accuracies. Moreover, predicting the abs. binding free energy of true hits was not possible whatever docking accuracy was achieved and scoring function used. As the best docking/consensus scoring combination varies with the selected target and the physicochem. of target-ligand interactions, we propose a two-step protocol for screening large databases: (i) screening of a reduced dataset contg. a few known ligands for deriving the optimal docking/consensus scoring scheme, (ii) applying the latter parameters to the screening of the entire database.
- 12Pham, T. A.; Jain, A. N. Parameter estimation for scoring protein–ligand interactions using negative training data J. Med. Chem. 2006, 49, 5856– 5868Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXlsF2ls70%253D&md5=2296d0e658abfda21e3cc50d2a248e44Parameter Estimation for Scoring Protein-Ligand Interactions Using Negative Training DataPham, Tuan A.; Jain, Ajay N.Journal of Medicinal Chemistry (2006), 49 (20), 5856-5868CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Surflex-Dock employs an empirically derived scoring function to rank putative protein-ligand interactions by flexible docking of small mols. to proteins of known structure. The scoring function employed by Surflex was developed purely on the basis of pos. data, comprising noncovalent protein-ligand complexes with known binding affinities. Consequently, scoring function terms for improper interactions received little wt. in parameter estn., and an ad hoc scheme for avoiding protein-ligand interpenetration was adopted. We present a generalized method for incorporating synthetically generated neg. training data, which allows for rigorous estn. of all scoring function parameters. Geometric docking accuracy remained excellent under the new parametrization. In addn., a test of screening utility covering a diverse set of 29 proteins and corresponding ligand sets showed improved performance. Maximal enrichment of true ligands over non-ligands exceeded 20-fold in over 80% of cases, with enrichment of greater than 100-fold in over 50% of cases.
- 13Kellenberger, E.; Rodrigo, J.; Muller, P.; Rognan, D. Comparative evaluation of eight docking tools for docking and virtual screening accuracy Proteins 2004, 57, 225– 242Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXnvFaku7k%253D&md5=75a11a8f0603c819e4f4c8013b948686Comparative evaluation of eight docking tools for docking and virtual screening accuracyKellenberger, Esther; Rodrigo, Jordi; Muller, Pascal; Rognan, DidierProteins: Structure, Function, and Bioinformatics (2004), 57 (2), 225-242CODEN: PSFBAF ISSN:. (Wiley-Liss, Inc.)Eight docking programs (DOCK, FLEXX, FRED, GLIDE, GOLD, SLIDE, SURFLEX, and QXP) that can be used for either single-ligand docking or database screening have been compared for their propensity to recover the x-ray pose of 100 small-mol.-wt. ligands, and for their capacity to discriminate known inhibitors of an enzyme (thymidine kinase) from randomly chosen "drug-like" mols. Interestingly, both properties are found to be correlated, since the tools showing the best docking accuracy (GLIDE, GOLD, and SURFLEX) are also the most successful in ranking known inhibitors in a virtual screening expt. Moreover, the current study pinpoints some physicochem. descriptors of either the ligand or its cognate protein-binding site that generally lead to docking/scoring inaccuracies.
- 14Ferrara, P.; Gohlke, H.; Price, D. J.; Klebe, G.; Brooks, C. L., III. Assessing scoring functions for protein–ligand interactions J. Med. Chem. 2004, 47, 3032– 3047Google Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXjs1Sjsrk%253D&md5=a9f1e74194d3949aefadefb3266372d2Assessing Scoring Functions for Protein-Ligand InteractionsFerrara, Philippe; Gohlke, Holger; Price, Daniel J.; Klebe, Gerhard; Brooks, Charles L., IIIJournal of Medicinal Chemistry (2004), 47 (12), 3032-3047CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)An assessment of nine scoring functions commonly applied in docking using a set of 189 protein-ligand complexes is presented. The scoring functions include the CHARMm potential, the scoring function DrugScore, the scoring function used in AutoDock, the three scoring functions implemented in DOCK, as well as three scoring functions implemented in the CScore module in SYBYL (PMF, Gold, ChemScore). The authors evaluated the abilities of these scoring functions to recognize near-native configurations among a set of decoys and to rank binding affinities. Binding site decoys were generated by mol. dynamics with restraints. To investigate whether the scoring functions can also be applied for binding site detection, decoys on the protein surface were generated. The influence of the assignment of protonation states was probed by either assigning "std." protonation states to binding site residues or adjusting protonation states according to exptl. evidence. The role of solvation models in conjunction with CHARMm was explored in detail. These include a distance-dependent dielec. function, a generalized Born model, and the Poisson equation. The authors evaluated the effect of using a rigid receptor on the outcome of docking by generating all-pairs decoys ("cross-decoys") for six trypsin and seven HIV-1 protease complexes. The scoring functions perform well to discriminate near-native from misdocked conformations, with CHARMm, DOCK-energy, DrugScore, ChemScore, and AutoDock yielding recognition rates of around 80%. Significant degrdn. in performance is obsd. in going from decoy to cross-decoy recognition for CHARMm in the case of HIV-1 protease, whereas DrugScore and ChemScore, as well as CHARMm in the case of trypsin, show only small deterioration. In contrast, the prediction of binding affinities remains problematic for all of the scoring functions. ChemScore gives the highest correlation value with R2 = 0.51 for the set of 189 complexes and R2 = 0.43 for the set of 116 complexes that does not contain any of the complexes used to calibrate this scoring function. Neither a more accurate treatment of solvation nor a more sophisticated charge model for zinc improves the quality of the results. Improved modeling of the protonation states, however, leads to a better prediction of binding affinities in the case of the generalized Born and the Poisson continuum models used in conjunction with the CHARMm force field. The method can be used for drug discovery.
- 15Huang, N.; Shoichet, B. K.; Irwin, J. J. Benchmarking sets for molecular docking J. Med. Chem. 2006, 49, 6789– 6801Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28XhtFehurzI&md5=43af46968c2a6d66334922bf0caedbe1Benchmarking Sets for Molecular DockingHuang, Niu; Shoichet, Brian K.; Irwin, John J.Journal of Medicinal Chemistry (2006), 49 (23), 6789-6801CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Ligand enrichment among top-ranking hits is a key metric of mol. docking. To avoid bias, decoys should resemble ligands phys., so that enrichment is not simply a sepn. of gross features, yet be chem. distinct from them, so that they are unlikely to be binders. We have assembled a directory of useful decoys (DUD), with 2950 ligands for 40 different targets. Every ligand has 36 decoy mols. that are phys. similar but topol. distinct, leading to a database of 98 266 compds. For most targets, enrichment was at least half a log better with uncorrected databases such as the MDDR than with DUD, evidence of bias in the former. These calcns. also allowed 40×40 cross-docking, where the enrichments of each ligand set could be compared for all 40 targets, enabling a specificity metric for the docking screens. DUD is freely available online as a benchmarking set for docking at http://blaster.docking.org/dud/.
- 16Christofferson, A. J.; Huang, N. How to benchmark methods for structure-based virtual screening of large compound libraries. In Computational Drug Discovery and Design (Methods in Molecular Biology); 2011/12/21 ed.; Baron, R., Ed.; Springer Protocols: New York, 2012; Vol. 819, Chapter 13, pp 187– 195.Google ScholarThere is no corresponding record for this reference.
- 17Verdonk, M. L.; Berdini, V.; Hartshorn, M. J.; Mooij, W. T.; Murray, C. W.; Taylor, R. D.; Watson, P. Virtual screening using protein–ligand docking: avoiding artificial enrichment J. Chem. Inf. Comput. Sci. 2004, 44, 793– 806Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXhtVGhs70%253D&md5=b0cd9e9a04ddebe401dee9744e2184cbVirtual screening using protein-ligand docking: Avoiding artificial enrichmentVerdonk, Marcel L.; Berdini, Valerio; Hartshorn, Michael J.; Mooij, Wijnand T. M.; Murray, Christopher W.; Taylor, Richard D.; Watson, PaulJournal of Chemical Information and Computer Sciences (2004), 44 (3), 793-806CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)This study addresses a no. of topical issues around the use of protein-ligand docking in virtual screening. We show that, for the validation of such methods, it is key to use focused libraries (contg. compds. with one-dimensional properties, similar to the actives), rather than "random" or "drug-like" libraries to test the actives against. We also show that, to obtain good enrichments, the docking program needs to produce reliable binding modes. We demonstrate how pharmacophores can be used to guide the dockings and improve enrichments, and we compare the performance of three consensus-ranking protocols against ranking based on individual scoring functions. Finally, we show that protein-ligand docking can be an effective aid in the screening for weak, fragment-like binders, which has rapidly become a popular strategy for hit identification. All results presented are based on carefully constructed virtual screening expts. against four targets, using the protein-ligand docking program GOLD.
- 18Kuntz, I. D.; Chen, K.; Sharp, K. A.; Kollman, P. A. The maximal affinity of ligands Proc. Natl. Acad. Sci. U.S.A. 1999, 96, 9997– 10002Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1MXlvFehu7s%253D&md5=46fe894b016f3041831a2f0b71f812b1The maximal affinity of ligandsKuntz, I. D.; Chen, K.; Sharp, K. A.; Kollman, P. A.Proceedings of the National Academy of Sciences of the United States of America (1999), 96 (18), 9997-10002CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)We explore the question of what are the best ligands for macromol. targets. A survey of exptl. data on a large no. of the strongest-binding ligands indicates that the free energy of binding increases with the no. of nonhydrogen atoms with an initial slope of ≈-1.5 kcal/mol (1 cal = 4.18 J) per atom. For ligands that contain more than 15 nonhydrogen atoms, the free energy of binding increases very little with relative mol. mass. This nonlinearity is largely ascribed to nonthermodynamic factors. An anal. of the dominant interactions suggests that van der Waals interactions and hydrophobic effects provide a reasonable basis for understanding binding affinities across the entire set of ligands. Interesting outliers that bind unusually strongly on a per atom basis include metal ions, covalently attached ligands, and a few well known complexes such as biotin-avidin.
- 19Fan, H.; Irwin, J. J.; Webb, B. M.; Klebe, G.; Shoichet, B. K.; Sali, A. Molecular Docking Screens Using Comparative Models of Proteins J. Chem. Inf. Model. 2009, 49, 2512– 2527Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXht1yntL7M&md5=11bd5284c6b43592906e45b6aea3019dMolecular Docking Screens Using Comparative Models of ProteinsFan, Hao; Irwin, John J.; Webb, Benjamin M.; Klebe, Gerhard; Shoichet, Brian K.; Sali, AndrejJournal of Chemical Information and Modeling (2009), 49 (11), 2512-2527CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Two orders of magnitude more protein sequences can be modeled by comparative modeling than have been detd. by X-ray crystallog. and NMR spectroscopy. Investigators have nevertheless been cautious about using comparative models for ligand discovery because of concerns about model errors. We suggest how to exploit comparative models for mol. screens, based on docking against a wide range of crystallog. structures and comparative models with known ligands. To account for the variation in the ligand-binding pocket as it binds different ligands, we calc. "consensus" enrichment by ranking each library compd. by its best docking score against all available comparative models and/or modeling templates. For the majority of the targets, the consensus enrichment for multiple models was better than or comparable to that of the holo and apo X-ray structures. Even for single models, the models are significantly more enriching than the template structure if the template is paralogous and shares more than 25% sequence identity with the target.
- 20Repasky, M. P.; Murphy, R. B.; Banks, J. L.; Greenwood, J. R.; Tubert-Brohman, I.; Bhat, S.; Friesner, R. A. Docking performance of the glide program as evaluated on the Astex and DUD datasets: a complete set of glide SP results and selected results for a new scoring function integrating WaterMap and glide J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9575-9Google ScholarThere is no corresponding record for this reference.
- 21Brozell, S. R.; Mukherjee, S.; Balius, T. E.; Roe, D. R.; Case, D. A.; Rizzo, R. C. Evaluation of DOCK 6 as a pose generation and database enrichment tool J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9565-yGoogle ScholarThere is no corresponding record for this reference.
- 22Neves, M. A.; Totrov, M.; Abagyan, R. Docking and scoring with ICM: the benchmarking results and strategies for improvement J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9547-0Google ScholarThere is no corresponding record for this reference.
- 23Spitzer, R.; Jain, A. N. Surflex-Dock: docking benchmarks and real-world application J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-011-9533-yGoogle ScholarThere is no corresponding record for this reference.
- 24Schneider, N.; Hindle, S.; Lange, G.; Klein, R.; Albrecht, J.; Briem, H.; Beyer, K.; Claussen, H.; Gastreich, M.; Lemmen, C.; Rarey, M. Substantial improvements in large-scale redocking and screening using the novel HYDE scoring function J. Comput.-Aided Mol. Des. 2011, DOI: 10.1007/s10822-011-9531-0Google ScholarThere is no corresponding record for this reference.
- 25Liebeschuetz, J. W.; Cole, J. C.; Korb, O. Pose prediction and virtual screening performance of GOLD scoring functions in a standardized test J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9551-4Google ScholarThere is no corresponding record for this reference.
- 26Novikov, F. N.; Stroylov, V. S.; Zeifman, A. A.; Stroganov, O. V.; Kulkov, V.; Chilov, G. G. Lead Finder docking and virtual screening evaluation with Astex and DUD test sets J. Comput.-Aided Mol. Des. 2012, DOI: 10.1007/s10822-012-9549-yGoogle ScholarThere is no corresponding record for this reference.
- 27Good, A. C.; Oprea, T. I. Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection? J. Comput.-Aided Mol. Des. 2008, 22, 169– 178Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsLo%253D&md5=459cd3d09a88d5afc04a4a49e172fb76Optimization of CAMD techniques 3. Virtual screening enrichment studies: a help or hindrance in tool selection?Good, Andrew C.; Oprea, Tudor I.Journal of Computer-Aided Molecular Design (2008), 22 (3-4), 169-178CODEN: JCADEQ; ISSN:0920-654X. (Springer)Over the last few years many articles have been published in an attempt to provide performance benchmarks for virtual screening tools. While this research has imparted useful insights, the myriad variables controlling said studies place significant limits on results interpretability. Here we investigate the effects of these variables, including anal. of calcn. setup variation, the effect of target choice, active/decoy set selection (with particular emphasis on the effect of analog bias) and enrichment data interpretation. In addn. the optimization of the publicly available DUD benchmark sets through analog bias removal is discussed, as is their augmentation through the addn. of large diverse data sets collated using WOMBAT.
- 28Mackey, M. D.; Melville, J. L. Better than random? The chemotype enrichment problem J. Chem. Inf. Model. 2009, 49, 1154– 1162Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXltVCgs78%253D&md5=51b6efb624c3a383b7546997c11ab1e0Better than Random? The Chemotype Enrichment ProblemMackey, Mark D.; Melville, James L.Journal of Chemical Information and Modeling (2009), 49 (5), 1154-1162CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Chemotype enrichment is increasingly recognized as an important measure of virtual screening performance. However, little attention has been paid to producing metrics which can quantify chemotype retrieval. Here, we examine two different protocols for analyzing chemotype retrieval: "cluster averaging", where the contribution of each active to the scoring metric is proportional to the no. of other actives with the same chemotype, and "first found", where only the first active for a given chemotype contributes to the score. We demonstrate that this latter anal., common in the qual. anal. used in the current literature, has important drawbacks when combined with quant. metrics.
- 29Hawkins, P. C.; Warren, G. L.; Skillman, A. G.; Nicholls, A. How to do an evaluation: pitfalls and traps J. Comput.-Aided Mol. Des. 2008, 22, 179– 190Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsLk%253D&md5=3e176941c4a526f3a97519ce02007792How to do an evaluation: pitfalls and trapsHawkins, Paul C. D.; Warren, Gregory L.; Skillman, A. Geoffrey; Nicholls, AnthonyJournal of Computer-Aided Molecular Design (2008), 22 (3-4), 179-190CODEN: JCADEQ; ISSN:0920-654X. (Springer)The recent literature is replete with papers evaluating computational tools (often those operating on 3D structures) for their performance in a certain set of tasks. Most commonly these papers compare a no. of docking tools for their performance in cognate re-docking (pose prediction) and/or virtual screening. Related papers have been published on ligand-based tools: pose prediction by conformer generators and virtual screening using a variety of ligand-based approaches. The reliability of these comparisons is critically affected by a no. of factors usually ignored by the authors, including bias in the datasets used in virtual screening, the metrics used to assess performance in virtual screening and pose prediction and errors in crystal structures used.
- 30Irwin, J. J. Community benchmarks for virtual screening J. Comput.-Aided Mol. Des. 2008, 22, 193– 199Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsb8%253D&md5=4972b768c915bc3e7ee05210dcf1b7a8Community benchmarks for virtual screeningIrwin, John J.Journal of Computer-Aided Molecular Design (2008), 22 (3-4), 193-199CODEN: JCADEQ; ISSN:0920-654X. (Springer)Ligand enrichment among top-ranking hits is a key metric of virtual screening. To avoid bias, decoys should resemble ligands phys., so that enrichment is not attributable to simple differences of gross features. We therefore created a directory of useful decoys (DUD) by selecting decoys that resembled annotated ligands phys. but not topol. to benchmark docking performance. DUD has 2950 annotated ligands and 95,316 property-matched decoys for 40 targets. It is by far the largest and most comprehensive public data set for benchmarking virtual screening programs that I am aware of. This paper outlines several ways that DUD can be improved to provide better telemetry to investigators seeking to understand both the strengths and the weaknesses of current docking methods. I also highlight several pitfalls for the unwary: a risk of over-optimization, questions about chem. space, and the proper scope for using DUD. Careful attention to both the compn. of benchmarks and how they are used is essential to avoid being misled by overfitting and bias.
- 31Mysinger, M. M.; Shoichet, B. K. Rapid context-dependent ligand desolvation in molecular docking J. Chem. Inf. Model. 2010, 50, 1561– 1573Google Scholar31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhtVGrsLvI&md5=5124d2c4618e07b4f1e41609c9e61362Rapid Context-Dependent Ligand Desolvation in Molecular DockingMysinger, Michael M.; Shoichet, Brian K.Journal of Chemical Information and Modeling (2010), 50 (9), 1561-1573CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)In structure-based screens for new ligands, a mol. docking algorithm must rapidly score many mols. in multiple configurations, accounting for both the ligand's interactions with receptor and its competing interactions with solvent. Here the authors explore a context-dependent ligand desolvation scoring term for mol. docking. The authors relate the Generalized-Born effective Born radii for every ligand atom to a fractional desolvation and then use this fraction to scale an atom-by-atom decompn. of the full transfer free energy. The fractional desolvation is precomputed on a scoring grid by numerically integrating over the vol. of receptor proximal to a ligand atom, weighted by distance. To test this method's performance, the authors dock ligands vs. property-matched decoys over 40 DUD targets. Context-dependent desolvation better enriches ligands compared to both the raw full transfer free energy penalty and compared to ignoring desolvation altogether, though the improvement is modest. More compellingly, the new method improves docking performance across receptor types. Thus, whereas entirely ignoring desolvation works best for charged sites and overpenalizing with full desolvation works well for neutral sites, the phys. more correct context-dependent ligand desolvation is competitive across both types of targets. The method also reliably discriminates ligands from highly charged mols., where ignoring desolvation performs poorly. Since this context-dependent ligand desolvation may be precalcd., it improves docking reliability with minimal cost to calcn. time and may be readily incorporated into any physics-based docking program.
- 32Vogel, S. M.; Bauer, M. R.; Boeckler, F. M. DEKOIS: demanding evaluation kits for objective in silico screening—a versatile tool for benchmarking docking programs and scoring functions J. Chem. Inf. Model. 2011, 51, 2650– 2665Google Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtValsbnN&md5=e3f59bcd3f898dce1c116a325e914861DEKOIS: Demanding Evaluation Kits for Objective in Silico Screening - A Versatile Tool for Benchmarking Docking Programs and Scoring FunctionsVogel, Simon M.; Bauer, Matthias R.; Boeckler, Frank M.Journal of Chemical Information and Modeling (2011), 51 (10), 2650-2665CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)For widely applied in silico screening techniques success depends on the rational selection of an appropriate method. We herein present a fast, versatile, and robust method to construct demanding evaluation kits for objective in silico screening (DEKOIS). This automated process enables creating tailor-made decoy sets for any given sets of bioactives. It facilitates a target-dependent validation of docking algorithms and scoring functions helping to save time and resources. We have developed metrics for assessing and improving decoy set quality and employ them to investigate how decoy embedding affects docking. We demonstrate that screening performance is target-dependent and can be impaired by latent actives in the decoy set (LADS) or enhanced by poor decoy embedding. The presented method allows extending and complementing the collection of publicly available high quality decoy sets toward new target space. All present and future DEKOIS data sets will be made accessible at www.dekois.com.
- 33Wallach, I.; Lilien, R. Virtual decoy sets for molecular docking benchmarks J. Chem. Inf. Model. 2011, 51, 196– 202Google Scholar33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXoslel&md5=3b58c7641cb2c665ce381081d8b1bf41Virtual Decoy Sets for Molecular Docking BenchmarksWallach, Izhar; Lilien, RyanJournal of Chemical Information and Modeling (2011), 51 (2), 196-202CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Virtual docking algorithms are often evaluated on their ability to sep. active ligands from decoy mols. The current state-of-the-art benchmark, the Directory of Useful Decoys (DUD), minimizes bias by including decoys from a library of synthetically feasible mols. that are phys. similar yet chem. dissimilar to the active ligands. We show that by ignoring synthetic feasibility, we can compile a benchmark that is comparable to the DUD and less biased with respect to phys. similarity.
- 34Gatica, E. A.; Cavasotto, C. N. Ligand and decoy sets for docking to G protein-coupled receptors J. Chem. Inf. Model. 2012, 52, 1– 6Google Scholar34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhs1ajsb3P&md5=a51e1b60afda728614a3bc1ceed79cfdLigand and Decoy Sets for Docking to G Protein-Coupled ReceptorsGatica, Edgar A.; Cavasotto, Claudio N.Journal of Chemical Information and Modeling (2012), 52 (1), 1-6CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)We compiled a G protein-coupled receptor (GPCR) ligand library (GLL) for 147 targets, selecting for each ligand 39 decoy mols., collected in the GPCR Decoy Database (GDD). Decoys were chosen ensuring a ligand-decoy similarity of six phys. properties, while enforcing ligand-decoy chem. dissimilarity. The performance in docking of the GDD was evaluated on 19 GPCRs, showing a marked decrease in enrichment compared to bias-uncorrected decoy sets. Both the GLL and GDD are freely available for the scientific community.
- 35Cereto-Massague, A.; Guasch, L.; Valls, C.; Mulero, M.; Pujadas, G.; Garcia-Vallve, S. DecoyFinder: an easy-to-use python GUI application for building target-specific decoy sets Bioinformatics 2012, 28, 1661– 1662Google ScholarThere is no corresponding record for this reference.
- 36Rohrer, S. G.; Baumann, K. Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data J. Chem. Inf. Model. 2009, 49, 169– 184Google Scholar36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXptlOhtQ%253D%253D&md5=17805f49d995c4c082c3d07be7145a2dMaximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity DataRohrer, Sebastian G.; Baumann, KnutJournal of Chemical Information and Modeling (2009), 49 (2), 169-184CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Refined nearest neighbor anal. was recently introduced for the anal. of virtual screening benchmark data sets. It constitutes a technique from the field of spatial statistics and provides a math. framework for the nonparametric anal. of mapped point patterns. Here, refined nearest neighbor anal. is used to design benchmark data sets for virtual screening based on PubChem bioactivity data. A workflow is devised that purges data sets of compds. active against pharmaceutically relevant targets from unselective hits. Topol. optimization using exptl. design strategies monitored by refined nearest neighbor anal. functions is applied to generate corresponding data sets of actives and decoys that are unbiased with regard to analog bias and artificial enrichment. These data sets provide a tool for Maximum Unbiased Validation (MUV) of virtual screening methods. The data sets and a software package implementing the MUV design workflow are freely available.
- 37Ripphausen, P.; Wassermann, A. M.; Bajorath, J. REPROVIS-DB: a benchmark system for ligand-based virtual screening derived from reproducible prospective applications J. Chem. Inf. Model. 2011, 51, 2467– 2473Google Scholar37https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXht1ait7fJ&md5=8439a3f6471f43afe159d08c28a5bc14REPROVIS-DB: A Benchmark System for Ligand-Based Virtual Screening Derived from Reproducible Prospective ApplicationsRipphausen, Peter; Wassermann, Anne Mai; Bajorath, JurgenJournal of Chemical Information and Modeling (2011), 51 (10), 2467-2473CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Benchmark calcns. are essential for the evaluation of virtual screening (VS) methods. Typically, classes of known active compds. taken from the medicinal chem. literature are divided into ref. mols. (search templates) and potential hits that are added to background databases assumed to consist of compds. not sharing this activity. Then VS calcns. are carried out, and the recall of known active compds. is detd. However, conventional benchmarking is affected by a no. of problems that reduce its value for method evaluation. In addn. to often insufficient statistical validation and the lack of generally accepted evaluation stds., the artificial nature of typical benchmark settings is often criticized. Retrospective benchmark calcns. generally overestimate the potential of VS methods and do not scale with their performance in prospective applications. In order to provide addnl. opportunities for benchmarking that more closely resemble practical VS conditions, we have designed a publicly available compd. database (DB) of reproducible virtual screens (REPROVIS-DB) that organizes information from successful ligand-based VS applications including ref. compds., screening databases, compd. selection criteria, and exptl. confirmed hits. Using the currently available 25 hand-selected compd. data sets, one can attempt to reproduce successful virtual screens with other than the originally applied methods and assess their potential for practical applications.
- 38Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale bioactivity database for drug discovery Nucleic Acids Res. 2012, 40, D1100– 1107Google Scholar38https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhs12htbjN&md5=aedf7793e1ca54b6a4fa272ea3ef7d0eChEMBL: a large-scale bioactivity database for drug discoveryGaulton, Anna; Bellis, Louisa J.; Bento, A. Patricia; Chambers, Jon; Davies, Mark; Hersey, Anne; Light, Yvonne; McGlinchey, Shaun; Michalovich, David; Al-Lazikani, Bissan; Overington, John P.Nucleic Acids Research (2012), 40 (D1), D1100-D1107CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)ChEMBL is an Open Data database contg. binding, functional and ADMET information for a large no. of drug-like bioactive compds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chem. biol. and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compds. and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.
- 39Bemis, G. W.; Murcko, M. A. The properties of known drugs. 1. Molecular frameworks J. Med. Chem. 1996, 39, 2887– 2893Google Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK28XjvVejtro%253D&md5=5e2c4fdfea9434456a0cca83de4185b3The Properties of Known Drugs. 1. Molecular FrameworksBemis, Guy W.; Murcko, Mark A.Journal of Medicinal Chemistry (1996), 39 (15), 2887-2893CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)To better understand the common features present in drug mols., we use shape description methods to analyze a database of com. available drugs and prep. a list of common drug shapes. A useful way of organizing this structural data is to group the atoms of each drug mol. into ring, linker, framework, and side chain atoms. On the basis of the two-dimensional mol. structures (without regard to atom type, hybridization, and bond order), there are 1179 different frameworks among the 5120 compds. analyzed. However, the shapes of half of the drugs in the database are described by the 32 most frequently occurring frameworks. This suggests that the diversity of shapes in the set of known drugs is extremely low. In our second method of anal., in which atom type, hybridization, and bond order are considered, more diversity is seen; there are 2506 different frameworks among the 5120 compds. in the database, and the most frequently occurring 42 frameworks account for only one-fourth of the drugs. We discuss the possible interpretations of these findings and the way they may be used to guide future drug discovery research.
- 40Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. The Protein Data Bank Nucleic Acids Res. 2000, 28, 235– 242Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3cXhvVKjt7w%253D&md5=227fb393f754be2be375ab727bfd05dcThe Protein Data BankBerman, Helen M.; Westbrook, John; Feng, Zukang; Gilliland, Gary; Bhat, T. N.; Weissig, Helge; Shindyalov, Ilya N.; Bourne, Philip E.Nucleic Acids Research (2000), 28 (1), 235-242CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The Protein Data Bank (PDB; http://www.rcsb.org/pdb/)is the single worldwide archive of structural data of biol. macromols. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.
- 41Apweiler, R.; Bairoch, A.; Wu, C. H.; Barker, W. C.; Boeckmann, B.; Ferro, S.; Gasteiger, E.; Huang, H.; Lopez, R.; Magrane, M.; Martin, M. J.; Natale, D. A.; O’Donovan, C.; Redaschi, N.; Yeh, L. S. UniProt: The Universal Protein Knowledgebase Nucleic Acids Res. 2004, 32, D115– D119Google Scholar41https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXhtVSru7vK&md5=4b692ffbb5c27bd9b4d2a8d0809e84e1UniProt: the universal protein knowledgebaseApweiler, Rolf; Bairoch, Amos; Wu, Cathy H.; Barker, Winona C.; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J.; Natale, Darren A.; O'Donovan, Claire; Redaschi, Nicole; Yeh, Lai-Su L.Nucleic Acids Research (2004), 32 (Database), D115-D119CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, the Swiss-Prot, TrEMBL and PIR protein database activities have united to form the Universal Protein Knowledgebase (UniProt) consortium. Our mission is to provide a comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-refs. and query interfaces. The central database will have two sections, corresponding to the familiar Swiss-Prot (fully manually curated entries) and TrEMBL (enriched with automated classification, annotation and extensive cross-refs.). For convenient sequence searches, UniProt also provides several non-redundant sequence databases. The UniProt NREF (UniRef) databases provide representative subsets of the knowledgebase suitable for efficient searching. The comprehensive UniProt Archive (UniParc) is updated daily from many public source databases. The UniProt databases can be accessed online (http://www.uniprot.org) or downloaded in several formats (ftp://ftp.uniprot.org/pub). The scientific community is encouraged to submit data for inclusion in UniProt.
- 42Irwin, J. J.; Shoichet, B. K.; Mysinger, M. M.; Huang, N.; Colizzi, F.; Wassam, P.; Cao, Y. Automated docking screens: a feasibility study J. Med. Chem. 2009, 52, 5712– 5720Google Scholar42https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhtVOjurjN&md5=f302db1f77b87f26d1fa2fece55bca96Automated Docking Screens: A Feasibility StudyIrwin, John J.; Shoichet, Brian K.; Mysinger, Michael M.; Huang, Niu; Colizzi, Francesco; Wassam, Pascal; Cao, YiqunJournal of Medicinal Chemistry (2009), 52 (18), 5712-5720CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Mol. docking is the most practical approach to leverage protein structure for ligand discovery, but the technique retains important liabilities that make it challenging to deploy on a large scale. We have therefore created an expert system, DOCK Blaster, to investigate the feasibility of full automation. The method requires a PDB code, sometimes with a ligand structure, and from that alone can launch a full screen of large libraries. A crit. feature is self-assessment, which ests. the anticipated reliability of the automated screening results using pose fidelity and enrichment. Against common benchmarks, DOCK Blaster recapitulates the crystal ligand pose within 2 Å rmsd 50-60% of the time; inferior to an expert, but respectable. Half the time the ligand also ranked among the top 5% of 100 phys. matched decoys chosen on the fly. Further tests were undertaken culminating in a study of 7755 eligible PDB structures. In 1398 cases, the redocked ligand ranked in the top 5% of 100 property-matched decoys while also posing within 2 Å rmsd, suggesting that unsupervised prospective docking is viable. DOCK Blaster is available at http://blaster.docking.org.
- 43Powers, R. A.; Morandi, F.; Shoichet, B. K. Structure-based discovery of a novel, noncovalent inhibitor of AmpC beta-lactamase Structure 2002, 10, 1013– 1023Google Scholar43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD38XlsVyjsrk%253D&md5=5f539484d08191f0c65626493fc1b262Structure-Based Discovery of a Novel, Noncovalent Inhibitor of AmpC β-LactamasePowers, Rachel A.; Morandi, Federica; Shoichet, Brian K.Structure (Cambridge, MA, United States) (2002), 10 (7), 1013-1023CODEN: STRUE6; ISSN:0969-2126. (Cell Press)β-Lactamases are the most widespread resistance mechanisms to β-lactam antibiotics, and there is a pressing need for novel, non-β-lactam drugs. A database of over 200,000 compds. was docked to the active site of AmpC β-lactamase to identify potential inhibitors. Fifty-six compds. were tested, and three had Ki values of 650 μM or better. The best of these, 3-[(4-chloroanilino)sulfonyl]thiophene-2-carboxylic acid, was a competitive noncovalent inhibitor (Ki = 26 μM), which also reversed resistance to β-lactams in bacteria expressing AmpC. The structure of AmpC in complex with this compd. was detd. by x-ray crystallog. to 1.94 A and reveals that the inhibitor interacts with key active-site residues in sites targeted in the docking calcn. Indeed, the exptl. detd. conformation of the inhibitor closely resembles the prediction. The structure of the enzyme-inhibitor complex presents an opportunity to improve binding affinity in a novel series of inhibitors discovered by structure-based methods.
- 44Carlsson, J.; Yoo, L.; Gao, Z. G.; Irwin, J. J.; Shoichet, B. K.; Jacobson, K. A. Structure-based discovery of A2A adenosine receptor ligands J. Med. Chem. 2010, 53, 3748– 3755Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXkvFaqsL8%253D&md5=c36c941d52d2cec06387d79c4c423d46Structure-Based Discovery of A2A Adenosine Receptor LigandsCarlsson, Jens; Yoo, Lena; Gao, Zhan-Guo; Irwin, John J.; Shoichet, Brian K.; Jacobson, Kenneth A.Journal of Medicinal Chemistry (2010), 53 (9), 3748-3755CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)The recent detn. of X-ray structures of pharmacol. relevant GPCRs has made these targets accessible to structure-based ligand discovery. Here we explore whether novel chemotypes may be discovered for the A2A adenosine receptor, based on complementarity to its recently detd. structure. The A2A adenosine receptor signals in the periphery and the CNS, with agonists explored as anti-inflammatory drugs and antagonists explored for neurodegenerative diseases. We used mol. docking to screen a 1.4 million compd. database against the X-ray structure computationally and tested 20 high-ranking, previously unknown mols. exptl. Of these 35% showed substantial activity with affinities between 200 nM and 9 μM. For the most potent of these new inhibitors, over 50-fold specificity was obsd. for the A2A vs. the related A1 and A3 subtypes. These high hit rates and affinities at least partly reflect the bias of com. libraries toward GPCR-like chemotypes, an issue that we attempt to investigate quant. Despite this bias, many of the most potent new ligands were novel, dissimilar from known ligands, providing new lead structures for modulation of this medically important target.
- 45Carlsson, J.; Coleman, R. G.; Setola, V.; Irwin, J. J.; Fan, H.; Schlessinger, A.; Sali, A.; Roth, B. L.; Shoichet, B. K. Ligand discovery from a dopamine D3 receptor homology model and crystal structure Natre Chem. Biol. 2011, 7, 769– 778Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtFylsLvP&md5=7242aa512079a348678b309d352ecdb9Ligand discovery from a dopamine D3 receptor homology model and crystal structureCarlsson, Jens; Coleman, Ryan G.; Setola, Vincent; Irwin, John J.; Fan, Hao; Schlessinger, Avner; Sali, Andrej; Roth, Bryan L.; Shoichet, Brian K.Nature Chemical Biology (2011), 7 (11), 769-778CODEN: NCBABT; ISSN:1552-4450. (Nature Publishing Group)G protein-coupled receptors (GPCRs) are intensely studied as drug targets and for their role in signaling. With the detn. of the first crystal structures, interest in structure-based ligand discovery increased. Unfortunately, for most GPCRs no exptl. structures are available. The detn. of the D3 receptor structure and the challenge to the community to predict it enabled a fully prospective comparison of ligand discovery from a modeled structure vs. that of the subsequently released crystal structure. Over 3.3 million mols. were docked against a homol. model, and 26 of the highest ranking were tested for binding. Six had affinities ranging from 0.2 to 3.1 μM. Subsequently, the crystal structure was released and the docking screen repeated. Of the 25 compds. selected, five had affinities ranging from 0.3 to 3.0 μM. One of the new ligands from the homol. model screen was optimized for affinity to 81 nM. The feasibility of docking screens against modeled GPCRs more generally is considered.
- 46Irwin, J. J.; Shoichet, B. K. ZINC—a free database of commercially available compounds for virtual screening J. Chem. Inf. Model. 2005, 45, 177– 182Google Scholar46https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXhtVOjt77J&md5=e3892b7dc8608b17a3e63541a5ed60e6ZINC - A Free Database of Commercially Available Compounds for Virtual ScreeningIrwin, John J.; Shoichet, Brian K.Journal of Chemical Information and Computer Sciences (2005), 45 (1), 177-182CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)A crit. barrier to entry into structure-based virtual screening is the lack of a suitable, easy to access database of purchasable compds. We have therefore prepd. a library of 727 842 mols., each with 3D structure, using catalogs of compds. from vendors (the size of this library continues to grow). The mols. have been assigned biol. relevant protonation states and are annotated with properties such as mol. wt., calcd. LogP, and no. of rotatable bonds. Each mol. in the library contains vendor and purchasing information and is ready for docking using a no. of popular docking programs. Within certain limits, the mols. are prepd. in multiple protonation states and multiple tautomeric forms. In one format, multiple conformations are available for the mols. This database is available for free download (http://zinc.docking.org) in several common file formats including SMILES, mol2, 3D SDF, and DOCK flexibase format. A Web-based query tool incorporating a mol. drawing interface enables the database to be searched and browsed and subsets to be created. Users can process their own mols. by uploading them to a server. Our hope is that this database will bring virtual screening libraries to a wide community of structural biologists and medicinal chemists.
- 47Velankar, S.; McNeil, P.; Mittard-Runte, V.; Suarez, A.; Barrell, D.; Apweiler, R.; Henrick, K. E-MSD: an integrated data resource for bioinformatics Nucleic Acids Res. 2005, 33, D262– 265Google ScholarThere is no corresponding record for this reference.
- 48Hawkins, P. C.; Skillman, A. G.; Nicholls, A. Comparison of shape-matching and docking as virtual screening tools J. Med. Chem. 2007, 50, 74– 82Google Scholar48https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28Xhtlansb%252FF&md5=6f97f5c0cc092b4e225f7c2656c1bcf6Comparison of Shape-Matching and Docking as Virtual Screening ToolsHawkins, Paul C. D.; Skillman, A. Geoffrey; Nicholls, AnthonyJournal of Medicinal Chemistry (2007), 50 (1), 74-82CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Ligand docking is a widely used approach in virtual screening. In recent years a large no. of publications have appeared in which docking tools are compared and evaluated for their effectiveness in virtual screening against a wide variety of protein targets. These studies have shown that the effectiveness of docking in virtual screening is highly variable due to a large no. of possible confounding factors. Another class of method that has shown promise in virtual screening is the shape-based, ligand-centric approach. Several direct comparisons of docking with the shape-based tool ROCS have been conducted using data sets from some of these recent docking publications. The results show that a shape-based, ligand-centric approach is more consistent than, and often superior to, the protein-centric approach taken by docking.
- 49Teotico, D. G.; Babaoglu, K.; Rocklin, G. J.; Ferreira, R. S.; Giannetti, A. M.; Shoichet, B. K. Docking for fragment inhibitors of AmpC beta-lactamase Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 7455– 7460Google Scholar49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXmt1Khtrg%253D&md5=d2df01432d5381c226e8aa7e008457b5Docking for fragment inhibitors of AmpC 83-lactamaseTeotico, Denise G.; Babaoglu, Kerim; Rocklin, Gabriel J.; Ferreira, Rafaela S.; Giannetti, Anthony M.; Shoichet, Brian K.Proceedings of the National Academy of Sciences of the United States of America (2009), 106 (18), 7455-7460CODEN: PNASA6; ISSN:0027-8424. (National Academy of Sciences)Fragment screens for new ligands have had wide success, not withstanding their constraint to libraries of 1000-10,000 mols. Larger libraries would be addressable were mol. docking reliable for fragment screens, but this has not been widely accepted. To investigate docking's ability to prioritize fragments, a library of >137,000 such mols. were docked against the structure of β-lactamase. Forty-eight fragments highly ranked by docking were acquired and tested; 23 had Ki values ranging from 0.7 to 9.2 mM. X-ray crystal structures of the enzyme-bound complexes were detd. for 8 of the fragments. For 4, the correspondence between the predicted and exptl. structures was high (RMSD between 1.2 and 1.4 Å), whereas for another 2, the fidelity was lower but retained most key interactions (RMSD 2.4-2.6 Å). Two of the 8 fragments adopted very different poses in the active site owing to enzyme conformational changes. The 48% hit rate of the fragment docking compares very favorably with "lead-like" docking and high-throughput screening against the same enzyme. To understand this, we investigated the occurrence of the fragment scaffolds among larger, lead-like mols. Approx. 1% of com. available fragments contain these inhibitors whereas only 10-7% of lead-like mols. do. This suggests that many more chemotypes and combinations of chemotypes are present among fragments than are available among lead-like mols., contributing to the higher hit rates. The ability of docking to prioritize these fragments suggests that the technique can be used to exploit the better chemotype coverage that exists at the fragment level.
- 50Tondi, D.; Morandi, F.; Bonnet, R.; Costi, M. P.; Shoichet, B. K. Structure-based optimization of a non-beta-lactam lead results in inhibitors that do not up-regulate beta-lactamase expression in cell culture J. Am. Chem. Soc. 2005, 127, 4632– 4639Google Scholar50https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXit1Krsbk%253D&md5=ddad35ac3e6fb96cf65e12cd807f271cStructure-Based Optimization of a Non-β-lactam Lead Results in Inhibitors That Do Not Up-Regulate β-Lactamase Expression in Cell CultureTondi, Donatella; Morandi, Federica; Bonnet, Richard; Costi, M. Paola; Shoichet, Brian K.Journal of the American Chemical Society (2005), 127 (13), 4632-4639CODEN: JACSAT; ISSN:0002-7863. (American Chemical Society)Bacterial expression of β-lactamases is the most widespread resistance mechanism to β-lactam antibiotics, such as penicillins and cephalosporins. There is a pressing need for novel, non-β-lactam inhibitors of these enzymes. One previously discovered novel inhibitor of the β-lactamase AmpC (I) has several favorable properties: it is chem. dissimilar to β-lactams and is a noncovalent, competitive inhibitor of the enzyme. However, at 26 μM its activity is modest. Using the x-ray structure of the AmpC/I complex as a template, 14 analogs were designed and synthesized. The most active of these, (II), had a Ki of 1 μM, 26-fold better than the lead. To understand the origins of this improved activity, the structures of AmpC in complex with compd. II and an analog were detd. by x-ray crystallog. to 1.97 and 1.96 Å, resp. II was active in cell culture, reversing resistance to the third generation cephalosporin ceftazidime in bacterial pathogens expressing AmpC. In contrast to β-lactam-based inhibitors clavulanate and cefoxitin, compd. II did not up-regulate β-lactamase expression in cell culture but simply inhibited the enzyme expressed by the resistant bacteria. Its escape from this resistance mechanism derives from its dissimilarity to β-lactam antibiotics.
- 51Graves, A. P.; Brenk, R.; Shoichet, B. K. Decoys for docking J. Med. Chem. 2005, 48, 3714– 3728Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXjvFKjsbs%253D&md5=1797b501c1777dc94553701d45c822f0Decoys for DockingGraves, Alan P.; Brenk, Ruth; Shoichet, Brian K.Journal of Medicinal Chemistry (2005), 48 (11), 3714-3728CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Mol. docking is widely used to predict novel lead compds. for drug discovery. Success depends on the quality of the docking scoring function, among other factors. An imperfect scoring function can mislead by predicting incorrect ligand geometries or by selecting nonbinding mols. over true ligands. These false-pos. hits may be considered "decoys". Although these decoys are frustrating, they potentially provide important tests for a docking algorithm; the more subtle the decoy, the more rigorous the test. Indeed, decoy databases have been used to improve protein structure prediction algorithms and protein-protein docking algorithms. Here, we describe 20 geometric decoys in five enzymes and 166 "hit list" decoys-i.e., mols. predicted to bind by our docking program that were tested and found not to do so - for β-lactamase and two cavity sites in lysozyme. Esp. in the cavity sites, which are very simple, these decoys highlight particular weaknesses in our scoring function. We also consider the performance of five other widely used docking scoring functions against our geometric and hit list decoys. Intriguingly, whereas many of these other scoring functions performed better on the geometric decoys, they typically performed worse on the hit list decoys, often highly ranking mols. that seemed to poorly complement the model sites. Several of these "hits" from the other scoring functions were tested exptl. and found, in fact, to be decoys. Collectively, these decoys provide a tool for the development and improvement of mol. docking scoring functions. Such improvements may, in turn, be rapidly tested exptl. against these and related exptl. systems, which are well-behaved in assays and for structure detn.
- 52Hawkins, P. C.; Skillman, A. G.; Warren, G. L.; Ellingson, B. A.; Stahl, M. T. Conformer generation with OMEGA: algorithm and validation using high quality structures from the Protein Data Bank and Cambridge Structural Database J. Chem. Inf. Model. 2010, 50, 572– 584Google Scholar52https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXjtlaisrY%253D&md5=fb87ecc9c51eddef63b41fffcd9babeeConformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural DatabaseHawkins, Paul C. D.; Skillman, A. Geoffrey; Warren, Gregory L.; Ellingson, Benjamin A.; Stahl, Matthew T.Journal of Chemical Information and Modeling (2010), 50 (4), 572-584CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Here, we present the algorithm and validation for OMEGA, a systematic, knowledge-based conformer generator. The algorithm consists of three phases: assembly of an initial 3D structure from a library of fragments; exhaustive enumeration of all rotatable torsions using values drawn from a knowledge-based list of angles, thereby generating a large set of conformations; and sampling of this set by geometric and energy criteria. Validation of conformer generators like OMEGA has often been undertaken by comparing computed conformer sets to exptl. mol. conformations from crystallog., usually from the Protein Databank (PDB). Such an approach is fraught with difficulty due to the systematic problems with small mol. structures in the PDB. Methods are presented to identify a diverse set of small mol. structures from cocomplexes in the PDB that has maximal reliability. A challenging set of 197 high quality, carefully selected ligand structures from well-solved models was obtained using these methods. This set will provide a sound basis for comparison and validation of conformer generators in the future. Validation results from this set are compared to the results using structures of a set of druglike mols. extd. from the Cambridge Structural Database (CSD). OMEGA is found to perform very well in reproducing the crystallog. conformations from both these data sets using two complementary metrics of success.
- 53Jain, A. N. Bias, reporting, and sharing: computational evaluations of docking methods J. Comput.-Aided Mol. Des. 2008, 22, 201– 212Google Scholar53https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXjsFOnsbg%253D&md5=5ec23dce64722830afa856002acd6cf6Bias, reporting, and sharing: computational evaluations of docking methodsJain, Ajay N.Journal of Computer-Aided Molecular Design (2008), 22 (3-4), 201-212CODEN: JCADEQ; ISSN:0920-654X. (Springer)Computational methods for docking ligands to protein binding sites have become ubiquitous in drug discovery. Despite the age of the field, no stds. have been established with respect to methodol. evaluation of docking accuracy, virtual screening utility, or scoring accuracy. There are crit. issues relating to data sharing, data set design and prepn., and statistical reporting that have an impact on the degree to which a report will translate into real-world performance. These issues also have an impact on whether there is a transparent relationship between methodol. changes and reported performance improvements. This paper presents detailed examples of pitfalls in each area and makes recommendations as to best practices.
Supporting Information
Supporting Information
ARTICLE SECTIONSFigure showing DUD-E workflows, while tables provide detailed target-by-target data and tab delimited text files provide the raw data. This material is available free of charge via the Internet at http://pubs.acs.org.
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.