OptiSim:  An Extended Dissimilarity Selection Method for Finding Diverse Representative Subsets

Robert D. Clark
Tripos, Inc., 1699 South Hanley Road, St. Louis, Missouri 63144
J. Chem. Inf. Comput. Sci., 1997, 37 (6), pp 1181–1188
DOI: 10.1021/ci970282v
Publication Date (Web): November 24, 1997
Copyright © 1997 American Chemical Society

Abstract

Compound selection methods currently available to chemists are based on maximum or minimum dissimilarity selection or on hierarchical clustering. Optimizable K-Dissimilarity Selection (OptiSim) is a novel and efficient stochastic selection algorithm which includes maximum and minimum dissimilarity-based selection as special cases. By adjusting the subsample size parameter K, it is possible to adjust the balance between representativeness and diversity in the compounds selected. The OptiSim algorithm is described, along with some analytical tools for comparing it to other selection methods. Such comparisons indicate that OptiSim can mimic the representativeness of selections based on hierarchical clustering and, at least in some cases, improve upon them.

Citing Articles

View all 38 citing articles

Citation data is made available by participants in CrossRef's Cited-by Linking service. For a more comprehensive list of citations to this article, users are encouraged to perform a search in SciFinder.

This article has been cited by 29 ACS Journal articles (5 most recent appear below).

  • Cover Image

    Maximum-Score Diversity Selection for Early Drug Discovery

    Thorsten Meinl, Claude Ostermann, and Michael R. Berthold
    Journal of Chemical Information and Modeling2011 51 (2), 237-247
    • Maximum-Score Diversity Selection for Early Drug Discovery

      Thorsten Meinl, Claude Ostermann, and Michael R. Berthold
      Journal of Chemical Information and Modeling2011 51 (2), 237-247

      Diversity selection is a common task in early drug discovery. One drawback of current approaches is that usually only the structural diversity is taken into account, therefore, activity information is ignored. In this article, we present a modified ...

  • Cover Image

    Discovery of Novel GSK-3β Inhibitors with Potent in Vitro and in Vivo Activities and Excellent Brain Permeability Using Combined Ligand- and Structure-Based Virtual Screening

    Mohammad A. Khanfar, Ronald A. Hill, Amal Kaddoumi, and Khalid A. El Sayed
    Journal of Medicinal Chemistry2010 53 (24), 8534-8545
    • Discovery of Novel GSK-3β Inhibitors with Potent in Vitro and in Vivo Activities and Excellent Brain Permeability Using Combined Ligand- and Structure-Based Virtual Screening

      Mohammad A. Khanfar, Ronald A. Hill, Amal Kaddoumi, and Khalid A. El Sayed
      Journal of Medicinal Chemistry2010 53 (24), 8534-8545

      Dysregulation of glycogen synthase kinase (GSK-3β) is implicated in the pathophysiology of many diseases, including type-2 diabetes, stroke, Alzheimer’s, and others. A multistage virtual screening strategy designed so as to overcome known caveats arising ...

  • Cover Image

    Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data

    Sebastian G. Rohrer and Knut Baumann
    Journal of Chemical Information and Modeling2009 49 (2), 169-184
    • Maximum Unbiased Validation (MUV) Data Sets for Virtual Screening Based on PubChem Bioactivity Data

      Sebastian G. Rohrer and Knut Baumann
      Journal of Chemical Information and Modeling2009 49 (2), 169-184

      Refined nearest neighbor analysis was recently introduced for the analysis of virtual screening benchmark data sets. It constitutes a technique from the field of spatial statistics and provides a mathematical framework for the nonparametric analysis of ...

  • Cover Image

    Data Mining a Small Molecule Drug Screening Representative Subset from NIH PubChem

    Xiang-Qun Xie and Jian-Zhong Chen
    Journal of Chemical Information and Modeling2008 48 (3), 465-475
    • Data Mining a Small Molecule Drug Screening Representative Subset from NIH PubChem

      Xiang-Qun Xie and Jian-Zhong Chen
      Journal of Chemical Information and Modeling2008 48 (3), 465-475

      PubChem is a scientific showcase of the NIH Roadmap Initiatives. It is a compound repository created to facilitate information exchange and data sharing among the NIH Roadmap-funded Molecular Library Screening Center Network (MLSCN) and the scientific ...

  • Cover Image

    A Scalable Approach to Combinatorial Library Design for Drug Discovery

    Puneet Sharma, Srinivasa Salapaka, and Carolyn Beck
    Journal of Chemical Information and Modeling2008 48 (1), 27-41
    • A Scalable Approach to Combinatorial Library Design for Drug Discovery

      Puneet Sharma, Srinivasa Salapaka, and Carolyn Beck
      Journal of Chemical Information and Modeling2008 48 (1), 27-41

      In this paper, we propose an algorithm for the design of lead generation libraries required in combinatorial drug discovery. This algorithm addresses simultaneously the two key criteria of diversity and representativeness of compounds in the resulting ...

Tools

SciFinder Links

SciFinder subscribers:  Click to sign in | Not a SciFinder subscriber? Learn more at www.cas.org

Explore by:


History

  • Published In Issue November 24, 1997
  • Received April 7, 1997

Recommend & Share

  • Share on ACS NetworkACS Network
  • Add to FacebookFacebook
  • Tweet ThisTweet This
  • Add to CiteULikeCiteULike
  • Add to NewsvineNewsvine
  • Digg ThisDigg This
  • Add to DeliciousDelicious

Related Content

Other ACS content by these authors: