New Diversity Calculations Algorithms Used for Compound Selection

Sergei V. Trepalin,* Vadim A. Gerasimenko, Andrey V. Kozyukov, Nikolay Ph. Savchuk, and Andrey A. Ivaschenko
Institute of Physiologically Active Compounds, 142432 Chernogolovka, Moscow Region, Russia, and ChemDiv, Inc. 11575 Sorrento Valley Road, #210, San Diego, California 92121
J. Chem. Inf. Comput. Sci., 2002, 42 (2), pp 249–258
DOI: 10.1021/ci0100649
Publication Date (Web): February 1, 2002
Copyright © 2002 American Chemical Society
*

 Corresponding author phone:  (096)524-8269; e-mail:  Sergey_ Trepalin@chemdiv.com; trep@ism.ac.ru.

,

 Institute of Physiologically Active Compounds.

,

 ChemDiv, Inc.

Abstract

Some modifications were introduced into the previously described Centroid diversity sorting algorithm, which uses cosine similarity metric. The modified algorithm is suitable for the work with large databases on personal computers. For example, for diversity sorting of the database with the size greater than a million of records, less than 9 h are required (Pentium III, 800 MHz). The problem of selecting new compounds into the existing collection is examined to reach the maximum diversity of the collection. The article describes the new algorithm for the selection of heterocyclic compounds.

Tools

SciFinder Links

SciFinder subscribers:  Click to sign in | Not a SciFinder subscriber? Learn more at www.cas.org

History

  • Published In Issue March 25, 2002
  • Received June 30, 2001

Recommend & Share

Related Content

Other ACS content by these authors: