J. Chem. Inf. Comput. Sci., 38 (6), 1079 -1086, 1998. 10.1021/ci980107u S0095-2338(98)00107-3
Web Release Date: October 2, 1998

Copyright © 1998 American Chemical Society

Balancing Representativeness Against Diversity using Optimizable K-Dissimilarity and Hierarchical Clustering

Robert D. Clark* and William J. Langton

Tripos, Inc., 1699 South Hanley Road, St. Louis, Missouri 63144

Received June 6, 1998

Abstract:

When assessing the pharmacological potential of large libraries of compounds, it is often useful to start by determining the biochemical activities of some subset thereof. This is so whether the compounds in question have in fact already been synthesized or exist solely as virtual libraries. A suitable subset for this task must be structurally diverse, so as to minimize redundant testing, but must also be representative, so that valuable subgroups do not get overlooked. These two needs are intrinsically in conflict, with gains in one necessarily coming at the expense of the other. Results obtained using optimizable K-dissimilarity selection and clustering are described and compared with those obtained using more traditional agglomerative hierarchical clustering techniques.

Download the full text: PDF | HTML