Web Release Date: October 2,
Balancing Representativeness Against Diversity using Optimizable K-Dissimilarity and Hierarchical Clustering
Received June 6, 1998
Abstract: When assessing the pharmacological potential of large libraries of compounds, it is often useful to start by
determining the biochemical activities of some subset thereof. This is so whether the compounds in question
have in fact already been synthesized or exist solely as virtual libraries. A suitable subset for this task must
be structurally diverse, so as to minimize redundant testing, but must also be representative, so that valuable
subgroups do not get overlooked. These two needs are intrinsically in conflict, with gains in one necessarily
coming at the expense of the other. Results obtained using optimizable K-dissimilarity selection and clustering
are described and compared with those obtained using more traditional agglomerative hierarchical clustering
techniques.
Download the full text:
PDF |
HTML