Molecular Similarity Searching Using Atom Environments, Information-Based Feature Selection, and a Naïve Bayesian Classifier

Andreas Bender,* Hamse Y. Mussa, and Robert C. Glen
Unilever Centre for Molecular Science Informatics, Chemistry Department, University of Cambridge, Cambridge CB2 1EW, United Kingdom
Stephan Reiling
Aventis Pharmaceuticals, 1041 Route 202206, Bridgewater, New Jersey 08807
J. Chem. Inf. Comput. Sci., 2004, 44 (1), pp 170–178
DOI: 10.1021/ci034207y
Publication Date (Web): December 24, 2003
Copyright © 2004 American Chemical Society
*

 Corresponding author phone:  +44 (1223) 763 073; fax:  +44 (1223) 763 076; e-mail:  ab454@cam.ac.uk.

Abstract

A novel technique for similarity searching is introduced. Molecules are represented by atom environments, which are fed into an information-gain-based feature selection. A naïve Bayesian classifier is then employed for compound classification. The new method is tested by its ability to retrieve five sets of active molecules seeded in the MDL Drug Data Report (MDDR). In comparison experiments, the algorithm outperforms all current retrieval methods assessed here using two- and three-dimensional descriptors and offers insight into the significance of structural components for binding.

Tools

History

  • Published In Issue January 26, 2004
  • Received September 15, 2003

Recommend & Share

Related Content

Other ACS content by these authors: