Virtual Screening of Molecular Databases Using a Support Vector Machine

Robert N. Jorissen and Michael K. Gilson*
Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, 9600 Gudelsky Drive, Rockville, Maryland 20850
J. Chem. Inf. Model., 2005, 45 (3), pp 549–561
DOI: 10.1021/ci049641u
Publication Date (Web): April 16, 2005
Copyright © 2005 American Chemical Society
*

 Corresponding author e-mail:  gilson@umbi.edu.edu.

Abstract

The Support Vector Machine (SVM) is an algorithm that derives a model used for the classification of data into two categories and which has good generalization properties. This study applies the SVM algorithm to the problem of virtual screening for molecules with a desired activity. In contrast to typical applications of the SVM, we emphasize not classification but enrichment of actives by using a modified version of the standard SVM function to rank molecules. The method employs a simple and novel criterion for picking molecular descriptors and uses cross-validation to select SVM parameters. The resulting method is more effective at enriching for active compounds with novel chemistries than binary fingerprint-based methods such as binary kernel discrimination.

Tools

History

  • Published In Issue May 23, 2005
  • Received November 30, 2004

Recommend & Share

Related Content

Other ACS content by these authors: