Decision Forest:  Combining the Predictions of Multiple Independent Decision Tree Models

Weida Tong,* Huixiao Hong, Hong Fang, Qian Xie, and Roger Perkins
Center for Toxicoinformatics, Division of Biometry and Risk Assessment, National Center for Toxicological Research, Jefferson, Arkansas 72079, and Northrop Grumman Information Technology, Jefferson, Arkansas 72079
J. Chem. Inf. Comput. Sci., 2003, 43 (2), pp 525–531
DOI: 10.1021/ci020058s
Publication Date (Web): February 4, 2003
Copyright Not subject to U.S. Copyright. Published 2003 American Chemical Society
*

 Corresponding author phone:  (870)543-7142; fax:  (870)543-7662; e-mail:  wtong@nctr.fda.gov. Corresponding address:  NCTR, 3900 NCTR Road, HFT 20, Jefferson, AR 72079.

,

 National Center for Toxicological Research.

,

 Northrop Grumman Information Technology.

Abstract

The techniques of combining the results of multiple classification models to produce a single prediction have been investigated for many years. In earlier applications, the multiple models to be combined were developed by altering the training set. The use of these so-called resampling techniques, however, poses the risk of reducing predictivity of the individual models to be combined and/or over fitting the noise in the data, which might result in poorer prediction of the composite model than the individual models. In this paper, we suggest a novel approach, named Decision Forest, that combines multiple Decision Tree models. Each Decision Tree model is developed using a unique set of descriptors. When models of similar predictive quality are combined using the Decision Forest method, quality compared to the individual models is consistently and significantly improved in both training and testing steps. An example will be presented for prediction of binding affinity of 232 chemicals to the estrogen receptor.

Tools

History

  • Published In Issue March 24, 2003
  • Received September 16, 2002

Recommend & Share

Related Content

Other ACS content by these authors: