Combinatorial QSAR Modeling of Chemical Toxicants Tested against Tetrahymena pyriformis

Hao Zhu, Alexander Tropsha*, Denis Fourches, Alexandre Varnek, Ester Papa§, Paola Gramatica§, Tomas berg, Phuong Dao, Artem Cherkasov and Igor V. Tetko#
Laboratory for Molecular Modeling, Division of Medicinal Chemistry and Natural Products, and Carolina Exploratory Center for Cheminformatics Research, School of Pharmacy, CB 7360, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, Laboratories of Chemoinformatics, Institute of Chemistry, Louis Pasteur University, Strasbourg, France, QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Structural and Functional Biology, University of Insubria, Varese, Italy, School of Pure and Applied Natural Sciences, University of Kalmar, SE-391 82 Kalmar, Sweden, Division of Infectious Diseases, Faculty of Medicine, University of British Columbia, 2733 Heather Street, Vancouver, British Columbia, V5Z 3J5, Canada, Helmholtz Center MunichGerman Research Center for Environmental Health, Institute for Bioinformatics, Neuherberg, D-85764, Germany, and Institute of Bioorganic & Petrochemistry, Murmanskaya 1, Kyiv-94, 02660, Ukraine
J. Chem. Inf. Model., 2008, 48 (4), pp 766–784
DOI: 10.1021/ci700443v
Publication Date (Web): March 1, 2008
Copyright © 2008 American Chemical Society

University of North Carolina at Chapel Hill.

,
* Corresponding author. Phone: +1 (919) 966-2955 . Fax: +1 (919) 966-0204. E-mail: alex_tropsha@unc.edu.
,

Louis Pasteur University.

,
§

University of Insubria.

,

University of Kalmar.

,

University of British Columbia.

,
#

German Research Center for Environmental Health, Institute for Bioinformatics.

,

Institute of Bioorganic & Petrochemistry.

Abstract

Abstract Image

Selecting most rigorous quantitative structure−activity relationship (QSAR) approaches is of great importance in the development of robust and predictive models of chemical toxicity. To address this issue in a systematic way, we have formed an international virtual collaboratory consisting of six independent groups with shared interests in computational chemical toxicology. We have compiled an aqueous toxicity data set containing 983 unique compounds tested in the same laboratory over a decade against Tetrahymena pyriformis. A modeling set including 644 compounds was selected randomly from the original set and distributed to all groups that used their own QSAR tools for model development. The remaining 339 compounds in the original set (external set I) as well as 110 additional compounds (external set II) published recently by the same laboratory (after this computational study was already in progress) were used as two independent validation sets to assess the external predictive power of individual models. In total, our virtual collaboratory has developed 15 different types of QSAR models of aquatic toxicity for the training set. The internal prediction accuracy for the modeling set ranged from 0.76 to 0.93 as measured by the leave-one-out cross-validation correlation coefficient (Qabs2). The prediction accuracy for the external validation sets I and II ranged from 0.71 to 0.85 (linear regression coefficient RabsI2) and from 0.38 to 0.83 (linear regression coefficient RabsII2), respectively. The use of an applicability domain threshold implemented in most models generally improved the external prediction accuracy but at the same time led to a decrease in chemical space coverage. Finally, several consensus models were developed by averaging the predicted aquatic toxicity for every compound using all 15 models, with or without taking into account their respective applicability domains. We find that consensus models afford higher prediction accuracy for the external validation data sets with the highest space coverage as compared to individual constituent models. Our studies prove the power of a collaborative and consensual approach to QSAR model development. The best validated models of aquatic toxicity developed by our collaboratory (both individual and consensus) can be used as reliable computational predictors of aquatic toxicity and are available from any of the participating laboratories.

Tools

History

  • Published In Issue April 28, 2008
  • Article ASAPMarch 01, 2008
  • Received: November 28, 2007

Recommend & Share

Related Content

Other ACS content by these authors: