J. Chem. Inf. Model., 47 (6), 2063 -2076, 2007. 10.1021/ci700141x S1549-9596(70)00141-9
Web Release Date: October 4, 2007

Copyright © 2007 American Chemical Society

Chemical Data Mining of the NCI Human Tumor Cell Line Database

Huijun Wang, Jonathan Klinginsmith, Xiao Dong, Adam C. Lee, Rajarshi Guha, Yuqing Wu, Gordon M. Crippen, and David J. Wild*

Indiana University School of Informatics and Chemical Informatics and Cyberinfrastructure Collaboratory, 901 East Tenth Street, Bloomington, Indiana 47408, and College of Pharmacy, University of Michigan, 428 Church Street, Ann Arbor, Michigan 48109-1065

Received April 20, 2007

Abstract:

The NCI Developmental Therapeutics Program Human Tumor cell line data set is a publicly available database that contains cellular assay screening data for over 40 000 compounds tested in 60 human tumor cell lines. The database also contains microarray assay gene expression data for the cell lines, and so it provides an excellent information resource particularly for testing data mining methods that bridge chemical, biological, and genomic information. In this paper we describe a formal knowledge discovery approach to characterizing and data mining this set and report the results of some of our initial experiments in mining the set from a chemoinformatics perspective.


Download the full text: PDF | HTML