Research Article
The Problem of Overfitting
Purchase the full-text
- PDF/HTML,
figures/images,
references and tables,
(where available)
Citing Articles
Citation data is made available by participants in CrossRef's Cited-by Linking service. For a more comprehensive list of citations to this article, users are encouraged to perform a search in SciFinder.
This article has been cited by 48 ACS Journal articles (5 most recent appear below).

Determining the Degree of Randomness of Descriptors in Linear Regression Equations with Respect to the Data Size
Michael C. HutterJournal of Chemical Information and Modeling2011 51 (12), 3099-3104Determining the Degree of Randomness of Descriptors in Linear Regression Equations with Respect to the Data Size
Michael C. HutterJournal of Chemical Information and Modeling2011 51 (12), 3099-3104Linear regression equations suffer from the curse of dimensionality that leads to overfitting and accidental correlation, particularly for small data sets and when many variables are present. This can lead to cases where descriptors based on random ...

Robust Scoring Functions for Protein–Ligand Interactions with Quantum Chemical Charge Models
Jui-Chih Wang, Jung-Hsin Lin, Chung-Ming Chen, Alex L. Perryman, and Arthur J. OlsonJournal of Chemical Information and Modeling2011 51 (10), 2528-2537Robust Scoring Functions for Protein–Ligand Interactions with Quantum Chemical Charge Models
Jui-Chih Wang, Jung-Hsin Lin, Chung-Ming Chen, Alex L. Perryman, and Arthur J. OlsonJournal of Chemical Information and Modeling2011 51 (10), 2528-2537Ordinary least-squares (OLS) regression has been used widely for constructing the scoring functions for protein–ligand interactions. However, OLS is very sensitive to the existence of outliers, and models constructed using it are easily affected by the ...

Real External Predictivity of QSAR Models: How To Evaluate It? Comparison of Different Validation Criteria and Proposal of Using the Concordance Correlation Coefficient
Nicola Chirico and Paola GramaticaJournal of Chemical Information and Modeling2011 51 (9), 2320-2335Real External Predictivity of QSAR Models: How To Evaluate It? Comparison of Different Validation Criteria and Proposal of Using the Concordance Correlation Coefficient
Nicola Chirico and Paola GramaticaJournal of Chemical Information and Modeling2011 51 (9), 2320-2335The main utility of QSAR models is their ability to predict activities/properties for new chemicals, and this external prediction ability is evaluated by means of various validation criteria. As a measure for such evaluation the OECD guidelines have ...

Design, Synthesis, and Biological Evaluation of Diminutive Forms of (+)-Spongistatin 1: Lessons Learned
Amos B. Smith, III, Christina A. Risatti, Onur Atasoylu, Clay S. Bennett, Junke Liu, Hongsheng Cheng, Karen TenDyke, and Qunli XuJournal of the American Chemical Society2011 Article ASAPDesign, Synthesis, and Biological Evaluation of Diminutive Forms of (+)-Spongistatin 1: Lessons Learned
Amos B. Smith, III, Christina A. Risatti, Onur Atasoylu, Clay S. Bennett, Junke Liu, Hongsheng Cheng, Karen TenDyke, and Qunli XuJournal of the American Chemical Society2011 Article ASAPThe design, synthesis, and biological evaluation of two diminutive forms of (+)-spongistatin 1, in conjunction with the development of a potentially general design strategy to simplify highly flexible macrocyclic molecules while maintaining biological ...

Cross-Target View to Feature Selection: Identification of Molecular Interaction Features in Ligand−Target Space
Satoshi Niijima, Hiroaki Yabuuchi, and Yasushi OkunoJournal of Chemical Information and Modeling2011 51 (1), 15-24Cross-Target View to Feature Selection: Identification of Molecular Interaction Features in Ligand−Target Space
Satoshi Niijima, Hiroaki Yabuuchi, and Yasushi OkunoJournal of Chemical Information and Modeling2011 51 (1), 15-24There is growing interest in computational chemogenomics, which aims to identify all possible ligands of all target families using in silico prediction models. In particular, kernel methods provide a means of integrating compounds and proteins in a ...
Tools
-
Add to Favorites
-
Download Citation
-
Email a Colleague -
Permalink
Order Reprints
Rights & Permissions
Citation Alerts
History
- Published In Issue January 26, 2004
- Received October 30, 2003
Cart


ACS
Network
to check that it is plausible that its predictions will carry over to fresh data not used in the model fitting exercise. There are two standard ways of doing this






