ACS Publications. Most Trusted. Most Cited. Most Read
My Activity

Figure 1Loading Img

Similarity to Molecules in the Training Set Is a Good Discriminator for Prediction Accuracy in QSAR

View Author Information
Molecular Systems Department, RY50S-100 Merck Research Laboratories, Rahway, New Jersey 07065, and Molecular Systems Department, WP53F-301 Merck Research Laboratories, West Point, Pennsylvania 19486
Cite this: J. Chem. Inf. Comput. Sci. 2004, 44, 6, 1912–1928
Publication Date (Web):October 7, 2004
Copyright © 2004 American Chemical Society

    Article Views





    Other access options


    How well can a QSAR model predict the activity of a molecule not in the training set used to create the model? A set of retrospective cross-validation experiments using 20 diverse in-house activity sets were done to find a good discriminator of prediction accuracy as measured by root-mean-square difference between observed and predicted activity. Among the measures we tested, two seem useful:  the similarity of the molecule to be predicted to the nearest molecule in the training set and/or the number of neighbors in the training set, where neighbors are those more similar than a user-chosen cutoff. The molecules with the highest similarity and/or the most neighbors are the best-predicted. This trend holds true for narrow training sets and, to a lesser degree, for many diverse training sets and does not depend on which QSAR method or descriptor is used. One may define the similarity using a different descriptor than that used for the QSAR model. The similarity dependence for diverse training sets is somewhat unexpected. It appears to be greater for those data sets where the association of similar activities vs similar structures (as encoded in the Patterson plot) is stronger. We propose a way to estimate the reliability of the prediction of an arbitrary chemical structure on a given QSAR model, given the training set from which the model was derived.

    Read this article

    To access this article, please review the available access options below.

    Get instant access

    Purchase Access

    Read this article for 48 hours. Check out below using your ACS ID or as a guest.


    Access through Your Institution

    You may have access to this article through your institution.

    Your institution does not have access to this content. You can change your affiliated institution below.


     Corresponding author e-mail:  [email protected].

     RY50S-100 Merck Research Laboratories.

     WP53F-301 Merck Research Laboratories.

    Cited By

    This article is cited by 231 publications.

    1. Robert C. Spiers, Callan Norby, John H. Kalivas. Physicochemical Responsive Integrated Similarity Measure (PRISM) for a Comprehensive Quantitative Perspective of Sample Similarity Dynamically Assessed with NIR Spectra. Analytical Chemistry 2023, 95 (34) , 12776-12784.
    2. Kajjana Boonpalit, Yutthana Wongnongwa, Chanatkran Prommin, Sarana Nutanong, Supawadee Namuangruk. Data-Driven Discovery of Graphene-Based Dual-Atom Catalysts for Hydrogen Evolution Reaction with Graph Neural Network and DFT Calculations. ACS Applied Materials & Interfaces 2023, 15 (10) , 12936-12945.
    3. Yuru Chen, Qiuhong Liang, Wenjie Liang, Wenlong Li, Yan Liu, Kexin Guo, Bo Yang, Xu Zhao, Mengting Yang. Identification of Toxicity Forcing Agents from Individual Aliphatic and Aromatic Disinfection Byproducts Formed in Drinking Water: Implications and Limitations. Environmental Science & Technology 2023, 57 (3) , 1366-1377.
    4. Sebastian Schieferdecker, Freddy A. Bernal, K. Philip Wojtas, François Keiff, Yan Li, Hans-Martin Dahse, Florian Kloss. Development of Predictive Classification Models for Whole Cell Antimycobacterial Activity of Benzothiazinones. Journal of Medicinal Chemistry 2022, 65 (9) , 6748-6763.
    5. Filip Miljković, Raquel Rodríguez-Pérez, Jürgen Bajorath. Impact of Artificial Intelligence on Compound Discovery, Design, and Synthesis. ACS Omega 2021, 6 (49) , 33293-33299.
    6. Srilatha Sakamuru, Jinghua Zhao, Menghang Xia, Huixiao Hong, Anton Simeonov, Iosif Vaisman, Ruili Huang. Predictive Models to Identify Small Molecule Activators and Inhibitors of Opioid Receptors. Journal of Chemical Information and Modeling 2021, 61 (6) , 2675-2685.
    7. W. Patrick Walters, Regina Barzilay. Applications of Deep Learning in Molecule Generation and Molecular Property Prediction. Accounts of Chemical Research 2021, 54 (2) , 263-270.
    8. Lior Hirschfeld, Kyle Swanson, Kevin Yang, Regina Barzilay, Connor W. Coley. Uncertainty Quantification Using Neural Networks for Molecular Property Prediction. Journal of Chemical Information and Modeling 2020, 60 (8) , 3770-3780.
    9. Zhongyu Wang, Jingwen Chen, Huixiao Hong. Applicability Domains Enhance Application of PPARγ Agonist Classifiers Trained by Drug-like Compounds to Environmental Chemicals. Chemical Research in Toxicology 2020, 33 (6) , 1382-1388.
    10. Robert P. Sheridan, Prabha Karnachi, Matthew Tudor, Yuting Xu, Andy Liaw, Falgun Shah, Alan C. Cheng, Elizabeth Joshi, Meir Glick, Juan Alvarez. Experimental Error, Kurtosis, Activity Cliffs, and Methodology: What Limits the Predictivity of Quantitative Structure–Activity Relationship Models?. Journal of Chemical Information and Modeling 2020, 60 (4) , 1969-1982.
    11. Andy H. Vo, Terry R. Van Vleet, Rishi R. Gupta, Michael J. Liguori, Mohan S. Rao. An Overview of Machine Learning and Big Data for Drug Toxicity Evaluation. Chemical Research in Toxicology 2020, 33 (1) , 20-37.
    12. Yipin Lu, Shankara Anand, William Shirley, Peter Gedeck, Brian P. Kelley, Suzanne Skolnik, Stephane Rodde, Mai Nguyen, Mika Lindvall, Weiping Jia. Prediction of pKa Using Machine Learning Methods with Rooted Topological Torsion Fingerprints: Application to Aliphatic Amines. Journal of Chemical Information and Modeling 2019, 59 (11) , 4706-4719.
    13. Vladimir P. Berishvili, Valentin O. Perkin, Andrew E. Voronkov, Eugene V. Radchenko, Riyaz Syed, Chittireddy Venkata Ramana Reddy, Viness Pillay, Pradeep Kumar, Yahya E. Choonara, Ahmed Kamal, Vladimir A. Palyulin. Time-Domain Analysis of Molecular Dynamics Trajectories Using Deep Neural Networks: Application to Activity Ranking of Tankyrase Inhibitors. Journal of Chemical Information and Modeling 2019, 59 (8) , 3519-3532.
    14. Ingrid Grenet, Kevin Merlo, Jean-Paul Comet, Romain Tertiaux, David Rouquié, Frédéric Dayan. Stacked Generalization with Applicability Domain Outperforms Simple QSAR on in Vitro Toxicological Data. Journal of Chemical Information and Modeling 2019, 59 (4) , 1486-1496.
    15. Ruifeng Liu, Hao Wang, Kyle P. Glover, Michael G. Feasel, Anders Wallqvist. Dissecting Machine-Learning Prediction of Molecular Activity: Is an Applicability Domain Needed for Quantitative Structure–Activity Relationship Models Based on Deep Neural Networks?. Journal of Chemical Information and Modeling 2019, 59 (1) , 117-126.
    16. Francois Berenger, Yoshihiro Yamanishi. A Distance-Based Boolean Applicability Domain for Classification of High Throughput Screening Data. Journal of Chemical Information and Modeling 2019, 59 (1) , 463-476.
    17. Ruifeng Liu, Anders Wallqvist. Molecular Similarity-Based Domain Applicability Metric Efficiently Identifies Out-of-Domain Compounds. Journal of Chemical Information and Modeling 2019, 59 (1) , 181-189.
    18. Kunal Roy, Pravin Ambure, Supratik Kar. How Precise Are Our Quantitative Structure–Activity Relationship Derived Predictions for New Query Chemicals?. ACS Omega 2018, 3 (9) , 11392-11406.
    19. Ruifeng Liu, Kyle P. Glover, Michael G. Feasel, Anders Wallqvist. General Approach to Estimate Error Bars for Quantitative Structure–Activity Relationship Predictions of Molecular Activity. Journal of Chemical Information and Modeling 2018, 58 (8) , 1561-1575.
    20. Yuting Xu, Junshui Ma, Andy Liaw, Robert P. Sheridan, and Vladimir Svetnik . Demystifying Multitask Deep Neural Networks for Quantitative Structure–Activity Relationships. Journal of Chemical Information and Modeling 2017, 57 (10) , 2490-2504.
    21. Runsheng Song, Arturo A. Keller, and Sangwon Suh . Rapid Life-Cycle Impact Screening Using Artificial Neural Networks. Environmental Science & Technology 2017, 51 (18) , 10777-10785.
    22. Peter Gedeck, Suzanne Skolnik, and Stephane Rodde . Developing Collaborative QSAR Models Without Sharing Structures. Journal of Chemical Information and Modeling 2017, 57 (8) , 1847-1858.
    23. Jiangming Sun, Lars Carlsson, Ernst Ahlberg, Ulf Norinder, Ola Engkvist, and Hongming Chen . Applying Mondrian Cross-Conformal Prediction To Estimate Prediction Confidence on Large Imbalanced Bioactivity Data Sets. Journal of Chemical Information and Modeling 2017, 57 (7) , 1591-1598.
    24. Andreas Verras, Chris L. Waller, Peter Gedeck, Darren V. S. Green, Thierry Kogej, Anandkumar Raichurkar, Manoranjan Panda, Anang A. Shelat, Julie Clark, R. Kiplin Guy, George Papadatos, and Jeremy Burrows . Shared Consensus Machine Learning Models for Predicting Blood Stage Malaria Inhibition. Journal of Chemical Information and Modeling 2017, 57 (3) , 445-453.
    25. Peter Willett . Molecular Similarity Approaches in Chemoinformatics: Early History and Literature Status. 2016, 67-89.
    26. Robert P. Sheridan . The Relative Importance of Domain Applicability Metrics for Estimating Prediction Errors in QSAR Varies with Training Set Diversity. Journal of Chemical Information and Modeling 2015, 55 (6) , 1098-1107.
    27. Yuriy A. Abramov . Major Source of Error in QSPR Prediction of Intrinsic Thermodynamic Solubility of Drugs: Solid vs Nonsolid State Contributions?. Molecular Pharmaceutics 2015, 12 (6) , 2126-2141.
    28. Ruifeng Liu and Anders Wallqvist . Merging Applicability Domains for in Silico Assessment of Chemical Mutagenicity. Journal of Chemical Information and Modeling 2014, 54 (3) , 793-800.
    29. Marko Toplak, Rok Močnik, Matija Polajnar, Zoran Bosnić, Lars Carlsson, Catrin Hasselgren, Janez Demšar, Scott Boyer, Blaž Zupan, and Jonna Stålring . Assessment of Machine Learning Reliability Methods for Quantifying the Applicability Domain of QSAR Regression Models. Journal of Chemical Information and Modeling 2014, 54 (2) , 431-441.
    30. Robert P. Sheridan . Using Random Forest To Model the Domain Applicability of Another Random Forest Model. Journal of Chemical Information and Modeling 2013, 53 (11) , 2837-2850.
    31. Yuling An, Woody Sherman, and Steven L. Dixon . Kernel-Based Partial Least Squares: Application to Fingerprint-Based QSAR with Model Visualization. Journal of Chemical Information and Modeling 2013, 53 (9) , 2312-2321.
    32. Robert P. Sheridan . Time-Split Cross-Validation as a Method for Estimating the Goodness of Prospective Prediction.. Journal of Chemical Information and Modeling 2013, 53 (4) , 783-790.
    33. Prashant V. Desai, Geri A. Sawada, Ian A. Watson, and Thomas J. Raub . Integration of in Silico and in Vitro Tools for Scaffold Optimization during Drug Discovery: Predicting P-Glycoprotein Efflux. Molecular Pharmaceutics 2013, 10 (4) , 1249-1261.
    34. Christopher E. Keefer, Gregory W. Kauffman, and Rishi Raj Gupta . Interpretable, Probability-Based Confidence Metric for Continuous Quantitative Structure–Activity Relationship Models. Journal of Chemical Information and Modeling 2013, 53 (2) , 368-383.
    35. Shawn Martin . Lattice Enumeration for Inverse Molecular Design Using the Signature Descriptor. Journal of Chemical Information and Modeling 2012, 52 (7) , 1787-1797.
    36. Tammy Biniashvili, Ehud Schreiber, and Yossef Kliger . Improving Classical Substructure-Based Virtual Screening to Handle Extrapolation Challenges. Journal of Chemical Information and Modeling 2012, 52 (3) , 678-685.
    37. Bin Chen, Robert P. Sheridan, Viktor Hornak, and Johannes H. Voigt . Comparison of Random Forest and Pipeline Pilot Naïve Bayes in Prospective QSAR Predictions. Journal of Chemical Information and Modeling 2012, 52 (3) , 792-803.
    38. Robert P. Sheridan . Three Useful Dimensions for Domain Applicability in QSAR Models Using Random Forest. Journal of Chemical Information and Modeling 2012, 52 (3) , 814-823.
    39. Bernd Wendt, Ulrike Uhrig, and Fabian Bös . Capturing Structure−Activity Relationships from Chemogenomic Spaces. Journal of Chemical Information and Modeling 2011, 51 (4) , 843-851.
    40. Iurii Sushko, Sergii Novotarskyi, Robert Körner, Anil Kumar Pandey, Artem Cherkasov, Jiazhong Li, Paola Gramatica, Katja Hansen, Timon Schroeter, Klaus-Robert Müller, Lili Xi, Huanxiang Liu, Xiaojun Yao, Tomas Öberg, Farhad Hormozdiari, Phuong Dao, Cenk Sahinalp, Roberto Todeschini, Pavel Polishchuk, Anatoliy Artemenko, Victor Kuz’min, Todd M. Martin, Douglas M. Young, Denis Fourches, Eugene Muratov, Alexander Tropsha, Igor Baskin, Dragos Horvath, Gilles Marcou, Christophe Muller, Alexander Varnek, Volodymyr V. Prokopenko, and Igor V. Tetko . Applicability Domains for Classification Problems: Benchmarking of Distance to Models for Ames Mutagenicity Set. Journal of Chemical Information and Modeling 2010, 50 (12) , 2094-2111.
    41. George Papadatos, Anthony W. J. Cooper, Visakan Kadirkamanathan, Simon J. F. Macdonald, Iain M. McLay, Stephen D. Pickett, John M. Pritchard, Peter Willett and Valerie J. Gillet . Analysis of Neighborhood Behavior in Lead Optimization and Array Design. Journal of Chemical Information and Modeling 2009, 49 (2) , 195-208.
    42. Kirk Simmons, John Kinney, Aaron Owens, Daniel A. Kleier, Karen Bloch, Dave Argentar, Alicia Walsh and Ganesh Vaidyanathan . Practical Outcomes of Applying Ensemble Machine Learning Classifiers to High-Throughput Screening (HTS) Data Analysis and Screening. Journal of Chemical Information and Modeling 2008, 48 (11) , 2196-2206.
    43. Gerrit Schüürmann, Ralf-Uwe Ebert, Jingwen Chen, Bin Wang and Ralph Kühne . External Validation and Prediction Employing the Predictive Squared Correlation Coefficient — Test Set Activity Mean vs Training Set Activity Mean. Journal of Chemical Information and Modeling 2008, 48 (11) , 2140-2145.
    44. Igor V. Tetko, Iurii Sushko, Anil Kumar Pandey, Hao Zhu, Alexander Tropsha, Ester Papa, Tomas Öberg, Roberto Todeschini, Denis Fourches and Alexandre Varnek . Critical Assessment of QSAR Models of Environmental Toxicity against Tetrahymena pyriformis: Focusing on Applicability Domain and Overfitting by Variable Selection. Journal of Chemical Information and Modeling 2008, 48 (9) , 1733-1746.
    45. Mosè Casalegno, Guido Sello and Emilio Benfenati . Definition and Detection of Outliers in Chemical Space. Journal of Chemical Information and Modeling 2008, 48 (8) , 1592-1601.
    46. W. Michael Brown, Ariella Sasson, Donald R. Bellew, Lucy A. Hunsaker, Shawn Martin, Andrei Leitao, Lorraine M. Deck, David L. Vander Jagt and Tudor I. Oprea . Efficient Calculation of Molecular Properties from Simulation Using Kernel Molecular Dynamics. Journal of Chemical Information and Modeling 2008, 48 (8) , 1626-1637.
    47. Rajarshi Guha, , John H. Van Drie. Assessing How Well a Modeling Protocol Captures a Structure−Activity Landscape. Journal of Chemical Information and Modeling 2008, 48 (8) , 1716-1728.
    48. Akash Khandelwal, Matthew D. Krasowski, Erica J. Reschly, Michael W. Sinz, Peter W. Swaan and Sean Ekins . Machine Learning Methods and Docking for Predicting Human Pregnane X Receptor Activation. Chemical Research in Toxicology 2008, 21 (7) , 1457-1467.
    49. Ana G. Maldonado,, Jean-Pierre Doucet,, Michel Petitjean, and, Bo-Tao Fan. MolDiA:  A Novel Molecular Diversity Analysis Tool. 1. Principles and Architecture. Journal of Chemical Information and Modeling 2007, 47 (6) , 2197-2207.
    50. Hongzhou Zhang,, Howard Y. Ando,, Linna Chen, and, Pil H. Lee. On-the-Fly Selection of a Training Set for Aqueous Solubility Prediction. Molecular Pharmaceutics 2007, 4 (4) , 489-497.
    51. Berith F. Jensen, Christian Vind, Søren B. Padkjær, Per B. Brockhoff, Hanne H. F. Refsgaard. In Silico Prediction of Cytochrome P450 2D6 and 3A4 Inhibition Using Gaussian Kernel Weighted k-Nearest Neighbor and Extended Connectivity Fingerprints, Including Structural Fragment Analysis of Inhibitors versus Noninhibitors. Journal of Medicinal Chemistry 2007, 50 (3) , 501-511.
    52. David S. Palmer,, Noel M. O'Boyle,, Robert C. Glen, and, John B. O. Mitchell. Random Forest Models To Predict Aqueous Solubility. Journal of Chemical Information and Modeling 2007, 47 (1) , 150-158.
    53. Hua Yuan,, Yongyan Wang, and, Yiyu Cheng. Local and Global Quantitative Structure−Activity Relationship Modeling and Prediction for the Baseline Toxicity. Journal of Chemical Information and Modeling 2007, 47 (1) , 159-169.
    54. Jaroslaw Polanski,, Andrzej Bak,, Rafal Gieleciak, and, Tomasz Magdziarz. Modeling Robust QSAR. Journal of Chemical Information and Modeling 2006, 46 (6) , 2310-2318.
    55. Jordi Mestres,, Lidia Martín-Couce,, Elisabet Gregori-Puigjané,, Montserrat Cases, and, Scott Boyer. Ligand-Based Approach to In Silico Pharmacology:  Nuclear Receptor Profiling. Journal of Chemical Information and Modeling 2006, 46 (6) , 2725-2736.
    56. Peter Gedeck,, Bernhard Rohde, and, Christian Bartels. QSAR − How Good Is It in Practice? Comparison of Descriptor Sets on an Unbiased Cross Section of Corporate Data Sets. Journal of Chemical Information and Modeling 2006, 46 (5) , 1924-1936.
    57. Sean Ekins,, Konstantin V. Balakin,, Nikolay Savchuk, and, Yan Ivanenkov. Insights for Human Ether-a-Go-Go-Related Gene Potassium Channel Inhibition Using Recursive Partitioning and Kohonen and Sammon Mapping Techniques. Journal of Medicinal Chemistry 2006, 49 (17) , 5059-5071.
    58. Alireza Givehchi,, Andreas Bender, and, Robert C. Glen. Analysis of Activity Space by Fragment Fingerprints, 2D Descriptors, and Multitarget Dependent Transformation of 2D Descriptors. Journal of Chemical Information and Modeling 2006, 46 (3) , 1078-1083.
    59. Pierre Bruneau and, Nathan R. McElroy. logD7.4 Modeling Using Bayesian Regularized Neural Networks. Assessment and Correction of the Errors of Prediction. Journal of Chemical Information and Modeling 2006, 46 (3) , 1379-1387.
    60. Dariusz Plewczynski,, Stéphane A. H. Spieser, and, Uwe Koch. Assessing Different Classification Methods for Virtual Screening. Journal of Chemical Information and Modeling 2006, 46 (3) , 1098-1106.
    61. Jérôme Hert,, Peter Willett, and, David J. Wilton, , Pierre Acklin,, Kamal Azzaoui,, Edgar Jacoby, and, Ansgar Schuffenhauer. New Methods for Ligand-Based Virtual Screening:  Use of Data Fusion and Machine Learning to Enhance the Effectiveness of Similarity Searching. Journal of Chemical Information and Modeling 2006, 46 (2) , 462-470.
    62. Eric-Wubbo Lameijer,, Joost N. Kok,, Thomas Bäck, and, Ad P. IJzerman. The Molecule Evoluator. An Interactive Evolutionary Algorithm for the Design of Drug-Like Molecules. Journal of Chemical Information and Modeling 2006, 46 (2) , 545-552.
    63. W. Michael Brown,, Shawn Martin,, Mark D. Rintoul, and, Jean-Loup Faulon. Designing Novel Polymers with Targeted Properties Using the Signature Molecular Descriptor. Journal of Chemical Information and Modeling 2006, 46 (2) , 826-835.
    64. Orazio Nicolotti and, Angelo Carotti. QSAR and QSPR Studies of a Highly Structured Physicochemical Domain. Journal of Chemical Information and Modeling 2006, 46 (1) , 264-276.
    65. Alba T. Macias,, Md. Younus Mia,, Guanjun Xia,, Jun Hayashi, and, Alexander D. MacKerell, Jr.. Lead Validation and SAR Development via Chemical Similarity Searching; Application to Compounds Targeting the pY+3 Site of the SH2 Domain of p56lck. Journal of Chemical Information and Modeling 2005, 45 (6) , 1759-1766.
    66. Jérôme Hert,, Peter Willett,, David J. Wilton,, Pierre Acklin,, Kamal Azzaoui,, Edgar Jacoby, and, Ansgar Schuffenhauer. Enhancing the Effectiveness of Similarity-Based Virtual Screening Using Nearest-Neighbor Information. Journal of Medicinal Chemistry 2005, 48 (22) , 7049-7054.
    67. Andreas Evers,, Gerhard Hessler,, Hans Matter, and, Thomas Klabunde. Virtual Screening of Biogenic Amine-Binding G-Protein Coupled Receptors:  Comparative Evaluation of Protein- and Ligand-Based Virtual Screening Protocols. Journal of Medicinal Chemistry 2005, 48 (17) , 5448-5465.
    68. Sabcho Dimitrov,, Gergana Dimitrova,, Todor Pavlov,, Nadezhda Dimitrova,, Grace Patlewicz,, Jay Niemela, and, Ovanes Mekenyan. A Stepwise Approach for Defining the Applicability Domain of SAR and QSAR Models. Journal of Chemical Information and Modeling 2005, 45 (4) , 839-849.
    69. Peter Willett. Searching Techniques for Databases of Two- and Three-Dimensional Chemical Structures. Journal of Medicinal Chemistry 2005, 48 (13) , 4183-4199.
    70. Vladimir Svetnik,, Ting Wang,, Christopher Tong,, Andy Liaw,, Robert P. Sheridan, and, Qinghua Song. Boosting:  An Ensemble Learning Tool for Compound Classification and QSAR Modeling. Journal of Chemical Information and Modeling 2005, 45 (3) , 786-799.
    71. Thao Pham, Mohamed Ghafoor, Sandra Grañana-Castillo, Catia Marzolini, Sara Gibbons, Saye Khoo, Justin Chiong, Dennis Wang, Marco Siccardi. DeepARV: ensemble deep learning to predict drug-drug interaction of clinical relevance with antiretroviral therapy. npj Systems Biology and Applications 2024, 10 (1)
    72. V. Hemamalini, Amit Kumar Tyagi, V. Vennila, Shabnam Kumari. Revolutionizing drug Discovery With Cutting-Edge Technologies. 2024, 76-89.
    73. Leonard Wossnig, Norbert Furtmann, Andrew Buchanan, Sandeep Kumar, Victor Greiff. Best practices for machine learning in antibody discovery and development. Drug Discovery Today 2024, 41 , 104025.
    74. Adeshina I. Odugbemi, Clement Nyirenda, Alan Christoffels, Samuel A. Egieyeh. Machine Learning Prediction of Intestinal α-Glucosidase Inhibitors Using a Diverse Set of Ligands: A Drug Repurposing Effort with DrugBank Database Screening. 2024
    75. Ozren Jovic, Rabah Mouras. Extreme Gradient Boosting Combined with Conformal Predictors for Informative Solubility Estimation. Molecules 2024, 29 (1) , 19.
    76. Seokhyun Moon, Wonho Zhung, Woo Youn Kim. Toward generalizable structure‐based deep learning models for protein–ligand interaction prediction: Challenges and strategies. WIREs Computational Molecular Science 2024, 14 (1)
    77. Srijit Seal, Hongbin Yang, Maria-Anna Trapotsi, Satvik Singh, Jordi Carreras-Puigvert, Ola Spjuth, Andreas Bender. Merging bioactivity predictions from cell morphology and chemical fingerprint models using similarity to training data. Journal of Cheminformatics 2023, 15 (1)
    78. Yumeng Zhang, Janosch Menke, Jiazhen He, Eva Nittinger, Christian Tyrchan, Oliver Koch, Hongtao Zhao. Similarity-based pairing improves efficiency of siamese neural networks for regression tasks and uncertainty quantification. Journal of Cheminformatics 2023, 15 (1)
    79. Haiyang Shi, Geping Luo, Olaf Hellwich, Xiufeng He, Mingjuan Xie, Wenqiang Zhang, Friday U. Ochege, Qing Ling, Yu Zhang, Ruixiang Gao, Alishir Kurban, Philippe De Maeyer, Tim Van de Voorde. Comparing the use of all data or specific subsets for training machine learning models in hydrology: A case study of evapotranspiration prediction. Journal of Hydrology 2023, 627 , 130399.
    80. Ayana Ghosh, Sergei V. Kalinin, Maxim A. Ziatdinov. Discovery of structure–property relations for molecules via hypothesis-driven active learning over the chemical space. APL Machine Learning 2023, 1 (4)
    81. Jan G. Rittig, Martin Ritzert, Artur M. Schweidtmann, Stefanie Winkler, Jana M. Weber, Philipp Morsch, Karl Alexander Heufer, Martin Grohe, Alexander Mitsos, Manuel Dahmen. Graph machine learning for design of high‐octane fuels. AIChE Journal 2023, 69 (4)
    82. Ekaterina A. Sosnina, Sergey Sosnin, Maxim V. Fedorov. Improvement of multi-task learning by data enrichment: application for drug discovery. Journal of Computer-Aided Molecular Design 2023, 37 (4) , 183-200.
    83. Zhongyu Wang, Jingwen Chen. Applicability Domain Characterization for Machine Learning QSAR Models. 2023, 323-353.
    84. Vadim Korolev, Iurii Nevolin, Pavel Protsenko. A universal similarity based approach for predictive uncertainty quantification in materials science. Scientific Reports 2022, 12 (1)
    85. Nikhil V S Avula, Shivanand Kumar Veesam, Sudarshan Behera, Sundaram Balasubramanian. Building robust machine learning models for small chemical science data: the case of shear viscosity of fluids. Machine Learning: Science and Technology 2022, 3 (4) , 045032.
    86. Chihiro Fujio, Hideaki Ogawa. Deep-learning prediction and uncertainty quantification for scramjet intake flowfields. Aerospace Science and Technology 2022, 130 , 107931.
    87. Jie Yu, Dingyan Wang, Mingyue Zheng. Uncertainty quantification: Can we trust artificial intelligence in drug discovery?. iScience 2022, 25 (8) , 104814.
    88. Jeffrey K. Weber, Joseph A. Morrone, Sugato Bagchi, Jan D. Estrada Pabon, Seung-gu Kang, Leili Zhang, Wendy D. Cornell. Simplified, interpretable graph convolutional neural networks for small molecule activity prediction. Journal of Computer-Aided Molecular Design 2022, 36 (5) , 391-404.
    89. Sara M. de Cripan, Adrià Cereto-Massagué, Pol Herrero, Andrei Barcaru, Núria Canela, Xavier Domingo-Almenara. Machine Learning-Based Retention Time Prediction of Trimethylsilyl Derivatives of Metabolites. Biomedicines 2022, 10 (4) , 879.
    90. Morgan Thomas, Andrew Boardman, Miguel Garcia-Ortegon, Hongbin Yang, Chris de Graaf, Andreas Bender. Applications of Artificial Intelligence in Drug Design: Opportunities and Challenges. 2022, 1-59.
    91. Christoph Grebner, Hans Matter, Gerhard Hessler. Artificial Intelligence in Compound Design. 2022, 349-382.
    92. Gediminas Adomavicius, Yaqiong Wang. Improving Reliability Estimation for Individual Numeric Predictions: A Machine Learning Approach. INFORMS Journal on Computing 2022, 34 (1) , 503-521.
    93. Morgan Thomas, Robert T. Smith, Noel M. O’Boyle, Chris de Graaf, Andreas Bender. Comparison of structure- and ligand-based scoring functions for deep generative models: a GPCR case study. Journal of Cheminformatics 2021, 13 (1)
    94. Dingyan Wang, Jie Yu, Lifan Chen, Xutong Li, Hualiang Jiang, Kaixian Chen, Mingyue Zheng, Xiaomin Luo. A hybrid framework for improving uncertainty quantification in deep learning-based QSAR regression modeling. Journal of Cheminformatics 2021, 13 (1)
    95. Qing Ye, Xin Chai, Dejun Jiang, Liu Yang, Chao Shen, Xujun Zhang, Dan Li, Dongsheng Cao, Tingjun Hou. Identification of active molecules against Mycobacterium tuberculosis through machine learning. Briefings in Bioinformatics 2021, 22 (5)
    96. Hanna Meyer, Edzer Pebesma. Predicting into unknown space? Estimating the area of applicability of spatial prediction models. Methods in Ecology and Evolution 2021, 12 (9) , 1620-1633.
    97. Gulyaim Sagandykova, Bogusław Buszewski. Perspectives and recent advances in quantitative structure-retention relationships for high performance liquid chromatography. How far are we?. TrAC Trends in Analytical Chemistry 2021, 141 , 116294.
    98. Huda Mando, Ahmad Hassan, Sajjad Gharaghani. Novel and Predictive QSAR Model for Steroidal and Nonsteroidal 5α- Reductase Type II Inhibitors. Current Drug Discovery Technologies 2021, 18 (2) , 317-332.
    99. Shrooq Alsenan, Isra Al-Turaiki, Alaaeldin Hafez. A deep learning approach to predict blood-brain barrier permeability. PeerJ Computer Science 2021, 7 , e515.
    100. Thomas Blaschke, Ola Engkvist, Jürgen Bajorath, Hongming Chen. Memory-assisted reinforcement learning for diverse molecular de novo design. Journal of Cheminformatics 2020, 12 (1)
    Load more citations

    Pair your accounts.

    Export articles to Mendeley

    Get article recommendations from ACS based on references in your Mendeley library.

    Pair your accounts.

    Export articles to Mendeley

    Get article recommendations from ACS based on references in your Mendeley library.

    You’ve supercharged your research process with ACS and Mendeley!

    STEP 1:
    Click to create an ACS ID

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    Your Mendeley pairing has expired. Please reconnect