ACS Publications. Most Trusted. Most Cited. Most Read
My Activity

Multiobjective Optimization in Quantitative Structure−Activity Relationships:  Deriving Accurate and Interpretable QSARs

View Author Information
Krebs Institute for Biomolecular Research and Department of Information Studies, University of Sheffield, Western Bank, Sheffield S10 2TN, United Kingdom, Department of Automatic Control and Systems Engineering, University of Sheffield, Western Bank, Sheffield S10 2TN, United Kingdom, and GlaxoSmithKline, Gunnels Wood Road, Stevenage SG1 2NY, United Kingdom
Cite this: J. Med. Chem. 2002, 45, 23, 5069–5080
Publication Date (Web):October 15, 2002
Copyright © 2002 American Chemical Society

    Article Views





    Other access options


    Abstract Image

    Deriving quantitative structure−activity relationship (QSAR) models that are accurate, reliable, and easily interpretable is a difficult task. In this study, two new methods have been developed that aim to find useful QSAR models that represent an appropriate balance between model accuracy and complexity. Both methods are based on genetic programming (GP). The first method, referred to as genetic QSAR (or GPQSAR), uses a penalty function to control model complexity. GPQSAR is designed to derive a single linear model that represents an appropriate balance between the variance and the number of descriptors selected for the model. The second method, referred to as multiobjective genetic QSAR (MoQSAR), is based on multiobjective GP and represents a new way of thinking of QSAR. Specifically, QSAR is considered as a multiobjective optimization problem that comprises a number of competitive objectives. Typical objectives include model fitting, the total number of terms, and the occurrence of nonlinear terms. MoQSAR results in a family of equivalent QSAR models where each QSAR represents a different tradeoff in the objectives. A practical consideration often overlooked in QSAR studies is the need for the model to promote an understanding of the biochemical response under investigation. To accomplish this, chemically intuitive descriptors are needed but do not always give rise to statistically robust models. This problem is addressed by the addition of a further objective, called chemical desirability, that aims to reward models that consist of descriptors that are easily interpretable by chemists. GPQSAR and MoQSAR have been tested on various data sets including the Selwood data set and two different solubility data sets. The study demonstrates that the MoQSAR method is able to find models that are at least as good as models derived using standard statistical approaches and also yields models that allow a medicinal chemist to trade statistical robustness for chemical interpretability.

    Read this article

    To access this article, please review the available access options below.

    Get instant access

    Purchase Access

    Read this article for 48 hours. Check out below using your ACS ID or as a guest.


    Access through Your Institution

    You may have access to this article through your institution.

    Your institution does not have access to this content. You can change your affiliated institution below.

     Krebs Institute for Biomolecular Research and Department of Information Studies, University of Sheffield.


     To whom correspondence should be addressed. Tel. +44 1142 222 652. Fax:  +44 1142 780 300. E-mail. [email protected].

     Department of Automatic Control and Systems Engineering, University of Sheffield.



    Cited By

    This article is cited by 80 publications.

    1. Maria Vittoria Togo, Fabrizio Mastrolorito, Fulvio Ciriaco, Daniela Trisciuzzi, Anna Rita Tondo, Nicola Gambacorta, Loredana Bellantuono, Alfonso Monaco, Francesco Leonetti, Roberto Bellotti, Cosimo Damiano Altomare, Nicola Amoroso, Orazio Nicolotti. TIRESIA: An eXplainable Artificial Intelligence Platform for Predicting Developmental Toxicity. Journal of Chemical Information and Modeling 2023, 63 (1) , 56-66.
    2. Alberga Domenico, Gambacorta Nicola, Trisciuzzi Daniela, Ciriaco Fulvio, Amoroso Nicola, Nicolotti Orazio. De Novo Drug Design of Targeted Chemical Libraries Based on Artificial Intelligence and Pair-Based Multiobjective Optimization. Journal of Chemical Information and Modeling 2020, 60 (10) , 4582-4593.
    3. Darren V. S. Green . Using Machine Learning To Inform Decisions in Drug Discovery: An Industry Perspective. 2019, 81-101.
    4. Orazio Nicolotti, Ilenia Giangreco, Teresa Fabiola Miscioscia and Angelo Carotti. Improving Quantitative Structure−Activity Relationships through Multiobjective Optimization. Journal of Chemical Information and Modeling 2009, 49 (10) , 2290-2302.
    5. Maykel Cruz-Monteagudo, Fernanda Borges, M. Natália D. S. Cordeiro, J. Luis Cagide Fajin, Carlos Morell, Reinaldo Molina Ruiz, Yudith Cañizares-Carmenate and Elena Rosa Dominguez. Desirability-Based Methods of Multiobjective Optimization and Ranking for Global QSAR Studies. Filtering Safe and Potent Drug Candidates from Combinatorial Libraries. Journal of Combinatorial Chemistry 2008, 10 (6) , 897-913.
    6. Orazio Nicolotti, Teresa Fabiola Miscioscia, Andrea Carotti, Francesco Leonetti and Angelo Carotti. An Integrated Approach to Ligand- and Structure-Based Drug Design: Development and Application to a Series of Serine Protease Inhibitors. Journal of Chemical Information and Modeling 2008, 48 (6) , 1211-1226.
    7. Mikko J. Vainio and, Mark S. Johnson. Generating Conformer Ensembles Using a Multiobjective Genetic Algorithm. Journal of Chemical Information and Modeling 2007, 47 (6) , 2462-2474.
    8. Orazio Nicolotti,, Teresa Fabiola Miscioscia,, Francesco Leonetti,, Giovanni Muncipinto, and, Angelo Carotti. Screening of Matrix Metalloproteinases Available from the Protein Data Bank:  Insights into Biological Functions, Domain Organization, and Zinc Binding Groups. Journal of Chemical Information and Modeling 2007, 47 (6) , 2439-2448.
    9. Orazio Nicolotti and, Angelo Carotti. QSAR and QSPR Studies of a Highly Structured Physicochemical Domain. Journal of Chemical Information and Modeling 2006, 46 (1) , 264-276.
    10. Mikko J. Vainio and, Mark S. Johnson. McQSAR:  A Multiconformational Quantitative Structure−Activity Relationship Engine Driven by Genetic Algorithms. Journal of Chemical Information and Modeling 2005, 45 (6) , 1953-1961.
    11. T. John McNeany and, Jonathan D. Hirst. Inhibition of the Tyrosine Kinase, Syk, Analyzed by Stepwise Nonparametric Regression. Journal of Chemical Information and Modeling 2005, 45 (3) , 768-776.
    12. Vishwesh Venkatraman,, Andrew Rowland Dalby, and, Zheng Rong Yang. Evaluation of Mutual Information and Genetic Programming for Feature Selection in QSAR. Journal of Chemical Information and Computer Sciences 2004, 44 (5) , 1686-1692.
    13. Trudi Wright,, Valerie J. Gillet,, Darren V. S. Green, and, Stephen D. Pickett. Optimizing the Size and Configuration of Combinatorial Libraries. Journal of Chemical Information and Computer Sciences 2003, 43 (2) , 381-390.
    14. Maria Vittoria Togo, Fabrizio Mastrolorito, Angelica Orfino, Elisabetta Anna Graps, Anna Rita Tondo, Cosimo Damiano Altomare, Fulvio Ciriaco, Daniela Trisciuzzi, Orazio Nicolotti, Nicola Amoroso. Where developmental toxicity meets explainable artificial intelligence: state-of-the-art and perspectives. Expert Opinion on Drug Metabolism & Toxicology 2023, 20 , 1-17.
    15. Vaishnav Bhaskar, Sunil Kumar, Aathira Sujathan Nair, S. Gokul, Prayaga Rajappan Krishnendu, Sonu Benny, C. T. Amrutha, Deepthi S. Manisha, Vaishnavi Bhaskar, Subin Mary Zachariah, T. P. Aneesh, Mohamed A. Abdelgawad, Mohammed M. Ghoneim, Leena K. Pappachen, Orazio Nicolotti, Bijo Mathew. In silico development of potential InhA inhibitors through 3D-QSAR analysis, virtual screening and molecular dynamics. Journal of Biomolecular Structure and Dynamics 2023, , 1-23.
    16. Shaolin Wang, Yi Mei, Mengjie Zhang. A Multi-Objective Genetic Programming Algorithm With α Dominance and Archive for Uncertain Capacitated Arc Routing Problem. IEEE Transactions on Evolutionary Computation 2023, 27 (6) , 1633-1647.
    17. Nicola Gambacorta, Daniela Trisciuzzi, Fulvio Ciriaco, Fabrizio Mastrolorito, Maria Vittoria Togo, Anna Rita Tondo, Cosimo Damiano Altomare, Nicola Amoroso, Orazio Nicolotti. Machine learning resources for drug design. 2023, 663-678.
    18. George Lambrinidis, Anna Tsantili-Kakoulidou. Multi-objective optimization methods in novel drug design. Expert Opinion on Drug Discovery 2021, 16 (6) , 647-658.
    19. J. Dana Honeycutt, Kimberley M. Zorn, Alex M. Clark, Sean Ekins. Advances in Multiobjective Optimization for Drug Discovery and Development. 2021, 1-25.
    20. Francesca Carofiglio, Daniela Trisciuzzi, Nicola Gambacorta, Francesco Leonetti, Angela Stefanachi, Orazio Nicolotti. Bcr-Abl Allosteric Inhibitors: Where We Are and Where We Are Going to. Molecules 2020, 25 (18) , 4210.
    21. Ahlam Sayout, Aicha Ouarhach, Reda Rabie, Ilham Dilagui, Nabila Soraa, Abderrahmane Romane. Evaluation of Antibacterial Activity of Lavandulapedunculata subsp . atlantica ( Braun‐Blanq. ) Romo Essential Oil and Selected Terpenoids against Resistant Bacteria Strains–Structure–Activity Relationships. Chemistry & Biodiversity 2020, 17 (1)
    22. Angela Serra, Serli Önlü, Paola Festa, Vittorio Fortino, Dario Greco, . MaNGA: a novel multi-niche multi-objective genetic algorithm for QSAR modelling. Bioinformatics 2020, 36 (1) , 145-153.
    23. Daniela Trisciuzzi, Orazio Nicolotti, Maria A. Miteva, Bruno O. Villoutreix. Analysis of solvent-exposed and buried co-crystallized ligands: a case study to support the design of novel protein–protein interaction inhibitors. Drug Discovery Today 2019, 24 (2) , 551-559.
    24. George Lambrinidis, Anna Tsantili-Kakoulidou. Challenges with multi-objective QSAR in drug discovery. Expert Opinion on Drug Discovery 2018, 13 (9) , 851-859.
    25. Marco Catto, Daniela Trisciuzzi, Domenico Alberga, Giuseppe Felice Mangiatordi, Orazio Nicolotti. Multitarget Drug Design for Neurodegenerative Diseases. 2018, 93-105.
    26. Zhi-Zhong Liu, Jia-Wei Huang, Yong Wang, Dong-Sheng Cao. ECoFFeS: A Software Using Evolutionary Computation for Feature Selection in Drug Discovery. IEEE Access 2018, 6 , 20950-20963.
    27. Maria Maddalena Cavalluzzi, Giuseppe Felice Mangiatordi, Orazio Nicolotti, Giovanni Lentini. Ligand efficiency metrics in drug discovery: the pros and cons from a practical perspective. Expert Opinion on Drug Discovery 2017, 12 (11) , 1087-1104.
    28. Aminael Sánchez-Rodríguez, Yunierkis Pérez-Castillo, Stephan C. Schürer, Orazio Nicolotti, Giuseppe Felice Mangiatordi, Fernanda Borges, M. Natalia D.S. Cordeiro, Eduardo Tejera, José L. Medina-Franco, Maykel Cruz-Monteagudo. From flamingo dance to (desirable) drug discovery: a nature-inspired approach. Drug Discovery Today 2017, 22 (10) , 1489-1502.
    29. Fotios Tsopelas, Constantinos Giaginis, Anna Tsantili-Kakoulidou. Lipophilicity and biomimetic properties to support drug discovery. Expert Opinion on Drug Discovery 2017, 12 (9) , 885-896.
    30. George Lambrinidis, Fotios Tsopelas, Costas Giaginis, Anna Tsantili-Kakoulidou. QSAR/QSPR Modeling in the Design of Drug Candidates with Balanced Pharmacodynamic and Pharmacokinetic Properties. 2017, 339-384.
    31. A. Gissi, G.F. Mangiatordi, T. Sobański, T. Netzeva, O. Nicolotti. Nontest Methods for REACH Legislation. 2017, 472-490.
    32. Giuseppe Felice Mangiatordi, Domenico Alberga, Cosimo Damiano Altomare, Angelo Carotti, Marco Catto, Saverio Cellamare, Domenico Gadaleta, Gianluca Lattanzi, Francesco Leonetti, Leonardo Pisani, Angela Stefanachi, Daniela Trisciuzzi, Orazio Nicolotti. Mind the Gap! A Journey towards Computational Toxicology. Molecular Informatics 2016, 35 (8-9) , 294-308.
    33. Rahul Gangwal, Mangesh Damre, Abhay Sangamwar. Overview and Recent Advances in QSAR Studies. 2016, 1-32.
    34. Giuseppe Felice Mangiatordi, Angelo Carotti, Ettore Novellino, Orazio Nicolotti. A Round Trip from Medicinal Chemistry to Predictive Toxicology. 2016, 461-473.
    35. Yong Wang, Jing‐Jing Huang, Neng Zhou, Dong‐Sheng Cao, Jie Dong, Han‐Xiong Li. Incorporating PLS model information into particle swarm optimization for descriptor selection in QSAR/QSPR. Journal of Chemometrics 2015, 29 (12) , 627-636.
    36. Michael M. Hann, Andrew R. Leach. Coping with Complexity in Molecular Design. 2013, 57-77.
    37. Christos A. Nicolaou, Nathan Brown. Multi-objective optimization methods in drug design. Drug Discovery Today: Technologies 2013, 10 (3) , e427-e435.
    38. Richard Cox, Darren V. S. Green, Christopher N. Luscombe, Noj Malcolm, Stephen D. Pickett. QSAR workbench: automating QSAR modeling to drive compound design. Journal of Computer-Aided Molecular Design 2013, 27 (4) , 321-336.
    39. Rajarshi Guha. On Exploring Structure–Activity Relationships. 2013, 81-94.
    40. Richard D. Cramer. R-group template CoMFA combines benefits of “ad hoc” and topomer alignments using 3D-QSAR for lead optimization. Journal of Computer-Aided Molecular Design 2012, 26 (7) , 805-819.
    41. Zhiwen Yu, Hau-San Wong, Dingwen Wang, Ming Wei. Neighborhood Knowledge-Based Evolutionary Algorithm for Multiobjective Optimization Problems. IEEE Transactions on Evolutionary Computation 2011, 15 (6) , 812-831.
    42. Orazio Nicolotti, Ilenia Giangreco, Antonellina Introcaso, Francesco Leonetti, Angela Stefanachi, Angelo Carotti. Strategies of multi-objective optimization in drug discovery and development. Expert Opinion on Drug Discovery 2011, 6 (9) , 871-884.
    43. Scott J. Lusher, Ross McGuire, Rita Azevedo, Jan-Willem Boiten, Rene C. van Schaik, Jacob de Vlieg. A molecular informatics view on best practice in multi-parameter compound optimization. Drug Discovery Today 2011, 16 (13-14) , 555-568.
    44. Yu Qin, Hongfei Deng, Hong Yan, Rugang Zhong. An accurate nonlinear QSAR model for the antitumor activities of chloroethylnitrosoureas using neural networks. Journal of Molecular Graphics and Modelling 2011, 29 (6) , 826-833.
    45. K. Şahin, E. Sarıpınar, E. Yanmaz, N. Geçen. Quantitative bioactivity prediction and pharmacophore identification for benzotriazine derivatives using the electron conformational–genetic algorithm in QSAR. SAR and QSAR in Environmental Research 2011, 22 (3-4) , 217-238.
    46. Jagdish C. Patra, Boon H. Chua. Artificial neural network‐based drug design for diabetes mellitus using flavonoids. Journal of Computational Chemistry 2011, 32 (4) , 555-567.
    48. Jagdish C. Patra, Kenny H. K. Chua. Neural network based drug design for diabetes mellitus using QSAR with 2D and 3D descriptors. 2010, 1-8.
    49. Sean Ekins, J. Dana Honeycutt, James T. Metz. Multiobjective Optimization for Drug Discovery. 2010, 259-278.
    50. Maykel Cruz‐Monteagudo, Hai PhamThe, M. Natalia D. S. Cordeiro, Fernanda Borges. Prioritizing Hits with Appropriate Trade‐Offs Between HIV‐1 Reverse Transcriptase Inhibitory Efficacy and MT4 Blood Cells Toxicity Through Desirability‐Based Multiobjective Optimization and Ranking. Molecular Informatics 2010, 29 (4) , 303-321.
    51. A. Machado, E. Tejera, M. Cruz-Monteagudo, I. Rebelo. Application of desirability-based multi(bi)-objective optimization in the design of selective arylpiperazine derivates for the 5-HT1A serotonin receptor. European Journal of Medicinal Chemistry 2009, 44 (12) , 5045-5054.
    52. Xiaofeng Liu, Fang Bai, Sisheng Ouyang, Xicheng Wang, Honglin Li, Hualiang Jiang. Cyndi: a multi-objective evolution algorithm based method for bioactive molecular conformational generation. BMC Bioinformatics 2009, 10 (1)
    53. Honglin Li, Hailei Zhang, Mingyue Zheng, Jie Luo, Ling Kang, Xiaofeng Liu, Xicheng Wang, Hualiang Jiang. An effective docking strategy for virtual screening based on multi-objective optimization algorithm. BMC Bioinformatics 2009, 10 (1)
    54. Jagdish C. Patra, Onkar Singh. Artificial neural networks‐based approach to design ARIs using QSAR for diabetes mellitus. Journal of Computational Chemistry 2009, 30 (15) , 2494-2508.
    55. Orazio Nicolotti, Ilenia Giangreco, Teresa Fabiola Miscioscia, Angelo Carotti. Investigating Enzyme Selectivity and Hit Enrichment by Automatically Interfacing Ligand‐ and Structure‐Based Molecular Design. QSAR & Combinatorial Science 2009, 28 (8) , 861-864.
    56. . Bibliography. 2009, 1-241.
    57. Mojtaba Shamsipur, Vali Zare-Shahabadi, Bahram Hemmateenejad, Morteza Akhond. An efficient variable selection method based on the use of external memory in ant colony optimization. Application to QSAR/QSPR studies. Analytica Chimica Acta 2009, 646 (1-2) , 39-46.
    58. David Hecht, Mars Cheung, Gary B. Fogel. Docking scores and QSAR using evolved neural networks for the Pan-inhibition of wild-type and mutant PfDHFR by cycloguanil derivatives. 2009, 262-269.
    59. . Multiobjective Optimization. 2009, 237-240.
    60. Maykel Cruz‐Monteagudo, Fernanda Borges, M. Natália D. S. Cordeiro. Desirability‐based multiobjective optimization for global QSAR studies: Application to the design of novel NSAIDs with improved analgesic, antiinflammatory, and ulcerogenic profiles. Journal of Computational Chemistry 2008, 29 (14) , 2445-2459.
    61. Han van de Waterbeemd, Sally Rose. Quantitative Approaches to Structure–Activity Relationships. 2008, 491-513.
    62. Katerina Neophytou, Christos A. Nicolaou, Constantinos S. Pattichis, Christos N. Schizas. Deriving Quantitative Structure-Activity Relationship Models Using Genetic Programming for Drug Discovery. 2007, 277-280.
    63. D. Pugazhenth, S.P. Rajagopala. Machine Learning Technique Approaches in Drug Discovery, Design and Development. Information Technology Journal 2007, 6 (5) , 718-724.
    64. Julia Handl, Douglas B. Kell, Joshua Knowles. Multiobjective Optimization in Bioinformatics and Computational Biology. IEEE/ACM Transactions on Computational Biology and Bioinformatics 2007, 4 (2) , 279-292.
    65. Simon J. Cottrell, Valerie J. Gillet, Robin Taylor. Incorporating partial matches within multiobjective pharmacophore identification. Journal of Computer-Aided Molecular Design 2007, 20 (12) , 735-749.
    66. M. Afshar, A. Lanoue, J. Sallantin. Multiobjective/Multicriteria Optimization and Decision Support in Drug Discovery. 2007, 767-774.
    67. Masamoto Arakawa, Kiyoshi Hasegawa, Kimito Funatsu. QSAR study of anti-HIV HEPT analogues based on multi-objective genetic programming and counter-propagation neural network. Chemometrics and Intelligent Laboratory Systems 2006, 83 (2) , 91-98.
    68. Ching‐Chi Hsu, Ching‐Kong Chao, Jaw‐Lin Wang, Jinn Lin. Multiobjective optimization of tibial locking screw design using a genetic algorithm: Evaluation of mechanical performance. Journal of Orthopaedic Research 2006, 24 (5) , 908-916.
    69. Jörg K. Wegner, Holger Fröhlich, Holger M. Mielenz, Andreas Zell. Data and Graph Mining in Chemical Space for ADME and Activity Data Sets. QSAR & Combinatorial Science 2006, 25 (3) , 205-220.
    70. S. J. Barrett, W. B. Langdon. Advances in the Application of Machine Learning Techniques in Drug Discovery, Design and Development. 2006, 99-110.
    72. Weimin Guo, Wensheng Cai, Xueguang Shao, Zhongxiao Pan. Application of genetic stochastic resonance algorithm to quantitative structure–activity relationship study. Chemometrics and Intelligent Laboratory Systems 2005, 75 (2) , 181-188.
    73. Simon J. Cottrell, Valerie J. Gillet, Robin Taylor, David J. Wilton. Generation of multiple pharmacophore hypotheses using multiobjective optimisation techniques. Journal of Computer-Aided Molecular Design 2004, 18 (11) , 665-682.
    74. Marius Olah, Cristian Bologa, Tudor I. Oprea. An automated PLS search for biologically relevant QSAR descriptors. Journal of Computer-Aided Molecular Design 2004, 18 (7-9) , 437-449.
    75. Daniel C Weaver. Applying data mining techniques to library design, lead generation and lead optimization. Current Opinion in Chemical Biology 2004, 8 (3) , 264-270.
    76. Panagiotis Patrinos, Alex Alexandridis, Andreas Afantitis, Haralambos Sarimveis, Olga Igglesi-Markopoulou. Development of nonlinear quantitative structure-activity relationships using rbf networks and evolutionary computing. 2004, 265-270.
    77. Yenamandra S. Prabhakar. A Combinatorial Approach to the Variable Selection in Multiple Linear Regression: Analysis of Selwood et al. Data Set – A Case Study. QSAR & Combinatorial Science 2003, 22 (6) , 583-595.
    78. Ting‐Lan Chiu, Sung‐Sau So. Genetic Neural Networks for Functional Approximation. QSAR & Combinatorial Science 2003, 22 (5) , 519-526.
    79. W. B. Langdon, S. J. Barrett. Genetic Programming in Data Mining for Drug Discovery. , 211-235.
    80. Francesco Archetti, Stefano Lanzeni, Enza Messina, Leonardo Vanneschi. Genetic Programming and Other Machine Learning Approaches to Predict Median Oral Lethal Dose (LD50) and Plasma Protein Binding Levels (%PPB) of Drugs. , 11-23.

    Pair your accounts.

    Export articles to Mendeley

    Get article recommendations from ACS based on references in your Mendeley library.

    Pair your accounts.

    Export articles to Mendeley

    Get article recommendations from ACS based on references in your Mendeley library.

    You’ve supercharged your research process with ACS and Mendeley!

    STEP 1:
    Click to create an ACS ID

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    Please note: If you switch to a different device, you may be asked to login again with only your ACS ID.

    Your Mendeley pairing has expired. Please reconnect