Stereo Signature Molecular DescriptorClick to copy article linkArticle link copied!
Abstract

We present an algorithm to compute molecular graph descriptors considering the stereochemistry of the molecular structure based on our previously introduced signature molecular descriptor. The algorithm can generate two types of descriptors, one which is compliant with the Cahn–Ingold–Prelog priority rules, including complex stereochemistry structures such as fullerenes, and a computationally efficient one based on our previous definition of a directed acyclic graph that is augmented to a chiral molecular graph. The performance of the algorithm in terms of speed as a canonicalizer as well as in modeling and predicting bioactivity is evaluated, showing an overall better performance than other molecular descriptors, which is particularly relevant in modeling stereoselective biochemical reactions. The complete source code of the stereo signature molecular descriptor is available for download under an open-source license at http://molsig.sourceforge.net.
Cited By
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by ACS Publications if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
This article is cited by 47 publications.
- Guillaume Gricourt, Philippe Meyer, Thomas Duigou, Jean-Loup Faulon. Artificial Intelligence Methods and Models for Retro-Biosynthesis: A Scoping Review. ACS Synthetic Biology 2024, 13
(8)
, 2276-2294. https://doi.org/10.1021/acssynbio.4c00091
- Feng Gao, Yike Shen, Jonathan Brett Sallach, Hui Li, Cun Liu, Yuanbo Li. Direct Prediction of Bioaccumulation of Organic Contaminants in Plant Roots from Soils with Machine Learning Models Based on Molecular Structures. Environmental Science & Technology 2021, 55
(24)
, 16358-16368. https://doi.org/10.1021/acs.est.1c02376
- Andrew F. Zahrt, Soumitra V. Athavale, Scott E. Denmark. Quantitative Structure–Selectivity Relationships in Enantioselective Catalysis: Past, Present, and Future. Chemical Reviews 2020, 120
(3)
, 1620-1689. https://doi.org/10.1021/acs.chemrev.9b00425
- Jiangming Sun, Lars Carlsson, Ernst Ahlberg, Ulf Norinder, Ola Engkvist, and Hongming Chen . Applying Mondrian Cross-Conformal Prediction To Estimate Prediction Confidence on Large Imbalanced Bioactivity Data Sets. Journal of Chemical Information and Modeling 2017, 57
(7)
, 1591-1598. https://doi.org/10.1021/acs.jcim.7b00159
- Joseph Mellor, Ioana Grigoras, Pablo Carbonell, and Jean-Loup Faulon . Semisupervised Gaussian Process for Automated Enzyme Search. ACS Synthetic Biology 2016, 5
(6)
, 518-528. https://doi.org/10.1021/acssynbio.5b00294
- Nadine Schneider, Roger A. Sayle, and Gregory A. Landrum . Get Your Atoms in Order—An Open-Source Implementation of a Novel and Robust Molecular Canonicalization Algorithm. Journal of Chemical Information and Modeling 2015, 55
(10)
, 2111-2120. https://doi.org/10.1021/acs.jcim.5b00543
- Pablo Carbonell, Pierre Parutto, Claire Baudier, Christophe Junot, and Jean-Loup Faulon . Retropath: Automated Pipeline for Embedded Metabolic Circuits. ACS Synthetic Biology 2014, 3
(8)
, 565-577. https://doi.org/10.1021/sb4001273
- Jinming Fan, Chao Qian, Shaodong Zhou. A Universal Framework for General Prediction of Physicochemical Properties: the Natural Growth Model. Research 2024, https://doi.org/10.34133/research.0510
- Rizvi Syed Aal E Ali, Jiaolong Meng, Muhammad Ehtisham Ibraheem Khan, Xuefeng Jiang. Machine learning advancements in organic synthesis: A focused exploration of artificial intelligence applications in chemistry. Artificial Intelligence Chemistry 2024, 2
(1)
, 100049. https://doi.org/10.1016/j.aichem.2024.100049
- Jinming Fan, Chao Qian, Shaodong Zhou. Reproducing the color with reformulated recipe. Artificial Intelligence Chemistry 2023, 1
(1)
, 100003. https://doi.org/10.1016/j.aichem.2023.100003
- Jinming Fan, Chao Qian, Shaodong Zhou. Machine Learning Spectroscopy Using a 2-Stage, Generalized Constituent Contribution Protocol. Research 2023, 6 https://doi.org/10.34133/research.0115
- Benjamin Owen, Katherine Wheelhouse, Grazziela Figueredo, Ender Özcan, Simon Woodward. Machine learnt patterns in rhodium-catalysed asymmetric Michael addition using chiral diene ligands. Results in Chemistry 2022, 4 , 100379. https://doi.org/10.1016/j.rechem.2022.100379
- William Finnigan. Computer-Aided Synthesis Planning for Biocatalysis. 2022https://doi.org/10.1016/B978-0-32-390644-9.00084-6
- Bernard Nguyen, Leanne S. Whitmore, Anthe George, Corey M. Hudson. Evaluating causal‐based feature selection for fuel property prediction models. Statistical Analysis and Data Mining: The ASA Data Science Journal 2021, 14
(6)
, 624-635. https://doi.org/10.1002/sam.11511
- Aleksandra Nikonenko, Dmitry Zankov, Igor Baskin, Timur Madzhidov, Pavel Polishchuk. Multiple Conformer Descriptors for QSAR Modeling. Molecular Informatics 2021, 40
(11)
https://doi.org/10.1002/minf.202060030
- Nastaran Meftahi, Mykhailo Klymenko, Andrew J. Christofferson, Udo Bach, David A. Winkler, Salvy P. Russo. Machine learning property prediction for organic photovoltaic devices. npj Computational Materials 2020, 6
(1)
https://doi.org/10.1038/s41524-020-00429-w
- Cindy Vallieres, Andrew L. Hook, Yinfeng He, Valentina Cuzzucoli Crucitti, Grazziela Figueredo, Catheryn R. Davies, Laurence Burroughs, David A. Winkler, Ricky D. Wildman, Derek J. Irvine, Morgan R. Alexander, Simon V. Avery. Discovery of (meth)acrylate polymers that resist colonization by fungi associated with pathogenesis and biodeterioration. Science Advances 2020, 6
(23)
https://doi.org/10.1126/sciadv.aba6574
- Rudy J. Richardson, John K. Fink, Paul Glynn, Robert B. Hufnagel, Galina F. Makhaeva, Sanjeeva J. Wijeyesakere. Neuropathy target esterase (NTE/PNPLA6) and organophosphorus compound-induced delayed neurotoxicity (OPIDN). 2020, 1-78. https://doi.org/10.1016/bs.ant.2020.01.001
- Chengcai Luo, Guixiang Hu, Meilan Huang, Jianwei Zou, Yongjun Jiang. Prediction on separation factor of chiral arylhydantoin compounds and recognition mechanism between chiral stationary phase and the enantiomers. Journal of Molecular Graphics and Modelling 2020, 94 , 107479. https://doi.org/10.1016/j.jmgm.2019.107479
- Maria Sorokina, Christoph Steinbeck. NaPLeS: a natural products likeness scorer—web application and database. Journal of Cheminformatics 2019, 11
(1)
https://doi.org/10.1186/s13321-019-0378-z
- Paulius Mikulskis, Morgan R. Alexander, David Alan Winkler. Toward Interpretable Machine Learning Models for Materials Discovery. Advanced Intelligent Systems 2019, 1
(8)
https://doi.org/10.1002/aisy.201900045
- Gavin Kurgan, Logan Kurgan, Aidan Schneider, Moses Onyeabor, Yesenia Rodriguez-Sanchez, Eric Taylor, Rodrigo Martinez, Pablo Carbonell, Xiaojian Shi, Haiwei Gu, Xuan Wang. Identification of major malate export systems in an engineered malate-producing Escherichia coli aided by substrate similarity search. Applied Microbiology and Biotechnology 2019, 103
(21-22)
, 9001-9011. https://doi.org/10.1007/s00253-019-10164-y
- Thomas Duigou, Melchior du Lac, Pablo Carbonell, Jean-Loup Faulon. RetroRules: a database of reaction rules for engineering biology. Nucleic Acids Research 2019, 47
(D1)
, D1229-D1235. https://doi.org/10.1093/nar/gky940
- Michael Machas, Gavin Kurgan, Amit K Jha, Andrew Flores, Aidan Schneider, Sean Coyle, Arul M Varman, Xuan Wang, David R Nielsen. Emerging tools, enabling technologies, and future opportunities for the bioproduction of aromatic chemicals. Journal of Chemical Technology & Biotechnology 2019, 94
(1)
, 38-52. https://doi.org/10.1002/jctb.5762
- Tadi Venkata Sivakumar, Anirban Bhaduri, Rajasekhara Reddy Duvvuru Muni, Jin Hwan Park, Tae Yong Kim. SimCAL: a flexible tool to compute biochemical reaction similarity. BMC Bioinformatics 2018, 19
(1)
https://doi.org/10.1186/s12859-018-2248-5
- Fan Feng, Luhua Lai, Jianfeng Pei. Computational Chemical Synthesis Analysis and Pathway Design. Frontiers in Chemistry 2018, 6 https://doi.org/10.3389/fchem.2018.00199
- Pablo Carbonell, Baudoin Delépine, Jean-Loup Faulon. Extended Metabolic Space Modeling. 2018, 83-96. https://doi.org/10.1007/978-1-4939-7295-1_6
- Baudoin Delépine, Thomas Duigou, Pablo Carbonell, Jean-Loup Faulon. RetroPath2.0: A retrosynthesis workflow for metabolic engineers. Metabolic Engineering 2018, 45 , 158-170. https://doi.org/10.1016/j.ymben.2017.12.002
- Jiangming Sun, Nina Jeliazkova, Vladimir Chupakhin, Jose-Felipe Golib-Dzib, Ola Engkvist, Lars Carlsson, Jörg Wegner, Hugo Ceulemans, Ivan Georgiev, Vedrin Jeliazkov, Nikolay Kochev, Thomas J. Ashby, Hongming Chen. ExCAPE-DB: an integrated large scale dataset facilitating Big Data analysis in chemogenomics. Journal of Cheminformatics 2017, 9
(1)
https://doi.org/10.1186/s13321-017-0203-5
- Marwin H. S. Segler, Mark P. Waller. Neural‐Symbolic Machine Learning for Retrosynthesis and Reaction Prediction. Chemistry – A European Journal 2017, 23
(25)
, 5966-5971. https://doi.org/10.1002/chem.201605499
- Jaroslaw Polanski, Johann Gasteiger. Computer Representation of Chemical Compounds. 2017, 1997-2039. https://doi.org/10.1007/978-3-319-27282-5_50
- P. Polishchuk, E. Mokshyna, A. Kosinskaya, A. Muats, M. Kulinsky, O. Tinkov, L. Ognichenko, T. Khristova, A. Artemenko, V. Kuz’min. Structural, Physicochemical and Stereochemical Interpretation of QSAR Models Based on Simplex Representation of Molecular Structure. 2017, 107-147. https://doi.org/10.1007/978-3-319-56850-8_4
- Jennifer De León, Ana M. Velásquez, Bibian A. Hoyos. A stochastic method for asphaltene structure formulation from experimental data: avoidance of implausible structures. Physical Chemistry Chemical Physics 2017, 19
(15)
, 9934-9944. https://doi.org/10.1039/C6CP06380B
- Somayeh Pirhadi, Jocelyn Sunseri, David Ryan Koes. Open source molecular modeling. Journal of Molecular Graphics and Modelling 2016, 69 , 127-143. https://doi.org/10.1016/j.jmgm.2016.07.008
- Baudoin Delépine, Vincent Libis, Pablo Carbonell, Jean-Loup Faulon. SensiPath: computer-aided design of sensing-enabling metabolic pathways. Nucleic Acids Research 2016, 44
(W1)
, W226-W231. https://doi.org/10.1093/nar/gkw305
- S. Sagar, J. Sidorova. Sequence Retriever for Known, Discovered, and User-Specified Molecular Fragments. 2016, 51-58. https://doi.org/10.1007/978-3-319-40126-3_6
- D.P. Visco, J.J. Chen. The Signature Molecular Descriptor in Molecular Design. 2016, 315-343. https://doi.org/10.1016/B978-0-444-63683-6.00011-3
- Virginie Y. Martiny, Pablo Carbonell, Florent Chevillard, Gautier Moroy, Arnaud B. Nicot, Philippe Vayer, Bruno O. Villoutreix, Maria A. Miteva. Integrated structure- and ligand-based
in silico
approach to predict inhibition of cytochrome P450 2D6. Bioinformatics 2015, 31
(24)
, 3930-3937. https://doi.org/10.1093/bioinformatics/btv486
- Maria Sorokina, Claudine Medigue, David Vallenet. A new network representation of the metabolism to detect chemical transformation modules. BMC Bioinformatics 2015, 16
(1)
https://doi.org/10.1186/s12859-015-0809-4
- Wendy A. Warr. Many InChIs and quite some feat. Journal of Computer-Aided Molecular Design 2015, 29
(8)
, 681-694. https://doi.org/10.1007/s10822-015-9854-3
- Fangfang Zheng, Qingyou Zhang, Jingya Li, Jingjie Suo, Chengcheng Wu, Yanmei Zhou, Xiaoqiang Liu, Lu Xu. Machine learning induction of chemically intuitive rules for the prediction of enantioselectivity in the asymmetric syntheses of alcohols. Chemometrics and Intelligent Laboratory Systems 2015, 145 , 39-47. https://doi.org/10.1016/j.chemolab.2015.03.016
- Pablo Carbonell, Jean-Yves Trosset. Computational Protein Design Methods for Synthetic Biology. 2015, 3-21. https://doi.org/10.1007/978-1-4939-1878-2_1
- Jaroslaw Polanski, Johann Gasteiger. Computer Representation of Chemical Compounds. 2015, 1-43. https://doi.org/10.1007/978-94-007-6169-8_50-1
- Alfred Fernández-Castané, Tamás Fehér, Pablo Carbonell, Cyrille Pauthenier, Jean-Loup Faulon. Computer-aided design for metabolic engineering. Journal of Biotechnology 2014, 192 , 302-313. https://doi.org/10.1016/j.jbiotec.2014.03.029
- Hang Li, Donald P. Visco, Nic D. Leipzig. Confirmation of predicted activity for factor XIa inhibitors from a virtual screening approach. AIChE Journal 2014, 60
(8)
, 2741-2746. https://doi.org/10.1002/aic.14508
- Pablo Carbonell, Pierre Parutto, Joan Herisson, Shashi Bhushan Pandit, Jean-Loup Faulon. XTMS: pathway design in an eXTended metabolic space. Nucleic Acids Research 2014, 42
(W1)
, W389-W394. https://doi.org/10.1093/nar/gku362
- John G. Cumming, Andrew M. Davis, Sorel Muresan, Markus Haeberlein, Hongming Chen. Chemical predictive modelling to improve compound quality. Nature Reviews Drug Discovery 2013, 12
(12)
, 948-962. https://doi.org/10.1038/nrd4128
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.