Recommender Systems in Antiviral Drug DiscoveryClick to copy article linkArticle link copied!
- Ekaterina A. Sosnina*Ekaterina A. Sosnina*Email: [email protected]. Phone: +79260256249.Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Bolshoy Boulevard 30/1, Moscow 143026, RussiaInstitute of Physiologically Active Compounds, RAS, Severniy pr. 1, Chernogolovka 142432, RussiaMore by Ekaterina A. Sosnina
- Sergey SosninSergey SosninCenter for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Bolshoy Boulevard 30/1, Moscow 143026, RussiaSyntelly LLC, Skolkovo Innovation Center, Bolshoy Boulevard 30, Moscow 121205, RussiaMore by Sergey Sosnin
- Anastasia A. NikitinaAnastasia A. NikitinaDepartment of Chemistry, Lomonosov Moscow State University, Leninskie Gory 1 bd. 3, Moscow 119991, RussiaFSBSI “Chumakov FSC R&D IBP RAS”, Poselok Instituta Poliomielita 8 bd. 1, Poselenie Moskovsky, Moscow 108819, RussiaMore by Anastasia A. Nikitina
- Ivan NazarovIvan NazarovCenter for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Bolshoy Boulevard 30/1, Moscow 143026, RussiaMore by Ivan Nazarov
- Dmitry I. OsolodkinDmitry I. OsolodkinFSBSI “Chumakov FSC R&D IBP RAS”, Poselok Instituta Poliomielita 8 bd. 1, Poselenie Moskovsky, Moscow 108819, RussiaInstitute of Translational Medicine and Biotechnology, Sechenov First Moscow State Medical University, Trubetskaya Ulitsa 8, Moscow 119991, RussiaMore by Dmitry I. Osolodkin
- Maxim V. FedorovMaxim V. FedorovCenter for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Bolshoy Boulevard 30/1, Moscow 143026, RussiaSyntelly LLC, Skolkovo Innovation Center, Bolshoy Boulevard 30, Moscow 121205, RussiaPhysics John Anderson Building, University of Strathclyde, 107 Rottenrow East, Glasgow G4 0NG, U.K.More by Maxim V. Fedorov
Abstract
Recommender systems (RSs), which underwent rapid development and had an enormous impact on e-commerce, have the potential to become useful tools for drug discovery. In this paper, we applied RS methods for the prediction of the antiviral activity class (active/inactive) for compounds extracted from ChEMBL. Two main RS approaches were applied: collaborative filtering (Surprise implementation) and content-based filtering (sparse-group inductive matrix completion (SGIMC) method). The effectiveness of RS approaches was investigated for prediction of antiviral activity classes (“interactions”) for compounds and viruses, for which some of their interactions with other viruses or compounds are known, and for prediction of interaction profiles for new compounds. Both approaches achieved relatively good prediction quality for binary classification of individual interactions and compound profiles, as quantified by cross-validation and external validation receiver operating characteristic (ROC) score >0.9. Thus, even simple recommender systems may serve as an effective tool in antiviral drug discovery.
1. Introduction
2. Materials and Methods
2.1. Recommender System Approaches
2.1.1. Collaborative Filtering
k-Nearest-neighbor (kNN)-based algorithms are implemented in knns.KNNBasic class and identify the neighbors for the compounds and viruses based on the similarity of their interaction profiles. We used cosine similarity and mean-squared difference metrics for similarity calculation.
Clustering algorithms are implemented in the co_clustering.CoClustering class. They identify the neighborhood by grouping compounds or viruses into coclusters, simultaneously clustering the columns and rows of a matrix, and generate predictions based on the average interaction values.
Matrix factorization algorithms are represented by singular value decomposition (matrix_factorization.SVD) and non-negative matrix factorization (matrix_factorization.NMF) methods. They are based on the idea of interaction matrix decomposition and determination of the latent variables allowing for completion of the missing interaction values.
2.1.2. Content-Based Filtering









2.2. Data Preparation
Figure 1
Figure 1. Scheme of data preparation.
interaction data sets | DB_main | DB_ext_points | DB_ext_CS |
---|---|---|---|
number of compounds | 247 994 | 447 | 10 730 |
number of viral species | 158 | 43 | 56 |
number of interactions | 400 281 | 659 | 10 730 |
active/inactive class ratio | 9/1 | 3/1 | 4/1 |
minimum/average/maximum number of interactions with viruses for each compound | 1/1.61/36 | 1/1.47/12 | 1/1/1 |
minimum/average/maximum number of interactions with compounds for each virus | 1/2533.42/85 823 | 1/15.33/155 | 1/195.09/2621 |
sparsity | 1.02% | 3.43% | 1.82% |
virus feature sets | DB_v.main | DB_v.ext_points | DB_v.ext_CS |
---|---|---|---|
number of species features | 74 | 74 | 74 |
compound feature sets | DB_c.main/DB_c.50d/DB_c.25d/DB_c.10d/DB_c.8/DB_c.1 | DB_c.ext_points | DB_c.ext_CS |
---|---|---|---|
number of compound features | 2016/1008/504/202/8/1 | 2016 | 2016 |
2.2.1. Compound Features
2.2.2. Viral Species Features
2.3. Prediction Scenarios
Figure 2
Figure 2. Addressed challenges: (a) prediction of point compound–virus interactions, (b) compoundwise CS prediction, and (c) specieswise CS prediction. Matrix of interactions, green; matrix of species features, pink; matrix of compound features, yellow; and unknown compound–virus interactions, white.
2.3.1. Influence of the Number of Features
2.4. Evaluation and Metrics


3. Results and Discussion
prediction of point interactions for known compounds and viruses,
compoundwise CS prediction (prediction of interaction profiles for new compounds),
specieswise CS prediction (prediction of interaction profiles for new viruses), and
prediction of compound–virus interactions with a reduced number of compound features.
3.1. Prediction of Point Interactions
3.1.1. Collaborative Filtering
Figure 3
Figure 3. Violin plot of ROC AUC values for viral species in cross-validation (blue) and external validation (red). Dotted lines inside the violins represent the quartiles of the distribution.
cross-validation | external validation | |||||
---|---|---|---|---|---|---|
Surprise methods | ROC AUC ± SD | mean ROC AUC ± SD | median ROC AUC | ROC AUC | mean ROC AUC ± SD | median ROC AUC |
knns.KNNBasic msd (virus-based) | 0.808 ± 0.004 | 0.8 ± 0.3 | 0.86 | 0.603 | 0.7 ± 0.3 | 0.72 |
knns.KNNBasic msd (compound-based) | 0.888 ± 0.004 | 0.8 ± 0.3 | 0.83 | 0.888 | 0.8 ± 0.3 | 0.83 |
knns.KNNBasic cosine (virus-based) | 0.806 ± 0.005 | 0.8 ± 0.3 | 0.86 | 0.606 | 0.7 ± 0.3 | 0.75 |
knns.KNNBasic cosine (compound-based) | 0.872 ± 0.004 | 0.7 ± 0.3 | 0.79 | 0.872 | 0.7 ± 0.3 | 0.75 |
co_clustering.CoClustering | 0.863 ± 0.01 | 0.7 ± 0.3 | 0.81 | 0.702 | 0.7 ± 0.3 | 0.76 |
matrix_factorization.SVD | 0.939 ± 0.003 | 0.8 ± 0.3 | 0.88 | 0.764 | 0.7 ± 0.3 | 0.78 |
matrix_factorization.NMF | 0.939 ± 0.003 | 0.8 ± 0.3 | 0.88 | 0.709 | 0.7 ± 0.3 | 0.68 |
3.1.2. CBF Prediction of Point Interactions with SGIMC
Figure 4
Figure 4. Guided grid search of Classo, Cridge, and Cgroup coefficients for interaction prediction for known compounds and viral species based on (a) ROC AUC, (b) mean ROC AUC, and (c) median ROC AUC. Rank = 10, number of iterations = 70.
3.2. Cold-Start Prediction with CBF
Figure 5
Figure 5. Violin plots of ROC AUC values for viral species: (a) prediction of point compound–virus interactions, (b) compoundwise CS prediction, and (c) specieswise CS prediction. The prediction was assessed in cross-validation (light blue and coral) and external validation (dark blue, red, and green). Lines depict the dependence of median ROC AUC scores on the number of iterations. Dotted lines inside the violins represent the quartiles of the distribution. Rank = 10, Classo = 0.0, Cgroup = 0.0, and Cridge = 120.0.
Figure 6
Figure 6. Dependence of the median ROC AUC score for point interaction prediction on number of iterations through cross-validation with original feature matrices (red), unit vector for species (blue), and unit vector for compounds (green) (rank = 10, Classo = 0.0, Cgroup = 0.0, and Cridge = 120.). Error bars represent the SD.
3.3. Influence of the Number of Features
Figure 7
Figure 7. Dependence of mean ROC AUC (a) and median ROC AUC (b) for models with a different number of compound features on the number of iterations. Rank = 10, Classo = 0.0, Cgroup = 0.0, and Cridge = 120.0. Compound feature matrices: DB_c.main (red ★), DB_c.50d (blue ■), DB_c.25d (magenta ◆), DB_c.10d (green ×), DB_c.8 (orange •), and DB_c.1 (light blue ▲). Error bars represent the SD.
Figure 8
Figure 8. Influence of the Cgroup regularization coefficient in cross-validation for point interaction prediction on the mean/median ROC AUC at 70 (a) and 10 (b) iterations. Continuous and dashed red lines indicate the mean and median ROC AUC, and continuous and dashed blue lines indicate the mean and median number of zeroed features. Shaded areas represent the corresponding standard deviations. The black dash-dotted line shows median ROC AUC with 50% of compound features. Classo = 0.0, Cridge = 120.0, and rank = 10.
4. Conclusions
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.0c00857.
File SI1: pdf file with a description of data set preparation; table of models’ hyperparameters; and the best hyperparameters’ and prediction assessment (PDF)
File SI2: python code snippet for the metric calculation; File SI3: gzipped tarball file with the data sets inside (DB_main.csv—data set of compound–virus interactions, DB_c_main.csv—data set of compound features, DB_v_main.csv—data set of virus features, DB_ext.csv—test data set with compound–virus interactions, DB_c_ext.csv—test data set with compound features, and DB_ext_comp.csv —data set with compound labels for point and CS test prediction); file is located on Zenodo (doi: 10.5281/zenodo.3831446)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
The authors acknowledge the usage of the Skoltech CDISE HPC clusters Arkuda and Zhores for obtaining the results presented in this manuscript. The authors are thankful to Maxim Panov and Evgeny Frolov from the Center for Computational and Data-Intensive Science and Engineering, Skoltech, for fruitful discussions. The reported study was funded by the Russian Foundation of Basic Research (according to the research project no. 19-33-90290, MVF and EAS—computational experiments and results assessment) and the State research funding for FSBSI “Chumakov FSC R&D IBP RAS” (topic no. 0837-2019-0002, AAN and DIO—database curation and data assessment).
Appendix: Terminological Issues in the RS Field
Acronyms | |
RS | recommender system |
CF | collaborative filtering |
CBF | content-based filtering |
CS | cold-start |
SGIMC | sparse-group inductive matrix completion |
SD | standard deviation |
References
This article references 73 other publications.
- 1Caruana, R. Multitask Learning. Mach. Learn. 1997, 28, 41– 75, DOI: 10.1023/A:1007379606734Google ScholarThere is no corresponding record for this reference.
- 2Lipinski, C. F.; Maltarollo, V. G.; Oliveira, P. R.; da Silva, A. B. F.; Honorio, K. M. Advances and Perspectives in Applying Deep Learning for Drug Design and Discovery. Front. Robot. AI 2019, 6, 108, DOI: 10.3389/frobt.2019.00108Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3srlslKgug%253D%253D&md5=e9e8d47b720ff86f94bedb65eb82ab18Advances and Perspectives in Applying Deep Learning for Drug Design and DiscoveryLipinski Celio F; da Silva Alberico B F; Maltarollo Vinicius G; Oliveira Patricia R; Honorio Kathia Maria; Honorio Kathia MariaFrontiers in robotics and AI (2019), 6 (), 108 ISSN:.Discovering (or planning) a new drug candidate involves many parameters, which makes this process slow, costly, and leading to failures at the end in some cases. In the last decades, we have witnessed a revolution in the computational area (hardware, software, large-scale computing, etc.), as well as an explosion in data generation (big data), which raises the need for more sophisticated algorithms to analyze this myriad of data. In this scenario, we can highlight the potentialities of artificial intelligence (AI) or computational intelligence (CI) as a powerful tool to analyze medicinal chemistry data. According to IEEE, computational intelligence involves the theory, the design, the application, and the development of biologically and linguistically motivated computational paradigms. In addition, CI encompasses three main methodologies: neural networks (NN), fuzzy systems, and evolutionary computation. In particular, artificial neural networks have been successfully applied in medicinal chemistry studies. A branch of the NN area that has attracted a lot of attention refers to deep learning (DL) due to its generalization power and ability to extract features from data. Therefore, in this mini-review we will briefly outline the present scope, advances, and challenges related to the use of DL in drug design and discovery, describing successful studies involving quantitative structure-activity relationships (QSAR) and virtual screening (VS) of databases containing thousands of compounds.
- 3Norinder, U.; Svensson, F. Multitask Modeling with Confidence Using Matrix Factorization and Conformal Prediction. J. Chem. Inf. Model. 2019, 59, 1598– 1604, DOI: 10.1021/acs.jcim.9b00027Google Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXlslejsrY%253D&md5=40d6e0601087c68202722a50f8328376Multitask Modeling with Confidence Using Matrix Factorization and Conformal PredictionNorinder, Ulf; Svensson, FredrikJournal of Chemical Information and Modeling (2019), 59 (4), 1598-1604CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Multitask prediction of bioactivities is often faced with challenges relating to the sparsity of data and imbalance between different labels. The authors propose class conditional (Mondrian) conformal predictors using underlying Macau models as a novel approach for large scale bioactivity prediction. This approach handles both high degrees of missing data and label imbalances while still producing high quality predictive models. When applied to ten assay end points from PubChem, the models generated valid models with an efficiency of 74.0-80.1% at the 80% confidence level with similar performance both for the minority and majority class. Also when deleting progressively larger portions of the available data (0-80%) the performance of the models remained robust with only minor deterioration (redn. in efficiency between 5 and 10%). Compared to using Macau without conformal prediction the method presented here significantly improves the performance on imbalanced data sets.
- 4Zubatyuk, R.; Smith, J. S.; Leszczynski, J.; Isayev, O. Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network. Sci. Adv. 2019, 5, eaav6490 DOI: 10.1126/sciadv.aav6490Google ScholarThere is no corresponding record for this reference.
- 5Ramsundar, B.; Liu, B.; Wu, Z.; Verras, A.; Tudor, M.; Sheridan, R. P.; Pande, V. Is Multitask Deep Learning Practical for Pharma. J. Chem. Inf. Model. 2017, 57, 2068– 2076, DOI: 10.1021/acs.jcim.7b00146Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtFCisr7O&md5=3d27a01e20455d4ee4c869b7cc91717dIs Multitask Deep Learning Practical for Pharma?Ramsundar, Bharath; Liu, Bowen; Wu, Zhenqin; Verras, Andreas; Tudor, Matthew; Sheridan, Robert P.; Pande, VijayJournal of Chemical Information and Modeling (2017), 57 (8), 2068-2076CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Multitask deep learning has emerged as a powerful tool for computational drug discovery. However, despite a no. of preliminary studies, multitask deep networks have yet to be widely deployed in the pharmaceutical and biotech industries. This lack of acceptance stems from both software difficulties and lack of understanding of the robustness of multitask deep networks. Our work aims to resolve both of these barriers to adoption. We introduce a high-quality open-source implementation of multitask deep networks as part of the DeepChem open-source platform. Our implementation enables simple python scripts to construct, fit, and evaluate sophisticated deep models. We use our implementation to analyze the performance of multitask deep networks and related deep models on four collections of pharmaceutical data (three of which have not previously been analyzed in the literature). We split these data sets into train/valid/test using time and neighbor splits to test multitask deep learning performance under challenging conditions. Our results demonstrate that multitask deep networks are surprisingly robust and can offer strong improvement over random forests. Our anal. and open-source implementation in DeepChem provide an argument that multitask deep networks are ready for widespread use in com. drug discovery.
- 6Sosnin, S.; Vashurina, M.; Withnall, M.; Karpov, P.; Fedorov, M.; Tetko, I. A Survey of Multi-task Learning Methods in Chemoinformatics. Mol. Inf. 2018, 615– 621Google ScholarThere is no corresponding record for this reference.
- 7Sosnin, S.; Karlov, D.; Tetko, I. V.; Fedorov, M. V. Comparative Study of Multitask Toxicity Modeling on a Broad Chemical Space. J. Chem. Inf. Model. 2019, 1062– 1072, DOI: 10.1021/acs.jcim.8b00685Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXis1SgtLfP&md5=33e06f11e654074570f6a142979cbb65Comparative Study of Multitask Toxicity Modeling on a Broad Chemical SpaceSosnin, Sergey; Karlov, Dmitry; Tetko, Igor V.; Fedorov, Maxim V.Journal of Chemical Information and Modeling (2019), 59 (3), 1062-1072CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Acute toxicity is one of the most challenging properties to predict purely with computational methods due to its direct relationship to biol. interactions. Moreover, toxicity can be represented by different endpoints: it can be measured for different species using different types of administration, etc., and it is questionable if the knowledge transfer between endpoints is possible. We performed a comparative study of prediction multi-task toxicity for a broad chem. space using different descriptors and modeling algorithms and applied multi-task learning for a large toxicity dataset extd. from the Registry of Toxic Effects of Chem. Substances (RTECS). We demonstrated that multi-task modeling provides significant improvement over single-output models and other machine learning methods. Our research reveals that multi-task learning can be very useful to improve the quality of acute toxicity modeling and raises a discussion about the usage of multi-task approaches for regulation purposes.
- 8van Westen, G. J. P.; Wegner, J. K.; IJzerman, A. P.; van Vlijmen, H. W. T.; Bender, A. Proteochemometric Modeling as a Tool to Design Selective Compounds and for Extrapolating to Novel Targets. MedChemComm 2011, 2, 16– 30, DOI: 10.1039/C0MD00165AGoogle Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXksFGltQ%253D%253D&md5=5edd71d2091121325d7e47439d28fd57Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targetsvan Westen, Gerard J. P.; Wegner, Joerg K.; IJzerman, Adriaan P.; van Vlijmen, Herman W. T.; Bender, A.MedChemComm (2011), 2 (1), 16-30CODEN: MCCEAY; ISSN:2040-2503. (Royal Society of Chemistry)A review. 'Proteochemometric modeling' is a bioactivity modeling technique founded on the description of both small mols. (the ligands), and proteins (the targets). By combining those two elements of a ligand - target interaction proteochemometrics techniques model the interaction complex or the full ligand - target interaction space, and they are able to quantify the similarity between both ligands and targets simultaneously. Consequently, proteochemometric models or complex based models, can be considered an extension of QSAR models, which are ligand based. As proteochemometric models are able to incorporate target information they outperform conventional QSAR models when extrapolating from the activities of known ligands on known targets to novel targets. Vice versa, proteochemometrics can be used to virtually screen for selective compds. that are solely active on a single member of a subfamily of targets, as well as to select compds. with a desired bioactivity profile - a topic particularly relevant with concepts such as 'ligand polypharmacol.' in mind. Here we illustrate the concept of proteochemometrics and provide a review of relevant methodol. publications in the field. We give an overview of the target families proteochemometrics modeling has previously been applied to, and introduce some novel application areas of the modeling technique. We conclude that proteochemometrics is a promising technique in preclin. drug research that allows merging data sets that were previously considered sep., with the potential to extrapolate more reliably both in ligand as well as target space.
- 9Cortés-Ciriano, I.; Ain, Q. U.; Subramanian, V.; Lenselink, E. B.; Méndez-Lucio, O.; IJzerman, A. P.; Wohlfahrt, G.; Prusis, P.; Malliavin, T. E.; van Westen, G. J. P.; Bender, A. Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects. MedChemComm 2015, 6, 24– 50, DOI: 10.1039/C4MD00216DGoogle Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhslagsLrE&md5=b3e82734e224ae4990218b6473ef7859Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospectsCortes-Ciriano, Isidro; Ain, Qurrat Ul; Subramanian, Vigneshwari; Lenselink, Eelke B.; Mendez-Lucio, Oscar; IJzerman, Adriaan P.; Wohlfahrt, Gerd; Prusis, Peteris; Malliavin, Therese E.; van Westen, Gerard J. P.; Bender, AndreasMedChemComm (2015), 6 (1), 24-50CODEN: MCCEAY; ISSN:2040-2503. (Royal Society of Chemistry)A review. Proteochemometric (PCM) modeling is a computational method to model the bioactivity of multiple ligands against multiple related protein targets simultaneously. Hence it has been found to be particularly useful when exploring the selectivity and promiscuity of ligands on different proteins. In this review, we will firstly provide a brief introduction to the main concepts of PCM for readers new to the field. The next part focuses on recent tech. advances, including the application of support vector machines (SVMs) using different kernel functions, random forests, Gaussian processes and collaborative filtering. The subsequent section will then describe some novel practical applications of PCM in the medicinal chem. field, including studies on GPCRs, kinases, viral proteins (e.g. from HIV) and epigenetic targets such as histone deacetylases. Finally, we will conclude by summarizing novel developments in PCM, which we expect to gain further importance in the future. These developments include adding three-dimensional protein target information, application of PCM to the prediction of binding energies, and application of the concept in the fields of pharmacogenomics and toxicogenomics. This review is an update to a related publication in 2011 and it mainly focuses on developments in the field since then.
- 10Schaduangrat, N.; Anuwongcharoen, N.; Phanus-umporn, C.; Sriwanichpoom, N.; Wikberg, J. E.; Nantasenamat, C. Silico Drug Design; Elsevier, 2019; pp 281– 302.Google ScholarThere is no corresponding record for this reference.
- 11Alves, V. M.; Golbraikh, A.; Capuzzi, S. J.; Liu, K.; Lam, W. I.; Korn, D. R.; Pozefsky, D.; Andrade, C. H.; Muratov, E. N.; Tropsha, A. Multi-Descriptor Read Across (MuDRA): A Simple and Transparent Approach for Developing Accurate Quantitative Structure-Activity Relationship Models. J. Chem. Inf. Model. 2018, 58, 1214– 1223, DOI: 10.1021/acs.jcim.8b00124Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhtVags7rK&md5=4a8174226c05b72d6f7b6856e0ffa855Multi-Descriptor Read Across (MuDRA): A Simple and Transparent Approach for Developing Accurate Quantitative Structure-Activity Relationship ModelsAlves, Vinicius M.; Golbraikh, Alexander; Capuzzi, Stephen J.; Liu, Kammy; Lam, Wai In; Korn, Daniel Robert; Pozefsky, Diane; Andrade, Carolina Horta; Muratov, Eugene N.; Tropsha, AlexanderJournal of Chemical Information and Modeling (2018), 58 (6), 1214-1223CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Multiple approaches to quant. structure-activity relationship (QSAR) modeling using various statistical or machine learning techniques and different types of chem. descriptors have been developed over the years. Oftentimes models are used in consensus to make more accurate predictions at the expense of model interpretation. We propose a simple, fast, and reliable method termed Multi-Descriptor Read Across (MuDRA) for developing both accurate and interpretable models. The method is conceptually related to the well-known kNN approach but uses different types of chem. descriptors simultaneously for similarity assessment. To benchmark the new method, we have built MuDRA models for six different end points (Ames mutagenicity, aquatic toxicity, hepatotoxicity, hERG liability, skin sensitization, and endocrine disruption) and compared the results with those generated with conventional consensus QSAR modeling. We find that models built with MuDRA show consistently high external accuracy similar to that of conventional QSAR models. However, MuDRA models excel in terms of transparency, interpretability, and computational efficiency. We posit that due to its methodol. simplicity and reliable predictive accuracy, MuDRA provides a powerful alternative to a much more complex consensus QSAR modeling. MuDRA is implemented and freely available at the Chembench web portal (https://chembench.mml.unc.edu/mudra).
- 12Recommender Systems Handbook; Ricci, F.; Rokach, L.; Shapira, B., Eds.; Springer US: Boston, MA, 2015.Google ScholarThere is no corresponding record for this reference.
- 13Waegeman, W.; Dembczyński, K.; Hüllermeier, E. Multi-target prediction: a unifying view on problems and methods. Data Min. Knowl. Discovery 2019, 33, 293– 324, DOI: 10.1007/s10618-018-0595-5Google ScholarThere is no corresponding record for this reference.
- 14Bennett, J.; Elkan, C.; Liu, B.; Smyth, P.; Tikk, D. KDD Cup and Workshop 2007. SIGKDD Explor. Newsl. 2007, 9, 51– 52, DOI: 10.1145/1345448.1345459Google ScholarThere is no corresponding record for this reference.
- 15Amatriain, X.; Basilico, J. Recommender Systems Handbook; Ricci, F.; Rokach, L.; Shapira, B., Eds.; Springer US: Boston, MA, 2015; pp 385– 419.Google ScholarThere is no corresponding record for this reference.
- 16Thorat, P. B.; Goudar, R. M.; Barve, S. Survey on Collaborative Filtering, Content-based Filtering and Hybrid Recommendation System. Int. J. Comput. Appl. 2015, 110, 31– 36Google ScholarThere is no corresponding record for this reference.
- 17Aggarwal, C. C. Recommender Systems: The Textbook, 1st ed.; Springer Publishing Company, Inc., 2016.Google ScholarThere is no corresponding record for this reference.
- 18Sanghavi, B.; Rathod, R.; Mistry, D. M. Recommender Systems-Comparison of Content-based Filtering and Collaborative Filtering. Int. J. Curr. Eng. Technol. 2014, 4, 3131– 3133Google ScholarThere is no corresponding record for this reference.
- 19Aggarwal, P.; Tomar, V.; Kathuria, A. Comparing Content Based and Collaborative Filtering in Recommender Systems. Int. J. New Technol. Res. 2017, 3, 3Google ScholarThere is no corresponding record for this reference.
- 20Ariff, N. M.; Bakar, M. A. A.; Rahim, N. F. In Comparison Between Content-based and Collaborative Filtering Recommendation System for Movie Suggestions, AIP Conference Proceedings; AIP Publishing LLC: Kuala Lumpur, Malaysia, 2018; p 020057.Google ScholarThere is no corresponding record for this reference.
- 21Su, X.; Khoshgoftaar, T. M. A Survey of Collaborative Filtering Techniques. Adv. Artif. Intell. 2009, 2009, 1– 19, DOI: 10.1155/2009/421425Google ScholarThere is no corresponding record for this reference.
- 22Nilashi, M.; Bagherifard, K.; Ibrahim, O.; Alizadeh, H.; Nojeem, L. A.; Roozegar, N. Collaborative Filtering Recommender Systems. Res. J. Appl. Sci., Eng. Technol. 2013, 5, 4168– 4182, DOI: 10.19026/rjaset.5.4644Google ScholarThere is no corresponding record for this reference.
- 23Sharma, M.; Mann, S. A Survey of Recommender Systems: Approaches and Limitations. Int. J. Innov. Sci. Eng. Technol. 2013, 1– 9Google ScholarThere is no corresponding record for this reference.
- 24Cacheda, F.; Carneiro, V.; Fernández, D.; Formoso, V. Comparison of collaborative filtering algorithms: Limitations of current techniques and proposals for scalable, high-performance recommender systems. ACM Trans. Web 2011, 5, 1– 33, DOI: 10.1145/1921591.1921593Google ScholarThere is no corresponding record for this reference.
- 25Kumar Bokde, D.; Girase, S.; Mukhopadhyay, D. Matrix Factorization Model in Collaborative Filtering Algorithms: A Survey. Procedia Comput. Sci. 2015, 49, 136– 146Google ScholarThere is no corresponding record for this reference.
- 26Pazzani, M. J.; Billsus, D. The Adaptive Web; Brusilovsky, P.; Kobsa, A.; Nejdl, W., Eds.; Springer: Berlin, Heidelberg, 2007; Vol. 4321, pp 325– 341.Google ScholarThere is no corresponding record for this reference.
- 27Lops, P.; de Gemmis, M.; Semeraro, G. Recommender Systems Handbook; Ricci, F.; Rokach, L.; Shapira, B.; Kantor, P. B., Eds.; Springer US: Boston, MA, 2011; pp 73– 105.Google ScholarThere is no corresponding record for this reference.
- 28Zhang, W.; Zou, H.; Luo, L.; Liu, Q.; Wu, W.; Xiao, W. Predicting potential side effects of drugs by recommender methods and ensemble learning. Neurocomputing 2016, 173, 979– 987, DOI: 10.1016/j.neucom.2015.08.054Google ScholarThere is no corresponding record for this reference.
- 29Fan, J.; Yang, J.; Jiang, Z. Prediction of Central Nervous System Side Effects Through Drug Permeability to Blood-Brain Barrier and Recommendation Algorithm. J. Comput. Biol. 2018, 25, 1– 9, DOI: 10.1089/cmb.2017.0149Google ScholarThere is no corresponding record for this reference.
- 30Wang, H.; Gu, Q.; Wei, J.; Cao, Z.; Liu, Q. Mining drug-disease relationships as a complement to medical genetics-based drug repositioning: Where a recommendation system meets genome-wide association studies. Clin. Pharmacol. Ther. 2015, 97, 451– 454, DOI: 10.1002/cpt.82Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2MrksVWgsQ%253D%253D&md5=19103bff034f45685bfc5aed59126da9Mining drug-disease relationships as a complement to medical genetics-based drug repositioning: Where a recommendation system meets genome-wide association studiesWang H; Gu Q; Wei J; Cao Z; Liu QClinical pharmacology and therapeutics (2015), 97 (5), 451-4 ISSN:.A novel recommendation-based drug repositioning strategy is presented to simultaneously determine novel drug indications and side effects in one integrated framework. This strategy provides a complementary method to medical genetics-based drug repositioning, which reduces the occurrence of false positives in medical genetics-based drug repositioning, resulting in a ranked list of new candidate indications and/or side effects with different confidence levels. Several new drug indications and side effects are reported with high prediction confidences.
- 31Hao, W.; Hai-ping, W.; Xin-dong, W.; Qi, L. Mining Drug-Disease Relationships: a Recommendation System. Chin. Pharmacol. Bull. 2015, 31, 1770– 1774Google ScholarThere is no corresponding record for this reference.
- 32Yang, J.; Li, Z.; Fan, X.; Cheng, Y. Drug-Disease Association and Drug-Repositioning Predictions in Complex Diseases Using Causal Inference-Probabilistic Matrix Factorization. J. Chem. Inf. Model. 2014, 54, 2562– 2569, DOI: 10.1021/ci500340nGoogle Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtleqtb%252FP&md5=d300bb27b05bd8d228e4efb0b6a5a5ebDrug-Disease Association and Drug-Repositioning Predictions in Complex Diseases Using Causal Inference-Probabilistic Matrix FactorizationYang, Jihong; Li, Zheng; Fan, Xiaohui; Cheng, YiyuJournal of Chemical Information and Modeling (2014), 54 (9), 2562-2569CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)The high incidence of complex diseases has become a worldwide threat to human health. Multiple targets and pathways are perturbed during the pathol. process of complex diseases. Systematic investigation of complex relationship between drugs and diseases is necessary for new assocn. discovery and drug repurposing. For this purpose, three causal networks were constructed herein for cardiovascular diseases, diabetes mellitus, and neoplasms, resp. A causal inference-probabilistic matrix factorization (CI-PMF) approach was proposed to predict and classify drug-disease assocns., and further used for drug-repositioning predictions. First, multilevel systematic relations between drugs and diseases were integrated from heterogeneous databases to construct causal networks connecting drug-target-pathway-gene-disease. Then, the assocn. scores between drugs and diseases were assessed by evaluating a drug's effects on multiple targets and pathways. Furthermore, PMF models were learned based on known interactions, and assocns. were then classified into three types by trained models. Finally, therapeutic assocns. were predicted based upon the ranking of assocn. scores and predicted assocn. types. In terms of drug-disease assocn. prediction, modified causal inference included in CI-PMF outperformed existing causal inference with a higher AUC (area under receiver operating characteristic curve) score and greater precision. Moreover, CI-PMF performed better than single modified causal inference in predicting therapeutic drug-disease assocns. In the top 30% of predicted assocns., 58.6% (136/232), 50.8% (31/61), and 39.8% (140/352) hit known therapeutic assocns., while precisions obtained by the latter were only 10.2% (231/2264), 8.8% (36/411), and 9.7% (189/1948). Clin. verifications were further conducted for the top 100 newly predicted therapeutic assocns. As a result, 21, 12, and 32 assocns. have been studied and many treatment effects of drugs on diseases were investigated for cardiovascular diseases, diabetes mellitus, and neoplasms, resp. Related chains in causal networks were extd. for these 65 clin.-verified assocns., and we further illustrated the therapeutic role of etodolac in breast cancer by inferred chains. Overall, CI-PMF is a useful approach for assocg. drugs with complex diseases and provides potential values for drug repositioning.
- 33Galeano, D.; Paccanaro, A. A Recommender System Approach for Predicting Drug Side Effects. In International Joint Conference on Neural Networks (IJCNN); IEEE, 2018; pp 1– 7.Google ScholarThere is no corresponding record for this reference.
- 34Qiu, H.; Mao, K.-T.; Shi, J.-Y.; Huang, H.; Chen, Z.; Dong, K.; Yiu, S.-M. Predicting and Understanding Comprehensive Drug-Drug Interactions via Semi-nonnegative Matrix Factorization. BMC Syst. Biol. 2018, 101– 110Google ScholarThere is no corresponding record for this reference.
- 35Shi, J.-Y.; Huang, H.; Li, J.-X.; Lei, P.; Zhang, Y.-N.; Yiu, S.-M. Predicting Comprehensive Drug-Drug Interactions for New Drugs via Triple Matrix Factorization. Bioinf. Biomed. Eng. 2017, 2018, 108– 117Google ScholarThere is no corresponding record for this reference.
- 36Yamada, M.; Lian, W.; Goyal, A.; Chen, J.; Wimalawarne, K.; Khan, S. A.; Kaski, S.; Mamitsuka, H.; Chang, Y. In Convex Factorization Machine for Toxicogenomics Prediction, Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD ’17, pp 1215– 1224.Google ScholarThere is no corresponding record for this reference.
- 37Bhat, S.; Aishwarya, K. In Item-Based Hybrid Recommender System For Newly Marketed Pharmaceutical Drugs , 2013 International Conference on Advances in Computing, Mysore, 2013; pp 2107– 2111.Google ScholarThere is no corresponding record for this reference.
- 38Huang, Z.; Lu, X.; Duan, H.; Zhao, C. Collaboration-based Medical Knowledge Recommendation. Artif. Intell. Med. 2012, 55, 13– 24, DOI: 10.1016/j.artmed.2011.10.002Google Scholar38https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC38zos1Sjtw%253D%253D&md5=49ca552faf586f1cd0c4d851cc3c4d78Collaboration-based medical knowledge recommendationHuang Zhengxing; Lu Xudong; Duan Huilong; Zhao ChenhuiArtificial intelligence in medicine (2012), 55 (1), 13-24 ISSN:.PURPOSE: Clinicians rely on a large amount of medical knowledge when performing clinical work. In clinical environment, clinical organizations must exploit effective methods of seeking and recommending appropriate medical knowledge in order to help clinicians perform their work. METHOD: Aiming at supporting medical knowledge search more accurately and realistically, this paper proposes a collaboration-based medical knowledge recommendation approach. In particular, the proposed approach generates clinician trust profile based on the measure of trust factors implicitly from clinicians' past rating behaviors on knowledge items. And then the generated clinician trust profile is incorporated into collaborative filtering techniques to improve the quality of medical knowledge recommendation, to solve the information-overload problem by suggesting knowledge items of interest to clinicians. RESULTS: Two case studies are conducted at Zhejiang Huzhou Central Hospital of China. One case study is about the drug recommendation hold in the endocrinology department of the hospital. The experimental dataset records 16 clinicians' drug prescribing tracks in six months. This case study shows a proof-of-concept of the proposed approach. The other case study addresses the problem of radiological computed tomography (CT)-scan report recommendation. In particular, 30 pieces of CT-scan examinational reports about cerebral hemorrhage patients are collected from electronic medical record systems of the hospital, and are evaluated and rated by 19 radiologists of the radiology department and 7 clinicians of the neurology department, respectively. This case study provides some confidence the proposed approach will scale up. CONCLUSION: The experimental results show that the proposed approach performs well in recommending medical knowledge items of interest to clinicians, which indicates that the proposed approach is feasible in clinical practice.
- 39Ma, J.; Zhang, R.; Yuan, Y.; Zhao, Z. Using Hybrid Similarity-Based Collaborative Filtering Method for Compound Activity Prediction. In International Conference on Intelligent Computing; Springer: Cham, 2018; pp 51– 72.Google ScholarThere is no corresponding record for this reference.
- 40Simm, J.; Arany, A.; Zakeri, P.; Haber, T.; Wegner, J. K.; Chupakhin, V.; Ceulemans, H.; Moreau, Y. In Macau: Scalable Bayesian Factorization with High-Dimensional Side Information Using MCMC , 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing, 2017; pp 2107– 2111.Google ScholarThere is no corresponding record for this reference.
- 41de León, A.; Chen, B.; Gillet, V. J. Effect of Missing Data on Multitask Prediction Methods. J. Cheminf. 2018, 10, 26 DOI: 10.1186/s13321-018-0281-zGoogle ScholarThere is no corresponding record for this reference.
- 42Hasan, S.; Duncan, G. T.; Neill, D. B.; Padman, R. In Towards a Collaborative Filtering Approach to Medication Reconciliation , AMIA Annual Symposium Proceedings, 2008; pp 288– 292.Google ScholarThere is no corresponding record for this reference.
- 43Hasan, S.; Duncan, G. T.; Neill, D. B.; Padman, R. Automatic Detection of Omissions in Medication Lists. J. Am. Med. Inform. Assoc. 2011, 18, 449– 458, DOI: 10.1136/amiajnl-2011-000106Google Scholar43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC3MngtlyisA%253D%253D&md5=24c8821ad3ff3b2aa3021fe185cc41ffAutomatic detection of omissions in medication listsHasan Sharique; Duncan George T; Neill Daniel B; Padman RemaJournal of the American Medical Informatics Association : JAMIA (2011), 18 (4), 449-58 ISSN:.OBJECTIVE: Evidence suggests that the medication lists of patients are often incomplete and could negatively affect patient outcomes. In this article, the authors propose the application of collaborative filtering methods to the medication reconciliation task. Given a current medication list for a patient, the authors employ collaborative filtering approaches to predict drugs the patient could be taking but are missing from their observed list. DESIGN: The collaborative filtering approach presented in this paper emerges from the insight that an omission in a medication list is analogous to an item a consumer might purchase from a product list. Online retailers use collaborative filtering to recommend relevant products using retrospective purchase data. In this article, the authors argue that patient information in electronic medical records, combined with artificial intelligence methods, can enhance medication reconciliation. The authors formulate the detection of omissions in medication lists as a collaborative filtering problem. Detection of omissions is accomplished using several machine-learning approaches. The effectiveness of these approaches is evaluated using medication data from three long-term care centers. The authors also propose several decision-theoretic extensions to the methodology for incorporating medical knowledge into recommendations. RESULTS: Results show that collaborative filtering identifies the missing drug in the top-10 list about 40-50% of the time and the therapeutic class of the missing drug 50%-65% of the time at the three clinics in this study. CONCLUSION: Results suggest that collaborative filtering can be a valuable tool for reconciling medication lists, complementing currently recommended process-driven approaches. However, a one-size-fits-all approach is not optimal, and consideration should be given to context (eg, types of patients and drug regimens) and consequence (eg, the impact of omission on outcomes).
- 44Huang, Z.; Lu, X.; Duan, H.; Zhao, C. Collaboration-based Medical Knowledge Recommendation. Artif. Intell. Med. 2012, 55, 13– 24, DOI: 10.1016/j.artmed.2011.10.002Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC38zos1Sjtw%253D%253D&md5=49ca552faf586f1cd0c4d851cc3c4d78Collaboration-based medical knowledge recommendationHuang Zhengxing; Lu Xudong; Duan Huilong; Zhao ChenhuiArtificial intelligence in medicine (2012), 55 (1), 13-24 ISSN:.PURPOSE: Clinicians rely on a large amount of medical knowledge when performing clinical work. In clinical environment, clinical organizations must exploit effective methods of seeking and recommending appropriate medical knowledge in order to help clinicians perform their work. METHOD: Aiming at supporting medical knowledge search more accurately and realistically, this paper proposes a collaboration-based medical knowledge recommendation approach. In particular, the proposed approach generates clinician trust profile based on the measure of trust factors implicitly from clinicians' past rating behaviors on knowledge items. And then the generated clinician trust profile is incorporated into collaborative filtering techniques to improve the quality of medical knowledge recommendation, to solve the information-overload problem by suggesting knowledge items of interest to clinicians. RESULTS: Two case studies are conducted at Zhejiang Huzhou Central Hospital of China. One case study is about the drug recommendation hold in the endocrinology department of the hospital. The experimental dataset records 16 clinicians' drug prescribing tracks in six months. This case study shows a proof-of-concept of the proposed approach. The other case study addresses the problem of radiological computed tomography (CT)-scan report recommendation. In particular, 30 pieces of CT-scan examinational reports about cerebral hemorrhage patients are collected from electronic medical record systems of the hospital, and are evaluated and rated by 19 radiologists of the radiology department and 7 clinicians of the neurology department, respectively. This case study provides some confidence the proposed approach will scale up. CONCLUSION: The experimental results show that the proposed approach performs well in recommending medical knowledge items of interest to clinicians, which indicates that the proposed approach is feasible in clinical practice.
- 45Nikitina, A. A.; Orlov, A. A.; Kozlovskaya, L. I.; Palyulin, V. A.; Osolodkin, D. I. Enhanced Taxonomy Annotation of Antiviral Activity Data from ChEMBL. Database 2019, 2019, bay139 DOI: 10.1093/database/bay139Google ScholarThere is no corresponding record for this reference.
- 46Seley-Radtke, K. L.; Yates, M. K. The Evolution of Nucleoside Analogue Antivirals: A Review for Chemists and Non-chemists. Part 1: Early Structural Modifications to the Nucleoside Scaffold. Antiviral Res. 2018, 154, 66– 86, DOI: 10.1016/j.antiviral.2018.04.004Google Scholar46https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXnslOntrc%253D&md5=ccd9ad86ef8a1baf078e21747edef73dThe evolution of nucleoside analogue antivirals: A review for chemists and non-chemists. Part 1: Early structural modifications to the nucleoside scaffoldSeley-Radtke, Katherine L.; Yates, Mary K.Antiviral Research (2018), 154 (), 66-86CODEN: ARSRDR; ISSN:0166-3542. (Elsevier B.V.)A review. This is the first of two invited articles reviewing the development of nucleoside-analog antiviral drugs, written for a target audience of virologists and other non-chemists, as well as chemists who may not be familiar with the field. Rather than providing a simple chronol. account, we have examd. and attempted to explain the thought processes, advances in synthetic chem. and lessons learned from antiviral testing that led to a few mols. being moved forward to eventual approval for human therapies, while others were discarded. The present paper focuses on early, relatively simplistic changes made to the nucleoside scaffold, beginning with modifications of the nucleoside sugars of Ara-C and other arabinose-derived nucleoside analogs in the 1960's. A future paper will review more recent developments, focusing esp. on more complex modifications, particularly those involving multiple changes to the nucleoside scaffold. We hope that these articles will help virologists and others outside the field of medicinal chem. to understand why certain drugs were successfully developed, while the majority of candidate compds. encountered barriers due to low-yielding synthetic routes, toxicity or other problems that led to their abandonment.
- 47Yates, M. K.; Seley-Radtke, K. L. The Evolution of Antiviral Nucleoside Analogues: A Review for Chemists and Non-chemists. Part II: Complex Modifications to the Nucleoside Scaffold. Antiviral Res. 2019, 162, 5– 21, DOI: 10.1016/j.antiviral.2018.11.016Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXisFSntrzL&md5=26d398d67fa3ae4c54318057eb18b35fThe evolution of antiviral nucleoside analogues: A review for chemists and non-chemists. Part II: Complex modifications to the nucleoside scaffoldYates, Mary K.; Seley-Radtke, Katherine L.Antiviral Research (2019), 162 (), 5-21CODEN: ARSRDR; ISSN:0166-3542. (Elsevier B.V.)This is the second of two invited articles reviewing the development of nucleoside analog antiviral drugs, written for a target audience of virologists and other non-chemists, as well as chemists who may not be familiar with the field. As with the first paper, rather than providing a chronol. account, we have chosen to examine particular examples of structural modifications made to nucleoside analogs that have proven fruitful as various antiviral, anticancer, and other therapeutics. The first review covered the more common, and in most cases, single modifications to the sugar and base moieties of the nucleoside scaffold. This paper focuses on more recent developments, esp. nucleoside analogs that contain more than one modification to the nucleoside scaffold. We hope that these two articles will provide an informative historical perspective of some of the successfully designed analogs, as well as many candidate compds. that encountered obstacles.
- 48Li, G.; De Clercq, E. Therapeutic options for the 2019 novel coronavirus (2019-nCoV). Nat. Rev. Drug Discov. 2020, 19, 149– 150, DOI: 10.1038/d41573-020-00016-0Google Scholar48https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXkt1GksLo%253D&md5=60d91d88e6a0279829eecb74990c9d03Therapeutic options for the 2019 novel coronavirus (2019-nCoV)Li, Guangdi; De Clercq, ErikNature Reviews Drug Discovery (2020), 19 (3), 149-150CODEN: NRDDAG; ISSN:1474-1776. (Nature Research)A review. Therapeutic options in response to the 2019-nCoV outbreak are urgently needed. Here, we discuss the potential for repurposing existing antiviral agents to treat 2019-nCoV infection (now known as COVID-19), some of which are already moving into clin. trials.
- 49Grčar, M.; Mladenič, D.; Fortuna, B.; Grobelnik, M. Advances in Web Mining and Web Usage Analysis; Hutchison, D.; Kanade, T.; Kittler, J.; Kleinberg, J. M.; Mattern, F.; Mitchell, J. C.; Naor, M.; Nierstrasz, O.; Pandu Rangan, C.; Steffen, B., Eds.; Springer: Berlin, Heidelberg, 2006; Vol. 4198, pp 58– 76.Google ScholarThere is no corresponding record for this reference.
- 50Guo, M. User Modeling, Adaptation, and Personalization; Hutchison, D.; Kanade, T.; Kittler, J.; Kleinberg, J. M.; Mattern, F.; Mitchell, J. C.; Naor, M.; Nierstrasz, O.; Pandu Rangan, C.; Steffen, B., Eds.; Springer: Berlin, Heidelberg, 2012; Vol. 7379, pp 361– 364.Google ScholarThere is no corresponding record for this reference.
- 51Hug, N. Surprise, a Python Library for Recommender Systems. http://surpriselib.com (accessed December 1, 2018).Google ScholarThere is no corresponding record for this reference.
- 52Nazarov, I.; Shirokikh, B.; Burkina, M.; Fedonin, G.; Panov, M. Sparse Group Inductive Matrix Completion, 2018. arXiv preprint arXiv:1804.10653. https://arxiv.org/abs/1804.10653.Google ScholarThere is no corresponding record for this reference.
- 53Davies, M.; Nowotka, M.; Papadatos, G.; Dedman, N.; Gaulton, A.; Atkinson, F.; Bellis, L.; Overington, J. P. ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015, 43, W612– W620, DOI: 10.1093/nar/gkv352Google Scholar53https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtVymtL7K&md5=2e1955bce953b6c2dfc4ef0e92752623ChEMBL web services: streamlining access to drug discovery data and utilitiesDavies, Mark; Nowotka, Michal; Papadatos, George; Dedman, Nathan; Gaulton, Anna; Atkinson, Francis; Bellis, Louisa; Overington, John P.Nucleic Acids Research (2015), 43 (W1), W612-W620CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)A review. ChEMBL is now a well-established resource in the fields of drug discovery and medicial chem. research. The ChEMBL database curates and stores standardized bioactivity, mol., target and drug data extd. from multiple sources, including the primary medicinal chem. literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services (version 2.0.x), which exposes significantly more data from the underlying database and introduces new functionality. To complement the data-focused services, a utility service (version 1.0.x), which provides RESTful access to commonly used chem.-informatics methods, has also been concurrently developed. The ChEMBL web services can be used together or independently to build applications and data processing workflows relevant to drug discovery and chem. biol.
- 54Kode srl, Dragon (Software for Molecular Descriptor Calculation), version 7.0.8., 2017. https://chm.kode-solutions.net.Google ScholarThere is no corresponding record for this reference.
- 55ICTV Master Species List, v.1. https://talk.ictvonline.org/files/master-species-lists/m/msl/5945 (accessed July 1, 2018).Google ScholarThere is no corresponding record for this reference.
- 56Muhammad, U.; Uzairu, A.; Ebuka Arthur, D. Review on: quantitative structure activity relationship (QSAR) modeling. J. Anal. Pharm. 2018, 7, 240– 242, DOI: 10.15406/japlr.2018.07.00232Google ScholarThere is no corresponding record for this reference.
- 57Siontis, G. C.; Tzoulaki, I.; Castaldi, P. J.; Ioannidis, J. P. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J. Clin. Epidemiol. 2015, 68, 25– 34, DOI: 10.1016/j.jclinepi.2014.09.007Google Scholar57https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2MzitVKksA%253D%253D&md5=e928bea6812d2066f4cfdefddcca46c8External validation of new risk prediction models is infrequent and reveals worse prognostic discriminationSiontis George C M; Tzoulaki Ioanna; Castaldi Peter J; Ioannidis John P AJournal of clinical epidemiology (2015), 68 (1), 25-34 ISSN:.OBJECTIVES: To evaluate how often newly developed risk prediction models undergo external validation and how well they perform in such validations. STUDY DESIGN AND SETTING: We reviewed derivation studies of newly proposed risk models and their subsequent external validations. Study characteristics, outcome(s), and models' discriminatory performance [area under the curve, (AUC)] in derivation and validation studies were extracted. We estimated the probability of having a validation, change in discriminatory performance with more stringent external validation by overlapping or different authors compared to the derivation estimates. RESULTS: We evaluated 127 new prediction models. Of those, for 32 models (25%), at least an external validation study was identified; in 22 models (17%), the validation had been done by entirely different authors. The probability of having an external validation by different authors within 5 years was 16%. AUC estimates significantly decreased during external validation vs. the derivation study [median AUC change: -0.05 (P < 0.001) overall; -0.04 (P = 0.009) for validation by overlapping authors; -0.05 (P < 0.001) for validation by different authors]. On external validation, AUC decreased by at least 0.03 in 19 models and never increased by at least 0.03 (P < 0.001). CONCLUSION: External independent validation of predictive models in different studies is uncommon. Predictive performance may worsen substantially on external validation.
- 58Raschka, S. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning, 2018. arXiv preprint arXiv:1811.12808. https://arxiv.org/abs/1811.12808.Google ScholarThere is no corresponding record for this reference.
- 59Brown, J. B. Classifiers and their Metrics Quantified. Mol. Inform. 2018, 37, 1– 11, DOI: 10.1002/minf.201700127Google ScholarThere is no corresponding record for this reference.
- 60Ramsundar, B.; Kearnes, S.; Riley, P.; Webster, D.; Konerding, D.; Pande, V. Massively Multitask Networks for Drug Discovery, 2015. arXiv preprint arXiv:1502.02072. https://arxiv.org/abs/1502.02072.Google ScholarThere is no corresponding record for this reference.
- 61Rücker, C.; Rücker, G.; Meringer, M. y-Randomization and Its Variants in QSPR/QSAR. J. Chem. Inf. Model. 2007, 47, 2345– 2357, DOI: 10.1021/ci700157bGoogle Scholar61https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXhtVGqtL%252FI&md5=9b99477ea9ca6078466c06dd19cc2116y-Randomization and Its Variants in QSPR/QSARRuecker, Christoph; Ruecker, Gerta; Meringer, MarkusJournal of Chemical Information and Modeling (2007), 47 (6), 2345-2357CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Y-Randomization is a tool used in validation of QSPR/QSAR models, whereby the performance of the original model in data description (r2) is compared to that of models built for permuted (randomly shuffled) response, based on the original descriptor pool and the original model building procedure. We compared y-randomization and several variants thereof, using original response, permuted response, or random no. pseudoresponse and original descriptors or random no. pseudodescriptors, in the typical setting of multilinear regression (MLR) with descriptor selection. For each combination of no. of observations (compds.), no. of descriptors in the final model, and no. of descriptors in the pool to select from, computer expts. using the same descriptor selection method result in two different mean highest random r2 values. A lower one is produced by y-randomization or a variant likewise based on the original descriptors, while a higher one is obtained from variants that use random no. pseudodescriptors. The difference is due to the intercorrelation of real descriptors in the pool. We propose to compare an original model's r2 to both of these whenever possible. The meaning of the three possible outcomes of such a double test is discussed. Often y-randomization is not available to a potential user of a model, due to the values of all descriptors in the pool for all compds. not being published. In such cases random no. expts. as proposed here are still possible. The test was applied to several recently published MLR QSAR equations, and cases of failure were identified. Some progress also is reported toward the aim of obtaining the mean highest r2 of random pseudomodels by calcn. rather than by tedious multiple simulations on random no. variables.
- 62Kovatcheva, A.; Golbraikh, A.; Oloff, S.; Feng, J.; Zheng, W.; Tropsha, A. QSAR Modeling of Datasets with Enantioselective Compounds using Chirality Sensitive Molecular Descriptors. SAR QSAR Environ. Res. 2005, 16, 93– 102, DOI: 10.1080/10629360412331319844Google Scholar62https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXovV2msg%253D%253D&md5=8c9e03780535c88865145b3683141f87QSAR modeling of datasets with enantioselective compounds using chirality sensitive molecular descriptorsKovatcheva, A.; Golbraikh, A.; Oloff, S.; Feng, J.; Zheng, W.; Tropsha, A.SAR and QSAR in Environmental Research (2005), 16 (1-2), 93-102CODEN: SQERED; ISSN:1062-936X. (Taylor & Francis Ltd.)Shape descriptors used in 3D QSAR studies naturally take into account chirality; however, for flexible and structurally diverse mols. such studies require extensive conformational searching and alignment. QSAR modeling studies of two datasets of fragrance compds. with complex stereochem. using simple alignment-free chirality sensitive descriptors developed in our labs. are presented. In the first investigation, 44 α-campholenic derivs. with sandalwood odor were represented as derivs. of several common structural templates with substituents numbered according to their relative spatial positions in the mols. Both mol. and substituent descriptors were used as independent variables in MLR calcns., and the best model was characterized by the training set q2 of 0.79 and external test set r2 of 0.95. In the second study, several types of chirality descriptors were employed in combinatorial QSAR modeling of 98 ambergris fragrance compds. Among 28 possible combinations of seven types of descriptors and four statistical modeling techniques, k nearest neighbor classification with CoMFA descriptors was initially found to generate the best models with the internal and external accuracies of 76 and 89%, resp. The same dataset was then studied using novel atom pair chirality descriptors (cAP). The cAP are based on a modified definition of the at. chirality, in which the seniority of the substituents is defined by their relative partial charge values: higher values correspond to higher seniorities. The resulting models were found to have higher predictive power than those developed with CoMFA descriptors; the best model was characterized by the internal and external accuracies of 82 and 94%, resp. The success of modeling studies using simple alignment free chirality descriptors discussed in this paper suggests that they should be applied broadly to QSAR studies of many datasets when compd. stereochem. plays an important role in defining their activity.
- 63de Cerqueira Lima, P.; Golbraikh, A.; Oloff, S.; Xiao, Y.; Tropsha, A. Combinatorial QSAR Modeling of P-Glycoprotein Substrates. J. Chem. Inf. Model. 2006, 46, 1245– 1254, DOI: 10.1021/ci0504317Google Scholar63https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD28zltlOlug%253D%253D&md5=f54d2dd79a795510cf5c02f93505ec88Combinatorial QSAR modeling of P-glycoprotein substratesde Cerqueira Lima Patricia; Golbraikh Alexander; Oloff Scott; Xiao Yunde; Tropsha AlexanderJournal of chemical information and modeling (2006), 46 (3), 1245-54 ISSN:1549-9596.Quantitative structure-activity (property) relationship (QSAR/QSPR) models are typically generated with a single modeling technique using one type of molecular descriptors. Recently, we have begun to explore a combinatorial QSAR approach which employs various combinations of optimization methods and descriptor types and includes rigorous and consistent model validation (Kovatcheva, A.; Golbraikh, A.; Oloff, S.; Xiao, Y.; Zheng, W.; Wolschann, P.; Buchbauer, G.; Tropsha, A. Combinatorial QSAR of Ambergris Fragrance Compounds. J. Chem. Inf. Comput. Sci. 2004, 44, 582-95). Herein, we have applied this approach to a data set of 195 diverse substrates and nonsubstrates of P-glycoprotein (P-gp) that plays a crucial role in drug resistance. Modeling methods included k-nearest neighbors classification, decision tree, binary QSAR, and support vector machines (SVM). Descriptor sets included molecular connectivity indices, atom pair (AP) descriptors, VolSurf descriptors, and molecular operation environment descriptors. Each descriptor type was used with every QSAR modeling technique; so, in total, 16 combinations of techniques and descriptor types have been considered. Although all combinations resulted in models with a high correct classification rate for the training set (CCR(train)), not all of them had high classification accuracy for the test set (CCR(test)). Thus, predictive models have been generated only for some combinations of the methods and descriptor types, and the best models were obtained using SVM classification with either AP or VolSurf descriptors; they were characterized by CCR(train) = 0.94 and 0.88 and CCR(test) = 0.81 and 0.81, respectively. The combinatorial QSAR approach identified models with higher predictive accuracy than those reported previously for the same data set. We suggest that, in the absence of any universally applicable "one-for-all" QSAR methodology, the combinatorial QSAR approach should become the standard practice in QSPR/QSAR modeling.
- 64Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2018, 12, e0177678 DOI: 10.1016/j.aci.2018.08.003Google ScholarThere is no corresponding record for this reference.
- 65Ozsoy, M. G.; Özyer, T.; Polat, F.; Alhajj, R. Realizing Drug Repositioning by Adapting a Recommendation System to Handle the Process. BMC Bioinform. 2018, 19, 263– 266Google ScholarThere is no corresponding record for this reference.
- 66Yang, J.; Li, Z.; Fan, X.; Cheng, Y. Drug-Disease Association and Drug-Repositioning Predictions in Complex Diseases Using Causal Inference-Probabilistic Matrix Factorization. J. Chem. Inf. Model. 2014, 54, 2562– 2569, DOI: 10.1021/ci500340nGoogle Scholar66https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtleqtb%252FP&md5=d300bb27b05bd8d228e4efb0b6a5a5ebDrug-Disease Association and Drug-Repositioning Predictions in Complex Diseases Using Causal Inference-Probabilistic Matrix FactorizationYang, Jihong; Li, Zheng; Fan, Xiaohui; Cheng, YiyuJournal of Chemical Information and Modeling (2014), 54 (9), 2562-2569CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)The high incidence of complex diseases has become a worldwide threat to human health. Multiple targets and pathways are perturbed during the pathol. process of complex diseases. Systematic investigation of complex relationship between drugs and diseases is necessary for new assocn. discovery and drug repurposing. For this purpose, three causal networks were constructed herein for cardiovascular diseases, diabetes mellitus, and neoplasms, resp. A causal inference-probabilistic matrix factorization (CI-PMF) approach was proposed to predict and classify drug-disease assocns., and further used for drug-repositioning predictions. First, multilevel systematic relations between drugs and diseases were integrated from heterogeneous databases to construct causal networks connecting drug-target-pathway-gene-disease. Then, the assocn. scores between drugs and diseases were assessed by evaluating a drug's effects on multiple targets and pathways. Furthermore, PMF models were learned based on known interactions, and assocns. were then classified into three types by trained models. Finally, therapeutic assocns. were predicted based upon the ranking of assocn. scores and predicted assocn. types. In terms of drug-disease assocn. prediction, modified causal inference included in CI-PMF outperformed existing causal inference with a higher AUC (area under receiver operating characteristic curve) score and greater precision. Moreover, CI-PMF performed better than single modified causal inference in predicting therapeutic drug-disease assocns. In the top 30% of predicted assocns., 58.6% (136/232), 50.8% (31/61), and 39.8% (140/352) hit known therapeutic assocns., while precisions obtained by the latter were only 10.2% (231/2264), 8.8% (36/411), and 9.7% (189/1948). Clin. verifications were further conducted for the top 100 newly predicted therapeutic assocns. As a result, 21, 12, and 32 assocns. have been studied and many treatment effects of drugs on diseases were investigated for cardiovascular diseases, diabetes mellitus, and neoplasms, resp. Related chains in causal networks were extd. for these 65 clin.-verified assocns., and we further illustrated the therapeutic role of etodolac in breast cancer by inferred chains. Overall, CI-PMF is a useful approach for assocg. drugs with complex diseases and provides potential values for drug repositioning.
- 67Ding, H.; Takigawa, I.; Mamitsuka, H.; Zhu, S. Similarity-based Machine Learning Methods for Predicting Drug-Target Interactions: a Brief Review. Brief. Bioinform. 2014, 15, 734– 747, DOI: 10.1093/bib/bbt056Google Scholar67https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC3sfosVyhug%253D%253D&md5=6538debacc5fb6ca35a990356e73bbe4Similarity-based machine learning methods for predicting drug-target interactions: a brief reviewDing Hao; Takigawa Ichigaku; Mamitsuka Hiroshi; Zhu ShanfengBriefings in bioinformatics (2014), 15 (5), 734-47 ISSN:.Computationally predicting drug-target interactions is useful to select possible drug (or target) candidates for further biochemical verification. We focus on machine learning-based approaches, particularly similarity-based methods that use drug and target similarities, which show relationships among drugs and those among targets, respectively. These two similarities represent two emerging concepts, the chemical space and the genomic space. Typically, the methods combine these two types of similarities to generate models for predicting new drug-target interactions. This process is also closely related to a lot of work in pharmacogenomics or chemical biology that attempt to understand the relationships between the chemical and genomic spaces. This background makes the similarity-based approaches attractive and promising. This article reviews the similarity-based machine learning methods for predicting drug-target interactions, which are state-of-the-art and have aroused great interest in bioinformatics. We describe each of these methods briefly, and empirically compare these methods under a uniform experimental setting to explore their advantages and limitations.
- 68Martin, E. J.; Polyakov, V. R.; Zhu, X.-W.; Mukherjee, P.; Tian, L.; Liu, X. All-Assay-Max2 pQSAR: Activity Predictions as Accurate as 4-concentration IC50s for 8558 Novartis Assays. J. Chem. Inf. Model. 2019, 59, 4450– 4459, DOI: 10.1021/acs.jcim.9b00375Google Scholar68https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXhslOqurbK&md5=00bbbad59334754948074d6e1e5edccbAll-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC50s for 8558 Novartis AssaysMartin, Eric J.; Polyakov, Valery R.; Zhu, Xiang-Wei; Tian, Li; Mukherjee, Prasenjit; Liu, XinJournal of Chemical Information and Modeling (2019), 59 (10), 4450-4459CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Profile-quant. structure-activity relationship (pQSAR) is a massively multitask, two-step machine learning method with unprecedented scope, accuracy, and applicability domain. In step one, a "profile" of conventional single-assay random forest regression models are trained on a very large no. of biochem. and cellular pIC50 assays using Morgan 2 substructural fingerprints as compd. descriptors. In step two, a panel of partial least squares (PLS) models are built using the profile of pIC50 predictions from those random forest regression models as compd. descriptors (hence the name). Previously described for a panel of 728 biochem. and cellular kinase assays, we have now built an enormous pQSAR from 11 805 diverse Novartis (NVS) IC50 and EC50 assays. This large no. of assays, and hence of compd. descriptors for PLS, dictated reducing the profile by only including random forest regression models whose predictions correlate with the assay being modeled. The random forest regression and pQSAR models were evaluated with our "realistically novel" held-out test set, whose median av. similarity to the nearest training set member across the 11 805 assays was only 0.34, comparable to the novelty of compds. actually selected from virtual screens. For the 11 805 single-assay random forest regression models, the median correlation of prediction with the expt. was only rext2 = 0.05, virtually random, and only 8% of the models achieved our std. success threshold of rext2 = 0.30. For pQSAR, the median correlation was rext2 = 0.53, comparable to four-concn. exptl. IC50s, and 72% of the models met our rext2 > 0.30 std., totaling 8558 successful models. The successful models included assays from all of the 51 annotated target subclasses, as well as 4196 phenotypic assays, indicating that pQSAR can be applied to virtually any disease area. Every month, all models are updated to include new measurements, and predictions are made for 5.5 million NVS compds., totaling 50 billion predictions. Common uses have included virtual screening, selectivity design, toxicity and promiscuity prediction, mechanism-of-action prediction, and others. Several such actual applications are described.
- 69Koohi, A. In Prediction of Drug-Target Interactions Using Popular Collaborative Filtering Methods , 2013 IEEE International Workshop on Genomic Signal Processing and Statistics, 2013; pp 58– 61.Google ScholarThere is no corresponding record for this reference.
- 70Peska, L.; Buza, K.; Koller, J. Drug-Target Interaction Prediction: A Bayesian Ranking Approach. Comput. Methods Programs Biomed 2017, 152, 15– 21, DOI: 10.1016/j.cmpb.2017.09.003Google Scholar70https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1M7ivFWmsA%253D%253D&md5=ddaece7f6b312c4405e4d3e1da6da96aDrug-target interaction prediction: A Bayesian ranking approachPeska Ladislav; Buza Krisztian; Koller JuliaComputer methods and programs in biomedicine (2017), 152 (), 15-21 ISSN:.BACKGROUND AND OBJECTIVE: In silico prediction of drug-target interactions (DTI) could provide valuable information and speed-up the process of drug repositioning - finding novel usage for existing drugs. In our work, we focus on machine learning algorithms supporting drug-centric repositioning approach, which aims to find novel usage for existing or abandoned drugs. We aim at proposing a per-drug ranking-based method, which reflects the needs of drug-centric repositioning research better than conventional drug-target prediction approaches. METHODS: We propose Bayesian Ranking Prediction of Drug-Target Interactions (BRDTI). The method is based on Bayesian Personalized Ranking matrix factorization (BPR) which has been shown to be an excellent approach for various preference learning tasks, however, it has not been used for DTI prediction previously. In order to successfully deal with DTI challenges, we extended BPR by proposing: (i) the incorporation of target bias, (ii) a technique to handle new drugs and (iii) content alignment to take structural similarities of drugs and targets into account. RESULTS: Evaluation on five benchmark datasets shows that BRDTI outperforms several state-of-the-art approaches in terms of per-drug nDCG and AUC. BRDTI results w.r.t. nDCG are 0.929, 0.953, 0.948, 0.897 and 0.690 for G-Protein Coupled Receptors (GPCR), Ion Channels (IC), Nuclear Receptors (NR), Enzymes (E) and Kinase (K) datasets respectively. Additionally, BRDTI significantly outperformed other methods (BLM-NII, WNN-GIP, NetLapRLS and CMF) w.r.t. nDCG in 17 out of 20 cases. Furthermore, BRDTI was also shown to be able to predict novel drug-target interactions not contained in the original datasets. The average recall at top-10 predicted targets for each drug was 0.762, 0.560, 1.000 and 0.404 for GPCR, IC, NR, and E datasets respectively. CONCLUSIONS: Based on the evaluation, we can conclude that BRDTI is an appropriate choice for researchers looking for an in silico DTI prediction technique to be used in drug-centric repositioning scenarios. BRDTI Software and supplementary materials are available online at www.ksi.mff.cuni.cz/∼peska/BRDTI.
- 71Ezzat, A.; Zhao, P.; Wu, M.; Li, X.-L.; Kwoh, C.-K. Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization. IEEE/ACM Trans. Comput. Biol. Bioinform. 2017, 14, 646– 656, DOI: 10.1109/TCBB.2016.2530062Google Scholar71https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXltlGgurw%253D&md5=7cc545e3566f907a56b3e36c472f8b93Drug-target interaction prediction with graph regularized matrix factorizationEzzat, Ali; Zhao, Peilin; Wu, Min; Li, Xiao-Li; Kwoh, Chee-KeongIEEE/ACM Transactions on Computational Biology and Bioinformatics (2017), 14 (3), 646-656CODEN: ITCBCY; ISSN:1557-9964. (Institute of Electrical and Electronics Engineers)Exptl. detn. of drug-target interactions is expensive and time-consuming. Therefore, there is a continuous demand for more accurate predictions of interactions using computational techniques. Algorithms have been devised to infer novel interactions on a global scale where the input to these algorithms is a drug-target network (i.e., a bipartite graph where edges connect pairs of drugs and targets that are known to interact). However, these algorithms had difficulty predicting interactions involving new drugs or targets for which there are no known interactions (i.e., "orphan" nodes in the network). Since data usually lie on or near to low-dimensional non-linear manifolds, we propose two matrix factorization methods that use graph regularization in order to learn such manifolds. In addn., considering that many of the non-occurring edges in the network are actually unknown or missing cases, we developed a preprocessing step to enhance predictions in the "new drug" and "new target" cases by adding edges with intermediate interaction likelihood scores. In our cross validation expts., our methods achieved better results than three other state-of-the-art methods in most cases. Finally, we simulated some "new drug" and "new target" cases and found that GRMF predicted the left-out interactions reasonably well.
- 72Science, D. Data Sourced from Dimensions, an Inter-linked Research Information System Provided by Digital Science. https://www.dimensions.ai (accessed October 1, 2018).Google ScholarThere is no corresponding record for this reference.
- 73Hook, D. W.; Porter, S. J.; Herzog, C. Dimensions: Building Context for Search and Evaluation. Front. Res. Metrics Anal. 2018, 3, 23, DOI: 10.3389/frma.2018.00023Google ScholarThere is no corresponding record for this reference.
Cited By
Smart citations by scite.ai include citation statements extracted from the full text of the citing article. The number of the statements may be higher than the number of citations provided by ACS Publications if one paper cites another multiple times or lower if scite has not yet processed some of the citing articles.
This article is cited by 16 publications.
- Diba Behnoudfar, Cory M. Simon, Joshua Schrier. Data-Driven Imputation of Miscibility of Aqueous Solutions via Graph-Regularized Logistic Matrix Factorization. The Journal of Physical Chemistry B 2023, 127
(37)
, 7964-7973. https://doi.org/10.1021/acs.jpcb.3c03789
- Arni Sturluson, Ali Raza, Grant D. McConachie, Daniel W. Siderius, Xiaoli Z. Fern, Cory M. Simon. Recommendation System to Predict Missing Adsorption Properties of Nanoporous Materials. Chemistry of Materials 2021, 33
(18)
, 7203-7216. https://doi.org/10.1021/acs.chemmater.1c01201
- Monica Rahma Fauziah, Dina Tri Utari. Content-based filtering for drug recommendation systems: Exploring drug characteristics for personalized treatment recommendations. 2025, 040012. https://doi.org/10.1063/5.0236682
- Holli-Joi Martin, Cleber C. Melo-Filho, Alexey V. Zakharov, Eugene Muratov, Alexander Tropsha. On the importance of data curation for knowledge mining in antiviral research. Science Progress 2025, 108
(1)
https://doi.org/10.1177/00368504241301535
- Mariam Zomorodi, Ismail Ghodsollahee, Jennifer H Martin, Nicholas J Talley, Vahid Salari, Paweł Pławiak, Kazem Rahimi, U.R. Acharya. RECOMED: A comprehensive pharmaceutical recommendation system. Artificial Intelligence in Medicine 2024, 157 , 102981. https://doi.org/10.1016/j.artmed.2024.102981
- Matilde Pato, Márcia Barros, Francisco M. Couto. Survey on Recommender Systems for Biomedical Items in Life and Health Sciences. ACM Computing Surveys 2024, 56
(6)
, 1-32. https://doi.org/10.1145/3639047
- Alexandrina S. Volobueva, Anton A. Shetnev, Mikhail G. Mikhalski, Valeria A. Panova, Darina D. Barkhatova, Ekaterina D. Korshunova, Sergey A. Ivanovskiy, Vladimir V. Zarubaev, Sergey V. Baykov. Benzocaine-N-acylindoline conjugates: synthesis and antiviral activity against Coxsackievirus B3. Medicinal Chemistry Research 2024, 33
(3)
, 464-475. https://doi.org/10.1007/s00044-024-03191-6
- Alexandrina Volobueva, Anton Shetnev, Mikhail Mikhalski, Valeria Panova, Darina Barkhatova, Ekaterina Korshunova, Sergey Ivanovskii, Vladimir Zarubaev, Sergey Baykov. Benzocaine-N-acylindoline Conjugates: Synthesis and Antiviral Activity Against Coxsackievirus B3. 2023https://doi.org/10.21203/rs.3.rs-3447939/v1
- Fatemeh Vakili, Zahra Vakili, Mehrdad Kargari, Mehran Ghaffari. Drug Recommender System Based on Collaborative Filtering for Multiple Sclerosis Patients. 2023, 305-310. https://doi.org/10.1109/ICWR57742.2023.10139214
- Huijun Li, Lin Zou, Jamal Alzobair Hammad Kowah, Dongqiong He, Zifan Liu, Xuejie Ding, Hao Wen, Lisheng Wang, Mingqing Yuan, Xu Liu. A compact review of progress and prospects of deep learning in drug discovery. Journal of Molecular Modeling 2023, 29
(4)
https://doi.org/10.1007/s00894-023-05492-w
- Ekaterina A. Sosnina, Sergey Sosnin, Maxim V. Fedorov. Improvement of multi-task learning by data enrichment: application for drug discovery. Journal of Computer-Aided Molecular Design 2023, 37
(4)
, 183-200. https://doi.org/10.1007/s10822-023-00500-w
- Deepa D, T. Dhiliphan Rajkumar. Navigating the Healthcare Landscape with Recommendation Systems: A Survey of Current Applications and Potential Impact. 2023, 818-823. https://doi.org/10.1109/ICCMC56507.2023.10083785
- Theresa Olubukola Omodunbi, Grace Egbi Alilu, Rhoda Nsikanabasi Ikono. Drug Recommender Systems: A Review of State-of-the-Art Algorithms. 2022, 1-8. https://doi.org/10.1109/ITED56637.2022.10051591
- Yaqi Zhang, Gancheng Zhu, Kewei Li, Fei Li, Lan Huang, Meiyu Duan, Fengfeng Zhou. HLAB: learning the BiLSTM features from the ProtBert-encoded proteins for the class I HLA-peptide binding prediction. Briefings in Bioinformatics 2022, 23
(5)
https://doi.org/10.1093/bib/bbac173
- Marcia Barros, Andre Moitinho, Francisco M. Couto. Hybrid semantic recommender system for chemical compounds in large-scale datasets. Journal of Cheminformatics 2021, 13
(1)
https://doi.org/10.1186/s13321-021-00495-2
- Moe Elbadawi, Simon Gaisford, Abdul W. Basit. Advanced machine-learning techniques in drug discovery. Drug Discovery Today 2021, 26
(3)
, 769-777. https://doi.org/10.1016/j.drudis.2020.12.003
- Olga A. Tarasova, Anastasia V. Rudik, Sergey M. Ivanov, Alexey A. Lagunin, Vladimir V. Poroikov, Dmitry A. Filimonov. Machine Learning Methods in Antiviral Drug Discovery. 2021, 245-279. https://doi.org/10.1007/7355_2021_121
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
Abstract
Figure 1
Figure 1. Scheme of data preparation.
Figure 2
Figure 2. Addressed challenges: (a) prediction of point compound–virus interactions, (b) compoundwise CS prediction, and (c) specieswise CS prediction. Matrix of interactions, green; matrix of species features, pink; matrix of compound features, yellow; and unknown compound–virus interactions, white.
Figure 3
Figure 3. Violin plot of ROC AUC values for viral species in cross-validation (blue) and external validation (red). Dotted lines inside the violins represent the quartiles of the distribution.
Figure 4
Figure 4. Guided grid search of Classo, Cridge, and Cgroup coefficients for interaction prediction for known compounds and viral species based on (a) ROC AUC, (b) mean ROC AUC, and (c) median ROC AUC. Rank = 10, number of iterations = 70.
Figure 5
Figure 5. Violin plots of ROC AUC values for viral species: (a) prediction of point compound–virus interactions, (b) compoundwise CS prediction, and (c) specieswise CS prediction. The prediction was assessed in cross-validation (light blue and coral) and external validation (dark blue, red, and green). Lines depict the dependence of median ROC AUC scores on the number of iterations. Dotted lines inside the violins represent the quartiles of the distribution. Rank = 10, Classo = 0.0, Cgroup = 0.0, and Cridge = 120.0.
Figure 6
Figure 6. Dependence of the median ROC AUC score for point interaction prediction on number of iterations through cross-validation with original feature matrices (red), unit vector for species (blue), and unit vector for compounds (green) (rank = 10, Classo = 0.0, Cgroup = 0.0, and Cridge = 120.). Error bars represent the SD.
Figure 7
Figure 7. Dependence of mean ROC AUC (a) and median ROC AUC (b) for models with a different number of compound features on the number of iterations. Rank = 10, Classo = 0.0, Cgroup = 0.0, and Cridge = 120.0. Compound feature matrices: DB_c.main (red ★), DB_c.50d (blue ■), DB_c.25d (magenta ◆), DB_c.10d (green ×), DB_c.8 (orange •), and DB_c.1 (light blue ▲). Error bars represent the SD.
Figure 8
Figure 8. Influence of the Cgroup regularization coefficient in cross-validation for point interaction prediction on the mean/median ROC AUC at 70 (a) and 10 (b) iterations. Continuous and dashed red lines indicate the mean and median ROC AUC, and continuous and dashed blue lines indicate the mean and median number of zeroed features. Shaded areas represent the corresponding standard deviations. The black dash-dotted line shows median ROC AUC with 50% of compound features. Classo = 0.0, Cridge = 120.0, and rank = 10.
References
This article references 73 other publications.
- 1Caruana, R. Multitask Learning. Mach. Learn. 1997, 28, 41– 75, DOI: 10.1023/A:1007379606734There is no corresponding record for this reference.
- 2Lipinski, C. F.; Maltarollo, V. G.; Oliveira, P. R.; da Silva, A. B. F.; Honorio, K. M. Advances and Perspectives in Applying Deep Learning for Drug Design and Discovery. Front. Robot. AI 2019, 6, 108, DOI: 10.3389/frobt.2019.001082https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3srlslKgug%253D%253D&md5=e9e8d47b720ff86f94bedb65eb82ab18Advances and Perspectives in Applying Deep Learning for Drug Design and DiscoveryLipinski Celio F; da Silva Alberico B F; Maltarollo Vinicius G; Oliveira Patricia R; Honorio Kathia Maria; Honorio Kathia MariaFrontiers in robotics and AI (2019), 6 (), 108 ISSN:.Discovering (or planning) a new drug candidate involves many parameters, which makes this process slow, costly, and leading to failures at the end in some cases. In the last decades, we have witnessed a revolution in the computational area (hardware, software, large-scale computing, etc.), as well as an explosion in data generation (big data), which raises the need for more sophisticated algorithms to analyze this myriad of data. In this scenario, we can highlight the potentialities of artificial intelligence (AI) or computational intelligence (CI) as a powerful tool to analyze medicinal chemistry data. According to IEEE, computational intelligence involves the theory, the design, the application, and the development of biologically and linguistically motivated computational paradigms. In addition, CI encompasses three main methodologies: neural networks (NN), fuzzy systems, and evolutionary computation. In particular, artificial neural networks have been successfully applied in medicinal chemistry studies. A branch of the NN area that has attracted a lot of attention refers to deep learning (DL) due to its generalization power and ability to extract features from data. Therefore, in this mini-review we will briefly outline the present scope, advances, and challenges related to the use of DL in drug design and discovery, describing successful studies involving quantitative structure-activity relationships (QSAR) and virtual screening (VS) of databases containing thousands of compounds.
- 3Norinder, U.; Svensson, F. Multitask Modeling with Confidence Using Matrix Factorization and Conformal Prediction. J. Chem. Inf. Model. 2019, 59, 1598– 1604, DOI: 10.1021/acs.jcim.9b000273https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXlslejsrY%253D&md5=40d6e0601087c68202722a50f8328376Multitask Modeling with Confidence Using Matrix Factorization and Conformal PredictionNorinder, Ulf; Svensson, FredrikJournal of Chemical Information and Modeling (2019), 59 (4), 1598-1604CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Multitask prediction of bioactivities is often faced with challenges relating to the sparsity of data and imbalance between different labels. The authors propose class conditional (Mondrian) conformal predictors using underlying Macau models as a novel approach for large scale bioactivity prediction. This approach handles both high degrees of missing data and label imbalances while still producing high quality predictive models. When applied to ten assay end points from PubChem, the models generated valid models with an efficiency of 74.0-80.1% at the 80% confidence level with similar performance both for the minority and majority class. Also when deleting progressively larger portions of the available data (0-80%) the performance of the models remained robust with only minor deterioration (redn. in efficiency between 5 and 10%). Compared to using Macau without conformal prediction the method presented here significantly improves the performance on imbalanced data sets.
- 4Zubatyuk, R.; Smith, J. S.; Leszczynski, J.; Isayev, O. Accurate and transferable multitask prediction of chemical properties with an atoms-in-molecules neural network. Sci. Adv. 2019, 5, eaav6490 DOI: 10.1126/sciadv.aav6490There is no corresponding record for this reference.
- 5Ramsundar, B.; Liu, B.; Wu, Z.; Verras, A.; Tudor, M.; Sheridan, R. P.; Pande, V. Is Multitask Deep Learning Practical for Pharma. J. Chem. Inf. Model. 2017, 57, 2068– 2076, DOI: 10.1021/acs.jcim.7b001465https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtFCisr7O&md5=3d27a01e20455d4ee4c869b7cc91717dIs Multitask Deep Learning Practical for Pharma?Ramsundar, Bharath; Liu, Bowen; Wu, Zhenqin; Verras, Andreas; Tudor, Matthew; Sheridan, Robert P.; Pande, VijayJournal of Chemical Information and Modeling (2017), 57 (8), 2068-2076CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Multitask deep learning has emerged as a powerful tool for computational drug discovery. However, despite a no. of preliminary studies, multitask deep networks have yet to be widely deployed in the pharmaceutical and biotech industries. This lack of acceptance stems from both software difficulties and lack of understanding of the robustness of multitask deep networks. Our work aims to resolve both of these barriers to adoption. We introduce a high-quality open-source implementation of multitask deep networks as part of the DeepChem open-source platform. Our implementation enables simple python scripts to construct, fit, and evaluate sophisticated deep models. We use our implementation to analyze the performance of multitask deep networks and related deep models on four collections of pharmaceutical data (three of which have not previously been analyzed in the literature). We split these data sets into train/valid/test using time and neighbor splits to test multitask deep learning performance under challenging conditions. Our results demonstrate that multitask deep networks are surprisingly robust and can offer strong improvement over random forests. Our anal. and open-source implementation in DeepChem provide an argument that multitask deep networks are ready for widespread use in com. drug discovery.
- 6Sosnin, S.; Vashurina, M.; Withnall, M.; Karpov, P.; Fedorov, M.; Tetko, I. A Survey of Multi-task Learning Methods in Chemoinformatics. Mol. Inf. 2018, 615– 621There is no corresponding record for this reference.
- 7Sosnin, S.; Karlov, D.; Tetko, I. V.; Fedorov, M. V. Comparative Study of Multitask Toxicity Modeling on a Broad Chemical Space. J. Chem. Inf. Model. 2019, 1062– 1072, DOI: 10.1021/acs.jcim.8b006857https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXis1SgtLfP&md5=33e06f11e654074570f6a142979cbb65Comparative Study of Multitask Toxicity Modeling on a Broad Chemical SpaceSosnin, Sergey; Karlov, Dmitry; Tetko, Igor V.; Fedorov, Maxim V.Journal of Chemical Information and Modeling (2019), 59 (3), 1062-1072CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Acute toxicity is one of the most challenging properties to predict purely with computational methods due to its direct relationship to biol. interactions. Moreover, toxicity can be represented by different endpoints: it can be measured for different species using different types of administration, etc., and it is questionable if the knowledge transfer between endpoints is possible. We performed a comparative study of prediction multi-task toxicity for a broad chem. space using different descriptors and modeling algorithms and applied multi-task learning for a large toxicity dataset extd. from the Registry of Toxic Effects of Chem. Substances (RTECS). We demonstrated that multi-task modeling provides significant improvement over single-output models and other machine learning methods. Our research reveals that multi-task learning can be very useful to improve the quality of acute toxicity modeling and raises a discussion about the usage of multi-task approaches for regulation purposes.
- 8van Westen, G. J. P.; Wegner, J. K.; IJzerman, A. P.; van Vlijmen, H. W. T.; Bender, A. Proteochemometric Modeling as a Tool to Design Selective Compounds and for Extrapolating to Novel Targets. MedChemComm 2011, 2, 16– 30, DOI: 10.1039/C0MD00165A8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXksFGltQ%253D%253D&md5=5edd71d2091121325d7e47439d28fd57Proteochemometric modeling as a tool to design selective compounds and for extrapolating to novel targetsvan Westen, Gerard J. P.; Wegner, Joerg K.; IJzerman, Adriaan P.; van Vlijmen, Herman W. T.; Bender, A.MedChemComm (2011), 2 (1), 16-30CODEN: MCCEAY; ISSN:2040-2503. (Royal Society of Chemistry)A review. 'Proteochemometric modeling' is a bioactivity modeling technique founded on the description of both small mols. (the ligands), and proteins (the targets). By combining those two elements of a ligand - target interaction proteochemometrics techniques model the interaction complex or the full ligand - target interaction space, and they are able to quantify the similarity between both ligands and targets simultaneously. Consequently, proteochemometric models or complex based models, can be considered an extension of QSAR models, which are ligand based. As proteochemometric models are able to incorporate target information they outperform conventional QSAR models when extrapolating from the activities of known ligands on known targets to novel targets. Vice versa, proteochemometrics can be used to virtually screen for selective compds. that are solely active on a single member of a subfamily of targets, as well as to select compds. with a desired bioactivity profile - a topic particularly relevant with concepts such as 'ligand polypharmacol.' in mind. Here we illustrate the concept of proteochemometrics and provide a review of relevant methodol. publications in the field. We give an overview of the target families proteochemometrics modeling has previously been applied to, and introduce some novel application areas of the modeling technique. We conclude that proteochemometrics is a promising technique in preclin. drug research that allows merging data sets that were previously considered sep., with the potential to extrapolate more reliably both in ligand as well as target space.
- 9Cortés-Ciriano, I.; Ain, Q. U.; Subramanian, V.; Lenselink, E. B.; Méndez-Lucio, O.; IJzerman, A. P.; Wohlfahrt, G.; Prusis, P.; Malliavin, T. E.; van Westen, G. J. P.; Bender, A. Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects. MedChemComm 2015, 6, 24– 50, DOI: 10.1039/C4MD00216D9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhslagsLrE&md5=b3e82734e224ae4990218b6473ef7859Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospectsCortes-Ciriano, Isidro; Ain, Qurrat Ul; Subramanian, Vigneshwari; Lenselink, Eelke B.; Mendez-Lucio, Oscar; IJzerman, Adriaan P.; Wohlfahrt, Gerd; Prusis, Peteris; Malliavin, Therese E.; van Westen, Gerard J. P.; Bender, AndreasMedChemComm (2015), 6 (1), 24-50CODEN: MCCEAY; ISSN:2040-2503. (Royal Society of Chemistry)A review. Proteochemometric (PCM) modeling is a computational method to model the bioactivity of multiple ligands against multiple related protein targets simultaneously. Hence it has been found to be particularly useful when exploring the selectivity and promiscuity of ligands on different proteins. In this review, we will firstly provide a brief introduction to the main concepts of PCM for readers new to the field. The next part focuses on recent tech. advances, including the application of support vector machines (SVMs) using different kernel functions, random forests, Gaussian processes and collaborative filtering. The subsequent section will then describe some novel practical applications of PCM in the medicinal chem. field, including studies on GPCRs, kinases, viral proteins (e.g. from HIV) and epigenetic targets such as histone deacetylases. Finally, we will conclude by summarizing novel developments in PCM, which we expect to gain further importance in the future. These developments include adding three-dimensional protein target information, application of PCM to the prediction of binding energies, and application of the concept in the fields of pharmacogenomics and toxicogenomics. This review is an update to a related publication in 2011 and it mainly focuses on developments in the field since then.
- 10Schaduangrat, N.; Anuwongcharoen, N.; Phanus-umporn, C.; Sriwanichpoom, N.; Wikberg, J. E.; Nantasenamat, C. Silico Drug Design; Elsevier, 2019; pp 281– 302.There is no corresponding record for this reference.
- 11Alves, V. M.; Golbraikh, A.; Capuzzi, S. J.; Liu, K.; Lam, W. I.; Korn, D. R.; Pozefsky, D.; Andrade, C. H.; Muratov, E. N.; Tropsha, A. Multi-Descriptor Read Across (MuDRA): A Simple and Transparent Approach for Developing Accurate Quantitative Structure-Activity Relationship Models. J. Chem. Inf. Model. 2018, 58, 1214– 1223, DOI: 10.1021/acs.jcim.8b0012411https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhtVags7rK&md5=4a8174226c05b72d6f7b6856e0ffa855Multi-Descriptor Read Across (MuDRA): A Simple and Transparent Approach for Developing Accurate Quantitative Structure-Activity Relationship ModelsAlves, Vinicius M.; Golbraikh, Alexander; Capuzzi, Stephen J.; Liu, Kammy; Lam, Wai In; Korn, Daniel Robert; Pozefsky, Diane; Andrade, Carolina Horta; Muratov, Eugene N.; Tropsha, AlexanderJournal of Chemical Information and Modeling (2018), 58 (6), 1214-1223CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Multiple approaches to quant. structure-activity relationship (QSAR) modeling using various statistical or machine learning techniques and different types of chem. descriptors have been developed over the years. Oftentimes models are used in consensus to make more accurate predictions at the expense of model interpretation. We propose a simple, fast, and reliable method termed Multi-Descriptor Read Across (MuDRA) for developing both accurate and interpretable models. The method is conceptually related to the well-known kNN approach but uses different types of chem. descriptors simultaneously for similarity assessment. To benchmark the new method, we have built MuDRA models for six different end points (Ames mutagenicity, aquatic toxicity, hepatotoxicity, hERG liability, skin sensitization, and endocrine disruption) and compared the results with those generated with conventional consensus QSAR modeling. We find that models built with MuDRA show consistently high external accuracy similar to that of conventional QSAR models. However, MuDRA models excel in terms of transparency, interpretability, and computational efficiency. We posit that due to its methodol. simplicity and reliable predictive accuracy, MuDRA provides a powerful alternative to a much more complex consensus QSAR modeling. MuDRA is implemented and freely available at the Chembench web portal (https://chembench.mml.unc.edu/mudra).
- 12Recommender Systems Handbook; Ricci, F.; Rokach, L.; Shapira, B., Eds.; Springer US: Boston, MA, 2015.There is no corresponding record for this reference.
- 13Waegeman, W.; Dembczyński, K.; Hüllermeier, E. Multi-target prediction: a unifying view on problems and methods. Data Min. Knowl. Discovery 2019, 33, 293– 324, DOI: 10.1007/s10618-018-0595-5There is no corresponding record for this reference.
- 14Bennett, J.; Elkan, C.; Liu, B.; Smyth, P.; Tikk, D. KDD Cup and Workshop 2007. SIGKDD Explor. Newsl. 2007, 9, 51– 52, DOI: 10.1145/1345448.1345459There is no corresponding record for this reference.
- 15Amatriain, X.; Basilico, J. Recommender Systems Handbook; Ricci, F.; Rokach, L.; Shapira, B., Eds.; Springer US: Boston, MA, 2015; pp 385– 419.There is no corresponding record for this reference.
- 16Thorat, P. B.; Goudar, R. M.; Barve, S. Survey on Collaborative Filtering, Content-based Filtering and Hybrid Recommendation System. Int. J. Comput. Appl. 2015, 110, 31– 36There is no corresponding record for this reference.
- 17Aggarwal, C. C. Recommender Systems: The Textbook, 1st ed.; Springer Publishing Company, Inc., 2016.There is no corresponding record for this reference.
- 18Sanghavi, B.; Rathod, R.; Mistry, D. M. Recommender Systems-Comparison of Content-based Filtering and Collaborative Filtering. Int. J. Curr. Eng. Technol. 2014, 4, 3131– 3133There is no corresponding record for this reference.
- 19Aggarwal, P.; Tomar, V.; Kathuria, A. Comparing Content Based and Collaborative Filtering in Recommender Systems. Int. J. New Technol. Res. 2017, 3, 3There is no corresponding record for this reference.
- 20Ariff, N. M.; Bakar, M. A. A.; Rahim, N. F. In Comparison Between Content-based and Collaborative Filtering Recommendation System for Movie Suggestions, AIP Conference Proceedings; AIP Publishing LLC: Kuala Lumpur, Malaysia, 2018; p 020057.There is no corresponding record for this reference.
- 21Su, X.; Khoshgoftaar, T. M. A Survey of Collaborative Filtering Techniques. Adv. Artif. Intell. 2009, 2009, 1– 19, DOI: 10.1155/2009/421425There is no corresponding record for this reference.
- 22Nilashi, M.; Bagherifard, K.; Ibrahim, O.; Alizadeh, H.; Nojeem, L. A.; Roozegar, N. Collaborative Filtering Recommender Systems. Res. J. Appl. Sci., Eng. Technol. 2013, 5, 4168– 4182, DOI: 10.19026/rjaset.5.4644There is no corresponding record for this reference.
- 23Sharma, M.; Mann, S. A Survey of Recommender Systems: Approaches and Limitations. Int. J. Innov. Sci. Eng. Technol. 2013, 1– 9There is no corresponding record for this reference.
- 24Cacheda, F.; Carneiro, V.; Fernández, D.; Formoso, V. Comparison of collaborative filtering algorithms: Limitations of current techniques and proposals for scalable, high-performance recommender systems. ACM Trans. Web 2011, 5, 1– 33, DOI: 10.1145/1921591.1921593There is no corresponding record for this reference.
- 25Kumar Bokde, D.; Girase, S.; Mukhopadhyay, D. Matrix Factorization Model in Collaborative Filtering Algorithms: A Survey. Procedia Comput. Sci. 2015, 49, 136– 146There is no corresponding record for this reference.
- 26Pazzani, M. J.; Billsus, D. The Adaptive Web; Brusilovsky, P.; Kobsa, A.; Nejdl, W., Eds.; Springer: Berlin, Heidelberg, 2007; Vol. 4321, pp 325– 341.There is no corresponding record for this reference.
- 27Lops, P.; de Gemmis, M.; Semeraro, G. Recommender Systems Handbook; Ricci, F.; Rokach, L.; Shapira, B.; Kantor, P. B., Eds.; Springer US: Boston, MA, 2011; pp 73– 105.There is no corresponding record for this reference.
- 28Zhang, W.; Zou, H.; Luo, L.; Liu, Q.; Wu, W.; Xiao, W. Predicting potential side effects of drugs by recommender methods and ensemble learning. Neurocomputing 2016, 173, 979– 987, DOI: 10.1016/j.neucom.2015.08.054There is no corresponding record for this reference.
- 29Fan, J.; Yang, J.; Jiang, Z. Prediction of Central Nervous System Side Effects Through Drug Permeability to Blood-Brain Barrier and Recommendation Algorithm. J. Comput. Biol. 2018, 25, 1– 9, DOI: 10.1089/cmb.2017.0149There is no corresponding record for this reference.
- 30Wang, H.; Gu, Q.; Wei, J.; Cao, Z.; Liu, Q. Mining drug-disease relationships as a complement to medical genetics-based drug repositioning: Where a recommendation system meets genome-wide association studies. Clin. Pharmacol. Ther. 2015, 97, 451– 454, DOI: 10.1002/cpt.8230https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2MrksVWgsQ%253D%253D&md5=19103bff034f45685bfc5aed59126da9Mining drug-disease relationships as a complement to medical genetics-based drug repositioning: Where a recommendation system meets genome-wide association studiesWang H; Gu Q; Wei J; Cao Z; Liu QClinical pharmacology and therapeutics (2015), 97 (5), 451-4 ISSN:.A novel recommendation-based drug repositioning strategy is presented to simultaneously determine novel drug indications and side effects in one integrated framework. This strategy provides a complementary method to medical genetics-based drug repositioning, which reduces the occurrence of false positives in medical genetics-based drug repositioning, resulting in a ranked list of new candidate indications and/or side effects with different confidence levels. Several new drug indications and side effects are reported with high prediction confidences.
- 31Hao, W.; Hai-ping, W.; Xin-dong, W.; Qi, L. Mining Drug-Disease Relationships: a Recommendation System. Chin. Pharmacol. Bull. 2015, 31, 1770– 1774There is no corresponding record for this reference.
- 32Yang, J.; Li, Z.; Fan, X.; Cheng, Y. Drug-Disease Association and Drug-Repositioning Predictions in Complex Diseases Using Causal Inference-Probabilistic Matrix Factorization. J. Chem. Inf. Model. 2014, 54, 2562– 2569, DOI: 10.1021/ci500340n32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtleqtb%252FP&md5=d300bb27b05bd8d228e4efb0b6a5a5ebDrug-Disease Association and Drug-Repositioning Predictions in Complex Diseases Using Causal Inference-Probabilistic Matrix FactorizationYang, Jihong; Li, Zheng; Fan, Xiaohui; Cheng, YiyuJournal of Chemical Information and Modeling (2014), 54 (9), 2562-2569CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)The high incidence of complex diseases has become a worldwide threat to human health. Multiple targets and pathways are perturbed during the pathol. process of complex diseases. Systematic investigation of complex relationship between drugs and diseases is necessary for new assocn. discovery and drug repurposing. For this purpose, three causal networks were constructed herein for cardiovascular diseases, diabetes mellitus, and neoplasms, resp. A causal inference-probabilistic matrix factorization (CI-PMF) approach was proposed to predict and classify drug-disease assocns., and further used for drug-repositioning predictions. First, multilevel systematic relations between drugs and diseases were integrated from heterogeneous databases to construct causal networks connecting drug-target-pathway-gene-disease. Then, the assocn. scores between drugs and diseases were assessed by evaluating a drug's effects on multiple targets and pathways. Furthermore, PMF models were learned based on known interactions, and assocns. were then classified into three types by trained models. Finally, therapeutic assocns. were predicted based upon the ranking of assocn. scores and predicted assocn. types. In terms of drug-disease assocn. prediction, modified causal inference included in CI-PMF outperformed existing causal inference with a higher AUC (area under receiver operating characteristic curve) score and greater precision. Moreover, CI-PMF performed better than single modified causal inference in predicting therapeutic drug-disease assocns. In the top 30% of predicted assocns., 58.6% (136/232), 50.8% (31/61), and 39.8% (140/352) hit known therapeutic assocns., while precisions obtained by the latter were only 10.2% (231/2264), 8.8% (36/411), and 9.7% (189/1948). Clin. verifications were further conducted for the top 100 newly predicted therapeutic assocns. As a result, 21, 12, and 32 assocns. have been studied and many treatment effects of drugs on diseases were investigated for cardiovascular diseases, diabetes mellitus, and neoplasms, resp. Related chains in causal networks were extd. for these 65 clin.-verified assocns., and we further illustrated the therapeutic role of etodolac in breast cancer by inferred chains. Overall, CI-PMF is a useful approach for assocg. drugs with complex diseases and provides potential values for drug repositioning.
- 33Galeano, D.; Paccanaro, A. A Recommender System Approach for Predicting Drug Side Effects. In International Joint Conference on Neural Networks (IJCNN); IEEE, 2018; pp 1– 7.There is no corresponding record for this reference.
- 34Qiu, H.; Mao, K.-T.; Shi, J.-Y.; Huang, H.; Chen, Z.; Dong, K.; Yiu, S.-M. Predicting and Understanding Comprehensive Drug-Drug Interactions via Semi-nonnegative Matrix Factorization. BMC Syst. Biol. 2018, 101– 110There is no corresponding record for this reference.
- 35Shi, J.-Y.; Huang, H.; Li, J.-X.; Lei, P.; Zhang, Y.-N.; Yiu, S.-M. Predicting Comprehensive Drug-Drug Interactions for New Drugs via Triple Matrix Factorization. Bioinf. Biomed. Eng. 2017, 2018, 108– 117There is no corresponding record for this reference.
- 36Yamada, M.; Lian, W.; Goyal, A.; Chen, J.; Wimalawarne, K.; Khan, S. A.; Kaski, S.; Mamitsuka, H.; Chang, Y. In Convex Factorization Machine for Toxicogenomics Prediction, Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining—KDD ’17, pp 1215– 1224.There is no corresponding record for this reference.
- 37Bhat, S.; Aishwarya, K. In Item-Based Hybrid Recommender System For Newly Marketed Pharmaceutical Drugs , 2013 International Conference on Advances in Computing, Mysore, 2013; pp 2107– 2111.There is no corresponding record for this reference.
- 38Huang, Z.; Lu, X.; Duan, H.; Zhao, C. Collaboration-based Medical Knowledge Recommendation. Artif. Intell. Med. 2012, 55, 13– 24, DOI: 10.1016/j.artmed.2011.10.00238https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC38zos1Sjtw%253D%253D&md5=49ca552faf586f1cd0c4d851cc3c4d78Collaboration-based medical knowledge recommendationHuang Zhengxing; Lu Xudong; Duan Huilong; Zhao ChenhuiArtificial intelligence in medicine (2012), 55 (1), 13-24 ISSN:.PURPOSE: Clinicians rely on a large amount of medical knowledge when performing clinical work. In clinical environment, clinical organizations must exploit effective methods of seeking and recommending appropriate medical knowledge in order to help clinicians perform their work. METHOD: Aiming at supporting medical knowledge search more accurately and realistically, this paper proposes a collaboration-based medical knowledge recommendation approach. In particular, the proposed approach generates clinician trust profile based on the measure of trust factors implicitly from clinicians' past rating behaviors on knowledge items. And then the generated clinician trust profile is incorporated into collaborative filtering techniques to improve the quality of medical knowledge recommendation, to solve the information-overload problem by suggesting knowledge items of interest to clinicians. RESULTS: Two case studies are conducted at Zhejiang Huzhou Central Hospital of China. One case study is about the drug recommendation hold in the endocrinology department of the hospital. The experimental dataset records 16 clinicians' drug prescribing tracks in six months. This case study shows a proof-of-concept of the proposed approach. The other case study addresses the problem of radiological computed tomography (CT)-scan report recommendation. In particular, 30 pieces of CT-scan examinational reports about cerebral hemorrhage patients are collected from electronic medical record systems of the hospital, and are evaluated and rated by 19 radiologists of the radiology department and 7 clinicians of the neurology department, respectively. This case study provides some confidence the proposed approach will scale up. CONCLUSION: The experimental results show that the proposed approach performs well in recommending medical knowledge items of interest to clinicians, which indicates that the proposed approach is feasible in clinical practice.
- 39Ma, J.; Zhang, R.; Yuan, Y.; Zhao, Z. Using Hybrid Similarity-Based Collaborative Filtering Method for Compound Activity Prediction. In International Conference on Intelligent Computing; Springer: Cham, 2018; pp 51– 72.There is no corresponding record for this reference.
- 40Simm, J.; Arany, A.; Zakeri, P.; Haber, T.; Wegner, J. K.; Chupakhin, V.; Ceulemans, H.; Moreau, Y. In Macau: Scalable Bayesian Factorization with High-Dimensional Side Information Using MCMC , 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing, 2017; pp 2107– 2111.There is no corresponding record for this reference.
- 41de León, A.; Chen, B.; Gillet, V. J. Effect of Missing Data on Multitask Prediction Methods. J. Cheminf. 2018, 10, 26 DOI: 10.1186/s13321-018-0281-zThere is no corresponding record for this reference.
- 42Hasan, S.; Duncan, G. T.; Neill, D. B.; Padman, R. In Towards a Collaborative Filtering Approach to Medication Reconciliation , AMIA Annual Symposium Proceedings, 2008; pp 288– 292.There is no corresponding record for this reference.
- 43Hasan, S.; Duncan, G. T.; Neill, D. B.; Padman, R. Automatic Detection of Omissions in Medication Lists. J. Am. Med. Inform. Assoc. 2011, 18, 449– 458, DOI: 10.1136/amiajnl-2011-00010643https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC3MngtlyisA%253D%253D&md5=24c8821ad3ff3b2aa3021fe185cc41ffAutomatic detection of omissions in medication listsHasan Sharique; Duncan George T; Neill Daniel B; Padman RemaJournal of the American Medical Informatics Association : JAMIA (2011), 18 (4), 449-58 ISSN:.OBJECTIVE: Evidence suggests that the medication lists of patients are often incomplete and could negatively affect patient outcomes. In this article, the authors propose the application of collaborative filtering methods to the medication reconciliation task. Given a current medication list for a patient, the authors employ collaborative filtering approaches to predict drugs the patient could be taking but are missing from their observed list. DESIGN: The collaborative filtering approach presented in this paper emerges from the insight that an omission in a medication list is analogous to an item a consumer might purchase from a product list. Online retailers use collaborative filtering to recommend relevant products using retrospective purchase data. In this article, the authors argue that patient information in electronic medical records, combined with artificial intelligence methods, can enhance medication reconciliation. The authors formulate the detection of omissions in medication lists as a collaborative filtering problem. Detection of omissions is accomplished using several machine-learning approaches. The effectiveness of these approaches is evaluated using medication data from three long-term care centers. The authors also propose several decision-theoretic extensions to the methodology for incorporating medical knowledge into recommendations. RESULTS: Results show that collaborative filtering identifies the missing drug in the top-10 list about 40-50% of the time and the therapeutic class of the missing drug 50%-65% of the time at the three clinics in this study. CONCLUSION: Results suggest that collaborative filtering can be a valuable tool for reconciling medication lists, complementing currently recommended process-driven approaches. However, a one-size-fits-all approach is not optimal, and consideration should be given to context (eg, types of patients and drug regimens) and consequence (eg, the impact of omission on outcomes).
- 44Huang, Z.; Lu, X.; Duan, H.; Zhao, C. Collaboration-based Medical Knowledge Recommendation. Artif. Intell. Med. 2012, 55, 13– 24, DOI: 10.1016/j.artmed.2011.10.00244https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC38zos1Sjtw%253D%253D&md5=49ca552faf586f1cd0c4d851cc3c4d78Collaboration-based medical knowledge recommendationHuang Zhengxing; Lu Xudong; Duan Huilong; Zhao ChenhuiArtificial intelligence in medicine (2012), 55 (1), 13-24 ISSN:.PURPOSE: Clinicians rely on a large amount of medical knowledge when performing clinical work. In clinical environment, clinical organizations must exploit effective methods of seeking and recommending appropriate medical knowledge in order to help clinicians perform their work. METHOD: Aiming at supporting medical knowledge search more accurately and realistically, this paper proposes a collaboration-based medical knowledge recommendation approach. In particular, the proposed approach generates clinician trust profile based on the measure of trust factors implicitly from clinicians' past rating behaviors on knowledge items. And then the generated clinician trust profile is incorporated into collaborative filtering techniques to improve the quality of medical knowledge recommendation, to solve the information-overload problem by suggesting knowledge items of interest to clinicians. RESULTS: Two case studies are conducted at Zhejiang Huzhou Central Hospital of China. One case study is about the drug recommendation hold in the endocrinology department of the hospital. The experimental dataset records 16 clinicians' drug prescribing tracks in six months. This case study shows a proof-of-concept of the proposed approach. The other case study addresses the problem of radiological computed tomography (CT)-scan report recommendation. In particular, 30 pieces of CT-scan examinational reports about cerebral hemorrhage patients are collected from electronic medical record systems of the hospital, and are evaluated and rated by 19 radiologists of the radiology department and 7 clinicians of the neurology department, respectively. This case study provides some confidence the proposed approach will scale up. CONCLUSION: The experimental results show that the proposed approach performs well in recommending medical knowledge items of interest to clinicians, which indicates that the proposed approach is feasible in clinical practice.
- 45Nikitina, A. A.; Orlov, A. A.; Kozlovskaya, L. I.; Palyulin, V. A.; Osolodkin, D. I. Enhanced Taxonomy Annotation of Antiviral Activity Data from ChEMBL. Database 2019, 2019, bay139 DOI: 10.1093/database/bay139There is no corresponding record for this reference.
- 46Seley-Radtke, K. L.; Yates, M. K. The Evolution of Nucleoside Analogue Antivirals: A Review for Chemists and Non-chemists. Part 1: Early Structural Modifications to the Nucleoside Scaffold. Antiviral Res. 2018, 154, 66– 86, DOI: 10.1016/j.antiviral.2018.04.00446https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXnslOntrc%253D&md5=ccd9ad86ef8a1baf078e21747edef73dThe evolution of nucleoside analogue antivirals: A review for chemists and non-chemists. Part 1: Early structural modifications to the nucleoside scaffoldSeley-Radtke, Katherine L.; Yates, Mary K.Antiviral Research (2018), 154 (), 66-86CODEN: ARSRDR; ISSN:0166-3542. (Elsevier B.V.)A review. This is the first of two invited articles reviewing the development of nucleoside-analog antiviral drugs, written for a target audience of virologists and other non-chemists, as well as chemists who may not be familiar with the field. Rather than providing a simple chronol. account, we have examd. and attempted to explain the thought processes, advances in synthetic chem. and lessons learned from antiviral testing that led to a few mols. being moved forward to eventual approval for human therapies, while others were discarded. The present paper focuses on early, relatively simplistic changes made to the nucleoside scaffold, beginning with modifications of the nucleoside sugars of Ara-C and other arabinose-derived nucleoside analogs in the 1960's. A future paper will review more recent developments, focusing esp. on more complex modifications, particularly those involving multiple changes to the nucleoside scaffold. We hope that these articles will help virologists and others outside the field of medicinal chem. to understand why certain drugs were successfully developed, while the majority of candidate compds. encountered barriers due to low-yielding synthetic routes, toxicity or other problems that led to their abandonment.
- 47Yates, M. K.; Seley-Radtke, K. L. The Evolution of Antiviral Nucleoside Analogues: A Review for Chemists and Non-chemists. Part II: Complex Modifications to the Nucleoside Scaffold. Antiviral Res. 2019, 162, 5– 21, DOI: 10.1016/j.antiviral.2018.11.01647https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXisFSntrzL&md5=26d398d67fa3ae4c54318057eb18b35fThe evolution of antiviral nucleoside analogues: A review for chemists and non-chemists. Part II: Complex modifications to the nucleoside scaffoldYates, Mary K.; Seley-Radtke, Katherine L.Antiviral Research (2019), 162 (), 5-21CODEN: ARSRDR; ISSN:0166-3542. (Elsevier B.V.)This is the second of two invited articles reviewing the development of nucleoside analog antiviral drugs, written for a target audience of virologists and other non-chemists, as well as chemists who may not be familiar with the field. As with the first paper, rather than providing a chronol. account, we have chosen to examine particular examples of structural modifications made to nucleoside analogs that have proven fruitful as various antiviral, anticancer, and other therapeutics. The first review covered the more common, and in most cases, single modifications to the sugar and base moieties of the nucleoside scaffold. This paper focuses on more recent developments, esp. nucleoside analogs that contain more than one modification to the nucleoside scaffold. We hope that these two articles will provide an informative historical perspective of some of the successfully designed analogs, as well as many candidate compds. that encountered obstacles.
- 48Li, G.; De Clercq, E. Therapeutic options for the 2019 novel coronavirus (2019-nCoV). Nat. Rev. Drug Discov. 2020, 19, 149– 150, DOI: 10.1038/d41573-020-00016-048https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXkt1GksLo%253D&md5=60d91d88e6a0279829eecb74990c9d03Therapeutic options for the 2019 novel coronavirus (2019-nCoV)Li, Guangdi; De Clercq, ErikNature Reviews Drug Discovery (2020), 19 (3), 149-150CODEN: NRDDAG; ISSN:1474-1776. (Nature Research)A review. Therapeutic options in response to the 2019-nCoV outbreak are urgently needed. Here, we discuss the potential for repurposing existing antiviral agents to treat 2019-nCoV infection (now known as COVID-19), some of which are already moving into clin. trials.
- 49Grčar, M.; Mladenič, D.; Fortuna, B.; Grobelnik, M. Advances in Web Mining and Web Usage Analysis; Hutchison, D.; Kanade, T.; Kittler, J.; Kleinberg, J. M.; Mattern, F.; Mitchell, J. C.; Naor, M.; Nierstrasz, O.; Pandu Rangan, C.; Steffen, B., Eds.; Springer: Berlin, Heidelberg, 2006; Vol. 4198, pp 58– 76.There is no corresponding record for this reference.
- 50Guo, M. User Modeling, Adaptation, and Personalization; Hutchison, D.; Kanade, T.; Kittler, J.; Kleinberg, J. M.; Mattern, F.; Mitchell, J. C.; Naor, M.; Nierstrasz, O.; Pandu Rangan, C.; Steffen, B., Eds.; Springer: Berlin, Heidelberg, 2012; Vol. 7379, pp 361– 364.There is no corresponding record for this reference.
- 51Hug, N. Surprise, a Python Library for Recommender Systems. http://surpriselib.com (accessed December 1, 2018).There is no corresponding record for this reference.
- 52Nazarov, I.; Shirokikh, B.; Burkina, M.; Fedonin, G.; Panov, M. Sparse Group Inductive Matrix Completion, 2018. arXiv preprint arXiv:1804.10653. https://arxiv.org/abs/1804.10653.There is no corresponding record for this reference.
- 53Davies, M.; Nowotka, M.; Papadatos, G.; Dedman, N.; Gaulton, A.; Atkinson, F.; Bellis, L.; Overington, J. P. ChEMBL web services: streamlining access to drug discovery data and utilities. Nucleic Acids Res. 2015, 43, W612– W620, DOI: 10.1093/nar/gkv35253https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtVymtL7K&md5=2e1955bce953b6c2dfc4ef0e92752623ChEMBL web services: streamlining access to drug discovery data and utilitiesDavies, Mark; Nowotka, Michal; Papadatos, George; Dedman, Nathan; Gaulton, Anna; Atkinson, Francis; Bellis, Louisa; Overington, John P.Nucleic Acids Research (2015), 43 (W1), W612-W620CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)A review. ChEMBL is now a well-established resource in the fields of drug discovery and medicial chem. research. The ChEMBL database curates and stores standardized bioactivity, mol., target and drug data extd. from multiple sources, including the primary medicinal chem. literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services (version 2.0.x), which exposes significantly more data from the underlying database and introduces new functionality. To complement the data-focused services, a utility service (version 1.0.x), which provides RESTful access to commonly used chem.-informatics methods, has also been concurrently developed. The ChEMBL web services can be used together or independently to build applications and data processing workflows relevant to drug discovery and chem. biol.
- 54Kode srl, Dragon (Software for Molecular Descriptor Calculation), version 7.0.8., 2017. https://chm.kode-solutions.net.There is no corresponding record for this reference.
- 55ICTV Master Species List, v.1. https://talk.ictvonline.org/files/master-species-lists/m/msl/5945 (accessed July 1, 2018).There is no corresponding record for this reference.
- 56Muhammad, U.; Uzairu, A.; Ebuka Arthur, D. Review on: quantitative structure activity relationship (QSAR) modeling. J. Anal. Pharm. 2018, 7, 240– 242, DOI: 10.15406/japlr.2018.07.00232There is no corresponding record for this reference.
- 57Siontis, G. C.; Tzoulaki, I.; Castaldi, P. J.; Ioannidis, J. P. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination. J. Clin. Epidemiol. 2015, 68, 25– 34, DOI: 10.1016/j.jclinepi.2014.09.00757https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2MzitVKksA%253D%253D&md5=e928bea6812d2066f4cfdefddcca46c8External validation of new risk prediction models is infrequent and reveals worse prognostic discriminationSiontis George C M; Tzoulaki Ioanna; Castaldi Peter J; Ioannidis John P AJournal of clinical epidemiology (2015), 68 (1), 25-34 ISSN:.OBJECTIVES: To evaluate how often newly developed risk prediction models undergo external validation and how well they perform in such validations. STUDY DESIGN AND SETTING: We reviewed derivation studies of newly proposed risk models and their subsequent external validations. Study characteristics, outcome(s), and models' discriminatory performance [area under the curve, (AUC)] in derivation and validation studies were extracted. We estimated the probability of having a validation, change in discriminatory performance with more stringent external validation by overlapping or different authors compared to the derivation estimates. RESULTS: We evaluated 127 new prediction models. Of those, for 32 models (25%), at least an external validation study was identified; in 22 models (17%), the validation had been done by entirely different authors. The probability of having an external validation by different authors within 5 years was 16%. AUC estimates significantly decreased during external validation vs. the derivation study [median AUC change: -0.05 (P < 0.001) overall; -0.04 (P = 0.009) for validation by overlapping authors; -0.05 (P < 0.001) for validation by different authors]. On external validation, AUC decreased by at least 0.03 in 19 models and never increased by at least 0.03 (P < 0.001). CONCLUSION: External independent validation of predictive models in different studies is uncommon. Predictive performance may worsen substantially on external validation.
- 58Raschka, S. Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning, 2018. arXiv preprint arXiv:1811.12808. https://arxiv.org/abs/1811.12808.There is no corresponding record for this reference.
- 59Brown, J. B. Classifiers and their Metrics Quantified. Mol. Inform. 2018, 37, 1– 11, DOI: 10.1002/minf.201700127There is no corresponding record for this reference.
- 60Ramsundar, B.; Kearnes, S.; Riley, P.; Webster, D.; Konerding, D.; Pande, V. Massively Multitask Networks for Drug Discovery, 2015. arXiv preprint arXiv:1502.02072. https://arxiv.org/abs/1502.02072.There is no corresponding record for this reference.
- 61Rücker, C.; Rücker, G.; Meringer, M. y-Randomization and Its Variants in QSPR/QSAR. J. Chem. Inf. Model. 2007, 47, 2345– 2357, DOI: 10.1021/ci700157b61https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXhtVGqtL%252FI&md5=9b99477ea9ca6078466c06dd19cc2116y-Randomization and Its Variants in QSPR/QSARRuecker, Christoph; Ruecker, Gerta; Meringer, MarkusJournal of Chemical Information and Modeling (2007), 47 (6), 2345-2357CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Y-Randomization is a tool used in validation of QSPR/QSAR models, whereby the performance of the original model in data description (r2) is compared to that of models built for permuted (randomly shuffled) response, based on the original descriptor pool and the original model building procedure. We compared y-randomization and several variants thereof, using original response, permuted response, or random no. pseudoresponse and original descriptors or random no. pseudodescriptors, in the typical setting of multilinear regression (MLR) with descriptor selection. For each combination of no. of observations (compds.), no. of descriptors in the final model, and no. of descriptors in the pool to select from, computer expts. using the same descriptor selection method result in two different mean highest random r2 values. A lower one is produced by y-randomization or a variant likewise based on the original descriptors, while a higher one is obtained from variants that use random no. pseudodescriptors. The difference is due to the intercorrelation of real descriptors in the pool. We propose to compare an original model's r2 to both of these whenever possible. The meaning of the three possible outcomes of such a double test is discussed. Often y-randomization is not available to a potential user of a model, due to the values of all descriptors in the pool for all compds. not being published. In such cases random no. expts. as proposed here are still possible. The test was applied to several recently published MLR QSAR equations, and cases of failure were identified. Some progress also is reported toward the aim of obtaining the mean highest r2 of random pseudomodels by calcn. rather than by tedious multiple simulations on random no. variables.
- 62Kovatcheva, A.; Golbraikh, A.; Oloff, S.; Feng, J.; Zheng, W.; Tropsha, A. QSAR Modeling of Datasets with Enantioselective Compounds using Chirality Sensitive Molecular Descriptors. SAR QSAR Environ. Res. 2005, 16, 93– 102, DOI: 10.1080/1062936041233131984462https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXovV2msg%253D%253D&md5=8c9e03780535c88865145b3683141f87QSAR modeling of datasets with enantioselective compounds using chirality sensitive molecular descriptorsKovatcheva, A.; Golbraikh, A.; Oloff, S.; Feng, J.; Zheng, W.; Tropsha, A.SAR and QSAR in Environmental Research (2005), 16 (1-2), 93-102CODEN: SQERED; ISSN:1062-936X. (Taylor & Francis Ltd.)Shape descriptors used in 3D QSAR studies naturally take into account chirality; however, for flexible and structurally diverse mols. such studies require extensive conformational searching and alignment. QSAR modeling studies of two datasets of fragrance compds. with complex stereochem. using simple alignment-free chirality sensitive descriptors developed in our labs. are presented. In the first investigation, 44 α-campholenic derivs. with sandalwood odor were represented as derivs. of several common structural templates with substituents numbered according to their relative spatial positions in the mols. Both mol. and substituent descriptors were used as independent variables in MLR calcns., and the best model was characterized by the training set q2 of 0.79 and external test set r2 of 0.95. In the second study, several types of chirality descriptors were employed in combinatorial QSAR modeling of 98 ambergris fragrance compds. Among 28 possible combinations of seven types of descriptors and four statistical modeling techniques, k nearest neighbor classification with CoMFA descriptors was initially found to generate the best models with the internal and external accuracies of 76 and 89%, resp. The same dataset was then studied using novel atom pair chirality descriptors (cAP). The cAP are based on a modified definition of the at. chirality, in which the seniority of the substituents is defined by their relative partial charge values: higher values correspond to higher seniorities. The resulting models were found to have higher predictive power than those developed with CoMFA descriptors; the best model was characterized by the internal and external accuracies of 82 and 94%, resp. The success of modeling studies using simple alignment free chirality descriptors discussed in this paper suggests that they should be applied broadly to QSAR studies of many datasets when compd. stereochem. plays an important role in defining their activity.
- 63de Cerqueira Lima, P.; Golbraikh, A.; Oloff, S.; Xiao, Y.; Tropsha, A. Combinatorial QSAR Modeling of P-Glycoprotein Substrates. J. Chem. Inf. Model. 2006, 46, 1245– 1254, DOI: 10.1021/ci050431763https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BD28zltlOlug%253D%253D&md5=f54d2dd79a795510cf5c02f93505ec88Combinatorial QSAR modeling of P-glycoprotein substratesde Cerqueira Lima Patricia; Golbraikh Alexander; Oloff Scott; Xiao Yunde; Tropsha AlexanderJournal of chemical information and modeling (2006), 46 (3), 1245-54 ISSN:1549-9596.Quantitative structure-activity (property) relationship (QSAR/QSPR) models are typically generated with a single modeling technique using one type of molecular descriptors. Recently, we have begun to explore a combinatorial QSAR approach which employs various combinations of optimization methods and descriptor types and includes rigorous and consistent model validation (Kovatcheva, A.; Golbraikh, A.; Oloff, S.; Xiao, Y.; Zheng, W.; Wolschann, P.; Buchbauer, G.; Tropsha, A. Combinatorial QSAR of Ambergris Fragrance Compounds. J. Chem. Inf. Comput. Sci. 2004, 44, 582-95). Herein, we have applied this approach to a data set of 195 diverse substrates and nonsubstrates of P-glycoprotein (P-gp) that plays a crucial role in drug resistance. Modeling methods included k-nearest neighbors classification, decision tree, binary QSAR, and support vector machines (SVM). Descriptor sets included molecular connectivity indices, atom pair (AP) descriptors, VolSurf descriptors, and molecular operation environment descriptors. Each descriptor type was used with every QSAR modeling technique; so, in total, 16 combinations of techniques and descriptor types have been considered. Although all combinations resulted in models with a high correct classification rate for the training set (CCR(train)), not all of them had high classification accuracy for the test set (CCR(test)). Thus, predictive models have been generated only for some combinations of the methods and descriptor types, and the best models were obtained using SVM classification with either AP or VolSurf descriptors; they were characterized by CCR(train) = 0.94 and 0.88 and CCR(test) = 0.81 and 0.81, respectively. The combinatorial QSAR approach identified models with higher predictive accuracy than those reported previously for the same data set. We suggest that, in the absence of any universally applicable "one-for-all" QSAR methodology, the combinatorial QSAR approach should become the standard practice in QSPR/QSAR modeling.
- 64Tharwat, A. Classification assessment methods. Appl. Comput. Inform. 2018, 12, e0177678 DOI: 10.1016/j.aci.2018.08.003There is no corresponding record for this reference.
- 65Ozsoy, M. G.; Özyer, T.; Polat, F.; Alhajj, R. Realizing Drug Repositioning by Adapting a Recommendation System to Handle the Process. BMC Bioinform. 2018, 19, 263– 266There is no corresponding record for this reference.
- 66Yang, J.; Li, Z.; Fan, X.; Cheng, Y. Drug-Disease Association and Drug-Repositioning Predictions in Complex Diseases Using Causal Inference-Probabilistic Matrix Factorization. J. Chem. Inf. Model. 2014, 54, 2562– 2569, DOI: 10.1021/ci500340n66https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtleqtb%252FP&md5=d300bb27b05bd8d228e4efb0b6a5a5ebDrug-Disease Association and Drug-Repositioning Predictions in Complex Diseases Using Causal Inference-Probabilistic Matrix FactorizationYang, Jihong; Li, Zheng; Fan, Xiaohui; Cheng, YiyuJournal of Chemical Information and Modeling (2014), 54 (9), 2562-2569CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)The high incidence of complex diseases has become a worldwide threat to human health. Multiple targets and pathways are perturbed during the pathol. process of complex diseases. Systematic investigation of complex relationship between drugs and diseases is necessary for new assocn. discovery and drug repurposing. For this purpose, three causal networks were constructed herein for cardiovascular diseases, diabetes mellitus, and neoplasms, resp. A causal inference-probabilistic matrix factorization (CI-PMF) approach was proposed to predict and classify drug-disease assocns., and further used for drug-repositioning predictions. First, multilevel systematic relations between drugs and diseases were integrated from heterogeneous databases to construct causal networks connecting drug-target-pathway-gene-disease. Then, the assocn. scores between drugs and diseases were assessed by evaluating a drug's effects on multiple targets and pathways. Furthermore, PMF models were learned based on known interactions, and assocns. were then classified into three types by trained models. Finally, therapeutic assocns. were predicted based upon the ranking of assocn. scores and predicted assocn. types. In terms of drug-disease assocn. prediction, modified causal inference included in CI-PMF outperformed existing causal inference with a higher AUC (area under receiver operating characteristic curve) score and greater precision. Moreover, CI-PMF performed better than single modified causal inference in predicting therapeutic drug-disease assocns. In the top 30% of predicted assocns., 58.6% (136/232), 50.8% (31/61), and 39.8% (140/352) hit known therapeutic assocns., while precisions obtained by the latter were only 10.2% (231/2264), 8.8% (36/411), and 9.7% (189/1948). Clin. verifications were further conducted for the top 100 newly predicted therapeutic assocns. As a result, 21, 12, and 32 assocns. have been studied and many treatment effects of drugs on diseases were investigated for cardiovascular diseases, diabetes mellitus, and neoplasms, resp. Related chains in causal networks were extd. for these 65 clin.-verified assocns., and we further illustrated the therapeutic role of etodolac in breast cancer by inferred chains. Overall, CI-PMF is a useful approach for assocg. drugs with complex diseases and provides potential values for drug repositioning.
- 67Ding, H.; Takigawa, I.; Mamitsuka, H.; Zhu, S. Similarity-based Machine Learning Methods for Predicting Drug-Target Interactions: a Brief Review. Brief. Bioinform. 2014, 15, 734– 747, DOI: 10.1093/bib/bbt05667https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC3sfosVyhug%253D%253D&md5=6538debacc5fb6ca35a990356e73bbe4Similarity-based machine learning methods for predicting drug-target interactions: a brief reviewDing Hao; Takigawa Ichigaku; Mamitsuka Hiroshi; Zhu ShanfengBriefings in bioinformatics (2014), 15 (5), 734-47 ISSN:.Computationally predicting drug-target interactions is useful to select possible drug (or target) candidates for further biochemical verification. We focus on machine learning-based approaches, particularly similarity-based methods that use drug and target similarities, which show relationships among drugs and those among targets, respectively. These two similarities represent two emerging concepts, the chemical space and the genomic space. Typically, the methods combine these two types of similarities to generate models for predicting new drug-target interactions. This process is also closely related to a lot of work in pharmacogenomics or chemical biology that attempt to understand the relationships between the chemical and genomic spaces. This background makes the similarity-based approaches attractive and promising. This article reviews the similarity-based machine learning methods for predicting drug-target interactions, which are state-of-the-art and have aroused great interest in bioinformatics. We describe each of these methods briefly, and empirically compare these methods under a uniform experimental setting to explore their advantages and limitations.
- 68Martin, E. J.; Polyakov, V. R.; Zhu, X.-W.; Mukherjee, P.; Tian, L.; Liu, X. All-Assay-Max2 pQSAR: Activity Predictions as Accurate as 4-concentration IC50s for 8558 Novartis Assays. J. Chem. Inf. Model. 2019, 59, 4450– 4459, DOI: 10.1021/acs.jcim.9b0037568https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXhslOqurbK&md5=00bbbad59334754948074d6e1e5edccbAll-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC50s for 8558 Novartis AssaysMartin, Eric J.; Polyakov, Valery R.; Zhu, Xiang-Wei; Tian, Li; Mukherjee, Prasenjit; Liu, XinJournal of Chemical Information and Modeling (2019), 59 (10), 4450-4459CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Profile-quant. structure-activity relationship (pQSAR) is a massively multitask, two-step machine learning method with unprecedented scope, accuracy, and applicability domain. In step one, a "profile" of conventional single-assay random forest regression models are trained on a very large no. of biochem. and cellular pIC50 assays using Morgan 2 substructural fingerprints as compd. descriptors. In step two, a panel of partial least squares (PLS) models are built using the profile of pIC50 predictions from those random forest regression models as compd. descriptors (hence the name). Previously described for a panel of 728 biochem. and cellular kinase assays, we have now built an enormous pQSAR from 11 805 diverse Novartis (NVS) IC50 and EC50 assays. This large no. of assays, and hence of compd. descriptors for PLS, dictated reducing the profile by only including random forest regression models whose predictions correlate with the assay being modeled. The random forest regression and pQSAR models were evaluated with our "realistically novel" held-out test set, whose median av. similarity to the nearest training set member across the 11 805 assays was only 0.34, comparable to the novelty of compds. actually selected from virtual screens. For the 11 805 single-assay random forest regression models, the median correlation of prediction with the expt. was only rext2 = 0.05, virtually random, and only 8% of the models achieved our std. success threshold of rext2 = 0.30. For pQSAR, the median correlation was rext2 = 0.53, comparable to four-concn. exptl. IC50s, and 72% of the models met our rext2 > 0.30 std., totaling 8558 successful models. The successful models included assays from all of the 51 annotated target subclasses, as well as 4196 phenotypic assays, indicating that pQSAR can be applied to virtually any disease area. Every month, all models are updated to include new measurements, and predictions are made for 5.5 million NVS compds., totaling 50 billion predictions. Common uses have included virtual screening, selectivity design, toxicity and promiscuity prediction, mechanism-of-action prediction, and others. Several such actual applications are described.
- 69Koohi, A. In Prediction of Drug-Target Interactions Using Popular Collaborative Filtering Methods , 2013 IEEE International Workshop on Genomic Signal Processing and Statistics, 2013; pp 58– 61.There is no corresponding record for this reference.
- 70Peska, L.; Buza, K.; Koller, J. Drug-Target Interaction Prediction: A Bayesian Ranking Approach. Comput. Methods Programs Biomed 2017, 152, 15– 21, DOI: 10.1016/j.cmpb.2017.09.00370https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1M7ivFWmsA%253D%253D&md5=ddaece7f6b312c4405e4d3e1da6da96aDrug-target interaction prediction: A Bayesian ranking approachPeska Ladislav; Buza Krisztian; Koller JuliaComputer methods and programs in biomedicine (2017), 152 (), 15-21 ISSN:.BACKGROUND AND OBJECTIVE: In silico prediction of drug-target interactions (DTI) could provide valuable information and speed-up the process of drug repositioning - finding novel usage for existing drugs. In our work, we focus on machine learning algorithms supporting drug-centric repositioning approach, which aims to find novel usage for existing or abandoned drugs. We aim at proposing a per-drug ranking-based method, which reflects the needs of drug-centric repositioning research better than conventional drug-target prediction approaches. METHODS: We propose Bayesian Ranking Prediction of Drug-Target Interactions (BRDTI). The method is based on Bayesian Personalized Ranking matrix factorization (BPR) which has been shown to be an excellent approach for various preference learning tasks, however, it has not been used for DTI prediction previously. In order to successfully deal with DTI challenges, we extended BPR by proposing: (i) the incorporation of target bias, (ii) a technique to handle new drugs and (iii) content alignment to take structural similarities of drugs and targets into account. RESULTS: Evaluation on five benchmark datasets shows that BRDTI outperforms several state-of-the-art approaches in terms of per-drug nDCG and AUC. BRDTI results w.r.t. nDCG are 0.929, 0.953, 0.948, 0.897 and 0.690 for G-Protein Coupled Receptors (GPCR), Ion Channels (IC), Nuclear Receptors (NR), Enzymes (E) and Kinase (K) datasets respectively. Additionally, BRDTI significantly outperformed other methods (BLM-NII, WNN-GIP, NetLapRLS and CMF) w.r.t. nDCG in 17 out of 20 cases. Furthermore, BRDTI was also shown to be able to predict novel drug-target interactions not contained in the original datasets. The average recall at top-10 predicted targets for each drug was 0.762, 0.560, 1.000 and 0.404 for GPCR, IC, NR, and E datasets respectively. CONCLUSIONS: Based on the evaluation, we can conclude that BRDTI is an appropriate choice for researchers looking for an in silico DTI prediction technique to be used in drug-centric repositioning scenarios. BRDTI Software and supplementary materials are available online at www.ksi.mff.cuni.cz/∼peska/BRDTI.
- 71Ezzat, A.; Zhao, P.; Wu, M.; Li, X.-L.; Kwoh, C.-K. Drug-Target Interaction Prediction with Graph Regularized Matrix Factorization. IEEE/ACM Trans. Comput. Biol. Bioinform. 2017, 14, 646– 656, DOI: 10.1109/TCBB.2016.253006271https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXltlGgurw%253D&md5=7cc545e3566f907a56b3e36c472f8b93Drug-target interaction prediction with graph regularized matrix factorizationEzzat, Ali; Zhao, Peilin; Wu, Min; Li, Xiao-Li; Kwoh, Chee-KeongIEEE/ACM Transactions on Computational Biology and Bioinformatics (2017), 14 (3), 646-656CODEN: ITCBCY; ISSN:1557-9964. (Institute of Electrical and Electronics Engineers)Exptl. detn. of drug-target interactions is expensive and time-consuming. Therefore, there is a continuous demand for more accurate predictions of interactions using computational techniques. Algorithms have been devised to infer novel interactions on a global scale where the input to these algorithms is a drug-target network (i.e., a bipartite graph where edges connect pairs of drugs and targets that are known to interact). However, these algorithms had difficulty predicting interactions involving new drugs or targets for which there are no known interactions (i.e., "orphan" nodes in the network). Since data usually lie on or near to low-dimensional non-linear manifolds, we propose two matrix factorization methods that use graph regularization in order to learn such manifolds. In addn., considering that many of the non-occurring edges in the network are actually unknown or missing cases, we developed a preprocessing step to enhance predictions in the "new drug" and "new target" cases by adding edges with intermediate interaction likelihood scores. In our cross validation expts., our methods achieved better results than three other state-of-the-art methods in most cases. Finally, we simulated some "new drug" and "new target" cases and found that GRMF predicted the left-out interactions reasonably well.
- 72Science, D. Data Sourced from Dimensions, an Inter-linked Research Information System Provided by Digital Science. https://www.dimensions.ai (accessed October 1, 2018).There is no corresponding record for this reference.
- 73Hook, D. W.; Porter, S. J.; Herzog, C. Dimensions: Building Context for Search and Evaluation. Front. Res. Metrics Anal. 2018, 3, 23, DOI: 10.3389/frma.2018.00023There is no corresponding record for this reference.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acsomega.0c00857.
File SI1: pdf file with a description of data set preparation; table of models’ hyperparameters; and the best hyperparameters’ and prediction assessment (PDF)
File SI2: python code snippet for the metric calculation; File SI3: gzipped tarball file with the data sets inside (DB_main.csv—data set of compound–virus interactions, DB_c_main.csv—data set of compound features, DB_v_main.csv—data set of virus features, DB_ext.csv—test data set with compound–virus interactions, DB_c_ext.csv—test data set with compound features, and DB_ext_comp.csv —data set with compound labels for point and CS test prediction); file is located on Zenodo (doi: 10.5281/zenodo.3831446)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.