Toward Explainable Anticancer Compound Sensitivity Prediction via Multimodal Attention-Based Convolutional Encoders
- Matteo Manica
- ,
- Ali Oskooei
- ,
- Jannis BornJannis BornIBM Research, 8803 Zürich, SwitzerlandETH Zürich, 8092 Zürich, SwitzerlandUniversity of Zürich, 8006 Zürich, SwitzerlandMore by Jannis Born
- ,
- Vigneshwari Subramanian
- ,
- Julio Sáez-Rodríguez
- , and
- María Rodríguez Martínez*María Rodríguez Martínez*E-mail: [email protected]IBM Research, 8803 Zürich, SwitzerlandMore by María Rodríguez Martínez
Abstract

In line with recent advances in neural drug design and sensitivity prediction, we propose a novel architecture for interpretable prediction of anticancer compound sensitivity using a multimodal attention-based convolutional encoder. Our model is based on the three key pillars of drug sensitivity: compounds’ structure in the form of a SMILES sequence, gene expression profiles of tumors, and prior knowledge on intracellular interactions from protein–protein interaction networks. We demonstrate that our multiscale convolutional attention-based encoder significantly outperforms a baseline model trained on Morgan fingerprints and a selection of encoders based on SMILES, as well as the previously reported state-of-the-art for multimodal drug sensitivity prediction (R2 = 0.86 and RMSE = 0.89). Moreover, the explainability of our approach is demonstrated by a thorough analysis of the attention weights. We show that the attended genes significantly enrich apoptotic processes and that the drug attention is strongly correlated with a standard chemical structure similarity index. Finally, we report a case study of two receptor tyrosine kinase (RTK) inhibitors acting on a leukemia cell line, showcasing the ability of the model to focus on informative genes and submolecular regions of the two compounds. The demonstrated generalizability and the interpretability of our model testify to its potential for in silico prediction of anticancer compound efficacy on unseen cancer cells, positioning it as a valid solution for the development of personalized therapies as well as for the evaluation of candidate compounds in de novo drug design.
- drug sensitivity prediction
- computational systems biology
- deep learning
- machine learning
- drug discovery
- multiscale
- multimodal
- attention
- CNN
- RNN
- explainability
- interpretability
- molecular networks
- molecular fingerprints
- GDSC
- SMILES
- gene expression
- drug discovery
- drug sensitivity
- anticancer compounds
- IC50
- EC50
- lead discovery
- personalized medicine
- precision medicine
1. Introduction
1.1. Motivation
1.2. Related Work
1.3. Scope of the Presented Work
Figure 1

Figure 1. Multimodal end-to-end architecture of the proposed encoders. General framework for the explored architectures. Each model ingests a cell–compound pair and makes an IC50 drug sensitivity prediction. Cells are represented by the gene expression values of a subset of 2128 genes, selected according to a network propagation procedure. Compounds are represented by their SMILES string (apart from the baseline model that uses 512-bit fingerprints). The gene-vector is fed into an attention-based gene encoder that assigns higher weights to the most informative genes. To encode the SMILES strings, several neural architectures are compared (for details see section 2) and used in combination with the gene expression encoder in order to predict drug sensitivity.
2. Methods
2.1. Data
2.2. Network Propagation


2.3. Model Architectures
Deep baseline (DNN)
Commonalities of SMILES Encoders

Figure 2

Figure 2. Key layers employed throughout the SMILES encoder. (A) SMILES Embedding (SE): An embedding layer transforms raw SMILES strings into a sequence of vectors in an embedding space. (B) Gene attention (GA): An attention-based gene expression encoder generates attention weights that are in turn applied to the input gene subset via a dot product. (C) Contextual attention (CA): A contextual attention layer ingests the SMILES encoding (either raw or the output of another encoder, e.g., CNN, RNN, and so on) of a compound and genes from a cell to compute an attention distribution (αi) over all tokens of the SMILES encoding, in the context of the genetic profile of the cell. The attention-filtered molecule represents the most informative molecular substructures for IC50 prediction, given the gene expression of a cell.
Bidirectional Recurrent (bRNN)
Stacked Convolutional Encoder (SCNN)
Self-Attention (SA)



Contextual-Attention (CA)

Multiscale Convolutional Attention (MCA)
Figure 3

Figure 3. Model architecture of the multiscale convolutional attentive (MCA) encoder. The MCA model employed three parallel channels of convolutions over the SMILES sequence with kernel sizes K and one residual channel operating directly on the token level. Each channel applied a separate gene attention layer, before (convolved) SMILES and filtered genes were fed to a multihead of four contextual-attention layers. The outputs of these 16 layers were concatenated and resulted in an IC50 prediction through a stack of dense layers. For CA, GA, and SE, see Figure 2.
2.4. Model Evaluation
Strict Split
Lenient Split
2.5. Training Procedure
3. Results
3.1. Model Performance Comparison on Strict Split
encoder type | drug structure | standardized RMSE median ± IQR |
---|---|---|
deep baseline (DNN) | fingerprints | 0.122 ± 0.010 |
bidirectional recurrent (bRNN) | SMILES | 0.119 ± 0.011 |
stacked convolutional (SCNN) | SMILES | 0.130 ± 0.006 |
self-attention (SA) | SMILES | 0.112* ± 0.009 |
contextual attention (CA) | SMILES | 0.110* ± 0.007 |
multiscale convolutional attentive (MCA) | SMILES | 0.109* ± 0.009 |
MCA (prediction averaging) | SMILES | 0.104** ± 0.005 |
The median RMSE and the IQR between predicted and true IC50 values on test data of all 25 folds are reported. Interestingly, attention-based models outperform all other models, including models trained on fingerprints, with a statistically significant margin (* indicating a significance of p < 0.01 compared to the DNN encoder, ** to the MCA).
3.2. Model Validation on Lenient Data Split
Figure 4

Figure 4. Test performance of MCA on lenient splitting. Scatter plot of correlation between true and predicted drug sensitivity by a late-fusion model ensemble of all five folds. The model was fitted in log space.
3.3. Attention Analysis
Drug Structure Attention
Gene Attention

A Case Study: Two TK Inhibitors
Figure 5

Figure 5. Neural attention on molecules and genes. The molecular attention maps on the top demonstrate how the model’s attention is shifted when the thiazole group is replaced by a piperazine group. The change in attention across the two molecules is particularly concentrated around the affected rings, signifying that these functional groups play an important role in the mechanism of action for these tyrosine kinase inhibitors when they act on a chronic myelogenous leukemia (CML) cell line. The gene attention plot at the bottom depicts the most attended genes of the CML cell line, all of which can be linked to leukemia (details see text).
4. Discussion
Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.molpharmaceut.9b00520.
Details of data splits, CNV inclusion, and comparisons with other regression models; the best trained model in compressed format; the processed data following both the strict split and the lenient split strategies; and a list of genes selected via network propagation (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
The authors would like to thank Dr. Maria Gabrani for her continuous support and useful discussions. The projects leading to this publication have received funding from the European Union’s Horizon 2020 research and innovation program under grant agreements no. 668858 and no. 826121.
References
This article references 73 other publications.
- 1Goh, G. B.; Hodas, N. O.; Siegel, C.; Vishnu, A. SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties. arXiv:1712.02034 [stat.ML] , arXiv preprint, 2017. https://arxiv.org/abs/1712.02034.Google ScholarThere is no corresponding record for this reference.
- 2Petrova, E. Innovation and marketing in the pharmaceutical industry; Springer, 2014; pp 19– 81.
- 3Lloyd, I.; Shimmings, A.; Scrip, P. S. Pharma R&D Annual Review 2018. https://pharmaintelligence.informa.com/resources/product-content/pharma-rd-annual-review-2018 (accessed June 25, 2018).Google ScholarThere is no corresponding record for this reference.
- 4Hargrave-Thomas, E.; Yu, B.; Reynisson, J. Serendipity in anticancer drug discovery. World J. Clin. Oncol. 2012, 3 (1), 1, DOI: 10.5306/wjco.v3.i1.1[Crossref], [PubMed], [CAS], Google Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC387ks12ktA%253D%253D&md5=63ec57cb90808eea53404cb6eb31948dSerendipity in anticancer drug discoveryHargrave-Thomas Emily; Yu Bo; Reynisson JohannesWorld journal of clinical oncology (2012), 3 (1), 1-6 ISSN:.It was found that the discovery of 5.8% (84/1437) of all drugs on the market involved serendipity. Of these drugs, 31 (2.2%) were discovered following an incident in the laboratory and 53 (3.7%) were discovered in a clinical setting. In addition, 263 (18.3%) of the pharmaceuticals in clinical use today are chemical derivatives of the drugs discovered with the aid of serendipity. Therefore, in total, 24.1% (347/1437) of marketed drugs can be directly traced to serendipitous events confirming the importance of this elusive phenomenon. In the case of anticancer drugs, 35.2% (31/88) can be attributed to a serendipitous event, which is somewhat larger than for all drugs. The therapeutic field that has benefited the most from serendipity are central nervous system active drugs reflecting the difficulty in designing compounds to pass the blood-brain-barrier and the lack of laboratory-based assays for many of the diseases of the mind.
- 5Geeleher, P.; Cox, N. J.; Huang, R. S. Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models. Genome Biol. 2016, 17, 190, DOI: 10.1186/s13059-016-1050-9[Crossref], [PubMed], [CAS], Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhvFSjtbzM&md5=273bee29c0dea9d5f6786404a3632598Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical modelsGeeleher, Paul; Cox, Nancy J.; Huang, R. StephanieGenome Biology (2016), 17 (), 190/1-190/11CODEN: GNBLFW; ISSN:1474-760X. (BioMed Central Ltd.)We show that variability in general levels of drug sensitivity in pre-clin. cancer models confounds biomarker discovery. However, using a very large panel of cell lines, each treated with many drugs, we could est. a general level of sensitivity to all drugs in each cell line. By conditioning on this variable, biomarkers were identified that were more likely to be effective in clin. trials than those identified using a conventional uncorrected approach. We find that differences in general levels of drug sensitivity are driven by biol. relevant processes. We developed a gene expression based method that can be used to correct for this confounder in future studies.
- 6De Niz, C.; Rahman, R.; Zhao, X.; Pal, R. Algorithms for drug sensitivity prediction. Algorithms 2016, 9, 77, DOI: 10.3390/a9040077
- 7Ali, M.; Aittokallio, T. Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys. Rev. 2019, 11, 31, DOI: 10.1007/s12551-018-0446-z[Crossref], [PubMed], [CAS], Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhsV2jtbnN&md5=dc79eb74938d74572272b65387f947adMachine learning and feature selection for drug response prediction in precision oncology applicationsAli, Mehreen; Aittokallio, TeroBiophysical Reviews (2019), 11 (1), 31-39CODEN: BRIECG; ISSN:1867-2450. (Springer)A review. In-depth modeling of the complex interplay among multiple omics data measured from cancer cell lines or patient tumors is providing new opportunities toward identification of tailored therapies for individual cancer patients. Supervised machine learning algorithms are increasingly being applied to the omics profiles as they enable integrative analyses among the high-dimensional data sets, as well as personalized predictions of therapy responses using multi-omics panels of response-predictive biomarkers identified through feature selection and cross-validation. However, tech. variability and frequent missingness in input "big data" require the application of dedicated data preprocessing pipelines that often lead to some loss of information and compressed view of the biol. signal. We describe here the state-of-the-art machine learning methods for anti-cancer drug response modeling and prediction and give our perspective on further opportunities to make better use of high-dimensional multi-omics profiles along with knowledge about cancer pathways targeted by anti-cancer compds. when predicting their phenotypic responses.
- 8Costello, J. C. A community effort to assess and improve drug sensitivity prediction algorithms. Nat. Biotechnol. 2014, 32, 1202, DOI: 10.1038/nbt.2877[Crossref], [PubMed], [CAS], Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXovFCgtLs%253D&md5=e0810989daeffc47400e0ecd3229911aA community effort to assess and improve drug sensitivity prediction algorithmsCostello, James C.; Heiser, Laura M.; Georgii, Elisabeth; Gonen, Mehmet; Menden, Michael P.; Wang, Nicholas J.; Bansal, Mukesh; Ammad-ud-din, Muhammad; Hintsanen, Petteri; Khan, Suleiman A.; Mpindi, John-Patrick; Kallioniemi, Olli; Honkela, Antti; Aittokallio, Tero; Wennerberg, Krister; Abbuehl, Jean-Paul; Allen, Jeffrey; Altman, Russ B.; Balcome, Shawn; Battle, Alexis; Bender, Andreas; Berger, Bonnie; Bernard, Jonathan; Bhattacharjee, Madhuchhanda; Bhuvaneshwar, Krithika; Bieberich, Andrew A.; Boehm, Fred; Califano, Andrea; Chan, Christina; Chen, Beibei; Chen, Ting-Huei; Choi, Jaejoon; Coelho, Luis Pedro; Cokelaer, Thomas; Collins, James C.; Creighton, Chad J.; Cui, Jike; Dampier, Will; Davisson, V. Jo; De Baets, Bernard; Deshpande, Raamesh; Di Camillo, Barbara; Dundar, Murat; Duren, Zhana; Ertel, Adam; Fan, Haoyang; Fang, Hongbin; Gallahan, Dan; Gauba, Robinder; Gottlieb, Assaf; Grau, Michael; Gray, Joe W.; Gusev, Yuriy; Ha, Min Jin; Han, Leng; Harris, Michael; Henderson, Nicholas; Hejase, Hussein A.; Homicsko, Krisztian; Hou, Jack P.; Hwang, Woochang; Ijzerman, Adriaan P.; Karacali, Bilge; Kaski, Samuel; Keles, Sunduz; Kendziorski, Christina; Kim, Junho; Kim, Min; Kim, Youngchul; Knowles, David A.; Koller, Daphne; Lee, Junehawk; Lee, Jae K.; Lenselink, Eelke B.; Li, Biao; Li, Bin; Li, Jun; Liang, Han; Ma, Jian; Madhavan, Subha; Mooney, Sean; Myers, Chad L.; Newton, Michael A.; Overington, John P.; Pal, Ranadip; Peng, Jian; Pestell, Richard; Prill, Robert J.; Qiu, Peng; Rajwa, Bartek; Sadanandam, Anguraj; Saez-Rodriguez, Julio; Sambo, Francesco; Shin, Hyunjin; Singer, Dinah; Song, Jiuzhou; Song, Lei; Sridhar, Arvind; Stock, Michiel; Stolovitzky, Gustavo; Sun, Wei; Ta, Tram; Tadesse, Mahlet; Tan, Ming; Tang, Hao; Theodorescu, Dan; Toffolo, Gianna Maria; Tozeren, Aydin; Trepicchio, William; Varoquaux, Nelle; Vert, Jean-Philippe; Waegeman, Willem; Walter, Thomas; Wan, Qian; Wang, Difei; Wang, Nicholas J.; Wang, Wen; Wang, Yong; Wang, Zhishi; Wegner, Joerg K.; Wu, Tongtong; Xia, Tian; Xiao, Guanghua; Xie, Yang; Xu, Yanxun; Yang, Jichen; Yuan, Yuan; Zhang, Shihua; Zhang, Xiang-Sun; Zhao, Junfei; Zuo, Chandler; van Vlijmen, Herman W. T.; van Westen, Gerard J. P.; Collins, James J.Nature Biotechnology (2014), 32 (12), 1202-1212CODEN: NABIF9; ISSN:1087-0156. (Nature Publishing Group)Predicting the best treatment strategy from genomic information is a core goal of precision medicine. Here we focus on predicting drug response based on a cohort of genomic, epigenomic and proteomic profiling data sets measured in human breast cancer cell lines. Through a collaborative effort between the National Cancer Institute (NCI) and the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we analyzed a total of 44 drug sensitivity prediction algorithms. The top-performing approaches modeled nonlinear relationships and incorporated biol. pathway information. We found that gene expression microarrays consistently provided the best predictive power of the individual profiling data sets; however, performance was increased by including multiple, independent data sets. We discuss the innovations underlying the top-performing methodol., Bayesian multitask MKL, and we provide detailed descriptions of all methods. This study establishes benchmarks for drug sensitivity prediction and identifies approaches that can be leveraged for the development of new methods.
- 9Garnett, M. J. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 2012, 483, 570, DOI: 10.1038/nature11005[Crossref], [PubMed], [CAS], Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XmtVektL4%253D&md5=540fc631fa43b55d0338fff4e46782b5Systematic identification of genomic markers of drug sensitivity in cancer cellsGarnett, Mathew J.; Edelman, Elena J.; Heidorn, Sonja J.; Greenman, Chris D.; Dastur, Anahita; Lau, King Wai; Greninger, Patricia; Thompson, I. Richard; Luo, Xi; Soares, Jorge; Liu, Qingsong; Iorio, Francesco; Surdez, Didier; Chen, Li; Milano, Randy J.; Bignell, Graham R.; Tam, Ah T.; Davies, Helen; Stevenson, Jesse A.; Barthorpe, Syd; Lutz, Stephen R.; Kogera, Fiona; Lawrence, Karl; McLaren-Douglas, Anne; Mitropoulos, Xeni; Mironenko, Tatiana; Thi, Helen; Richardson, Laura; Zhou, Wenjun; Jewitt, Frances; Zhang, Tinghu; O'Brien, Patrick; Boisvert, Jessica L.; Price, Stacey; Hur, Wooyoung; Yang, Wanjuan; Deng, Xianming; Butler, Adam; Choi, Hwan Geun; Chang, Jae Won; Baselga, Jose; Stamenkovic, Ivan; Engelman, Jeffrey A.; Sharma, Sreenath V.; Delattre, Olivier; Saez-Rodriguez, Julio; Gray, Nathanael S.; Settleman, Jeffrey; Futreal, P. Andrew; Haber, Daniel A.; Stratton, Michael R.; Ramaswamy, Sridhar; McDermott, Ultan; Benes, Cyril H.Nature (London, United Kingdom) (2012), 483 (7391), 570-575CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)Clin. responses to anticancer therapies are often restricted to a subset of patients. In some cases, mutated cancer genes are potent biomarkers for responses to targeted agents. Here, to uncover new biomarkers of sensitivity and resistance to cancer therapeutics, we screened a panel of several hundred cancer cell lines-which represent much of the tissue-type and genetic diversity of human cancers-with 130 drugs under clin. and preclin. investigation. In aggregate, we found that mutated cancer genes were assocd. with cellular response to most currently available cancer drugs. Classic oncogene addiction paradigms were modified by addnl. tissue-specific or expression biomarkers, and some frequently mutated genes were assocd. with sensitivity to a broad range of therapeutic agents. Unexpected relationships were revealed, including the marked sensitivity of Ewing's sarcoma cells harbouring the EWS (also known as EWSR1)-FLI1 gene translocation to poly(ADP-ribose) polymerase (PARP) inhibitors. By linking drug activity to the functional complexity of cancer genomes, systematic pharmacogenomic profiling in cancer cell lines provides a powerful biomarker discovery platform to guide rational cancer therapeutic strategies.
- 10Kalamara, A.; Tobalina, L.; Saez-Rodriguez, J. How to find the right drug for each patient? Advances and challenges in pharmacogenomics. Curr. Opin. Syst. Biol. 2018, 10, 53, DOI: 10.1016/j.coisb.2018.07.001[Crossref], [PubMed], [CAS], Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3MfislyltQ%253D%253D&md5=f18195017b040f14eda271b84cd11675How to find the right drug for each patient? Advances and challenges in pharmacogenomicsKalamara Angeliki; Tobalina Luis; Saez-Rodriguez Julio; Saez-Rodriguez Julio; Saez-Rodriguez JulioCurrent opinion in systems biology (2018), 10 (), 53-62 ISSN:2452-3100.Cancer is a highly heterogeneous disease with complex underlying biology. For these reasons, effective cancer treatment is still a challenge. Nowadays, it is clear that a cancer therapy that fits all the cases cannot be found, and as a result the design of therapies tailored to the patient's molecular characteristics is needed. Pharmacogenomics aims to study the relationship between an individual's genotype and drug response. Scientists use different biological models, ranging from cell lines to mouse models, as proxies for patients for preclinical and translational studies. The rapid development of "-omics" technologies is increasing the amount of features that can be measured in these models, expanding the possibilities of finding predictive biomarkers of drug response. Finding these relationships requires diverse computational approaches ranging from machine learning to dynamic modeling. Despite major advances, we are still far from being able to precisely predict drug efficacy in cancer models, let alone directly on patients. We believe that the new experimental techniques and computational approaches covered in this review will bring us closer to this goal.
- 11Yang, W. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 2012, 41, D955– D961, DOI: 10.1093/nar/gks1111
- 12Tan, M. Prediction of anti-cancer drug response by kernelized multi-task learning. Artificial intelligence in medicine 2016, 73, 70– 77, DOI: 10.1016/j.artmed.2016.09.004[Crossref], [PubMed], [CAS], Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2sjktl2itQ%253D%253D&md5=ee6b2a8fb0e7221cdffc3b20180fb444Prediction of anti-cancer drug response by kernelized multi-task learningTan MehmetArtificial intelligence in medicine (2016), 73 (), 70-77 ISSN:.MOTIVATION: Chemotherapy or targeted therapy are two of the main treatment options for many types of cancer. Due to the heterogeneous nature of cancer, the success of the therapeutic agents differs among patients. In this sense, determination of chemotherapeutic response of the malign cells is essential for establishing a personalized treatment protocol and designing new drugs. With the recent technological advances in producing large amounts of pharmacogenomic data, in silico methods have become important tools to achieve this aim. OBJECTIVE: Data produced by using cancer cell lines provide a test bed for machine learning algorithms that try to predict the response of cancer cells to different agents. The potential use of these algorithms in drug discovery/repositioning and personalized treatments motivated us in this study to work on predicting drug response by exploiting the recent pharmacogenomic databases. We aim to improve the prediction of drug response of cancer cell lines. METHODS: We propose to use a method that employs multi-task learning to improve learning by transfer, and kernels to extract non-linear relationships to predict drug response. RESULTS: The method outperforms three state-of-the-art algorithms on three anti-cancer drug screen datasets. We achieved a mean squared error of 3.305 and 0.501 on two different large scale screen data sets. On a recent challenge dataset, we obtained an error of 0.556. We report the methodological comparison results as well as the performance of the proposed algorithm on each single drug. CONCLUSION: The results show that the proposed method is a strong candidate to predict drug response of cancer cell lines in silico for pre-clinical studies. The source code of the algorithm and data used can be obtained from http://mtan.etu.edu.tr/Supplementary/kMTrace/.
- 13Tan, M.; Özgül, O. F.; Bardak, B.; Ekşioğlu, I.; Sabuncuoğlu, S. Drug response prediction by ensemble learning and drug-induced gene expression signatures. arXiv:1802.03800 , arXiv preprint, 2018. https://arxiv.org/abs/1802.03800.Google ScholarThere is no corresponding record for this reference.
- 14Turki, T.; Wei, Z. A link prediction approach to cancer drug sensitivity prediction. BMC Syst. Biol. 2017, 11, 94, DOI: 10.1186/s12918-017-0463-8[Crossref], [PubMed], [CAS], Google Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitVOjtLvL&md5=cbd2e9c5352ec42c796c4c508ad65c86A link prediction approach to cancer drug sensitivity predictionTurki, Turki; Wei, ZhiBMC Systems Biology (2017), 11 (Suppl.5), 94/1-94/14CODEN: BSBMCC; ISSN:1752-0509. (BioMed Central Ltd.)Background: Predicting the response to a drug for cancer disease patients based on genomic information is an important problem in modern clin. oncol. This problem occurs in part because many available drug sensitivity prediction algorithms do not consider better quality cancer cell lines and the adoption of new feature representations; both lead to the accurate prediction of drug responses. By predicting accurate drug responses to cancer, oncologists gain a more complete understanding of the effective treatments for each patient, which is a core goal in precision medicine. Results: In this paper, we model cancer drug sensitivity as a link prediction, which is shown to be an effective technique. We evaluate our proposed link prediction algorithms and compare them with an existing drug sensitivity prediction approach based on clin. trial data. The exptl. results based on the clin. trial data show the stability of our link prediction algorithms, which yield the highest area under the ROC curve (AUC) and are statistically significant. Conclusions: We propose a link prediction approach to obtain new feature representation. Compared with an existing approach, the results show that incorporating the new feature representation to the link prediction algorithms has significantly improved the performance.
- 15Menden, M. P. Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties. PLoS One 2013, 8, e61318 DOI: 10.1371/journal.pone.0061318[Crossref], [PubMed], [CAS], Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXnsVKhsr8%253D&md5=bf41b75d43be68936a1c2a043b3e0b6aMachine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical propertiesMenden, Michael P.; Iorio, Francesco; Garnett, Mathew; McDermott, Ultan; Benes, Cyril H.; Ballester, Pedro J.; Saez-Rodriguez, JulioPLoS One (2013), 8 (4), e61318CODEN: POLNCL; ISSN:1932-6203. (Public Library of Science)Predicting the response of a specific cancer to a therapy is a major goal in modern oncol. that should ultimately lead to a personalised treatment. High-throughput screenings of potentially active compds. against a panel of genomically heterogeneous cancer cell lines have unveiled multiple relationships between genomic alterations and drug responses. Various computational approaches have been proposed to predict sensitivity based on genomic features, while others have used the chem. properties of the drugs to ascertain their effect. In an effort to integrate these complementary approaches, we developed machine learning models to predict the response of cancer cell lines to drug treatment, quantified through IC50 values, based on both the genomic features of the cell lines and the chem. properties of the considered drugs. Models predicted IC50 values in a 8-fold cross-validation and an independent blind test with coeff. of detn. R2 of 0.72 and 0.64 resp. Furthermore, models were able to predict with comparable accuracy (R2 of 0.61) IC50s of cell lines from a tissue not used in the training stage. Our in silico models can be used to optimize the exptl. design of drug-cell screenings by estg. a large proportion of missing IC50 values rather than exptl. measuring them. The implications of our results go beyond virtual drug screening design: potentially thousands of drugs could be probed in silico to systematically test their potential efficacy as anti-tumor agents based on their structure, thus providing a computational framework to identify new drug repositioning opportunities as well as ultimately be useful for personalized medicine by linking the genomic traits of patients to drug sensitivity.
- 16Ammad-Ud-Din, M. Integrative and personalized QSAR analysis in cancer by kernelized Bayesian matrix factorization. J. Chem. Inf. Model. 2014, 54, 2347– 2359, DOI: 10.1021/ci500152b[ACS Full Text
], [CAS], Google Scholar
16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtFyns7jL&md5=ccc712fe6aeff91c602424028184e750Integrative and Personalized QSAR Analysis in Cancer by Kernelized Bayesian Matrix FactorizationAmmad-ud-din, Muhammad; Georgii, Elisabeth; Gonen, Mehmet; Laitinen, Tuomo; Kallioniemi, Olli; Wennerberg, Krister; Poso, Antti; Kaski, SamuelJournal of Chemical Information and Modeling (2014), 54 (8), 2347-2359CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)With data from recent large-scale drug sensitivity measurement campaigns, it is now possible to build and test models predicting responses for more than one hundred anticancer drugs against several hundreds of human cancer cell lines. Traditional quant. structure-activity relation (QSAR) approaches focus on small mols. in searching for their structural properties predictive of the biol. activity in a single cell line or a single tissue type. We extend this line of research in two directions: (1) an integrative QSAR approach predicting the responses to new drugs for a panel of multiple known cancer cell lines simultaneously and (2) a personalized QSAR approach predicting the responses to new drugs for new cancer cell lines. To solve the modeling task, we apply a novel kernelized Bayesian matrix factorization method. For max. applicability and predictive performance, the method optionally utilizes genomic features of cell lines and target information on drugs in addn. to chem. drug descriptors. In a case study with 116 anticancer drugs and 650 cell lines, we demonstrate the usefulness of the method in several relevant prediction scenarios, differing in the amt. of available information, and analyze the importance of various types of drug features for the response prediction. Furthermore, after predicting the missing values of the data set, a complete global map of drug response is explored to assess treatment potential and treatment range of therapeutically interesting anticancer drugs. - 17Zhang, N. Predicting anticancer drug responses using a dual-layer integrated cell line-drug network model. PLoS Comput. Biol. 2015, 11, e1004498 DOI: 10.1371/journal.pcbi.1004498[Crossref], [PubMed], [CAS], Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XkvVKhsrs%253D&md5=20ee4b5df9de1903261824acffc6852cPredicting anticancer drug responses using a dual-layer integrated cell line-drug network modelZhang, Naiqian; Wang, Haiyun; Fang, Yun; Wang, Jun; Zheng, Xiaoqi; Liu, X. ShirleyPLoS Computational Biology (2015), 11 (9), e1004498/1-e1004498/18CODEN: PCBLBG; ISSN:1553-7358. (Public Library of Science)The ability to predict the response of a cancer patient to a therapeutic agent is a major goal in modern oncol. that should ultimately lead to personalized treatment. Existing approaches to predicting drug sensitivity rely primarily on profiling of cancer cell line panels that have been treated with different drugs and selecting genomic or functional genomic features to regress or classify the drug response. Here, we propose a dual-layer integrated cell line-drug network model, which uses both cell line similarity network (CSN) data and drug similarity network (DSN) data to predict the drug response of a given cell line using a weighted model. Using the Cancer Cell Line Encyclopedia (CCLE) and Cancer Genome Project (CGP) studies as benchmark datasets, our single-layer model with CSN or DSN and only a single parameter achieved a prediction performance comparable to the previously generated elastic net model. When using the dual-layer model integrating both CSN and DSN, our predicted response reached a 0.6 Pearson correlation coeff. with obsd. responses for most drugs, which is significantly better than the previous results using the elastic net model. We have also applied the dual-layer cell line-drug integrated network model to fill in the missing drug response values in the CGP dataset. Even though the dual-layer integrated cell line-drug network model does not specifically model mutation information, it correctly predicted that BRAF mutant cell lines would be more sensitive than BRAF wild-type cell lines to three MEK1/2 inhibitors tested.
- 18Wang, Y.; Fang, J.; Chen, S. Inferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic properties. Sci. Rep. 2016, 6, 32679, DOI: 10.1038/srep32679[Crossref], [PubMed], [CAS], Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhsFGrsb7J&md5=00fe30ca82336071ccababcdde8a025bInferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic propertiesWang, Yongcui; Fang, Jianwen; Chen, ShilongScientific Reports (2016), 6 (), 32679CODEN: SRCEC3; ISSN:2045-2322. (Nature Publishing Group)Accurately predicting the response of a cancer patient to a therapeutic agent is a core goal of precision medicine. Existing approaches were mainly relied primarily on genomic alterations in cancer cells that have been treated with different drugs. Here we focus on predicting drug response based on integration of the heterogeneously pharmacogenomics data from both cell and drug sides. Through a systematical approach, named as PDRCC (Predict Drug Response in Cancer Cells), the cancer genomic alterations and compd. chem. and therapeutic properties were incorporated to det. the chemotherapeutic response in cancer patients. Using the Cancer Cell Line Encyclopedia (CCLE) study as the benchmark dataset, all pharmacogenomics data exhibited their roles in inferring the relationships between cancer cells and drugs. When integrating both genomic resources and compd. information, the prediction coverage was significantly increased. The validity of PDRCC was also supported by its effective in uncovering the unknown cell-drug assocns. with database and literature evidences. It set the stage for clin. testing of novel therapeutic strategies, such as the sensitive assocn. between cancer cell 'A549_LUNG' and compd. 'Topotecan'. In conclusion, PDRCC offers the possibility for faster, safer, and cheaper the development of novel anti-cancer therapeutics in the early-stage clin. trails.
- 19Ding, M. Q.; Chen, L.; Cooper, G. F.; Young, J. D.; Lu, X. Precision oncology beyond targeted therapy: Combining omics data with machine learning matches the majority of cancer cells to effective therapeutics. Mol. Cancer Res. 2018, 16, 269– 278, DOI: 10.1158/1541-7786.MCR-17-0378[Crossref], [PubMed], [CAS], Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitVeiurk%253D&md5=ebfba8be2a7247ef534374050b3c04b8Precision Oncology beyond Targeted Therapy: Combining Omics Data with Machine Learning Matches the Majority of Cancer Cells to Effective TherapeuticsDing, Michael Q.; Chen, Lujia; Cooper, Gregory F.; Young, Jonathan D.; Lu, XinghuaMolecular Cancer Research (2018), 16 (2), 269-278CODEN: MCROC5; ISSN:1541-7786. (American Association for Cancer Research)Precision oncol. involves identifying drugs that will effectively treat a tumor and then prescribing an optimal clin. treatment regimen. However, most first-line chemotherapy drugs do not have biomarkers to guide their application. For molecularly targeted drugs, using the genomic status of a drug target as a therapeutic indicator has limitations. In this study, machine learning methods (e.g., deep learning) were used to identify informative features from genome-scale omics data and to train classifiers for predicting the effectiveness of drugs in cancer cell lines. The methodol. introduced here can accurately predict the efficacy of drugs, regardless of whether they are molecularly targeted or nonspecific chemotherapy drugs. This approach, on a per-drug basis, can identify sensitive cancer cells with an av. sensitivity of 0.82 and specificity of 0.82; on a per-cell line basis, it can identify effective drugs with an av. sensitivity of 0.80 and specificity of 0.82. This report describes a data-driven precision medicine approach that is not only generalizable but also optimizes therapeutic efficacy. The framework detailed herein, when successfully translated to clin. environments, could significantly broaden the scope of precision oncol. beyond targeted therapies, benefiting an expanded proportion of cancer patients. Mol Cancer Res; 16(2); 269-78. ©2017 AACR.
- 20Wang, L.; Li, X.; Zhang, L.; Gao, Q. Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization. BMC Cancer 2017, 17, 513, DOI: 10.1186/s12885-017-3500-5[Crossref], [PubMed], [CAS], Google Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitFajsr3O&md5=7b6f4376f81b7495224d01936c0a7a01Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularizationWang, Lin; Li, Xiaozhong; Zhang, Louxin; Gao, QiangBMC Cancer (2017), 17 (), 513/1-513/12CODEN: BCMACL; ISSN:1471-2407. (BioMed Central Ltd.)Human cancer cell lines are used in research to study the biol. of cancer and to test cancer treatments. Recently there are already some large panels of several hundred human cancer cell lines which are characterized with genomic and pharmacol. data. The ability to predict drug responses using these pharmacogenomics data can facilitate the development of precision cancer medicines. Although several methods have been developed to address the drug response prediction, there are many challenges in obtaining accurate prediction. Based on the fact that similar cell lines and similar drugs exhibit similar drug responses, we adopted a similarity-regularized matrix factorization (SRMF) method to predict anticancer drug responses of cell lines using chem. structures of drugs and baseline gene expression levels in cell lines. Specifically, chem. structural similarity of drugs and gene expression profile similarity of cell lines were considered as regularization terms, which were incorporated to the drug response matrix factorization model. We first demonstrated the effectiveness of SRMF using a set of simulation data and compared it with two typical similarity-based methods. Furthermore, we applied it to the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) datasets, and performance of SRMF exceeds three state-of-the-art methods. We also applied SRMF to est. the missing drug response values in the GDSC dataset. Even though SRMF does not specifically model mutation information, it could correctly predict drug-cancer gene assocns. that are consistent with existing data, and identify novel drug-cancer gene assocns. that are not found in existing data as well. SRMF can also aid in drug repositioning. The newly predicted drug responses of GDSC dataset suggest that mTOR inhibitor rapamycin was sensitive to non-small cell lung cancer (NSCLC), and expression of AK1RC3 and HINT1 may be adjunct markers of cell line sensitivity to rapamycin. Our anal. showed that the proposed data integration method is able to improve the accuracy of prediction of anticancer drug responses in cell lines, and can identify consistent and novel drug-cancer gene assocns. compared to existing data as well as aid in drug repositioning.
- 21Yuan, H.; Paskov, I.; Paskov, H.; González, A. J.; Leslie, C. S. Multitask learning improves prediction of cancer drug sensitivity. Sci. Rep. 2016, 6, 31619, DOI: 10.1038/srep31619[Crossref], [PubMed], [CAS], Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhsVSrsbrK&md5=0c547526f6d8fb6058ba0cd8e1194657Multitask learning improves prediction of cancer drug sensitivityYuan, Han; Paskov, Ivan; Paskov, Hristo; Gonzalez, Alvaro J.; Leslie, Christina S.Scientific Reports (2016), 6 (), 31619CODEN: SRCEC3; ISSN:2045-2322. (Nature Publishing Group)Precision oncol. seeks to predict the best therapeutic option for individual patients based on the mol. characteristics of their tumors. To assess the preclin. feasibility of drug sensitivity prediction, several studies have measured drug responses for cytotoxic and targeted therapies across large collections of genomically and transcriptomically characterized cancer cell lines and trained predictive models using std. methods like elastic net regression. Here we use existing drug response data sets to demonstrate that multitask learning across drugs strongly improves the accuracy and interpretability of drug prediction models. Our method uses trace norm regularization with a highly efficient ADMM (alternating direction method of multipliers) optimization algorithm that readily scales to large data sets. We anticipate that our approach will enhance efforts to exploit growing drug response compendia in order to advance personalized therapy.
- 22Stanfield, Z.; Coşkun, M.; Koyutürk, M. Drug response prediction as a link prediction problem. Sci. Rep. 2017, 7, 40321, DOI: 10.1038/srep40321[Crossref], [PubMed], [CAS], Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXotleltA%253D%253D&md5=82ee0937c1803b80d5b763f58ff7671eDrug Response Prediction as a Link Prediction ProblemStanfield, Zachary; Coskun, Mustafa; Koyuturk, MehmetScientific Reports (2017), 7 (), 40321CODEN: SRCEC3; ISSN:2045-2322. (Nature Publishing Group)Drug response prediction is a well-studied problem in which the mol. profile of a given sample is used to predict the effect of a given drug on that sample. Effective solns. to this problem hold the key for precision medicine. In cancer research, genomic data from cell lines are often utilized as features to develop machine learning models predictive of drug response. Mol. networks provide a functional context for the integration of genomic features, thereby resulting in robust and reproducible predictive models. However, inclusion of network data increases dimensionality and poses addnl. challenges for common machine learning tasks. To overcome these challenges, we here formulate drug response prediction as a link prediction problem. For this purpose, we represent drug response data for a large cohort of cell lines as a heterogeneous network. Using this network, we compute "network profiles" for cell lines and drugs. We then use the assocns. between these profiles to predict links between drugs and cell lines. Through leave-one-out cross validation and cross-classification on independent datasets, we show that this approach leads to accurate and reproducible classification of sensitive and resistant cell line-drug pairs, with 85% accuracy. We also examine the biol. relevance of the network profiles.
- 23Liu, H.; Zhao, Y.; Zhang, L.; Chen, X. Anti-cancer drug response prediction using neighbor-based collaborative filtering with global effect removal. Mol. Ther.--Nucleic Acids 2018, 13, 303– 311, DOI: 10.1016/j.omtn.2018.09.011[Crossref], [PubMed], [CAS], Google Scholar23https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXisVGmsrfN&md5=273e92f2fdbf288a8fa2392cac057826Anti-cancer Drug Response Prediction Using Neighbor-Based Collaborative Filtering with Global Effect RemovalLiu, Hui; Zhao, Yan; Zhang, Lin; Chen, XingMolecular Therapy--Nucleic Acids (2018), 13 (), 303-311CODEN: MTAOC5; ISSN:2162-2531. (Elsevier)Patients of the same cancer may differ in their responses to a specific medical therapy. Identification of predictive mol. features for drug sensitivity holds the key in the era of precision medicine. Human cell lines have harbored most of the same genetic changes found in patients' tumors and thus are widely used in the research of drug response. In this work, we formulated drug-response prediction as a recommender system problem and then adopted a neighbor-based collaborative filtering with global effect removal (NCFGER) method to est. anti-cancer drug responses of cell lines by integrating cell-line similarity networks and drug similarity networks based on the fact that similar cell lines and similar drugs exhibit similar responses. Specifically, we removed the global effect in the available responses and shrunk the similarity score for each cell line pair as well as each drug pair. We then used the K most similar neighbors (hybrid of cell-line-oriented and drug-oriented) in the available responses to predict the unknown ones. Through 10-fold cross-validation, this approach was shown to reach accurate and reproducible outcomes of drug sensitivity. We also discussed the biol. outcomes based on the newly predicted response values.
- 24Zhang, L.; Chen, X.; Guan, N.-N.; Liu, H.; Li, J.-Q. A hybrid interpolation weighted collaborative filtering method for anti-cancer drug response prediction. Front. Pharmacol. 2018, 9, 01017, DOI: 10.3389/fphar.2018.01017[Crossref], [PubMed], [CAS], Google Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXjvVSks7c%253D&md5=a5162b0c14303ea2ac0db4ebe9538d81A hybrid interpolation weighted collaborative filtering method for anti-cancer drug response predictionZhang, Lin; Chen, Xing; Guan, Na-Na; Liu, Hui; Li, Jian-QiangFrontiers in Pharmacology (2018), 9 (), 1017/1-1017/11CODEN: FPRHAU; ISSN:1663-9812. (Frontiers Media S.A.)Individualized therapies ask for the most effective regimen for each patient, while the patients' response may differ from each other. However, it is impossible to clin. evaluate each patient's response due to the large population. Human cell lines have harbored most of the same genetic changes found in patients' tumors, thus are widely used to help understand initial responses of drugs. Based on the more credible assumption that similar cell lines and similar drugs exhibit similar responses, we formulated drug response prediction as a recommender system problem, and then adopted a hybrid interpolation weighted collaborative filtering (HIWCF) method to predict anti-cancer drug responses of cell lines by incorporating cell line similarity and drug similarity shown from gene expression profiles, drug chem. structure as well as drug response similarity. Specifically, we estd. the baseline based on the available responses and shrunk the similarity score for each cell line pair as well as each drug pair. The similarity scores were then shrunk and weighted by the correlation coeffs. drawn from the know response between each pair. Before used to find the K most similar neighbors for further prediction, they went through the case amplification strategy to emphasize high similarity and neglect low similarity. In the last step for prediction, cell line-oriented and drug-oriented collaborative filtering models were carried out, and the av. of predicted values from both models was used as the final predicted sensitivity. Through 10-fold cross validation, this approach was shown to reach accurate and reproducible outcome for those missing drug sensitivities. We also found that the drug response similarity between cell lines or drugs may play important role in the prediction. Finally, we discussed the biol. outcomes based on the newly predicted response values in GDSC dataset.
- 25Oskooei, A.; Manica, M.; Mathis, R.; Martínez, M. R. Network-based Biased Tree Ensembles (NetBiTE) for Drug Sensitivity Prediction and Drug Sensitivity Biomarker Identification in Cancer. arXiv:1808.06603 [q-bio.QM] , arXiv preprint, 2018. https://arxiv.org/abs/1808.06603Google ScholarThere is no corresponding record for this reference.
- 26Zhang, F.; Wang, M.; Xi, J.; Yang, J.; Li, A. A novel heterogeneous network-based method for drug response prediction in cancer cell lines. Sci. Rep. 2018, 8, 3355, DOI: 10.1038/s41598-018-21622-4[Crossref], [PubMed], [CAS], Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1Mrjs1GhtQ%253D%253D&md5=b58de39ef4627844fc0ad2924f7535a3A novel heterogeneous network-based method for drug response prediction in cancer cell linesZhang Fei; Wang Minghui; Li Ao; Wang Minghui; Xi Jianing; Yang Jianghong; Li AoScientific reports (2018), 8 (1), 3355 ISSN:.An enduring challenge in personalized medicine lies in selecting a suitable drug for each individual patient. Here we concentrate on predicting drug responses based on a cohort of genomic, chemical structure, and target information. Therefore, a recently study such as GDSC has provided an unprecedented opportunity to infer the potential relationships between cell line and drug. While existing approach rely primarily on regression, classification or multiple kernel learning to predict drug responses. Synthetic approach indicates drug target and protein-protein interaction could have the potential to improve the prediction performance of drug response. In this study, we propose a novel heterogeneous network-based method, named as HNMDRP, to accurately predict cell line-drug associations through incorporating heterogeneity relationship among cell line, drug and target. Compared to previous study, HNMDRP can make good use of above heterogeneous information to predict drug responses. The validity of our method is verified not only by plotting the ROC curve, but also by predicting novel cell line-drug sensitive associations which have dependable literature evidences. This allows us possibly to suggest potential sensitive associations among cell lines and drugs. Matlab and R codes of HNMDRP can be found at following https://github.com/USTC-HIlab/HNMDRP .
- 27Cereto-Massagué, A. Molecular fingerprint similarity search in virtual screening. Methods 2015, 71, 58– 63, DOI: 10.1016/j.ymeth.2014.08.005[Crossref], [PubMed], [CAS], Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhsVSmsbrN&md5=0452fd89578a9477fd5d3f251d513f81Molecular fingerprint similarity search in virtual screeningCereto-Massague, Adria; Ojeda, Maria Jose; Valls, Cristina; Mulero, Miquel; Garcia-Vallve, Santiago; Pujadas, GerardMethods (Amsterdam, Netherlands) (2015), 71 (), 58-63CODEN: MTHDE9; ISSN:1046-2023. (Elsevier B.V.)A review. Mol. fingerprints have been used for a long time now in drug discovery and virtual screening. Their ease of use (requiring little to no configuration) and the speed at which substructure and similarity searches can be performed with them - paired with a virtual screening performance similar to other more complex methods - is the reason for their popularity. However, there are many types of fingerprints, each representing a different aspect of the mol., which can greatly affect search performance. This review focuses on commonly used fingerprint algorithms, their usage in virtual screening, and the software packages and online tools that provide these algorithms.
- 28Chen, H.; Engkvist, O.; Wang, Y.; Olivecrona, M.; Blaschke, T. The rise of deep learning in drug discovery. Drug Discovery Today 2018, 23, 1241, DOI: 10.1016/j.drudis.2018.01.039[Crossref], [PubMed], [CAS], Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1MvjvFyqtQ%253D%253D&md5=d6cbdd98ede30181802cca1786cd5a95The rise of deep learning in drug discoveryChen Hongming; Engkvist Ola; Olivecrona Marcus; Blaschke Thomas; Wang YinhaiDrug discovery today (2018), 23 (6), 1241-1250 ISSN:.Over the past decade, deep learning has achieved remarkable success in various artificial intelligence research areas. Evolved from the previous research on artificial neural networks, this technology has shown superior performance to other machine learning algorithms in areas such as image and voice recognition, natural language processing, among others. The first wave of applications of deep learning in pharmaceutical research has emerged in recent years, and its utility has gone beyond bioactivity predictions and has shown promise in addressing diverse problems in drug discovery. Examples will be discussed covering bioactivity prediction, de novo molecular design, synthesis prediction and biological image analysis.
- 29Grapov, D.; Fahrmann, J.; Wanichthanarak, K.; Khoomrung, S. Rise of deep learning for genomic, proteomic, and metabolomic data integration in precision medicine. Omics: a journal of integrative biology 2018, 22, 630– 636, DOI: 10.1089/omi.2018.0097[Crossref], [PubMed], [CAS], Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3c7pslKitg%253D%253D&md5=cf9b7eb5ed0be9d3ab23ac6bbc5d5a13Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision MedicineGrapov Dmitry; Fahrmann Johannes; Wanichthanarak Kwanjeera; Khoomrung Sakda; Wanichthanarak Kwanjeera; Khoomrung SakdaOmics : a journal of integrative biology (2018), 22 (10), 630-636 ISSN:.Machine learning (ML) is being ubiquitously incorporated into everyday products such as Internet search, email spam filters, product recommendations, image classification, and speech recognition. New approaches for highly integrated manufacturing and automation such as the Industry 4.0 and the Internet of things are also converging with ML methodologies. Many approaches incorporate complex artificial neural network architectures and are collectively referred to as deep learning (DL) applications. These methods have been shown capable of representing and learning predictable relationships in many diverse forms of data and hold promise for transforming the future of omics research and applications in precision medicine. Omics and electronic health record data pose considerable challenges for DL. This is due to many factors such as low signal to noise, analytical variance, and complex data integration requirements. However, DL models have already been shown capable of both improving the ease of data encoding and predictive model performance over alternative approaches. It may not be surprising that concepts encountered in DL share similarities with those observed in biological message relay systems such as gene, protein, and metabolite networks. This expert review examines the challenges and opportunities for DL at a systems and biological scale for a precision medicine readership.
- 30Wu, Z. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 2018, 9, 513– 530, DOI: 10.1039/C7SC02664A[Crossref], [PubMed], [CAS], Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhslChtrbO&md5=cd23d4caad97fe4c48ac09c886806191MoleculeNet: a benchmark for molecular machine learningWu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N.; Gomes, Joseph; Geniesse, Caleb; Pappu, Aneesh S.; Leswing, Karl; Pande, VijayChemical Science (2018), 9 (2), 513-530CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)Mol. machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about mol. properties. However, algorithmic progress has been limited due to the lack of a std. benchmark to compare the efficacy of proposed methods; most new algorithms are benchmarked on different datasets making it challenging to gauge the quality of proposed methods. This work introduces MoleculeNet, a large scale benchmark for mol. machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed mol. featurization and learning algorithms (released as part of the DeepChem open source library). MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for mol. machine learning and broadly offer the best performance. However, this result comes with caveats. Learnable representations still struggle to deal with complex tasks under data scarcity and highly imbalanced classification. For quantum mech. and biophys. datasets, the use of physics-aware featurizations can be more important than choice of particular learning algorithm.
- 31Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 [cs.CL] , arXiv preprint, 2014. https://arxiv.org/abs/1409.0473.Google ScholarThere is no corresponding record for this reference.
- 32Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Model. 1988, 28, 31– 36, DOI: 10.1021/ci00057a005[ACS Full Text
], [CAS], Google Scholar
32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL1cXnsVeqsA%253D%253D&md5=04592975f9dd3c0ce3c1ad618ba2b17dSMILES, a chemical language and information system. 1. Introduction to methodology and encoding rulesWeininger, DavidJournal of Chemical Information and Computer Sciences (1988), 28 (1), 31-6CODEN: JCISD8; ISSN:0095-2338.The SMILES (simplified mol. input line entry system) chem. notation system is described for information processing. The system is based on principles of mol. graph theory and it allows structure specification by use of a very small and natural grammar well suited for high-speed machine processing. The system is easy to use, has high machine compatibility, and allows many computer applications, including notation generation, const. speed database retrieval, substructure searching, and property prediction models. - 33Jastrzębski, S.; Leśniak, D.; Czarnecki, W. M. Learning to SMILE (S). arXiv:1602.06289 [cs.CL] , arXiv preprint, 2016. https://arxiv.org/abs/1602.06289Google ScholarThere is no corresponding record for this reference.
- 34Schwaller, P.; Molecular transformer for chemical reaction prediction and uncertainty estimation. arXiv:1811.02633 [physics.chem-ph] , arXiv preprint, 2018. https://arxiv.org/abs/1811.02633.Google ScholarThere is no corresponding record for this reference.
- 35Bjerrum, E. J. SMILES enumeration as data augmentation for neural network modeling of molecules. arXiv:1703.07076 [cs.LG] , arXiv preprint, 2017. https://arxiv.org/abs/1703.07076.Google ScholarThere is no corresponding record for this reference.
- 36Segler, M. H.; Kogej, T.; Tyrchan, C.; Waller, M. P. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 2018, 4, 120– 131, DOI: 10.1021/acscentsci.7b00512[ACS Full Text
], [CAS], Google Scholar
36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXitVCjsLfP&md5=708f40422c7a911c629525ce5b66088bGenerating Focused Molecule Libraries for Drug Discovery with Recurrent Neural NetworksSegler, Marwin H. S.; Kogej, Thierry; Tyrchan, Christian; Waller, Mark P.ACS Central Science (2018), 4 (1), 120-131CODEN: ACSCII; ISSN:2374-7951. (American Chemical Society)In de novo drug design, computational strategies are used to generate novel mols. with good affinity to the desired biol. target. In this work, we show that recurrent neural networks can be trained as generative models for mol. structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated mols. correlate very well with the properties of the mols. used to train the model. In order to enrich libraries with mols. active toward a given biol. target, we propose to fine-tune the model with small sets of mols., which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test mols. that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria), it reproduced 28% of 1240 test mols. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel mols. for drug discovery. - 37Bai, S.; Kolter, J. Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv:1803.01271 [cs.LG] , arXiv preprint, 2018. https://arxiv.org/abs/1803.01271.Google ScholarThere is no corresponding record for this reference.
- 38Kimber, T. B.; Engelke, S.; Tetko, I. V.; Bruno, E.; Godin, G. Synergy Effect between Convolutional Neural Networks and the Multiplicity of SMILES for Improvement of Molecular Prediction. arXiv:1812.04439 [cs.LG] arXiv preprint, 2018. https://arxiv.org/abs/1812.04439.Google ScholarThere is no corresponding record for this reference.
- 39Chang, Y.; Park, H.; Yang, H.-J.; Lee, S.; Lee, K.-Y.; Kim, T. S.; Jung, J.; Shin, J.-M. Cancer Drug Response Profile scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic Signature. Sci. Rep. 2018, 8, 8857, DOI: 10.1038/s41598-018-27214-6[Crossref], [PubMed], [CAS], Google Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1Mbmt1Olsg%253D%253D&md5=4f616135dff5a8ed477c8c643f293204Cancer Drug Response Profile scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic SignatureChang Yoosup; Park Hyejin; Lee Seungju; Shin Jae-Min; Yang Hyun-Jin; Lee Kwee-Yum; Kim Tae Soon; Lee Kwee-Yum; Kim Tae Soon; Jung JongsunScientific reports (2018), 8 (1), 8857 ISSN:.In the era of precision medicine, cancer therapy can be tailored to an individual patient based on the genomic profile of a tumour. Despite the ever-increasing abundance of cancer genomic data, linking mutation profiles to drug efficacy remains a challenge. Herein, we report Cancer Drug Response profile scan (CDRscan) a novel deep learning model that predicts anticancer drug responsiveness based on a large-scale drug screening assay data encompassing genomic profiles of 787 human cancer cell lines and structural profiles of 244 drugs. CDRscan employs a two-step convolution architecture, where the genomic mutational fingerprints of cell lines and the molecular fingerprints of drugs are processed individually, then merged by 'virtual docking', an in silico modelling of drug treatment. Analysis of the goodness-of-fit between observed and predicted drug response revealed a high prediction accuracy of CDRscan (R(2) > 0.84; AUROC > 0.98). We applied CDRscan to 1,487 approved drugs and identified 14 oncology and 23 non-oncology drugs having new potential cancer indications. This, to our knowledge, is the first-time application of a deep learning model in predicting the feasibility of drug repurposing. By further clinical validation, CDRscan is expected to allow selection of the most effective anticancer drugs for the genomic profile of the individual patient.
- 40Yang, M.; Simm, J.; Lam, C. C.; Zakeri, P.; van Westen, G. J. P.; Moreau, Y.; Saez-Rodriguez, J. Linking drug target and pathway activation for effective therapy using multi-task learning. Sci. Rep. 2018, 8, 8322, DOI: 10.1038/s41598-018-25947-y[Crossref], [PubMed], [CAS], Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1MbhsFahug%253D%253D&md5=5f0082cc4e99defc75620c293fb3f8d1Linking drug target and pathway activation for effective therapy using multi-task learningYang Mi; Saez-Rodriguez Julio; Simm Jaak; Zakeri Pooya; Moreau Yves; Lam Chi Chung; van Westen Gerard J P; Saez-Rodriguez JulioScientific reports (2018), 8 (1), 8322 ISSN:.Despite the abundance of large-scale molecular and drug-response data, the insights gained about the mechanisms underlying treatment efficacy in cancer has been in general limited. Machine learning algorithms applied to those datasets most often are used to provide predictions without interpretation, or reveal single drug-gene association and fail to derive robust insights. We propose to use Macau, a bayesian multitask multi-relational algorithm to generalize from individual drugs and genes and explore the interactions between the drug targets and signaling pathways' activation. A typical insight would be: "Activation of pathway Y will confer sensitivity to any drug targeting protein X". We applied our methodology to the Genomics of Drug Sensitivity in Cancer (GDSC) screening, using gene expression of 990 cancer cell lines, activity scores of 11 signaling pathways derived from the tool PROGENy as cell line input and 228 nominal targets for 265 drugs as drug input. These interactions can guide a tissue-specific combination treatment strategy, for example suggesting to modulate a certain pathway to maximize the drug response for a given tissue. We confirmed in literature drug combination strategies derived from our result for brain, skin and stomach tissues. Such an analysis of interactions across tissues might help target discovery, drug repurposing and patient stratification strategies.
- 41Oskooei, A. PaccMann: Prediction of anticancer compound sensitivity with multi-modal attentionbased neural networks. arXiv:1811.06802 [cs.LG] , arXiv preprint, 2018. https://arxiv.org/abs/1811.06802.Google ScholarThere is no corresponding record for this reference.
- 42Rogers, D.; Hahn, M. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 2010, 50, 742– 754, DOI: 10.1021/ci100050t[ACS Full Text
], [CAS], Google Scholar
42https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXlt1Onsbg%253D&md5=cd6c736cd7a3d280b67f5316acce8006Extended-Connectivity FingerprintsRogers, David; Hahn, MathewJournal of Chemical Information and Modeling (2010), 50 (5), 742-754CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Extended-connectivity fingerprints (ECFPs) are a novel class of topol. fingerprints for mol. characterization. Historically, topol. fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure-activity modeling. ECFPs are circular fingerprints with a no. of useful qualities: they can be very rapidly calcd.; they are not predefined and can represent an essentially infinite no. of different mol. features (including stereochem. information); their features represent the presence of particular substructures, allowing easier interpretation of anal. results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature. - 43Iorio, F. A landscape of pharmacogenomic interactions in cancer. Cell 2016, 166, 740– 754, DOI: 10.1016/j.cell.2016.06.017[Crossref], [PubMed], [CAS], Google Scholar43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhtFCiu73L&md5=ed2bc85f4e6b304829d3077190384272A Landscape of pharmacogenomic interactions in cancerIorio, Francesco; Knijnenburg, Theo A.; Vis, Daniel J.; Bignell, Graham R.; Menden, Michael P.; Schubert, Michael; Aben, Nanne; Goncalves, Emanuel; Barthorpe, Syd; Lightfoot, Howard; Cokelaer, Thomas; Greninger, Patricia; van Dyk, Ewald; Chang, Han; de Silva, Heshani; Heyn, Holger; Deng, Xianming; Egan, Regina K.; Liu, Qingsong; Mironenko, Tatiana; Mitropoulos, Xeni; Richardson, Laura; Wang, Jinhua; Zhang, Tinghu; Moran, Sebastian; Sayols, Sergi; Soleimani, Maryam; Tamborero, David; Lopez-Bigas, Nuria; Ross-Macdonald, Petra; Esteller, Manel; Gray, Nathanael S.; Haber, Daniel A.; Stratton, Michael R.; Benes, Cyril H.; Wessels, Lodewyk F. A.; Saez-Rodriguez, Julio; McDermott, Ultan; Garnett, Mathew J.Cell (Cambridge, MA, United States) (2016), 166 (3), 740-754CODEN: CELLB5; ISSN:0092-8674. (Cell Press)Systematic studies of cancer genomes have provided unprecedented insights into the mol. nature of cancer. Using this information to guide the development and application of therapies in the clinic is challenging. Here, we report how cancer-driven alterations identified in 11,289 tumors from 29 tissues (integrating somatic mutations, copy no. alterations, DNA methylation, and gene expression) can be mapped onto 1001 molecularly annotated human cancer cell lines and correlated with sensitivity to 265 drugs. We find that cell lines faithfully recapitulate oncogenic alterations identified in tumors, find that many of these assoc. with drug sensitivity/resistance, and highlight the importance of tissue lineage in mediating drug response. Logic-based modeling uncovers combinations of alterations that sensitize to drugs, while machine learning demonstrates the relative importance of different data types in predicting drug response. Our anal. and datasets are rich resources to link genotypes with cellular phenotypes and to identify therapeutic options for selected cancer sub-populations.
- 44Szklarczyk, D. STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015, 43, D447– D452, DOI: 10.1093/nar/gku1003[Crossref], [PubMed], [CAS], Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtVymt7bE&md5=47f29e29c4093189bfbefadc9e3a93c4STRING v10: protein-protein interaction networks, integrated over the tree of lifeSzklarczyk, Damian; Franceschini, Andrea; Wyder, Stefan; Forslund, Kristoffer; Heller, Davide; Huerta-Cepas, Jaime; Simonovic, Milan; Roth, Alexander; Santos, Alberto; Tsafou, Kalliopi P.; Kuhn, Michael; Bork, Peer; Jensen, Lars J.; von Mering, ChristianNucleic Acids Research (2015), 43 (D1), D447-D452CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The many functional partnerships and interactions that occur between proteins are at the core of cellular processing and their systematic characterization helps to provide context in mol. systems biol. However, known and predicted interactions are scattered over multiple resources, and the available data exhibit notable differences in terms of quality and completeness. The STRING database aims to provide a crit. assessment and integration of protein-protein interactions, including direct (phys.) as well as indirect (functional) assocns. The new version 10.0 of STRING covers more than 2000 organisms, which has necessitated novel, scalable algorithms for transferring interaction information between organisms. For this purpose, we have introduced hierarchical and self-consistent orthol. annotations for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resoln. Further improvements in version 10.0 include a completely redesigned prediction pipeline for inferring protein-protein assocns. from coexpression data, an API interface for the R computing environment and improved statistical anal. for enrichment tests in user-provided networks.
- 45Hofree, M.; Shen, J. P.; Carter, H.; Gross, A.; Ideker, T. Network-based stratification of tumor mutations. Nat. Methods 2013, 10, 1108, DOI: 10.1038/nmeth.2651[Crossref], [PubMed], [CAS], Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhsVeqsrvM&md5=4a305ef2d3b1521e344d8895e77efa49Network-based stratification of tumor mutationsHofree, Matan; Shen, John P.; Carter, Hannah; Gross, Andrew; Ideker, TreyNature Methods (2013), 10 (11), 1108-1115CODEN: NMAEA3; ISSN:1548-7091. (Nature Publishing Group)Many forms of cancer have multiple subtypes with different causes and clin. outcomes. Somatic tumor genome sequences provide a rich new source of data for uncovering these subtypes but have proven difficult to compare, as two tumors rarely share the same mutations. Here we introduce network-based stratification (NBS), a method to integrate somatic tumor genomes with gene networks. This approach allows for stratification of cancer into informative subtypes by clustering together patients with mutations in similar network regions. We demonstrate NBS in ovarian, uterine and lung cancer cohorts from The Cancer Genome Atlas. For each tissue, NBS identifies subtypes that are predictive of clin. outcomes such as patient survival, response to therapy or tumor histol. We identify network regions characteristic of each subtype and show how mutation-derived subtypes can be used to train an mRNA expression signature, which provides similar information in the absence of DNA sequence.
- 46Unterthiner, T.; et al. Deep learning as an opportunity in virtual screening. Proceedings of the Deep Learning Workshop at NIPS , 2014 1 9Google ScholarThere is no corresponding record for this reference.
- 47Schwaller, P.; Gaudin, T.; Lanyi, D.; Bekas, C.; Laino, T. Found in Translation: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 2018, 9, 6091– 6098, DOI: 10.1039/C8SC02339E[Crossref], [PubMed], [CAS], Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhtFyjtb%252FE&md5=c4e3b675f45ba7710534ee39f247a036"Found in Translation": predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence modelsSchwaller, Philippe; Gaudin, Theophile; Lanyi, David; Bekas, Costas; Laino, TeodoroChemical Science (2018), 9 (28), 6091-6098CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)A review. There is an intuitive analogy of an org. chemist's understanding of a compd. and a language speaker's understanding of a word. Based on this analogy, it is possible to introduce the basic concepts and analyze potential impacts of linguistic anal. to the world of org. chem. In this work, we cast the reaction prediction task as a translation problem by introducing a template-free sequence-to-sequence model, trained end-to-end and fully data-driven. We propose a tokenization, which is arbitrarily extensible with reaction information. Using an attention-based model borrowed from human language translation, we improve the state-of-the-art solns. in reaction prediction on the top-1 accuracy by achieving 80.3% without relying on auxiliary knowledge, such as reaction templates or explicit at. features. Also, a top-1 accuracy of 65.4% is reached on a larger and noisier dataset.
- 48Cho, K.; Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078 [cs.CL] , arXiv preprint, 2014. https://arxiv.org/abs/1406.1078.Google ScholarThere is no corresponding record for this reference.
- 49Koprowski, R.; Foster, K. R. Machine learning and medicine: book review and commentary. BioMed. Eng. 2018, 17, 17, DOI: 10.1186/s12938-018-0449-9[Crossref], [CAS], Google Scholar49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1MvmsVSjsQ%253D%253D&md5=617175afc306a0acf8b4e317e83a8fdfMachine learning and medicine: book review and commentaryKoprowski Robert; Foster Kenneth RBiomedical engineering online (2018), 17 (1), 17 ISSN:.This article is a review of the book "Master machine learning algorithms, discover how they work and implement them from scratch" (ISBN: not available, 37 USD, 163 pages) edited by Jason Brownlee published by the Author, edition, v1.10 http://MachineLearningMastery.com . An accompanying commentary discusses some of the issues that are involved with use of machine learning and data mining techniques to develop predictive models for diagnosis or prognosis of disease, and to call attention to additional requirements for developing diagnostic and prognostic algorithms that are generally useful in medicine. Appendix provides examples that illustrate potential problems with machine learning that are not addressed in the reviewed book.
- 50Yang, Z. Hierarchical attention networks for document classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016, 1480– 1489, DOI: 10.18653/v1/N16-1174
- 51Gupta, A.; Kumar, B. S.; Negi, A. S. Current status on development of steroids as anticancer agents. J. Steroid Biochem. Mol. Biol. 2013, 137, 242– 270, DOI: 10.1016/j.jsbmb.2013.05.011[Crossref], [PubMed], [CAS], Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXpslKqsL8%253D&md5=506f07cacd1d3a2ad658ea59c6bc0610Current status on development of steroids as anticancer agentsGupta, Atul; Sathish Kumar, B.; Negi, Arvind S.Journal of Steroid Biochemistry and Molecular Biology (2013), 137 (), 242-270CODEN: JSBBEZ; ISSN:0960-0760. (Elsevier Ltd.)A review. Steroids are important biodynamic agents. Their affinities for various nuclear receptors have been an interesting feature to utilize them for drug development particularly for receptor mediated diseases. Steroid biochem. and its crucial role in human physiol., has attained importance among the researchers. Recent years have seen an extensive focus on modification of steroids. The rational modifications of perhydrocyclopentanophenanthrene nucleus of steroids have yielded several important anticancer lead mols. Exemestane, SR 16157, Fulvestrant and 2-methoxyestradiol are some of the successful leads emerged on steroidal pharmacophores. The present review is an update on some of the steroidal leads obtained during past 25 years. Various steroid based enzyme inhibitors, antiestrogens, cytotoxic conjugates and steroidal cytotoxic mols. of natural as well as synthetic origin have been highlighted.
- 52Vaswani, A.; et al. Attention is all you need. Advances in Neural Information Processing Systems 30 , NIPS 2017; pp 5998– 6008.Google ScholarThere is no corresponding record for this reference.
- 53Li, V.; Maki, A. Feature Contraction: New ConvNet Regularization in Image Classification. BMVC 2018.Google ScholarThere is no corresponding record for this reference.
- 54Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 [cs.LG] , arXiv preprint, 2014. https://arxiv.org/abs/1412.6980.Google ScholarThere is no corresponding record for this reference.
- 55Jiao, Q.; Bi, L.; Ren, Y.; Song, S.; Wang, Q.; Wang, Y.-s. Advances in studies of tyrosine kinase inhibitors and their acquired resistance. Mol. Cancer 2018, 17, 36, DOI: 10.1186/s12943-018-0801-5[Crossref], [PubMed], [CAS], Google Scholar55https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitF2nt7vM&md5=79e82708697c0359c3279ea6121af8feAdvances in studies of tyrosine kinase inhibitors and their acquired resistanceJiao, Qinlian; Bi, Lei; Ren, Yidan; Song, Shuliang; Wang, Qin; Wang, Yun-shanMolecular Cancer (2018), 17 (), 36/1-36/12CODEN: MCOACG; ISSN:1476-4598. (BioMed Central Ltd.)Protein tyrosine kinase (PTK) is one of the major signaling enzymes in the process of cell signal transduction, which catalyzes the transfer of ATP-γ-phosphate to the tyrosine residues of the substrate protein, making it phosphorylation, regulating cell growth, differentiation, death and a series of physiol. and biochem. processes. Abnormal expression of PTK usually leads to cell proliferation disorders, and is closely related to tumor invasion, metastasis and tumor angiogenesis. At present, a variety of PTKs have been used as targets in the screening of anti-tumor drugs. Tyrosine kinase inhibitors (TKIs) compete with ATP for the ATP binding site of PTK and reduce tyrosine kinase phosphorylation, thereby inhibiting cancer cell proliferation. TKI has made great progress in the treatment of cancer, but the attendant acquired acquired resistance is still inevitable, restricting the treatment of cancer. In this paper, we summarize the role of PTK in cancer, TKI treatment of tumor pathways and TKI acquired resistance mechanisms, which provide some ref. for further research on TKI treatment of tumors.
- 56Finlay, S. Multiple classifier architectures and their application to credit risk assessment. European Journal of Operational Research 2011, 210, 368– 378, DOI: 10.1016/j.ejor.2010.09.029
- 57Tanimoto, T. T. Elementary mathematical theory of classification and prediction. IBM Technical Report , 1958.Google ScholarThere is no corresponding record for this reference.
- 58Bajusz, D.; Rácz, A.; Héberger, K. Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?. J. Cheminf. 2015, 7, 20, DOI: 10.1186/s13321-015-0069-3
- 59Chen, E. Y. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinf. 2013, 14, 128, DOI: 10.1186/1471-2105-14-128[Crossref], [PubMed], [CAS], Google Scholar59https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC3srlt1Kksw%253D%253D&md5=28be1b884f451b1b78defc3708b4b62fEnrichr: interactive and collaborative HTML5 gene list enrichment analysis toolChen Edward Y; Tan Christopher M; Kou Yan; Duan Qiaonan; Wang Zichen; Meirelles Gabriela Vaz; Clark Neil R; Ma'ayan AviBMC bioinformatics (2013), 14 (), 128 ISSN:.BACKGROUND: System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. RESULTS: Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. CONCLUSIONS: Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.
- 60Kuleshov, M. V. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016, 44, W90– W97, DOI: 10.1093/nar/gkw377[Crossref], [PubMed], [CAS], Google Scholar60https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2itrfF&md5=09239af53827a888f7328cf362e5c4b6Enrichr: a comprehensive gene set enrichment analysis web server 2016 updateKuleshov, Maxim V.; Jones, Matthew R.; Rouillard, Andrew D.; Fernandez, Nicolas F.; Duan, Qiaonan; Wang, Zichen; Koplev, Simon; Jenkins, Sherry L.; Jagodnik, Kathleen M.; Lachmann, Alexander; McDermott, Michael G.; Monteiro, Caroline D.; Gundersen, Gregory W.; Ma'ayan, AviNucleic Acids Research (2016), 44 (W1), W90-W97CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)Enrichment anal. is a popular method for analyzing gene sets generated by genome-wide expts. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for anal. and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as cluster grams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biol. knowledge for further biol. discoveries.
- 61Mi, H. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res. 2017, 45, D183– D189, DOI: 10.1093/nar/gkw1138[Crossref], [PubMed], [CAS], Google Scholar61https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhslWhsLw%253D&md5=0ddda80a1174008dd6b96ff32bd47a29PANTHER version 11: expanded annotation data from gene ontology and reactome pathways, and data analysis tool enhancementsMi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D.Nucleic Acids Research (2017), 45 (D1), D183-D189CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)The PANTHER database (Protein Anal. THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics expts. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the anal. tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontol. Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment anal. using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of Evalue statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution.
- 62Kim, H.-G.; Hwang, S.-Y.; Aaronson, S. A.; Mandinova, A.; Lee, S. W. DDR1 receptor tyrosine kinase promotes prosurvival pathway through Notch1 activation. J. Biol. Chem. 2011, 286, 17672– 17681, DOI: 10.1074/jbc.M111.236612[Crossref], [PubMed], [CAS], Google Scholar62https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXmtVGjt7c%253D&md5=b71db3a01690a02e8fb2ac8d18067491DDR1 Receptor Tyrosine Kinase Promotes Prosurvival Pathway through Notch1 ActivationKim, Hyung-Gu; Hwang, So-Young; Aaronson, Stuart A.; Mandinova, Anna; Lee, Sam W.Journal of Biological Chemistry (2011), 286 (20), 17672-17681CODEN: JBCHA3; ISSN:0021-9258. (American Society for Biochemistry and Molecular Biology)DDR1 (discoidin domain receptor tyrosine kinase 1) kinase s highly expressed in a variety of human cancers and occasionally mutated in lung cancer and leukemia. It is now clear that aberrant signaling through the DDR1 receptor is closely assocd. with various steps of tumorigenesis, although little is known about the mol. mechanism(s) underlying the role of DDR1 in cancer. Besides the role of DDR1 in tumorigenesis, we previously identified DDR1 kinase as a transcriptional target of tumor suppressor p53. DDR1 is functionally activated as detd. by its tyrosine phosphorylation, in response to p53-dependent DNA damage. In this study, we report the characterization of the Notch1 protein as an interacting partner of DDR1 receptor, as detd. by tandem affinity protein purifn. Upon ligand-mediated DDR1 kinase activation, Notch1 was activated, bound to DDR1, and activated canonical Notch1 targets, including Hes1 and Hey2. Moreover, DDR1 ligand (collagen I) treatment significantly increased the active form of Notch1 receptor in the nuclear fraction, whereas DDR1 knockdown cells show little or no increase of the active form of Notch1 in the nuclear fraction, suggesting a novel intracellular mechanism underlying autocrine activation of wild-type Notch signaling through DDR1. DDR1 activation suppressed genotoxic-mediated cell death, whereas Notch1 inhibition by a γ-secretase inhibitor, DAPT, enhanced cell death in response to stress. Moreover, the DDR1 knockdown cancer cells showed the reduced transformed phenotypes in vitro and in vivo xenograft studies. The results suggest that DDR1 exerts prosurvival effect, at least in part, through the functional interaction with Notch1.
- 63Barisione, G. Heterogeneous expression of the collagen receptor DDR1 in chronic lymphocytic leukaemia and correlation with progression. Blood cancer journal 2017, 7, e513, DOI: 10.1038/bcj.2016.121
- 64Pandzic, T.; Larsson, J.; He, L.; Kundu, S.; Ban, K.; Akhtar-Ali, M.; Hellstrom, A. R.; Schuh, A.; Clifford, R.; Blakemore, S. J.; Strefford, J. C.; Baumann, T.; Lopez-Guillermo, A.; Campo, E.; Ljungstrom, V.; Mansouri, L.; Rosenquist, R.; Sjoblom, T.; Hellstrom, M. Transposon mutagenesis reveals fludarabine-resistance mechanisms in chronic lymphocytic leukemia. Clin. Cancer Res. 2016, 22, 6217, DOI: 10.1158/1078-0432.CCR-15-2903[Crossref], [PubMed], [CAS], Google Scholar64https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XitFSmsbfF&md5=4a99008657214a95e3382e58d96b75bbTransposon Mutagenesis Reveals Fludarabine Resistance Mechanisms in Chronic Lymphocytic LeukemiaPandzic, Tatjana; Larsson, Jimmy; He, Liqun; Kundu, Snehangshu; Ban, Kenneth; Akhtar-Ali, Muhammad; Hellstrom, Anders R.; Schuh, Anna; Clifford, Ruth; Blakemore, Stuart J.; Strefford, Jonathan C.; Baumann, Tycho; Lopez-Guillermo, Armando; Campo, Elias; Ljungstrom, Viktor; Mansouri, Larry; Rosenquist, Richard; Sjoblom, Tobias; Hellstrom, MatsClinical Cancer Research (2016), 22 (24), 6217-6227CODEN: CCREF4; ISSN:1078-0432. (American Association for Cancer Research)Purpose: To identify resistance mechanisms for the chemotherapeutic drug fludarabine in chronic lymphocytic leukemia (CLL), as innate and acquired resistance to fludarabine-based chemotherapy represents a major challenge for long-term disease control. Exptl. Design: We used piggyBac transposon-mediated mutagenesis, combined with next-generation sequencing, to identify genes that confer resistance to fludarabine in a human CLL cell line. Results: In total, this screen identified 782 genes with transposon integrations in fludarabine-resistant pools of cells. One of the identified genes is a known resistance mediator DCK (deoxycytidine kinase), which encodes an enzyme that is essential for the phosphorylation of the prodrug to the active metabolite. BMP2K, a gene not previously linked to CLL, was also identified as a modulator of response to fludarabine. In addn., 10 of 782 transposon-targeted genes had previously been implicated in treatment resistance based on somatic mutations seen in patients refractory to fludarabine-based therapy. Functional characterization of these genes supported a significant role for ARID5B and BRAF in fludarabine sensitivity. Finally, pathway anal. of transposon-targeted genes and RNA-seq profiling of fludarabine-resistant cells suggested deregulated MAPK signaling as involved in mediating drug resistance in CLL. Conclusions: To our knowledge, this is the first forward genetic screen for chemotherapy resistance in CLL. The screen pinpointed novel genes and pathways involved in fludarabine resistance along with previously known resistance mechanisms. Transposon screens can therefore aid interpretation of cancer genome sequencing data in the identification of genes modifying sensitivity to chemotherapy. Clin Cancer Res; 22(24); 6217-27. ©2016 AACR.
- 65Schmidt, H. H. Deregulation of the carbohydrate (chondroitin 4) sulfotransferase 11 (CHST11) gene in a B-cell chronic lymphocytic leukemia with at (12; 14)(q23; q32). Oncogene 2004, 23, 6991, DOI: 10.1038/sj.onc.1207934[Crossref], [PubMed], [CAS], Google Scholar65https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXnsFait7g%253D&md5=741370eab10e5686c70de8ee397c2763Deregulation of the carbohydrate (chondroitin 4) sulfotransferase 11 (CHST11) gene in a B-cell chronic lymphocytic leukemia with a t(12;14)(q23;q32)Schmidt, Helmut H.; Dyomin, Vadim G.; Palanisamy, Nallasivam; Itoyama, Takahiro; Nanjangud, Gouri; Pirc-Danoewinata, Hendrati; Haas, Oskar A.; Chaganti, R. S. K.Oncogene (2004), 23 (41), 6991-6996CODEN: ONCNES; ISSN:0950-9232. (Nature Publishing Group)The t(12;14)(q23;q32) breakpoints in a case of B-cell chronic lymphocytic leukemia (B-CLL) were mapped by fluorescence in situ hybridization (FISH) and Southern blot anal. and cloned using an IGH switch-γ probe. The translocation affected a productively rearranged IGH allele and the carbohydrate (chondroitin 4) sulfotransferase 11 (CHST11) locus at 12q23, with a reciprocal break in intron 2 of the CHST11 gene. CHST11 belongs to the HNK1 family of Golgi-assocd. sulfotransferases, a group of glycosaminoglycan-modifying enzymes, and is expressed mainly in the hematopoietic lineage. Northern Blot anal. of tumor RNA using CHST11-specific probes showed expression of two CHST11 forms of abnormal size. 5'- And 3'-Rapid Amplification of cDNA Ends (RACE) revealed IGH/CHST11 as well as CHST11/IGH fusion RNAs expressed from the der(14) and der(12) chromosomes. Both fusion species contained open reading frames making possible the translation of two truncated forms of CHST11 protein. The biol. consequence of t(12;14)(q23;q32) in this case presumably is a disturbance of the cellular distribution of CHST11 leading to deregulation of a chondroitin-sulfate-dependent pathway specific to the hematopoietic lineage.
- 66Renema, N.; Navet, B.; Heymann, M.-F.; Lezot, F.; Heymann, D. RANK–RANKL signalling in cancer. Biosci. Rep. 2016, 36, e00366 DOI: 10.1042/BSR20160150[Crossref], [PubMed], [CAS], Google Scholar66https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhs1OntLs%253D&md5=f3e6fecc5a11f177114fbd433a51e5b9RANK-RANKL signalling in cancerRenema, Nathalie; Navet, Benjamin; Heymann, Marie-Francoise; Lezot, Frederic; Heymann, DominiqueBioscience Reports (2016), 36 (4), e00366/1-e00366/17CODEN: BRPTDT; ISSN:0144-8463. (Portland Press Ltd.)Oncogenic events combined with a favorable environment are the two main factors in the oncol. process. The tumor microenvironment is composed of a complex, interconnected network of protagonists, including sol. factors such as cytokines, extracellular matrix components, interacting with fibroblasts, endothelial cells, immune cells and various specific cell types depending on the location of the cancer cells (e.g. pulmonary epithelium, osteoblasts). This diversity defines specific "niches" (e.g. vascular, immune, bone niches) involved in tumor growth and the metastatic process. These actors communicate together by direct intercellular communications and/or in an autocrine/paracrine/endocrine manner involving cytokines and growth factors. Among these glycoproteins, RANKL (receptor activator nuclear factor-κB ligand) and its receptor RANK (receptor activator nuclear factor), members of the TNF and TNFR superfamilies, have stimulated the interest of the scientific community. RANK is frequently expressed by cancer cells in contrast with RANKL which is frequently detected in the tumor microenvironment and together they participate in every step in cancer development. Their activities are markedly regulated by osteoprotegerin (OPG, a sol. decoy receptor) and its ligands, and by LGR4, a membrane receptor able to bind RANKL. The aim of the present review is to provide an overview of the functional implication of the RANK/RANKL system in cancer development, and to underline the most recent clin. studies.
- 67Heltemes-Harris, L. M. Ebf1 or Pax5 haploinsufficiency synergizes with STAT5 activation to initiate acute lymphoblastic leukemia. J. Exp. Med. 2011, 208, 1135– 1149, DOI: 10.1084/jem.20101947[Crossref], [PubMed], [CAS], Google Scholar67https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXns1Sktbk%253D&md5=d6d4c86c8d6a805a25dd44013f27a7c0Ebf1 or Pax5 haploinsufficiency synergizes with STAT5 activation to initiate acute lymphoblastic leukemiaHeltemes-Harris, Lynn M.; Willette, Mark J. L.; Ramsey, Laura B.; Qiu, Yi Hua; Neeley, E. Shannon; Zhang, Nianxiang; Thomas, Deborah A.; Koeuth, Thearith; Baechler, Emily C.; Kornblau, Steven M.; Farrar, Michael A.Journal of Experimental Medicine (2011), 208 (6), 1135-1149CODEN: JEMEAV; ISSN:0022-1007. (Rockefeller University Press)As STAT5 is crit. for the differentiation, proliferation, and survival of progenitor B cells, this transcription factor may play a role in acute lymphoblastic leukemia (ALL). Here, we show increased expression of activated signal transducer and activator of transcription 5 (STAT5), which is correlated with poor prognosis, in ALL patient cells. Mutations in EBF1 and PAX5, genes crit. for B cell development have also been identified in human ALL. To det. whether mutations in Ebf1 or Pax5 synergize with STAT5 activation to induce ALL, we crossed mice expressing a constitutively active form of STAT5 (Stat5b-CA) with mice heterozygous for Ebf1 or Pax5. Haploinsufficiency of either Pax5 or Ebf1 synergized with Stat5b-CA to rapidly induce ALL in 100% of the mice. The leukemic cells displayed reduced expression of both Pax5 and Ebf1, but this had little effect on most EBF1 or PAX5 target genes. Only a subset of target genes was deregulated; this subset included a large percentage of potential tumor suppressor genes and oncogenes. Further, most of these genes appear to be jointly regulated by both EBF1 and PAX5. Our findings suggest a model whereby small perturbations in a self-reinforcing network of transcription factors crit. for B cell development, specifically PAX5 and EBF1, cooperate with STAT5 activation to initiate ALL.
- 68Rainer, J. Research resource: transcriptional response to glucocorticoids in childhood acute lymphoblastic leukemia. Mol. Endocrinol. 2012, 26, 178– 193, DOI: 10.1210/me.2011-1213[Crossref], [PubMed], [CAS], Google Scholar68https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xht1aktLg%253D&md5=5dccf98a5ff2195602e264bb36d2e9b4Research resource: transcriptional response to glucocorticoids in childhood acute lymphoblastic leukemiaRainer, Johannes; Lelong, Julien; Bindreither, Daniel; Mantinger, Christine; Ploner, Christian; Geley, Stephan; Kofler, ReinhardMolecular Endocrinology (2012), 26 (1), 178-193CODEN: MOENEN; ISSN:0888-8809. (Endocrine Society)Glucocorticoids (GC) induce apoptosis in lymphoblasts and are thus essential in the treatment of acute lymphoblastic leukemia (ALL). Their effects result from gene regulations via the GC receptor (NR3C1/GR), but it is unknown how these changes evolve, what the primary GR targets are, and to what extent responses differ between ALL subtypes and nonlymphoid malignancies. The authors delineated the transcriptional response to GC on the exon level in a time-resolved manner in a precursor B- and a T childhood ALL model employing Exon microarrays and combined this with genome-wide NR3C1-binding site detection using chromatin immunopptn.-on-chip technol. This integrative approach showed that the response was strongly influenced by kinetics and extent of GR autoinduction in both models. Although remarkable differences between the ALL systems were apparent, the authors defined a set of common response genes enriched in apoptosis-related processes. Globally, GR binding was higher for GC-induced vs. -repressed genes, suggesting that GR mediates gene repression by interaction with distant enhancers or by cross talk with other transcription factors. Exon level anal. defined several new GC-regulated transcript variants of genes, including ATP4B, GPR98, TBCD, and ZBTB16. The authors' study provides unprecedented insight into the transcriptional response to GC in ALL cells, essential to understand this biol. and clin. important phenomenon. The authors found evidence of cell type-specific as well as common responses, possibly related to apoptosis induction, and detected induction of novel transcript variants by GC in the investigated systems. Finally, the authors implemented a bioinformatic framework that might be useful for high-d. microarray analyses to identify alternative transcript variant expression.
- 69Zhang, J. D.; Hatje, K.; Sturm, G.; Broger, C.; Ebeling, M.; Burtin, M.; Terzi, F.; Pomposiello, S. I.; Badi, L. Detect tissue heterogeneity in gene expression data with BioQC. BMC Genomics 2017, 18, 277, DOI: 10.1186/s12864-017-3661-2[Crossref], [PubMed], [CAS], Google Scholar69https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXot1Wit7s%253D&md5=51e53bade27f686628dea9b2b6fa261aDetect tissue heterogeneity in gene expression data with BioQCZhang, Jitao David; Hatje, Klas; Sturm, Gregor; Broger, Clemens; Ebeling, Martin; Burtin, Martine; Terzi, Fabiola; Pomposiello, Silvia Ines; Badi, LauraBMC Genomics (2017), 18 (), 277/1-277/9CODEN: BGMEET; ISSN:1471-2164. (BioMed Central Ltd.)Background: Gene expression data can be compromised by cells originating from other tissues than the target tissue of profiling. Failures in detecting such tissue heterogeneity have profound implications on data interpretation and reproducibility. A computational tool explicitly addressing the issue is warranted. Results: We introduce BioQC, a R/Bioconductor software package to detect tissue heterogeneity in gene expression data. To this end BioQC implements a computationally efficient Wilcoxon-Mann-Whitney test and provides more than 150 signatures of tissue-enriched genes derived from large-scale transcriptomics studies. Simulation expts. show that BioQC is both fast and sensitive in detecting tissue heterogeneity. In a case study with whole-organ profiling data, BioQC predicted contamination events that are confirmed by quant. RT-PCR. Applied to transcriptomics data of the Genotype-Tissue Expression (GTEx) project, BioQC reveals clustering of samples and suggests that some samples likely suffer from tissue heterogeneity. Conclusions: Our experience with gene expression data indicates a prevalence of tissue heterogeneity that often goes unnoticed. BioQC addresses the issue by integrating prior knowledge with a scalable algorithm. We propose BioQC as a first-line tool to ensure quality and reproducibility of gene expression data.
- 70Blaschke, T.; Olivecrona, M.; Engkvist, O.; Bajorath, J.; Chen, H. Application of generative autoencoder in de novo molecular design. arXiv:1711.07839 [cs.LG] , arXiv preprint, 2017. https://arxiv.org/abs/1711.07839.Google ScholarThere is no corresponding record for this reference.
- 71Kadurin, A.; Aliper, A.; Kazennov, A.; Mamoshina, P.; Vanhaelen, Q.; Khrabrov, K.; Zhavoronkov, A. The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology. Oncotarget 2017, 8, 10883– 10890, DOI: 10.18632/oncotarget.14073[Crossref], [PubMed], [CAS], Google Scholar71https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1c%252FpvFKruw%253D%253D&md5=677ef0264494eb8a7ef8c6584c1202abThe cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncologyKadurin Artur; Khrabrov Kuzma; Kadurin Artur; Aliper Alexander; Kazennov Andrey; Mamoshina Polina; Vanhaelen Quentin; Zhavoronkov Alex; Kadurin Artur; Kadurin Artur; Kazennov Andrey; Zhavoronkov Alex; Mamoshina Polina; Zhavoronkov AlexOncotarget (2017), 8 (7), 10883-10890 ISSN:.Recent advances in deep learning and specifically in generative adversarial networks have demonstrated surprising results in generating new images and videos upon request even using natural language as input. In this paper we present the first application of generative adversarial autoencoders (AAE) for generating novel molecular fingerprints with a defined set of parameters. We developed a 7-layer AAE architecture with the latent middle layer serving as a discriminator. As an input and output the AAE uses a vector of binary fingerprints and concentration of the molecule. In the latent layer we also introduced a neuron responsible for growth inhibition percentage, which when negative indicates the reduction in the number of tumor cells after the treatment. To train the AAE we used the NCI-60 cell line assay data for 6252 compounds profiled on MCF-7 cell line. The output of the AAE was used to screen 72 million compounds in PubChem and select candidate molecules with potential anti-cancer properties. This approach is a proof of concept of an artificially-intelligent drug discovery engine, where AAEs are used to generate new molecular fingerprints with the desired molecular properties.
- 72Popova, M.; Isayev, O.; Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv. 2018, 4, eaap7885, DOI: 10.1126/sciadv.aap7885
- 73Kim, D.; Hur, J.; Han, J. H.; Ha, S. C.; Shin, D.; Lee, S.; Park, S.; Sugiyama, H.; Kim, K. K. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 2018, 46, 10504, DOI: 10.1093/nar/gky784[Crossref], [PubMed], [CAS], Google Scholar73https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXovVaht7s%253D&md5=ff83fba388a3e1ba89c6556a97adcd8fSequence preference and structural heterogeneity of BZ junctionsKim, Doyoun; Hur, Jeonghwan; Han, Ji Hoon; Ha, Sung Chul; Shin, Donghyuk; Lee, Sangho; Park, Soyoung; Sugiyama, Hiroshi; Kim, Kyeong KyuNucleic Acids Research (2018), 46 (19), 10504-10513CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)BZ junctions, which connect B-DNA to Z-DNA, are necessary for local transformation of B-DNA to Z-DNA in the genome. However, the limited information on the junction-forming sequences and junction structures has led to a lack of understanding of the structural diversity and sequence preferences of BZ junctions. We detd. three crystal structures of BZ junctions with diverse sequences followed by spectroscopic validation of DNA conformation. The structural features of the BZ junctions were well conserved regardless of sequences via the continuous base stacking through B-to-Z DNA with A-T base extrusion. However, the sequence-dependent structural heterogeneity of the junctions was also obsd. in base step parameters that are correlated with steric constraints imposed during Z-DNA formation. Further, CD and fluorescence-based anal. of BZ junctions revealed that a base extrusion was only found at the A-T base pair present next to a stable dinucleotide Z-DNA unit. Our findings suggest that Z-DNA formation in the genome is influenced by the sequence preference for BZ junctions.
Cited By
This article is cited by 66 publications.
- Seung-gu Kang, Joseph A. Morrone, Jeffrey K. Weber, Wendy D. Cornell. Analysis of Training and Seed Bias in Small Molecules Generated with a Conditional Graph-Based Variational Autoencoder─Insights for Practical AI-Driven Molecule Generation. Journal of Chemical Information and Modeling 2022, 62 (4) , 801-816. https://doi.org/10.1021/acs.jcim.1c01545
- Jannis Born, Tien Huynh, Astrid Stroobants, Wendy D. Cornell, Matteo Manica. Active Site Sequence Representations of Human Kinases Outperform Full Sequence Representations for Affinity Prediction and Inhibitor Generation: 3D Effects in a 1D Model. Journal of Chemical Information and Modeling 2022, 62 (2) , 240-257. https://doi.org/10.1021/acs.jcim.1c00889
- Iljung Jin, Hojung Nam. HiDRA: Hierarchical Network for Drug Response Prediction with Attention. Journal of Chemical Information and Modeling 2021, 61 (8) , 3858-3867. https://doi.org/10.1021/acs.jcim.1c00706
- Nikita Moshkov, Tim Becker, Kevin Yang, Peter Horvath, Vlado Dancik, Bridget K. Wagner, Paul A. Clemons, Shantanu Singh, Anne E. Carpenter, Juan C. Caicedo. Predicting compound activity from phenotypic profiles and chemical structures. Nature Communications 2023, 14 (1) https://doi.org/10.1038/s41467-023-37570-1
- Michael W. Mullowney, Katherine R. Duncan, Somayah S. Elsayed, Neha Garg, Justin J. J. van der Hooft, Nathaniel I. Martin, David Meijer, Barbara R. Terlouw, Friederike Biermann, Kai Blin, Janani Durairaj, Marina Gorostiola González, Eric J. N. Helfrich, Florian Huber, Stefan Leopold-Messer, Kohulan Rajan, Tristan de Rond, Jeffrey A. van Santen, Maria Sorokina, Marcy J. Balunas, Mehdi A. Beniddir, Doris A. van Bergeijk, Laura M. Carroll, Chase M. Clark, Djork-Arné Clevert, Chris A. Dejong, Chao Du, Scarlet Ferrinho, Francesca Grisoni, Albert Hofstetter, Willem Jespers, Olga V. Kalinina, Satria A. Kautsar, Hyunwoo Kim, Tiago F. Leao, Joleen Masschelein, Evan R. Rees, Raphael Reher, Daniel Reker, Philippe Schwaller, Marwin Segler, Michael A. Skinnider, Allison S. Walker, Egon L. Willighagen, Barbara Zdrazil, Nadine Ziemert, Rebecca J. M. Goss, Pierre Guyomard, Andrea Volkamer, William H. Gerwick, Hyun Uk Kim, Rolf Müller, Gilles P. van Wezel, Gerard J. P. van Westen, Anna K. H. Hirsch, Roger G. Linington, Serina L. Robinson, Marnix H. Medema. Artificial intelligence for natural product drug discovery. Nature Reviews Drug Discovery 2023, 14 https://doi.org/10.1038/s41573-023-00774-7
- Ignacio Ponzoni, Juan Antonio Páez Prosper, Nuria E. Campillo. Explainable artificial intelligence: A taxonomy and guidelines for its application to drug discovery. WIREs Computational Molecular Science 2023, 2390 https://doi.org/10.1002/wcms.1681
- Sanghyuk Roy Choi, Minhyeok Lee. Transformer Architecture and Attention Mechanisms in Genome Data Analysis: A Comprehensive Review. Biology 2023, 12 (7) , 1033. https://doi.org/10.3390/biology12071033
- Jannis Born, Greta Markert, Nikita Janakarajan, Talia B. Kimber, Andrea Volkamer, María Rodríguez Martínez, Matteo Manica. Chemical representation learning for toxicity prediction. Digital Discovery 2023, 2 (3) , 674-691. https://doi.org/10.1039/D2DD00099G
- Zetian Zheng, Junyi Chen, Xingjian Chen, Lei Huang, Weidun Xie, Qiuzhen Lin, Xiangtao Li, Ka‐Chun Wong. Enabling Single‐Cell Drug Response Annotations from Bulk RNA‐Seq Using SCAD. Advanced Science 2023, 10 (11) https://doi.org/10.1002/advs.202204113
- Alexander Partin, Thomas S. Brettin, Yitan Zhu, Oleksandr Narykov, Austin Clyde, Jamie Overbeek, Rick L. Stevens. Deep learning methods for drug response prediction in cancer: Predominant and emerging trends. Frontiers in Medicine 2023, 10 https://doi.org/10.3389/fmed.2023.1086097
- Bihan Shen, Fangyoumin Feng, Kunshi Li, Ping Lin, Liangxiao Ma, Hong Li. A systematic assessment of deep learning methods for drug response prediction: from in vitro to clinical applications. Briefings in Bioinformatics 2023, 24 (1) https://doi.org/10.1093/bib/bbac605
- An-Phi Nguyen, Dana Lea Moreno, Nicolas Le-Bel, María Rodríguez Martínez, . MonoNet: enhancing interpretability in neural networks via monotonic features. Bioinformatics Advances 2023, 3 (1) https://doi.org/10.1093/bioadv/vbad016
- Zhenzhen Mao, Ye Nie, Weili Jia, Yanfang Wang, Jianhui Li, Tianchen Zhang, Xinjun Lei, Wen Shi, Wenjie Song, Xiao Zhang. Revealing Prognostic and Immunotherapy-Sensitive Characteristics of a Novel Cuproptosis-Related LncRNA Model in Hepatocellular Carcinoma Patients by Genomic Analysis. Cancers 2023, 15 (2) , 544. https://doi.org/10.3390/cancers15020544
- Davinder Paul Singh, Baijnath Kaushik. A systematic literature review for the prediction of anticancer drug response using various machine‐learning and deep‐learning techniques. Chemical Biology & Drug Design 2023, 101 (1) , 175-194. https://doi.org/10.1111/cbdd.14164
- Reid T. Powell. Computational precision therapeutics and drug repositioning. 2023https://doi.org/10.1016/B978-0-12-824010-6.00063-0
- Heewon Park, Seiya Imoto, Satoru Miyano. PredictiveNetwork: predictive gene network estimation with application to gastric cancer drug response-predictive network analysis. BMC Bioinformatics 2022, 23 (1) https://doi.org/10.1186/s12859-022-04871-z
- Edward O. Pyzer-Knapp, Jed W. Pitera, Peter W. J. Staar, Seiji Takeda, Teodoro Laino, Daniel P. Sanders, James Sexton, John R. Smith, Alessandro Curioni. Accelerating materials discovery using artificial intelligence, high performance computing and robotics. npj Computational Materials 2022, 8 (1) https://doi.org/10.1038/s41524-022-00765-z
- Munhwan Lee, Pil-Jong Kim, Hyunwhan Joe, Hong-Gee Kim. Gene-centric multi-omics integration with convolutional encoders for cancer drug response prediction. Computers in Biology and Medicine 2022, 151 , 106192. https://doi.org/10.1016/j.compbiomed.2022.106192
- Xiangxiang Zeng, Fei Wang, Yuan Luo, Seung-gu Kang, Jian Tang, Felice C. Lightstone, Evandro F. Fang, Wendy Cornell, Ruth Nussinov, Feixiong Cheng. Deep generative molecular design reshapes drug discovery. Cell Reports Medicine 2022, 3 (12) , 100794. https://doi.org/10.1016/j.xcrm.2022.100794
- Xiang Zhao, Tiejun Yang, Bingjie Li, Xin Zhang. SwinGAN: A dual-domain Swin Transformer-based generative adversarial network for MRI reconstruction. Computers in Biology and Medicine 2022, 16 , 106513. https://doi.org/10.1016/j.compbiomed.2022.106513
- Bikash Ranjan Samal, Jens Uwe Loers, Vanessa Vermeirssen, Katleen De Preter. Opportunities and challenges in interpretable deep learning for drug sensitivity prediction of cancer cells. Frontiers in Bioinformatics 2022, 2 https://doi.org/10.3389/fbinf.2022.1036963
- David Sidak, Jana Schwarzerová, Wolfram Weckwerth, Steffen Waldherr. Interpretable machine learning methods for predictions in systems biology from omics data. Frontiers in Molecular Biosciences 2022, 9 https://doi.org/10.3389/fmolb.2022.926623
- Di He, Qiao Liu, You Wu, Lei Xie. A context-aware deconfounding autoencoder for robust prediction of personalized clinical drug response from cell-line compound screening. Nature Machine Intelligence 2022, 4 (10) , 879-892. https://doi.org/10.1038/s42256-022-00541-0
- Ramkumar Thirunavukarasu, George Priya Doss C, Gnanasambandan R, Mohanraj Gopikrishnan, Venketesh Palanisamy. Towards computational solutions for precision medicine based big data healthcare system using deep learning models: A review. Computers in Biology and Medicine 2022, 149 , 106020. https://doi.org/10.1016/j.compbiomed.2022.106020
- Ali Farnoud, Alexander J. Ohnmacht, Martin Meinel, Michael P. Menden. Can artificial intelligence accelerate preclinical drug discovery and precision medicine?. Expert Opinion on Drug Discovery 2022, 17 (7) , 661-665. https://doi.org/10.1080/17460441.2022.2090540
- Filipa G. Carvalho, Maryam Abbasi, Bernardete Ribeiro, Joel P. Arrais. Deep Model for Anticancer Drug Response through Genomic Profiles and Compound Structures. 2022, 1-6. https://doi.org/10.1109/CBMS55023.2022.00050
- Dalei Wang, Lan Ma, . Research on Image Segmentation Algorithm Based on Multimodal Hierarchical Attention Mechanism and Genetic Neural Network. Computational Intelligence and Neuroscience 2022, 2022 , 1-16. https://doi.org/10.1155/2022/9980928
- Dang Minh, H. Xiang Wang, Y. Fen Li, Tan N. Nguyen. Explainable artificial intelligence: a comprehensive review. Artificial Intelligence Review 2022, 55 (5) , 3503-3568. https://doi.org/10.1007/s10462-021-10088-y
- Heewon Park, Rui Yamaguchi, Seiya Imoto, Satoru Miyano, . Xprediction: Explainable EGFR-TKIs response prediction based on drug sensitivity specific gene networks. PLOS ONE 2022, 17 (5) , e0261630. https://doi.org/10.1371/journal.pone.0261630
- Likun Jiang, Changzhi Jiang, Xinyu Yu, Rao Fu, Shuting Jin, Xiangrong Liu. DeepTTA: a transformer-based model for predicting cancer drug response. Briefings in Bioinformatics 2022, 23 (3) https://doi.org/10.1093/bib/bbac100
- Priyanka Ramesh, Shanthi Veerappapillai. Designing Novel Compounds for the Treatment and Management of RET-Positive Non-Small Cell Lung Cancer—Fragment Based Drug Design Strategy. Molecules 2022, 27 (5) , 1590. https://doi.org/10.3390/molecules27051590
- Farzaneh Firoozbakht, Behnam Yousefi, Benno Schwikowski. An overview of machine learning methods for monotherapy drug response prediction. Briefings in Bioinformatics 2022, 23 (1) https://doi.org/10.1093/bib/bbab408
- Xin An, Xi Chen, Daiyao Yi, Hongyang Li, Yuanfang Guan. Representation of molecules for drug response prediction. Briefings in Bioinformatics 2022, 23 (1) https://doi.org/10.1093/bib/bbab393
- Paul Prasse, Pascal Iversen, Matthias Lienhard, Kristina Thedinga, Chris Bauer, Ralf Herwig, Tobias Scheffer. Matching anticancer compounds and tumor cell lines by neural networks with ranking loss. NAR Genomics and Bioinformatics 2022, 4 (1) https://doi.org/10.1093/nargab/lqab128
- Siyu Liu, Jihao Wu, Yajuan Feng. The Prediction of Anti-cancer Drug Response by Integrating Multi-omics Data. 2022, 1149-1156. https://doi.org/10.1007/978-3-030-81007-8_132
- Kristina Thedinga, Ralf Herwig. A gradient tree boosting and network propagation derived pan-cancer survival network of the tumor microenvironment. iScience 2022, 25 (1) , 103617. https://doi.org/10.1016/j.isci.2021.103617
- Hanan Ahmed, Safwat Hamad, Howida A. Shedeed, Ashraf Saad Hussein. Enhanced Deep Learning Model for Personalized Cancer Treatment. IEEE Access 2022, 10 , 106050-106058. https://doi.org/10.1109/ACCESS.2022.3209285
- Rashmi Siddalingappa, Sekar Kanagaraj. K-nearest-neighbor algorithm to predict the survival time and classification of various stages of oral cancer: a machine learning approach. F1000Research 2022, 11 , 70. https://doi.org/10.12688/f1000research.75469.1
- Alexander Partin, Thomas Brettin, Yvonne A. Evrard, Yitan Zhu, Hyunseung Yoo, Fangfang Xia, Songhao Jiang, Austin Clyde, Maulik Shukla, Michael Fonstein, James H. Doroshow, Rick L. Stevens. Learning curves for drug response prediction in cancer cell lines. BMC Bioinformatics 2021, 22 (1) https://doi.org/10.1186/s12859-021-04163-y
- Zhaorui Zuo, Penglei Wang, Xiaowei Chen, Li Tian, Hui Ge, Dahong Qian. SWnet: a deep learning model for drug response prediction from cancer genomic signatures and compound chemical structures. BMC Bioinformatics 2021, 22 (1) https://doi.org/10.1186/s12859-021-04352-9
- Krzysztof Koras, Ewa Kizling, Dilafruz Juraeva, Eike Staub, Ewa Szczurek. Interpretable deep recommender system model for prediction of kinase inhibitor efficacy across cancer cell lines. Scientific Reports 2021, 11 (1) https://doi.org/10.1038/s41598-021-94564-z
- B Zagidullin, Z Wang, Y Guan, E Pitkänen, J Tang. Comparative analysis of molecular fingerprints in prediction of drug combination effects. Briefings in Bioinformatics 2021, 22 (6) https://doi.org/10.1093/bib/bbab291
- Hossein Sharifi-Noghabi, Parsa Alamzadeh Harjandi, Olga Zolotareva, Colin C. Collins, Martin Ester. Out-of-distribution generalization from labelled and unlabelled gene expression data for drug response prediction. Nature Machine Intelligence 2021, 3 (11) , 962-972. https://doi.org/10.1038/s42256-021-00408-w
- Fabio Boniolo, Emilio Dorigatti, Alexander J. Ohnmacht, Dieter Saur, Benjamin Schubert, Michael P. Menden. Artificial intelligence in early drug discovery enabling precision medicine. Expert Opinion on Drug Discovery 2021, 16 (9) , 991-1007. https://doi.org/10.1080/17460441.2021.1918096
- Anna Weber, Jannis Born, María Rodriguez Martínez. TITAN: T-cell receptor specificity prediction with bimodal attention networks. Bioinformatics 2021, 37 (Supplement_1) , i237-i244. https://doi.org/10.1093/bioinformatics/btab294
- Joel Markus Vaz, S. Balaji. Convolutional neural networks (CNNs): concepts and applications in pharmacogenomics. Molecular Diversity 2021, 25 (3) , 1569-1584. https://doi.org/10.1007/s11030-021-10225-3
- Beatriz García-Jiménez, Jorge Muñoz, Sara Cabello, Joaquín Medina, Mark D Wilkinson, . Predicting microbiomes through a deep latent space. Bioinformatics 2021, 37 (10) , 1444-1451. https://doi.org/10.1093/bioinformatics/btaa971
- Jannis Born, Matteo Manica, Joris Cadow, Greta Markert, Nil Adell Mill, Modestas Filipavicius, Nikita Janakarajan, Antonio Cardinale, Teodoro Laino, María Rodríguez Martínez. Data-driven molecular design for discovery and synthesis of novel ligands: a case study on SARS-CoV-2. Machine Learning: Science and Technology 2021, 2 (2) , 025024. https://doi.org/10.1088/2632-2153/abe808
- Maria Kuksin, Daphné Morel, Marine Aglave, François-Xavier Danlos, Aurélien Marabelle, Andrei Zinovyev, Daniel Gautheret, Loïc Verlingue. Applications of single-cell and bulk RNA sequencing in onco-immunology. European Journal of Cancer 2021, 149 , 193-210. https://doi.org/10.1016/j.ejca.2021.03.005
- Muthu Kumar Thirunavukkarasu, Woong-Hee Shin, Ramanathan Karuppasamy. Exploring safe and potent bioactives for the treatment of non-small cell lung cancer. 3 Biotech 2021, 11 (5) https://doi.org/10.1007/s13205-021-02797-6
- Jannis Born, Matteo Manica, Ali Oskooei, Joris Cadow, Greta Markert, María Rodríguez Martínez. PaccMannRL: De novo generation of hit-like anticancer molecules from transcriptomic data via reinforcement learning. iScience 2021, 24 (4) , 102269. https://doi.org/10.1016/j.isci.2021.102269
- Partho Sen, Santosh Lamichhane, Vivek B Mathema, Aidan McGlinchey, Alex M Dickens, Sakda Khoomrung, Matej Orešič. Deep learning meets metabolomics: a methodological perspective. Briefings in Bioinformatics 2021, 22 (2) , 1531-1542. https://doi.org/10.1093/bib/bbaa204
- Helena Castañé, Gerard Baiges-Gaya, Anna Hernández-Aguilera, Elisabet Rodríguez-Tomàs, Salvador Fernández-Arroyo, Pol Herrero, Antoni Delpino-Rius, Nuria Canela, Javier A. Menendez, Jordi Camps, Jorge Joven. Coupling Machine Learning and Lipidomics as a Tool to Investigate Metabolic Dysfunction-Associated Fatty Liver Disease. A General Overview. Biomolecules 2021, 11 (3) , 473. https://doi.org/10.3390/biom11030473
- Jeong Hoon Lee, Kye Hwa Lee, Ju Han Kim. In Silico Inference of Synthetic Cytotoxic Interactions from Paclitaxel Responses. International Journal of Molecular Sciences 2021, 22 (3) , 1097. https://doi.org/10.3390/ijms22031097
- Gargi Joshi, Rahee Walambe, Ketan Kotecha. A Review on Explainability in Multimodal Deep Neural Nets. IEEE Access 2021, 9 , 59800-59821. https://doi.org/10.1109/ACCESS.2021.3070212
- Qiao Liu, Zhiqiang Hu, Rui Jiang, Mu Zhou. DeepCDR: a hybrid graph convolutional network for predicting cancer drug response. Bioinformatics 2020, 36 (Supplement_2) , i911-i918. https://doi.org/10.1093/bioinformatics/btaa822
- Liang-Chin Huang, Wayland Yeung, Ye Wang, Huimin Cheng, Aarya Venkat, Sheng Li, Ping Ma, Khaled Rasheed, Natarajan Kannan. Quantitative Structure–Mutation–Activity Relationship Tests (QSMART) model for protein kinase inhibitor response prediction. BMC Bioinformatics 2020, 21 (1) https://doi.org/10.1186/s12859-020-03842-6
- Jeong Hoon Lee, Yu Rang Park, Minsun Jung, Sun Gyo Lim. Gene regulatory network analysis with drug sensitivity reveals synergistic effects of combinatory chemotherapy in gastric cancer. Scientific Reports 2020, 10 (1) https://doi.org/10.1038/s41598-020-61016-z
- Yitan Zhu, Thomas Brettin, Yvonne A. Evrard, Alexander Partin, Fangfang Xia, Maulik Shukla, Hyunseung Yoo, James H. Doroshow, Rick L. Stevens. Ensemble transfer learning for the prediction of anti-cancer drug response. Scientific Reports 2020, 10 (1) https://doi.org/10.1038/s41598-020-74921-0
- Yitan Zhu, Thomas Brettin, Yvonne A. Evrard, Fangfang Xia, Alexander Partin, Maulik Shukla, Hyunseung Yoo, James H. Doroshow, Rick L. Stevens. Enhanced Co-Expression Extrapolation (COXEN) Gene Selection Method for Building Anti-Cancer Drug Response Prediction Models. Genes 2020, 11 (9) , 1070. https://doi.org/10.3390/genes11091070
- Romain Lopez, Adam Gayoso, Nir Yosef. Enhancing scientific discoveries in molecular biology with deep generative models. Molecular Systems Biology 2020, 16 (9) https://doi.org/10.15252/msb.20199198
- Joris Cadow, Jannis Born, Matteo Manica, Ali Oskooei, María Rodríguez Martínez. PaccMann: a web service for interpretable anticancer compound sensitivity prediction. Nucleic Acids Research 2020, 48 (W1) , W502-W508. https://doi.org/10.1093/nar/gkaa327
- Henry E. Webel, Talia B. Kimber, Silke Radetzki, Martin Neuenschwander, Marc Nazaré, Andrea Volkamer. Revealing cytotoxic substructures in molecules using deep learning. Journal of Computer-Aided Molecular Design 2020, 34 (7) , 731-746. https://doi.org/10.1007/s10822-020-00310-4
- Aurélien Pélissier, Youcef Akrout, Katharina Jahn , Jack Kuipers , Ulf Klein , Niko Beerenwinkel, María Rodríguez Martínez . Computational Model Reveals a Stochastic Mechanism behind Germinal Center Clonal Bursts. Cells 2020, 9 (6) , 1448. https://doi.org/10.3390/cells9061448
- Jannis Born, Matteo Manica, Ali Oskooei, Joris Cadow, María Rodríguez Martínez. PaccMannRL: Designing Anticancer Drugs From Transcriptomic Data via Reinforcement Learning. 2020, 231-233. https://doi.org/10.1007/978-3-030-45257-5_18
- Dmitriy D. Matyushin, Aleksey K. Buryak. Gas Chromatographic Retention Index Prediction Using Multimodal Machine Learning. IEEE Access 2020, 8 , 223140-223155. https://doi.org/10.1109/ACCESS.2020.3045047
Abstract
Figure 1
Figure 1. Multimodal end-to-end architecture of the proposed encoders. General framework for the explored architectures. Each model ingests a cell–compound pair and makes an IC50 drug sensitivity prediction. Cells are represented by the gene expression values of a subset of 2128 genes, selected according to a network propagation procedure. Compounds are represented by their SMILES string (apart from the baseline model that uses 512-bit fingerprints). The gene-vector is fed into an attention-based gene encoder that assigns higher weights to the most informative genes. To encode the SMILES strings, several neural architectures are compared (for details see section 2) and used in combination with the gene expression encoder in order to predict drug sensitivity.
Figure 2
Figure 2. Key layers employed throughout the SMILES encoder. (A) SMILES Embedding (SE): An embedding layer transforms raw SMILES strings into a sequence of vectors in an embedding space. (B) Gene attention (GA): An attention-based gene expression encoder generates attention weights that are in turn applied to the input gene subset via a dot product. (C) Contextual attention (CA): A contextual attention layer ingests the SMILES encoding (either raw or the output of another encoder, e.g., CNN, RNN, and so on) of a compound and genes from a cell to compute an attention distribution (αi) over all tokens of the SMILES encoding, in the context of the genetic profile of the cell. The attention-filtered molecule represents the most informative molecular substructures for IC50 prediction, given the gene expression of a cell.
Figure 3
Figure 3. Model architecture of the multiscale convolutional attentive (MCA) encoder. The MCA model employed three parallel channels of convolutions over the SMILES sequence with kernel sizes K and one residual channel operating directly on the token level. Each channel applied a separate gene attention layer, before (convolved) SMILES and filtered genes were fed to a multihead of four contextual-attention layers. The outputs of these 16 layers were concatenated and resulted in an IC50 prediction through a stack of dense layers. For CA, GA, and SE, see Figure 2.
Figure 4
Figure 4. Test performance of MCA on lenient splitting. Scatter plot of correlation between true and predicted drug sensitivity by a late-fusion model ensemble of all five folds. The model was fitted in log space.
Figure 5
Figure 5. Neural attention on molecules and genes. The molecular attention maps on the top demonstrate how the model’s attention is shifted when the thiazole group is replaced by a piperazine group. The change in attention across the two molecules is particularly concentrated around the affected rings, signifying that these functional groups play an important role in the mechanism of action for these tyrosine kinase inhibitors when they act on a chronic myelogenous leukemia (CML) cell line. The gene attention plot at the bottom depicts the most attended genes of the CML cell line, all of which can be linked to leukemia (details see text).
References
ARTICLE SECTIONSThis article references 73 other publications.
- 1Goh, G. B.; Hodas, N. O.; Siegel, C.; Vishnu, A. SMILES2Vec: An Interpretable General-Purpose Deep Neural Network for Predicting Chemical Properties. arXiv:1712.02034 [stat.ML] , arXiv preprint, 2017. https://arxiv.org/abs/1712.02034.Google ScholarThere is no corresponding record for this reference.
- 2Petrova, E. Innovation and marketing in the pharmaceutical industry; Springer, 2014; pp 19– 81.
- 3Lloyd, I.; Shimmings, A.; Scrip, P. S. Pharma R&D Annual Review 2018. https://pharmaintelligence.informa.com/resources/product-content/pharma-rd-annual-review-2018 (accessed June 25, 2018).Google ScholarThere is no corresponding record for this reference.
- 4Hargrave-Thomas, E.; Yu, B.; Reynisson, J. Serendipity in anticancer drug discovery. World J. Clin. Oncol. 2012, 3 (1), 1, DOI: 10.5306/wjco.v3.i1.1[Crossref], [PubMed], [CAS], Google Scholar4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC387ks12ktA%253D%253D&md5=63ec57cb90808eea53404cb6eb31948dSerendipity in anticancer drug discoveryHargrave-Thomas Emily; Yu Bo; Reynisson JohannesWorld journal of clinical oncology (2012), 3 (1), 1-6 ISSN:.It was found that the discovery of 5.8% (84/1437) of all drugs on the market involved serendipity. Of these drugs, 31 (2.2%) were discovered following an incident in the laboratory and 53 (3.7%) were discovered in a clinical setting. In addition, 263 (18.3%) of the pharmaceuticals in clinical use today are chemical derivatives of the drugs discovered with the aid of serendipity. Therefore, in total, 24.1% (347/1437) of marketed drugs can be directly traced to serendipitous events confirming the importance of this elusive phenomenon. In the case of anticancer drugs, 35.2% (31/88) can be attributed to a serendipitous event, which is somewhat larger than for all drugs. The therapeutic field that has benefited the most from serendipity are central nervous system active drugs reflecting the difficulty in designing compounds to pass the blood-brain-barrier and the lack of laboratory-based assays for many of the diseases of the mind.
- 5Geeleher, P.; Cox, N. J.; Huang, R. S. Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical models. Genome Biol. 2016, 17, 190, DOI: 10.1186/s13059-016-1050-9[Crossref], [PubMed], [CAS], Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhvFSjtbzM&md5=273bee29c0dea9d5f6786404a3632598Cancer biomarker discovery is improved by accounting for variability in general levels of drug sensitivity in pre-clinical modelsGeeleher, Paul; Cox, Nancy J.; Huang, R. StephanieGenome Biology (2016), 17 (), 190/1-190/11CODEN: GNBLFW; ISSN:1474-760X. (BioMed Central Ltd.)We show that variability in general levels of drug sensitivity in pre-clin. cancer models confounds biomarker discovery. However, using a very large panel of cell lines, each treated with many drugs, we could est. a general level of sensitivity to all drugs in each cell line. By conditioning on this variable, biomarkers were identified that were more likely to be effective in clin. trials than those identified using a conventional uncorrected approach. We find that differences in general levels of drug sensitivity are driven by biol. relevant processes. We developed a gene expression based method that can be used to correct for this confounder in future studies.
- 6De Niz, C.; Rahman, R.; Zhao, X.; Pal, R. Algorithms for drug sensitivity prediction. Algorithms 2016, 9, 77, DOI: 10.3390/a9040077
- 7Ali, M.; Aittokallio, T. Machine learning and feature selection for drug response prediction in precision oncology applications. Biophys. Rev. 2019, 11, 31, DOI: 10.1007/s12551-018-0446-z[Crossref], [PubMed], [CAS], Google Scholar7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhsV2jtbnN&md5=dc79eb74938d74572272b65387f947adMachine learning and feature selection for drug response prediction in precision oncology applicationsAli, Mehreen; Aittokallio, TeroBiophysical Reviews (2019), 11 (1), 31-39CODEN: BRIECG; ISSN:1867-2450. (Springer)A review. In-depth modeling of the complex interplay among multiple omics data measured from cancer cell lines or patient tumors is providing new opportunities toward identification of tailored therapies for individual cancer patients. Supervised machine learning algorithms are increasingly being applied to the omics profiles as they enable integrative analyses among the high-dimensional data sets, as well as personalized predictions of therapy responses using multi-omics panels of response-predictive biomarkers identified through feature selection and cross-validation. However, tech. variability and frequent missingness in input "big data" require the application of dedicated data preprocessing pipelines that often lead to some loss of information and compressed view of the biol. signal. We describe here the state-of-the-art machine learning methods for anti-cancer drug response modeling and prediction and give our perspective on further opportunities to make better use of high-dimensional multi-omics profiles along with knowledge about cancer pathways targeted by anti-cancer compds. when predicting their phenotypic responses.
- 8Costello, J. C. A community effort to assess and improve drug sensitivity prediction algorithms. Nat. Biotechnol. 2014, 32, 1202, DOI: 10.1038/nbt.2877[Crossref], [PubMed], [CAS], Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXovFCgtLs%253D&md5=e0810989daeffc47400e0ecd3229911aA community effort to assess and improve drug sensitivity prediction algorithmsCostello, James C.; Heiser, Laura M.; Georgii, Elisabeth; Gonen, Mehmet; Menden, Michael P.; Wang, Nicholas J.; Bansal, Mukesh; Ammad-ud-din, Muhammad; Hintsanen, Petteri; Khan, Suleiman A.; Mpindi, John-Patrick; Kallioniemi, Olli; Honkela, Antti; Aittokallio, Tero; Wennerberg, Krister; Abbuehl, Jean-Paul; Allen, Jeffrey; Altman, Russ B.; Balcome, Shawn; Battle, Alexis; Bender, Andreas; Berger, Bonnie; Bernard, Jonathan; Bhattacharjee, Madhuchhanda; Bhuvaneshwar, Krithika; Bieberich, Andrew A.; Boehm, Fred; Califano, Andrea; Chan, Christina; Chen, Beibei; Chen, Ting-Huei; Choi, Jaejoon; Coelho, Luis Pedro; Cokelaer, Thomas; Collins, James C.; Creighton, Chad J.; Cui, Jike; Dampier, Will; Davisson, V. Jo; De Baets, Bernard; Deshpande, Raamesh; Di Camillo, Barbara; Dundar, Murat; Duren, Zhana; Ertel, Adam; Fan, Haoyang; Fang, Hongbin; Gallahan, Dan; Gauba, Robinder; Gottlieb, Assaf; Grau, Michael; Gray, Joe W.; Gusev, Yuriy; Ha, Min Jin; Han, Leng; Harris, Michael; Henderson, Nicholas; Hejase, Hussein A.; Homicsko, Krisztian; Hou, Jack P.; Hwang, Woochang; Ijzerman, Adriaan P.; Karacali, Bilge; Kaski, Samuel; Keles, Sunduz; Kendziorski, Christina; Kim, Junho; Kim, Min; Kim, Youngchul; Knowles, David A.; Koller, Daphne; Lee, Junehawk; Lee, Jae K.; Lenselink, Eelke B.; Li, Biao; Li, Bin; Li, Jun; Liang, Han; Ma, Jian; Madhavan, Subha; Mooney, Sean; Myers, Chad L.; Newton, Michael A.; Overington, John P.; Pal, Ranadip; Peng, Jian; Pestell, Richard; Prill, Robert J.; Qiu, Peng; Rajwa, Bartek; Sadanandam, Anguraj; Saez-Rodriguez, Julio; Sambo, Francesco; Shin, Hyunjin; Singer, Dinah; Song, Jiuzhou; Song, Lei; Sridhar, Arvind; Stock, Michiel; Stolovitzky, Gustavo; Sun, Wei; Ta, Tram; Tadesse, Mahlet; Tan, Ming; Tang, Hao; Theodorescu, Dan; Toffolo, Gianna Maria; Tozeren, Aydin; Trepicchio, William; Varoquaux, Nelle; Vert, Jean-Philippe; Waegeman, Willem; Walter, Thomas; Wan, Qian; Wang, Difei; Wang, Nicholas J.; Wang, Wen; Wang, Yong; Wang, Zhishi; Wegner, Joerg K.; Wu, Tongtong; Xia, Tian; Xiao, Guanghua; Xie, Yang; Xu, Yanxun; Yang, Jichen; Yuan, Yuan; Zhang, Shihua; Zhang, Xiang-Sun; Zhao, Junfei; Zuo, Chandler; van Vlijmen, Herman W. T.; van Westen, Gerard J. P.; Collins, James J.Nature Biotechnology (2014), 32 (12), 1202-1212CODEN: NABIF9; ISSN:1087-0156. (Nature Publishing Group)Predicting the best treatment strategy from genomic information is a core goal of precision medicine. Here we focus on predicting drug response based on a cohort of genomic, epigenomic and proteomic profiling data sets measured in human breast cancer cell lines. Through a collaborative effort between the National Cancer Institute (NCI) and the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we analyzed a total of 44 drug sensitivity prediction algorithms. The top-performing approaches modeled nonlinear relationships and incorporated biol. pathway information. We found that gene expression microarrays consistently provided the best predictive power of the individual profiling data sets; however, performance was increased by including multiple, independent data sets. We discuss the innovations underlying the top-performing methodol., Bayesian multitask MKL, and we provide detailed descriptions of all methods. This study establishes benchmarks for drug sensitivity prediction and identifies approaches that can be leveraged for the development of new methods.
- 9Garnett, M. J. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature 2012, 483, 570, DOI: 10.1038/nature11005[Crossref], [PubMed], [CAS], Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XmtVektL4%253D&md5=540fc631fa43b55d0338fff4e46782b5Systematic identification of genomic markers of drug sensitivity in cancer cellsGarnett, Mathew J.; Edelman, Elena J.; Heidorn, Sonja J.; Greenman, Chris D.; Dastur, Anahita; Lau, King Wai; Greninger, Patricia; Thompson, I. Richard; Luo, Xi; Soares, Jorge; Liu, Qingsong; Iorio, Francesco; Surdez, Didier; Chen, Li; Milano, Randy J.; Bignell, Graham R.; Tam, Ah T.; Davies, Helen; Stevenson, Jesse A.; Barthorpe, Syd; Lutz, Stephen R.; Kogera, Fiona; Lawrence, Karl; McLaren-Douglas, Anne; Mitropoulos, Xeni; Mironenko, Tatiana; Thi, Helen; Richardson, Laura; Zhou, Wenjun; Jewitt, Frances; Zhang, Tinghu; O'Brien, Patrick; Boisvert, Jessica L.; Price, Stacey; Hur, Wooyoung; Yang, Wanjuan; Deng, Xianming; Butler, Adam; Choi, Hwan Geun; Chang, Jae Won; Baselga, Jose; Stamenkovic, Ivan; Engelman, Jeffrey A.; Sharma, Sreenath V.; Delattre, Olivier; Saez-Rodriguez, Julio; Gray, Nathanael S.; Settleman, Jeffrey; Futreal, P. Andrew; Haber, Daniel A.; Stratton, Michael R.; Ramaswamy, Sridhar; McDermott, Ultan; Benes, Cyril H.Nature (London, United Kingdom) (2012), 483 (7391), 570-575CODEN: NATUAS; ISSN:0028-0836. (Nature Publishing Group)Clin. responses to anticancer therapies are often restricted to a subset of patients. In some cases, mutated cancer genes are potent biomarkers for responses to targeted agents. Here, to uncover new biomarkers of sensitivity and resistance to cancer therapeutics, we screened a panel of several hundred cancer cell lines-which represent much of the tissue-type and genetic diversity of human cancers-with 130 drugs under clin. and preclin. investigation. In aggregate, we found that mutated cancer genes were assocd. with cellular response to most currently available cancer drugs. Classic oncogene addiction paradigms were modified by addnl. tissue-specific or expression biomarkers, and some frequently mutated genes were assocd. with sensitivity to a broad range of therapeutic agents. Unexpected relationships were revealed, including the marked sensitivity of Ewing's sarcoma cells harbouring the EWS (also known as EWSR1)-FLI1 gene translocation to poly(ADP-ribose) polymerase (PARP) inhibitors. By linking drug activity to the functional complexity of cancer genomes, systematic pharmacogenomic profiling in cancer cell lines provides a powerful biomarker discovery platform to guide rational cancer therapeutic strategies.
- 10Kalamara, A.; Tobalina, L.; Saez-Rodriguez, J. How to find the right drug for each patient? Advances and challenges in pharmacogenomics. Curr. Opin. Syst. Biol. 2018, 10, 53, DOI: 10.1016/j.coisb.2018.07.001[Crossref], [PubMed], [CAS], Google Scholar10https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3MfislyltQ%253D%253D&md5=f18195017b040f14eda271b84cd11675How to find the right drug for each patient? Advances and challenges in pharmacogenomicsKalamara Angeliki; Tobalina Luis; Saez-Rodriguez Julio; Saez-Rodriguez Julio; Saez-Rodriguez JulioCurrent opinion in systems biology (2018), 10 (), 53-62 ISSN:2452-3100.Cancer is a highly heterogeneous disease with complex underlying biology. For these reasons, effective cancer treatment is still a challenge. Nowadays, it is clear that a cancer therapy that fits all the cases cannot be found, and as a result the design of therapies tailored to the patient's molecular characteristics is needed. Pharmacogenomics aims to study the relationship between an individual's genotype and drug response. Scientists use different biological models, ranging from cell lines to mouse models, as proxies for patients for preclinical and translational studies. The rapid development of "-omics" technologies is increasing the amount of features that can be measured in these models, expanding the possibilities of finding predictive biomarkers of drug response. Finding these relationships requires diverse computational approaches ranging from machine learning to dynamic modeling. Despite major advances, we are still far from being able to precisely predict drug efficacy in cancer models, let alone directly on patients. We believe that the new experimental techniques and computational approaches covered in this review will bring us closer to this goal.
- 11Yang, W. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 2012, 41, D955– D961, DOI: 10.1093/nar/gks1111
- 12Tan, M. Prediction of anti-cancer drug response by kernelized multi-task learning. Artificial intelligence in medicine 2016, 73, 70– 77, DOI: 10.1016/j.artmed.2016.09.004[Crossref], [PubMed], [CAS], Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2sjktl2itQ%253D%253D&md5=ee6b2a8fb0e7221cdffc3b20180fb444Prediction of anti-cancer drug response by kernelized multi-task learningTan MehmetArtificial intelligence in medicine (2016), 73 (), 70-77 ISSN:.MOTIVATION: Chemotherapy or targeted therapy are two of the main treatment options for many types of cancer. Due to the heterogeneous nature of cancer, the success of the therapeutic agents differs among patients. In this sense, determination of chemotherapeutic response of the malign cells is essential for establishing a personalized treatment protocol and designing new drugs. With the recent technological advances in producing large amounts of pharmacogenomic data, in silico methods have become important tools to achieve this aim. OBJECTIVE: Data produced by using cancer cell lines provide a test bed for machine learning algorithms that try to predict the response of cancer cells to different agents. The potential use of these algorithms in drug discovery/repositioning and personalized treatments motivated us in this study to work on predicting drug response by exploiting the recent pharmacogenomic databases. We aim to improve the prediction of drug response of cancer cell lines. METHODS: We propose to use a method that employs multi-task learning to improve learning by transfer, and kernels to extract non-linear relationships to predict drug response. RESULTS: The method outperforms three state-of-the-art algorithms on three anti-cancer drug screen datasets. We achieved a mean squared error of 3.305 and 0.501 on two different large scale screen data sets. On a recent challenge dataset, we obtained an error of 0.556. We report the methodological comparison results as well as the performance of the proposed algorithm on each single drug. CONCLUSION: The results show that the proposed method is a strong candidate to predict drug response of cancer cell lines in silico for pre-clinical studies. The source code of the algorithm and data used can be obtained from http://mtan.etu.edu.tr/Supplementary/kMTrace/.
- 13Tan, M.; Özgül, O. F.; Bardak, B.; Ekşioğlu, I.; Sabuncuoğlu, S. Drug response prediction by ensemble learning and drug-induced gene expression signatures. arXiv:1802.03800 , arXiv preprint, 2018. https://arxiv.org/abs/1802.03800.Google ScholarThere is no corresponding record for this reference.
- 14Turki, T.; Wei, Z. A link prediction approach to cancer drug sensitivity prediction. BMC Syst. Biol. 2017, 11, 94, DOI: 10.1186/s12918-017-0463-8[Crossref], [PubMed], [CAS], Google Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitVOjtLvL&md5=cbd2e9c5352ec42c796c4c508ad65c86A link prediction approach to cancer drug sensitivity predictionTurki, Turki; Wei, ZhiBMC Systems Biology (2017), 11 (Suppl.5), 94/1-94/14CODEN: BSBMCC; ISSN:1752-0509. (BioMed Central Ltd.)Background: Predicting the response to a drug for cancer disease patients based on genomic information is an important problem in modern clin. oncol. This problem occurs in part because many available drug sensitivity prediction algorithms do not consider better quality cancer cell lines and the adoption of new feature representations; both lead to the accurate prediction of drug responses. By predicting accurate drug responses to cancer, oncologists gain a more complete understanding of the effective treatments for each patient, which is a core goal in precision medicine. Results: In this paper, we model cancer drug sensitivity as a link prediction, which is shown to be an effective technique. We evaluate our proposed link prediction algorithms and compare them with an existing drug sensitivity prediction approach based on clin. trial data. The exptl. results based on the clin. trial data show the stability of our link prediction algorithms, which yield the highest area under the ROC curve (AUC) and are statistically significant. Conclusions: We propose a link prediction approach to obtain new feature representation. Compared with an existing approach, the results show that incorporating the new feature representation to the link prediction algorithms has significantly improved the performance.
- 15Menden, M. P. Machine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical properties. PLoS One 2013, 8, e61318 DOI: 10.1371/journal.pone.0061318[Crossref], [PubMed], [CAS], Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXnsVKhsr8%253D&md5=bf41b75d43be68936a1c2a043b3e0b6aMachine learning prediction of cancer cell sensitivity to drugs based on genomic and chemical propertiesMenden, Michael P.; Iorio, Francesco; Garnett, Mathew; McDermott, Ultan; Benes, Cyril H.; Ballester, Pedro J.; Saez-Rodriguez, JulioPLoS One (2013), 8 (4), e61318CODEN: POLNCL; ISSN:1932-6203. (Public Library of Science)Predicting the response of a specific cancer to a therapy is a major goal in modern oncol. that should ultimately lead to a personalised treatment. High-throughput screenings of potentially active compds. against a panel of genomically heterogeneous cancer cell lines have unveiled multiple relationships between genomic alterations and drug responses. Various computational approaches have been proposed to predict sensitivity based on genomic features, while others have used the chem. properties of the drugs to ascertain their effect. In an effort to integrate these complementary approaches, we developed machine learning models to predict the response of cancer cell lines to drug treatment, quantified through IC50 values, based on both the genomic features of the cell lines and the chem. properties of the considered drugs. Models predicted IC50 values in a 8-fold cross-validation and an independent blind test with coeff. of detn. R2 of 0.72 and 0.64 resp. Furthermore, models were able to predict with comparable accuracy (R2 of 0.61) IC50s of cell lines from a tissue not used in the training stage. Our in silico models can be used to optimize the exptl. design of drug-cell screenings by estg. a large proportion of missing IC50 values rather than exptl. measuring them. The implications of our results go beyond virtual drug screening design: potentially thousands of drugs could be probed in silico to systematically test their potential efficacy as anti-tumor agents based on their structure, thus providing a computational framework to identify new drug repositioning opportunities as well as ultimately be useful for personalized medicine by linking the genomic traits of patients to drug sensitivity.
- 16Ammad-Ud-Din, M. Integrative and personalized QSAR analysis in cancer by kernelized Bayesian matrix factorization. J. Chem. Inf. Model. 2014, 54, 2347– 2359, DOI: 10.1021/ci500152b[ACS Full Text
], [CAS], Google Scholar
16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhtFyns7jL&md5=ccc712fe6aeff91c602424028184e750Integrative and Personalized QSAR Analysis in Cancer by Kernelized Bayesian Matrix FactorizationAmmad-ud-din, Muhammad; Georgii, Elisabeth; Gonen, Mehmet; Laitinen, Tuomo; Kallioniemi, Olli; Wennerberg, Krister; Poso, Antti; Kaski, SamuelJournal of Chemical Information and Modeling (2014), 54 (8), 2347-2359CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)With data from recent large-scale drug sensitivity measurement campaigns, it is now possible to build and test models predicting responses for more than one hundred anticancer drugs against several hundreds of human cancer cell lines. Traditional quant. structure-activity relation (QSAR) approaches focus on small mols. in searching for their structural properties predictive of the biol. activity in a single cell line or a single tissue type. We extend this line of research in two directions: (1) an integrative QSAR approach predicting the responses to new drugs for a panel of multiple known cancer cell lines simultaneously and (2) a personalized QSAR approach predicting the responses to new drugs for new cancer cell lines. To solve the modeling task, we apply a novel kernelized Bayesian matrix factorization method. For max. applicability and predictive performance, the method optionally utilizes genomic features of cell lines and target information on drugs in addn. to chem. drug descriptors. In a case study with 116 anticancer drugs and 650 cell lines, we demonstrate the usefulness of the method in several relevant prediction scenarios, differing in the amt. of available information, and analyze the importance of various types of drug features for the response prediction. Furthermore, after predicting the missing values of the data set, a complete global map of drug response is explored to assess treatment potential and treatment range of therapeutically interesting anticancer drugs. - 17Zhang, N. Predicting anticancer drug responses using a dual-layer integrated cell line-drug network model. PLoS Comput. Biol. 2015, 11, e1004498 DOI: 10.1371/journal.pcbi.1004498[Crossref], [PubMed], [CAS], Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XkvVKhsrs%253D&md5=20ee4b5df9de1903261824acffc6852cPredicting anticancer drug responses using a dual-layer integrated cell line-drug network modelZhang, Naiqian; Wang, Haiyun; Fang, Yun; Wang, Jun; Zheng, Xiaoqi; Liu, X. ShirleyPLoS Computational Biology (2015), 11 (9), e1004498/1-e1004498/18CODEN: PCBLBG; ISSN:1553-7358. (Public Library of Science)The ability to predict the response of a cancer patient to a therapeutic agent is a major goal in modern oncol. that should ultimately lead to personalized treatment. Existing approaches to predicting drug sensitivity rely primarily on profiling of cancer cell line panels that have been treated with different drugs and selecting genomic or functional genomic features to regress or classify the drug response. Here, we propose a dual-layer integrated cell line-drug network model, which uses both cell line similarity network (CSN) data and drug similarity network (DSN) data to predict the drug response of a given cell line using a weighted model. Using the Cancer Cell Line Encyclopedia (CCLE) and Cancer Genome Project (CGP) studies as benchmark datasets, our single-layer model with CSN or DSN and only a single parameter achieved a prediction performance comparable to the previously generated elastic net model. When using the dual-layer model integrating both CSN and DSN, our predicted response reached a 0.6 Pearson correlation coeff. with obsd. responses for most drugs, which is significantly better than the previous results using the elastic net model. We have also applied the dual-layer cell line-drug integrated network model to fill in the missing drug response values in the CGP dataset. Even though the dual-layer integrated cell line-drug network model does not specifically model mutation information, it correctly predicted that BRAF mutant cell lines would be more sensitive than BRAF wild-type cell lines to three MEK1/2 inhibitors tested.
- 18Wang, Y.; Fang, J.; Chen, S. Inferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic properties. Sci. Rep. 2016, 6, 32679, DOI: 10.1038/srep32679[Crossref], [PubMed], [CAS], Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhsFGrsb7J&md5=00fe30ca82336071ccababcdde8a025bInferences of drug responses in cancer cells from cancer genomic features and compound chemical and therapeutic propertiesWang, Yongcui; Fang, Jianwen; Chen, ShilongScientific Reports (2016), 6 (), 32679CODEN: SRCEC3; ISSN:2045-2322. (Nature Publishing Group)Accurately predicting the response of a cancer patient to a therapeutic agent is a core goal of precision medicine. Existing approaches were mainly relied primarily on genomic alterations in cancer cells that have been treated with different drugs. Here we focus on predicting drug response based on integration of the heterogeneously pharmacogenomics data from both cell and drug sides. Through a systematical approach, named as PDRCC (Predict Drug Response in Cancer Cells), the cancer genomic alterations and compd. chem. and therapeutic properties were incorporated to det. the chemotherapeutic response in cancer patients. Using the Cancer Cell Line Encyclopedia (CCLE) study as the benchmark dataset, all pharmacogenomics data exhibited their roles in inferring the relationships between cancer cells and drugs. When integrating both genomic resources and compd. information, the prediction coverage was significantly increased. The validity of PDRCC was also supported by its effective in uncovering the unknown cell-drug assocns. with database and literature evidences. It set the stage for clin. testing of novel therapeutic strategies, such as the sensitive assocn. between cancer cell 'A549_LUNG' and compd. 'Topotecan'. In conclusion, PDRCC offers the possibility for faster, safer, and cheaper the development of novel anti-cancer therapeutics in the early-stage clin. trails.
- 19Ding, M. Q.; Chen, L.; Cooper, G. F.; Young, J. D.; Lu, X. Precision oncology beyond targeted therapy: Combining omics data with machine learning matches the majority of cancer cells to effective therapeutics. Mol. Cancer Res. 2018, 16, 269– 278, DOI: 10.1158/1541-7786.MCR-17-0378[Crossref], [PubMed], [CAS], Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitVeiurk%253D&md5=ebfba8be2a7247ef534374050b3c04b8Precision Oncology beyond Targeted Therapy: Combining Omics Data with Machine Learning Matches the Majority of Cancer Cells to Effective TherapeuticsDing, Michael Q.; Chen, Lujia; Cooper, Gregory F.; Young, Jonathan D.; Lu, XinghuaMolecular Cancer Research (2018), 16 (2), 269-278CODEN: MCROC5; ISSN:1541-7786. (American Association for Cancer Research)Precision oncol. involves identifying drugs that will effectively treat a tumor and then prescribing an optimal clin. treatment regimen. However, most first-line chemotherapy drugs do not have biomarkers to guide their application. For molecularly targeted drugs, using the genomic status of a drug target as a therapeutic indicator has limitations. In this study, machine learning methods (e.g., deep learning) were used to identify informative features from genome-scale omics data and to train classifiers for predicting the effectiveness of drugs in cancer cell lines. The methodol. introduced here can accurately predict the efficacy of drugs, regardless of whether they are molecularly targeted or nonspecific chemotherapy drugs. This approach, on a per-drug basis, can identify sensitive cancer cells with an av. sensitivity of 0.82 and specificity of 0.82; on a per-cell line basis, it can identify effective drugs with an av. sensitivity of 0.80 and specificity of 0.82. This report describes a data-driven precision medicine approach that is not only generalizable but also optimizes therapeutic efficacy. The framework detailed herein, when successfully translated to clin. environments, could significantly broaden the scope of precision oncol. beyond targeted therapies, benefiting an expanded proportion of cancer patients. Mol Cancer Res; 16(2); 269-78. ©2017 AACR.
- 20Wang, L.; Li, X.; Zhang, L.; Gao, Q. Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularization. BMC Cancer 2017, 17, 513, DOI: 10.1186/s12885-017-3500-5[Crossref], [PubMed], [CAS], Google Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitFajsr3O&md5=7b6f4376f81b7495224d01936c0a7a01Improved anticancer drug response prediction in cell lines using matrix factorization with similarity regularizationWang, Lin; Li, Xiaozhong; Zhang, Louxin; Gao, QiangBMC Cancer (2017), 17 (), 513/1-513/12CODEN: BCMACL; ISSN:1471-2407. (BioMed Central Ltd.)Human cancer cell lines are used in research to study the biol. of cancer and to test cancer treatments. Recently there are already some large panels of several hundred human cancer cell lines which are characterized with genomic and pharmacol. data. The ability to predict drug responses using these pharmacogenomics data can facilitate the development of precision cancer medicines. Although several methods have been developed to address the drug response prediction, there are many challenges in obtaining accurate prediction. Based on the fact that similar cell lines and similar drugs exhibit similar drug responses, we adopted a similarity-regularized matrix factorization (SRMF) method to predict anticancer drug responses of cell lines using chem. structures of drugs and baseline gene expression levels in cell lines. Specifically, chem. structural similarity of drugs and gene expression profile similarity of cell lines were considered as regularization terms, which were incorporated to the drug response matrix factorization model. We first demonstrated the effectiveness of SRMF using a set of simulation data and compared it with two typical similarity-based methods. Furthermore, we applied it to the Genomics of Drug Sensitivity in Cancer (GDSC) and Cancer Cell Line Encyclopedia (CCLE) datasets, and performance of SRMF exceeds three state-of-the-art methods. We also applied SRMF to est. the missing drug response values in the GDSC dataset. Even though SRMF does not specifically model mutation information, it could correctly predict drug-cancer gene assocns. that are consistent with existing data, and identify novel drug-cancer gene assocns. that are not found in existing data as well. SRMF can also aid in drug repositioning. The newly predicted drug responses of GDSC dataset suggest that mTOR inhibitor rapamycin was sensitive to non-small cell lung cancer (NSCLC), and expression of AK1RC3 and HINT1 may be adjunct markers of cell line sensitivity to rapamycin. Our anal. showed that the proposed data integration method is able to improve the accuracy of prediction of anticancer drug responses in cell lines, and can identify consistent and novel drug-cancer gene assocns. compared to existing data as well as aid in drug repositioning.
- 21Yuan, H.; Paskov, I.; Paskov, H.; González, A. J.; Leslie, C. S. Multitask learning improves prediction of cancer drug sensitivity. Sci. Rep. 2016, 6, 31619, DOI: 10.1038/srep31619[Crossref], [PubMed], [CAS], Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhsVSrsbrK&md5=0c547526f6d8fb6058ba0cd8e1194657Multitask learning improves prediction of cancer drug sensitivityYuan, Han; Paskov, Ivan; Paskov, Hristo; Gonzalez, Alvaro J.; Leslie, Christina S.Scientific Reports (2016), 6 (), 31619CODEN: SRCEC3; ISSN:2045-2322. (Nature Publishing Group)Precision oncol. seeks to predict the best therapeutic option for individual patients based on the mol. characteristics of their tumors. To assess the preclin. feasibility of drug sensitivity prediction, several studies have measured drug responses for cytotoxic and targeted therapies across large collections of genomically and transcriptomically characterized cancer cell lines and trained predictive models using std. methods like elastic net regression. Here we use existing drug response data sets to demonstrate that multitask learning across drugs strongly improves the accuracy and interpretability of drug prediction models. Our method uses trace norm regularization with a highly efficient ADMM (alternating direction method of multipliers) optimization algorithm that readily scales to large data sets. We anticipate that our approach will enhance efforts to exploit growing drug response compendia in order to advance personalized therapy.
- 22Stanfield, Z.; Coşkun, M.; Koyutürk, M. Drug response prediction as a link prediction problem. Sci. Rep. 2017, 7, 40321, DOI: 10.1038/srep40321[Crossref], [PubMed], [CAS], Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXotleltA%253D%253D&md5=82ee0937c1803b80d5b763f58ff7671eDrug Response Prediction as a Link Prediction ProblemStanfield, Zachary; Coskun, Mustafa; Koyuturk, MehmetScientific Reports (2017), 7 (), 40321CODEN: SRCEC3; ISSN:2045-2322. (Nature Publishing Group)Drug response prediction is a well-studied problem in which the mol. profile of a given sample is used to predict the effect of a given drug on that sample. Effective solns. to this problem hold the key for precision medicine. In cancer research, genomic data from cell lines are often utilized as features to develop machine learning models predictive of drug response. Mol. networks provide a functional context for the integration of genomic features, thereby resulting in robust and reproducible predictive models. However, inclusion of network data increases dimensionality and poses addnl. challenges for common machine learning tasks. To overcome these challenges, we here formulate drug response prediction as a link prediction problem. For this purpose, we represent drug response data for a large cohort of cell lines as a heterogeneous network. Using this network, we compute "network profiles" for cell lines and drugs. We then use the assocns. between these profiles to predict links between drugs and cell lines. Through leave-one-out cross validation and cross-classification on independent datasets, we show that this approach leads to accurate and reproducible classification of sensitive and resistant cell line-drug pairs, with 85% accuracy. We also examine the biol. relevance of the network profiles.
- 23Liu, H.; Zhao, Y.; Zhang, L.; Chen, X. Anti-cancer drug response prediction using neighbor-based collaborative filtering with global effect removal. Mol. Ther.--Nucleic Acids 2018, 13, 303– 311, DOI: 10.1016/j.omtn.2018.09.011[Crossref], [PubMed], [CAS], Google Scholar23https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXisVGmsrfN&md5=273e92f2fdbf288a8fa2392cac057826Anti-cancer Drug Response Prediction Using Neighbor-Based Collaborative Filtering with Global Effect RemovalLiu, Hui; Zhao, Yan; Zhang, Lin; Chen, XingMolecular Therapy--Nucleic Acids (2018), 13 (), 303-311CODEN: MTAOC5; ISSN:2162-2531. (Elsevier)Patients of the same cancer may differ in their responses to a specific medical therapy. Identification of predictive mol. features for drug sensitivity holds the key in the era of precision medicine. Human cell lines have harbored most of the same genetic changes found in patients' tumors and thus are widely used in the research of drug response. In this work, we formulated drug-response prediction as a recommender system problem and then adopted a neighbor-based collaborative filtering with global effect removal (NCFGER) method to est. anti-cancer drug responses of cell lines by integrating cell-line similarity networks and drug similarity networks based on the fact that similar cell lines and similar drugs exhibit similar responses. Specifically, we removed the global effect in the available responses and shrunk the similarity score for each cell line pair as well as each drug pair. We then used the K most similar neighbors (hybrid of cell-line-oriented and drug-oriented) in the available responses to predict the unknown ones. Through 10-fold cross-validation, this approach was shown to reach accurate and reproducible outcomes of drug sensitivity. We also discussed the biol. outcomes based on the newly predicted response values.
- 24Zhang, L.; Chen, X.; Guan, N.-N.; Liu, H.; Li, J.-Q. A hybrid interpolation weighted collaborative filtering method for anti-cancer drug response prediction. Front. Pharmacol. 2018, 9, 01017, DOI: 10.3389/fphar.2018.01017[Crossref], [PubMed], [CAS], Google Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXjvVSks7c%253D&md5=a5162b0c14303ea2ac0db4ebe9538d81A hybrid interpolation weighted collaborative filtering method for anti-cancer drug response predictionZhang, Lin; Chen, Xing; Guan, Na-Na; Liu, Hui; Li, Jian-QiangFrontiers in Pharmacology (2018), 9 (), 1017/1-1017/11CODEN: FPRHAU; ISSN:1663-9812. (Frontiers Media S.A.)Individualized therapies ask for the most effective regimen for each patient, while the patients' response may differ from each other. However, it is impossible to clin. evaluate each patient's response due to the large population. Human cell lines have harbored most of the same genetic changes found in patients' tumors, thus are widely used to help understand initial responses of drugs. Based on the more credible assumption that similar cell lines and similar drugs exhibit similar responses, we formulated drug response prediction as a recommender system problem, and then adopted a hybrid interpolation weighted collaborative filtering (HIWCF) method to predict anti-cancer drug responses of cell lines by incorporating cell line similarity and drug similarity shown from gene expression profiles, drug chem. structure as well as drug response similarity. Specifically, we estd. the baseline based on the available responses and shrunk the similarity score for each cell line pair as well as each drug pair. The similarity scores were then shrunk and weighted by the correlation coeffs. drawn from the know response between each pair. Before used to find the K most similar neighbors for further prediction, they went through the case amplification strategy to emphasize high similarity and neglect low similarity. In the last step for prediction, cell line-oriented and drug-oriented collaborative filtering models were carried out, and the av. of predicted values from both models was used as the final predicted sensitivity. Through 10-fold cross validation, this approach was shown to reach accurate and reproducible outcome for those missing drug sensitivities. We also found that the drug response similarity between cell lines or drugs may play important role in the prediction. Finally, we discussed the biol. outcomes based on the newly predicted response values in GDSC dataset.
- 25Oskooei, A.; Manica, M.; Mathis, R.; Martínez, M. R. Network-based Biased Tree Ensembles (NetBiTE) for Drug Sensitivity Prediction and Drug Sensitivity Biomarker Identification in Cancer. arXiv:1808.06603 [q-bio.QM] , arXiv preprint, 2018. https://arxiv.org/abs/1808.06603Google ScholarThere is no corresponding record for this reference.
- 26Zhang, F.; Wang, M.; Xi, J.; Yang, J.; Li, A. A novel heterogeneous network-based method for drug response prediction in cancer cell lines. Sci. Rep. 2018, 8, 3355, DOI: 10.1038/s41598-018-21622-4[Crossref], [PubMed], [CAS], Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1Mrjs1GhtQ%253D%253D&md5=b58de39ef4627844fc0ad2924f7535a3A novel heterogeneous network-based method for drug response prediction in cancer cell linesZhang Fei; Wang Minghui; Li Ao; Wang Minghui; Xi Jianing; Yang Jianghong; Li AoScientific reports (2018), 8 (1), 3355 ISSN:.An enduring challenge in personalized medicine lies in selecting a suitable drug for each individual patient. Here we concentrate on predicting drug responses based on a cohort of genomic, chemical structure, and target information. Therefore, a recently study such as GDSC has provided an unprecedented opportunity to infer the potential relationships between cell line and drug. While existing approach rely primarily on regression, classification or multiple kernel learning to predict drug responses. Synthetic approach indicates drug target and protein-protein interaction could have the potential to improve the prediction performance of drug response. In this study, we propose a novel heterogeneous network-based method, named as HNMDRP, to accurately predict cell line-drug associations through incorporating heterogeneity relationship among cell line, drug and target. Compared to previous study, HNMDRP can make good use of above heterogeneous information to predict drug responses. The validity of our method is verified not only by plotting the ROC curve, but also by predicting novel cell line-drug sensitive associations which have dependable literature evidences. This allows us possibly to suggest potential sensitive associations among cell lines and drugs. Matlab and R codes of HNMDRP can be found at following https://github.com/USTC-HIlab/HNMDRP .
- 27Cereto-Massagué, A. Molecular fingerprint similarity search in virtual screening. Methods 2015, 71, 58– 63, DOI: 10.1016/j.ymeth.2014.08.005[Crossref], [PubMed], [CAS], Google Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXhsVSmsbrN&md5=0452fd89578a9477fd5d3f251d513f81Molecular fingerprint similarity search in virtual screeningCereto-Massague, Adria; Ojeda, Maria Jose; Valls, Cristina; Mulero, Miquel; Garcia-Vallve, Santiago; Pujadas, GerardMethods (Amsterdam, Netherlands) (2015), 71 (), 58-63CODEN: MTHDE9; ISSN:1046-2023. (Elsevier B.V.)A review. Mol. fingerprints have been used for a long time now in drug discovery and virtual screening. Their ease of use (requiring little to no configuration) and the speed at which substructure and similarity searches can be performed with them - paired with a virtual screening performance similar to other more complex methods - is the reason for their popularity. However, there are many types of fingerprints, each representing a different aspect of the mol., which can greatly affect search performance. This review focuses on commonly used fingerprint algorithms, their usage in virtual screening, and the software packages and online tools that provide these algorithms.
- 28Chen, H.; Engkvist, O.; Wang, Y.; Olivecrona, M.; Blaschke, T. The rise of deep learning in drug discovery. Drug Discovery Today 2018, 23, 1241, DOI: 10.1016/j.drudis.2018.01.039[Crossref], [PubMed], [CAS], Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1MvjvFyqtQ%253D%253D&md5=d6cbdd98ede30181802cca1786cd5a95The rise of deep learning in drug discoveryChen Hongming; Engkvist Ola; Olivecrona Marcus; Blaschke Thomas; Wang YinhaiDrug discovery today (2018), 23 (6), 1241-1250 ISSN:.Over the past decade, deep learning has achieved remarkable success in various artificial intelligence research areas. Evolved from the previous research on artificial neural networks, this technology has shown superior performance to other machine learning algorithms in areas such as image and voice recognition, natural language processing, among others. The first wave of applications of deep learning in pharmaceutical research has emerged in recent years, and its utility has gone beyond bioactivity predictions and has shown promise in addressing diverse problems in drug discovery. Examples will be discussed covering bioactivity prediction, de novo molecular design, synthesis prediction and biological image analysis.
- 29Grapov, D.; Fahrmann, J.; Wanichthanarak, K.; Khoomrung, S. Rise of deep learning for genomic, proteomic, and metabolomic data integration in precision medicine. Omics: a journal of integrative biology 2018, 22, 630– 636, DOI: 10.1089/omi.2018.0097[Crossref], [PubMed], [CAS], Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3c7pslKitg%253D%253D&md5=cf9b7eb5ed0be9d3ab23ac6bbc5d5a13Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision MedicineGrapov Dmitry; Fahrmann Johannes; Wanichthanarak Kwanjeera; Khoomrung Sakda; Wanichthanarak Kwanjeera; Khoomrung SakdaOmics : a journal of integrative biology (2018), 22 (10), 630-636 ISSN:.Machine learning (ML) is being ubiquitously incorporated into everyday products such as Internet search, email spam filters, product recommendations, image classification, and speech recognition. New approaches for highly integrated manufacturing and automation such as the Industry 4.0 and the Internet of things are also converging with ML methodologies. Many approaches incorporate complex artificial neural network architectures and are collectively referred to as deep learning (DL) applications. These methods have been shown capable of representing and learning predictable relationships in many diverse forms of data and hold promise for transforming the future of omics research and applications in precision medicine. Omics and electronic health record data pose considerable challenges for DL. This is due to many factors such as low signal to noise, analytical variance, and complex data integration requirements. However, DL models have already been shown capable of both improving the ease of data encoding and predictive model performance over alternative approaches. It may not be surprising that concepts encountered in DL share similarities with those observed in biological message relay systems such as gene, protein, and metabolite networks. This expert review examines the challenges and opportunities for DL at a systems and biological scale for a precision medicine readership.
- 30Wu, Z. MoleculeNet: a benchmark for molecular machine learning. Chem. Sci. 2018, 9, 513– 530, DOI: 10.1039/C7SC02664A[Crossref], [PubMed], [CAS], Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhslChtrbO&md5=cd23d4caad97fe4c48ac09c886806191MoleculeNet: a benchmark for molecular machine learningWu, Zhenqin; Ramsundar, Bharath; Feinberg, Evan N.; Gomes, Joseph; Geniesse, Caleb; Pappu, Aneesh S.; Leswing, Karl; Pande, VijayChemical Science (2018), 9 (2), 513-530CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)Mol. machine learning has been maturing rapidly over the last few years. Improved methods and the presence of larger datasets have enabled machine learning algorithms to make increasingly accurate predictions about mol. properties. However, algorithmic progress has been limited due to the lack of a std. benchmark to compare the efficacy of proposed methods; most new algorithms are benchmarked on different datasets making it challenging to gauge the quality of proposed methods. This work introduces MoleculeNet, a large scale benchmark for mol. machine learning. MoleculeNet curates multiple public datasets, establishes metrics for evaluation, and offers high quality open-source implementations of multiple previously proposed mol. featurization and learning algorithms (released as part of the DeepChem open source library). MoleculeNet benchmarks demonstrate that learnable representations are powerful tools for mol. machine learning and broadly offer the best performance. However, this result comes with caveats. Learnable representations still struggle to deal with complex tasks under data scarcity and highly imbalanced classification. For quantum mech. and biophys. datasets, the use of physics-aware featurizations can be more important than choice of particular learning algorithm.
- 31Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473 [cs.CL] , arXiv preprint, 2014. https://arxiv.org/abs/1409.0473.Google ScholarThere is no corresponding record for this reference.
- 32Weininger, D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. J. Chem. Inf. Model. 1988, 28, 31– 36, DOI: 10.1021/ci00057a005[ACS Full Text
], [CAS], Google Scholar
32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL1cXnsVeqsA%253D%253D&md5=04592975f9dd3c0ce3c1ad618ba2b17dSMILES, a chemical language and information system. 1. Introduction to methodology and encoding rulesWeininger, DavidJournal of Chemical Information and Computer Sciences (1988), 28 (1), 31-6CODEN: JCISD8; ISSN:0095-2338.The SMILES (simplified mol. input line entry system) chem. notation system is described for information processing. The system is based on principles of mol. graph theory and it allows structure specification by use of a very small and natural grammar well suited for high-speed machine processing. The system is easy to use, has high machine compatibility, and allows many computer applications, including notation generation, const. speed database retrieval, substructure searching, and property prediction models. - 33Jastrzębski, S.; Leśniak, D.; Czarnecki, W. M. Learning to SMILE (S). arXiv:1602.06289 [cs.CL] , arXiv preprint, 2016. https://arxiv.org/abs/1602.06289Google ScholarThere is no corresponding record for this reference.
- 34Schwaller, P.; Molecular transformer for chemical reaction prediction and uncertainty estimation. arXiv:1811.02633 [physics.chem-ph] , arXiv preprint, 2018. https://arxiv.org/abs/1811.02633.Google ScholarThere is no corresponding record for this reference.
- 35Bjerrum, E. J. SMILES enumeration as data augmentation for neural network modeling of molecules. arXiv:1703.07076 [cs.LG] , arXiv preprint, 2017. https://arxiv.org/abs/1703.07076.Google ScholarThere is no corresponding record for this reference.
- 36Segler, M. H.; Kogej, T.; Tyrchan, C.; Waller, M. P. Generating focused molecule libraries for drug discovery with recurrent neural networks. ACS Cent. Sci. 2018, 4, 120– 131, DOI: 10.1021/acscentsci.7b00512[ACS Full Text
], [CAS], Google Scholar
36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXitVCjsLfP&md5=708f40422c7a911c629525ce5b66088bGenerating Focused Molecule Libraries for Drug Discovery with Recurrent Neural NetworksSegler, Marwin H. S.; Kogej, Thierry; Tyrchan, Christian; Waller, Mark P.ACS Central Science (2018), 4 (1), 120-131CODEN: ACSCII; ISSN:2374-7951. (American Chemical Society)In de novo drug design, computational strategies are used to generate novel mols. with good affinity to the desired biol. target. In this work, we show that recurrent neural networks can be trained as generative models for mol. structures, similar to statistical language models in natural language processing. We demonstrate that the properties of the generated mols. correlate very well with the properties of the mols. used to train the model. In order to enrich libraries with mols. active toward a given biol. target, we propose to fine-tune the model with small sets of mols., which are known to be active against that target. Against Staphylococcus aureus, the model reproduced 14% of 6051 hold-out test mols. that medicinal chemists designed, whereas against Plasmodium falciparum (Malaria), it reproduced 28% of 1240 test mols. When coupled with a scoring function, our model can perform the complete de novo drug design cycle to generate large sets of novel mols. for drug discovery. - 37Bai, S.; Kolter, J. Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv:1803.01271 [cs.LG] , arXiv preprint, 2018. https://arxiv.org/abs/1803.01271.Google ScholarThere is no corresponding record for this reference.
- 38Kimber, T. B.; Engelke, S.; Tetko, I. V.; Bruno, E.; Godin, G. Synergy Effect between Convolutional Neural Networks and the Multiplicity of SMILES for Improvement of Molecular Prediction. arXiv:1812.04439 [cs.LG] arXiv preprint, 2018. https://arxiv.org/abs/1812.04439.Google ScholarThere is no corresponding record for this reference.
- 39Chang, Y.; Park, H.; Yang, H.-J.; Lee, S.; Lee, K.-Y.; Kim, T. S.; Jung, J.; Shin, J.-M. Cancer Drug Response Profile scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic Signature. Sci. Rep. 2018, 8, 8857, DOI: 10.1038/s41598-018-27214-6[Crossref], [PubMed], [CAS], Google Scholar39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1Mbmt1Olsg%253D%253D&md5=4f616135dff5a8ed477c8c643f293204Cancer Drug Response Profile scan (CDRscan): A Deep Learning Model That Predicts Drug Effectiveness from Cancer Genomic SignatureChang Yoosup; Park Hyejin; Lee Seungju; Shin Jae-Min; Yang Hyun-Jin; Lee Kwee-Yum; Kim Tae Soon; Lee Kwee-Yum; Kim Tae Soon; Jung JongsunScientific reports (2018), 8 (1), 8857 ISSN:.In the era of precision medicine, cancer therapy can be tailored to an individual patient based on the genomic profile of a tumour. Despite the ever-increasing abundance of cancer genomic data, linking mutation profiles to drug efficacy remains a challenge. Herein, we report Cancer Drug Response profile scan (CDRscan) a novel deep learning model that predicts anticancer drug responsiveness based on a large-scale drug screening assay data encompassing genomic profiles of 787 human cancer cell lines and structural profiles of 244 drugs. CDRscan employs a two-step convolution architecture, where the genomic mutational fingerprints of cell lines and the molecular fingerprints of drugs are processed individually, then merged by 'virtual docking', an in silico modelling of drug treatment. Analysis of the goodness-of-fit between observed and predicted drug response revealed a high prediction accuracy of CDRscan (R(2) > 0.84; AUROC > 0.98). We applied CDRscan to 1,487 approved drugs and identified 14 oncology and 23 non-oncology drugs having new potential cancer indications. This, to our knowledge, is the first-time application of a deep learning model in predicting the feasibility of drug repurposing. By further clinical validation, CDRscan is expected to allow selection of the most effective anticancer drugs for the genomic profile of the individual patient.
- 40Yang, M.; Simm, J.; Lam, C. C.; Zakeri, P.; van Westen, G. J. P.; Moreau, Y.; Saez-Rodriguez, J. Linking drug target and pathway activation for effective therapy using multi-task learning. Sci. Rep. 2018, 8, 8322, DOI: 10.1038/s41598-018-25947-y[Crossref], [PubMed], [CAS], Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1MbhsFahug%253D%253D&md5=5f0082cc4e99defc75620c293fb3f8d1Linking drug target and pathway activation for effective therapy using multi-task learningYang Mi; Saez-Rodriguez Julio; Simm Jaak; Zakeri Pooya; Moreau Yves; Lam Chi Chung; van Westen Gerard J P; Saez-Rodriguez JulioScientific reports (2018), 8 (1), 8322 ISSN:.Despite the abundance of large-scale molecular and drug-response data, the insights gained about the mechanisms underlying treatment efficacy in cancer has been in general limited. Machine learning algorithms applied to those datasets most often are used to provide predictions without interpretation, or reveal single drug-gene association and fail to derive robust insights. We propose to use Macau, a bayesian multitask multi-relational algorithm to generalize from individual drugs and genes and explore the interactions between the drug targets and signaling pathways' activation. A typical insight would be: "Activation of pathway Y will confer sensitivity to any drug targeting protein X". We applied our methodology to the Genomics of Drug Sensitivity in Cancer (GDSC) screening, using gene expression of 990 cancer cell lines, activity scores of 11 signaling pathways derived from the tool PROGENy as cell line input and 228 nominal targets for 265 drugs as drug input. These interactions can guide a tissue-specific combination treatment strategy, for example suggesting to modulate a certain pathway to maximize the drug response for a given tissue. We confirmed in literature drug combination strategies derived from our result for brain, skin and stomach tissues. Such an analysis of interactions across tissues might help target discovery, drug repurposing and patient stratification strategies.
- 41Oskooei, A. PaccMann: Prediction of anticancer compound sensitivity with multi-modal attentionbased neural networks. arXiv:1811.06802 [cs.LG] , arXiv preprint, 2018. https://arxiv.org/abs/1811.06802.Google ScholarThere is no corresponding record for this reference.
- 42Rogers, D.; Hahn, M. Extended-Connectivity Fingerprints. J. Chem. Inf. Model. 2010, 50, 742– 754, DOI: 10.1021/ci100050t[ACS Full Text
], [CAS], Google Scholar
42https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXlt1Onsbg%253D&md5=cd6c736cd7a3d280b67f5316acce8006Extended-Connectivity FingerprintsRogers, David; Hahn, MathewJournal of Chemical Information and Modeling (2010), 50 (5), 742-754CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Extended-connectivity fingerprints (ECFPs) are a novel class of topol. fingerprints for mol. characterization. Historically, topol. fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure-activity modeling. ECFPs are circular fingerprints with a no. of useful qualities: they can be very rapidly calcd.; they are not predefined and can represent an essentially infinite no. of different mol. features (including stereochem. information); their features represent the presence of particular substructures, allowing easier interpretation of anal. results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature. - 43Iorio, F. A landscape of pharmacogenomic interactions in cancer. Cell 2016, 166, 740– 754, DOI: 10.1016/j.cell.2016.06.017[Crossref], [PubMed], [CAS], Google Scholar43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhtFCiu73L&md5=ed2bc85f4e6b304829d3077190384272A Landscape of pharmacogenomic interactions in cancerIorio, Francesco; Knijnenburg, Theo A.; Vis, Daniel J.; Bignell, Graham R.; Menden, Michael P.; Schubert, Michael; Aben, Nanne; Goncalves, Emanuel; Barthorpe, Syd; Lightfoot, Howard; Cokelaer, Thomas; Greninger, Patricia; van Dyk, Ewald; Chang, Han; de Silva, Heshani; Heyn, Holger; Deng, Xianming; Egan, Regina K.; Liu, Qingsong; Mironenko, Tatiana; Mitropoulos, Xeni; Richardson, Laura; Wang, Jinhua; Zhang, Tinghu; Moran, Sebastian; Sayols, Sergi; Soleimani, Maryam; Tamborero, David; Lopez-Bigas, Nuria; Ross-Macdonald, Petra; Esteller, Manel; Gray, Nathanael S.; Haber, Daniel A.; Stratton, Michael R.; Benes, Cyril H.; Wessels, Lodewyk F. A.; Saez-Rodriguez, Julio; McDermott, Ultan; Garnett, Mathew J.Cell (Cambridge, MA, United States) (2016), 166 (3), 740-754CODEN: CELLB5; ISSN:0092-8674. (Cell Press)Systematic studies of cancer genomes have provided unprecedented insights into the mol. nature of cancer. Using this information to guide the development and application of therapies in the clinic is challenging. Here, we report how cancer-driven alterations identified in 11,289 tumors from 29 tissues (integrating somatic mutations, copy no. alterations, DNA methylation, and gene expression) can be mapped onto 1001 molecularly annotated human cancer cell lines and correlated with sensitivity to 265 drugs. We find that cell lines faithfully recapitulate oncogenic alterations identified in tumors, find that many of these assoc. with drug sensitivity/resistance, and highlight the importance of tissue lineage in mediating drug response. Logic-based modeling uncovers combinations of alterations that sensitize to drugs, while machine learning demonstrates the relative importance of different data types in predicting drug response. Our anal. and datasets are rich resources to link genotypes with cellular phenotypes and to identify therapeutic options for selected cancer sub-populations.
- 44Szklarczyk, D. STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015, 43, D447– D452, DOI: 10.1093/nar/gku1003[Crossref], [PubMed], [CAS], Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtVymt7bE&md5=47f29e29c4093189bfbefadc9e3a93c4STRING v10: protein-protein interaction networks, integrated over the tree of lifeSzklarczyk, Damian; Franceschini, Andrea; Wyder, Stefan; Forslund, Kristoffer; Heller, Davide; Huerta-Cepas, Jaime; Simonovic, Milan; Roth, Alexander; Santos, Alberto; Tsafou, Kalliopi P.; Kuhn, Michael; Bork, Peer; Jensen, Lars J.; von Mering, ChristianNucleic Acids Research (2015), 43 (D1), D447-D452CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The many functional partnerships and interactions that occur between proteins are at the core of cellular processing and their systematic characterization helps to provide context in mol. systems biol. However, known and predicted interactions are scattered over multiple resources, and the available data exhibit notable differences in terms of quality and completeness. The STRING database aims to provide a crit. assessment and integration of protein-protein interactions, including direct (phys.) as well as indirect (functional) assocns. The new version 10.0 of STRING covers more than 2000 organisms, which has necessitated novel, scalable algorithms for transferring interaction information between organisms. For this purpose, we have introduced hierarchical and self-consistent orthol. annotations for all interacting proteins, grouping the proteins into families at various levels of phylogenetic resoln. Further improvements in version 10.0 include a completely redesigned prediction pipeline for inferring protein-protein assocns. from coexpression data, an API interface for the R computing environment and improved statistical anal. for enrichment tests in user-provided networks.
- 45Hofree, M.; Shen, J. P.; Carter, H.; Gross, A.; Ideker, T. Network-based stratification of tumor mutations. Nat. Methods 2013, 10, 1108, DOI: 10.1038/nmeth.2651[Crossref], [PubMed], [CAS], Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhsVeqsrvM&md5=4a305ef2d3b1521e344d8895e77efa49Network-based stratification of tumor mutationsHofree, Matan; Shen, John P.; Carter, Hannah; Gross, Andrew; Ideker, TreyNature Methods (2013), 10 (11), 1108-1115CODEN: NMAEA3; ISSN:1548-7091. (Nature Publishing Group)Many forms of cancer have multiple subtypes with different causes and clin. outcomes. Somatic tumor genome sequences provide a rich new source of data for uncovering these subtypes but have proven difficult to compare, as two tumors rarely share the same mutations. Here we introduce network-based stratification (NBS), a method to integrate somatic tumor genomes with gene networks. This approach allows for stratification of cancer into informative subtypes by clustering together patients with mutations in similar network regions. We demonstrate NBS in ovarian, uterine and lung cancer cohorts from The Cancer Genome Atlas. For each tissue, NBS identifies subtypes that are predictive of clin. outcomes such as patient survival, response to therapy or tumor histol. We identify network regions characteristic of each subtype and show how mutation-derived subtypes can be used to train an mRNA expression signature, which provides similar information in the absence of DNA sequence.
- 46Unterthiner, T.; et al. Deep learning as an opportunity in virtual screening. Proceedings of the Deep Learning Workshop at NIPS , 2014 1 9Google ScholarThere is no corresponding record for this reference.
- 47Schwaller, P.; Gaudin, T.; Lanyi, D.; Bekas, C.; Laino, T. Found in Translation: predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models. Chem. Sci. 2018, 9, 6091– 6098, DOI: 10.1039/C8SC02339E[Crossref], [PubMed], [CAS], Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhtFyjtb%252FE&md5=c4e3b675f45ba7710534ee39f247a036"Found in Translation": predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence modelsSchwaller, Philippe; Gaudin, Theophile; Lanyi, David; Bekas, Costas; Laino, TeodoroChemical Science (2018), 9 (28), 6091-6098CODEN: CSHCCN; ISSN:2041-6520. (Royal Society of Chemistry)A review. There is an intuitive analogy of an org. chemist's understanding of a compd. and a language speaker's understanding of a word. Based on this analogy, it is possible to introduce the basic concepts and analyze potential impacts of linguistic anal. to the world of org. chem. In this work, we cast the reaction prediction task as a translation problem by introducing a template-free sequence-to-sequence model, trained end-to-end and fully data-driven. We propose a tokenization, which is arbitrarily extensible with reaction information. Using an attention-based model borrowed from human language translation, we improve the state-of-the-art solns. in reaction prediction on the top-1 accuracy by achieving 80.3% without relying on auxiliary knowledge, such as reaction templates or explicit at. features. Also, a top-1 accuracy of 65.4% is reached on a larger and noisier dataset.
- 48Cho, K.; Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078 [cs.CL] , arXiv preprint, 2014. https://arxiv.org/abs/1406.1078.Google ScholarThere is no corresponding record for this reference.
- 49Koprowski, R.; Foster, K. R. Machine learning and medicine: book review and commentary. BioMed. Eng. 2018, 17, 17, DOI: 10.1186/s12938-018-0449-9[Crossref], [CAS], Google Scholar49https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1MvmsVSjsQ%253D%253D&md5=617175afc306a0acf8b4e317e83a8fdfMachine learning and medicine: book review and commentaryKoprowski Robert; Foster Kenneth RBiomedical engineering online (2018), 17 (1), 17 ISSN:.This article is a review of the book "Master machine learning algorithms, discover how they work and implement them from scratch" (ISBN: not available, 37 USD, 163 pages) edited by Jason Brownlee published by the Author, edition, v1.10 http://MachineLearningMastery.com . An accompanying commentary discusses some of the issues that are involved with use of machine learning and data mining techniques to develop predictive models for diagnosis or prognosis of disease, and to call attention to additional requirements for developing diagnostic and prognostic algorithms that are generally useful in medicine. Appendix provides examples that illustrate potential problems with machine learning that are not addressed in the reviewed book.
- 50Yang, Z. Hierarchical attention networks for document classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2016, 1480– 1489, DOI: 10.18653/v1/N16-1174
- 51Gupta, A.; Kumar, B. S.; Negi, A. S. Current status on development of steroids as anticancer agents. J. Steroid Biochem. Mol. Biol. 2013, 137, 242– 270, DOI: 10.1016/j.jsbmb.2013.05.011[Crossref], [PubMed], [CAS], Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXpslKqsL8%253D&md5=506f07cacd1d3a2ad658ea59c6bc0610Current status on development of steroids as anticancer agentsGupta, Atul; Sathish Kumar, B.; Negi, Arvind S.Journal of Steroid Biochemistry and Molecular Biology (2013), 137 (), 242-270CODEN: JSBBEZ; ISSN:0960-0760. (Elsevier Ltd.)A review. Steroids are important biodynamic agents. Their affinities for various nuclear receptors have been an interesting feature to utilize them for drug development particularly for receptor mediated diseases. Steroid biochem. and its crucial role in human physiol., has attained importance among the researchers. Recent years have seen an extensive focus on modification of steroids. The rational modifications of perhydrocyclopentanophenanthrene nucleus of steroids have yielded several important anticancer lead mols. Exemestane, SR 16157, Fulvestrant and 2-methoxyestradiol are some of the successful leads emerged on steroidal pharmacophores. The present review is an update on some of the steroidal leads obtained during past 25 years. Various steroid based enzyme inhibitors, antiestrogens, cytotoxic conjugates and steroidal cytotoxic mols. of natural as well as synthetic origin have been highlighted.
- 52Vaswani, A.; et al. Attention is all you need. Advances in Neural Information Processing Systems 30 , NIPS 2017; pp 5998– 6008.Google ScholarThere is no corresponding record for this reference.
- 53Li, V.; Maki, A. Feature Contraction: New ConvNet Regularization in Image Classification. BMVC 2018.Google ScholarThere is no corresponding record for this reference.
- 54Kingma, D. P.; Ba, J. Adam: A method for stochastic optimization. arXiv:1412.6980 [cs.LG] , arXiv preprint, 2014. https://arxiv.org/abs/1412.6980.Google ScholarThere is no corresponding record for this reference.
- 55Jiao, Q.; Bi, L.; Ren, Y.; Song, S.; Wang, Q.; Wang, Y.-s. Advances in studies of tyrosine kinase inhibitors and their acquired resistance. Mol. Cancer 2018, 17, 36, DOI: 10.1186/s12943-018-0801-5[Crossref], [PubMed], [CAS], Google Scholar55https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitF2nt7vM&md5=79e82708697c0359c3279ea6121af8feAdvances in studies of tyrosine kinase inhibitors and their acquired resistanceJiao, Qinlian; Bi, Lei; Ren, Yidan; Song, Shuliang; Wang, Qin; Wang, Yun-shanMolecular Cancer (2018), 17 (), 36/1-36/12CODEN: MCOACG; ISSN:1476-4598. (BioMed Central Ltd.)Protein tyrosine kinase (PTK) is one of the major signaling enzymes in the process of cell signal transduction, which catalyzes the transfer of ATP-γ-phosphate to the tyrosine residues of the substrate protein, making it phosphorylation, regulating cell growth, differentiation, death and a series of physiol. and biochem. processes. Abnormal expression of PTK usually leads to cell proliferation disorders, and is closely related to tumor invasion, metastasis and tumor angiogenesis. At present, a variety of PTKs have been used as targets in the screening of anti-tumor drugs. Tyrosine kinase inhibitors (TKIs) compete with ATP for the ATP binding site of PTK and reduce tyrosine kinase phosphorylation, thereby inhibiting cancer cell proliferation. TKI has made great progress in the treatment of cancer, but the attendant acquired acquired resistance is still inevitable, restricting the treatment of cancer. In this paper, we summarize the role of PTK in cancer, TKI treatment of tumor pathways and TKI acquired resistance mechanisms, which provide some ref. for further research on TKI treatment of tumors.
- 56Finlay, S. Multiple classifier architectures and their application to credit risk assessment. European Journal of Operational Research 2011, 210, 368– 378, DOI: 10.1016/j.ejor.2010.09.029
- 57Tanimoto, T. T. Elementary mathematical theory of classification and prediction. IBM Technical Report , 1958.Google ScholarThere is no corresponding record for this reference.
- 58Bajusz, D.; Rácz, A.; Héberger, K. Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?. J. Cheminf. 2015, 7, 20, DOI: 10.1186/s13321-015-0069-3
- 59Chen, E. Y. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinf. 2013, 14, 128, DOI: 10.1186/1471-2105-14-128[Crossref], [PubMed], [CAS], Google Scholar59https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC3srlt1Kksw%253D%253D&md5=28be1b884f451b1b78defc3708b4b62fEnrichr: interactive and collaborative HTML5 gene list enrichment analysis toolChen Edward Y; Tan Christopher M; Kou Yan; Duan Qiaonan; Wang Zichen; Meirelles Gabriela Vaz; Clark Neil R; Ma'ayan AviBMC bioinformatics (2013), 14 (), 128 ISSN:.BACKGROUND: System-wide profiling of genes and proteins in mammalian cells produce lists of differentially expressed genes/proteins that need to be further analyzed for their collective functions in order to extract new knowledge. Once unbiased lists of genes or proteins are generated from such experiments, these lists are used as input for computing enrichment with existing lists created from prior knowledge organized into gene-set libraries. While many enrichment analysis tools and gene-set libraries databases have been developed, there is still room for improvement. RESULTS: Here, we present Enrichr, an integrative web-based and mobile software application that includes new gene-set libraries, an alternative approach to rank enriched terms, and various interactive visualization approaches to display enrichment results using the JavaScript library, Data Driven Documents (D3). The software can also be embedded into any tool that performs gene list analysis. We applied Enrichr to analyze nine cancer cell lines by comparing their enrichment signatures to the enrichment signatures of matched normal tissues. We observed a common pattern of up regulation of the polycomb group PRC2 and enrichment for the histone mark H3K27me3 in many cancer cell lines, as well as alterations in Toll-like receptor and interlukin signaling in K562 cells when compared with normal myeloid CD33+ cells. Such analyses provide global visualization of critical differences between normal tissues and cancer cell lines but can be applied to many other scenarios. CONCLUSIONS: Enrichr is an easy to use intuitive enrichment analysis web-based tool providing various types of visualization summaries of collective functions of gene lists. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr.
- 60Kuleshov, M. V. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016, 44, W90– W97, DOI: 10.1093/nar/gkw377[Crossref], [PubMed], [CAS], Google Scholar60https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhtV2itrfF&md5=09239af53827a888f7328cf362e5c4b6Enrichr: a comprehensive gene set enrichment analysis web server 2016 updateKuleshov, Maxim V.; Jones, Matthew R.; Rouillard, Andrew D.; Fernandez, Nicolas F.; Duan, Qiaonan; Wang, Zichen; Koplev, Simon; Jenkins, Sherry L.; Jagodnik, Kathleen M.; Lachmann, Alexander; McDermott, Michael G.; Monteiro, Caroline D.; Gundersen, Gregory W.; Ma'ayan, AviNucleic Acids Research (2016), 44 (W1), W90-W97CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)Enrichment anal. is a popular method for analyzing gene sets generated by genome-wide expts. Here we present a significant update to one of the tools in this domain called Enrichr. Enrichr currently contains a large collection of diverse gene set libraries available for anal. and download. In total, Enrichr currently contains 180 184 annotated gene sets from 102 gene set libraries. New features have been added to Enrichr including the ability to submit fuzzy sets, upload BED files, improved application programming interface and visualization of the results as cluster grams. Overall, Enrichr is a comprehensive resource for curated gene sets and a search engine that accumulates biol. knowledge for further biol. discoveries.
- 61Mi, H. PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements. Nucleic Acids Res. 2017, 45, D183– D189, DOI: 10.1093/nar/gkw1138[Crossref], [PubMed], [CAS], Google Scholar61https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhslWhsLw%253D&md5=0ddda80a1174008dd6b96ff32bd47a29PANTHER version 11: expanded annotation data from gene ontology and reactome pathways, and data analysis tool enhancementsMi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D.Nucleic Acids Research (2017), 45 (D1), D183-D189CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)The PANTHER database (Protein Anal. THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics expts. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the anal. tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontol. Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment anal. using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of Evalue statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution.
- 62Kim, H.-G.; Hwang, S.-Y.; Aaronson, S. A.; Mandinova, A.; Lee, S. W. DDR1 receptor tyrosine kinase promotes prosurvival pathway through Notch1 activation. J. Biol. Chem. 2011, 286, 17672– 17681, DOI: 10.1074/jbc.M111.236612[Crossref], [PubMed], [CAS], Google Scholar62https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXmtVGjt7c%253D&md5=b71db3a01690a02e8fb2ac8d18067491DDR1 Receptor Tyrosine Kinase Promotes Prosurvival Pathway through Notch1 ActivationKim, Hyung-Gu; Hwang, So-Young; Aaronson, Stuart A.; Mandinova, Anna; Lee, Sam W.Journal of Biological Chemistry (2011), 286 (20), 17672-17681CODEN: JBCHA3; ISSN:0021-9258. (American Society for Biochemistry and Molecular Biology)DDR1 (discoidin domain receptor tyrosine kinase 1) kinase s highly expressed in a variety of human cancers and occasionally mutated in lung cancer and leukemia. It is now clear that aberrant signaling through the DDR1 receptor is closely assocd. with various steps of tumorigenesis, although little is known about the mol. mechanism(s) underlying the role of DDR1 in cancer. Besides the role of DDR1 in tumorigenesis, we previously identified DDR1 kinase as a transcriptional target of tumor suppressor p53. DDR1 is functionally activated as detd. by its tyrosine phosphorylation, in response to p53-dependent DNA damage. In this study, we report the characterization of the Notch1 protein as an interacting partner of DDR1 receptor, as detd. by tandem affinity protein purifn. Upon ligand-mediated DDR1 kinase activation, Notch1 was activated, bound to DDR1, and activated canonical Notch1 targets, including Hes1 and Hey2. Moreover, DDR1 ligand (collagen I) treatment significantly increased the active form of Notch1 receptor in the nuclear fraction, whereas DDR1 knockdown cells show little or no increase of the active form of Notch1 in the nuclear fraction, suggesting a novel intracellular mechanism underlying autocrine activation of wild-type Notch signaling through DDR1. DDR1 activation suppressed genotoxic-mediated cell death, whereas Notch1 inhibition by a γ-secretase inhibitor, DAPT, enhanced cell death in response to stress. Moreover, the DDR1 knockdown cancer cells showed the reduced transformed phenotypes in vitro and in vivo xenograft studies. The results suggest that DDR1 exerts prosurvival effect, at least in part, through the functional interaction with Notch1.
- 63Barisione, G. Heterogeneous expression of the collagen receptor DDR1 in chronic lymphocytic leukaemia and correlation with progression. Blood cancer journal 2017, 7, e513, DOI: 10.1038/bcj.2016.121
- 64Pandzic, T.; Larsson, J.; He, L.; Kundu, S.; Ban, K.; Akhtar-Ali, M.; Hellstrom, A. R.; Schuh, A.; Clifford, R.; Blakemore, S. J.; Strefford, J. C.; Baumann, T.; Lopez-Guillermo, A.; Campo, E.; Ljungstrom, V.; Mansouri, L.; Rosenquist, R.; Sjoblom, T.; Hellstrom, M. Transposon mutagenesis reveals fludarabine-resistance mechanisms in chronic lymphocytic leukemia. Clin. Cancer Res. 2016, 22, 6217, DOI: 10.1158/1078-0432.CCR-15-2903[Crossref], [PubMed], [CAS], Google Scholar64https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XitFSmsbfF&md5=4a99008657214a95e3382e58d96b75bbTransposon Mutagenesis Reveals Fludarabine Resistance Mechanisms in Chronic Lymphocytic LeukemiaPandzic, Tatjana; Larsson, Jimmy; He, Liqun; Kundu, Snehangshu; Ban, Kenneth; Akhtar-Ali, Muhammad; Hellstrom, Anders R.; Schuh, Anna; Clifford, Ruth; Blakemore, Stuart J.; Strefford, Jonathan C.; Baumann, Tycho; Lopez-Guillermo, Armando; Campo, Elias; Ljungstrom, Viktor; Mansouri, Larry; Rosenquist, Richard; Sjoblom, Tobias; Hellstrom, MatsClinical Cancer Research (2016), 22 (24), 6217-6227CODEN: CCREF4; ISSN:1078-0432. (American Association for Cancer Research)Purpose: To identify resistance mechanisms for the chemotherapeutic drug fludarabine in chronic lymphocytic leukemia (CLL), as innate and acquired resistance to fludarabine-based chemotherapy represents a major challenge for long-term disease control. Exptl. Design: We used piggyBac transposon-mediated mutagenesis, combined with next-generation sequencing, to identify genes that confer resistance to fludarabine in a human CLL cell line. Results: In total, this screen identified 782 genes with transposon integrations in fludarabine-resistant pools of cells. One of the identified genes is a known resistance mediator DCK (deoxycytidine kinase), which encodes an enzyme that is essential for the phosphorylation of the prodrug to the active metabolite. BMP2K, a gene not previously linked to CLL, was also identified as a modulator of response to fludarabine. In addn., 10 of 782 transposon-targeted genes had previously been implicated in treatment resistance based on somatic mutations seen in patients refractory to fludarabine-based therapy. Functional characterization of these genes supported a significant role for ARID5B and BRAF in fludarabine sensitivity. Finally, pathway anal. of transposon-targeted genes and RNA-seq profiling of fludarabine-resistant cells suggested deregulated MAPK signaling as involved in mediating drug resistance in CLL. Conclusions: To our knowledge, this is the first forward genetic screen for chemotherapy resistance in CLL. The screen pinpointed novel genes and pathways involved in fludarabine resistance along with previously known resistance mechanisms. Transposon screens can therefore aid interpretation of cancer genome sequencing data in the identification of genes modifying sensitivity to chemotherapy. Clin Cancer Res; 22(24); 6217-27. ©2016 AACR.
- 65Schmidt, H. H. Deregulation of the carbohydrate (chondroitin 4) sulfotransferase 11 (CHST11) gene in a B-cell chronic lymphocytic leukemia with at (12; 14)(q23; q32). Oncogene 2004, 23, 6991, DOI: 10.1038/sj.onc.1207934[Crossref], [PubMed], [CAS], Google Scholar65https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXnsFait7g%253D&md5=741370eab10e5686c70de8ee397c2763Deregulation of the carbohydrate (chondroitin 4) sulfotransferase 11 (CHST11) gene in a B-cell chronic lymphocytic leukemia with a t(12;14)(q23;q32)Schmidt, Helmut H.; Dyomin, Vadim G.; Palanisamy, Nallasivam; Itoyama, Takahiro; Nanjangud, Gouri; Pirc-Danoewinata, Hendrati; Haas, Oskar A.; Chaganti, R. S. K.Oncogene (2004), 23 (41), 6991-6996CODEN: ONCNES; ISSN:0950-9232. (Nature Publishing Group)The t(12;14)(q23;q32) breakpoints in a case of B-cell chronic lymphocytic leukemia (B-CLL) were mapped by fluorescence in situ hybridization (FISH) and Southern blot anal. and cloned using an IGH switch-γ probe. The translocation affected a productively rearranged IGH allele and the carbohydrate (chondroitin 4) sulfotransferase 11 (CHST11) locus at 12q23, with a reciprocal break in intron 2 of the CHST11 gene. CHST11 belongs to the HNK1 family of Golgi-assocd. sulfotransferases, a group of glycosaminoglycan-modifying enzymes, and is expressed mainly in the hematopoietic lineage. Northern Blot anal. of tumor RNA using CHST11-specific probes showed expression of two CHST11 forms of abnormal size. 5'- And 3'-Rapid Amplification of cDNA Ends (RACE) revealed IGH/CHST11 as well as CHST11/IGH fusion RNAs expressed from the der(14) and der(12) chromosomes. Both fusion species contained open reading frames making possible the translation of two truncated forms of CHST11 protein. The biol. consequence of t(12;14)(q23;q32) in this case presumably is a disturbance of the cellular distribution of CHST11 leading to deregulation of a chondroitin-sulfate-dependent pathway specific to the hematopoietic lineage.
- 66Renema, N.; Navet, B.; Heymann, M.-F.; Lezot, F.; Heymann, D. RANK–RANKL signalling in cancer. Biosci. Rep. 2016, 36, e00366 DOI: 10.1042/BSR20160150[Crossref], [PubMed], [CAS], Google Scholar66https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXhs1OntLs%253D&md5=f3e6fecc5a11f177114fbd433a51e5b9RANK-RANKL signalling in cancerRenema, Nathalie; Navet, Benjamin; Heymann, Marie-Francoise; Lezot, Frederic; Heymann, DominiqueBioscience Reports (2016), 36 (4), e00366/1-e00366/17CODEN: BRPTDT; ISSN:0144-8463. (Portland Press Ltd.)Oncogenic events combined with a favorable environment are the two main factors in the oncol. process. The tumor microenvironment is composed of a complex, interconnected network of protagonists, including sol. factors such as cytokines, extracellular matrix components, interacting with fibroblasts, endothelial cells, immune cells and various specific cell types depending on the location of the cancer cells (e.g. pulmonary epithelium, osteoblasts). This diversity defines specific "niches" (e.g. vascular, immune, bone niches) involved in tumor growth and the metastatic process. These actors communicate together by direct intercellular communications and/or in an autocrine/paracrine/endocrine manner involving cytokines and growth factors. Among these glycoproteins, RANKL (receptor activator nuclear factor-κB ligand) and its receptor RANK (receptor activator nuclear factor), members of the TNF and TNFR superfamilies, have stimulated the interest of the scientific community. RANK is frequently expressed by cancer cells in contrast with RANKL which is frequently detected in the tumor microenvironment and together they participate in every step in cancer development. Their activities are markedly regulated by osteoprotegerin (OPG, a sol. decoy receptor) and its ligands, and by LGR4, a membrane receptor able to bind RANKL. The aim of the present review is to provide an overview of the functional implication of the RANK/RANKL system in cancer development, and to underline the most recent clin. studies.
- 67Heltemes-Harris, L. M. Ebf1 or Pax5 haploinsufficiency synergizes with STAT5 activation to initiate acute lymphoblastic leukemia. J. Exp. Med. 2011, 208, 1135– 1149, DOI: 10.1084/jem.20101947[Crossref], [PubMed], [CAS], Google Scholar67https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXns1Sktbk%253D&md5=d6d4c86c8d6a805a25dd44013f27a7c0Ebf1 or Pax5 haploinsufficiency synergizes with STAT5 activation to initiate acute lymphoblastic leukemiaHeltemes-Harris, Lynn M.; Willette, Mark J. L.; Ramsey, Laura B.; Qiu, Yi Hua; Neeley, E. Shannon; Zhang, Nianxiang; Thomas, Deborah A.; Koeuth, Thearith; Baechler, Emily C.; Kornblau, Steven M.; Farrar, Michael A.Journal of Experimental Medicine (2011), 208 (6), 1135-1149CODEN: JEMEAV; ISSN:0022-1007. (Rockefeller University Press)As STAT5 is crit. for the differentiation, proliferation, and survival of progenitor B cells, this transcription factor may play a role in acute lymphoblastic leukemia (ALL). Here, we show increased expression of activated signal transducer and activator of transcription 5 (STAT5), which is correlated with poor prognosis, in ALL patient cells. Mutations in EBF1 and PAX5, genes crit. for B cell development have also been identified in human ALL. To det. whether mutations in Ebf1 or Pax5 synergize with STAT5 activation to induce ALL, we crossed mice expressing a constitutively active form of STAT5 (Stat5b-CA) with mice heterozygous for Ebf1 or Pax5. Haploinsufficiency of either Pax5 or Ebf1 synergized with Stat5b-CA to rapidly induce ALL in 100% of the mice. The leukemic cells displayed reduced expression of both Pax5 and Ebf1, but this had little effect on most EBF1 or PAX5 target genes. Only a subset of target genes was deregulated; this subset included a large percentage of potential tumor suppressor genes and oncogenes. Further, most of these genes appear to be jointly regulated by both EBF1 and PAX5. Our findings suggest a model whereby small perturbations in a self-reinforcing network of transcription factors crit. for B cell development, specifically PAX5 and EBF1, cooperate with STAT5 activation to initiate ALL.
- 68Rainer, J. Research resource: transcriptional response to glucocorticoids in childhood acute lymphoblastic leukemia. Mol. Endocrinol. 2012, 26, 178– 193, DOI: 10.1210/me.2011-1213[Crossref], [PubMed], [CAS], Google Scholar68https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xht1aktLg%253D&md5=5dccf98a5ff2195602e264bb36d2e9b4Research resource: transcriptional response to glucocorticoids in childhood acute lymphoblastic leukemiaRainer, Johannes; Lelong, Julien; Bindreither, Daniel; Mantinger, Christine; Ploner, Christian; Geley, Stephan; Kofler, ReinhardMolecular Endocrinology (2012), 26 (1), 178-193CODEN: MOENEN; ISSN:0888-8809. (Endocrine Society)Glucocorticoids (GC) induce apoptosis in lymphoblasts and are thus essential in the treatment of acute lymphoblastic leukemia (ALL). Their effects result from gene regulations via the GC receptor (NR3C1/GR), but it is unknown how these changes evolve, what the primary GR targets are, and to what extent responses differ between ALL subtypes and nonlymphoid malignancies. The authors delineated the transcriptional response to GC on the exon level in a time-resolved manner in a precursor B- and a T childhood ALL model employing Exon microarrays and combined this with genome-wide NR3C1-binding site detection using chromatin immunopptn.-on-chip technol. This integrative approach showed that the response was strongly influenced by kinetics and extent of GR autoinduction in both models. Although remarkable differences between the ALL systems were apparent, the authors defined a set of common response genes enriched in apoptosis-related processes. Globally, GR binding was higher for GC-induced vs. -repressed genes, suggesting that GR mediates gene repression by interaction with distant enhancers or by cross talk with other transcription factors. Exon level anal. defined several new GC-regulated transcript variants of genes, including ATP4B, GPR98, TBCD, and ZBTB16. The authors' study provides unprecedented insight into the transcriptional response to GC in ALL cells, essential to understand this biol. and clin. important phenomenon. The authors found evidence of cell type-specific as well as common responses, possibly related to apoptosis induction, and detected induction of novel transcript variants by GC in the investigated systems. Finally, the authors implemented a bioinformatic framework that might be useful for high-d. microarray analyses to identify alternative transcript variant expression.
- 69Zhang, J. D.; Hatje, K.; Sturm, G.; Broger, C.; Ebeling, M.; Burtin, M.; Terzi, F.; Pomposiello, S. I.; Badi, L. Detect tissue heterogeneity in gene expression data with BioQC. BMC Genomics 2017, 18, 277, DOI: 10.1186/s12864-017-3661-2[Crossref], [PubMed], [CAS], Google Scholar69https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXot1Wit7s%253D&md5=51e53bade27f686628dea9b2b6fa261aDetect tissue heterogeneity in gene expression data with BioQCZhang, Jitao David; Hatje, Klas; Sturm, Gregor; Broger, Clemens; Ebeling, Martin; Burtin, Martine; Terzi, Fabiola; Pomposiello, Silvia Ines; Badi, LauraBMC Genomics (2017), 18 (), 277/1-277/9CODEN: BGMEET; ISSN:1471-2164. (BioMed Central Ltd.)Background: Gene expression data can be compromised by cells originating from other tissues than the target tissue of profiling. Failures in detecting such tissue heterogeneity have profound implications on data interpretation and reproducibility. A computational tool explicitly addressing the issue is warranted. Results: We introduce BioQC, a R/Bioconductor software package to detect tissue heterogeneity in gene expression data. To this end BioQC implements a computationally efficient Wilcoxon-Mann-Whitney test and provides more than 150 signatures of tissue-enriched genes derived from large-scale transcriptomics studies. Simulation expts. show that BioQC is both fast and sensitive in detecting tissue heterogeneity. In a case study with whole-organ profiling data, BioQC predicted contamination events that are confirmed by quant. RT-PCR. Applied to transcriptomics data of the Genotype-Tissue Expression (GTEx) project, BioQC reveals clustering of samples and suggests that some samples likely suffer from tissue heterogeneity. Conclusions: Our experience with gene expression data indicates a prevalence of tissue heterogeneity that often goes unnoticed. BioQC addresses the issue by integrating prior knowledge with a scalable algorithm. We propose BioQC as a first-line tool to ensure quality and reproducibility of gene expression data.
- 70Blaschke, T.; Olivecrona, M.; Engkvist, O.; Bajorath, J.; Chen, H. Application of generative autoencoder in de novo molecular design. arXiv:1711.07839 [cs.LG] , arXiv preprint, 2017. https://arxiv.org/abs/1711.07839.Google ScholarThere is no corresponding record for this reference.
- 71Kadurin, A.; Aliper, A.; Kazennov, A.; Mamoshina, P.; Vanhaelen, Q.; Khrabrov, K.; Zhavoronkov, A. The cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncology. Oncotarget 2017, 8, 10883– 10890, DOI: 10.18632/oncotarget.14073[Crossref], [PubMed], [CAS], Google Scholar71https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC1c%252FpvFKruw%253D%253D&md5=677ef0264494eb8a7ef8c6584c1202abThe cornucopia of meaningful leads: Applying deep adversarial autoencoders for new molecule development in oncologyKadurin Artur; Khrabrov Kuzma; Kadurin Artur; Aliper Alexander; Kazennov Andrey; Mamoshina Polina; Vanhaelen Quentin; Zhavoronkov Alex; Kadurin Artur; Kadurin Artur; Kazennov Andrey; Zhavoronkov Alex; Mamoshina Polina; Zhavoronkov AlexOncotarget (2017), 8 (7), 10883-10890 ISSN:.Recent advances in deep learning and specifically in generative adversarial networks have demonstrated surprising results in generating new images and videos upon request even using natural language as input. In this paper we present the first application of generative adversarial autoencoders (AAE) for generating novel molecular fingerprints with a defined set of parameters. We developed a 7-layer AAE architecture with the latent middle layer serving as a discriminator. As an input and output the AAE uses a vector of binary fingerprints and concentration of the molecule. In the latent layer we also introduced a neuron responsible for growth inhibition percentage, which when negative indicates the reduction in the number of tumor cells after the treatment. To train the AAE we used the NCI-60 cell line assay data for 6252 compounds profiled on MCF-7 cell line. The output of the AAE was used to screen 72 million compounds in PubChem and select candidate molecules with potential anti-cancer properties. This approach is a proof of concept of an artificially-intelligent drug discovery engine, where AAEs are used to generate new molecular fingerprints with the desired molecular properties.
- 72Popova, M.; Isayev, O.; Tropsha, A. Deep reinforcement learning for de novo drug design. Sci. Adv. 2018, 4, eaap7885, DOI: 10.1126/sciadv.aap7885
- 73Kim, D.; Hur, J.; Han, J. H.; Ha, S. C.; Shin, D.; Lee, S.; Park, S.; Sugiyama, H.; Kim, K. K. PubChem 2019 update: improved access to chemical data. Nucleic Acids Res. 2018, 46, 10504, DOI: 10.1093/nar/gky784[Crossref], [PubMed], [CAS], Google Scholar73https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXovVaht7s%253D&md5=ff83fba388a3e1ba89c6556a97adcd8fSequence preference and structural heterogeneity of BZ junctionsKim, Doyoun; Hur, Jeonghwan; Han, Ji Hoon; Ha, Sung Chul; Shin, Donghyuk; Lee, Sangho; Park, Soyoung; Sugiyama, Hiroshi; Kim, Kyeong KyuNucleic Acids Research (2018), 46 (19), 10504-10513CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)BZ junctions, which connect B-DNA to Z-DNA, are necessary for local transformation of B-DNA to Z-DNA in the genome. However, the limited information on the junction-forming sequences and junction structures has led to a lack of understanding of the structural diversity and sequence preferences of BZ junctions. We detd. three crystal structures of BZ junctions with diverse sequences followed by spectroscopic validation of DNA conformation. The structural features of the BZ junctions were well conserved regardless of sequences via the continuous base stacking through B-to-Z DNA with A-T base extrusion. However, the sequence-dependent structural heterogeneity of the junctions was also obsd. in base step parameters that are correlated with steric constraints imposed during Z-DNA formation. Further, CD and fluorescence-based anal. of BZ junctions revealed that a base extrusion was only found at the A-T base pair present next to a stable dinucleotide Z-DNA unit. Our findings suggest that Z-DNA formation in the genome is influenced by the sequence preference for BZ junctions.
Supporting Information
Supporting Information
ARTICLE SECTIONSThe Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.molpharmaceut.9b00520.
Details of data splits, CNV inclusion, and comparisons with other regression models; the best trained model in compressed format; the processed data following both the strict split and the lenient split strategies; and a list of genes selected via network propagation (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.