Repositioning of 8565 Existing Drugs for COVID-19Click to copy article linkArticle link copied!
- Kaifu GaoKaifu GaoDepartment of Mathematics, Michigan State University, East Lansing, Michigan 48824, United StatesMore by Kaifu Gao
- Duc Duy NguyenDuc Duy NguyenDepartment of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United StatesMore by Duc Duy Nguyen
- Jiahui ChenJiahui ChenDepartment of Mathematics, Michigan State University, East Lansing, Michigan 48824, United StatesMore by Jiahui Chen
- Rui WangRui WangDepartment of Mathematics, Michigan State University, East Lansing, Michigan 48824, United StatesMore by Rui Wang
- Guo-Wei Wei*Guo-Wei Wei*[email protected]Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United StatesDepartment of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United StatesDepartment of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United StatesMore by Guo-Wei Wei
Abstract
The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has infected over 7.1 million people and led to over 0.4 million deaths. Currently, there is no specific anti-SARS-CoV-2 medication. New drug discovery typically takes more than 10 years. Drug repositioning becomes one of the most feasible approaches for combating COVID-19. This work curates the largest available experimental data set for SARS-CoV-2 or SARS-CoV 3CL (main) protease inhibitors. On the basis of this data set, we develop validated machine learning models with relatively low root-mean-square error to screen 1553 FDA-approved drugs as well as another 7012 investigational or off-market drugs in DrugBank. We found that many existing drugs might be potentially potent to SARS-CoV-2. The druggability of many potent SARS-CoV-2 3CL protease inhibitors is analyzed. This work offers a foundation for further experimental studies of COVID-19 drug repositioning.
Note
This article is made available via the ACS COVID-19 subset for unrestricted RESEARCH re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) appeared in Wuhan, China, in late December 2019 and has rapidly spread around the world. By June 11, 2020, over 7.1 million individuals were infected, and more than 408 000 fatalities had been reported. Currently, there is no specific antiviral drug for this epidemic. It is worth noting that recently, an experimental drug, Remdesivir, has been recognized as a promising anti-SARS-CoV-2 drug. However, the high experimental value of IC50 (11.41 μM) (1) indicates that it must be used in a large dose in treating COVID-19, which is subject to side effects.
Considering the severity of this widespread dissemination and health threats, panicked patients misled by media flocked to pharmacies for Chinese medicine herbs, which were reported to “inhibit” SARS-CoV-2, despite no clinical evidence supporting the claim. Although there is also no evidence for Chloroquine’s claimed curing effect, some desperate people take it as “prophylactic” for COVID-19. Many researchers are engaged in developing anti-SARS-CoV-2 drugs. (2,3) However, new drug discovery is a long, costly, and rigorous scientific process. A more effective approach is to search for anti-SARS-CoV-2 therapies from existing drug databases.
Drug repositioning (also known as drug repurposing), which concerns the investigation of existing drugs for new therapeutic target indications, has emerged as a successful strategy for drug discovery because of the reduced costs and expedited approval procedures. (4−6) Several successful examples reveal its great value in practice: Nelfinavir, initially developed to treat the human immunodeficiency virus (HIV), is now being used for cancer treatments. Amantadine was first designed to treat the influenza caused by type A influenza viral infection and is being used for the Parkinson’s disease. (7) In recent years, the rapid growth of drug-related data sets, as well as open data initiatives, has led to new developments for computational drug repositioning, particularly structural-based drug repositioning (SBDR). Machine learning, network analysis, and text mining and semantic inference are three major computational approaches commonly applied in drug repositioning. (8) The rapid accumulation of genetic and structural databases (https://www.rcsb.org/ and https://www.ncbi.nlm.nih.gov/genbank/), the development of low-dimensional mathematical representations of complex biomolecular structures, (9) and the availability of advanced deep learning algorithms have made machine learning-based drug repositioning a promising approach. (8) Because of the urgent need for anti-SARS-CoV-2 drugs, a computational drug repositioning is one of the most feasible strategies for discovering SARS-CoV-2 drugs.
In SBDR, one needs to select one or a few effective targets. Study shows that the SARS-CoV-2 genome is very close to that of the severe acute respiratory syndrome (SARS)-CoV. (10) The sequence identities of SARS-CoV-2 3CL protease, RNA polymerase, and the spike protein with corresponding SARS-CoV proteins are 96.08%, 96%, and 76%, respectively (11) (see Figure S1). We, therefore, hypothesize that a potent SARS 3CL protease inhibitor is also a potent SARS-CoV-2 3CL protease inhibitor. Unfortunately, there is no effective SARS therapy at present. Nevertheless, the X-ray crystal structures of both SARS and SARS-CoV-2 3CL proteases have been reported. (12,13) Additionally, the binding affinities of SARS-CoV or SARS-CoV-2 3CL protease inhibitors from single-protein experiments are available in various databases or the original literature. Moreover, the DrugBank contains about 1600 drugs approved by the U.S. Food and Drug Administration (FDA) as well as more than 7000 investigational or off-market drugs. (14) The aforementioned information provides a sound basis for developing an SBDR machine learning model for SARS-CoV-2 3CL protease inhibition. It is worth clarifying that SBDR machine learning models are driven by data and do not explicitly form the energy terms related to some biophysical characteristics such as electrostatics and hydrogen bonding. Instead, these biophysical interactions are implicitly encoded in the fingerprints, and their impacts on the binding affinity are regulated by machine learning scoring functions.
In responding to the pressing need for anti-SARS-CoV-2 medications, we have carefully collected 314 bonding affinities for SARS-CoV or SARS-CoV-2 3CL protease inhibitors, which is the largest set available to date for this system. Machine learning models are built for these data points.
Unlike most earlier COVID-19 drug repositioning works that did not provide a target-specific cross-validation test, we have carefully optimized our machine learning model with a 10-fold cross-validation test on SARS-CoV-2 3CL protease inhibitors. We achieve a Pearson correlation coefficient of 0.78 and a root-mean-square error (RMSE) of 0.79 kcal/mol on the test sets of 10-fold cross validation tasks, which is much better than that of similar machine learning models for standard training sets in the PDBbind database (around 1.9 kcal/mol). (15) We systematically evaluate the binding affinities (BAs) of 1553 FDA-approved drugs as well as 7012 investigational or off-market drugs in the DrugBank by our 2D-fingerprint-based machine learning model. In addition, a three-dimensional (3D) pose predictor named MathPose (16) is also applied to predict the 3D binding poses. With these models, we report the top 20 potential anti-SARS-CoV-2 3CL inhibitors from the FDA-approved drugs and another top 20 from investigational or off-market drugs. We also discuss the druggability of some potent inhibitors in our training set. The information provides timely guidance for the further development of anti-SARS-CoV-2 drugs.
With the SARS-CoV-2 3CL protease as the target, we predict the binding affinities of 1553 FDA-approved drugs using our machine learning predictor. Given these predicted affinities, the top 20 potential SARS-CoV-2 inhibitors from the FDA-approved drugs are shown in Table 1. We also supply the corresponding IC50 (μM) derived from the binding affinity X (kcal/mol) via the following conversion: IC50 = 10X/1.3633 × 10–6. A complete list of the predicted values for 1553 FDA-approved drugs is given in the Supporting Tables (FDA_approved) in Supporting Information.
DrugID | name | brand name | predicted binding affinity | IC50 |
---|---|---|---|---|
DB01123 | Proflavine | Bayer Pessaries, Molca, Septicide | –8.37 | 0.72 |
DB01243 | Chloroxine | Capitrol | –8.24 | 0.89 |
DB08998 | Demexiptiline | Deparon, Tinoran | –8.14 | 1.06 |
DB00544 | Fluorouracil | Adrucil | –8.11 | 1.11 |
DB03209 | Oteracil | Teysuno | –8.09 | 1.16 |
DB13222 | Tilbroquinol | Intetrix | –8.08 | 1.18 |
DB01136 | Carvedilol | Coreg | –8.06 | 1.22 |
DB01033 | Mercaptopurine | Purinethol | –8.04 | 1.26 |
DB08903 | Bedaquiline | Sirturo | –8.02 | 1.29 |
DB00257 | Clotrimazole | Canesten | –8.00 | 1.35 |
DB00878 | Chlorhexidine | Betasept, Biopatch | –8.00 | 1.35 |
DB00666 | Nafarelin | Synarel | –8.00 | 1.35 |
DB01213 | Fomepizole | Antizol | –7.98 | 1.39 |
DB01656 | Roflumilast | Daxas, Daliresp | –7.97 | 1.41 |
DB00676 | Benzyl benzoate | Ascabin, Ascabiol, Ascarbin, Tenutex | –7.96 | 1.45 |
DB06663 | Pasireotide | Signifor | –7.95 | 1.47 |
DB08983 | Etofibrate | Lipo Merz Retard, Liposec | –7.94 | 1.48 |
DB06791 | Lanreotide | Somatuline | –7.94 | 1.48 |
DB00027 | Gramicidin D | Neosporin Ophthalmic | –7.94 | 1.48 |
DB00730 | Thiabendazole | Mintezol, Tresaderm, and Arbotect | –7.93 | 1.51 |
We briefly describe the top 10 predicted potential anti-SARS-CoV-2 drugs from the FDA-approved set. The most potent one is Proflavine, an acriflavine derivative. It is a disinfectant bacteriostatic against many Gram-positive bacteria. Proflavine is toxic and carcinogenic in mammals and so it is used only as a surface disinfectant or for treating superficial wounds. Under the circumstance of the SARS-CoV-2, this drug might be used to clean skin or SARS-CoV-2 contaminated materials, offering an extra layer of protection. The second drug is Chloroxine, also an antibacterial drug, which is used in infectious diarrhea, disorders of the intestinal microflora, giardiasis, and inflammatory bowel disease. It is notable that this drug belongs to the same family with Chloroquine, which was once considered for anti-SARS-CoV-2. However, according to our prediction, Chloroquine is not effective for SARS-CoV-2 3CL protease inhibition (BA: −6.92 kcal/mol). The third one, Demexiptiline, a tricyclic antidepressant, acts primarily as a norepinephrine reuptake inhibitor. The next one, Fluorouracil, is a medication used to treat cancer. By injection into a vein, it is used for colon cancer, esophageal cancer, stomach cancer, pancreatic cancer, breast cancer, and cervical cancer. The fifth drug, Oteracil, is an adjunct to antineoplastic therapy, used to reduce the toxic side effects associated with chemotherapy. The next one, Tilbroquinol, is a medication used in the treatment of intestinal amoebiasis. The seventh drug, Carvedilol, is a medication used to treat high blood pressure, congestive heart failure, and left ventricular dysfunction. The number eight drug, Mercaptopurine, is a medication used for cancer and autoimmune diseases. Specifically, it treats acute lymphocytic leukemia, chronic myeloid leukemia, Crohn’s disease, and ulcerative colitis. The next one is Bedaquiline, which is a medication used to treat active tuberculosis, specifically multidrug-resistant tuberculosis along with other tuberculosis. The number ten drug, Clotrimazole, is an antifungal medication, which is used to treat vaginal yeast infections, oral thrush, diaper rash, pityriasis versicolor, and types of ringworm including athlete’s foot and jock itch.
Using our validated machine learning model, we present the binding affinity prediction and ranking of 7012 investigational or off-market drugs. We list the top 20 from the investigational or off-market drugs in Table 2. A complete list of the predicted values can be found in the Supporting Tables (Other_drugs) in Supporting Information.
DrugID | name | predicted BA | IC50 |
---|---|---|---|
DB12903 | Debio-1347 | –9.02 | 0.24 |
DB07959 | 3-(1H-benzimidazol-2-yl)-1H-indazole | –9.01 | 0.24 |
DB07301 | 9H-carbazole | –8.96 | 0.27 |
DB07620 | 2-[(2,4-dichloro-5-methylphenyl)sulfonyl]-1,3-dinitro-5-(trifluoromethyl)benzene | –8.89 | 0.30 |
DB08036 | 6,7,12,13-tetrahydro-5H-indolo[2,3-a]pyrrolo[3,4-c]carbazol-5-one | –8.89 | 0.30 |
DB08440 | N-1,10-phenanthrolin-5-ylacetamide | –8.83 | 0.33 |
DB01767 | Hemi-Babim | –8.80 | 0.35 |
DB06828 | 5-[2-(1H-pyrrol-1-yl)ethoxy]-1H-indole | –8.73 | 0.39 |
DB14914 | Flortaucipir F-18 | –8.69 | 0.42 |
DB15033 | Flortaucipir | –8.69 | 0.42 |
DB13534 | Gedocarnil | –8.67 | 0.44 |
DB02365 | 1,10-Phenanthroline | –8.64 | 0.45 |
DB09473 | Indium In-111 oxyquinoline | –8.64 | 0.45 |
DB08512 | 6-amino-2-[(1-naphthylmethyl)amino]-3,7-dihydro-8H-imidazo[4,5-g]quinazolin-8-one | –8.60 | 0.48 |
DB01876 | Bis(5-Amidino-2-Benzimidazolyl)Methanone | –8.60 | 0.49 |
DB07919 | 7-methoxy-1-methyl-9H-β-carboline | –8.59 | 0.49 |
DB02089 | CP-526423 | –8.59 | 0.50 |
DB07837 | [4-(5-naphthalen-2-yl-1H-pyrrolo[2,3-b]pyridin-3-yl)phenyl]acetic acid | –8.53 | 0.55 |
DB08073 | (2S)-1-(1H-indol-3-yl)-3-{[5-(3-methyl-1H-indazol-5-yl)pyridin-3-yl]oxy}propan-2-amine | –8.53 | 0.55 |
DB08267 | 6-amino-4-(2-phenylethyl)-1,7-dihydro-8H-imidazo[4,5-g]quinazolin-8-one | –8.52 | 0.56 |
The prediction of binding poses is another important task in drug discovery. The goal pose prediction is to determine the binding conformations of small-molecule ligands to the appropriate target binding site. The availability of binding poses enables researchers to understand the molecular mechanism of protein–drug interactions and elucidate fundamental biochemical processes. For example, protein–ligand pose and binding affinity predictions are major tasks in D3R Grand Challenges. (16) Molecular docking is one of the most frequently used methods for pose predictions. In this work, utilizing MathPose developed in recent work, (16) we predict and analyze the binding poses of our predicted top 3 FDA-approved drugs and predicted top 3 investigational or off-market drugs. More detail of the MathPose is given below. The predicted poses are described in the next section.
The first-ranking candidate from the FDA-approved drugs is Proflavine (see Figure 1a), with a predicted binding affinity to the SARS-CoV-2 3CL protease of −8.37 kcal/mol. The predicted binding pose using our MathPose (16) is illustrated in Figure 1b. It reveals that there are two hydrogen bonds formed between the drug and the SARS-CoV-2 3CL protease. The first one is between one amino of Proflavine and the O atom in the main chain of the residue Glu166 of the protease. The second one is between the other amino of the drug and the five-member ring in the side chain of the residue His41 of the protease. As a result, the binding affinity is promising.
The predicted second-best drug is Chloroxine (see Figure 1c). Its predicted binding affinity is −8.24 kcal/mol. Between the drug and the protease, there are two hydrogen bonds (see Figure 1d): One is formed by the H atom of the hydroxy of the drug with the main-chain O atom of the residue Leu141. The other one is between the hydroxy O atom of the drug and the amino in the main chain of Cys145.
The third one, Demexiptiline (see Figure 1e), has a predicted binding affinity of −8.14 kcal/mol. The hydrogen bonds between this drug and the protease are formed by the H atom of the amino on the tail of the drug with the side-chain O atom of Ser144. Hydrophobic interactions also play a critical role in the binding.
It is interesting to analyze the binding affinities of the existing drugs developed as protease inhibitors. Table 3 shows their predicted binding affinities. The predicted values by a recent study (17) are given in parentheses, and it appears that these values are overestimated. Notably, the current protease inhibitors do not have a substantial effect on the SARS-CoV-2 3CL protease. A possible reason is that SARS-CoV-2 3CL protease is genetically and structurally different from most other known proteases.
DrugID | predicted binding affinity | IC50 | DrugID | predicted BA | IC50 |
---|---|---|---|---|---|
Remikiren | –7.42 | 3.57 | Moexipril | –6.55 | 15.63 |
Candoxatril | –7.22 | 5.05 | Trandolapril | –6.54 | 17.70 |
Darunavir | –7.16 | 5.55 | Lopinavir | –6.50 | 16.92 |
Isofluorophate | –7.09 | 6.28 | Spirapril | –6.49 | 17.16 |
Atazanavir | –7.03 (−9.57) | 6.96 | Dabigatran etexilate | –6.46 | 17.96 |
Argatroban | –7.02 | 6.98 | Apixaban | –6.44 | 18.84 |
Sitagliptin | –6.93 | 8.22 | Tipranavir | –6.39 | 20.36 |
Fosamprenavir | –6.92 | 8.26 | Lisinopril | –6.35 | 21.87 |
Quinapril | –6.91 | 8.45 | Perindopril | –6.34 | 22.10 |
Amprenavir | –6.82 | 9.83 | Cilazapril | –6.31 | 23.36 |
Benazepril | –6.81 | 10.05 | Ritonavir | –6.26 (−8.47) | 25.50 |
Rivaroxaban | –6.74 | 11.21 | Ximelagatran | –6.24 | 26.14 |
Fosinopril | –6.74 | 11.28 | Vildagliptin | –6.15 | 30.38 |
Telaprevir | –6.73 | 11.54 | Cilastatin | –6.15 | 30.40 |
Captopril | –6.72 | 11.68 | Indinavir | –6.11 | 32.91 |
Ramipril | –6.66 | 12.84 | Saxagliptin | –6.07 | 35.27 |
Enalapril | –6.66 | 12.93 | Nelfinavir | –6.05 | 36.23 |
Alogliptin | –6.62 | 13.90 | Boceprevir | –6.00 | 39.16 |
Linagliptin | –6.58 | 14.73 | Simeprevir | –5.77 (−8.29) | 58.25 |
Saquinavir | –6.56 | 15.26 | Ecabet | –5.71 | 64.15 |
Numbers in parentheses are predictions from the literature. (17)
In this section, we are interested in comparing our predicted binding affinities to the corresponding experimental ones of some existing drugs outside our training set. Table 4 lists our predictions along with the experimental values of these drugs. These experimental data are extracted from the recent literature. (1,18,19) The RMSE of experimental values and predicted ones is 0.87 kcal/mol, showing a good agreement. It is worth noting that all these data were obtained from cell-culture experiments, leading to discrepancies when comparing these experimental values to our results only tailoring to the inhibition of the SARS-CoV-2 3CL protease. For example, the target of Remdesivir is the RNA-dependent RNA polymerase rather than the 3CL protease.
DrugID | experiment | prediction | DrugID | experiment | prediction |
---|---|---|---|---|---|
Remdesivir | –6.74 (1) | –6.29 | Perhexiline | –7.08 (1) | –6.67 |
Chloroquine | –7.00 (1) | –6.92 | Loperamide | –6.86 (1) | –6.98 |
Lopinavir | –6.87 (1) | –6.51 | Mefloquine | –7.31 (1) | –6.89 |
Niclosamide | –8.93 (1) | –7.66 | Amodiaquine | –7.21 (1) | –6.93 |
Proscillaridin | –7.75 (1) | –6.50 | Phenazopyridine | –6.21 (1) | –7.51 |
Penfluridol | –7.23 (1) | –6.54 | Clomiphene | –7.19 (1) | –7.12 |
Toremifene | –7.42 (1) | –7.20 | Digoxin | –9.16 (1) | –7.00 |
Hexachlorophene | –8.24 (1) | –7.37 | Thioridazine | –7.05 (1) | –6.96 |
Salinomycin | –9.02 (1) | –7.00 | Pyronaridine | –6.13 (1) | –6.68 |
Ciclesonide | –7.31 (1) | –7.04 | Ceritinib | –7.56 (1) | –6.77 |
Osimertinib | –7.48 (1) | –6.62 | Lusutrombopag | –7.39 (1) | –6.78 |
Gilteritinib | –7.05 (1) | –5.57 | Berbamine | –6.96 (1) | –6.87 |
Ivacaftor | –7.07 (1) | –6.74 | Mequitazine | –7.00 (1) | –6.41 |
Dronedarone | –7.37 (1) | –6.19 | Eltrombopag | –6.93 (1) | –6.17 |
Fluphenazine | –7.08 (18) | –6.29 | Benztropine | –6.63 (18) | –6.94 |
Chlorpromazine | –7.50 (18) | –7.00 | Terconazole | –6.71 (18) | –7.18 |
Simeprevir | –6.67 (19) | –5.77 | Boceprevir | –7.34 (19) | –6.00 |
Narlaprevir | –7.14 (19) | –6.38 |
All numbers are in kcal/mol.
Among the investigational or off-market drugs, the top-ranking candidate is Debio-1347 (see Figure 2a). Its binding affinity with the SARS-CoV-2 3CL protease is predicted to be −9.02 kcal/mol. The MathPose-predicted pose is illustrated in Figure 2b. It indicates a hydrogen bond network formed between the drug and the protease leads to the moderately high binding affinity. This network consists of two hydrogen bonds: the first hydrogen bond is between one N atom in the Pyrazole of the drug and the main-chain amino of the residue Glu166 of the protease; the second one is between one N atom in the 1H-1,3-benzodiazole of the drug and the main-chain amino of the residue Gly143 of the protease.
The second-best investigational drug is 3-(1H-benzimidazol-2-yl)-1H-indazole (Figure 2c) with a predicted binding affinity of −9.01 kcal/mol. Figure 2d reveals that the drug forms two hydrogen bonds with the protease. One is between one N atom in the 1H-1,3-benzodiazole of the drug and the main-chain O atom of the residue Glu166 of the protease. The other is between one N atom in the 1H-indazole of the drug and the main-chain O atom of the residue His164 of the protease.
The third one, 9H-carbazole (see Figure 2e), also has a promising predicted affinity of −8.96 kcal/mol. As one can see from Figure 2f, a strong hydrogen bond is formed between the N atom of the drug and the main-chain O atom of the residue His164 of the protease. The hydrophobic interactions play an essential role in the binding as well.
Note that in our training set collected from the existing experimental data, 21 samples have binding affinity values lower than −9 kcal/mol. Table 5 provides a list of the top 20 SARS-CoV/SARS-CoV-2 3CL-protease inhibitors with their experimental binding affinities and estimated druggable properties. Moreover, 4 of these 21 samples have 3D experimental structures available. Although these inhibitors are not on the market yet, they serve as good starting points for the design of anti-SARS-CoV-2 drugs. A full list of our training compounds is given in the Supporting Tables (Training set) in Supporting Information.
ID | binding affinity | IC50 | synthesizability | log P | log S |
---|---|---|---|---|---|
CHEMBL497141 | –11.08 | 0.01 | 2.4 | 2.18 | –3.65 |
PDB ID 2zu4 | –10.12 | 0.04 | 4.04 | 2.35 | –3.53 |
CHEMBL222234 | –9.95 | 0.05 | 2.26 | 2.66 | –3.59 |
CHEMBL2442057 | –9.94 | 0.05 | 2.26 | 5.39 | –6.22 |
CHEMBL213054 | –9.92 | 0.05 | 4.2 | 3.15 | –3.81 |
CHEMBL212080 | –9.87 | 0.06 | 4.25 | 3.15 | –3.76 |
CHEMBL222840 | –9.85 | 0.06 | 2.23 | 2.55 | –3.37 |
CHEMBL398437 | –9.85 | 0.06 | 2.29 | 4.12 | –5.39 |
CHEMBL222769 | –9.82 | 0.06 | 2.16 | 4.87 | –5.73 |
PDB ID 3avz | –9.80 | 0.07 | 4.65 | –1.35 | –2.33 |
CHEMBL225515 | –9.80 | 0.07 | 2.22 | 3.44 | –4.28 |
CHEMBL1929019 | –9.80 | 0.07 | 4.23 | –0.77 | –2.41 |
CHEMBL222893 | –9.57 | 0.10 | 2.21 | 4.17 | –5.01 |
PDB ID 2zu5 | –9.56 | 0.10 | 4.27 | 3.79 | –4.39 |
PDB ID 3atw | –9.55 | 0.10 | 4.63 | –0.46 | –2.47 |
CHEMBL334399 | –9.50 | 0.11 | 2.20 | 3.06 | –4.17 |
CHEMBL253905 | –9.43 | 0.12 | 2.43 | 4.78 | –5.45 |
CHEMBL403932 | –9.42 | 0.12 | 1.94 | 4.11 | –4.97 |
CHEMBL254103 | –9.25 | 0.16 | 2.10 | 2.35 | –3.34 |
CHEMBL426898 | –9.23 | 0.17 | 2.17 | 3.70 | –4.72 |
Among the SARS-CoV/SARS-CoV-2 3CL-protease complexes with their 3D experimental structures available, the one with the PDB ID 2zu4 (20) is the most potent one with a binding affinity over −10 kcal/mol. This high binding affinity is due to a strong hydrogen bond network between the inhibitor and the protease, which consists of as many as 7 hydrogen bonds. These 7 hydrogen bonds are formed by the inhibitor with protease residues Gln189, Gly143, His163, His164, and Glu166 of the protease.
The second-best 3D-experimental structure is the one with the PDB ID 3avz, (21) and its binding affinity is −9.80 kcal/mol. A hydrogen bond network, including 7 bonds, plays an essential role in this strong binding. This network is between the inhibitor and protease residues Gln192, Thr190, His164, His163, Glu166, and Gly143.
The PDB ID of the third one is 2zu5 (20) with a binding affinity of −9.56 kcal/mol. A strong hydrogen bond network with 7 bonds can also be found in the structure. The protease residues in the network are Glu166, Phe148, His163, His164, Gly143, and Gln189.
Because His163, His164, and Glu166 emerge in the hydrogen bond networks of all three structures, it suggests that these three residues are critical to inhibitor binding.
The partition coefficient (log P), aqueous solubility (log S), and synthesizability are also critical medical chemical properties for deciding whether a compound can be a drug or not. Notably, synthesizability is always in terms of synthetic accessibility score (SAS), for which 1 indicates the easiest, 10 indicates the hardest. Here, we first calculate the log P’s, log S’s, and SASs of the 1553 FDA-approved drugs (see the Supporting Tables (FDA_approved) in Supporting Information); we then investigate whether the three properties of the inhibitors in the top three 3D experimental structures (Figure 3) are in the preferred ranges of the FDA-approved drugs.
According to the log P distribution of the FDA-approved drugs in the Supporting Tables (FDA_approved) in Supporting Information, the log P interval with a large population of the FDA-approved drugs is between −0.14 and 4.96. The log P values of the top 3 inhibitors are 2.35, −1.35, and −0.46, respectively.
The log S distribution reveals that the preferred range of log S is between −5.12 and 1.76. The log S values of the top 3 inhibitors are −3.53, −2.33, and −4.39.
In the SAS distribution, most of the FDA-approved drugs have SASs between 1.84 and 3.94. The SAS values of the top 3 inhibitors are 4.04, 4.65, and 4.27.
In summary, for the inhibitor in the first ranking PDB structure 2zu4, its log P and log S are quite good for a drug. The SAS is a little higher, but it is still not too difficult to synthesize: 344 of the 1553 FDA-approved drugs have larger SASs than this inhibitor, and 56 of them even have SASs over 6.
Similarly, for the 3avz and 2zu5 inhibitors, their log S’s are very promising. Some of the log P’s and SASs are out of the preferred ranges, but many FDA-approved drugs still have worse log Ps and SASs. As a result, these top 3 inhibitors, especially the first one, could be good starting points for developing anti-SARS-CoV2 drugs. Obviously, their toxicity will be a major concern for any further development.
We collect the training set from single-protein experimental data of SARS/SARS-CoV-2 3CL protease in public databases or the related literature.
ChEMBL is a manually curated database of bioactive molecules. (22) Currently, ChEMBL contains more than 2 million compounds only in the SMILES string format. In ChEMBL, we find 277 SARS-CoV or SARS-CoV-2 3CL protease inhibitors with reported Kd/IC50 from single-protein experiments.
Another database is PDBbind. The PDBbind database includes all the protein–ligand complexes with the crystal structures deposited in the Protein Data Bank (PDB) and their binding affinities in the form of Kd, Ki, or IC50 reported in the literature. (23) The newest PDBbind v2019 consists of 17 679 complexes as well as the binding affinities. We find another 30 inhibitors in the PDBbind v2019.
Additionally, binding affinities for four other SARS-CoV 3CL protease inhibitors and three other SARS-CoV-2 3CL protease inhibitors are extracted from refs (24) and (13), respectively. Therefore, we collected 314 SARS-CoV/SARS-CoV-2 3CL protease inhibitors with available experimental binding affinities.
The binding affinity range in this set is from −3.68 kcal/mol to −11.08 kcal/mol. The distribution is depicted in Figure S3. The top 20 inhibitors in the training set are summarized in Table 5.
DrugBank (www.drugbank.ca) (14) is a richly annotated, freely accessible online database that integrates massive drug, drug target, drug action, and drug interaction information about FDA-approved drugs as well as investigational or off-market drugs. Because of the high quality and sufficient information contained in it, the DrugBank has become one of the most popular reference drug resources used all over the world. In the current work, we extract 1553 FDA-approved drugs and 7012 investigational or off-market drugs from DrugBank and evaluate their binding affinities to the SARS-CoV-2 3CL protease.
In this work, the log P and synthesizability values are calculated by RDKit (http://www.rdkit.org); the synthesizability in RDkit is reported in terms of synthetic accessibility score (1 means the easiest, and 10 means the hardest). The log S values are obtained via Alog PS 2.1. (25)
The 3D binding poses in this work are predicted by the MathPose, a 3D pose predictor which converts SMILES strings into 3D poses with references of target molecules. It was the top performer in D3R Grand Challenge 4 in predicting the poses of 24 beta-secretase 1 (BACE) binders. (16) For one SMILES string, around 1000 3D structures can be generated by a common docking software tool, i.e., GLIDE. Moreover, a selected set of known complexes is redocked by the three docking software packages mentioned above to generate 100 decoy complexes per input ligand as a machine learning training set. All of those structures are optimized by a minimization component in GLIDE with the OPLS3 force field. (26) The machine learning labels will be the calculated root mean squared deviations (RMSDs) between the decoy and native structures for this training set. Furthermore, MathDL models (16) are set up and applied to select the top-ranked pose for the given ligand.
In the current work, we develop a machine learning model for predicting the binding affinities of SARS-CoV-2 inhibitors. Our current model is classified as a ligand-based approach, the most popular framework in computer-aided drug design owing to its simplicity in data preparation while still delivering satisfactory performance. (27−30) Because the size of the training set in our current case study is only 314, we apply the gradient-boosting decision tree (GBDT) model because of its accuracy for handling small data sets. This GBDT predictor is constructed using the gradient boosting regressor module in scikit-learn (version 0.20.1).
The 2D fingerprints of compounds are used as the input features to our GBDT predictor. Previous study shows that the consensus of ECFP4, Estate1, and Estate2 fingerprints performs the best on binding-affinity prediction tasks. (31) In this work, we also make use of this consensus. The 2D fingerprints are calculated from SMILES strings using RDKit software (version 2018.09.3) (http://www.rdkit.org).
We validate the performance of our machine learning predictor for the 314 inhibitors in the SARS-CoV-2 BA training set. We use 10-fold cross-validation, which is carried out using 50 random splittings. In Table S1 we show that our machine learning predictor is trained with the average Pearson correlation coefficient (Rp) of 0.997, the Kendall’s τ (τ) of 0.972, and RMSE of 0.095 kcal/mol. These metrics are based on the averaged values across 10 folds, and these results indicate our model is well-trained. Their averaged test performances across the 10 folds of the whole SARS-CoV-2 BA set are found to be Rp = 0.777, τ = 0.586, and RMSE = 0.792 kcal/mol. These results endorse the reliability of our model in the binding affinity prediction of SARS-CoV-2 inhibitors.
The current pneumonia outbreak caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has evolved into a global pandemic.
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpclett.0c01579.
3CL protease sequence identity and 3D structure similarity analysis, machine learning details, the MathDL model, and the list of nonpolar 3CL protease binding site residues (PDF)
Tables of experimental binding affinities for 314 SARS-CoV-2 3CL protease inhibitors, the predicted binding affinities of 1553 FDA-approved drugs, and 7012 investigational or off-market drugs (XLSX)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgments
This work was supported in part by NIH Grant GM126189; NSF Grants DMS-1721024, DMS-1761320, and IIS1900473; Michigan Economic Development Corporation; Bristol-Myers Squibb; and Pfizer. The authors thank The IBM TJ Watson Research Center, The COVID-19 High Performance Computing Consortium, NVIDIA, and MSU HPCC for computational assistance supporting this work.
References
This article references 31 other publications.
- 1Jeon, S.; Ko, M.; Lee, J.; Choi, I.; Byun, S. Y.; Park, S.; Shum, D.; Kim, S. Identification of antiviral drug candidates against SARS-CoV-2 from FDA-approved drugs. Antimicrob. Agents Chemother. 2020, DOI: 10.1128/AAC.00819-20Google ScholarThere is no corresponding record for this reference.
- 2MacIntyre, C. R. Wuhan novel coronavirus 2019nCoV–update January 27th 2020. Glob. Biosecur. 2019, 1, 1, DOI: 10.31646/gbio.51Google ScholarThere is no corresponding record for this reference.
- 3Xu, Z.; Peng, C.; Shi, Y.; Zhu, Z.; Mu, K.; Wang, X.; Zhu, W. Nelfinavir was predicted to be a potential inhibitor of 2019 -nCoV main protease by an integrative approach combining homology modelling, molecular docking and binding free energy calculation. bioRxiv 2020.Google ScholarThere is no corresponding record for this reference.
- 4Brown, A. S.; Patel, C. J. A standard database for drug repositioning. Sci. Data 2017, 4, 1– 7, DOI: 10.1038/sdata.2017.29Google ScholarThere is no corresponding record for this reference.
- 5Amelio, I.; Gostev, M.; Knight, R.; Willis, A.; Melino, G.; Antonov, A. DRUGSURV: a resource for repositioning of approved and experimental drugs in oncology based on patient survival information. Cell Death Dis. 2014, 5, e1051– e1051, DOI: 10.1038/cddis.2014.9Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXitFOlsrY%253D&md5=2f59bd159f59678a9719c7c20627942aDRUGSURV: a resource for repositioning of approved and experimental drugs in oncology based on patient survival informationAmelio, I.; Gostev, M.; Knight, R. A.; Willis, A. E.; Melino, G.; Antonov, A. V.Cell Death & Disease (2014), 5 (2), e1051CODEN: CDDEA4; ISSN:2041-4889. (Nature Publishing Group)The use of existing drugs for new therapeutic applications, commonly referred to as drug repositioning, is a way for fast and cost-efficient drug discovery. Drug repositioning in oncol. is commonly initiated by in vitro exptl. evidence that a drug exhibits anticancer cytotoxicity. Any independent verification that the obsd. effects in vitro may be valid in a clin. setting, and that the drug could potentially affect patient survival in vivo is of paramount importance. Despite considerable recent efforts in computational drug repositioning, none of the studies have considered patient survival information in modeling the potential of existing/new drugs in the management of cancer. Therefore, we have developed DRUGSURV; this is the first computational tool to est. the potential effects of a drug using patient survival information derived from clin. cancer expression data sets. DRUGSURV provides statistical evidence that a drug can affect survival outcome in particular clin. conditions to justify further investigation of the drug anticancer potential and to guide clin. trial design. DRUGSURV covers both approved drugs (∼1700) as well as exptl. drugs (∼5000) and is freely available at http://www.bioprofiling.de/drugsurv.
- 6Jin, G.; Wong, S. T. Toward better drug repositioning: prioritizing and integrating existing methods into efficient pipelines. Drug Discovery Today 2014, 19, 637– 644, DOI: 10.1016/j.drudis.2013.11.005Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2c7nslyitg%253D%253D&md5=35815069aa104a93d551f2c1eae516c7Toward better drug repositioning: prioritizing and integrating existing methods into efficient pipelinesJin Guangxu; Wong Stephen T CDrug discovery today (2014), 19 (5), 637-44 ISSN:.Recycling old drugs, rescuing shelved drugs and extending patents' lives make drug repositioning an attractive form of drug discovery. Drug repositioning accounts for approximately 30% of the newly US Food and Drug Administration (FDA)-approved drugs and vaccines in recent years. The prevalence of drug-repositioning studies has resulted in a variety of innovative computational methods for the identification of new opportunities for the use of old drugs. Questions often arise from customizing or optimizing these methods into efficient drug-repositioning pipelines for alternative applications. It requires a comprehensive understanding of the available methods gained by evaluating both biological and pharmaceutical knowledge and the elucidated mechanism-of-action of drugs. Here, we provide guidance for prioritizing and integrating drug-repositioning methods for specific drug-repositioning pipelines.
- 7Patwardhan, B.; Chaguturu, R. Innovative Approaches in Drug Discovery: Ethnopharmacology, Systems Biology and Holistic Targeting; Academic Press, 2016.Google ScholarThere is no corresponding record for this reference.
- 8Li, J.; Zheng, S.; Chen, B.; Butte, A. J.; Swamidass, S. J.; Lu, Z. A survey of current trends in computational drug repositioning. Briefings Bioinf. 2016, 17, 2– 12, DOI: 10.1093/bib/bbv020Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2MjgtVyitw%253D%253D&md5=22a5693fc892d2479485e93101e8dc3aA survey of current trends in computational drug repositioningLi Jiao; Zheng Si; Chen Bin; Butte Atul J; Swamidass S Joshua; Lu ZhiyongBriefings in bioinformatics (2016), 17 (1), 2-12 ISSN:.Computational drug repositioning or repurposing is a promising and efficient tool for discovering new uses from existing drugs and holds the great potential for precision medicine in the age of big data. The explosive growth of large-scale genomic and phenotypic data, as well as data of small molecular compounds with granted regulatory approval, is enabling new developments for computational repositioning. To achieve the shortest path toward new drug indications, advanced data processing and analysis strategies are critical for making sense of these heterogeneous molecular measurements. In this review, we show recent advancements in the critical areas of computational drug repositioning from multiple aspects. First, we summarize available data sources and the corresponding computational repositioning strategies. Second, we characterize the commonly used computational techniques. Third, we discuss validation strategies for repositioning studies, including both computational and experimental methods. Finally, we highlight potential opportunities and use-cases, including a few target areas such as cancers. We conclude with a brief discussion of the remaining challenges in computational drug repositioning.
- 9Cang, Z.; Mu, L.; Wei, G.-W. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput. Biol. 2018, 14, e1005929 DOI: 10.1371/journal.pcbi.1005929Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhs1OhsL%252FM&md5=4437c7ae1cd57becf34db34716e8890dRepresentability of algebraic topology for biomolecules in machine learning based scoring and virtual screeningCang, Zixuan; Mu, Lin; Wei, Guo-WeiPLoS Computational Biology (2018), 14 (1), e1005929/1-e1005929/44CODEN: PCBLBG; ISSN:1553-7358. (Public Library of Science)This work introduces a no. of algebraic topol. approaches, including multi-component persistent homol., multi-level persistent homol., and electrostatic persistence for the representation, characterization, and description of small mols. and biomol. complexes. In contrast to the conventional persistent homol., multi-component persistent homol. retains crit. chem. and biol. information during the topol. simplification of biomol. geometric complexity. Multi-level persistent homol. enables a tailored topol. description of inter- and/or intra-mol. interactions of interest. Electrostatic persistence incorporates partial charge information into topol. invariants. These topol. methods are paired with Wasserstein distance to characterize similarities between mols. and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding anal. and virtual screening of small mols. Extensive numerical expts. involving 4,414 protein- ligand complexes from the PDBBind database and 128,374 ligand-target and decoytarget pairs in the DUD database are performed to test resp. the scoring power and the discriminatory power of the proposed topol. learning strategies. It is demonstrated that the present topol. learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.
- 10Gralinski, L. E.; Menachery, V. D. Return of the Coronavirus: 2019-nCoV. Viruses 2020, 12, 135, DOI: 10.3390/v12020135Google ScholarThere is no corresponding record for this reference.
- 11Xu, X.; Chen, P.; Wang, J.; Feng, J.; Zhou, H.; Li, X.; Zhong, W.; Hao, P. Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission. Sci. China: Life Sci. 2020, 63, 457– 460, DOI: 10.1007/s11427-020-1637-5Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXkt1ert70%253D&md5=4329e2e6beea7848252c54cf20aa8e84Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmissionXu, Xintian; Chen, Ping; Wang, Jingfang; Feng, Jiannan; Zhou, Hui; Li, Xuan; Zhong, Wu; Hao, PeiScience China: Life Sciences (2020), 63 (3), 457-460CODEN: SCLSCJ; ISSN:1674-7305. (Science China Press)The authors' anal. shows that the virus in the 2019 outbreak in Wuhan, China shares with the SARS/SARS-like coronaviruses a common ancestor that resembles the bat coronavirus HKU9-1. This virus is now known as SARS-CoV-2. Their work points to the important discovery that the RBD domain of SARS-CoV-2 S-protein supports strong interaction with human ACE2 protein, despite its sequence diversity with SARS-CoV S-protein. Thus SARS-CoV-2 poses a significant public health risk for human transmission via the S-protein-ACE2 binding pathway. People also need to be reminded that risk and dynamic of cross-species or human-to-human transmission of coronaviruses are also affected by many other factors, like the host's immune response, viral replication efficiency, or virus mutation rate.
- 12Lee, T.-W.; Cherney, M. M.; Huitema, C.; Liu, J.; James, K. E.; Powers, J. C.; Eltis, L. D.; James, M. N. Crystal structures of the main peptidase from the SARS coronavirus inhibited by a substrate-like aza-peptide epoxide. J. Mol. Biol. 2005, 353, 1137– 1151, DOI: 10.1016/j.jmb.2005.09.004Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXhtFKms7nL&md5=ba0e5fbad36bfc3d259c9221d451ef60Crystal Structures of the Main Peptidase from the SARS Coronavirus Inhibited by a Substrate-like Aza-peptide EpoxideLee, Ting-Wai; Cherney, Maia M.; Huitema, Carly; Liu, Jie; James, Karen Ellis; Powers, James C.; Eltis, Lindsay D.; James, Michael N. G.Journal of Molecular Biology (2005), 353 (5), 1137-1151CODEN: JMOBAK; ISSN:0022-2836. (Elsevier B.V.)The main peptidase (Mpro) from the coronavirus (CoV) causing severe acute respiratory syndrome (SARS) is one of the most attractive mol. targets for the development of anti-SARS agents. We report the irreversible inhibition of SARS-CoV Mpro by an aza-peptide epoxide (APE; kinact/Ki=1900(±400) M-1 s-1). The crystal structures of the Mpro:APE complex in the space groups C2 and P212121 revealed the formation of a covalent bond between the catalytic Cys145 Sγ atom of the peptidase and the epoxide C3 atom of the inhibitor, substantiating the mode of action of this class of cysteine-peptidase inhibitors. The aza-peptide component of APE binds in the substrate-binding regions of Mpro in a substrate-like manner, with excellent structural and chem. complementarity. In addn., the crystal structure of unbound Mpro in the space group C2 revealed that the "N-fingers" (N-terminal residues 1 to 7) of both protomers of Mpro are well defined and the substrate-binding regions of both protomers are in the catalytically competent conformation at the crystn. pH of 6.5, contrary to the previously detd. crystal structures of unbound Mpro in the space group P21.
- 13Zhang, L.; Lin, D.; Sun, X.; Curth, U.; Drosten, C.; Sauerhering, L.; Becker, S.; Rox, K.; Hilgenfeld, R. Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science 2020, 368, 409– 412, DOI: 10.1126/science.abb3405Google Scholar13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXnslKrtL8%253D&md5=9ac417c20f54c3327f9de9088b512d52Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitorsZhang, Linlin; Lin, Daizong; Sun, Xinyuanyuan; Curth, Ute; Drosten, Christian; Sauerhering, Lucie; Becker, Stephan; Rox, Katharina; Hilgenfeld, RolfScience (Washington, DC, United States) (2020), 368 (6489), 409-412CODEN: SCIEAS; ISSN:1095-9203. (American Association for the Advancement of Science)The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) is a global health emergency. An attractive drug target among coronaviruses is the main protease (Mpro, also called 3CLpro) because of its essential role in processing the polyproteins that are translated from the viral RNA. We report the x-ray structures of the unliganded SARS-CoV-2 Mpro and its complex with an α-ketoamide inhibitor. This was derived from a previously designed inhibitor but with the P3-P2 amide bond incorporated into a pyridone ring to enhance the half-life of the compd. in plasma. On the basis of the unliganded structure, we developed the lead compd. into a potent inhibitor of the SARS-CoV-2 Mpro. The pharmacokinetic characterization of the optimized inhibitor reveals a pronounced lung tropism and suitability for administration by the inhalative route.
- 14Wishart, D. S.; Feunang, Y. D.; Guo, A. C.; Lo, E. J.; Marcu, A.; Grant, J. R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018, 46, D1074– D1082, DOI: 10.1093/nar/gkx1037Google Scholar14https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitlGisbvI&md5=986b28c7ea546596a26dd3ba38f05feeDrugBank 5.0: a major update to the DrugBank database for 2018Wishart, David S.; Feunang, Yannick D.; Guo, An C.; Lo, Elvis J.; Marcu, Ana; Grant, Jason R.; Sajed, Tanvir; Johnson, Daniel; Li, Carin; Sayeeda, Zinat; Assempour, Nazanin; Iynkkaran, Ithayavani; Liu, Yifeng; Maciejewski, Adam; Gale, Nicola; Wilson, Alex; Chin, Lucy; Cummings, Ryan; Le, Diana; Pon, Allison; Knox, Craig; Wilson, MichaelNucleic Acids Research (2018), 46 (D1), D1074-D1082CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)DrugBank is a web-enabled database contg. comprehensivemol. information about drugs, their mechanisms, their interactions and their targets. First described in 2006, Drug- Bank has continued to evolve over the past 12 years in response to marked improvements to web stds. and changing needs for drug research and development. This year's update, DrugBank 5.0, represents the most significant upgrade to the database in more than 10 years. In many cases, existing data content has grown by 100% or more over the last update. For instance, the total no. of investigational drugs in the database has grown by almost 300%, the no. of drug-drug interactions has grown by nearly 600% and the no. of SNP-assocd. drug effects has grown more than 3000%. Significant improvements have been made to the quantity, quality and consistency of drug indications, drug binding data as well as drug-drug and drug-food interactions. A great deal of brand new data have also been added to DrugBank 5.0. This includes information on the influence of hundreds of drugs on metabolite levels (pharmacometabolomics), gene expression levels (pharmacotranscriptomics) and protein expression levels (pharmacoproteomics). New data have also been added on the status of hundreds of newdrug clin. trials and existing drug repurposing trials. Many other important improvements in the content, interface and performance of the DrugBank website have been made and these should greatly enhance its ease of use, utility and potential applications in many areas of pharmacol. research, pharmaceutical science and drug education.
- 15Li, H.; Sze, K.-H.; Lu, G.; Ballester, P. J. Machine-learning scoring functions for structure-based drug lead optimization. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2020, e1465 DOI: 10.1002/wcms.1465Google ScholarThere is no corresponding record for this reference.
- 16Nguyen, D. D.; Gao, K.; Wang, M.; Wei, G.-W. Mathdl: Mathematical deep learning for d3r grand challenge 4. J. Comput.-Aided Mol. Des. 2020, 34, 131– 147, DOI: 10.1007/s10822-019-00237-5Google Scholar16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXitFOju73K&md5=c9715db3446eb2d6cbad5a7f5a59e27eMathDL: mathematical deep learning for D3R Grand Challenge 4Nguyen, Duc Duy; Gao, Kaifu; Wang, Menglun; Wei, Guo-WeiJournal of Computer-Aided Molecular Design (2020), 34 (2), 131-147CODEN: JCADEQ; ISSN:0920-654X. (Springer)We present the performances of our math. deep learning (MathDL) models for D3R Grand Challenge 4 (GC4). This challenge involves pose prediction, affinity ranking, and free energy estn. for beta secretase 1 (BACE) as well as affinity ranking and free energy estn. for Cathepsin S (CatS). We have developed advanced mathematics, namely differential geometry, algebraic graph, and/or algebraic topol., to accurately and efficiently encode high dimensional phys./chem. interactions into scalable low-dimensional rotational and translational invariant representations. These representations are integrated with deep learning models, such as generative adversarial networks (GAN) and convolutional neural networks (CNN) for pose prediction and energy evaluation, resp. Overall, our MathDL models achieved the top place in pose prediction for BACE ligands in Stage 1a. Moreover, our submissions obtained the highest Spearman correlation coeff. on the affinity ranking of 460 CatS compds., and the smallest centered root mean square error on the free energy set of 39 CatS mols. It is worthy to mention that our method on docking pose predictions has significantly improved from our previous ones.
- 17Beck, B. R.; Shin, B.; Choi, Y.; Park, S.; Kang, K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J. 2020, 18, 784– 790, DOI: 10.1016/j.csbj.2020.03.025Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXmsVCgsLg%253D&md5=f688ef652af7e7ddedf26a0fc984d980Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning modelBeck, Bo Ram; Shin, Bonggun; Choi, Yoonjung; Park, Sungsoo; Kang, KeunsooComputational and Structural Biotechnology Journal (2020), 18 (), 784-790CODEN: CSBJAC; ISSN:2001-0370. (Elsevier B.V.)The infection of a novel coronavirus found in Wuhan of China (SARS-CoV-2) is rapidly spreading, and the incidence rate is increasing worldwide. Due to the lack of effective treatment options for SARS-CoV-2, various strategies are being tested in China, including drug repurposing. In this study, we used our pre-trained deep learning-based drug-target interaction model called Mol. Transformer-Drug Target Interaction (MT-DTI) to identify com. available drugs that could act on viral proteins of SARS-CoV-2. The result showed that atazanavir, an antiretroviral medication used to treat and prevent the human immunodeficiency virus (HIV), is the best chem. compd., showing an inhibitory potency with Kd of 94.94 nM against the SARS-CoV-2 3C-like proteinase, followed by remdesivir (113.13 nM), efavirenz (199.17 nM), ritonavir (204.05 nM), and dolutegravir (336.91 nM). Interestingly, lopinavir, ritonavir, and darunavir are all designed to target viral proteinases. However, in our prediction, they may also bind to the replication complex components of SARS-CoV-2 with an inhibitory potency with Kd < 1000 nM. In addn., we also found that several antiviral agents, such as Kaletra (lopinavir/ritonavir), could be used for the treatment of SARS-CoV-2. Overall, we suggest that the list of antiviral drugs identified by the MT-DTI model should be considered, when establishing effective treatment strategies for SARS-CoV-2.
- 18Weston, S.; Haupt, R.; Logue, J.; Matthews, K.; Frieman, M. FDA approved drugs with broad anti-coronaviral activity inhibit SARS-CoV-2 in vitro. bioRxiv 2020, DOI: 10.1101/2020.03.25.008482Google ScholarThere is no corresponding record for this reference.
- 19Ma, C.; Hurst, B.; Hu, Y.; Szeto, T.; Tarbet, B.; Wang, J. Boceprevir, GC-376, and calpain inhibitors II, XII inhibit SARS-CoV-2 viral replication by targeting the viral main protease. Cell Res. 2020, DOI: 10.1038/s41422-020-0356-zGoogle ScholarThere is no corresponding record for this reference.
- 20Lee, C.-C.; Kuo, C.-J.; Ko, T.-P.; Hsu, M.-F.; Tsui, Y.-C.; Chang, S.-C.; Yang, S.; Chen, S.-J.; Chen, H.-C.; Hsu, M.-C. Structural basis of inhibition specificities of 3C and 3C-like proteases by zinc-coordinating and peptidomimetic compounds. J. Biol. Chem. 2009, 284, 7646– 7655, DOI: 10.1074/jbc.M807947200Google Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXjtVSmsLc%253D&md5=af8f06673d60dbd5d8c19743f62ea015Structural Basis of Inhibition Specificities of 3C and 3C-like Proteases by Zinc-coordinating and Peptidomimetic CompoundsLee, Cheng-Chung; Kuo, Chih-Jung; Ko, Tzu-Ping; Hsu, Min-Feng; Tsui, Yao-Chen; Chang, Shih-Cheng; Yang, Syaulan; Chen, Shu-Jen; Chen, Hua-Chien; Hsu, Ming-Chu; Shih, Shin-Ru; Liang, Po-Huang; Wang, Andrew H.-J.Journal of Biological Chemistry (2009), 284 (12), 7646-7655CODEN: JBCHA3; ISSN:0021-9258. (American Society for Biochemistry and Molecular Biology)Human coxsackievirus (CV) belongs to the picornavirus family, which consists of over 200 medically relevant viruses. In picornavirus, a chymotrypsin-like protease (3Cpro) is required for viral replication by processing the polyproteins, and thus it is regarded as an antiviral drug target. A 3C-like protease (3CLpro) also exists in human coronaviruses (CoV) such as 229E and the one causing severe acute respiratory syndrome (SARS). To combat SARS, we previously had developed peptidomimetic and zinc-coordinating inhibitors of 3CLpro. As shown in the present study, some of these compds. were also found to be active against 3Cpro of CV strain B3 (CVB3). Several crystal structures of 3Cpro from CVB3 and 3CLpro from CoV-229E and SARS-CoV in complex with the inhibitors were solved. The zinc-coordinating inhibitor is tetrahedrally coordinated to the His40-Cys147 catalytic dyad of CVB3 3Cpro. The presence of specific binding pockets for the residues of peptidomimetic inhibitors explains the binding specificity. Our results provide a structural basis for inhibitor optimization and development of potential drugs for antiviral therapies.
- 21Akaji, K.; Konno, H.; Mitsui, H.; Teruya, K.; Shimamoto, Y.; Hattori, Y.; Ozaki, T.; Kusunoki, M.; Sanjoh, A. Structure-based design, synthesis, and evaluation of peptide-mimetic SARS 3CL protease inhibitors. J. Med. Chem. 2011, 54, 7962– 7973, DOI: 10.1021/jm200870nGoogle Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtlCiurrK&md5=f951e9f8cea482a0089192b91a44dc6fStructure-Based Design, Synthesis, and Evaluation of Peptide-Mimetic SARS 3CL Protease InhibitorsAkaji, Kenichi; Konno, Hiroyuki; Mitsui, Hironori; Teruya, Kenta; Shimamoto, Yasuhiro; Hattori, Yasunao; Ozaki, Takeshi; Kusunoki, Masami; Sanjoh, AkiraJournal of Medicinal Chemistry (2011), 54 (23), 7962-7973CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)The design and evaluation of low mol. wt. peptide-based severe acute respiratory syndrome (SARS) chymotrypsin-like protease (3CL) protease inhibitors are described. A substrate-based peptide aldehyde was selected as a starting compd., and optimum side-chain structures were detd., based on a comparison of inhibitory activities with Michael type inhibitors. For the efficient screening of peptide aldehydes contg. a specific C-terminal residue, a new approach employing thioacetal to aldehyde conversion mediated by N-bromosuccinimide was devised. Structural optimization was carried out based on x-ray crystallog. analyses of the R188I SARS 3CL protease in a complex with each inhibitor to provide a tetrapeptide aldehyde with an IC50 = 98 nM. The resulting compd. carried no substrate sequence, except for a P3 site directed toward the outside of the protease. X-ray crystallog. provided insights into the protein-ligand interactions.
- 22Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, D1100– D1107, DOI: 10.1093/nar/gkr777Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhs12htbjN&md5=aedf7793e1ca54b6a4fa272ea3ef7d0eChEMBL: a large-scale bioactivity database for drug discoveryGaulton, Anna; Bellis, Louisa J.; Bento, A. Patricia; Chambers, Jon; Davies, Mark; Hersey, Anne; Light, Yvonne; McGlinchey, Shaun; Michalovich, David; Al-Lazikani, Bissan; Overington, John P.Nucleic Acids Research (2012), 40 (D1), D1100-D1107CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)ChEMBL is an Open Data database contg. binding, functional and ADMET information for a large no. of drug-like bioactive compds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chem. biol. and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compds. and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.
- 23Wang, R.; Fang, X.; Lu, Y.; Wang, S. The PDBbind database: Collection of binding affinities for protein- ligand complexes with known three-dimensional structures. J. Med. Chem. 2004, 47, 2977– 2980, DOI: 10.1021/jm030580lGoogle Scholar23https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXjs1Sjs74%253D&md5=86e609172307402d8b0d4589b1270a2fThe PDBbind database: Collection of binding affinities for protein-ligand complexes with known three-dimensional structuresWang, Renxiao; Fang, Xueliang; Lu, Yipin; Wang, ShaomengJournal of Medicinal Chemistry (2004), 47 (12), 2977-2980CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)We have screened the entire Protein Data Bank (Release No. 103, Jan. 2003) and identified 5671 protein-ligand complexes out of 19 621 exptl. structures. A systematic examn. of the primary refs. of these entries has led to a collection of binding affinity data (Kd, Ki, and IC50) for a total of 1359 complexes. The outcomes of this project have been organized into a Web-accessible database named the PDBbind database.
- 24Bacha, U.; Barrila, J.; Gabelli, S. B.; Kiso, Y.; Mario Amzel, L.; Freire, E. Development of Broad-Spectrum Halomethyl Ketone Inhibitors Against Coronavirus Main Protease 3CLpro. Chem. Biol. Drug Des. 2008, 72, 34– 49, DOI: 10.1111/j.1747-0285.2008.00679.xGoogle Scholar24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXoslymu7s%253D&md5=db8382f357a5be0fffe340b113664fd5Development of broad-spectrum halomethyl ketone inhibitors against coronavirus main protease 3CLproBacha, Usman; Barrila, Jennifer; Gabelli, Sandra B.; Kiso, Yoshiaki; Amzel, L. Mario; Freire, ErnestoChemical Biology & Drug Design (2008), 72 (1), 34-49CODEN: CBDDAL; ISSN:1747-0277. (Blackwell Publishing Ltd.)Coronaviruses comprise a large group of RNA viruses with diverse host specificity. The emergence of highly pathogenic strains like the SARS coronavirus (SARS-Co-V), and the discovery of two new coronaviruses, NL-63 and HKU1, corroborates the high rate of mutation and recombination that have enabled them to cross species barriers and infect novel hosts. For that reason, the development of broad-spectrum antivirals that are effective against several members of this family is highly desirable. This goal can be accomplished by designing inhibitors against a target, such as the main protease 3CLpro (Mpro), which is highly conserved among all coronaviruses. Here 3CLpro derived from the SARS-Co-V was used as the primary target to identify a new class of inhibitors contg. a halomethyl ketone warhead. The compds. are highly potent against SARS 3CLpro with Ki's as low as 300 nM. The crystal structure of the complex of one of the compds. with 3CLpro indicates that this inhibitor forms a thioether linkage between the halomethyl carbon of the warhead and the catalytic Cys 145. Furthermore, Structure Activity Relationship (SAR) studies of these compds. have led to the identification of a pharmacophore that accurately defines the essential mol. features required for the high affinity.
- 25Tetko, I. V.; Gasteiger, J.; Todeschini, R.; Mauri, A.; Livingstone, D.; Ertl, P.; Palyulin, V. A.; Radchenko, E. V.; Zefirov, N. S.; Makarenko, A. S. Virtual computational chemistry laboratory–design and description. J. Comput.-Aided Mol. Des. 2005, 19, 453– 463, DOI: 10.1007/s10822-005-8694-yGoogle Scholar25https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXhtFaht77F&md5=6e48f916c58c1e772ade43fa8e4b4b1aVirtual computational chemistry laboratory - design and descriptionTetko, Igor V.; Gasteiger, Johann; Todeschini, Roberto; Mauri, Andrea; Livingstone, David; Ertl, Peter; Palyulin, Vladimir A.; Radchenko, Eugene V.; Zefirov, Nikolay S.; Makarenko, Alexander S.; Tanchuk, Vsevolod Yu.; Prokopenko, Volodymyr V.Journal of Computer-Aided Molecular Design (2005), 19 (6), 453-463CODEN: JCADEQ; ISSN:0920-654X. (Springer)Internet technol. offers an excellent opportunity for the development of tools by the cooperative effort of various groups and institutions. We have developed a multi-platform software system, Virtual Computational Chem. Lab., http://www.vcclab.org, allowing the computational chemist to perform a comprehensive series of mol. indexes/properties calcns. and data anal. The implemented software is based on a three-tier architecture that is one of the std. technologies to provide client-server services on the Internet. The developed software includes several popular programs, including the indexes generation program, DRAGON, a 3D structure generator, CORINA, a program to predict lipophilicity and aq. soly. of chems., ALOGPS and others. All these programs are running at the host institutes located in five countries over Europe. In this article we review the main features and statistics of the developed system that can be used as a prototype for academic and industry models.
- 26Harder, E.; Damm, W.; Maple, J.; Wu, C.; Reboul, M.; Xiang, J. Y.; Wang, L.; Lupyan, D.; Dahlgren, M. K.; Knight, J. L. OPLS3: a force field providing broad coverage of drug-like small molecules and proteins. J. Chem. Theory Comput. 2016, 12, 281– 296, DOI: 10.1021/acs.jctc.5b00864Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXhvVCjtbfE&md5=42663f8cfa84b80a67132bbb13b9b7ceOPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and ProteinsHarder, Edward; Damm, Wolfgang; Maple, Jon; Wu, Chuanjie; Reboul, Mark; Xiang, Jin Yu; Wang, Lingle; Lupyan, Dmitry; Dahlgren, Markus K.; Knight, Jennifer L.; Kaus, Joseph W.; Cerutti, David S.; Krilov, Goran; Jorgensen, William L.; Abel, Robert; Friesner, Richard A.Journal of Chemical Theory and Computation (2016), 12 (1), 281-296CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)The parametrization and validation of the OPLS3 force field for small mols. and proteins are reported. Enhancements with respect to the previous version (OPLS2.1) include the addn. of off-atom charge sites to represent halogen bonding and aryl nitrogen lone pairs as well as a complete refit of peptide dihedral parameters to better model the native structure of proteins. To adequately cover medicinal chem. space, OPLS3 employs over an order of magnitude more ref. data and assocd. parameter types relative to other commonly used small mol. force fields (e.g., MMFF and OPLS_2005). As a consequence, OPLS3 achieves a high level of accuracy across performance benchmarks that assess small mol. conformational propensities and solvation. The newly fitted peptide dihedrals lead to significant improvements in the representation of secondary structure elements in simulated peptides and native structure stability over a no. of proteins. Together, the improvements made to both the small mol. and protein force field lead to a high level of accuracy in predicting protein-ligand binding measured over a wide range of targets and ligands (less than 1 kcal/mol RMS error) representing a 30% improvement over earlier variants of the OPLS force field.
- 27Soufan, O.; Ba-alawi, W.; Magana-Mora, A.; Essack, M.; Bajic, V. B. DPubChem: a web tool for QSAR modeling and high-throughput virtual screening. Sci. Rep. 2018, 8, 1– 10, DOI: 10.1038/s41598-018-27495-xGoogle Scholar27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhvVCjtr3O&md5=df8cd91e6d51518519f9f12efb7a5d91DPubChem: a web tool for QSAR modeling and high-throughput virtual screeningSoufan, Othman; Ba-alawi, Wail; Magana-Mora, Arturo; Essack, Magbubah; Bajic, Vladimir B.Scientific Reports (2018), 8 (1), 1-10CODEN: SRCEC3; ISSN:2045-2322. (Nature Research)High-throughput screening (HTS) performs the exptl. testing of a large no. of chem. compds. aiming to identify those active in the considered assay. Alternatively, faster and cheaper methods of large-scale virtual screening are performed computationally through quant. structure-activity relationship (QSAR) models. However, the vast amt. of available HTS heterogeneous data and the imbalanced ratio of active to inactive compds. in an assay make this a challenging problem. Although different QSAR models have been proposed, they have certain limitations, e.g., high false pos. rates, complicated user interface, and limited utilization options. Therefore, we developed DPubChem, a novel web tool for deriving QSAR models that implement the state-of-the-art machine-learning techniques to enhance the precision of the models and enable efficient analyses of expts. from PubChem BioAssay database. DPubChem also has a simple interface that provides various options to users. DPubChem predicted active compds. for 300 datasets with an av. geometric mean and F1 score of 76.68% and 76.53%, resp. Furthermore, DPubChem builds interaction networks that highlight novel predicted links between chem. compds. and biol. assays. Using such a network, DPubChem successfully suggested a novel drug for the Niemann-Pick type C disease. DPubChem is freely available at www.cbrc.kaust.edu.sa/dpubchem.
- 28Peón, A.; Li, H.; Ghislat, G.; Leung, K.-S.; Wong, M.-H.; Lu, G.; Ballester, P. J. MolTarPred: a web tool for comprehensive target prediction with reliability estimation. Chem. Biol. Drug Des. 2019, 94, 1390– 1401, DOI: 10.1111/cbdd.13516Google Scholar28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXnslyhsLk%253D&md5=9b0ec49666c6da84849ad7f9ac2e3f1eMolTarPred: A web tool for comprehensive target prediction with reliability estimationPeon, Antonio; Li, Hongjian; Ghislat, Ghita; Leung, Kwong-Sak; Wong, Man-Hon; Lu, Gang; Ballester, Pedro J.Chemical Biology & Drug Design (2019), 94 (1), 1390-1401CODEN: CBDDAL; ISSN:1747-0277. (Wiley-Blackwell)Mol. target prediction can provide a starting point to understand the efficacy and side effects of phenotypic screening hits. Unfortunately, the vast majority of in silico target prediction methods are not available as web tools. Furthermore, these are limited in the no. of targets that can be predicted, do not est. which target predictions are more reliable and/or lack comprehensive retrospective validations. We present MolTarPred ( ), a user-friendly web tool for predicting protein targets of small org. compds. It is powered by a large knowledge base comprising 607,659 compds. and 4,553 macromol. targets collected from the ChEMBL database. In about 1 min, the predicted targets for the supplied mol. will be listed in a table. The chem. structures of the query mol. and the most similar compds. annotated with the predicted target will also be shown to permit visual inspection and comparison. Practical examples of the use of MolTarPred are showcased. MolTarPred is a new resource for scientists that require a more complete knowledge of the polypharmacol. of a mol. The introduction of a reliability score constitutes an attractive functionality of MolTarPred, as it permits focusing exptl. confirmatory tests on the most reliable predictions, which leads to higher prospective hit rates.
- 29Sheridan, R. P.; Wang, W. M.; Liaw, A.; Ma, J.; Gifford, E. M. Extreme gradient boosting as a method for quantitative structure–activity relationships. J. Chem. Inf. Model. 2016, 56, 2353– 2360, DOI: 10.1021/acs.jcim.6b00591Google Scholar29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhvFCgs73E&md5=b6c38759f87da65d52bb9e325240709fExtreme Gradient Boosting as a Method for Quantitative Structure-Activity RelationshipsSheridan, Robert P.; Wang, Wei Min; Liaw, Andy; Ma, Junshui; Gifford, Eric M.Journal of Chemical Information and Modeling (2016), 56 (12), 2353-2360CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)In the pharmaceutical industry it is common to generate many QSAR models from training sets contg. a large no. of mols. and a large no. of descriptors. The best QSAR methods are those that can generate the most accurate predictions but that are not overly expensive computationally. In this paper the authors compare extreme gradient boosting (XGBoost) to random forest and single-task deep neural nets on 30 inhouse data sets. While XGBoost has many adjustable parameters, the authors can define a set of std. parameters at which XGBoost makes predictions, on the av., better than those of random forest and almost as good as those of deep neural nets. The biggest strength of XGBoost is its speed. Whereas efficient use of random forest requires generating each tree in parallel on a cluster, and deep neural nets are usually run on GPUs, XGBoost can be run on a single cluster CPU in less than a third of the wall-clock time of either of the other methods.
- 30Sidorov, P.; Naulaerts, S.; Ariey-Bonnet, J.; Pasquier, E.; Ballester, P. Predicting synergism of cancer drug combinations using NCI-ALMANAC data. Front. Chem. 2019, 7, 509, DOI: 10.3389/fchem.2019.00509Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3MvkslOgsA%253D%253D&md5=e6fe7f361245c397ab3c3a57ee652c8bPredicting Synergism of Cancer Drug Combinations Using NCI-ALMANAC DataSidorov Pavel; Naulaerts Stefan; Ariey-Bonnet Jeremy; Pasquier Eddy; Ballester Pedro J; Naulaerts StefanFrontiers in chemistry (2019), 7 (), 509 ISSN:2296-2646.Drug combinations are of great interest for cancer treatment. Unfortunately, the discovery of synergistic combinations by purely experimental means is only feasible on small sets of drugs. In silico modeling methods can substantially widen this search by providing tools able to predict which of all possible combinations in a large compound library are synergistic. Here we investigate to which extent drug combination synergy can be predicted by exploiting the largest available dataset to date (NCI-ALMANAC, with over 290,000 synergy determinations). Each cell line is modeled using primarily two machine learning techniques, Random Forest (RF) and Extreme Gradient Boosting (XGBoost), on the datasets provided by NCI-ALMANAC. This large-scale predictive modeling study comprises more than 5,000 pair-wise drug combinations, 60 cell lines, 4 types of models, and 5 types of chemical features. The application of a powerful, yet uncommonly used, RF-specific technique for reliability prediction is also investigated. The evaluation of these models shows that it is possible to predict the synergy of unseen drug combinations with high accuracy (Pearson correlations between 0.43 and 0.86 depending on the considered cell line, with XGBoost providing slightly better predictions than RF). We have also found that restricting to the most reliable synergy predictions results in at least 2-fold error decrease with respect to employing the best learning algorithm without any reliability estimation. Alkylating agents, tyrosine kinase inhibitors and topoisomerase inhibitors are the drugs whose synergy with other partner drugs are better predicted by the models. Despite its leading size, NCI-ALMANAC comprises an extremely small part of all conceivable combinations. Given their accuracy and reliability estimation, the developed models should drastically reduce the number of required in vitro tests by predicting in silico which of the considered combinations are likely to be synergistic.
- 31Gao, K.; Nguyen, D. D.; Sresht, V.; Mathiowetz, A. M.; Tu, M.; Wei, G.-W. Are 2D fingerprints still valuable for drug discovery?. Phys. Chem. Chem. Phys. 2020, 22, 8373– 8390, DOI: 10.1039/D0CP00305KGoogle Scholar31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXlsVSlurw%253D&md5=dac8cf3cb16a45012daeed4df362e135Are 2D fingerprints still valuable for drug discovery?Gao, Kaifu; Nguyen, Duc Duy; Sresht, Vishnu; Mathiowetz, Alan M.; Tu, Meihua; Wei, Guo-WeiPhysical Chemistry Chemical Physics (2020), 22 (16), 8373-8390CODEN: PPCPFQ; ISSN:1463-9076. (Royal Society of Chemistry)Recently, mol. fingerprints extd. from three-dimensional (3D) structures using advanced mathematics, such as algebraic topol., differential geometry, and graph theory have been paired with efficient machine learning, esp. deep learning algorithms to outperform other methods in drug discovery applications and competitions. This raises the question of whether classical 2D fingerprints are still valuable in computer-aided drug discovery. This work considers 23 datasets assocd. with four typical problems, namely protein-ligand binding, toxicity, soly. and partition coeff. to assess the performance of eight 2D fingerprints. Advanced machine learning algorithms including random forest, gradient boosted decision tree, single-task deep neural network and multitask deep neural network are employed to construct efficient 2D-fingerprint based models. Addnl., appropriate consensus models are built to further enhance the performance of 2D-fingerprint-based methods. It is demonstrated that 2D-fingerprint-based models perform as well as the state-of-the-art 3D structure-based models for the predictions of toxicity, soly., partition coeff. and protein-ligand binding affinity based on only ligand information. However, 3D structure-based models outperform 2D fingerprint-based methods in complex-based protein-ligand binding affinity predictions.
Cited By
This article is cited by 78 publications.
- Maria del Mar Villanueva Guzman, Natalie J. LoMascolo, Delaina May, Caroline E. Thomas, Samantha P. Stacey, Bryan C. Mounce. Rapid Screening to Identify Antivirals against Persistent and Acute Coxsackievirus B3 Infection. ACS Infectious Diseases 2024, 10
(12)
, 4222-4232. https://doi.org/10.1021/acsinfecdis.4c00532
- Shuang Guo, Yuwei Liu, Yue Sun, Hanxiao Zhou, Yue Gao, Peng Wang, Hui Zhi, Yakun Zhang, Jing Gan, Shangwei Ning. Metabolic-Related Gene Prognostic Index for Predicting Prognosis, Immunotherapy Response, and Candidate Drugs in Ovarian Cancer. Journal of Chemical Information and Modeling 2024, 64
(3)
, 1066-1080. https://doi.org/10.1021/acs.jcim.3c01473
- Yanling Wu, Kun Li, Menglong Li, Xuemei Pu, Yanzhi Guo. Attention Mechanism-Based Graph Neural Network Model for Effective Activity Prediction of SARS-CoV-2 Main Protease Inhibitors: Application to Drug Repurposing as Potential COVID-19 Therapy. Journal of Chemical Information and Modeling 2023, 63
(22)
, 7011-7031. https://doi.org/10.1021/acs.jcim.3c01280
- Danyang Xiong, Xiaoyu Zhao, Song Luo, John Z. H. Zhang, Lili Duan. Molecular Mechanism of the Non-Covalent Orally Targeted SARS-CoV-2 Mpro Inhibitor S-217622 and Computational Assessment of Its Effectiveness against Mainstream Variants. The Journal of Physical Chemistry Letters 2022, 13
(38)
, 8893-8901. https://doi.org/10.1021/acs.jpclett.2c02428
- Kaifu Gao, Rui Wang, Jiahui Chen, Limei Cheng, Jaclyn Frishcosy, Yuta Huzumi, Yuchi Qiu, Tom Schluckbier, Xiaoqi Wei, Guo-Wei Wei. Methodology-Centered Review of Molecular Modeling, Simulation, and Prediction of SARS-CoV-2. Chemical Reviews 2022, 122
(13)
, 11287-11368. https://doi.org/10.1021/acs.chemrev.1c00965
- Trung Hai Nguyen, Phuong-Thao Tran, Ngoc Quynh Anh Pham, Van-Hai Hoang, Dinh Minh Hiep, Son Tung Ngo. Identifying Possible AChE Inhibitors from Drug-like Molecules via Machine Learning and Experimental Studies. ACS Omega 2022, 7
(24)
, 20673-20682. https://doi.org/10.1021/acsomega.2c00908
- Kaifu Gao, Rui Wang, Jiahui Chen, Jetze J. Tepe, Faqing Huang, Guo-Wei Wei. Perspectives on SARS-CoV-2 Main Protease Inhibitors. Journal of Medicinal Chemistry 2021, 64
(23)
, 16922-16955. https://doi.org/10.1021/acs.jmedchem.1c00409
- Kaifu Gao, Dong Chen, Alfred J. Robison, Guo-Wei Wei. Proteome-Informed Machine Learning Studies of Cocaine Addiction. The Journal of Physical Chemistry Letters 2021, 12
(45)
, 11122-11134. https://doi.org/10.1021/acs.jpclett.1c03133
- Hazem Mslati, Francesco Gentile, Carl Perez, Artem Cherkasov. Comprehensive Consensus Analysis of SARS-CoV-2 Drug Repurposing Campaigns. Journal of Chemical Information and Modeling 2021, 61
(8)
, 3771-3788. https://doi.org/10.1021/acs.jcim.1c00384
- Son Tung Ngo, Nguyen Minh Tam, Minh Quan Pham, Trung Hai Nguyen. Benchmark of Popular Free Energy Approaches Revealing the Inhibitors Binding to SARS-CoV-2 Mpro. Journal of Chemical Information and Modeling 2021, 61
(5)
, 2302-2312. https://doi.org/10.1021/acs.jcim.1c00159
- Oleg V. Prezhdo (Executive Editor, The Journal of Physical Chemistry Letters). Advancing Physical Chemistry with Machine Learning. The Journal of Physical Chemistry Letters 2020, 11
(22)
, 9656-9658. https://doi.org/10.1021/acs.jpclett.0c03130
- Saroj Kumar Panda, Pratyush Pani, Parth Sarthi Sen Gupta, Nimai Mahanandia, Malay Kumar Rana. Computational Assessment of Clinical Drugs against SARS‐CoV‐2: Foreseeing Molecular Mechanisms and Potent Mpro Inhibitors. ChemPhysChem 2025, 26
(2)
https://doi.org/10.1002/cphc.202400814
- Quynh Mai Thai, Trung Hai Nguyen, George Binh Lenon, Huong Thi Thu Phung, Jim-Tong Horng, Phuong-Thao Tran, Son Tung Ngo. Estimating AChE inhibitors from MCE database by machine learning and atomistic calculations. Journal of Molecular Graphics and Modelling 2024, 64 , 108906. https://doi.org/10.1016/j.jmgm.2024.108906
- Quynh Mai Thai, Minh Quan Pham, Phuong-Thao Tran, Trung Hai Nguyen, Son Tung Ngo. Searching for potential acetylcholinesterase inhibitors: a combined approach of multi-step similarity search, machine learning and molecular dynamics simulations. Royal Society Open Science 2024, 11
(10)
https://doi.org/10.1098/rsos.240546
- Amin Gasmi, Sadaf Noor, Alain Menzel, Nataliia Khanyk, Yuliya Semenova, Roman Lysiuk, Nataliya Beley, Liliia Bolibrukh, Asma Gasmi Benahmed, Olha Storchylo, Geir Bjørklund. Potential Drugs in COVID-19 Management. Current Medicinal Chemistry 2024, 31
(22)
, 3245-3264. https://doi.org/10.2174/0929867331666230717154101
- Trung Hai Nguyen, Quynh Mai Thai, Minh Quan Pham, Pham Thi Hong Minh, Huong Thi Thu Phung. Machine learning combines atomistic simulations to predict SARS-CoV-2 Mpro inhibitors from natural compounds. Molecular Diversity 2024, 28
(2)
, 553-561. https://doi.org/10.1007/s11030-023-10601-1
- Shagufta Quazi, Sampa Karmakar Singh, Rudra Prasad Saha, Arpita Das, Manoj Kumar Singh. ARTIFICIAL INTELLIGENCE IN TACKLING CORONAVIRUS AND FUTURE PANDEMICS. Journal of Experimental Biology and Agricultural Sciences 2024, 12
(1)
, 124-137. https://doi.org/10.18006/2024.12(1).124.137
- Yuan Luo, Bradley J. Nelson. Accelerating Iterated Persistent Homology Computations with Warm Starts. Computational Geometry 2024, , 102089. https://doi.org/10.1016/j.comgeo.2024.102089
- Huiwen Wang. Prediction of protein–ligand binding affinity via deep learning models. Briefings in Bioinformatics 2024, 25
(2)
https://doi.org/10.1093/bib/bbae081
- Jie Dong, Mihayl Varbanov, Stéphanie Philippot, Fanny Vreken, Wen-bin Zeng, Vincent Blay. Ligand-based discovery of coronavirus main protease inhibitors using MACAW molecular embeddings. Journal of Enzyme Inhibition and Medicinal Chemistry 2023, 38
(1)
, 24-35. https://doi.org/10.1080/14756366.2022.2132486
- Lili Duan, Bolin Tang, Song Luo, Danyang Xiong, Qihang Wang, Xiaole Xu, John Z. H. Zhang. Entropy driven cooperativity effect in multi-site drug optimization targeting SARS-CoV-2 papain-like protease. Cellular and Molecular Life Sciences 2023, 80
(11)
https://doi.org/10.1007/s00018-023-04985-4
- Naser Zaeri. Artificial intelligence and machine learning responses to COVID-19 related inquiries. Journal of Medical Engineering & Technology 2023, 47
(6)
, 301-320. https://doi.org/10.1080/03091902.2024.2321846
- Marim Elkashlan, Rahaf M. Ahmad, Malak Hajar, Fatma Al Jasmi, Juan Manuel Corchado, Nurul Athirah Nasarudin, Mohd Saberi Mohamad. A review of SARS-CoV-2 drug repurposing: databases and machine learning models. Frontiers in Pharmacology 2023, 14 https://doi.org/10.3389/fphar.2023.1182465
- Nguyen Minh Tam, Linh Hoang Tran, Quan V. Vo, Minh Quan Pham, Huong Thi Thu Phung. Designing Potential Inhibitors of SARS-CoV-2 Mpro Using Deep Learning and Steered Molecular Dynamic Simulations. Journal of Computational Biophysics and Chemistry 2023, 22
(05)
, 525-540. https://doi.org/10.1142/S2737416523500242
- Hongsong Feng, Jian Jiang, Guo-Wei Wei. Machine-learning repurposing of DrugBank compounds for opioid use disorder. Computers in Biology and Medicine 2023, 160 , 106921. https://doi.org/10.1016/j.compbiomed.2023.106921
- Hongsong Feng, Rana Elladki, Jian Jiang, Guo-Wei Wei. Machine-learning analysis of opioid use disorder informed by MOR, DOR, KOR, NOR and ZOR-based interactome networks. Computers in Biology and Medicine 2023, 157 , 106745. https://doi.org/10.1016/j.compbiomed.2023.106745
- Surojit Banerjee, Debadri Banerjee, Anupama Singh, Sumit Kumar, Deep Pooja, Veerma Ram, Hitesh Kulhari, Vikas Anand Saharan. A Clinical Insight on New Discovered Molecules and Repurposed Drugs for the Treatment of COVID-19. Vaccines 2023, 11
(2)
, 332. https://doi.org/10.3390/vaccines11020332
- Hongsong Feng, Guo-Wei Wei. Virtual screening of DrugBank database for hERG blockers using topological Laplacian-assisted AI models. Computers in Biology and Medicine 2023, 153 , 106491. https://doi.org/10.1016/j.compbiomed.2022.106491
- Sisir Nandi, Sarfaraz Ahmed, Aaruni Saxena, Anil Kumar Saxena. Exploring the Pathoprofiles of SARS-COV-2 Infected Human Gut–Lungs Microbiome Crosstalks. 2023, 217-235. https://doi.org/10.1007/978-981-99-1463-0_12
- Priyanka Sharma, Tushar Joshi, Shalini Mathpal, Sushma Tamta, Subhash Chandra. Computational approaches for drug discovery against COVID-19. 2023, 321-337. https://doi.org/10.1016/B978-0-323-91794-0.00024-X
- Trung Hai Nguyen, Nguyen Minh Tam, Mai Van Tuan, Peng Zhan, Van V. Vu, Duong Tuan Quang, Son Tung Ngo. Searching for potential inhibitors of SARS-COV-2 main protease using supervised learning and perturbation calculations. Chemical Physics 2023, 564 , 111709. https://doi.org/10.1016/j.chemphys.2022.111709
- Son Tung Ngo, Trung Hai Nguyen, Nguyen Thanh Tung, Van V. Vu, Minh Quan Pham, Binh Khanh Mai. Characterizing the ligand-binding affinity toward SARS-CoV-2 Mpro
via
physics- and knowledge-based approaches. Physical Chemistry Chemical Physics 2022, 24
(48)
, 29266-29278. https://doi.org/10.1039/D2CP04476E
- Mohammed Ghalib Enayathullah, Yash Parekh, Sarena Banu, Sushma Ram, Ramakrishnan Nagaraj, Bokara Kiran Kumar, Mohammed M. Idris. Gramicidin S and melittin: potential anti-viral therapeutic peptides to treat SARS-CoV-2 infection. Scientific Reports 2022, 12
(1)
https://doi.org/10.1038/s41598-022-07341-x
- Vinicius S. Nunes, Diego F. S. Paschoal, Luiz Antônio S. Costa, Hélio F. Dos Santos. Antivirals virtual screening to SARS-CoV-2 non-structural proteins. Journal of Biomolecular Structure and Dynamics 2022, 40
(19)
, 8989-9003. https://doi.org/10.1080/07391102.2021.1921033
- M. Sadegh Saberian, Kathleen P. Moriarty, Andrea D. Olmstead, Christian Hallgrimson, Francois Jean, Ivan R. Nabi, Maxwell W. Libbrecht, Ghassan Hamarneh. DEEMD: Drug Efficacy Estimation Against SARS-CoV-2 Based on Cell Morphology With Deep Multiple Instance Learning. IEEE Transactions on Medical Imaging 2022, 41
(11)
, 3128-3145. https://doi.org/10.1109/TMI.2022.3178523
- Hassan Hashemi, Shiva Ghareghani, Nasrin Nasimi, Mohammad Shahbazi, Zahra Derakhshan, Samuel Asumadu Sarkodie. Health Consequences of Overexposure to Disinfectants and Self-Medication against SARS-CoV-2: A Cautionary Tale Review. Sustainability 2022, 14
(20)
, 13614. https://doi.org/10.3390/su142013614
- Sanjay Kumar, Geethu S Kumar, Subhrangsu Sundar Maitra, Petr Malý, Shiv Bharadwaj, Pradeep Sharma, Vivek Dhar Dwivedi. Viral informatics: bioinformatics-based solution for managing viral infections. Briefings in Bioinformatics 2022, 23
(5)
https://doi.org/10.1093/bib/bbac326
- Quynh Mai Thai, T. Ngoc Han Pham, Dinh Minh Hiep, Minh Quan Pham, Phuong-Thao Tran, Trung Hai Nguyen, Son Tung Ngo. Searching for AChE inhibitors from natural compounds by using machine learning and atomistic simulations. Journal of Molecular Graphics and Modelling 2022, 115 , 108230. https://doi.org/10.1016/j.jmgm.2022.108230
- Faheem Ahmed, Afaque Manzoor Soomro, Abdul Rahim Chethikkattuveli Salih, Anupama Samantasinghar, Arun Asif, In Suk Kang, Kyung Hyun Choi. A comprehensive review of artificial intelligence and network based approaches to drug repurposing in Covid-19. Biomedicine & Pharmacotherapy 2022, 153 , 113350. https://doi.org/10.1016/j.biopha.2022.113350
- Yi Cong, Toshinori Endo. Multi-Omics and Artificial Intelligence-Guided Drug Repositioning: Prospects, Challenges, and Lessons Learned from COVID-19. OMICS: A Journal of Integrative Biology 2022, 26
(7)
, 361-371. https://doi.org/10.1089/omi.2022.0068
- Haoran Peng, Cuiling Ding, Liangliang Jiang, Wanda Tang, Yan Liu, Lanjuan Zhao, Zhigang Yi, Hao Ren, Chong Li, Yanhua He, Xu Zheng, Hailin Tang, Zhihui Chen, Zhongtian Qi, Ping Zhao. Discovery of potential anti-SARS-CoV-2 drugs based on large-scale screening in vitro and effect evaluation in vivo. Science China Life Sciences 2022, 65
(6)
, 1181-1197. https://doi.org/10.1007/s11427-021-2031-7
- Huiwen Wang, Haoquan Liu, Shangbo Ning, Chengwei Zeng, Yunjie Zhao. DLSSAffinity: protein–ligand binding affinity prediction
via
a deep learning model. Physical Chemistry Chemical Physics 2022, 24
(17)
, 10124-10133. https://doi.org/10.1039/D1CP05558E
- Rohoullah Firouzi, Mitra Ashouri, Mohammad Hossein Karimi‐Jafari. Structural insights into the substrate‐binding site of main protease for the structure‐based COVID‐19 drug discovery. Proteins: Structure, Function, and Bioinformatics 2022, 90
(5)
, 1090-1101. https://doi.org/10.1002/prot.26318
- Matjaž Simončič, Miha Lukšič, Maksym Druchok. Machine learning assessment of the binding region as a tool for more efficient computational receptor-ligand docking. Journal of Molecular Liquids 2022, 353 , 118759. https://doi.org/10.1016/j.molliq.2022.118759
- Yasin Panahi, Masoomeh Dadkhah, Sahand Talei, Zahra Gharari, Vahid Asghariazar, Arash Abdolmaleki, Somayeh Matin, Soheila Molaei. Can Anti-Parasitic Drugs Help Control COVID-19?. Future Virology 2022, 17
(5)
, 315-339. https://doi.org/10.2217/fvl-2021-0160
- Tymofii Nikolaienko, Oleksandr Gurbych, Maksym Druchok. Complex machine learning model needs complex testing: Examining predictability of molecular binding affinity by a graph neural network. Journal of Computational Chemistry 2022, 43
(10)
, 728-739. https://doi.org/10.1002/jcc.26831
- Martina Veit-Acosta, Walter Filgueira de Azevedo Junior. Computational Prediction of Binding Affinity for CDK2-ligand Complexes.
A Protein Target for Cancer Drug Discovery. Current Medicinal Chemistry 2022, 29
(14)
, 2438-2455. https://doi.org/10.2174/0929867328666210806105810
- Justin Airas, Catherine A. Bayas, Abdellah N'Ait Ousidi, Moulay Youssef Ait Itto, Aziz Auhmani, Mohamed Loubidi, M'hamed Esseffar, Julie A. Pollock, Carol A. Parish. Investigating novel thiazolyl-indazole derivatives as scaffolds for SARS-CoV-2 MPro inhibitors. European Journal of Medicinal Chemistry Reports 2022, 4 , 100034. https://doi.org/10.1016/j.ejmcr.2022.100034
- Luca Falciola, Massimo Barbieri. Searching and Analyzing Patent-relevant COVID-19 Information. World Patent Information 2022, 68 , 102094. https://doi.org/10.1016/j.wpi.2022.102094
- Domenico Iacopetta, Jessica Ceramella, Alessia Catalano, Carmela Saturnino, Michele Pellegrino, Annaluisa Mariconda, Pasquale Longo, Maria Stefania Sinicropi, Stefano Aquaro. COVID-19 at a Glance: An Up-to-Date Overview on Variants, Drug Design and Therapies. Viruses 2022, 14
(3)
, 573. https://doi.org/10.3390/v14030573
- Lakshmi Narasimha Gunturu, Girirajasekhar Dornadula. Internet of Health Things (IoHT): The Significance of Virtual Tools Aiding to Overcome Novel Coronavirus (COVID-19) Pandemic. 2022, 23-43. https://doi.org/10.1007/978-981-16-3783-4_2
- Chandana Mohanty, Chiluka Vinod, Sarbari Acharya, Nikita Mahapatra. COVID-19 Drug Repositioning: Present Status and Prospects. 2022, 645-671. https://doi.org/10.1007/978-3-030-72834-2_19
- Illya Aronskyy, Yosef Masoudi-Sobhanzadeh, Antonio Cappuccio, Elena Zaslavsky. Advances in the computational landscape for repurposed drugs against COVID-19. Drug Discovery Today 2021, 26
(12)
, 2800-2815. https://doi.org/10.1016/j.drudis.2021.07.026
- Nguyen Minh Tam, Duc-Hung Pham, Dinh Minh Hiep, Phuong-Thao Tran, Duong Tuan Quang, Son Tung Ngo. Searching and designing potential inhibitors for SARS-CoV-2 Mpro from natural sources using atomistic and deep-learning calculations. RSC Advances 2021, 11
(61)
, 38495-38504. https://doi.org/10.1039/D1RA06534C
- Prem Prakash Sharma, Meenakshi Bansal, Aaftaab Sethi, Poonam, Lindomar Pena, Vijay Kumar Goel, Maria Grishina, Shubhra Chaturvedi, Dhruv Kumar, Brijesh Rathi. Computational methods directed towards drug repurposing for COVID-19: advantages and limitations. RSC Advances 2021, 11
(57)
, 36181-36198. https://doi.org/10.1039/D1RA05320E
- Tuan Xu, Wei Zheng, Ruili Huang. High‐throughput screening assays for SARS‐CoV‐2 drug development: Current status and future directions. Drug Discovery Today 2021, 26
(10)
, 2439-2444. https://doi.org/10.1016/j.drudis.2021.05.012
- Lian Wang, Yonggang Zhang, Dongguang Wang, Xiang Tong, Tao Liu, Shijie Zhang, Jizhen Huang, Li Zhang, Lingmin Chen, Hong Fan, Mike Clarke. Artificial Intelligence for COVID-19: A Systematic Review. Frontiers in Medicine 2021, 8 https://doi.org/10.3389/fmed.2021.704256
- Stephen J Goodswen, Joel L N Barratt, Paul J Kennedy, Alexa Kaufer, Larissa Calarco, John T Ellis. Machine learning and applications in microbiology. FEMS Microbiology Reviews 2021, 45
(5)
https://doi.org/10.1093/femsre/fuab015
- Maksym Druchok, Dzvenymyra Yarish, Sofiya Garkot, Tymofii Nikolaienko, Oleksandr Gurbych. Ensembling machine learning models to boost molecular affinity prediction. Computational Biology and Chemistry 2021, 93 , 107529. https://doi.org/10.1016/j.compbiolchem.2021.107529
- Jacek Haneczok, Marcin Delijewski. Machine learning enabled identification of potential SARS-CoV-2 3CLpro inhibitors based on fixed molecular fingerprints and Graph-CNN neural representations. Journal of Biomedical Informatics 2021, 119 , 103821. https://doi.org/10.1016/j.jbi.2021.103821
- Nguyen Minh Tam, Minh Quan Pham, Nguyen Xuan Ha, Pham Cam Nam, Huong Thi Thu Phung. Computational estimation of potential inhibitors from known drugs against the main protease of SARS-CoV-2. RSC Advances 2021, 11
(28)
, 17478-17486. https://doi.org/10.1039/D1RA02529E
- Tanuj Sharma, Mohammed Abohashrh, Mohammad Hassan Baig, Jae-June Dong, Mohammad Mahtab Alam, Irfan Ahmad, Safia Irfan. Screening of drug databank against WT and mutant main protease of SARS-CoV-2: Towards finding potential compound for repurposing against COVID-19. Saudi Journal of Biological Sciences 2021, 28
(5)
, 3152-3159. https://doi.org/10.1016/j.sjbs.2021.02.059
- Magdi E. A. Zaki, Sami A. Al-Hussain, Vijay H. Masand, Siddhartha Akasapu, Sumit O. Bajaj, Nahed N. E. El-Sayed, Arabinda Ghosh, Israa Lewaa. Identification of Anti-SARS-CoV-2 Compounds from Food Using QSAR-Based Virtual Screening, Molecular Docking, and Molecular Dynamics Simulation Analysis. Pharmaceuticals 2021, 14
(4)
, 357. https://doi.org/10.3390/ph14040357
- Marcin Delijewski, Jacek Haneczok. AI drug discovery screening for COVID-19 reveals zafirlukast as a repurposing candidate. Medicine in Drug Discovery 2021, 9 , 100077. https://doi.org/10.1016/j.medidd.2020.100077
- Tatsuo Kanda, Reina Sasaki, Ryota Masuzaki, Mitsuhiko Moriyama. Artificial intelligence and machine learning could support drug development for hepatitis A virus internal ribosomal entry sites. Artificial Intelligence in Gastroenterology 2021, 2
(1)
, 1-9. https://doi.org/10.35712/aig.v2.i1.1
- Osvaldo Yañez, Manuel Isaías Osorio, Eugenio Uriarte, Carlos Areche, William Tiznado, José M. Pérez-Donoso, Olimpo García-Beltrán, Fernando González-Nilo. In Silico Study of Coumarins and Quinolines Derivatives as Potent Inhibitors of SARS-CoV-2 Main Protease. Frontiers in Chemistry 2021, 8 https://doi.org/10.3389/fchem.2020.595097
- Sebastián A. Cuesta, José R. Mora, Edgar A. Márquez. In Silico Screening of the DrugBank Database to Search for Possible Drugs against SARS-CoV-2. Molecules 2021, 26
(4)
, 1100. https://doi.org/10.3390/molecules26041100
- Nguyen Minh Tam, Pham Cam Nam, Duong Tuan Quang, Nguyen Thanh Tung, Van V. Vu, Son Tung Ngo. Binding of inhibitors to the monomeric and dimeric SARS-CoV-2 Mpro. RSC Advances 2021, 11
(5)
, 2926-2934. https://doi.org/10.1039/D0RA09858B
- Ivonne Buitrón-González, Giovanny Aguilera-Durán, Antonio Romo-Mancillas. In-silico drug repurposing study: Amprenavir, enalaprilat, and plerixafor, potential drugs for destabilizing the SARS-CoV-2 S-protein-angiotensin-converting enzyme 2 complex. Results in Chemistry 2021, 3 , 100094. https://doi.org/10.1016/j.rechem.2020.100094
- Daisuke Kuroda, Kouhei Tsumoto. Microsecond molecular dynamics suggest that a non-synonymous mutation, frequently observed in patients with mild symptoms in Tokyo, alters dynamics of the SARS-CoV-2 main protease. Biophysics and Physicobiology 2021, 18
(0)
, 215-222. https://doi.org/10.2142/biophysico.bppb-v18.022
- Atsushi Hijikata, Clara Shionyu, Setsu Nakae, Masafumi Shionyu, Motonori Ota, Shigehiko Kanaya, Tsuyoshi Shirai. Current status of structure-based drug repurposing against COVID-19 by targeting SARS-CoV-2 proteins. Biophysics and Physicobiology 2021, 18
(0)
, 226-240. https://doi.org/10.2142/biophysico.bppb-v18.025
- Hafsa Bareen Syeda, Mahanazuddin Syed, Kevin Wayne Sexton, Shorabuddin Syed, Salma Begum, Farhanuddin Syed, Fred Prior, Feliciano Yu Jr. Role of Machine Learning Techniques to Tackle the COVID-19 Crisis: Systematic Review. JMIR Medical Informatics 2021, 9
(1)
, e23811. https://doi.org/10.2196/23811
- Wenhan Guo, Yixin Xie, Alan E Lopez-Hernandez, Shengjie Sun, Lin Li. Electrostatic features for nucleocapsid proteins of SARS-CoV and SARS-CoV-2. Mathematical Biosciences and Engineering 2021, 18
(3)
, 2372-2383. https://doi.org/10.3934/mbe.2021120
- Kartikay Prasad, Vijay Kumar. Artificial intelligence-driven drug repurposing and structural biology for SARS-CoV-2. Current Research in Pharmacology and Drug Discovery 2021, 2 , 100042. https://doi.org/10.1016/j.crphar.2021.100042
- Marzieh Masjoudi, Armin Aslani, Somayyeh Khazaeian, Azita Fathnezhad-Kazemi. Explaining the experience of prenatal care and investigating the association between psychological factors with self-care in pregnant women during COVID-19 pandemic: a mixed method study protocol. Reproductive Health 2020, 17
(1)
https://doi.org/10.1186/s12978-020-00949-0
- Minh Quan Pham, Khanh B. Vu, T. Ngoc Han Pham, Le Thi Thuy Huong, Linh Hoang Tran, Nguyen Thanh Tung, Van V. Vu, Trung Hai Nguyen, Son Tung Ngo. Rapid prediction of possible inhibitors for SARS-CoV-2 main protease using docking and FPL simulations. RSC Advances 2020, 10
(53)
, 31991-31996. https://doi.org/10.1039/D0RA06212J
- Qingxin Li, CongBao Kang. Progress in Developing Inhibitors of SARS-CoV-2 3C-Like Protease. Microorganisms 2020, 8
(8)
, 1250. https://doi.org/10.3390/microorganisms8081250
- Luca Falciola, Massimo Barbieri. Searching and Analyzing Patent-Relevant Information for Evaluating COVID-19 Innovation. SSRN Electronic Journal 2020, 182 https://doi.org/10.2139/ssrn.3771756
Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.
Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.
The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.
Recommended Articles
References
This article references 31 other publications.
- 1Jeon, S.; Ko, M.; Lee, J.; Choi, I.; Byun, S. Y.; Park, S.; Shum, D.; Kim, S. Identification of antiviral drug candidates against SARS-CoV-2 from FDA-approved drugs. Antimicrob. Agents Chemother. 2020, DOI: 10.1128/AAC.00819-20There is no corresponding record for this reference.
- 2MacIntyre, C. R. Wuhan novel coronavirus 2019nCoV–update January 27th 2020. Glob. Biosecur. 2019, 1, 1, DOI: 10.31646/gbio.51There is no corresponding record for this reference.
- 3Xu, Z.; Peng, C.; Shi, Y.; Zhu, Z.; Mu, K.; Wang, X.; Zhu, W. Nelfinavir was predicted to be a potential inhibitor of 2019 -nCoV main protease by an integrative approach combining homology modelling, molecular docking and binding free energy calculation. bioRxiv 2020.There is no corresponding record for this reference.
- 4Brown, A. S.; Patel, C. J. A standard database for drug repositioning. Sci. Data 2017, 4, 1– 7, DOI: 10.1038/sdata.2017.29There is no corresponding record for this reference.
- 5Amelio, I.; Gostev, M.; Knight, R.; Willis, A.; Melino, G.; Antonov, A. DRUGSURV: a resource for repositioning of approved and experimental drugs in oncology based on patient survival information. Cell Death Dis. 2014, 5, e1051– e1051, DOI: 10.1038/cddis.2014.95https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2cXitFOlsrY%253D&md5=2f59bd159f59678a9719c7c20627942aDRUGSURV: a resource for repositioning of approved and experimental drugs in oncology based on patient survival informationAmelio, I.; Gostev, M.; Knight, R. A.; Willis, A. E.; Melino, G.; Antonov, A. V.Cell Death & Disease (2014), 5 (2), e1051CODEN: CDDEA4; ISSN:2041-4889. (Nature Publishing Group)The use of existing drugs for new therapeutic applications, commonly referred to as drug repositioning, is a way for fast and cost-efficient drug discovery. Drug repositioning in oncol. is commonly initiated by in vitro exptl. evidence that a drug exhibits anticancer cytotoxicity. Any independent verification that the obsd. effects in vitro may be valid in a clin. setting, and that the drug could potentially affect patient survival in vivo is of paramount importance. Despite considerable recent efforts in computational drug repositioning, none of the studies have considered patient survival information in modeling the potential of existing/new drugs in the management of cancer. Therefore, we have developed DRUGSURV; this is the first computational tool to est. the potential effects of a drug using patient survival information derived from clin. cancer expression data sets. DRUGSURV provides statistical evidence that a drug can affect survival outcome in particular clin. conditions to justify further investigation of the drug anticancer potential and to guide clin. trial design. DRUGSURV covers both approved drugs (∼1700) as well as exptl. drugs (∼5000) and is freely available at http://www.bioprofiling.de/drugsurv.
- 6Jin, G.; Wong, S. T. Toward better drug repositioning: prioritizing and integrating existing methods into efficient pipelines. Drug Discovery Today 2014, 19, 637– 644, DOI: 10.1016/j.drudis.2013.11.0056https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2c7nslyitg%253D%253D&md5=35815069aa104a93d551f2c1eae516c7Toward better drug repositioning: prioritizing and integrating existing methods into efficient pipelinesJin Guangxu; Wong Stephen T CDrug discovery today (2014), 19 (5), 637-44 ISSN:.Recycling old drugs, rescuing shelved drugs and extending patents' lives make drug repositioning an attractive form of drug discovery. Drug repositioning accounts for approximately 30% of the newly US Food and Drug Administration (FDA)-approved drugs and vaccines in recent years. The prevalence of drug-repositioning studies has resulted in a variety of innovative computational methods for the identification of new opportunities for the use of old drugs. Questions often arise from customizing or optimizing these methods into efficient drug-repositioning pipelines for alternative applications. It requires a comprehensive understanding of the available methods gained by evaluating both biological and pharmaceutical knowledge and the elucidated mechanism-of-action of drugs. Here, we provide guidance for prioritizing and integrating drug-repositioning methods for specific drug-repositioning pipelines.
- 7Patwardhan, B.; Chaguturu, R. Innovative Approaches in Drug Discovery: Ethnopharmacology, Systems Biology and Holistic Targeting; Academic Press, 2016.There is no corresponding record for this reference.
- 8Li, J.; Zheng, S.; Chen, B.; Butte, A. J.; Swamidass, S. J.; Lu, Z. A survey of current trends in computational drug repositioning. Briefings Bioinf. 2016, 17, 2– 12, DOI: 10.1093/bib/bbv0208https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2MjgtVyitw%253D%253D&md5=22a5693fc892d2479485e93101e8dc3aA survey of current trends in computational drug repositioningLi Jiao; Zheng Si; Chen Bin; Butte Atul J; Swamidass S Joshua; Lu ZhiyongBriefings in bioinformatics (2016), 17 (1), 2-12 ISSN:.Computational drug repositioning or repurposing is a promising and efficient tool for discovering new uses from existing drugs and holds the great potential for precision medicine in the age of big data. The explosive growth of large-scale genomic and phenotypic data, as well as data of small molecular compounds with granted regulatory approval, is enabling new developments for computational repositioning. To achieve the shortest path toward new drug indications, advanced data processing and analysis strategies are critical for making sense of these heterogeneous molecular measurements. In this review, we show recent advancements in the critical areas of computational drug repositioning from multiple aspects. First, we summarize available data sources and the corresponding computational repositioning strategies. Second, we characterize the commonly used computational techniques. Third, we discuss validation strategies for repositioning studies, including both computational and experimental methods. Finally, we highlight potential opportunities and use-cases, including a few target areas such as cancers. We conclude with a brief discussion of the remaining challenges in computational drug repositioning.
- 9Cang, Z.; Mu, L.; Wei, G.-W. Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput. Biol. 2018, 14, e1005929 DOI: 10.1371/journal.pcbi.10059299https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhs1OhsL%252FM&md5=4437c7ae1cd57becf34db34716e8890dRepresentability of algebraic topology for biomolecules in machine learning based scoring and virtual screeningCang, Zixuan; Mu, Lin; Wei, Guo-WeiPLoS Computational Biology (2018), 14 (1), e1005929/1-e1005929/44CODEN: PCBLBG; ISSN:1553-7358. (Public Library of Science)This work introduces a no. of algebraic topol. approaches, including multi-component persistent homol., multi-level persistent homol., and electrostatic persistence for the representation, characterization, and description of small mols. and biomol. complexes. In contrast to the conventional persistent homol., multi-component persistent homol. retains crit. chem. and biol. information during the topol. simplification of biomol. geometric complexity. Multi-level persistent homol. enables a tailored topol. description of inter- and/or intra-mol. interactions of interest. Electrostatic persistence incorporates partial charge information into topol. invariants. These topol. methods are paired with Wasserstein distance to characterize similarities between mols. and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding anal. and virtual screening of small mols. Extensive numerical expts. involving 4,414 protein- ligand complexes from the PDBBind database and 128,374 ligand-target and decoytarget pairs in the DUD database are performed to test resp. the scoring power and the discriminatory power of the proposed topol. learning strategies. It is demonstrated that the present topol. learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.
- 10Gralinski, L. E.; Menachery, V. D. Return of the Coronavirus: 2019-nCoV. Viruses 2020, 12, 135, DOI: 10.3390/v12020135There is no corresponding record for this reference.
- 11Xu, X.; Chen, P.; Wang, J.; Feng, J.; Zhou, H.; Li, X.; Zhong, W.; Hao, P. Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmission. Sci. China: Life Sci. 2020, 63, 457– 460, DOI: 10.1007/s11427-020-1637-511https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXkt1ert70%253D&md5=4329e2e6beea7848252c54cf20aa8e84Evolution of the novel coronavirus from the ongoing Wuhan outbreak and modeling of its spike protein for risk of human transmissionXu, Xintian; Chen, Ping; Wang, Jingfang; Feng, Jiannan; Zhou, Hui; Li, Xuan; Zhong, Wu; Hao, PeiScience China: Life Sciences (2020), 63 (3), 457-460CODEN: SCLSCJ; ISSN:1674-7305. (Science China Press)The authors' anal. shows that the virus in the 2019 outbreak in Wuhan, China shares with the SARS/SARS-like coronaviruses a common ancestor that resembles the bat coronavirus HKU9-1. This virus is now known as SARS-CoV-2. Their work points to the important discovery that the RBD domain of SARS-CoV-2 S-protein supports strong interaction with human ACE2 protein, despite its sequence diversity with SARS-CoV S-protein. Thus SARS-CoV-2 poses a significant public health risk for human transmission via the S-protein-ACE2 binding pathway. People also need to be reminded that risk and dynamic of cross-species or human-to-human transmission of coronaviruses are also affected by many other factors, like the host's immune response, viral replication efficiency, or virus mutation rate.
- 12Lee, T.-W.; Cherney, M. M.; Huitema, C.; Liu, J.; James, K. E.; Powers, J. C.; Eltis, L. D.; James, M. N. Crystal structures of the main peptidase from the SARS coronavirus inhibited by a substrate-like aza-peptide epoxide. J. Mol. Biol. 2005, 353, 1137– 1151, DOI: 10.1016/j.jmb.2005.09.00412https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXhtFKms7nL&md5=ba0e5fbad36bfc3d259c9221d451ef60Crystal Structures of the Main Peptidase from the SARS Coronavirus Inhibited by a Substrate-like Aza-peptide EpoxideLee, Ting-Wai; Cherney, Maia M.; Huitema, Carly; Liu, Jie; James, Karen Ellis; Powers, James C.; Eltis, Lindsay D.; James, Michael N. G.Journal of Molecular Biology (2005), 353 (5), 1137-1151CODEN: JMOBAK; ISSN:0022-2836. (Elsevier B.V.)The main peptidase (Mpro) from the coronavirus (CoV) causing severe acute respiratory syndrome (SARS) is one of the most attractive mol. targets for the development of anti-SARS agents. We report the irreversible inhibition of SARS-CoV Mpro by an aza-peptide epoxide (APE; kinact/Ki=1900(±400) M-1 s-1). The crystal structures of the Mpro:APE complex in the space groups C2 and P212121 revealed the formation of a covalent bond between the catalytic Cys145 Sγ atom of the peptidase and the epoxide C3 atom of the inhibitor, substantiating the mode of action of this class of cysteine-peptidase inhibitors. The aza-peptide component of APE binds in the substrate-binding regions of Mpro in a substrate-like manner, with excellent structural and chem. complementarity. In addn., the crystal structure of unbound Mpro in the space group C2 revealed that the "N-fingers" (N-terminal residues 1 to 7) of both protomers of Mpro are well defined and the substrate-binding regions of both protomers are in the catalytically competent conformation at the crystn. pH of 6.5, contrary to the previously detd. crystal structures of unbound Mpro in the space group P21.
- 13Zhang, L.; Lin, D.; Sun, X.; Curth, U.; Drosten, C.; Sauerhering, L.; Becker, S.; Rox, K.; Hilgenfeld, R. Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitors. Science 2020, 368, 409– 412, DOI: 10.1126/science.abb340513https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXnslKrtL8%253D&md5=9ac417c20f54c3327f9de9088b512d52Crystal structure of SARS-CoV-2 main protease provides a basis for design of improved α-ketoamide inhibitorsZhang, Linlin; Lin, Daizong; Sun, Xinyuanyuan; Curth, Ute; Drosten, Christian; Sauerhering, Lucie; Becker, Stephan; Rox, Katharina; Hilgenfeld, RolfScience (Washington, DC, United States) (2020), 368 (6489), 409-412CODEN: SCIEAS; ISSN:1095-9203. (American Association for the Advancement of Science)The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) is a global health emergency. An attractive drug target among coronaviruses is the main protease (Mpro, also called 3CLpro) because of its essential role in processing the polyproteins that are translated from the viral RNA. We report the x-ray structures of the unliganded SARS-CoV-2 Mpro and its complex with an α-ketoamide inhibitor. This was derived from a previously designed inhibitor but with the P3-P2 amide bond incorporated into a pyridone ring to enhance the half-life of the compd. in plasma. On the basis of the unliganded structure, we developed the lead compd. into a potent inhibitor of the SARS-CoV-2 Mpro. The pharmacokinetic characterization of the optimized inhibitor reveals a pronounced lung tropism and suitability for administration by the inhalative route.
- 14Wishart, D. S.; Feunang, Y. D.; Guo, A. C.; Lo, E. J.; Marcu, A.; Grant, J. R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018, 46, D1074– D1082, DOI: 10.1093/nar/gkx103714https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXitlGisbvI&md5=986b28c7ea546596a26dd3ba38f05feeDrugBank 5.0: a major update to the DrugBank database for 2018Wishart, David S.; Feunang, Yannick D.; Guo, An C.; Lo, Elvis J.; Marcu, Ana; Grant, Jason R.; Sajed, Tanvir; Johnson, Daniel; Li, Carin; Sayeeda, Zinat; Assempour, Nazanin; Iynkkaran, Ithayavani; Liu, Yifeng; Maciejewski, Adam; Gale, Nicola; Wilson, Alex; Chin, Lucy; Cummings, Ryan; Le, Diana; Pon, Allison; Knox, Craig; Wilson, MichaelNucleic Acids Research (2018), 46 (D1), D1074-D1082CODEN: NARHAD; ISSN:1362-4962. (Oxford University Press)DrugBank is a web-enabled database contg. comprehensivemol. information about drugs, their mechanisms, their interactions and their targets. First described in 2006, Drug- Bank has continued to evolve over the past 12 years in response to marked improvements to web stds. and changing needs for drug research and development. This year's update, DrugBank 5.0, represents the most significant upgrade to the database in more than 10 years. In many cases, existing data content has grown by 100% or more over the last update. For instance, the total no. of investigational drugs in the database has grown by almost 300%, the no. of drug-drug interactions has grown by nearly 600% and the no. of SNP-assocd. drug effects has grown more than 3000%. Significant improvements have been made to the quantity, quality and consistency of drug indications, drug binding data as well as drug-drug and drug-food interactions. A great deal of brand new data have also been added to DrugBank 5.0. This includes information on the influence of hundreds of drugs on metabolite levels (pharmacometabolomics), gene expression levels (pharmacotranscriptomics) and protein expression levels (pharmacoproteomics). New data have also been added on the status of hundreds of newdrug clin. trials and existing drug repurposing trials. Many other important improvements in the content, interface and performance of the DrugBank website have been made and these should greatly enhance its ease of use, utility and potential applications in many areas of pharmacol. research, pharmaceutical science and drug education.
- 15Li, H.; Sze, K.-H.; Lu, G.; Ballester, P. J. Machine-learning scoring functions for structure-based drug lead optimization. Wiley Interdiscip. Rev.: Comput. Mol. Sci. 2020, e1465 DOI: 10.1002/wcms.1465There is no corresponding record for this reference.
- 16Nguyen, D. D.; Gao, K.; Wang, M.; Wei, G.-W. Mathdl: Mathematical deep learning for d3r grand challenge 4. J. Comput.-Aided Mol. Des. 2020, 34, 131– 147, DOI: 10.1007/s10822-019-00237-516https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXitFOju73K&md5=c9715db3446eb2d6cbad5a7f5a59e27eMathDL: mathematical deep learning for D3R Grand Challenge 4Nguyen, Duc Duy; Gao, Kaifu; Wang, Menglun; Wei, Guo-WeiJournal of Computer-Aided Molecular Design (2020), 34 (2), 131-147CODEN: JCADEQ; ISSN:0920-654X. (Springer)We present the performances of our math. deep learning (MathDL) models for D3R Grand Challenge 4 (GC4). This challenge involves pose prediction, affinity ranking, and free energy estn. for beta secretase 1 (BACE) as well as affinity ranking and free energy estn. for Cathepsin S (CatS). We have developed advanced mathematics, namely differential geometry, algebraic graph, and/or algebraic topol., to accurately and efficiently encode high dimensional phys./chem. interactions into scalable low-dimensional rotational and translational invariant representations. These representations are integrated with deep learning models, such as generative adversarial networks (GAN) and convolutional neural networks (CNN) for pose prediction and energy evaluation, resp. Overall, our MathDL models achieved the top place in pose prediction for BACE ligands in Stage 1a. Moreover, our submissions obtained the highest Spearman correlation coeff. on the affinity ranking of 460 CatS compds., and the smallest centered root mean square error on the free energy set of 39 CatS mols. It is worthy to mention that our method on docking pose predictions has significantly improved from our previous ones.
- 17Beck, B. R.; Shin, B.; Choi, Y.; Park, S.; Kang, K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput. Struct. Biotechnol. J. 2020, 18, 784– 790, DOI: 10.1016/j.csbj.2020.03.02517https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXmsVCgsLg%253D&md5=f688ef652af7e7ddedf26a0fc984d980Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning modelBeck, Bo Ram; Shin, Bonggun; Choi, Yoonjung; Park, Sungsoo; Kang, KeunsooComputational and Structural Biotechnology Journal (2020), 18 (), 784-790CODEN: CSBJAC; ISSN:2001-0370. (Elsevier B.V.)The infection of a novel coronavirus found in Wuhan of China (SARS-CoV-2) is rapidly spreading, and the incidence rate is increasing worldwide. Due to the lack of effective treatment options for SARS-CoV-2, various strategies are being tested in China, including drug repurposing. In this study, we used our pre-trained deep learning-based drug-target interaction model called Mol. Transformer-Drug Target Interaction (MT-DTI) to identify com. available drugs that could act on viral proteins of SARS-CoV-2. The result showed that atazanavir, an antiretroviral medication used to treat and prevent the human immunodeficiency virus (HIV), is the best chem. compd., showing an inhibitory potency with Kd of 94.94 nM against the SARS-CoV-2 3C-like proteinase, followed by remdesivir (113.13 nM), efavirenz (199.17 nM), ritonavir (204.05 nM), and dolutegravir (336.91 nM). Interestingly, lopinavir, ritonavir, and darunavir are all designed to target viral proteinases. However, in our prediction, they may also bind to the replication complex components of SARS-CoV-2 with an inhibitory potency with Kd < 1000 nM. In addn., we also found that several antiviral agents, such as Kaletra (lopinavir/ritonavir), could be used for the treatment of SARS-CoV-2. Overall, we suggest that the list of antiviral drugs identified by the MT-DTI model should be considered, when establishing effective treatment strategies for SARS-CoV-2.
- 18Weston, S.; Haupt, R.; Logue, J.; Matthews, K.; Frieman, M. FDA approved drugs with broad anti-coronaviral activity inhibit SARS-CoV-2 in vitro. bioRxiv 2020, DOI: 10.1101/2020.03.25.008482There is no corresponding record for this reference.
- 19Ma, C.; Hurst, B.; Hu, Y.; Szeto, T.; Tarbet, B.; Wang, J. Boceprevir, GC-376, and calpain inhibitors II, XII inhibit SARS-CoV-2 viral replication by targeting the viral main protease. Cell Res. 2020, DOI: 10.1038/s41422-020-0356-zThere is no corresponding record for this reference.
- 20Lee, C.-C.; Kuo, C.-J.; Ko, T.-P.; Hsu, M.-F.; Tsui, Y.-C.; Chang, S.-C.; Yang, S.; Chen, S.-J.; Chen, H.-C.; Hsu, M.-C. Structural basis of inhibition specificities of 3C and 3C-like proteases by zinc-coordinating and peptidomimetic compounds. J. Biol. Chem. 2009, 284, 7646– 7655, DOI: 10.1074/jbc.M80794720020https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXjtVSmsLc%253D&md5=af8f06673d60dbd5d8c19743f62ea015Structural Basis of Inhibition Specificities of 3C and 3C-like Proteases by Zinc-coordinating and Peptidomimetic CompoundsLee, Cheng-Chung; Kuo, Chih-Jung; Ko, Tzu-Ping; Hsu, Min-Feng; Tsui, Yao-Chen; Chang, Shih-Cheng; Yang, Syaulan; Chen, Shu-Jen; Chen, Hua-Chien; Hsu, Ming-Chu; Shih, Shin-Ru; Liang, Po-Huang; Wang, Andrew H.-J.Journal of Biological Chemistry (2009), 284 (12), 7646-7655CODEN: JBCHA3; ISSN:0021-9258. (American Society for Biochemistry and Molecular Biology)Human coxsackievirus (CV) belongs to the picornavirus family, which consists of over 200 medically relevant viruses. In picornavirus, a chymotrypsin-like protease (3Cpro) is required for viral replication by processing the polyproteins, and thus it is regarded as an antiviral drug target. A 3C-like protease (3CLpro) also exists in human coronaviruses (CoV) such as 229E and the one causing severe acute respiratory syndrome (SARS). To combat SARS, we previously had developed peptidomimetic and zinc-coordinating inhibitors of 3CLpro. As shown in the present study, some of these compds. were also found to be active against 3Cpro of CV strain B3 (CVB3). Several crystal structures of 3Cpro from CVB3 and 3CLpro from CoV-229E and SARS-CoV in complex with the inhibitors were solved. The zinc-coordinating inhibitor is tetrahedrally coordinated to the His40-Cys147 catalytic dyad of CVB3 3Cpro. The presence of specific binding pockets for the residues of peptidomimetic inhibitors explains the binding specificity. Our results provide a structural basis for inhibitor optimization and development of potential drugs for antiviral therapies.
- 21Akaji, K.; Konno, H.; Mitsui, H.; Teruya, K.; Shimamoto, Y.; Hattori, Y.; Ozaki, T.; Kusunoki, M.; Sanjoh, A. Structure-based design, synthesis, and evaluation of peptide-mimetic SARS 3CL protease inhibitors. J. Med. Chem. 2011, 54, 7962– 7973, DOI: 10.1021/jm200870n21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhtlCiurrK&md5=f951e9f8cea482a0089192b91a44dc6fStructure-Based Design, Synthesis, and Evaluation of Peptide-Mimetic SARS 3CL Protease InhibitorsAkaji, Kenichi; Konno, Hiroyuki; Mitsui, Hironori; Teruya, Kenta; Shimamoto, Yasuhiro; Hattori, Yasunao; Ozaki, Takeshi; Kusunoki, Masami; Sanjoh, AkiraJournal of Medicinal Chemistry (2011), 54 (23), 7962-7973CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)The design and evaluation of low mol. wt. peptide-based severe acute respiratory syndrome (SARS) chymotrypsin-like protease (3CL) protease inhibitors are described. A substrate-based peptide aldehyde was selected as a starting compd., and optimum side-chain structures were detd., based on a comparison of inhibitory activities with Michael type inhibitors. For the efficient screening of peptide aldehydes contg. a specific C-terminal residue, a new approach employing thioacetal to aldehyde conversion mediated by N-bromosuccinimide was devised. Structural optimization was carried out based on x-ray crystallog. analyses of the R188I SARS 3CL protease in a complex with each inhibitor to provide a tetrapeptide aldehyde with an IC50 = 98 nM. The resulting compd. carried no substrate sequence, except for a P3 site directed toward the outside of the protease. X-ray crystallog. provided insights into the protein-ligand interactions.
- 22Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, D1100– D1107, DOI: 10.1093/nar/gkr77722https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXhs12htbjN&md5=aedf7793e1ca54b6a4fa272ea3ef7d0eChEMBL: a large-scale bioactivity database for drug discoveryGaulton, Anna; Bellis, Louisa J.; Bento, A. Patricia; Chambers, Jon; Davies, Mark; Hersey, Anne; Light, Yvonne; McGlinchey, Shaun; Michalovich, David; Al-Lazikani, Bissan; Overington, John P.Nucleic Acids Research (2012), 40 (D1), D1100-D1107CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)ChEMBL is an Open Data database contg. binding, functional and ADMET information for a large no. of drug-like bioactive compds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chem. biol. and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compds. and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.
- 23Wang, R.; Fang, X.; Lu, Y.; Wang, S. The PDBbind database: Collection of binding affinities for protein- ligand complexes with known three-dimensional structures. J. Med. Chem. 2004, 47, 2977– 2980, DOI: 10.1021/jm030580l23https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2cXjs1Sjs74%253D&md5=86e609172307402d8b0d4589b1270a2fThe PDBbind database: Collection of binding affinities for protein-ligand complexes with known three-dimensional structuresWang, Renxiao; Fang, Xueliang; Lu, Yipin; Wang, ShaomengJournal of Medicinal Chemistry (2004), 47 (12), 2977-2980CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)We have screened the entire Protein Data Bank (Release No. 103, Jan. 2003) and identified 5671 protein-ligand complexes out of 19 621 exptl. structures. A systematic examn. of the primary refs. of these entries has led to a collection of binding affinity data (Kd, Ki, and IC50) for a total of 1359 complexes. The outcomes of this project have been organized into a Web-accessible database named the PDBbind database.
- 24Bacha, U.; Barrila, J.; Gabelli, S. B.; Kiso, Y.; Mario Amzel, L.; Freire, E. Development of Broad-Spectrum Halomethyl Ketone Inhibitors Against Coronavirus Main Protease 3CLpro. Chem. Biol. Drug Des. 2008, 72, 34– 49, DOI: 10.1111/j.1747-0285.2008.00679.x24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXoslymu7s%253D&md5=db8382f357a5be0fffe340b113664fd5Development of broad-spectrum halomethyl ketone inhibitors against coronavirus main protease 3CLproBacha, Usman; Barrila, Jennifer; Gabelli, Sandra B.; Kiso, Yoshiaki; Amzel, L. Mario; Freire, ErnestoChemical Biology & Drug Design (2008), 72 (1), 34-49CODEN: CBDDAL; ISSN:1747-0277. (Blackwell Publishing Ltd.)Coronaviruses comprise a large group of RNA viruses with diverse host specificity. The emergence of highly pathogenic strains like the SARS coronavirus (SARS-Co-V), and the discovery of two new coronaviruses, NL-63 and HKU1, corroborates the high rate of mutation and recombination that have enabled them to cross species barriers and infect novel hosts. For that reason, the development of broad-spectrum antivirals that are effective against several members of this family is highly desirable. This goal can be accomplished by designing inhibitors against a target, such as the main protease 3CLpro (Mpro), which is highly conserved among all coronaviruses. Here 3CLpro derived from the SARS-Co-V was used as the primary target to identify a new class of inhibitors contg. a halomethyl ketone warhead. The compds. are highly potent against SARS 3CLpro with Ki's as low as 300 nM. The crystal structure of the complex of one of the compds. with 3CLpro indicates that this inhibitor forms a thioether linkage between the halomethyl carbon of the warhead and the catalytic Cys 145. Furthermore, Structure Activity Relationship (SAR) studies of these compds. have led to the identification of a pharmacophore that accurately defines the essential mol. features required for the high affinity.
- 25Tetko, I. V.; Gasteiger, J.; Todeschini, R.; Mauri, A.; Livingstone, D.; Ertl, P.; Palyulin, V. A.; Radchenko, E. V.; Zefirov, N. S.; Makarenko, A. S. Virtual computational chemistry laboratory–design and description. J. Comput.-Aided Mol. Des. 2005, 19, 453– 463, DOI: 10.1007/s10822-005-8694-y25https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXhtFaht77F&md5=6e48f916c58c1e772ade43fa8e4b4b1aVirtual computational chemistry laboratory - design and descriptionTetko, Igor V.; Gasteiger, Johann; Todeschini, Roberto; Mauri, Andrea; Livingstone, David; Ertl, Peter; Palyulin, Vladimir A.; Radchenko, Eugene V.; Zefirov, Nikolay S.; Makarenko, Alexander S.; Tanchuk, Vsevolod Yu.; Prokopenko, Volodymyr V.Journal of Computer-Aided Molecular Design (2005), 19 (6), 453-463CODEN: JCADEQ; ISSN:0920-654X. (Springer)Internet technol. offers an excellent opportunity for the development of tools by the cooperative effort of various groups and institutions. We have developed a multi-platform software system, Virtual Computational Chem. Lab., http://www.vcclab.org, allowing the computational chemist to perform a comprehensive series of mol. indexes/properties calcns. and data anal. The implemented software is based on a three-tier architecture that is one of the std. technologies to provide client-server services on the Internet. The developed software includes several popular programs, including the indexes generation program, DRAGON, a 3D structure generator, CORINA, a program to predict lipophilicity and aq. soly. of chems., ALOGPS and others. All these programs are running at the host institutes located in five countries over Europe. In this article we review the main features and statistics of the developed system that can be used as a prototype for academic and industry models.
- 26Harder, E.; Damm, W.; Maple, J.; Wu, C.; Reboul, M.; Xiang, J. Y.; Wang, L.; Lupyan, D.; Dahlgren, M. K.; Knight, J. L. OPLS3: a force field providing broad coverage of drug-like small molecules and proteins. J. Chem. Theory Comput. 2016, 12, 281– 296, DOI: 10.1021/acs.jctc.5b0086426https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXhvVCjtbfE&md5=42663f8cfa84b80a67132bbb13b9b7ceOPLS3: A Force Field Providing Broad Coverage of Drug-like Small Molecules and ProteinsHarder, Edward; Damm, Wolfgang; Maple, Jon; Wu, Chuanjie; Reboul, Mark; Xiang, Jin Yu; Wang, Lingle; Lupyan, Dmitry; Dahlgren, Markus K.; Knight, Jennifer L.; Kaus, Joseph W.; Cerutti, David S.; Krilov, Goran; Jorgensen, William L.; Abel, Robert; Friesner, Richard A.Journal of Chemical Theory and Computation (2016), 12 (1), 281-296CODEN: JCTCCE; ISSN:1549-9618. (American Chemical Society)The parametrization and validation of the OPLS3 force field for small mols. and proteins are reported. Enhancements with respect to the previous version (OPLS2.1) include the addn. of off-atom charge sites to represent halogen bonding and aryl nitrogen lone pairs as well as a complete refit of peptide dihedral parameters to better model the native structure of proteins. To adequately cover medicinal chem. space, OPLS3 employs over an order of magnitude more ref. data and assocd. parameter types relative to other commonly used small mol. force fields (e.g., MMFF and OPLS_2005). As a consequence, OPLS3 achieves a high level of accuracy across performance benchmarks that assess small mol. conformational propensities and solvation. The newly fitted peptide dihedrals lead to significant improvements in the representation of secondary structure elements in simulated peptides and native structure stability over a no. of proteins. Together, the improvements made to both the small mol. and protein force field lead to a high level of accuracy in predicting protein-ligand binding measured over a wide range of targets and ligands (less than 1 kcal/mol RMS error) representing a 30% improvement over earlier variants of the OPLS force field.
- 27Soufan, O.; Ba-alawi, W.; Magana-Mora, A.; Essack, M.; Bajic, V. B. DPubChem: a web tool for QSAR modeling and high-throughput virtual screening. Sci. Rep. 2018, 8, 1– 10, DOI: 10.1038/s41598-018-27495-x27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1cXhvVCjtr3O&md5=df8cd91e6d51518519f9f12efb7a5d91DPubChem: a web tool for QSAR modeling and high-throughput virtual screeningSoufan, Othman; Ba-alawi, Wail; Magana-Mora, Arturo; Essack, Magbubah; Bajic, Vladimir B.Scientific Reports (2018), 8 (1), 1-10CODEN: SRCEC3; ISSN:2045-2322. (Nature Research)High-throughput screening (HTS) performs the exptl. testing of a large no. of chem. compds. aiming to identify those active in the considered assay. Alternatively, faster and cheaper methods of large-scale virtual screening are performed computationally through quant. structure-activity relationship (QSAR) models. However, the vast amt. of available HTS heterogeneous data and the imbalanced ratio of active to inactive compds. in an assay make this a challenging problem. Although different QSAR models have been proposed, they have certain limitations, e.g., high false pos. rates, complicated user interface, and limited utilization options. Therefore, we developed DPubChem, a novel web tool for deriving QSAR models that implement the state-of-the-art machine-learning techniques to enhance the precision of the models and enable efficient analyses of expts. from PubChem BioAssay database. DPubChem also has a simple interface that provides various options to users. DPubChem predicted active compds. for 300 datasets with an av. geometric mean and F1 score of 76.68% and 76.53%, resp. Furthermore, DPubChem builds interaction networks that highlight novel predicted links between chem. compds. and biol. assays. Using such a network, DPubChem successfully suggested a novel drug for the Niemann-Pick type C disease. DPubChem is freely available at www.cbrc.kaust.edu.sa/dpubchem.
- 28Peón, A.; Li, H.; Ghislat, G.; Leung, K.-S.; Wong, M.-H.; Lu, G.; Ballester, P. J. MolTarPred: a web tool for comprehensive target prediction with reliability estimation. Chem. Biol. Drug Des. 2019, 94, 1390– 1401, DOI: 10.1111/cbdd.1351628https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC1MXnslyhsLk%253D&md5=9b0ec49666c6da84849ad7f9ac2e3f1eMolTarPred: A web tool for comprehensive target prediction with reliability estimationPeon, Antonio; Li, Hongjian; Ghislat, Ghita; Leung, Kwong-Sak; Wong, Man-Hon; Lu, Gang; Ballester, Pedro J.Chemical Biology & Drug Design (2019), 94 (1), 1390-1401CODEN: CBDDAL; ISSN:1747-0277. (Wiley-Blackwell)Mol. target prediction can provide a starting point to understand the efficacy and side effects of phenotypic screening hits. Unfortunately, the vast majority of in silico target prediction methods are not available as web tools. Furthermore, these are limited in the no. of targets that can be predicted, do not est. which target predictions are more reliable and/or lack comprehensive retrospective validations. We present MolTarPred ( ), a user-friendly web tool for predicting protein targets of small org. compds. It is powered by a large knowledge base comprising 607,659 compds. and 4,553 macromol. targets collected from the ChEMBL database. In about 1 min, the predicted targets for the supplied mol. will be listed in a table. The chem. structures of the query mol. and the most similar compds. annotated with the predicted target will also be shown to permit visual inspection and comparison. Practical examples of the use of MolTarPred are showcased. MolTarPred is a new resource for scientists that require a more complete knowledge of the polypharmacol. of a mol. The introduction of a reliability score constitutes an attractive functionality of MolTarPred, as it permits focusing exptl. confirmatory tests on the most reliable predictions, which leads to higher prospective hit rates.
- 29Sheridan, R. P.; Wang, W. M.; Liaw, A.; Ma, J.; Gifford, E. M. Extreme gradient boosting as a method for quantitative structure–activity relationships. J. Chem. Inf. Model. 2016, 56, 2353– 2360, DOI: 10.1021/acs.jcim.6b0059129https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhvFCgs73E&md5=b6c38759f87da65d52bb9e325240709fExtreme Gradient Boosting as a Method for Quantitative Structure-Activity RelationshipsSheridan, Robert P.; Wang, Wei Min; Liaw, Andy; Ma, Junshui; Gifford, Eric M.Journal of Chemical Information and Modeling (2016), 56 (12), 2353-2360CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)In the pharmaceutical industry it is common to generate many QSAR models from training sets contg. a large no. of mols. and a large no. of descriptors. The best QSAR methods are those that can generate the most accurate predictions but that are not overly expensive computationally. In this paper the authors compare extreme gradient boosting (XGBoost) to random forest and single-task deep neural nets on 30 inhouse data sets. While XGBoost has many adjustable parameters, the authors can define a set of std. parameters at which XGBoost makes predictions, on the av., better than those of random forest and almost as good as those of deep neural nets. The biggest strength of XGBoost is its speed. Whereas efficient use of random forest requires generating each tree in parallel on a cluster, and deep neural nets are usually run on GPUs, XGBoost can be run on a single cluster CPU in less than a third of the wall-clock time of either of the other methods.
- 30Sidorov, P.; Naulaerts, S.; Ariey-Bonnet, J.; Pasquier, E.; Ballester, P. Predicting synergism of cancer drug combinations using NCI-ALMANAC data. Front. Chem. 2019, 7, 509, DOI: 10.3389/fchem.2019.0050930https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BB3MvkslOgsA%253D%253D&md5=e6fe7f361245c397ab3c3a57ee652c8bPredicting Synergism of Cancer Drug Combinations Using NCI-ALMANAC DataSidorov Pavel; Naulaerts Stefan; Ariey-Bonnet Jeremy; Pasquier Eddy; Ballester Pedro J; Naulaerts StefanFrontiers in chemistry (2019), 7 (), 509 ISSN:2296-2646.Drug combinations are of great interest for cancer treatment. Unfortunately, the discovery of synergistic combinations by purely experimental means is only feasible on small sets of drugs. In silico modeling methods can substantially widen this search by providing tools able to predict which of all possible combinations in a large compound library are synergistic. Here we investigate to which extent drug combination synergy can be predicted by exploiting the largest available dataset to date (NCI-ALMANAC, with over 290,000 synergy determinations). Each cell line is modeled using primarily two machine learning techniques, Random Forest (RF) and Extreme Gradient Boosting (XGBoost), on the datasets provided by NCI-ALMANAC. This large-scale predictive modeling study comprises more than 5,000 pair-wise drug combinations, 60 cell lines, 4 types of models, and 5 types of chemical features. The application of a powerful, yet uncommonly used, RF-specific technique for reliability prediction is also investigated. The evaluation of these models shows that it is possible to predict the synergy of unseen drug combinations with high accuracy (Pearson correlations between 0.43 and 0.86 depending on the considered cell line, with XGBoost providing slightly better predictions than RF). We have also found that restricting to the most reliable synergy predictions results in at least 2-fold error decrease with respect to employing the best learning algorithm without any reliability estimation. Alkylating agents, tyrosine kinase inhibitors and topoisomerase inhibitors are the drugs whose synergy with other partner drugs are better predicted by the models. Despite its leading size, NCI-ALMANAC comprises an extremely small part of all conceivable combinations. Given their accuracy and reliability estimation, the developed models should drastically reduce the number of required in vitro tests by predicting in silico which of the considered combinations are likely to be synergistic.
- 31Gao, K.; Nguyen, D. D.; Sresht, V.; Mathiowetz, A. M.; Tu, M.; Wei, G.-W. Are 2D fingerprints still valuable for drug discovery?. Phys. Chem. Chem. Phys. 2020, 22, 8373– 8390, DOI: 10.1039/D0CP00305K31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BB3cXlsVSlurw%253D&md5=dac8cf3cb16a45012daeed4df362e135Are 2D fingerprints still valuable for drug discovery?Gao, Kaifu; Nguyen, Duc Duy; Sresht, Vishnu; Mathiowetz, Alan M.; Tu, Meihua; Wei, Guo-WeiPhysical Chemistry Chemical Physics (2020), 22 (16), 8373-8390CODEN: PPCPFQ; ISSN:1463-9076. (Royal Society of Chemistry)Recently, mol. fingerprints extd. from three-dimensional (3D) structures using advanced mathematics, such as algebraic topol., differential geometry, and graph theory have been paired with efficient machine learning, esp. deep learning algorithms to outperform other methods in drug discovery applications and competitions. This raises the question of whether classical 2D fingerprints are still valuable in computer-aided drug discovery. This work considers 23 datasets assocd. with four typical problems, namely protein-ligand binding, toxicity, soly. and partition coeff. to assess the performance of eight 2D fingerprints. Advanced machine learning algorithms including random forest, gradient boosted decision tree, single-task deep neural network and multitask deep neural network are employed to construct efficient 2D-fingerprint based models. Addnl., appropriate consensus models are built to further enhance the performance of 2D-fingerprint-based methods. It is demonstrated that 2D-fingerprint-based models perform as well as the state-of-the-art 3D structure-based models for the predictions of toxicity, soly., partition coeff. and protein-ligand binding affinity based on only ligand information. However, 3D structure-based models outperform 2D fingerprint-based methods in complex-based protein-ligand binding affinity predictions.
Supporting Information
Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jpclett.0c01579.
3CL protease sequence identity and 3D structure similarity analysis, machine learning details, the MathDL model, and the list of nonpolar 3CL protease binding site residues (PDF)
Tables of experimental binding affinities for 314 SARS-CoV-2 3CL protease inhibitors, the predicted binding affinities of 1553 FDA-approved drugs, and 7012 investigational or off-market drugs (XLSX)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.