FAME 2: Simple and Effective Machine Learning Model of Cytochrome P450 Regioselectivity
Abstract

We report on the further development of FAst MEtabolizer (FAME; J. Chem. Inf. Model.2013, 53, 2896–2907), a collection of random forest models for the prediction of sites of metabolism (SoMs) of xenobiotics. A broad set of descriptors was explored, from simple 2D descriptors such as those used in FAME, to quantum chemical descriptors employed in some of the most accurate models for SoM prediction currently available. In line with the original FAME approach, our objective was to keep things simple and to come up with accurate and robust models that are based on a small number of 2D descriptors. We found that circular descriptions of atoms and their environments with such descriptors in combination with an extremely randomized trees algorithm can yield models that perform equally well compared to more complex approaches. Thorough evaluation experiments on an independent test set showed that the best of these models obtained a Matthews correlation coefficient, area under the receiver operating characteristic curve, and Top-2 accuracy of 0.57, 0.91 and 94.1%, respectively. Models for the prediction of isoform-specific regioselectivity of CYP 3A4, 2D6, and 2C9 were also developed and showed competitive performance. The best models have been integrated into a newly developed software package (FAME 2), which is available free of charge from the authors.
Introduction
Results
Figure 1

Figure 1. Overview of the data preparation, model building, and model evaluation workflow.
CYP isoforma | totalb | training setc | test setd | uncorr test sete |
---|---|---|---|---|
global | 678 | 542 | 136 | 71 |
3A4 | 473 | 378 | 95 | 52 |
2D6 | 269 | 215 | 54 | 32 |
2C9 | 225 | 180 | 45 | 32 |
The code of the isoform for which a specific model was built. The joined global model which includes all isoforms in the Zaretzki data set is marked as “global”.
Total number of compounds that had SoMs annotated for the particular isoform in the data set.
Compounds randomly selected for training and hyperparameter optimization.
Compounds randomly selected for testing.
Compounds that remained in the uncorrelated test set after the similarity filter was applied.
Model Evaluation Measures

Model Performance as a Function of Descriptor Types and Model Parameters
class | groupa | description |
---|---|---|
2Db | CDK | This group comprises 15 basic 2D descriptors implemented in CDK. Those are the original 15 descriptors considered in FAME, and they encode various properties of individual atoms in the molecule (Table S1). |
circCDK | This contains circular descriptors derived from the 15 CDK descriptors. In other words, it is equal to the CDK set enriched by the circular versions of the CDK descriptors of bond depth 1–6. | |
CDK + ATF | This is a combination of the 15 basic CDK descriptors and the circular atom type fingerprints. The fingerprints represent occurrence counts of various atom types within a certain distance from the considered atom. | |
circCDK + ATF | This group includes atom type circular fingerprints in addition to all descriptors from the circCDK set. | |
QC-enhancedc | CDK + QC | This is the simplest quantum chemistry (QC) enhanced set. It represents the complete set of noncircular atomic descriptors used in the current study (Table S1 and Table 3). |
CDK + circQC | This combination comprises the 15 basic CDK descriptors that are combined with circular versions of the 10 quantum chemical descriptors. In other words, this set contains the CDK + QC set and the circular versions of the QC descriptors for bond depths of 1–6. | |
circCDK + ATF + circQC | These represent the combination of all descriptors investigated in the current study. |
Codes representing the descriptor groups explored in the current study.
Simple 2D descriptor sets that only use topological information and their circular counterparts.
Descriptor sets enhanced by quantum chemical descriptors.
name | description |
---|---|
piS(r) | atom self-polarizability (47, 48) |
De(r) | electrophilic delocalizabilities of an atom (47, 49) |
Dn(r) | nucleophilic delocalizabilities of an atom (47, 49) |
s-Pop | population of the s-orbital (i.e., formal electron density of the s-orbital) |
p-Pop | population of the p-orbitals (i.e., formal electron density of the p-orbitals) |
NumOfElecs | number of valence electrons localized on the atom (i.e., electron density) |
NetCharge | net charge (i.e., formal number of valence electrons–NumOfElecs) |
valence | sum of bonds for an atom (i.e., molecular orbital valency) (50) |
mull_charge | partial charge of an atom determined by Mulliken (51, 52) |
mull_pop | electron population of an atom determined by Mulliken (51, 52) |
Figure 2

Figure 2. Example showing how circular atom type fingerprints and descriptors of up to bond depth 3 were calculated for a single atom. In this example, neighboring atoms up to three bonds away from the sp3 hybridized oxygen were encoded. Two examples of data matrix parts corresponding to neighbors encoded in bond depths 0 and 3 are shown. In both the fingerprints and descriptors, atoms were grouped by their atom type (highlighted in blue in the column names) in each bond depth (highlighted in orange). In the case of the fingerprints, the occurrences of each atom type were noted, while for real-valued descriptors, the average value of the basic descriptor among the grouped atoms was calculated and assigned to the corresponding atom type. Therefore, for the sigmaElectronegativity descriptor in this example, the third layer would contain 8.35 (8.35/1 = 8.35) for the sp3 hybridized nitrogen type and 8.31 ((8.31 + 8.31)/2 = 8.31) for the aromatic carbon type. Consequently, what can be derived from this matrix fragment is that there are two aromatic carbons (which are topologically identical) within three bonds of the oxygen atom that have a sigmaElectronegativity equal to 8.31 each.
Optimized Model Parameters
model parameter | explored values |
---|---|
decision_threshold | 0.2, 0.4, 0.5, 0.6 |
class_weight | balanced, balanced_subsample, none |
max_features | sqrt, 0.3, 0.6, 0.9 |
max_features_ANOVA | 100, 150, 200, 250, 300, 350, 400 |
Decision Threshold and Class Weights
Maximum Number of Considered Features per Split
ANOVA F-Test
Influence of Descriptors on Model Performance During Training
Figure 3

Figure 3. Visualization of the relationship between maximum bond depth of circular descriptors and model performance on the (a) training set (measured as the mean MCC across the 10 folds in cross-validation of the optimized model), (b) independent test set, (c) uncorrelated test set, and (d) full Zaretzki data set (measured as the mean MCC across the 10 folds in cross-validation of the optimized model). Models that do not use circular descriptors are shown to have a bond depth equal to 0.
Model Performance on an Independent Test Set

Model performance is indicated by a color gradient, ranging from dark red (worst performance among all models) via white to dark green (best performance among all models).
Model code according to the descriptor set (Table 2) and maximum bond depth (indicated by a number in brackets) used.
Value of the given performance metric calculated using predictions of 136 compounds in the independent test set.
Figure 4

Figure 4. Consensus matrix combining results from parametric and nonparametric posthoc tests. The pairs of models for which the null hypothesis (the equality of means for t test or similar performance ranking for Nemenyi test) was rejected at an α level of 0.05 by both the parametric and nonparametric method are highlighted in teal.
Figure 5

Figure 5. Point plot which represents the expected performance of each model by indicating 95% confidence intervals for mean MCC. The confidence intervals were obtained from a statistical analysis of predictions of 100 random subsamples (50 molecules each, sampled with replacement) of the original test set.
Model Performance in a Structurally Uncorrelated Test Set
Figure 6

Figure 6. Histogram of maximum Tanimoto similarities (a) between compounds in the training set and the independent test set (136 molecules) and (b) between compounds within the uncorrelated test set (71 molecules).
Comparison with Other Methods
Performance Comparison with the Finkelmann Models
Performance Comparison with Xenosite
Isoform-Specific Models


Model performance is indicated by a color gradient, ranging from dark red (worst performance among all models) via white to dark green (best performance among all models).
Model code according to the descriptor set (Table 2) and maximum bond depth (indicated by a number in brackets) used.
Isoform code.
Average value of the given performance metric across ten cross-validation folds on the training set.
Value of the given performance metric as calculated using predictions on the independent test set.
Value of the given performance metric as calculated using predictions on the uncorrelated test set.
Computation Time
Software Package Description
Figure 7

Figure 7. Example of prediction visualization by FAME 2.
Conclusions
Methods
Data Set
Descriptors
Descriptor Selection
Value Imputation


Model Building
Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.7b00250.
Additional figures and tables with calculated CDK descriptors, hyperparameter optimization results, model validation results, and performance of the random forest algorithm (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.
Acknowledgment
Access to computing and storage facilities owned by parties and projects contributing to the National Grid Infrastructure MetaCentrum provided under the programme “Projects of Large Research, Development, and Innovations Infrastructures” (CESNET LM2015042), is greatly appreciated. OpenEye Scientific Software is acknowledged for providing licenses for OMEGA via the Free Academic Licensing Program.
AM1 | Austin Model 1 |
ATF | atom type fingerprints |
AUC | area under the ROC curve |
CYP | cytochrome P450 enzyme |
DFT | density functional theory |
MCC | Matthews correlation coefficient |
NDDO | Neglect of Differential Diatomic Overlap |
QC | quantum chemistry |
SOM | site of metabolism |
References
This article references 67 other publications.
- 1Kirchmair, J.; Howlett, A.; Peironcely, J. E.; Murrell, D. S.; Williamson, M. J.; Adams, S. E.; Hankemeier, T.; van Buren, L.; Duchateau, G.; Klaffke, W.; Glen, R. C. How Do Metabolites Differ from Their Parent Molecules and How Are They Excreted? J. Chem. Inf. Model. 2013, 53, 354– 367 DOI: 10.1021/ci300487z[ACS Full Text
], [CAS], Google Scholar
1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhsVemtrY%253D&md5=ea563b68312736c7736b40939d7620a6How Do Metabolites Differ from Their Parent Molecules and How Are They Excreted?Kirchmair, Johannes; Howlett, Andrew; Peironcely, Julio E.; Murrell, Daniel S.; Williamson, Mark J.; Adams, Samuel E.; Hankemeier, Thomas; van Buren, Leo; Duchateau, Guus; Klaffke, Werner; Glen, Robert C.Journal of Chemical Information and Modeling (2013), 53 (2), 354-367CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Understanding which physicochem. properties, or property distributions, are favorable for successful design and development of drugs, nutritional supplements, cosmetics, and agrochems. is of great importance. In this study the authors have analyzed mols. from three distinct chem. spaces (i) approved drugs, (ii) human metabolites, and (iii) traditional Chinese medicine (TCM) to investigate four aspects detg. the disposition of small org. mols. First, the authors examd. the physicochem. properties of these three classes of mols. and identified characteristic features resulting from their distinctive biol. functions. For example, human metabolites and TCM mols. can be larger and more hydrophobic than drugs, which makes them less likely to cross membranes. The authors then quantified the shifts in physicochem. property space induced by metab. from a holistic perspective by analyzing a data set of several thousand exptl. obsd. metabolic trees. Results show how the metabolic system aims to retain nutrients/micronutrients while facilitating a rapid elimination of xenobiotics. In the third part the authors compared these global shifts with the contributions made by individual metabolic reactions. For better resoln., all reactions were classified into phase I and phase II biotransformations. Interestingly, not all metabolic reactions lead to more hydrophilic mols. The authors were able to identify biotransformations leading to an increase of logP by more than one log unit, which could be used for the design of drugs with enhanced efficacy. The study closes with the anal. of the physicochem. properties of metabolites found in the bile, feces, and urine. Metabolites in the bile can be large and are often neg. charged. Mols. with mol. wt. >500 Da are rarely found in the urine, and most of these large mols. are charged phase II conjugates. - 2Testa, B.; Pedretti, A.; Vistoli, G. Reactions and Enzymes in the Metabolism of Drugs and Other Xenobiotics Drug Discovery Today 2012, 17, 549– 560 DOI: 10.1016/j.drudis.2012.01.017[Crossref], [PubMed], [CAS], Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XnslOiur8%253D&md5=e11b5b1b7ea18e8bc817bff83efad456Reactions and enzymes in the metabolism of drugs and other xenobioticsTesta, Bernard; Pedretti, Alessandro; Vistoli, GiulioDrug Discovery Today (2012), 17 (11-12), 549-560CODEN: DDTOFS; ISSN:1359-6446. (Elsevier Ltd.)In this article, we offer an overview of the compared quant. importance of biotransformation reactions in the metab. of drugs and other xenobiotics, based on a meta-anal. of current research interests. Also, we assess the relative significance the enzyme (super)families or categories catalyzing these reactions. We put the facts unveiled by the anal. into a drug discovery context and draw some implications. The results confirm the primary role of cytochrome P 450-catalyzed oxidns. and UDP-glucuronosyl-catalyzed glucuronidations, but they also document the marked significance of several other reactions. Thus, there is a need for several drug discovery scientists to better grasp the variety of drug metab. reactions and enzymes and their consequences.
- 3Kirchmair, J.; Göller, A. H.; Lang, D.; Kunze, J.; Testa, B.; Wilson, I. D.; Glen, R. C.; Schneider, G. Predicting Drug Metabolism: Experiment and/or Computation? Nat. Rev. Drug Discovery 2015, 14, 387– 404 DOI: 10.1038/nrd4581[Crossref], [PubMed], [CAS], Google Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXntFCju7k%253D&md5=231060e7860047948cffd071b8a57197Predicting drug metabolism: experiment and/or computation?Kirchmair, Johannes; Goeller, Andreas H.; Lang, Dieter; Kunze, Jens; Testa, Bernard; Wilson, Ian D.; Glen, Robert C.; Schneider, GisbertNature Reviews Drug Discovery (2015), 14 (6), 387-404CODEN: NRDDAG; ISSN:1474-1776. (Nature Publishing Group)A review. Drug metab. can produce metabolites with physicochem. and pharmacol. properties that differ substantially from those of the parent drug, and consequently has important implications for both drug safety and efficacy. To reduce the risk of costly clin.-stage attrition due to the metabolic characteristics of drug candidates, there is a need for efficient and reliable ways to predict drug metab. in vitro, in silico and in vivo. In this Perspective, we provide an overview of the state of the art of exptl. and computational approaches for investigating drug metab. We highlight the scope and limitations of these methods, and indicate strategies to harvest the synergies that result from combining measurement and prediction of drug metab.
- 4Campagna-Slater, V.; Pottel, J.; Therrien, E.; Cantin, L.-D.; Moitessier, N. Development of a Computational Tool to Rival Experts in the Prediction of Sites of Metabolism of Xenobiotics by P450s J. Chem. Inf. Model. 2012, 52, 2471– 2483 DOI: 10.1021/ci3003073[ACS Full Text
], [CAS], Google Scholar
4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xht1CisrvP&md5=c954b56c19c486106cb7fec0f51fe68aDevelopment of a Computational Tool to Rival Experts in the Prediction of Sites of Metabolism of Xenobiotics by P450sCampagna-Slater, Valerie; Pottel, Joshua; Therrien, Eric; Cantin, Louis-David; Moitessier, NicolasJournal of Chemical Information and Modeling (2012), 52 (9), 2471-2483CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)The metab. of xenobiotics - and more specifically drugs - in the liver is a crit. process controlling their half-life. Although there exist exptl. methods, which measure the metabolic stability of xenobiotics and identify their metabolites, developing higher throughput predictive methods is an avenue of research. It is expected that predicting the chem. nature of the metabolites would be an asset for designing safer drugs and/or drugs with modulated half-lives. We have developed IMPACTS (In-silico Metab. Prediction by Activated Cytochromes and Transition States), a computational tool combining docking to metabolic enzymes, transition state modeling, and rule-based substrate reactivity prediction to predict the site of metab. (SoM) of xenobiotics. Its application to sets of CYP1A2, CYP2C9, CYP2D6, and CYP3A4 substrates and comparison to experts' predictions demonstrates its accuracy and significance. IMPACTS identified an exptl. obsd. SoM in the top 2 predicted sites for 77% of the substrates, while the accuracy of biotransformation experts' prediction was 65%. Application of IMPACTS to external sets and comparison of its accuracy to those of eleven other methods further validated the method implemented in IMPACTS. - 5Crivori, P.; Poggesi, I. Computational Approaches for Predicting CYP-Related Metabolism Properties in the Screening of New Drugs Eur. J. Med. Chem. 2006, 41, 795– 808 DOI: 10.1016/j.ejmech.2006.03.003[Crossref], [PubMed], [CAS], Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28XmvFeltro%253D&md5=1de13b7845575861e4f5ef85af3d809bComputational approaches for predicting CYP-related metabolism properties in the screening of new drugsCrivori, P.; Poggesi, I.European Journal of Medicinal Chemistry (2006), 41 (7), 795-808CODEN: EJMCA5; ISSN:0223-5234. (Elsevier B.V.)A review. The site of biotransformation, the extent and rate of metab. and the no. of active metabolic pathways are among the most important characteristics of the pharmacokinetics of a drug. The catalytic activity of drug metabolizing enzymes is likely the most influential determinant of the pharmacokinetic variability. Metabolic stability is the prerequisite for sustaining the therapeutically relevant concns. Metabolic inhibition and induction can give rise to clin. important drug-drug interactions. A variety of computational approaches are currently available for predicting different cytochrome P 450 (CYP)-related metab. endpoints. The present review will describe these approaches and their impact on drug development process. Indications on the available software for the implementation will also be given.
- 6Tarcsay, Á.; Keserű, G. M. In Silico Site of Metabolism Prediction of Cytochrome P450-Mediated Biotransformations Expert Opin. Drug Metab. Toxicol. 2011, 7, 299– 312 DOI: 10.1517/17425255.2011.553599[Crossref], [PubMed], [CAS], Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXit1Wku7k%253D&md5=f3631600a6ebeee3c7338521ccdb7424In silico site of metabolism prediction of cytochrome P450-mediated biotransformationsTarcsay, Akos; Keseru, Gyorgy M.Expert Opinion on Drug Metabolism & Toxicology (2011), 7 (3), 299-312CODEN: EODMAP; ISSN:1742-5255. (Informa Healthcare)A review. Preclin. research involves the in vitro monitoring of metabolic stability to deliver compds. with improved ADME profiles. Prediction of the metabolically vulnerable points can substantially help in analyzing CYP-mediated metab. data and support optimization efforts in drug discovery programs. Moreover, fast and reliable in silico predictions could accelerate the characterization of in vitro/in vivo metabolites. This paper reviews in silico methods available for CYP-mediated site of metab. (SOM) prediction. Comprehensive and practical knowledge in this field can guide the identification of best practice and may inspire ideas for the development of novel approaches. Comparison of the efficacy of SOM prediction methodologies revealed the general dependency on the studied isoform and substrate set. Increasing knowledge on P 450 X-ray structures, on biotransformations and on the mechanistic details of the catalytic cycle revolutionized the prediction of SOM. Although no ultimate soln. exits, combined methods covering both steric and electronic effects are preferred on most of the pharmaceutically relevant isoforms.
- 7Kirchmair, J.; Williamson, M. J.; Tyzack, J. D.; Tan, L.; Bond, P. J.; Bender, A.; Glen, R. C. Computational Prediction of Metabolism: Sites, Products, SAR, P450 Enzyme Dynamics, and Mechanisms J. Chem. Inf. Model. 2012, 52, 617– 648 DOI: 10.1021/ci200542m[ACS Full Text
], [CAS], Google Scholar
7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XisVyitL4%253D&md5=d144f6054d9f27774f476d026a066805Computational Prediction of Metabolism: Sites, Products, SAR, P450 Enzyme Dynamics, and MechanismsKirchmair, Johannes; Williamson, Mark J.; Tyzack, Jonathan D.; Tan, Lu; Bond, Peter J.; Bender, Andreas; Glen, Robert C.Journal of Chemical Information and Modeling (2012), 52 (3), 617-648CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)A review perspective. Metab. of xenobiotics remains a central challenge for the discovery and development of drugs, cosmetics, nutritional supplements, and agrochems. Metabolic transformations are frequently related to the incidence of toxic effects that may result from the emergence of reactive species, the systemic accumulation of metabolites, or by induction of metabolic pathways. Exptl. investigation of the metab. of small org. mols. is particularly resource demanding; hence, computational methods are of considerable interest to complement exptl. approaches. This review provides a broad overview of structure- and ligand-based computational methods for the prediction of xenobiotic metab. Current computational approaches to address xenobiotic metab. are discussed from three major perspectives: (i) prediction of sites of metab. (SOMs), (ii) elucidation of potential metabolites and their chem. structures, and (iii) prediction of direct and indirect effects of xenobiotics on metabolizing enzymes, where the focus is on the cytochrome P 450 (CYP) superfamily of enzymes, the cardinal xenobiotics metabolizing enzymes. For each of these domains, a variety of approaches and their applications are systematically reviewed, including expert systems, data mining approaches, quant. structure-activity relationships (QSARs), and machine learning-based methods, pharmacophore-based algorithms, shape-focused techniques, mol. interaction fields (MIFs), reactivity-focused techniques, protein-ligand docking, mol. dynamics (MD) simulations, and combinations of methods. Predictive metab. is a developing area, and there is still enormous potential for improvement. However, it is clear that the combination of rapidly increasing amts. of available ligand- and structure-related exptl. data (in particular, quant. data) with novel and diverse simulation and modeling approaches is accelerating the development of effective tools for prediction of in vivo metab., which is reflected by the diverse and comprehensive data sources and methods for metab. prediction reviewed here. This review attempts to survey the range and scope of computational methods applied to metab. prediction and also to compare and contrast their applicability and performance. - 8Raunio, H.; Kuusisto, M.; Juvonen, R. O.; Pentikäinen, O. T. Modeling of Interactions between Xenobiotics and Cytochrome P450 (CYP) Enzymes Front. Pharmacol. 2015, 6, 123 DOI: 10.3389/fphar.2015.00123[Crossref], [PubMed], [CAS], Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2MbosFGiuw%253D%253D&md5=aaf28501a8579f1a24ec8e5c80f24a3aModeling of interactions between xenobiotics and cytochrome P450 (CYP) enzymesRaunio Hannu; Juvonen Risto O; Kuusisto Mira; Pentikainen Olli TFrontiers in pharmacology (2015), 6 (), 123 ISSN:1663-9812.The adverse effects to humans and environment of only few chemicals are well known. Absorption, distribution, metabolism, and excretion (ADME) are the steps of pharmaco/toxicokinetics that determine the internal dose of chemicals to which the organism is exposed. Of all the xenobiotic-metabolizing enzymes, the cytochrome P450 (CYP) enzymes are the most important due to their abundance and versatility. Reactions catalyzed by CYPs usually turn xenobiotics to harmless and excretable metabolites, but sometimes an innocuous xenobiotic is transformed into a toxic metabolite. Data on ADME and toxicity properties of compounds are increasingly generated using in vitro and modeling (in silico) tools. Both physics-based and empirical modeling approaches are used. Numerous ligand-based and target-based as well as combined modeling methods have been employed to evaluate determinants of CYP ligand binding as well as predicting sites of metabolism and inhibition characteristics of test molecules. In silico prediction of CYP-ligand interactions have made crucial contributions in understanding (1) determinants of CYP ligand binding recognition and affinity; (2) prediction of likely metabolites from substrates; (3) prediction of inhibitors and their inhibition potency. Truly predictive models of toxic outcomes cannot be created without incorporating metabolic characteristics; in silico methods help producing such information and filling gaps in experimentally derived data. Currently modeling methods are not mature enough to replace standard in vitro and in vivo approaches, but they are already used as an important component in risk assessment of drugs and other chemicals.
- 9Bezhentsev, V. M.; Tarasova, O. A.; Dmitriev, A. V.; Rudik, A. V.; Lagunin, A. A.; Filimonov, D. A.; Poroikov, V. V. Computer-Aided Prediction of Xenobiotic Metabolism in the Human Body Russ. Chem. Rev. 2016, 85, 854 DOI: 10.1070/RCR4614[Crossref], [CAS], Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhvF2jsb7L&md5=4ea739a940a38faeaabadba470f2cd05Computer-aided prediction of xenobiotic metabolism in humansBezhentsev, Vladislav M.; Tarasova, Olga A.; Dmitriev, Aleksandr V.; Rudik, Anastasiya V.; Lagunin, Aleksey A.; Filimonov, Dmitriy A.; Poroikov, Vladimir V.Russian Chemical Reviews (2016), 85 (8), 854-879CODEN: RCRVAB; ISSN:0036-021X. (IOP Publishing Ltd.)A review. The review describes the main databases contg. information about the metab. of xenobiotics, including data on drug metab., metabolic enzymes, schemes of biotransformation and the structures of some substrates and metabolites. Computational approaches used to predict the interaction of xenobiotics with metabolic enzymes, to identify the sites of metab. in the mol. and to generate structures of potential metabolites for subsequent evaluation of their properties are considered. The advantages and limitations of particular computational methods for metab. prediction are indicated and the prospects for their applications to improve the safety and efficacy of new drugs are discussed.
- 10Rydberg, P. Reactivity-Based Approaches and Machine Learning Methods for Predicting the Sites of Cytochrome P450-Mediated Metabolism. In Drug Metabolism Prediction; Kirchmair, J., Ed.; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, 2014; pp 265– 292.
- 11Rydberg, P.; Olsen, L. Predicting Drug Metabolism by Cytochrome P450 2C9: Comparison with the 2D6 and 3A4 Isoforms ChemMedChem 2012, 7, 1202– 1209 DOI: 10.1002/cmdc.201200160[Crossref], [PubMed], [CAS], Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XntFWitr0%253D&md5=0dd476527c405af9de49a076a30c1181Predicting Drug Metabolism by Cytochrome P450 2C9: Comparison with the 2D6 and 3A4 IsoformsRydberg, Patrik; Olsen, LarsChemMedChem (2012), 7 (7), 1202-1209, S1202/1-S1202/33CODEN: CHEMGX; ISSN:1860-7179. (Wiley-VCH Verlag GmbH & Co. KGaA)By the use of knowledge gained through modeling of drug metab. mediated by the cytochrome P 450 2D6 and 3A4 isoforms, we constructed a 2D-based model for site-of-metab. prediction for the cytochrome P 450 2C9 isoform. The similarities and differences between the models for the 2C9 and 2D6 isoforms are discussed through structural knowledge from the X-ray crystal structures and trends in exptl. data. The final model was validated on an independent test set, resulting in an area under the curve value of 0.92, and a site of metab. was found among the top two ranked atoms for 77 % of the compds.
- 12Darvas, F. Predicting Metabolic Pathways by Logic Programming J. Mol. Graphics 1988, 6, 80– 86 DOI: 10.1016/0263-7855(88)85004-5[Crossref], [CAS], Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL1cXkslaltL8%253D&md5=1d73455bb8ab8d32c02dff7b3a4d78a4Predicting metabolic pathways by logic programmingDarvas, FerencJournal of Molecular Graphics (1988), 6 (2), 80-6CODEN: JMGRDV; ISSN:0263-7855.A discussion of logic programming is presented and its application is described in an expert system used to simulate the metabolic fate of substances. An expert system called Metabolexpert accepts the formula of the compd. to be metabolized and produces a treelike picture of the metabolites generated together with the formula of the compds. On request, 3-dimensional pictures of the metabolites are also displayed and hydrophobicity values of the compds. calcd. A retrospective investigation of Metabolexpert's achievement showed that the expert system can reproduce almost all primary, secondary, and tertiary metabolites of amphetamine. A compd. series has been suggested for benchmark testing of metabolic transformation knowledge bases.
- 13Klopman, G.; Dimayuga, M.; Talafous, J. META. 1. A Program for the Evaluation of Metabolic Transformation of Chemicals J. Chem. Inf. Model. 1994, 34, 1320– 1325 DOI: 10.1021/ci00022a014[ACS Full Text
], [CAS], Google Scholar
13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2cXmsVyitrw%253D&md5=9f145c8b540affcaf4a67b6306b991aaMETA. 1. A Program for the Evaluation of Metabolic Transformation of ChemicalsKlopman, Gilles; Dimayuga, Mario; Talafous, JosephJournal of Chemical Information and Computer Sciences (1994), 34 (6), 1320-5CODEN: JCISD8; ISSN:0095-2338.A new metab. program, META, is introduced. In this paper, the basic principles on which the program operates are described. META is an expert system, capable of predicting the sites of potential enzymic attack and the nature of the chems. formed by such metabolic transformations. It operates from dictionaries of transformation operators, created by experts to represent known metabolic paths. - 14Talafous, J.; Sayre, L. M.; Mieyal, J. J.; Klopman, G. META. 2. A Dictionary Model of Mammalian Xenobiotic Metabolism J. Chem. Inf. Model. 1994, 34, 1326– 1333 DOI: 10.1021/ci00022a015
- 15Greene, N.; Judson, P. N.; Langowski, J. J.; Marchant, C. A. Knowledge-Based Expert Systems for Toxicity and Metabolism Prediction: DEREK, StAR and METEOR SAR QSAR Environ. Res. 1999, 10, 299– 314 DOI: 10.1080/10629369908039182[Crossref], [PubMed], [CAS], Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1MXlvFCgsbc%253D&md5=5988de2a40351b9416bc21a9688a3a6dKnowledge-based expert systems for toxicity and metabolism prediction: DEREK, StAR and METEORGreene, N.; Judson, P. N.; Langowski, J. J.; Marchant, C. A.SAR and QSAR in Environmental Research (1999), 10 (2-3), 299-314, 2 platesCODEN: SQERED; ISSN:1062-936X. (Gordon & Breach Science Publishers)It has long been recognized that the ability to predict the metabolic fate of a chem. substance and the potential toxicity of either the parent compd. or its metabolites are important in novel drug design. The popularity of using computer models as an aid in this area has grown considerably in recent years. LHASA Limited has been developing knowledge-based expert systems for toxicity and metab. prediction in collaboration with industry and regulatory authorities. These systems, DEREK, StAR, and METEOR, use rules to describe the relationship between chem. structure and either toxicity in the case of DEREK and StAR or metabolic fate in the case of METEOR. The rule refinement process for DEREK often involves assessing the predictions for a novel set of compds. and comparing them to their biol. assay results as a measure of the system's performance. For example, 266 non-congeneric chems. from the National Toxicol. Program database have been processed through the DEREK mutagenicity knowledge base and the predictions compared to their Salmonella typhimurium mutagenicity data. Initially, 81 of 114 mutagens (71%) and 117 of 152 non-mutagens (77%) were correctly identified. Following further knowledge base development, the no. of correctly identified mutagens has increased to 96 (84%). Further work on improving the predictive capabilities of DEREK, StAR, and METEOR is in progress.
- 16Hou, B. K.; Wackett, L. P.; Ellis, L. B. M. Microbial Pathway Prediction: A Functional Group Approach J. Chem. Inf. Comput. Sci. 2003, 43, 1051– 1057 DOI: 10.1021/ci034018f[ACS Full Text
], [CAS], Google Scholar
16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXis1Kgt74%253D&md5=c7a314ccbde95b8ceef6af0276469840Microbial pathway prediction: a functional group approachHou, Bo Kyeng; Wackett, Lawrence P.; Ellis, Lynda B. M.Journal of Chemical Information and Computer Sciences (2003), 43 (3), 1051-1057CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)We have developed a system to predict microbial catabolism, using the University of Minnesota Biocatalysis/Biodegrdn. Database (UM-BBD, http://umbbd.ahc.umn.edu/) as a knowledge base. The present system, available on the Web (http://umbbd.ahc.umn.edu/predict/), can predict biodegrdn. of most of the major aliph. and arom. org. functional groups contg. C, H, N, O, and halogens. It can duplicate at least one known biodegrdn. pathway for 60% of the compds. in a 84-member validation set; most pathways that did not completely duplicate known metab. could plausibly occur in nature. Users are encouraged, and have begun, to submit addnl. biotransformation rules and comment on existing rules; the system will further develop under the direction of the scientific community. - 17Hatzimanikatis, V.; Li, C.; Ionita, J. A.; Henry, C. S.; Jankowski, M. D.; Broadbelt, L. J. Exploring the Diversity of Complex Metabolic Networks Bioinformatics 2005, 21, 1603– 1609 DOI: 10.1093/bioinformatics/bti213[Crossref], [PubMed], [CAS], Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXjtlGhuro%253D&md5=343be8db9eada86fcba34ae047529decExploring the diversity of complex metabolic networksHatzimanikatis, Vassily; Li, Chunhui; Ionita, Justin A.; Henry, Christopher S.; Jankowski, Matthew D.; Broadbelt, Linda J.Bioinformatics (2005), 21 (8), 1603-1609CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Motivation: Metab., the network of chem. reactions that make life possible, is one of the most complex processes in nature. We describe here the development of a computational approach for the identification of every possible biochem. reaction from a given set of enzyme reaction rules that allows the de novo synthesis of metabolic pathways composed of these reactions, and the evaluation of these novel pathways with respect to their thermodn. properties. Results: We applied this framework to the anal. of the arom. amino acid pathways and discovered almost 75,000 novel biochem. routes from chorismate to phenylalanine, more than 350,000 from chorismate to tyrosine, but only 13 from chorismate to tryptophan. Thermodn. anal. of these pathways suggests that the native pathways are thermodynamically more favorable than the alternative possible pathways. The pathways generated involve compds. that exist in biol. databases, as well as compds. that exist in chem. databases and novel compds., suggesting novel biochem. routes for these compds. and the existence of biochem. compds. that remain to be discovered or synthesized through enzyme and pathway engineering.
- 18Ridder, L.; Wagener, M. SyGMa: Combining Expert Knowledge and Empirical Scoring in the Prediction of Metabolites ChemMedChem 2008, 3, 821– 832 DOI: 10.1002/cmdc.200700312[Crossref], [PubMed], [CAS], Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXms1emtLw%253D&md5=92ed84b2e97af5ad9a50beef151fa5dbSyGMa: combining expert knowledge and empirical scoring in the prediction of metabolitesRidder, Lars; Wagener, MarkusChemMedChem (2008), 3 (5), 821-832CODEN: CHEMGX; ISSN:1860-7179. (Wiley-VCH Verlag GmbH & Co. KGaA)Predictions of potential metabolites based on chem. structure are becoming increasingly important in drug discovery to guide medicinal chem. efforts that address metabolic issues and to support exptl. metabolite screening and identification. Herein we present a novel rule-based method, SyGMa (Systematic Generation of potential Metabolites), to predict the potential metabolites of a given parent structure. A set of reaction rules covering a broad range of phase 1 and phase 2 metab. has been derived from metabolic reactions reported in the Metabolite Database to occur in humans. An empirical probability score is assigned to each rule representing the fraction of correctly predicted metabolites in the training database. This score is used to refine the rules and to rank predicted metabolites. The current rule set of SyGMa covers approx. 70% of biotransformation reactions obsd. in humans. Evaluation of the rule-based predictions demonstrated a significant enrichment of true metabolites in the top of the ranking list: while in total, 68% of all obsd. metabolites in an independent test set were reproduced by SyGMa, a large part, 30% of the obsd. metabolites, were identified among the top three predictions. From a subset of cytochrome P 450 specific metabolites, 84% were reproduced overall, with 66% in the top three predicted phase 1 metabolites. A similarity anal. of the reactions present in the database was performed to obtain an overview of the metabolic reactions predicted by SyGMa and to support ongoing efforts to extend the rules. Specific examples demonstrate the use of SyGMa in exptl. metabolite identification and the application of SyGMa to suggest chem. modifications that improve the metabolic stability of compds.
- 19Gao, J.; Ellis, L. B. M.; Wackett, L. P. The University of Minnesota Pathway Prediction System: Multi-Level Prediction and Visualization Nucleic Acids Res. 2011, 39, W406– W411 DOI: 10.1093/nar/gkr200[Crossref], [PubMed], [CAS], Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXosVOmsLs%253D&md5=4b27083f37d70ab33ba63c1e27a2959bThe University of Minnesota Pathway Prediction System: multi-level prediction and visualizationGao, Junfeng; Ellis, Lynda B. M.; Wackett, Lawrence P.Nucleic Acids Research (2011), 39 (Web Server), W406-W411CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The University of Minnesota Pathway Prediction System (UM-PPS, http://umbbd.msi.umn.edu/predict/) is a rule-based system that predicts microbial catabolism of org. compds. Currently, its knowledge base contains 250 biotransformation rules and five types of metabolic logic entities. The original UM-PPS predicted up to two prediction levels at a time. Users had to choose a predicted product to continue the prediction. This approach provided a limited view of prediction results and heavily relied on manual intervention. The new UM-PPS produces a multi-level prediction within an acceptable time frame, and allows users to view prediction alternatives much more easily as a directed acyclic graph.
- 20Mu, F.; Unkefer, C. J.; Unkefer, P. J.; Hlavacek, W. S. Prediction of Metabolic Reactions Based on Atomic and Molecular Properties of Small-Molecule Compounds Bioinformatics 2011, 27, 1537– 1545 DOI: 10.1093/bioinformatics/btr177[Crossref], [PubMed], [CAS], Google Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXmvVajsLs%253D&md5=97709401a9db367e98b2278a9d601816Prediction of metabolic reactions based on atomic and molecular properties of small-molecule compoundsMu, Fangping; Unkefer, Clifford J.; Unkefer, Pat J.; Hlavacek, William S.Bioinformatics (2011), 27 (11), 1537-1545CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Our knowledge of the metabolites in cells and their reactions is far from complete as revealed by metabolomic measurements that detect many more small mols. than are documented in metabolic databases. Here, we develop an approach for predicting the reactivity of small-mol. metabolites in enzyme-catalyzed reactions that combines expert knowledge, computational chem. and machine learning. We classified 4843 reactions documented in the KEGG database, from all six Enzyme Commission classes (EC 1-6), into 80 reaction classes, each of which is marked by a characteristic functional group transformation. Reaction centers and surrounding local structures in substrates and products of these reactions were represented using SMARTS. We found that each of the SMARTS-defined chem. substructures is widely distributed among metabolites, but only a fraction of the functional groups in these substructures are reactive. Using at. properties of atoms in a putative reaction center and mol. properties as features, we trained support vector machine (SVM) classifiers to discriminate between functional groups that are reactive and non-reactive. Classifier accuracy was assessed by cross-validation anal. A typical sensitivity [TP/(TP+FN)] or specificity [TN/(TN+FP)] is ≈0.8. Our results suggest that metabolic reactivity of small-mol. compds. can be predicted with reasonable accuracy based on the presence of a potentially reactive functional group and the chem. features of its local environment. The classifiers presented here can be used to predict reactions via a web site (http://cellsignaling.lanl.gov/Reactivity/). Contact: [email protected].
- 21Yousofshahi, M.; Manteiga, S.; Wu, C.; Lee, K.; Hassoun, S. PROXIMAL: A Method for Prediction of Xenobiotic Metabolism BMC Syst. Biol. 2015, 9, 94 DOI: 10.1186/s12918-015-0241-4[Crossref], [PubMed], [CAS], Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXjtVyltbo%253D&md5=131eaa3ba02457ab89a73d1481d38ea1PROXIMAL: a method for Prediction of Xenobiotic MetabolismYousofshahi, Mona; Manteiga, Sara; Wu, Charmian; Lee, Kyongbum; Hassoun, SohaBMC Systems Biology (2015), 9 (), 94/1-94/17CODEN: BSBMCC; ISSN:1752-0509. (BioMed Central Ltd.)Background: Contamination of the environment with bioactive chems. has emerged as a potential public health risk. These substances that may cause distress or disease in humans can be found in air, water and food supplies. An open question is whether these chems. transform into potentially more active or toxic derivs. via xenobiotic metabolizing enzymes expressed in the body. We present a new prediction tool, which we call PROXIMAL (Prediction of Xenobiotic Metab.) for identifying possible transformation products of xenobiotic chems. in the liver. Using reaction data from DrugBank and KEGG, PROXIMAL builds look-up tables that catalog the sites and types of structural modifications performed by Phase I and Phase II enzymes. Given a compd. of interest, PROXIMAL searches for substructures that match the sites cataloged in the look-up tables, applies the corresponding modifications to generate a panel of possible transformation products, and ranks the products based on the activity and abundance of the enzymes involved. Results: PROXIMAL generates transformations that are specific for the chem. of interest by analyzing the chem.'s substructures. We evaluate the accuracy of PROXIMAL's predictions through case studies on two environmental chems. with suspected endocrine disrupting activity, bisphenol A (BPA) and 4-chlorobiphenyl (PCB3). Comparisons with published reports confirm 5 out of 7 and 17 out of 26 of the predicted derivs. for BPA and PCB3, resp. We also compare biotransformation predictions generated by PROXIMAL with those generated by METEOR and Metaprint2D-react, two other prediction tools. Conclusions: PROXIMAL can predict transformations of chems. that contain substructures recognizable by human liver enzymes. It also has the ability to rank the predicted metabolites based on the activity and abundance of enzymes involved in xenobiotic transformation.
- 22Sun, H.; Scott, D. O. Structure-Based Drug Metabolism Predictions for Drug Design Chem. Biol. Drug Des. 2010, 75, 3– 17 DOI: 10.1111/j.1747-0285.2009.00899.x[Crossref], [PubMed], [CAS], Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhs1WhsbfO&md5=52666dc1d960035e9def127cb1963860Structure-based drug metabolism predictions for drug designSun, Hao; Scott, Dennis O.Chemical Biology & Drug Design (2010), 75 (1), 3-17CODEN: CBDDAL; ISSN:1747-0277. (Wiley-Blackwell)A review. Significant progress has been made in structure-based drug design by pharmaceutical companies at different stages of drug discovery such as identifying new hits, enhancing mol. binding affinity in hit-to-lead, and reducing toxicities in lead optimization. Drug metab. is a major consideration for modifying drug clearance and also a primary source for drug metabolite-induced toxicity. With major cytochrome P 450 structures identified and characterized recently, structure-based drug metab. prediction becomes increasingly attractive. In silico methods based on mol. and quantum mechanics such as docking, mol. dynamics and ab initio chem. reactivity calcns. bring us closer to understand drug metab. and predict drug-drug interactions. In this study, we review important progress in drug metab. and common in silico techniques adopted to predict drug regioselectivity, stereoselectivity, reactive metabolites, induction, inhibition and mechanism-based inactivation, as well as their implementation in hit-to-lead drug discovery.
- 23Kingsley, L. J.; Wilson, G. L.; Essex, M. E.; Lill, M. A. Combining Structure- and Ligand-Based Approaches to Improve Site of Metabolism Prediction in CYP2C9 Substrates Pharm. Res. 2015, 32, 986– 1001 DOI: 10.1007/s11095-014-1511-3
- 24Sheridan, R. P.; Korzekwa, K. R.; Torres, R. A.; Walker, M. J. Empirical Regioselectivity Models for Human Cytochromes P450 3A4, 2D6, and 2C9 J. Med. Chem. 2007, 50, 3173– 3184 DOI: 10.1021/jm0613471[ACS Full Text
], [CAS], Google Scholar
24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXmsFGnsr0%253D&md5=645bd5ddeb0459636bf6067e0194dd34Empirical Regioselectivity Models for Human Cytochromes P450 3A4, 2D6, and 2C9Sheridan, Robert P.; Korzekwa, Kenneth R.; Torres, Rhonda A.; Walker, Matthew J.Journal of Medicinal Chemistry (2007), 50 (14), 3173-3184CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Cytochromes P 450 3A4, 2D6, and 2C9 metabolize a large fraction of drugs. Knowing where these enzymes will preferentially oxidize a mol., the regioselectivity, allows medicinal chemists to plan how best to block its metab. The authors present QSAR-based regioselectivity models for these enzymes calibrated against compiled literature data of drugs and drug-like compds. These models are purely empirical and use only the structures of the substrates, in contrast to those models that simulate a specific mechanism like hydrogen radical abstraction, and/or use explicit models of active sites. The authors most predictive models use three substructure descriptors and two phys. property descriptors. Descriptor importance from the random forest QSAR method show that other factors than the immediate chem. environment and the accessibility of the hydrogen affect regioselectivity in all three isoforms. The cross-validated predictions of the models are compared to predictions from the authors earlier mechanistic model (Singh et al. J. Med. Chem. 2003, 46, 1330-1336) and predictions from MetaSite (Cruciani et al. J. Med. Chem. 2005, 48, 6970-6979). - 25Rydberg, P.; Gloriam, D. E.; Zaretzki, J.; Breneman, C.; Olsen, L. SMARTCyp: A 2D Method for Prediction of Cytochrome P450-Mediated Drug Metabolism ACS Med. Chem. Lett. 2010, 1, 96– 100 DOI: 10.1021/ml100016x[ACS Full Text
], [CAS], Google Scholar
25https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXjtFOqsL0%253D&md5=4761de933c2c3cef5da2ffb29ac076d0SMARTCyp: A 2D Method for Prediction of Cytochrome P450-Mediated Drug MetabolismRydberg, Patrik; Gloriam, David E.; Zaretzki, Jed; Breneman, Curt; Olsen, LarsACS Medicinal Chemistry Letters (2010), 1 (3), 96-100CODEN: AMCLCT; ISSN:1948-5875. (American Chemical Society)SMARTCyp is an in silico method that predicts the sites of cytochrome P 450-mediated metab. of druglike mols. The method is foremost a reactivity model, and as such, it shows a preference for predicting sites that are metabolized by the cytochrome P 450 3A4 isoform. SMARTCyp predicts the site of metab. directly from the 2D structure of a mol., without requiring calcn. of electronic properties or generation of 3D structures. This is a major advantage, because it makes SMARTCyp very fast. Other advantages are that exptl. data are not a prerequisite to create the model, and it can easily be integrated with other methods to create models for other cytochrome P 450 isoforms. Benchmarking tests on a database of 394 3A4 substrates show that SMARTCyp successfully identifies at least one metabolic site in the top two ranked positions 76% of the time. SMARTCyp is available for download at http://www.farma.ku.dk/P 450. - 26Rydberg, P.; Gloriam, D. E.; Olsen, L. The SMARTCyp Cytochrome P450 Metabolism Prediction Server Bioinformatics 2010, 26, 2988– 2989 DOI: 10.1093/bioinformatics/btq584[Crossref], [PubMed], [CAS], Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhsVKjtL3I&md5=a8689591d976ca80d0ae6a3e15b32636The SMARTCyp cytochrome P450 metabolism prediction serverRydberg, Patrik; Gloriam, David E.; Olsen, LarsBioinformatics (2010), 26 (23), 2988-2989CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)The SMARTCyp server is the first web application for site of metab. prediction of cytochrome P 450-mediated drug metab.
- 27Rydberg, P.; Rostkowski, M.; Gloriam, D. E.; Olsen, L. The Contribution of Atom Accessibility to Site of Metabolism Models for Cytochromes P450 Mol. Pharmaceutics 2013, 10, 1216– 1223 DOI: 10.1021/mp3005116[ACS Full Text
], [CAS], Google Scholar
27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhtFyjs7c%253D&md5=206ed0792c118423008bca0e6a793096The Contribution of Atom Accessibility to Site of Metabolism Models for Cytochromes P450Rydberg, Patrik; Rostkowski, Michal; Gloriam, David E.; Olsen, LarsMolecular Pharmaceutics (2013), 10 (4), 1216-1223CODEN: MPOHBP; ISSN:1543-8384. (American Chemical Society)Three different types of atom accessibility descriptors are investigated in relation to site of metab. predictions. To enable the integration of local accessibility we have constructed 2DSASA, a method for the calcn. of the at. solvent accessible surface area that is independent of 3D coordinates. The method was implemented in the SMARTCyp site of metab. prediction models and improved the results by up to 4 percentage points for nine cytochrome P 450 isoforms. The final models are made available at http://www.farma.ku.dk/smartcyp. - 28Zaretzki, J.; Bergeron, C.; Rydberg, P.; Huang, T.-W.; Bennett, K. P.; Breneman, C. M. RS-Predictor: A New Tool for Predicting Sites of Cytochrome P450-Mediated Metabolism Applied to CYP 3A4 J. Chem. Inf. Model. 2011, 51, 1667– 1689 DOI: 10.1021/ci2000488[ACS Full Text
], [CAS], Google Scholar
28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXnsVertLs%253D&md5=3d863c866d45fc1c8e90d033a16ed6a1RS-Predictor: A New Tool for Predicting Sites of Cytochrome P450-Mediated Metabolism Applied to CYP 3A4Zaretzki, Jed; Bergeron, Charles; Rydberg, Patrik; Huang, Tao-wei; Bennett, Kristin P.; Breneman, Curt M.Journal of Chemical Information and Modeling (2011), 51 (7), 1667-1689CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)This article describes RegioSelectivity-Predictor (RS-Predictor), a new in silico method for generating predictive models of P 450-mediated metab. for drug-like compds. Within this method, potential sites of metab. (SOMs) are represented as "metabolophores": A concept that describes the hierarchical combination of topol. and quantum chem. descriptors needed to represent the reactivity of potential metabolic reaction sites. RS-Predictor modeling involves the use of metabolophore descriptors together with multiple-instance ranking (MIRank) to generate an optimized descriptor wt. vector that encodes regioselectivity trends across all cases in a training set. The resulting pathway-independent (O-dealkylation vs. N-oxidn. vs. Csp3 hydroxylation, etc.), isoenzyme-specific regioselectivity model may be used to predict potential metabolic liabilities. In the present work, cross-validated RS-Predictor models were generated for a set of 394 substrates of CYP 3A4 as a proof-of-principle for the method. Rank aggregation was then employed to merge independently generated predictions for each substrate into a single consensus prediction. The resulting consensus RS-Predictor models were shown to reliably identify at least one obsd. site of metab. in the top two rank-positions on 78% of the substrates. Comparisons between RS-Predictor and previously described regioselectivity prediction methods reveal new insights into how in silico metabolite prediction methods should be compared. - 29Zaretzki, J.; Rydberg, P.; Bergeron, C.; Bennett, K. P.; Olsen, L.; Breneman, C. M. RS-Predictor Models Augmented with SMARTCyp Reactivities: Robust Metabolic Regioselectivity Predictions for Nine CYP Isozymes J. Chem. Inf. Model. 2012, 52, 1637– 1659 DOI: 10.1021/ci300009z[ACS Full Text
], [CAS], Google Scholar
29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XlvFCrsLk%253D&md5=99d070a003355b0caecdad0945077de7RS-Predictor Models Augmented with SMARTCyp Reactivities: Robust Metabolic Regioselectivity Predictions for Nine CYP IsozymesZaretzki, Jed; Rydberg, Patrik; Bergeron, Charles; Bennett, Kristin P.; Olsen, Lars; Breneman, Curt M.Journal of Chemical Information and Modeling (2012), 52 (6), 1637-1659CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)RS-Predictor is a tool for creating pathway-independent, isoenzyme-specific, site of metab. (SOM) prediction models using any set of known cytochrome P 450 (CYP) substrates and metabolites. Until now, the RS-Predictor method was only trained and validated on CYP 3A4 data, but in the present study, we report on the versatility the RS-Predictor modeling paradigm by creating and testing regioselectivity models for substrates of the nine most important CYP isoenzymes. Through curation of source literature, we have assembled 680 substrates distributed among CYPs 1A2, 2A6, 2B6, 2C19, 2C8, 2C9, 2D6, 2E1, and 3A4, the largest publicly accessible collection of P 450 ligands and metabolites released to date. A comprehensive investigation into the importance of different descriptor classes for identifying the regioselectivity mediated by each isoenzyme is made through the generation of multiple independent RS-Predictor models for each set of isoenzyme substrates. Two of these models include a d. functional theory (DFT) reactivity descriptor derived from SMARTCyp. Optimal combinations of RS-Predictor and SMARTCyp are shown to have stronger performance than either method alone, while also exceeding the accuracy of the com. regioselectivity prediction methods distributed by Optibrium and Schroedinger, correctly identifying a large proportion of the metabolites in each substrate set within the top two rank-positions: 1A2 (83.0%), 2A6 (85.7%), 2B6 (82.1%), 2C19 (86.2%), 2C8 (83.8%), 2C9 (84.5%), 2D6 (85.9%), 2E1 (82.8%), 3A4 (82.3%), and merged (86.0%). Comprehensive datamining of each substrate set and careful statistical analyses of the predictions made by the different models revealed new insights into mol. features that control metabolic regioselectivity and enable accurate prospective prediction of likely SOMs. - 30Zaretzki, J.; Bergeron, C.; Huang, T.-W.; Rydberg, P.; Swamidass, S. J.; Breneman, C. M. RS-WebPredictor: A Server for Predicting CYP-Mediated Sites of Metabolism on Drug-like Molecules Bioinformatics 2013, 29, 497– 498 DOI: 10.1093/bioinformatics/bts705[Crossref], [PubMed], [CAS], Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXis1Ojtrw%253D&md5=bb01bdc036c5d338984efd663bd5ed78RS-WebPredictor: a server for predicting CYP-mediated sites of metabolism on drug-like moleculesZaretzki, Jed; Bergeron, Charles; Huang, Tao-wei; Rydberg, Patrik; Swamidass, S. Joshua; Breneman, Curt M.Bioinformatics (2013), 29 (4), 497-498CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Summary: Regioselectivity-WebPredictor (RS-WebPredictor) is a server that predicts isoenzyme-specific cytochrome P 450 (CYP)-mediated sites of metab. (SOMs) on drug-like mols. Predictions may be made for the promiscuous 2C9, 2D6 and 3A4 CYP isoenzymes, as well as CYPs 1A2, 2A6, 2B6, 2C8, 2C19 and 2E1. RS-WebPredictor is the first freely accessible server that predicts the regioselectivity of the last six isoenzymes. Server execution time is fast, taking on av. 2s to encode a submitted mol. and 1s to apply a given model, allowing for high-throughput use in lead optimization projects.
- 31Zaretzki, J.; Matlock, M.; Swamidass, S. J. XenoSite: Accurately Predicting CYP-Mediated Sites of Metabolism with Neural Networks J. Chem. Inf. Model. 2013, 53, 3373– 3383 DOI: 10.1021/ci400518g[ACS Full Text
], [CAS], Google Scholar
31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhslKks77E&md5=1de6321b1500c9571db548006767e708XenoSite: Accurately Predicting CYP-Mediated Sites of Metabolism with Neural NetworksZaretzki, Jed; Matlock, Matthew; Swamidass, S. JoshuaJournal of Chemical Information and Modeling (2013), 53 (12), 3373-3383CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Understanding how xenobiotic mols. are metabolized is important because it influences the safety, efficacy, and dose of medicines and how they can be modified to improve these properties. The cytochrome P450s (CYPs) are proteins responsible for metabolizing 90% of drugs on the market, and many computational methods can predict which at. sites of a mol.-sites of metab. (SOMs)-are modified during CYP-mediated metab. This study improves on prior methods of predicting CYP-mediated SOMs by using new descriptors and machine learning based on neural networks. The new method, XenoSite, is faster to train and more accurate by as much as 4% or 5% for some isoenzymes. Furthermore, some "incorrect" predictions made by XenoSite were subsequently validated as correct predictions by revaluation of the source literature. Moreover, XenoSite output is interpretable as a probability, which reflects both the confidence of the model that a particular atom is metabolized and the statistical likelihood that its prediction for that atom is correct. - 32Matlock, M. K.; Hughes, T. B.; Swamidass, S. J. XenoSite Server: A Web-Available Site of Metabolism Prediction Tool Bioinformatics 2015, 31, 1136– 1137 DOI: 10.1093/bioinformatics/btu761[Crossref], [PubMed], [CAS], Google Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xht1Gntb7E&md5=8d7928ef60f9ef495f7cf776aadc4d0eXenoSite server: a web-available site of metabolism prediction toolMatlock, Matthew K.; Hughes, Tyler B.; Joshua, Swamidass S.Bioinformatics (2015), 31 (7), 1136-1137CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Summary: Cytochrome P 450 enzymes (P450s) are metabolic enzymes that process the majority of FDA-approved, small-mol. drugs. Understanding how these enzymes modify mol. structure is key to the development of safe, effective drugs. XenoSite server is an online implementation of the XenoSite, a recently published computational model for P 450 metab. XenoSite predicts which at. sites of a mol.-sites of metab. (SOMs)-are modified by P450s. XenoSite server accepts input in common chem. file formats including SDF and SMILES and provides tools for visualizing the likelihood that each at. site is a site of metab. for a variety of important P450s, as well as a flat file download of SOM predictions.
- 33Kirchmair, J.; Williamson, M. J.; Afzal, A. M.; Tyzack, J. D.; Choy, A. P. K.; Howlett, A.; Rydberg, P.; Glen, R. C. FAst MEtabolizer (FAME): A Rapid and Accurate Predictor of Sites of Metabolism in Multiple Species by Endogenous Enzymes J. Chem. Inf. Model. 2013, 53, 2896– 2907 DOI: 10.1021/ci400503s[ACS Full Text
], [CAS], Google Scholar
33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhslGmt7rO&md5=131ead027225bea9851fba02e7b63a09FAst MEtabolizer (FAME): A Rapid and Accurate Predictor of Sites of Metabolism in Multiple Species by Endogenous EnzymesKirchmair, Johannes; Williamson, Mark J.; Afzal, Avid M.; Tyzack, Jonathan D.; Choy, Alison P. K.; Howlett, Andrew; Rydberg, Patrik; Glen, Robert C.Journal of Chemical Information and Modeling (2013), 53 (11), 2896-2907CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)FAst MEtabolizer (FAME) is a fast and accurate predictor of sites of metab. (SoMs). It is based on a collection of random forest models trained on diverse chem. data sets of >20,000 mols. annotated with their exptl. detd. SoMs. Using a comprehensive set of available data, FAME aims to assess metabolic processes from a holistic point of view. It is not limited to a specific enzyme family or species. Besides a global model, dedicated models are available for human, rat, and dog metab.; specific prediction of phase I and II metab. is also supported. FAME is able to identify at least one known SoM among the top-1, top-2, and top-3 highest ranked atom positions in up to 71%, 81%, and 87% of all cases tested, resp. These prediction rates are comparable to or better than SoM predictors focused on specific enzyme families (such as cytochrome P450s), despite the fact that FAME uses only seven chem. descriptors. FAME covers a very broad chem. space, which together with its inter- and extrapolation power makes it applicable to a wide range of chems. Predictions take <2.5 s per mol. in batch mode on an Ultrabook. Results are visualized using Jmol, with the most likely SoMs highlighted. - 34Tyzack, J. D.; Hunt, P. A.; Segall, M. D. Predicting Regioselectivity and Lability of Cytochrome P450 Metabolism Using Quantum Mechanical Simulations J. Chem. Inf. Model. 2016, 56, 2180– 2193 DOI: 10.1021/acs.jcim.6b00233[ACS Full Text
], [CAS], Google Scholar
34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhs1yksrfO&md5=3f4daa77aacbfaefaec652c8ff456740Predicting Regioselectivity and Lability of Cytochrome P450 Metabolism Using Quantum Mechanical SimulationsTyzack, Jonathan D.; Hunt, Peter A.; Segall, Matthew D.Journal of Chemical Information and Modeling (2016), 56 (11), 2180-2193CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Methods are described for predicting cytochrome P 450 (CYP) metab. incorporating both pathway-specific reactivity and isoform-specific accessibility considerations. Semi-empirical quantum mech. (QM) simulations, parameterized using exptl. data and ab initio calcns., estd. the reactivity of each potential site of metab. in the context of the whole mol. Ligand-based models, trained using high quality regioselectivity data, cor. for orientation and steric effects of the different CYP isoform binding pockets. The resulting models identified a site of metab. in the top 2 predictions for between 82 and 91% of compds. in independent test sets across 7 CYP isoforms. In addn. to predicting the relative proportion of metabolite formation at each site, these methods estd. the activation energy at each site, from which addnl. information could be derived regarding their lability in abs. terms. The authors illustrated how this could guide the design of compds. to overcome issues with rapid CYP metab. - 35He, S.-B.; Li, M.-M.; Zhang, B.-X.; Ye, X.-T.; Du, R.-F.; Wang, Y.; Qiao, Y.-J. Construction of Metabolism Prediction Models for CYP450 3A4, 2D6, and 2C9 Based on Microsomal Metabolic Reaction System Int. J. Mol. Sci. 2016, 17, E1686 DOI: 10.3390/ijms17101686[Crossref], [PubMed], [CAS], Google Scholar35https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XitVahsLjJ&md5=9b44af1c8b3412453be673ab9775bc9fConstruction of metabolism prediction models for CYP450 3A4, 2D6, and 2C9 based on microsomal metabolic reaction systemHe, Shuai-Bing; Li, Man-Man; Zhang, Bai-Xia; Ye, Xiao-Tong; Du, Ran-Feng; Wang, Yun; Qiao, Yan-JiangInternational Journal of Molecular Sciences (2016), 17 (10), 1686/1-1686/18CODEN: IJMCFK; ISSN:1422-0067. (MDPI AG)During the past decades, there have been continuous attempts in the prediction of metab. mediated by cytochrome P450s (CYP450s) 3A4, 2D6, and 2C9. However, it has indeed remained a huge challenge to accurately predict the metab. of xenobiotics mediated by these enzymes. To address this issue, microsomal metabolic reaction system (MMRS)-a novel concept, which integrates information about site of metab. (SOM) and enzyme-was introduced. By incorporating the use of multiple feature selection (FS) techniques (ChiSquared (CHI), InfoGain (IG), GainRatio (GR), Relief) and hybrid classification procedures (Kstar, Bayes (BN), K-nearest neighbors (IBK), C4.5 decision tree (J48), RandomForest (RF), Support vector machines (SVM), AdaBoostM1, Bagging), metab. prediction models were established based on metab. data released by Sheridan et al. Four major biotransformations, including aliph. C-hydroxylation, arom. C-hydroxylation, N-dealkylation and O-dealkylation, were involved. For validation, the overall accuracies of all four biotransformations exceeded 0.95. For receiver operating characteristic (ROC) anal., each of these models gave a significant area under curve (AUC) value >0.98. In addn., an external test was performed based on dataset published previously. As a result, 87.7% of the potential SOMs were correctly identified by our four models. In summary, four MMRS-based models were established, which can be used to predict the metab. mediated by CYP3A4, 2D6, and 2C9 with high accuracy.
- 36Finkelmann, A. R.; Göller, A. H.; Schneider, G. Site of Metabolism Prediction Based on Ab Initio Derived Atom Representations ChemMedChem 2017, 12, 606– 612 DOI: 10.1002/cmdc.201700097[Crossref], [PubMed], [CAS], Google Scholar36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXlsV2ltL8%253D&md5=5dbee6380ec1bed9bae7887550e9ec6cSite of Metabolism Prediction Based on ab initio Derived Atom RepresentationsFinkelmann, Arndt R.; Goeller, Andreas H.; Schneider, GisbertChemMedChem (2017), 12 (8), 606-612CODEN: CHEMGX; ISSN:1860-7179. (Wiley-VCH Verlag GmbH & Co. KGaA)Machine learning models for site of metab. (SoM) prediction offer the ability to identify metabolic soft spots in low-mol.-wt. drug mols. at low computational cost and enable data-based reactivity prediction. SoM prediction is an atom classification problem. Successful construction of machine learning models requires atom representations that capture the reactivity-detg. features of a potential reaction site. We have developed a descriptor scheme that characterizes an atom's steric and electronic environment and its relative location in the mol. structure. The partial charge distributions were obtained from fast quantum mech. calcns. We successfully trained machine learning classifiers on curated cytochrome P 450 metab. data. The models based on the new atom descriptors showed sustained accuracy for retrospective analyses of metab. optimization campaigns and lead optimization projects from Bayer Pharmaceuticals. The results obtained demonstrate the practicality of quantum-chem.-supported machine learning models for hit-to-lead optimization.
- 37Geurts, P.; Ernst, D.; Wehenkel, L. Extremely Randomized Trees Mach. Learn. 2006, 63, 3– 42 DOI: 10.1007/s10994-006-6226-1
- 38Tyzack, J. D.; Williamson, M. J.; Torella, R.; Glen, R. C. Prediction of Cytochrome P450 Xenobiotic Metabolism: Tethered Docking and Reactivity Derived from Ligand Molecular Orbital Analysis J. Chem. Inf. Model. 2013, 53, 1294– 1305 DOI: 10.1021/ci400058s[ACS Full Text
], [CAS], Google Scholar
38https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXot1Srtb0%253D&md5=8215558c1c9459a4d64f5ee88e16a3e3Prediction of Cytochrome P450 Xenobiotic Metabolism: Tethered Docking and Reactivity Derived from Ligand Molecular Orbital AnalysisTyzack, Jonathan D.; Williamson, Mark J.; Torella, Rubben; Glen, Robert C.Journal of Chemical Information and Modeling (2013), 53 (6), 1294-1305CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Metab. of xenobiotic and endogenous compds. is frequently complex, not completely elucidated, and therefore often ambiguous. The prediction of sites of metab. (SoM) can be particularly helpful as a first step toward the identification of metabolites, a process esp. relevant to drug discovery. This paper describes a reactivity approach for predicting SoM whereby reactivity is derived directly from the ground state ligand MO anal., calcd. using D. Functional Theory, using a novel implementation of the av. local ionization energy. Thus each potential SoM is sampled in the context of the whole ligand, in contrast to other popular approaches where activation energies are calcd. for a predefined database of mol. fragments and assigned to matching moieties in a query ligand. In addn., one of the first descriptions of mol. dynamics of cytochrome P 450 (CYP) isoforms 3A4, 2D6, and 2C9 in their Compd. I state is reported, and, from the representative protein structures obtained, an anal. and evaluation of various docking approaches using GOLD is performed. In particular, a covalent docking approach is described coupled with the modeling of important electrostatic interactions between CYP and ligand using spherical constraints. Combining the docking and reactivity results, obtained using std. functionality from common docking and quantum chem. applications, enables a SoM to be identified in the top 2 predictions for 75%, 80%, and 78% of the data sets for 3A4, 2D6, and 2C9, resp., results that are accessible and competitive with other recently published prediction tools. - 39Huang, T.-W.; Zaretzki, J.; Bergeron, C.; Bennett, K. P.; Breneman, C. M. DR-Predictor: Incorporating Flexible Docking with Specialized Electronic Reactivity and Machine Learning Techniques to Predict CYP-Mediated Sites of Metabolism J. Chem. Inf. Model. 2013, 53, 3352– 3366 DOI: 10.1021/ci4004688[ACS Full Text
], [CAS], Google Scholar
39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhslWlsrzF&md5=1d0592336da0de3bd1374b5d5b8fd991DR-Predictor: Incorporating Flexible Docking with Specialized Electronic Reactivity and Machine Learning Techniques to Predict CYP-Mediated Sites of MetabolismHuang, Tao-wei; Zaretzki, Jed; Bergeron, Charles; Bennett, Kristin P.; Breneman, Curt M.Journal of Chemical Information and Modeling (2013), 53 (12), 3352-3366CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Computational methods that can identify CYP-mediated sites of metab. (SOMs) of drug-like compds. have become required tools for early stage lead optimization. In recent years, methods that combine CYP binding site features with CYP/ligand binding information have been sought to increase the prediction accuracy of such hybrid models over those that use only one representation. Two challenges that any hybrid ligand/structure-based method must overcome are (1) identification of the best binding pose for a specific ligand with a given CYP and (2) appropriately incorporating the results of docking with ligand reactivity. To address these challenges the authors have created Docking-Regioselectivity-Predictor (DR-Predictor) - a method that incorporates flexible docking-derived information with specialized electronic reactivity and multiple-instance-learning methods to predict CYP-mediated SOMs. The hybrid ligand-structure-based DR-Predictor method was tested on substrate sets for CYP 1A2 and CYP 2A6. For these data, the DR-Predictor model was found to identify the exptl. obsd. SOM within the top two predicted rank-positions for 86% of the 261 1A2 substrates and 83% of the 100 2A6 substrates. Given the accuracy and extendibility of the DR-Predictor method, the authors anticipate that it will further facilitate the prediction of CYP metab. liabilities and aid in in-silico ADMET assessment of novel structures. - 40Zaretzki, J. M.; Browning, M. R.; Hughes, T. B.; Swamidass, S. J. Extending P450 Site-of-Metabolism Models with Region-Resolution Data Bioinformatics 2015, 31, 1966– 1973 DOI: 10.1093/bioinformatics/btv100[Crossref], [PubMed], [CAS], Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xht1Cit77E&md5=5463287e64149487a190db5ad62f6beeExtending P450 site-of-metabolism models with region-resolution dataZaretzki, Jed M.; Browning, Michael R.; Hughes, Tyler B.; Swamidass, S. JoshuaBioinformatics (2015), 31 (12), 1966-1973CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Motivation: Cytochrome P450s are a family of enzymes responsible for the metab. of approx. 90% of FDA-approved drugs. Medicinal chemists often want to know which atoms of a mol.-its metabolized sites-are oxidized by Cytochrome P450s in order to modify their metab. Consequently, there are several methods that use literature-derived, atom-resoln. data to train models that can predict a mol.'s sites of metab. There is, however, much more data available at a lower resoln., where the exact site of metab. is not known, but the region of the mol. that is oxidized is known. Until now, no site-of-metab. models made use of region- resoln. data. Results: Here, we describe XenoSite-Region, the first reported method for training site-of-metab. models with region-resoln. data. Our approach uses the Expectation Maximization algorithm to train a site-of-metab. model. Region-resoln. metab. data was simulated from a large site-of-metab. dataset, contg. 2000 mols. with 3400 metabolized and 30 000 un-metabolized sites and covering nine Cytochrome P 450 isoenzymes. When training on the same mols. (but with only region-level information), we find that this approach yields models almost as accurate as models trained with atom-resoln. data. Moreover, we find that atom-resoln. trained models are more accurate when also trained with region-resoln. data from addnl. mols. Our approach, therefore, opens up a way to extend the applicable domain of site-of-metab. models into larger regions of chem. space. This meets a crit. need in drug development by tapping into underutilized data commonly available in most large drug companies.
- 41Powers, D. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation J. Mach. Learn. Technol. 2011, 2, 37– 63Google ScholarThere is no corresponding record for this reference.
- 42Adams, S. E. Molecular Similarity and Xenobiotic Metabolism; University of Cambridge, 2010.Google ScholarThere is no corresponding record for this reference.
- 43Boyer, S.; Arnby, C. H.; Carlsson, L.; Smith, J.; Stein, V.; Glen, R. C. Reaction Site Mapping of Xenobiotic Biotransformations J. Chem. Inf. Model. 2007, 47, 583– 590 DOI: 10.1021/ci600376q[ACS Full Text
], [CAS], Google Scholar
43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXhslaitrc%253D&md5=5feeecf97e20393b3fc3a36be29eae6dReaction Site Mapping of Xenobiotic BiotransformationsBoyer, Scott; Arnby, Catrin Hasselgren; Carlsson, Lars; Smith, James; Stein, Viktor; Glen, Robert C.Journal of Chemical Information and Modeling (2007), 47 (2), 583-590CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Predictive metab. methods can be used in drug discovery projects to enhance the understanding of structure-metab. relationships. The present study uses data mining methods to exploit biotransformation data that have been recorded in the MDL Metabolite database. Reacting center fingerprints were derived from a comparison of substrates and their corresponding products listed in the database. This process yields two fingerprint databases: all atoms in all substrates and all reacting centers. The metabolic reaction data are then mined by submitting a new mol. and searching for fingerprint matches to every atom in the new mol. in both databases. An "occurrence ratio" is derived from the fingerprint matches between the submitted compd. and the reacting center and substrate fingerprint databases. Normalization of the occurrence ratio within each submitted mol. enables the results of the search to be rank-ordered as a measure of the relative frequency of a reaction occurring at a specific site within the submitted mol. Predictive performance that would allow this method to be used by drug discovery teams to generate useful hypotheses regarding structure metab. relationships was obsd. - 44Tyzack, J. D.; Mussa, H. Y.; Williamson, M. J.; Kirchmair, J.; Glen, R. C. Cytochrome P450 Site of Metabolism Prediction from 2D Topological Fingerprints Using GPU Accelerated Probabilistic Classifiers J. Cheminf. 2014, 6, 29 DOI: 10.1186/1758-2946-6-29[Crossref], [CAS], Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXivFSks74%253D&md5=cb64fc4f7a201503285ead1d4d732c6bCytochrome P450 site of metabolism prediction from 2D topological fingerprints using GPU accelerated probabilistic classifiersTyzack, Jonathan D.; Mussa, Hamse Y.; Williamson, Mark J.; Kirchmair, Johannes; Glen, Robert C.Journal of Cheminformatics (2014), 6 (), 29/1-29/14, 14 pp.CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: The prediction of sites and products of metab. in xenobiotic compds. is key to the development of new chem. entities, where screening potential metabolites for toxicity or unwanted side-effects is of crucial importance. In this work 2D topol. fingerprints are used to encode at. sites and three probabilistic machine learning methods are applied: Parzen-Rosenblatt Window (PRW), Naive Bayesian (NB) and a novel approach called RASCAL (Random Attribute Subsampling Classification Algorithm). These are implemented by randomly subsampling descriptor space to alleviate the problem often suffered by data mining methods of having to exactly match fingerprints, and in the case of PRW by measuring a distance between feature vectors rather than exact matching. The classifiers have been implemented in CUDA/C++ to exploit the parallel architecture of graphical processing units (GPUs) and is freely available in a public repository. Results: It is shown that for PRW a SoM (Site of Metab.) is identified in the top two predictions for 85%, 91% and 88% of the CYP 3A4, 2D6 and 2C9 data sets resp., with RASCAL giving similar performance of 83%, 91% and 88%, resp. These results put PRW and RASCAL performance ahead of NB which gave a much lower classification performance of 51%, 73% and 74%, resp. Conclusions: 2D topol. fingerprints calcd. to a bond depth of 4-6 contain sufficient information to allow the identification of SoMs using classifiers based on relatively small data sets. Thus, the machine learning methods outlined in this paper are conceptually simpler and more efficient than other methods tested and the use of simple topol. descriptors derived from 2D structure give results competitive with other approaches using more expensive quantum chem. descriptors. The descriptor space subsampling approach and ensemble methodol. allow the methods to be applied to mols. more distant from the training data where data mining would be more likely to fail due to the lack of common fingerprints. The RASCAL algorithm is shown to give equiv. classification performance to PRW but at lower computational expense allowing it to be applied more efficiently in the ensemble scheme.
- 45Stewart, J. J. MOPAC: A Semiempirical Molecular Orbital Program J. Comput.-Aided Mol. Des. 1990, 4, 1– 105 DOI: 10.1007/BF00128336[Crossref], [PubMed], [CAS], Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3cXksFyjurc%253D&md5=983366b4ba2797f0fd8a91aec6d366edMOPAC: a semiempirical molecular orbital programStewart, James J. P.Journal of Computer-Aided Molecular Design (1990), 4 (1), 1-105CODEN: JCADEQ; ISSN:0920-654X.An overview is presented of MOPAC, a semiempirical MO program for the study of chem. reactions involving mols., ions, and linear polymers. The program implements the semiempirical Hamiltonians MNDO, AM1, MINDO/3, and MNDO-PM3 and combines the calcns. of vibrational spectra, thermodn. quantities, isotopic substitution effects, and force consts. in a fully integrated program.
- 46MOPAC2016. http://openmopac.net/home.html (accessed Apr 7, 2017) .Google ScholarThere is no corresponding record for this reference.
- 47Schüürmann, G. Quantitative Structure-Property Relationships for the Polarizability, Solvatochromic Parameters and Lipophilicity Quant. Struct.-Act. Relat. 1990, 9, 326– 333 DOI: 10.1002/qsar.19900090406[Crossref], [CAS], Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3MXhvFaqtrY%253D&md5=e7e356b88e9179b4698e7515c47e1f0eQuantitative structure-property relationships for the polarizability, solvatochromic parameters and lipophilicitySchueuermann, GerritQuantitative Structure-Activity Relationships (1990), 9 (4), 326-33CODEN: QSARDI; ISSN:0931-8771.The polarizability αvol, the solvatochromic parameters π* and β and the lipophilicity as expressed by log Kow are subjected to regression analyses using calcd. mol. parameters within the MNDO scheme. The resulting linear one- and more-variable regression equations enable rapid approx. calcns. for αvol, π* and β and log Kow of untested compds.; in particular, αvol is split into a mol. size and an electronic part which offers new possibilities for applications in quant. structure-activity relationships.
- 48Coulson, C. A.; Longuet-Higgins, H. C. The Electronic Structure of Conjugated Systems. II. Unsaturated Hydrocarbons and Their Hetero-Derivatives Proc. R. Soc. London, Ser. A 1947, 192, 16– 32 DOI: 10.1098/rspa.1947.0136
- 49Fukui, K.; Kato, H.; Yonezawa, T. A New Quantum-Mechanical Reactivity Index for Saturated Compounds Bull. Chem. Soc. Jpn. 1961, 34, 1111– 1115 DOI: 10.1246/bcsj.34.1111
- 50Gopinathan, M. S.; Siddarth, P.; Ravimohan, C. Valency and Molecular Structure Theor. Chim. Acta 1986, 70, 303– 322 DOI: 10.1007/BF00534237
- 51Mulliken, R. S. Electronic Population Analysis on LCAO-MO Molecular Wave Functions. I J. Chem. Phys. 1955, 23, 1833– 1840 DOI: 10.1063/1.1740588[Crossref], [CAS], Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaG28Xnt1Kq&md5=5aab51c06ce6a3250219cf12d1b5f395Electronic population analysis on LCAO-MO [linear combination of atomic orbital-molecular orbital] molecular wave functions. IMulliken, R. S.Journal of Chemical Physics (1955), 23 (), 1833-40CODEN: JCPSA6; ISSN:0021-9606.An analysis in quant. form was given in terms of breakdowns of the electronic population into partial and total "gross at. populations," or into partial and total "net at. populations" together with "overlap populations." Gross at. populations distribute the electrons almost perfectly among the various at. orbitals of the various atoms in the mol. From these nos., a definite figure is obtained for the amt. of promotion (e.g. from 2s to 2p) in each atom; and also for the gross charge Q on each atom if the bonds are polar. The total overlap population for any pair of atoms in a mol. is in general made up of pos. and neg. contributions. If the total overlap population between 2 atoms is pos., they are bonded; if neg., they are antibonded. Tables of gross at. populations and overlap populations were calcd. for CO and H2O. The amt. of s-p promotion was nearly the same for the O atom in CO and in H2O (0.14 electron in CO and 0.15e in H2O). For the C atom in CO it is 0.50e. For the N atom in N2 it is 0.26e. In spite of very strong polarity in the π bonds in CO, the σ and π overlap populations are very similar to those in N. In CO the total overlap population for the π electrons is about twice that for the σ electrons. The most easily ionized electrons of CO are in a mol. orbital such that its gross at. population is 94% localized on the C atom; these electrons account for the weak electron donor properties of CO.
- 52Mulliken, R. S. Criteria for the Construction of Good Self-Consistent-Field Molecular Orbital Wave Functions, and the Significance of LCAO-MO Population Analysis J. Chem. Phys. 1962, 36, 3428– 3439 DOI: 10.1063/1.1732476[Crossref], [CAS], Google Scholar52https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaF38Xks1KrtL8%253D&md5=9fb6c9a93a70358d02154b028efe63fcCriteria for the construction of good self-consistent-field molecular orbital wave functions, and the significance of L.C.A.O.M.O. population analysisMulliken, R. S.Journal of Chemical Physics (1962), 36 (), 3428-39CODEN: JCPSA6; ISSN:0021-9606.Criteria for the optimal choice of finite linear combinations of Slater-type orbitals (S.T.O.) adequate to approx. closely S.C.F. M.O. were examd. in the light of computations on HF and other mols. Some aspects of the A.O. (generalized Heitler-London) method were discussed. The inherent limitations on the meaning of charges on atoms in a mol., or of degree of ionic character, were discussed. Unacceptable at. charges are found from an L.C.A.O. mol. orbital population analysis made on S.C.F. M.O. wave functions, approximated by using unbalanced S.T.O. sets, while acceptable results are obtained with judiciously balanced and sufficiently complete S.T.O. sets. The effects of insufficiently complete and unbalanced S.T.O. sets on the computed dipole moments of S.C.F. M.O. wave functions were discussed, with examples.
- 53Holm, S. A Simple Sequentially Rejective Multiple Test Procedure Scand. Stat. Theory Appl. 1979, 6, 65– 70Google ScholarThere is no corresponding record for this reference.
- 54Friedman, M. A Comparison of Alternative Tests of Significance for the Problem of m Rankings Ann. Math. Stat. 1940, 11, 86– 92 DOI: 10.1214/aoms/1177731944
- 55Friedman, M. A Correction: The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance J. Am. Stat. Assoc. 1939, 34, 109– 109 DOI: 10.2307/2279169
- 56Friedman, M. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance J. Am. Stat. Assoc. 1937, 32, 675– 701 DOI: 10.1080/01621459.1937.10503522
- 57Shapiro, S. S.; Wilk, M. B. An Analysis of Variance Test for Normality (Complete Samples) Biometrika 1965, 52, 591– 611 DOI: 10.1093/biomet/52.3-4.591
- 58Mauchly, J. W. Significance Test for Sphericity of a Normal N-Variate Distribution Ann. Math. Stat. 1940, 11, 204– 209 DOI: 10.1214/aoms/1177731915
- 59Greenhouse, S. W.; Geisser, S. On Methods in the Analysis of Profile Data Psychometrika 1959, 24, 95– 112 DOI: 10.1007/BF02289823
- 60Huynh, H.; Feldt, L. S. Estimation of the Box Correction for Degrees of Freedom from Sample Data in Randomized Block and Split-Plot Designs J. Educ. Behav. Stat. 1976, 1, 69– 82 DOI: 10.3102/10769986001001069
- 61de Bruyn Kops, C.; Friedrich, N.-O.; Kirchmair, J. Alignment-Based Prediction of Sites of Metabolism J. Chem. Inf. Model. 2017, 57 (6) 1258– 1264[ACS Full Text
], [CAS], Google Scholar
61https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXnvV2hs7w%253D&md5=3a48e4bbfc64e569c9a9ff3d4040bc38Alignment-Based Prediction of Sites of Metabolismde Bruyn Kops, Christina; Friedrich, Nils-Ole; Kirchmair, JohannesJournal of Chemical Information and Modeling (2017), 57 (6), 1258-1264CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Prediction of metabolically labile atom positions in a mol. (sites of metab.) is a key component of the simulation of xenobiotic metab. as a whole, providing crucial information for the development of safe and effective drugs. In 2008, an exploratory study was published in which sites of metab. were derived based on mol. shape- and chem. feature-based alignment to a mol. whose site of metab. (SoM) had been detd. by expts. The authors present a detailed anal. of the breadth of applicability of alignment-based SoM prediction, including transfer of the approach from a structure- to ligand-based method and extension of the applicability of the models from cytochrome P 450 2C9 to all cytochrome P 450 isoenzymes involved in drug metab. The authors evaluate the effect of mol. similarity of the query and ref. mols. on the ability of this approach to accurately predict SoMs. In addn., the authors combine the alignment-based method with a leading chem. reactivity model to take reactivity into account. The combined model yielded superior performance in comparison to the alignment-based approach and the reactivity models with an av. area under the receiver operating characteristic curve of 0.85 in cross-validation expts. In particular, early enrichment was improved, as evidenced by higher BEDROC scores (mean BEDROC = 0.59 for α = 20.0, mean BEDROC = 0.73 for α = 80.5). - 62OMEGA, version 2.5.1.4; OpenEye Scientific Software: Santa Fe, NM, 2011; https://www.eyesopen.com (accessed Apr 7, 2017).Google ScholarThere is no corresponding record for this reference.
- 63Hawkins, P. C. D.; Skillman, A. G.; Warren, G. L.; Ellingson, B. A.; Stahl, M. T. Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database J. Chem. Inf. Model. 2010, 50, 572– 584 DOI: 10.1021/ci100031x[ACS Full Text
], [CAS], Google Scholar
63https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXjtlaisrY%253D&md5=fb87ecc9c51eddef63b41fffcd9babeeConformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural DatabaseHawkins, Paul C. D.; Skillman, A. Geoffrey; Warren, Gregory L.; Ellingson, Benjamin A.; Stahl, Matthew T.Journal of Chemical Information and Modeling (2010), 50 (4), 572-584CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Here, we present the algorithm and validation for OMEGA, a systematic, knowledge-based conformer generator. The algorithm consists of three phases: assembly of an initial 3D structure from a library of fragments; exhaustive enumeration of all rotatable torsions using values drawn from a knowledge-based list of angles, thereby generating a large set of conformations; and sampling of this set by geometric and energy criteria. Validation of conformer generators like OMEGA has often been undertaken by comparing computed conformer sets to exptl. mol. conformations from crystallog., usually from the Protein Databank (PDB). Such an approach is fraught with difficulty due to the systematic problems with small mol. structures in the PDB. Methods are presented to identify a diverse set of small mol. structures from cocomplexes in the PDB that has maximal reliability. A challenging set of 197 high quality, carefully selected ligand structures from well-solved models was obtained using these methods. This set will provide a sound basis for comparison and validation of conformer generators in the future. Validation results from this set are compared to the results using structures of a set of druglike mols. extd. from the Cambridge Structural Database (CSD). OMEGA is found to perform very well in reproducing the crystallog. conformations from both these data sets using two complementary metrics of success. - 64RDKit 2016.03.4. https://github.com/rdkit/rdkit/releases/tag/Release_2016_03_4 (Accessed April 7, 2017).Google ScholarThere is no corresponding record for this reference.
- 65Steinbeck, C.; Han, Y.; Kuhn, S.; Horlacher, O.; Luttmann, E.; Willighagen, E. The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics J. Chem. Inf. Comput. Sci. 2003, 43, 493– 500 DOI: 10.1021/ci025584y[ACS Full Text
], [CAS], Google Scholar
65https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXhtVaktbg%253D&md5=afc8fd10783af301c73a8183727230bfThe Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and BioinformaticsSteinbeck, Christoph; Han, Yongquan; Kuhn, Stefan; Horlacher, Oliver; Luttmann, Edgar; Willighagen, EgonJournal of Chemical Information and Computer Sciences (2003), 43 (2), 493-500CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)The Chem. Development Kit (CDK) is a freely available open-source Java library for Structural Chemo- and Bioinformatics. Its architecture and capabilities as well as the development as an open-source project by a team of international collaborators from academic and industrial institutions is described. The CDK provides methods for many common tasks in mol. informatics, including 2D and 3D rendering of chem. structures, I/O routines, SMILES parsing and generation, ring searches, isomorphism checking, structure diagram generation, etc. Application scenarios as well as access information for interested users and potential contributors are given. - 66Chemistry Development Kit 1.4.19. https://github.com/cdk/cdk/releases/tag/cdk-1.4.19 (accessed Apr 7, 2017) .Google ScholarThere is no corresponding record for this reference.
- 67scikit-learn 0.18. http://scikit-learn.org/0.18/documentation.html (accessed Apr 7, 2017) .Google ScholarThere is no corresponding record for this reference.
Cited By
This article is cited by 40 publications.
- Yanjun Feng, Changda Gong, Jieyu Zhu, Guixia Liu, Yun Tang, Weihua Li. Prediction of Sites of Metabolism of CYP3A4 Substrates Utilizing Docking-Derived Geometric Features. Journal of Chemical Information and Modeling 2023, 63 (13) , 4158-4169. https://doi.org/10.1021/acs.jcim.3c00549
- Mario Öeren, Sylvia C. Kaempf, David J. Ponting, Peter A. Hunt, Matthew D. Segall. Predicting Regioselectivity of Cytosolic Sulfotransferase Metabolism for Drugs. Journal of Chemical Information and Modeling 2023, 63 (11) , 3340-3349. https://doi.org/10.1021/acs.jcim.3c00275
- Mario Öeren, Peter J. Walton, James Suri, David J. Ponting, Peter A. Hunt, Matthew D. Segall. Predicting Regioselectivity of AO, CYP, FMO, and UGT Metabolism Using Quantum Mechanical Simulations and Machine Learning. Journal of Medicinal Chemistry 2022, 65 (20) , 14066-14081. https://doi.org/10.1021/acs.jmedchem.2c01303
- Siyang Tian, Xuan Cao, Russell Greiner, Carin Li, AnChi Guo, David S. Wishart. CyProduct: A Software Tool for Accurately Predicting the Byproducts of Human Cytochrome P450 Metabolism. Journal of Chemical Information and Modeling 2021, 61 (6) , 3128-3140. https://doi.org/10.1021/acs.jcim.1c00144
- Christina de Bruyn Kops, Martin Šícho, Angelica Mazzolari, Johannes Kirchmair. GLORYx: Prediction of the Metabolites Resulting from Phase 1 and Phase 2 Biotransformations of Xenobiotics. Chemical Research in Toxicology 2021, 34 (2) , 286-299. https://doi.org/10.1021/acs.chemrestox.0c00224
- Charleen G. Don, Martin Smieško. Deciphering Reaction Determinants of Altered-Activity CYP2D6 Variants by Well-Tempered Metadynamics Simulation and QM/MM Calculations. Journal of Chemical Information and Modeling 2020, 60 (12) , 6642-6653. https://doi.org/10.1021/acs.jcim.0c01091
- Na Le Dang, Matthew K. Matlock, Tyler B. Hughes, S. Joshua Swamidass. The Metabolic Rainbow: Deep Learning Phase I Metabolism in Five Colors. Journal of Chemical Information and Modeling 2020, 60 (3) , 1146-1164. https://doi.org/10.1021/acs.jcim.9b00836
- Xin Yang, Yifei Wang, Ryan Byrne, Gisbert Schneider, Shengyong Yang. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chemical Reviews 2019, 119 (18) , 10520-10594. https://doi.org/10.1021/acs.chemrev.8b00728
- Martin Šícho, Conrad Stork, Angelica Mazzolari, Christina de Bruyn Kops, Alessandro Pedretti, Bernard Testa, Giulio Vistoli, Daniel Svozil, Johannes Kirchmair. FAME 3: Predicting the Sites of Metabolism in Synthetic Compounds and Natural Products for Phase 1 and Phase 2 Metabolic Enzymes. Journal of Chemical Information and Modeling 2019, 59 (8) , 3400-3412. https://doi.org/10.1021/acs.jcim.9b00376
- Angelica Mazzolari, Avid M. Afzal, Alessandro Pedretti, Bernard Testa, Giulio Vistoli, Andreas Bender. Prediction of UGT-mediated Metabolism Using the Manually Curated MetaQSAR Database. ACS Medicinal Chemistry Letters 2019, 10 (4) , 633-638. https://doi.org/10.1021/acsmedchemlett.8b00603
- Andrea Volkamer, Sereina Riniker, Eva Nittinger, Jessica Lanini, Francesca Grisoni, Emma Evertsson, Raquel Rodríguez-Pérez, Nadine Schneider. Machine learning for small molecule drug discovery in academia and industry. Artificial Intelligence in the Life Sciences 2023, 3 , 100056. https://doi.org/10.1016/j.ailsci.2022.100056
- Aditya Kate, Ekkita Seth, Ananya Singh, Chandrashekhar Mahadeo Chakole, Meenakshi Kanwar Chauhan, Ravi Kant Singh, Shrirang Maddalwar, Mohit Mishra. Artificial Intelligence for Computer-Aided Drug Discovery. Drug Research 2023, 63 https://doi.org/10.1055/a-2076-3359
- Vincent-Alexander Scholz, Conrad Stork, Markus Frericks, Johannes Kirchmair. Computational prediction of the metabolites of agrochemicals formed in rats. Science of The Total Environment 2023, , 165039. https://doi.org/10.1016/j.scitotenv.2023.165039
- Thi Tuyet Van Tran, Hilal Tayara, Kil To Chong. Artificial Intelligence in Drug Metabolism and Excretion Prediction: Recent Advances, Challenges, and Future Perspectives. Pharmaceutics 2023, 15 (4) , 1260. https://doi.org/10.3390/pharmaceutics15041260
- Érika Yoko Suzuki, Alice Simon, Thaisa Francielle Souza Domingos, Bárbara de Azevedo Abrahim Vieira, Alessandra Mendonça Teles de Souza, Carlos Rangel Rodrigues, Valeria Pereira de Sousa, Flávia Almada do Carmo, Lucio Mendes Cabral. Alternative Methods for Pulmonary-Administered Drugs Metabolism: A Breath of Change. Mini-Reviews in Medicinal Chemistry 2023, 23 (2) , 170-186. https://doi.org/10.2174/1389557522666220620125623
- Peter Ertl, Grégori Gerebtzoff, Richard Lewis, Hagen Muenkler, Nadine Schneider, Finton Sirockin, Nikolaus Stiefl, Paolo Tosco. Chemical Reactivity Prediction: Current Methods and Different Application Areas. Molecular Informatics 2022, 41 (6) https://doi.org/10.1002/minf.202100277
- Christophe Muller, Obdulia Rabal, Constantino Diaz Gonzalez. Artificial Intelligence, Machine Learning, and Deep Learning in Real-Life Drug Design Cases. 2022, 383-407. https://doi.org/10.1007/978-1-0716-1787-8_16
- Alan Talevi. Site of Metabolism Predictions. 2022, 1073-1081. https://doi.org/10.1007/978-3-030-84860-6_151
- Jonathan D Tyzack. Prediction of Drug Metabolism: Use of Structural Biology and In Silico Tools. 2022, 769-791. https://doi.org/10.1016/B978-0-12-820472-6.00067-0
- Ruben Goncalves, Romain Pelletier, Aurélien Couette, Thomas Gicquel, Brendan Le Daré. Suitability of high-resolution mass spectrometry in analytical toxicology: Focus on drugs of abuse. Toxicologie Analytique et Clinique 2022, 34 (1) , 29-41. https://doi.org/10.1016/j.toxac.2021.11.006
- Mario Thevis, Thomas Piper, Andreas Thomas. Recent advances in identifying and utilizing metabolites of selected doping agents in human sports drug testing. Journal of Pharmaceutical and Biomedical Analysis 2021, 205 , 114312. https://doi.org/10.1016/j.jpba.2021.114312
- Mary Alexandra Schleiff, Deepika Dhaware, Jasleen K. Sodhi. Recent advances in computational metabolite structure predictions and altered metabolic pathways assessment to inform drug development processes. Drug Metabolism Reviews 2021, 53 (2) , 173-187. https://doi.org/10.1080/03602532.2021.1910292
- Mario Öeren, Peter J. Walton, Peter A. Hunt, David J. Ponting, Matthew D. Segall. Predicting reactivity to drug metabolism: beyond P450s—modelling FMOs and UGTs. Journal of Computer-Aided Molecular Design 2021, 35 (4) , 541-555. https://doi.org/10.1007/s10822-020-00321-1
- Bekir Engin Eser, Yan Zhang, Li Zong, Zheng Guo. Self-sufficient Cytochrome P450s and their potential applications in biotechnology. Chinese Journal of Chemical Engineering 2021, 30 , 121-135. https://doi.org/10.1016/j.cjche.2020.12.002
- Baddipadige Raju, Shalki Choudhary, Gera Narendra, Himanshu Verma, Om Silakari. Molecular modeling approaches to address drug-metabolizing enzymes (DMEs) mediated chemoresistance: a review. Drug Metabolism Reviews 2021, 53 (1) , 45-75. https://doi.org/10.1080/03602532.2021.1874406
- Hyunho Kim, Eunyoung Kim, Ingoo Lee, Bongsung Bae, Minsu Park, Hojung Nam. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. Biotechnology and Bioprocess Engineering 2020, 25 (6) , 895-930. https://doi.org/10.1007/s12257-020-0049-y
- Santosh Malik, Ananya Ghosh, Rout George Kerry, Jyoti Ranjan Rout. Nanotechnology in Preclinical Pharmacokinetics. 2020, 461-478. https://doi.org/10.1007/978-981-15-2195-9_30
- Hussam AL-barakati, Niraj Thapa, Saigo Hiroto, Kaushik Roy, Robert H. Newman, Dukka KC. RF-MaloSite and DL-Malosite: Methods based on random forest and deep learning to identify malonylation sites. Computational and Structural Biotechnology Journal 2020, 18 , 852-860. https://doi.org/10.1016/j.csbj.2020.02.012
- Manavalan, Basith, Shin, Lee, Wei, Lee. 4mCpred-EL: An Ensemble Learning Framework for Identification of DNA N4-methylcytosine Sites in the Mouse Genome. Cells 2019, 8 (11) , 1332. https://doi.org/10.3390/cells8111332
- Kazuma Kaitoh, Masaaki Kotera, Kimito Funatsu. Novel Electrotopological Atomic Descriptors for the Prediction of Xenobiotic Cytochrome P450 Reactions. Molecular Informatics 2019, 38 (10) https://doi.org/10.1002/minf.201900010
- Christina de Bruyn Kops, Conrad Stork, Martin Šícho, Nikolay Kochev, Daniel Svozil, Nina Jeliazkova, Johannes Kirchmair. GLORY: Generator of the Structures of Likely Cytochrome P450 Metabolites Based on Predicted Sites of Metabolism. Frontiers in Chemistry 2019, 7 https://doi.org/10.3389/fchem.2019.00402
- Hussam J. AL-barakati, Hiroto Saigo, Robert H. Newman, Dukka B. KC. RF-GlutarySite: a random forest based predictor for glutarylation sites. Molecular Omics 2019, 15 (3) , 189-204. https://doi.org/10.1039/C9MO00028C
- Balachandran Manavalan, Shaherin Basith, Tae Hwan Shin, Leyi Wei, Gwang Lee. Meta-4mCpred: A Sequence-Based Meta-Predictor for Accurate DNA 4mC Site Prediction Using Effective Feature Representation. Molecular Therapy - Nucleic Acids 2019, 16 , 733-744. https://doi.org/10.1016/j.omtn.2019.04.019
- Leonardo L.G. Ferreira, Adriano D. Andricopulo. ADMET modeling approaches in drug discovery. Drug Discovery Today 2019, 24 (5) , 1157-1165. https://doi.org/10.1016/j.drudis.2019.03.015
- Yanmin Zhang, Yuchen Wang, Weineng Zhou, Yuanrong Fan, Junnan Zhao, Lu Zhu, Shuai Lu, Tao Lu, Yadong Chen, Haichun Liu. A combined drug discovery strategy based on machine learning and molecular docking. Chemical Biology & Drug Design 2019, 93 (5) , 685-699. https://doi.org/10.1111/cbdd.13494
- Jonathan D. Tyzack, Johannes Kirchmair. Computational methods and tools to predict cytochrome P450 metabolism for drug discovery. Chemical Biology & Drug Design 2019, 93 (4) , 377-386. https://doi.org/10.1111/cbdd.13445
- Marco Montefiori, Casper Lyngholm-Kjærby, Anthony Long, Lars Olsen, Flemming Steen Jørgensen. Fast Methods for Prediction of Aldehyde Oxidase-Mediated Site-of-Metabolism. Computational and Structural Biotechnology Journal 2019, 17 , 345-351. https://doi.org/10.1016/j.csbj.2019.03.003
- Magdalena Galster, Marius Löppenberg, Fabian Galla, Frederik Börgel, Oriana Agoglitta, Johannes Kirchmair, Ralph Holl. Phenylethylene glycol-derived LpxC inhibitors with diverse Zn2+-binding groups. Tetrahedron 2019, 75 (4) , 486-509. https://doi.org/10.1016/j.tet.2018.12.011
- Arndt R. Finkelmann, Daria Goldmann, Gisbert Schneider, Andreas H. Göller. MetScore: Site of Metabolism Prediction Beyond Cytochrome P450 Enzymes. ChemMedChem 2018, 13 (21) , 2281-2289. https://doi.org/10.1002/cmdc.201800309
- Balachandran Manavalan, Rajiv Gandhi Govindaraj, Tae Hwan Shin, Myeong Ok Kim, Gwang Lee. iBCE-EL: A New Ensemble Learning Framework for Improved Linear B-Cell Epitope Prediction. Frontiers in Immunology 2018, 9 https://doi.org/10.3389/fimmu.2018.01695
Abstract
Figure 1
Figure 1. Overview of the data preparation, model building, and model evaluation workflow.
Figure 2
Figure 2. Example showing how circular atom type fingerprints and descriptors of up to bond depth 3 were calculated for a single atom. In this example, neighboring atoms up to three bonds away from the sp3 hybridized oxygen were encoded. Two examples of data matrix parts corresponding to neighbors encoded in bond depths 0 and 3 are shown. In both the fingerprints and descriptors, atoms were grouped by their atom type (highlighted in blue in the column names) in each bond depth (highlighted in orange). In the case of the fingerprints, the occurrences of each atom type were noted, while for real-valued descriptors, the average value of the basic descriptor among the grouped atoms was calculated and assigned to the corresponding atom type. Therefore, for the sigmaElectronegativity descriptor in this example, the third layer would contain 8.35 (8.35/1 = 8.35) for the sp3 hybridized nitrogen type and 8.31 ((8.31 + 8.31)/2 = 8.31) for the aromatic carbon type. Consequently, what can be derived from this matrix fragment is that there are two aromatic carbons (which are topologically identical) within three bonds of the oxygen atom that have a sigmaElectronegativity equal to 8.31 each.
Figure 3
Figure 3. Visualization of the relationship between maximum bond depth of circular descriptors and model performance on the (a) training set (measured as the mean MCC across the 10 folds in cross-validation of the optimized model), (b) independent test set, (c) uncorrelated test set, and (d) full Zaretzki data set (measured as the mean MCC across the 10 folds in cross-validation of the optimized model). Models that do not use circular descriptors are shown to have a bond depth equal to 0.
Figure 4
Figure 4. Consensus matrix combining results from parametric and nonparametric posthoc tests. The pairs of models for which the null hypothesis (the equality of means for t test or similar performance ranking for Nemenyi test) was rejected at an α level of 0.05 by both the parametric and nonparametric method are highlighted in teal.
Figure 5
Figure 5. Point plot which represents the expected performance of each model by indicating 95% confidence intervals for mean MCC. The confidence intervals were obtained from a statistical analysis of predictions of 100 random subsamples (50 molecules each, sampled with replacement) of the original test set.
Figure 6
Figure 6. Histogram of maximum Tanimoto similarities (a) between compounds in the training set and the independent test set (136 molecules) and (b) between compounds within the uncorrelated test set (71 molecules).
Figure 7
Figure 7. Example of prediction visualization by FAME 2.
References
ARTICLE SECTIONSThis article references 67 other publications.
- 1Kirchmair, J.; Howlett, A.; Peironcely, J. E.; Murrell, D. S.; Williamson, M. J.; Adams, S. E.; Hankemeier, T.; van Buren, L.; Duchateau, G.; Klaffke, W.; Glen, R. C. How Do Metabolites Differ from Their Parent Molecules and How Are They Excreted? J. Chem. Inf. Model. 2013, 53, 354– 367 DOI: 10.1021/ci300487z[ACS Full Text
], [CAS], Google Scholar
1https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhsVemtrY%253D&md5=ea563b68312736c7736b40939d7620a6How Do Metabolites Differ from Their Parent Molecules and How Are They Excreted?Kirchmair, Johannes; Howlett, Andrew; Peironcely, Julio E.; Murrell, Daniel S.; Williamson, Mark J.; Adams, Samuel E.; Hankemeier, Thomas; van Buren, Leo; Duchateau, Guus; Klaffke, Werner; Glen, Robert C.Journal of Chemical Information and Modeling (2013), 53 (2), 354-367CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Understanding which physicochem. properties, or property distributions, are favorable for successful design and development of drugs, nutritional supplements, cosmetics, and agrochems. is of great importance. In this study the authors have analyzed mols. from three distinct chem. spaces (i) approved drugs, (ii) human metabolites, and (iii) traditional Chinese medicine (TCM) to investigate four aspects detg. the disposition of small org. mols. First, the authors examd. the physicochem. properties of these three classes of mols. and identified characteristic features resulting from their distinctive biol. functions. For example, human metabolites and TCM mols. can be larger and more hydrophobic than drugs, which makes them less likely to cross membranes. The authors then quantified the shifts in physicochem. property space induced by metab. from a holistic perspective by analyzing a data set of several thousand exptl. obsd. metabolic trees. Results show how the metabolic system aims to retain nutrients/micronutrients while facilitating a rapid elimination of xenobiotics. In the third part the authors compared these global shifts with the contributions made by individual metabolic reactions. For better resoln., all reactions were classified into phase I and phase II biotransformations. Interestingly, not all metabolic reactions lead to more hydrophilic mols. The authors were able to identify biotransformations leading to an increase of logP by more than one log unit, which could be used for the design of drugs with enhanced efficacy. The study closes with the anal. of the physicochem. properties of metabolites found in the bile, feces, and urine. Metabolites in the bile can be large and are often neg. charged. Mols. with mol. wt. >500 Da are rarely found in the urine, and most of these large mols. are charged phase II conjugates. - 2Testa, B.; Pedretti, A.; Vistoli, G. Reactions and Enzymes in the Metabolism of Drugs and Other Xenobiotics Drug Discovery Today 2012, 17, 549– 560 DOI: 10.1016/j.drudis.2012.01.017[Crossref], [PubMed], [CAS], Google Scholar2https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XnslOiur8%253D&md5=e11b5b1b7ea18e8bc817bff83efad456Reactions and enzymes in the metabolism of drugs and other xenobioticsTesta, Bernard; Pedretti, Alessandro; Vistoli, GiulioDrug Discovery Today (2012), 17 (11-12), 549-560CODEN: DDTOFS; ISSN:1359-6446. (Elsevier Ltd.)In this article, we offer an overview of the compared quant. importance of biotransformation reactions in the metab. of drugs and other xenobiotics, based on a meta-anal. of current research interests. Also, we assess the relative significance the enzyme (super)families or categories catalyzing these reactions. We put the facts unveiled by the anal. into a drug discovery context and draw some implications. The results confirm the primary role of cytochrome P 450-catalyzed oxidns. and UDP-glucuronosyl-catalyzed glucuronidations, but they also document the marked significance of several other reactions. Thus, there is a need for several drug discovery scientists to better grasp the variety of drug metab. reactions and enzymes and their consequences.
- 3Kirchmair, J.; Göller, A. H.; Lang, D.; Kunze, J.; Testa, B.; Wilson, I. D.; Glen, R. C.; Schneider, G. Predicting Drug Metabolism: Experiment and/or Computation? Nat. Rev. Drug Discovery 2015, 14, 387– 404 DOI: 10.1038/nrd4581[Crossref], [PubMed], [CAS], Google Scholar3https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXntFCju7k%253D&md5=231060e7860047948cffd071b8a57197Predicting drug metabolism: experiment and/or computation?Kirchmair, Johannes; Goeller, Andreas H.; Lang, Dieter; Kunze, Jens; Testa, Bernard; Wilson, Ian D.; Glen, Robert C.; Schneider, GisbertNature Reviews Drug Discovery (2015), 14 (6), 387-404CODEN: NRDDAG; ISSN:1474-1776. (Nature Publishing Group)A review. Drug metab. can produce metabolites with physicochem. and pharmacol. properties that differ substantially from those of the parent drug, and consequently has important implications for both drug safety and efficacy. To reduce the risk of costly clin.-stage attrition due to the metabolic characteristics of drug candidates, there is a need for efficient and reliable ways to predict drug metab. in vitro, in silico and in vivo. In this Perspective, we provide an overview of the state of the art of exptl. and computational approaches for investigating drug metab. We highlight the scope and limitations of these methods, and indicate strategies to harvest the synergies that result from combining measurement and prediction of drug metab.
- 4Campagna-Slater, V.; Pottel, J.; Therrien, E.; Cantin, L.-D.; Moitessier, N. Development of a Computational Tool to Rival Experts in the Prediction of Sites of Metabolism of Xenobiotics by P450s J. Chem. Inf. Model. 2012, 52, 2471– 2483 DOI: 10.1021/ci3003073[ACS Full Text
], [CAS], Google Scholar
4https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38Xht1CisrvP&md5=c954b56c19c486106cb7fec0f51fe68aDevelopment of a Computational Tool to Rival Experts in the Prediction of Sites of Metabolism of Xenobiotics by P450sCampagna-Slater, Valerie; Pottel, Joshua; Therrien, Eric; Cantin, Louis-David; Moitessier, NicolasJournal of Chemical Information and Modeling (2012), 52 (9), 2471-2483CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)The metab. of xenobiotics - and more specifically drugs - in the liver is a crit. process controlling their half-life. Although there exist exptl. methods, which measure the metabolic stability of xenobiotics and identify their metabolites, developing higher throughput predictive methods is an avenue of research. It is expected that predicting the chem. nature of the metabolites would be an asset for designing safer drugs and/or drugs with modulated half-lives. We have developed IMPACTS (In-silico Metab. Prediction by Activated Cytochromes and Transition States), a computational tool combining docking to metabolic enzymes, transition state modeling, and rule-based substrate reactivity prediction to predict the site of metab. (SoM) of xenobiotics. Its application to sets of CYP1A2, CYP2C9, CYP2D6, and CYP3A4 substrates and comparison to experts' predictions demonstrates its accuracy and significance. IMPACTS identified an exptl. obsd. SoM in the top 2 predicted sites for 77% of the substrates, while the accuracy of biotransformation experts' prediction was 65%. Application of IMPACTS to external sets and comparison of its accuracy to those of eleven other methods further validated the method implemented in IMPACTS. - 5Crivori, P.; Poggesi, I. Computational Approaches for Predicting CYP-Related Metabolism Properties in the Screening of New Drugs Eur. J. Med. Chem. 2006, 41, 795– 808 DOI: 10.1016/j.ejmech.2006.03.003[Crossref], [PubMed], [CAS], Google Scholar5https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD28XmvFeltro%253D&md5=1de13b7845575861e4f5ef85af3d809bComputational approaches for predicting CYP-related metabolism properties in the screening of new drugsCrivori, P.; Poggesi, I.European Journal of Medicinal Chemistry (2006), 41 (7), 795-808CODEN: EJMCA5; ISSN:0223-5234. (Elsevier B.V.)A review. The site of biotransformation, the extent and rate of metab. and the no. of active metabolic pathways are among the most important characteristics of the pharmacokinetics of a drug. The catalytic activity of drug metabolizing enzymes is likely the most influential determinant of the pharmacokinetic variability. Metabolic stability is the prerequisite for sustaining the therapeutically relevant concns. Metabolic inhibition and induction can give rise to clin. important drug-drug interactions. A variety of computational approaches are currently available for predicting different cytochrome P 450 (CYP)-related metab. endpoints. The present review will describe these approaches and their impact on drug development process. Indications on the available software for the implementation will also be given.
- 6Tarcsay, Á.; Keserű, G. M. In Silico Site of Metabolism Prediction of Cytochrome P450-Mediated Biotransformations Expert Opin. Drug Metab. Toxicol. 2011, 7, 299– 312 DOI: 10.1517/17425255.2011.553599[Crossref], [PubMed], [CAS], Google Scholar6https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXit1Wku7k%253D&md5=f3631600a6ebeee3c7338521ccdb7424In silico site of metabolism prediction of cytochrome P450-mediated biotransformationsTarcsay, Akos; Keseru, Gyorgy M.Expert Opinion on Drug Metabolism & Toxicology (2011), 7 (3), 299-312CODEN: EODMAP; ISSN:1742-5255. (Informa Healthcare)A review. Preclin. research involves the in vitro monitoring of metabolic stability to deliver compds. with improved ADME profiles. Prediction of the metabolically vulnerable points can substantially help in analyzing CYP-mediated metab. data and support optimization efforts in drug discovery programs. Moreover, fast and reliable in silico predictions could accelerate the characterization of in vitro/in vivo metabolites. This paper reviews in silico methods available for CYP-mediated site of metab. (SOM) prediction. Comprehensive and practical knowledge in this field can guide the identification of best practice and may inspire ideas for the development of novel approaches. Comparison of the efficacy of SOM prediction methodologies revealed the general dependency on the studied isoform and substrate set. Increasing knowledge on P 450 X-ray structures, on biotransformations and on the mechanistic details of the catalytic cycle revolutionized the prediction of SOM. Although no ultimate soln. exits, combined methods covering both steric and electronic effects are preferred on most of the pharmaceutically relevant isoforms.
- 7Kirchmair, J.; Williamson, M. J.; Tyzack, J. D.; Tan, L.; Bond, P. J.; Bender, A.; Glen, R. C. Computational Prediction of Metabolism: Sites, Products, SAR, P450 Enzyme Dynamics, and Mechanisms J. Chem. Inf. Model. 2012, 52, 617– 648 DOI: 10.1021/ci200542m[ACS Full Text
], [CAS], Google Scholar
7https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XisVyitL4%253D&md5=d144f6054d9f27774f476d026a066805Computational Prediction of Metabolism: Sites, Products, SAR, P450 Enzyme Dynamics, and MechanismsKirchmair, Johannes; Williamson, Mark J.; Tyzack, Jonathan D.; Tan, Lu; Bond, Peter J.; Bender, Andreas; Glen, Robert C.Journal of Chemical Information and Modeling (2012), 52 (3), 617-648CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)A review perspective. Metab. of xenobiotics remains a central challenge for the discovery and development of drugs, cosmetics, nutritional supplements, and agrochems. Metabolic transformations are frequently related to the incidence of toxic effects that may result from the emergence of reactive species, the systemic accumulation of metabolites, or by induction of metabolic pathways. Exptl. investigation of the metab. of small org. mols. is particularly resource demanding; hence, computational methods are of considerable interest to complement exptl. approaches. This review provides a broad overview of structure- and ligand-based computational methods for the prediction of xenobiotic metab. Current computational approaches to address xenobiotic metab. are discussed from three major perspectives: (i) prediction of sites of metab. (SOMs), (ii) elucidation of potential metabolites and their chem. structures, and (iii) prediction of direct and indirect effects of xenobiotics on metabolizing enzymes, where the focus is on the cytochrome P 450 (CYP) superfamily of enzymes, the cardinal xenobiotics metabolizing enzymes. For each of these domains, a variety of approaches and their applications are systematically reviewed, including expert systems, data mining approaches, quant. structure-activity relationships (QSARs), and machine learning-based methods, pharmacophore-based algorithms, shape-focused techniques, mol. interaction fields (MIFs), reactivity-focused techniques, protein-ligand docking, mol. dynamics (MD) simulations, and combinations of methods. Predictive metab. is a developing area, and there is still enormous potential for improvement. However, it is clear that the combination of rapidly increasing amts. of available ligand- and structure-related exptl. data (in particular, quant. data) with novel and diverse simulation and modeling approaches is accelerating the development of effective tools for prediction of in vivo metab., which is reflected by the diverse and comprehensive data sources and methods for metab. prediction reviewed here. This review attempts to survey the range and scope of computational methods applied to metab. prediction and also to compare and contrast their applicability and performance. - 8Raunio, H.; Kuusisto, M.; Juvonen, R. O.; Pentikäinen, O. T. Modeling of Interactions between Xenobiotics and Cytochrome P450 (CYP) Enzymes Front. Pharmacol. 2015, 6, 123 DOI: 10.3389/fphar.2015.00123[Crossref], [PubMed], [CAS], Google Scholar8https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A280%3ADC%252BC2MbosFGiuw%253D%253D&md5=aaf28501a8579f1a24ec8e5c80f24a3aModeling of interactions between xenobiotics and cytochrome P450 (CYP) enzymesRaunio Hannu; Juvonen Risto O; Kuusisto Mira; Pentikainen Olli TFrontiers in pharmacology (2015), 6 (), 123 ISSN:1663-9812.The adverse effects to humans and environment of only few chemicals are well known. Absorption, distribution, metabolism, and excretion (ADME) are the steps of pharmaco/toxicokinetics that determine the internal dose of chemicals to which the organism is exposed. Of all the xenobiotic-metabolizing enzymes, the cytochrome P450 (CYP) enzymes are the most important due to their abundance and versatility. Reactions catalyzed by CYPs usually turn xenobiotics to harmless and excretable metabolites, but sometimes an innocuous xenobiotic is transformed into a toxic metabolite. Data on ADME and toxicity properties of compounds are increasingly generated using in vitro and modeling (in silico) tools. Both physics-based and empirical modeling approaches are used. Numerous ligand-based and target-based as well as combined modeling methods have been employed to evaluate determinants of CYP ligand binding as well as predicting sites of metabolism and inhibition characteristics of test molecules. In silico prediction of CYP-ligand interactions have made crucial contributions in understanding (1) determinants of CYP ligand binding recognition and affinity; (2) prediction of likely metabolites from substrates; (3) prediction of inhibitors and their inhibition potency. Truly predictive models of toxic outcomes cannot be created without incorporating metabolic characteristics; in silico methods help producing such information and filling gaps in experimentally derived data. Currently modeling methods are not mature enough to replace standard in vitro and in vivo approaches, but they are already used as an important component in risk assessment of drugs and other chemicals.
- 9Bezhentsev, V. M.; Tarasova, O. A.; Dmitriev, A. V.; Rudik, A. V.; Lagunin, A. A.; Filimonov, D. A.; Poroikov, V. V. Computer-Aided Prediction of Xenobiotic Metabolism in the Human Body Russ. Chem. Rev. 2016, 85, 854 DOI: 10.1070/RCR4614[Crossref], [CAS], Google Scholar9https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XhvF2jsb7L&md5=4ea739a940a38faeaabadba470f2cd05Computer-aided prediction of xenobiotic metabolism in humansBezhentsev, Vladislav M.; Tarasova, Olga A.; Dmitriev, Aleksandr V.; Rudik, Anastasiya V.; Lagunin, Aleksey A.; Filimonov, Dmitriy A.; Poroikov, Vladimir V.Russian Chemical Reviews (2016), 85 (8), 854-879CODEN: RCRVAB; ISSN:0036-021X. (IOP Publishing Ltd.)A review. The review describes the main databases contg. information about the metab. of xenobiotics, including data on drug metab., metabolic enzymes, schemes of biotransformation and the structures of some substrates and metabolites. Computational approaches used to predict the interaction of xenobiotics with metabolic enzymes, to identify the sites of metab. in the mol. and to generate structures of potential metabolites for subsequent evaluation of their properties are considered. The advantages and limitations of particular computational methods for metab. prediction are indicated and the prospects for their applications to improve the safety and efficacy of new drugs are discussed.
- 10Rydberg, P. Reactivity-Based Approaches and Machine Learning Methods for Predicting the Sites of Cytochrome P450-Mediated Metabolism. In Drug Metabolism Prediction; Kirchmair, J., Ed.; Wiley-VCH Verlag GmbH & Co. KGaA: Weinheim, 2014; pp 265– 292.
- 11Rydberg, P.; Olsen, L. Predicting Drug Metabolism by Cytochrome P450 2C9: Comparison with the 2D6 and 3A4 Isoforms ChemMedChem 2012, 7, 1202– 1209 DOI: 10.1002/cmdc.201200160[Crossref], [PubMed], [CAS], Google Scholar11https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XntFWitr0%253D&md5=0dd476527c405af9de49a076a30c1181Predicting Drug Metabolism by Cytochrome P450 2C9: Comparison with the 2D6 and 3A4 IsoformsRydberg, Patrik; Olsen, LarsChemMedChem (2012), 7 (7), 1202-1209, S1202/1-S1202/33CODEN: CHEMGX; ISSN:1860-7179. (Wiley-VCH Verlag GmbH & Co. KGaA)By the use of knowledge gained through modeling of drug metab. mediated by the cytochrome P 450 2D6 and 3A4 isoforms, we constructed a 2D-based model for site-of-metab. prediction for the cytochrome P 450 2C9 isoform. The similarities and differences between the models for the 2C9 and 2D6 isoforms are discussed through structural knowledge from the X-ray crystal structures and trends in exptl. data. The final model was validated on an independent test set, resulting in an area under the curve value of 0.92, and a site of metab. was found among the top two ranked atoms for 77 % of the compds.
- 12Darvas, F. Predicting Metabolic Pathways by Logic Programming J. Mol. Graphics 1988, 6, 80– 86 DOI: 10.1016/0263-7855(88)85004-5[Crossref], [CAS], Google Scholar12https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaL1cXkslaltL8%253D&md5=1d73455bb8ab8d32c02dff7b3a4d78a4Predicting metabolic pathways by logic programmingDarvas, FerencJournal of Molecular Graphics (1988), 6 (2), 80-6CODEN: JMGRDV; ISSN:0263-7855.A discussion of logic programming is presented and its application is described in an expert system used to simulate the metabolic fate of substances. An expert system called Metabolexpert accepts the formula of the compd. to be metabolized and produces a treelike picture of the metabolites generated together with the formula of the compds. On request, 3-dimensional pictures of the metabolites are also displayed and hydrophobicity values of the compds. calcd. A retrospective investigation of Metabolexpert's achievement showed that the expert system can reproduce almost all primary, secondary, and tertiary metabolites of amphetamine. A compd. series has been suggested for benchmark testing of metabolic transformation knowledge bases.
- 13Klopman, G.; Dimayuga, M.; Talafous, J. META. 1. A Program for the Evaluation of Metabolic Transformation of Chemicals J. Chem. Inf. Model. 1994, 34, 1320– 1325 DOI: 10.1021/ci00022a014[ACS Full Text
], [CAS], Google Scholar
13https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK2cXmsVyitrw%253D&md5=9f145c8b540affcaf4a67b6306b991aaMETA. 1. A Program for the Evaluation of Metabolic Transformation of ChemicalsKlopman, Gilles; Dimayuga, Mario; Talafous, JosephJournal of Chemical Information and Computer Sciences (1994), 34 (6), 1320-5CODEN: JCISD8; ISSN:0095-2338.A new metab. program, META, is introduced. In this paper, the basic principles on which the program operates are described. META is an expert system, capable of predicting the sites of potential enzymic attack and the nature of the chems. formed by such metabolic transformations. It operates from dictionaries of transformation operators, created by experts to represent known metabolic paths. - 14Talafous, J.; Sayre, L. M.; Mieyal, J. J.; Klopman, G. META. 2. A Dictionary Model of Mammalian Xenobiotic Metabolism J. Chem. Inf. Model. 1994, 34, 1326– 1333 DOI: 10.1021/ci00022a015
- 15Greene, N.; Judson, P. N.; Langowski, J. J.; Marchant, C. A. Knowledge-Based Expert Systems for Toxicity and Metabolism Prediction: DEREK, StAR and METEOR SAR QSAR Environ. Res. 1999, 10, 299– 314 DOI: 10.1080/10629369908039182[Crossref], [PubMed], [CAS], Google Scholar15https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK1MXlvFCgsbc%253D&md5=5988de2a40351b9416bc21a9688a3a6dKnowledge-based expert systems for toxicity and metabolism prediction: DEREK, StAR and METEORGreene, N.; Judson, P. N.; Langowski, J. J.; Marchant, C. A.SAR and QSAR in Environmental Research (1999), 10 (2-3), 299-314, 2 platesCODEN: SQERED; ISSN:1062-936X. (Gordon & Breach Science Publishers)It has long been recognized that the ability to predict the metabolic fate of a chem. substance and the potential toxicity of either the parent compd. or its metabolites are important in novel drug design. The popularity of using computer models as an aid in this area has grown considerably in recent years. LHASA Limited has been developing knowledge-based expert systems for toxicity and metab. prediction in collaboration with industry and regulatory authorities. These systems, DEREK, StAR, and METEOR, use rules to describe the relationship between chem. structure and either toxicity in the case of DEREK and StAR or metabolic fate in the case of METEOR. The rule refinement process for DEREK often involves assessing the predictions for a novel set of compds. and comparing them to their biol. assay results as a measure of the system's performance. For example, 266 non-congeneric chems. from the National Toxicol. Program database have been processed through the DEREK mutagenicity knowledge base and the predictions compared to their Salmonella typhimurium mutagenicity data. Initially, 81 of 114 mutagens (71%) and 117 of 152 non-mutagens (77%) were correctly identified. Following further knowledge base development, the no. of correctly identified mutagens has increased to 96 (84%). Further work on improving the predictive capabilities of DEREK, StAR, and METEOR is in progress.
- 16Hou, B. K.; Wackett, L. P.; Ellis, L. B. M. Microbial Pathway Prediction: A Functional Group Approach J. Chem. Inf. Comput. Sci. 2003, 43, 1051– 1057 DOI: 10.1021/ci034018f[ACS Full Text
], [CAS], Google Scholar
16https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXis1Kgt74%253D&md5=c7a314ccbde95b8ceef6af0276469840Microbial pathway prediction: a functional group approachHou, Bo Kyeng; Wackett, Lawrence P.; Ellis, Lynda B. M.Journal of Chemical Information and Computer Sciences (2003), 43 (3), 1051-1057CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)We have developed a system to predict microbial catabolism, using the University of Minnesota Biocatalysis/Biodegrdn. Database (UM-BBD, http://umbbd.ahc.umn.edu/) as a knowledge base. The present system, available on the Web (http://umbbd.ahc.umn.edu/predict/), can predict biodegrdn. of most of the major aliph. and arom. org. functional groups contg. C, H, N, O, and halogens. It can duplicate at least one known biodegrdn. pathway for 60% of the compds. in a 84-member validation set; most pathways that did not completely duplicate known metab. could plausibly occur in nature. Users are encouraged, and have begun, to submit addnl. biotransformation rules and comment on existing rules; the system will further develop under the direction of the scientific community. - 17Hatzimanikatis, V.; Li, C.; Ionita, J. A.; Henry, C. S.; Jankowski, M. D.; Broadbelt, L. J. Exploring the Diversity of Complex Metabolic Networks Bioinformatics 2005, 21, 1603– 1609 DOI: 10.1093/bioinformatics/bti213[Crossref], [PubMed], [CAS], Google Scholar17https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2MXjtlGhuro%253D&md5=343be8db9eada86fcba34ae047529decExploring the diversity of complex metabolic networksHatzimanikatis, Vassily; Li, Chunhui; Ionita, Justin A.; Henry, Christopher S.; Jankowski, Matthew D.; Broadbelt, Linda J.Bioinformatics (2005), 21 (8), 1603-1609CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Motivation: Metab., the network of chem. reactions that make life possible, is one of the most complex processes in nature. We describe here the development of a computational approach for the identification of every possible biochem. reaction from a given set of enzyme reaction rules that allows the de novo synthesis of metabolic pathways composed of these reactions, and the evaluation of these novel pathways with respect to their thermodn. properties. Results: We applied this framework to the anal. of the arom. amino acid pathways and discovered almost 75,000 novel biochem. routes from chorismate to phenylalanine, more than 350,000 from chorismate to tyrosine, but only 13 from chorismate to tryptophan. Thermodn. anal. of these pathways suggests that the native pathways are thermodynamically more favorable than the alternative possible pathways. The pathways generated involve compds. that exist in biol. databases, as well as compds. that exist in chem. databases and novel compds., suggesting novel biochem. routes for these compds. and the existence of biochem. compds. that remain to be discovered or synthesized through enzyme and pathway engineering.
- 18Ridder, L.; Wagener, M. SyGMa: Combining Expert Knowledge and Empirical Scoring in the Prediction of Metabolites ChemMedChem 2008, 3, 821– 832 DOI: 10.1002/cmdc.200700312[Crossref], [PubMed], [CAS], Google Scholar18https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1cXms1emtLw%253D&md5=92ed84b2e97af5ad9a50beef151fa5dbSyGMa: combining expert knowledge and empirical scoring in the prediction of metabolitesRidder, Lars; Wagener, MarkusChemMedChem (2008), 3 (5), 821-832CODEN: CHEMGX; ISSN:1860-7179. (Wiley-VCH Verlag GmbH & Co. KGaA)Predictions of potential metabolites based on chem. structure are becoming increasingly important in drug discovery to guide medicinal chem. efforts that address metabolic issues and to support exptl. metabolite screening and identification. Herein we present a novel rule-based method, SyGMa (Systematic Generation of potential Metabolites), to predict the potential metabolites of a given parent structure. A set of reaction rules covering a broad range of phase 1 and phase 2 metab. has been derived from metabolic reactions reported in the Metabolite Database to occur in humans. An empirical probability score is assigned to each rule representing the fraction of correctly predicted metabolites in the training database. This score is used to refine the rules and to rank predicted metabolites. The current rule set of SyGMa covers approx. 70% of biotransformation reactions obsd. in humans. Evaluation of the rule-based predictions demonstrated a significant enrichment of true metabolites in the top of the ranking list: while in total, 68% of all obsd. metabolites in an independent test set were reproduced by SyGMa, a large part, 30% of the obsd. metabolites, were identified among the top three predictions. From a subset of cytochrome P 450 specific metabolites, 84% were reproduced overall, with 66% in the top three predicted phase 1 metabolites. A similarity anal. of the reactions present in the database was performed to obtain an overview of the metabolic reactions predicted by SyGMa and to support ongoing efforts to extend the rules. Specific examples demonstrate the use of SyGMa in exptl. metabolite identification and the application of SyGMa to suggest chem. modifications that improve the metabolic stability of compds.
- 19Gao, J.; Ellis, L. B. M.; Wackett, L. P. The University of Minnesota Pathway Prediction System: Multi-Level Prediction and Visualization Nucleic Acids Res. 2011, 39, W406– W411 DOI: 10.1093/nar/gkr200[Crossref], [PubMed], [CAS], Google Scholar19https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXosVOmsLs%253D&md5=4b27083f37d70ab33ba63c1e27a2959bThe University of Minnesota Pathway Prediction System: multi-level prediction and visualizationGao, Junfeng; Ellis, Lynda B. M.; Wackett, Lawrence P.Nucleic Acids Research (2011), 39 (Web Server), W406-W411CODEN: NARHAD; ISSN:0305-1048. (Oxford University Press)The University of Minnesota Pathway Prediction System (UM-PPS, http://umbbd.msi.umn.edu/predict/) is a rule-based system that predicts microbial catabolism of org. compds. Currently, its knowledge base contains 250 biotransformation rules and five types of metabolic logic entities. The original UM-PPS predicted up to two prediction levels at a time. Users had to choose a predicted product to continue the prediction. This approach provided a limited view of prediction results and heavily relied on manual intervention. The new UM-PPS produces a multi-level prediction within an acceptable time frame, and allows users to view prediction alternatives much more easily as a directed acyclic graph.
- 20Mu, F.; Unkefer, C. J.; Unkefer, P. J.; Hlavacek, W. S. Prediction of Metabolic Reactions Based on Atomic and Molecular Properties of Small-Molecule Compounds Bioinformatics 2011, 27, 1537– 1545 DOI: 10.1093/bioinformatics/btr177[Crossref], [PubMed], [CAS], Google Scholar20https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXmvVajsLs%253D&md5=97709401a9db367e98b2278a9d601816Prediction of metabolic reactions based on atomic and molecular properties of small-molecule compoundsMu, Fangping; Unkefer, Clifford J.; Unkefer, Pat J.; Hlavacek, William S.Bioinformatics (2011), 27 (11), 1537-1545CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Our knowledge of the metabolites in cells and their reactions is far from complete as revealed by metabolomic measurements that detect many more small mols. than are documented in metabolic databases. Here, we develop an approach for predicting the reactivity of small-mol. metabolites in enzyme-catalyzed reactions that combines expert knowledge, computational chem. and machine learning. We classified 4843 reactions documented in the KEGG database, from all six Enzyme Commission classes (EC 1-6), into 80 reaction classes, each of which is marked by a characteristic functional group transformation. Reaction centers and surrounding local structures in substrates and products of these reactions were represented using SMARTS. We found that each of the SMARTS-defined chem. substructures is widely distributed among metabolites, but only a fraction of the functional groups in these substructures are reactive. Using at. properties of atoms in a putative reaction center and mol. properties as features, we trained support vector machine (SVM) classifiers to discriminate between functional groups that are reactive and non-reactive. Classifier accuracy was assessed by cross-validation anal. A typical sensitivity [TP/(TP+FN)] or specificity [TN/(TN+FP)] is ≈0.8. Our results suggest that metabolic reactivity of small-mol. compds. can be predicted with reasonable accuracy based on the presence of a potentially reactive functional group and the chem. features of its local environment. The classifiers presented here can be used to predict reactions via a web site (http://cellsignaling.lanl.gov/Reactivity/). Contact: [email protected].
- 21Yousofshahi, M.; Manteiga, S.; Wu, C.; Lee, K.; Hassoun, S. PROXIMAL: A Method for Prediction of Xenobiotic Metabolism BMC Syst. Biol. 2015, 9, 94 DOI: 10.1186/s12918-015-0241-4[Crossref], [PubMed], [CAS], Google Scholar21https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXjtVyltbo%253D&md5=131eaa3ba02457ab89a73d1481d38ea1PROXIMAL: a method for Prediction of Xenobiotic MetabolismYousofshahi, Mona; Manteiga, Sara; Wu, Charmian; Lee, Kyongbum; Hassoun, SohaBMC Systems Biology (2015), 9 (), 94/1-94/17CODEN: BSBMCC; ISSN:1752-0509. (BioMed Central Ltd.)Background: Contamination of the environment with bioactive chems. has emerged as a potential public health risk. These substances that may cause distress or disease in humans can be found in air, water and food supplies. An open question is whether these chems. transform into potentially more active or toxic derivs. via xenobiotic metabolizing enzymes expressed in the body. We present a new prediction tool, which we call PROXIMAL (Prediction of Xenobiotic Metab.) for identifying possible transformation products of xenobiotic chems. in the liver. Using reaction data from DrugBank and KEGG, PROXIMAL builds look-up tables that catalog the sites and types of structural modifications performed by Phase I and Phase II enzymes. Given a compd. of interest, PROXIMAL searches for substructures that match the sites cataloged in the look-up tables, applies the corresponding modifications to generate a panel of possible transformation products, and ranks the products based on the activity and abundance of the enzymes involved. Results: PROXIMAL generates transformations that are specific for the chem. of interest by analyzing the chem.'s substructures. We evaluate the accuracy of PROXIMAL's predictions through case studies on two environmental chems. with suspected endocrine disrupting activity, bisphenol A (BPA) and 4-chlorobiphenyl (PCB3). Comparisons with published reports confirm 5 out of 7 and 17 out of 26 of the predicted derivs. for BPA and PCB3, resp. We also compare biotransformation predictions generated by PROXIMAL with those generated by METEOR and Metaprint2D-react, two other prediction tools. Conclusions: PROXIMAL can predict transformations of chems. that contain substructures recognizable by human liver enzymes. It also has the ability to rank the predicted metabolites based on the activity and abundance of enzymes involved in xenobiotic transformation.
- 22Sun, H.; Scott, D. O. Structure-Based Drug Metabolism Predictions for Drug Design Chem. Biol. Drug Des. 2010, 75, 3– 17 DOI: 10.1111/j.1747-0285.2009.00899.x[Crossref], [PubMed], [CAS], Google Scholar22https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD1MXhs1WhsbfO&md5=52666dc1d960035e9def127cb1963860Structure-based drug metabolism predictions for drug designSun, Hao; Scott, Dennis O.Chemical Biology & Drug Design (2010), 75 (1), 3-17CODEN: CBDDAL; ISSN:1747-0277. (Wiley-Blackwell)A review. Significant progress has been made in structure-based drug design by pharmaceutical companies at different stages of drug discovery such as identifying new hits, enhancing mol. binding affinity in hit-to-lead, and reducing toxicities in lead optimization. Drug metab. is a major consideration for modifying drug clearance and also a primary source for drug metabolite-induced toxicity. With major cytochrome P 450 structures identified and characterized recently, structure-based drug metab. prediction becomes increasingly attractive. In silico methods based on mol. and quantum mechanics such as docking, mol. dynamics and ab initio chem. reactivity calcns. bring us closer to understand drug metab. and predict drug-drug interactions. In this study, we review important progress in drug metab. and common in silico techniques adopted to predict drug regioselectivity, stereoselectivity, reactive metabolites, induction, inhibition and mechanism-based inactivation, as well as their implementation in hit-to-lead drug discovery.
- 23Kingsley, L. J.; Wilson, G. L.; Essex, M. E.; Lill, M. A. Combining Structure- and Ligand-Based Approaches to Improve Site of Metabolism Prediction in CYP2C9 Substrates Pharm. Res. 2015, 32, 986– 1001 DOI: 10.1007/s11095-014-1511-3
- 24Sheridan, R. P.; Korzekwa, K. R.; Torres, R. A.; Walker, M. J. Empirical Regioselectivity Models for Human Cytochromes P450 3A4, 2D6, and 2C9 J. Med. Chem. 2007, 50, 3173– 3184 DOI: 10.1021/jm0613471[ACS Full Text
], [CAS], Google Scholar
24https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXmsFGnsr0%253D&md5=645bd5ddeb0459636bf6067e0194dd34Empirical Regioselectivity Models for Human Cytochromes P450 3A4, 2D6, and 2C9Sheridan, Robert P.; Korzekwa, Kenneth R.; Torres, Rhonda A.; Walker, Matthew J.Journal of Medicinal Chemistry (2007), 50 (14), 3173-3184CODEN: JMCMAR; ISSN:0022-2623. (American Chemical Society)Cytochromes P 450 3A4, 2D6, and 2C9 metabolize a large fraction of drugs. Knowing where these enzymes will preferentially oxidize a mol., the regioselectivity, allows medicinal chemists to plan how best to block its metab. The authors present QSAR-based regioselectivity models for these enzymes calibrated against compiled literature data of drugs and drug-like compds. These models are purely empirical and use only the structures of the substrates, in contrast to those models that simulate a specific mechanism like hydrogen radical abstraction, and/or use explicit models of active sites. The authors most predictive models use three substructure descriptors and two phys. property descriptors. Descriptor importance from the random forest QSAR method show that other factors than the immediate chem. environment and the accessibility of the hydrogen affect regioselectivity in all three isoforms. The cross-validated predictions of the models are compared to predictions from the authors earlier mechanistic model (Singh et al. J. Med. Chem. 2003, 46, 1330-1336) and predictions from MetaSite (Cruciani et al. J. Med. Chem. 2005, 48, 6970-6979). - 25Rydberg, P.; Gloriam, D. E.; Zaretzki, J.; Breneman, C.; Olsen, L. SMARTCyp: A 2D Method for Prediction of Cytochrome P450-Mediated Drug Metabolism ACS Med. Chem. Lett. 2010, 1, 96– 100 DOI: 10.1021/ml100016x[ACS Full Text
], [CAS], Google Scholar
25https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXjtFOqsL0%253D&md5=4761de933c2c3cef5da2ffb29ac076d0SMARTCyp: A 2D Method for Prediction of Cytochrome P450-Mediated Drug MetabolismRydberg, Patrik; Gloriam, David E.; Zaretzki, Jed; Breneman, Curt; Olsen, LarsACS Medicinal Chemistry Letters (2010), 1 (3), 96-100CODEN: AMCLCT; ISSN:1948-5875. (American Chemical Society)SMARTCyp is an in silico method that predicts the sites of cytochrome P 450-mediated metab. of druglike mols. The method is foremost a reactivity model, and as such, it shows a preference for predicting sites that are metabolized by the cytochrome P 450 3A4 isoform. SMARTCyp predicts the site of metab. directly from the 2D structure of a mol., without requiring calcn. of electronic properties or generation of 3D structures. This is a major advantage, because it makes SMARTCyp very fast. Other advantages are that exptl. data are not a prerequisite to create the model, and it can easily be integrated with other methods to create models for other cytochrome P 450 isoforms. Benchmarking tests on a database of 394 3A4 substrates show that SMARTCyp successfully identifies at least one metabolic site in the top two ranked positions 76% of the time. SMARTCyp is available for download at http://www.farma.ku.dk/P 450. - 26Rydberg, P.; Gloriam, D. E.; Olsen, L. The SMARTCyp Cytochrome P450 Metabolism Prediction Server Bioinformatics 2010, 26, 2988– 2989 DOI: 10.1093/bioinformatics/btq584[Crossref], [PubMed], [CAS], Google Scholar26https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXhsVKjtL3I&md5=a8689591d976ca80d0ae6a3e15b32636The SMARTCyp cytochrome P450 metabolism prediction serverRydberg, Patrik; Gloriam, David E.; Olsen, LarsBioinformatics (2010), 26 (23), 2988-2989CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)The SMARTCyp server is the first web application for site of metab. prediction of cytochrome P 450-mediated drug metab.
- 27Rydberg, P.; Rostkowski, M.; Gloriam, D. E.; Olsen, L. The Contribution of Atom Accessibility to Site of Metabolism Models for Cytochromes P450 Mol. Pharmaceutics 2013, 10, 1216– 1223 DOI: 10.1021/mp3005116[ACS Full Text
], [CAS], Google Scholar
27https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhtFyjs7c%253D&md5=206ed0792c118423008bca0e6a793096The Contribution of Atom Accessibility to Site of Metabolism Models for Cytochromes P450Rydberg, Patrik; Rostkowski, Michal; Gloriam, David E.; Olsen, LarsMolecular Pharmaceutics (2013), 10 (4), 1216-1223CODEN: MPOHBP; ISSN:1543-8384. (American Chemical Society)Three different types of atom accessibility descriptors are investigated in relation to site of metab. predictions. To enable the integration of local accessibility we have constructed 2DSASA, a method for the calcn. of the at. solvent accessible surface area that is independent of 3D coordinates. The method was implemented in the SMARTCyp site of metab. prediction models and improved the results by up to 4 percentage points for nine cytochrome P 450 isoforms. The final models are made available at http://www.farma.ku.dk/smartcyp. - 28Zaretzki, J.; Bergeron, C.; Rydberg, P.; Huang, T.-W.; Bennett, K. P.; Breneman, C. M. RS-Predictor: A New Tool for Predicting Sites of Cytochrome P450-Mediated Metabolism Applied to CYP 3A4 J. Chem. Inf. Model. 2011, 51, 1667– 1689 DOI: 10.1021/ci2000488[ACS Full Text
], [CAS], Google Scholar
28https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3MXnsVertLs%253D&md5=3d863c866d45fc1c8e90d033a16ed6a1RS-Predictor: A New Tool for Predicting Sites of Cytochrome P450-Mediated Metabolism Applied to CYP 3A4Zaretzki, Jed; Bergeron, Charles; Rydberg, Patrik; Huang, Tao-wei; Bennett, Kristin P.; Breneman, Curt M.Journal of Chemical Information and Modeling (2011), 51 (7), 1667-1689CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)This article describes RegioSelectivity-Predictor (RS-Predictor), a new in silico method for generating predictive models of P 450-mediated metab. for drug-like compds. Within this method, potential sites of metab. (SOMs) are represented as "metabolophores": A concept that describes the hierarchical combination of topol. and quantum chem. descriptors needed to represent the reactivity of potential metabolic reaction sites. RS-Predictor modeling involves the use of metabolophore descriptors together with multiple-instance ranking (MIRank) to generate an optimized descriptor wt. vector that encodes regioselectivity trends across all cases in a training set. The resulting pathway-independent (O-dealkylation vs. N-oxidn. vs. Csp3 hydroxylation, etc.), isoenzyme-specific regioselectivity model may be used to predict potential metabolic liabilities. In the present work, cross-validated RS-Predictor models were generated for a set of 394 substrates of CYP 3A4 as a proof-of-principle for the method. Rank aggregation was then employed to merge independently generated predictions for each substrate into a single consensus prediction. The resulting consensus RS-Predictor models were shown to reliably identify at least one obsd. site of metab. in the top two rank-positions on 78% of the substrates. Comparisons between RS-Predictor and previously described regioselectivity prediction methods reveal new insights into how in silico metabolite prediction methods should be compared. - 29Zaretzki, J.; Rydberg, P.; Bergeron, C.; Bennett, K. P.; Olsen, L.; Breneman, C. M. RS-Predictor Models Augmented with SMARTCyp Reactivities: Robust Metabolic Regioselectivity Predictions for Nine CYP Isozymes J. Chem. Inf. Model. 2012, 52, 1637– 1659 DOI: 10.1021/ci300009z[ACS Full Text
], [CAS], Google Scholar
29https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC38XlvFCrsLk%253D&md5=99d070a003355b0caecdad0945077de7RS-Predictor Models Augmented with SMARTCyp Reactivities: Robust Metabolic Regioselectivity Predictions for Nine CYP IsozymesZaretzki, Jed; Rydberg, Patrik; Bergeron, Charles; Bennett, Kristin P.; Olsen, Lars; Breneman, Curt M.Journal of Chemical Information and Modeling (2012), 52 (6), 1637-1659CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)RS-Predictor is a tool for creating pathway-independent, isoenzyme-specific, site of metab. (SOM) prediction models using any set of known cytochrome P 450 (CYP) substrates and metabolites. Until now, the RS-Predictor method was only trained and validated on CYP 3A4 data, but in the present study, we report on the versatility the RS-Predictor modeling paradigm by creating and testing regioselectivity models for substrates of the nine most important CYP isoenzymes. Through curation of source literature, we have assembled 680 substrates distributed among CYPs 1A2, 2A6, 2B6, 2C19, 2C8, 2C9, 2D6, 2E1, and 3A4, the largest publicly accessible collection of P 450 ligands and metabolites released to date. A comprehensive investigation into the importance of different descriptor classes for identifying the regioselectivity mediated by each isoenzyme is made through the generation of multiple independent RS-Predictor models for each set of isoenzyme substrates. Two of these models include a d. functional theory (DFT) reactivity descriptor derived from SMARTCyp. Optimal combinations of RS-Predictor and SMARTCyp are shown to have stronger performance than either method alone, while also exceeding the accuracy of the com. regioselectivity prediction methods distributed by Optibrium and Schroedinger, correctly identifying a large proportion of the metabolites in each substrate set within the top two rank-positions: 1A2 (83.0%), 2A6 (85.7%), 2B6 (82.1%), 2C19 (86.2%), 2C8 (83.8%), 2C9 (84.5%), 2D6 (85.9%), 2E1 (82.8%), 3A4 (82.3%), and merged (86.0%). Comprehensive datamining of each substrate set and careful statistical analyses of the predictions made by the different models revealed new insights into mol. features that control metabolic regioselectivity and enable accurate prospective prediction of likely SOMs. - 30Zaretzki, J.; Bergeron, C.; Huang, T.-W.; Rydberg, P.; Swamidass, S. J.; Breneman, C. M. RS-WebPredictor: A Server for Predicting CYP-Mediated Sites of Metabolism on Drug-like Molecules Bioinformatics 2013, 29, 497– 498 DOI: 10.1093/bioinformatics/bts705[Crossref], [PubMed], [CAS], Google Scholar30https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXis1Ojtrw%253D&md5=bb01bdc036c5d338984efd663bd5ed78RS-WebPredictor: a server for predicting CYP-mediated sites of metabolism on drug-like moleculesZaretzki, Jed; Bergeron, Charles; Huang, Tao-wei; Rydberg, Patrik; Swamidass, S. Joshua; Breneman, Curt M.Bioinformatics (2013), 29 (4), 497-498CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Summary: Regioselectivity-WebPredictor (RS-WebPredictor) is a server that predicts isoenzyme-specific cytochrome P 450 (CYP)-mediated sites of metab. (SOMs) on drug-like mols. Predictions may be made for the promiscuous 2C9, 2D6 and 3A4 CYP isoenzymes, as well as CYPs 1A2, 2A6, 2B6, 2C8, 2C19 and 2E1. RS-WebPredictor is the first freely accessible server that predicts the regioselectivity of the last six isoenzymes. Server execution time is fast, taking on av. 2s to encode a submitted mol. and 1s to apply a given model, allowing for high-throughput use in lead optimization projects.
- 31Zaretzki, J.; Matlock, M.; Swamidass, S. J. XenoSite: Accurately Predicting CYP-Mediated Sites of Metabolism with Neural Networks J. Chem. Inf. Model. 2013, 53, 3373– 3383 DOI: 10.1021/ci400518g[ACS Full Text
], [CAS], Google Scholar
31https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhslKks77E&md5=1de6321b1500c9571db548006767e708XenoSite: Accurately Predicting CYP-Mediated Sites of Metabolism with Neural NetworksZaretzki, Jed; Matlock, Matthew; Swamidass, S. JoshuaJournal of Chemical Information and Modeling (2013), 53 (12), 3373-3383CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Understanding how xenobiotic mols. are metabolized is important because it influences the safety, efficacy, and dose of medicines and how they can be modified to improve these properties. The cytochrome P450s (CYPs) are proteins responsible for metabolizing 90% of drugs on the market, and many computational methods can predict which at. sites of a mol.-sites of metab. (SOMs)-are modified during CYP-mediated metab. This study improves on prior methods of predicting CYP-mediated SOMs by using new descriptors and machine learning based on neural networks. The new method, XenoSite, is faster to train and more accurate by as much as 4% or 5% for some isoenzymes. Furthermore, some "incorrect" predictions made by XenoSite were subsequently validated as correct predictions by revaluation of the source literature. Moreover, XenoSite output is interpretable as a probability, which reflects both the confidence of the model that a particular atom is metabolized and the statistical likelihood that its prediction for that atom is correct. - 32Matlock, M. K.; Hughes, T. B.; Swamidass, S. J. XenoSite Server: A Web-Available Site of Metabolism Prediction Tool Bioinformatics 2015, 31, 1136– 1137 DOI: 10.1093/bioinformatics/btu761[Crossref], [PubMed], [CAS], Google Scholar32https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xht1Gntb7E&md5=8d7928ef60f9ef495f7cf776aadc4d0eXenoSite server: a web-available site of metabolism prediction toolMatlock, Matthew K.; Hughes, Tyler B.; Joshua, Swamidass S.Bioinformatics (2015), 31 (7), 1136-1137CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Summary: Cytochrome P 450 enzymes (P450s) are metabolic enzymes that process the majority of FDA-approved, small-mol. drugs. Understanding how these enzymes modify mol. structure is key to the development of safe, effective drugs. XenoSite server is an online implementation of the XenoSite, a recently published computational model for P 450 metab. XenoSite predicts which at. sites of a mol.-sites of metab. (SOMs)-are modified by P450s. XenoSite server accepts input in common chem. file formats including SDF and SMILES and provides tools for visualizing the likelihood that each at. site is a site of metab. for a variety of important P450s, as well as a flat file download of SOM predictions.
- 33Kirchmair, J.; Williamson, M. J.; Afzal, A. M.; Tyzack, J. D.; Choy, A. P. K.; Howlett, A.; Rydberg, P.; Glen, R. C. FAst MEtabolizer (FAME): A Rapid and Accurate Predictor of Sites of Metabolism in Multiple Species by Endogenous Enzymes J. Chem. Inf. Model. 2013, 53, 2896– 2907 DOI: 10.1021/ci400503s[ACS Full Text
], [CAS], Google Scholar
33https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhslGmt7rO&md5=131ead027225bea9851fba02e7b63a09FAst MEtabolizer (FAME): A Rapid and Accurate Predictor of Sites of Metabolism in Multiple Species by Endogenous EnzymesKirchmair, Johannes; Williamson, Mark J.; Afzal, Avid M.; Tyzack, Jonathan D.; Choy, Alison P. K.; Howlett, Andrew; Rydberg, Patrik; Glen, Robert C.Journal of Chemical Information and Modeling (2013), 53 (11), 2896-2907CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)FAst MEtabolizer (FAME) is a fast and accurate predictor of sites of metab. (SoMs). It is based on a collection of random forest models trained on diverse chem. data sets of >20,000 mols. annotated with their exptl. detd. SoMs. Using a comprehensive set of available data, FAME aims to assess metabolic processes from a holistic point of view. It is not limited to a specific enzyme family or species. Besides a global model, dedicated models are available for human, rat, and dog metab.; specific prediction of phase I and II metab. is also supported. FAME is able to identify at least one known SoM among the top-1, top-2, and top-3 highest ranked atom positions in up to 71%, 81%, and 87% of all cases tested, resp. These prediction rates are comparable to or better than SoM predictors focused on specific enzyme families (such as cytochrome P450s), despite the fact that FAME uses only seven chem. descriptors. FAME covers a very broad chem. space, which together with its inter- and extrapolation power makes it applicable to a wide range of chems. Predictions take <2.5 s per mol. in batch mode on an Ultrabook. Results are visualized using Jmol, with the most likely SoMs highlighted. - 34Tyzack, J. D.; Hunt, P. A.; Segall, M. D. Predicting Regioselectivity and Lability of Cytochrome P450 Metabolism Using Quantum Mechanical Simulations J. Chem. Inf. Model. 2016, 56, 2180– 2193 DOI: 10.1021/acs.jcim.6b00233[ACS Full Text
], [CAS], Google Scholar
34https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xhs1yksrfO&md5=3f4daa77aacbfaefaec652c8ff456740Predicting Regioselectivity and Lability of Cytochrome P450 Metabolism Using Quantum Mechanical SimulationsTyzack, Jonathan D.; Hunt, Peter A.; Segall, Matthew D.Journal of Chemical Information and Modeling (2016), 56 (11), 2180-2193CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Methods are described for predicting cytochrome P 450 (CYP) metab. incorporating both pathway-specific reactivity and isoform-specific accessibility considerations. Semi-empirical quantum mech. (QM) simulations, parameterized using exptl. data and ab initio calcns., estd. the reactivity of each potential site of metab. in the context of the whole mol. Ligand-based models, trained using high quality regioselectivity data, cor. for orientation and steric effects of the different CYP isoform binding pockets. The resulting models identified a site of metab. in the top 2 predictions for between 82 and 91% of compds. in independent test sets across 7 CYP isoforms. In addn. to predicting the relative proportion of metabolite formation at each site, these methods estd. the activation energy at each site, from which addnl. information could be derived regarding their lability in abs. terms. The authors illustrated how this could guide the design of compds. to overcome issues with rapid CYP metab. - 35He, S.-B.; Li, M.-M.; Zhang, B.-X.; Ye, X.-T.; Du, R.-F.; Wang, Y.; Qiao, Y.-J. Construction of Metabolism Prediction Models for CYP450 3A4, 2D6, and 2C9 Based on Microsomal Metabolic Reaction System Int. J. Mol. Sci. 2016, 17, E1686 DOI: 10.3390/ijms17101686[Crossref], [PubMed], [CAS], Google Scholar35https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28XitVahsLjJ&md5=9b44af1c8b3412453be673ab9775bc9fConstruction of metabolism prediction models for CYP450 3A4, 2D6, and 2C9 based on microsomal metabolic reaction systemHe, Shuai-Bing; Li, Man-Man; Zhang, Bai-Xia; Ye, Xiao-Tong; Du, Ran-Feng; Wang, Yun; Qiao, Yan-JiangInternational Journal of Molecular Sciences (2016), 17 (10), 1686/1-1686/18CODEN: IJMCFK; ISSN:1422-0067. (MDPI AG)During the past decades, there have been continuous attempts in the prediction of metab. mediated by cytochrome P450s (CYP450s) 3A4, 2D6, and 2C9. However, it has indeed remained a huge challenge to accurately predict the metab. of xenobiotics mediated by these enzymes. To address this issue, microsomal metabolic reaction system (MMRS)-a novel concept, which integrates information about site of metab. (SOM) and enzyme-was introduced. By incorporating the use of multiple feature selection (FS) techniques (ChiSquared (CHI), InfoGain (IG), GainRatio (GR), Relief) and hybrid classification procedures (Kstar, Bayes (BN), K-nearest neighbors (IBK), C4.5 decision tree (J48), RandomForest (RF), Support vector machines (SVM), AdaBoostM1, Bagging), metab. prediction models were established based on metab. data released by Sheridan et al. Four major biotransformations, including aliph. C-hydroxylation, arom. C-hydroxylation, N-dealkylation and O-dealkylation, were involved. For validation, the overall accuracies of all four biotransformations exceeded 0.95. For receiver operating characteristic (ROC) anal., each of these models gave a significant area under curve (AUC) value >0.98. In addn., an external test was performed based on dataset published previously. As a result, 87.7% of the potential SOMs were correctly identified by our four models. In summary, four MMRS-based models were established, which can be used to predict the metab. mediated by CYP3A4, 2D6, and 2C9 with high accuracy.
- 36Finkelmann, A. R.; Göller, A. H.; Schneider, G. Site of Metabolism Prediction Based on Ab Initio Derived Atom Representations ChemMedChem 2017, 12, 606– 612 DOI: 10.1002/cmdc.201700097[Crossref], [PubMed], [CAS], Google Scholar36https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXlsV2ltL8%253D&md5=5dbee6380ec1bed9bae7887550e9ec6cSite of Metabolism Prediction Based on ab initio Derived Atom RepresentationsFinkelmann, Arndt R.; Goeller, Andreas H.; Schneider, GisbertChemMedChem (2017), 12 (8), 606-612CODEN: CHEMGX; ISSN:1860-7179. (Wiley-VCH Verlag GmbH & Co. KGaA)Machine learning models for site of metab. (SoM) prediction offer the ability to identify metabolic soft spots in low-mol.-wt. drug mols. at low computational cost and enable data-based reactivity prediction. SoM prediction is an atom classification problem. Successful construction of machine learning models requires atom representations that capture the reactivity-detg. features of a potential reaction site. We have developed a descriptor scheme that characterizes an atom's steric and electronic environment and its relative location in the mol. structure. The partial charge distributions were obtained from fast quantum mech. calcns. We successfully trained machine learning classifiers on curated cytochrome P 450 metab. data. The models based on the new atom descriptors showed sustained accuracy for retrospective analyses of metab. optimization campaigns and lead optimization projects from Bayer Pharmaceuticals. The results obtained demonstrate the practicality of quantum-chem.-supported machine learning models for hit-to-lead optimization.
- 37Geurts, P.; Ernst, D.; Wehenkel, L. Extremely Randomized Trees Mach. Learn. 2006, 63, 3– 42 DOI: 10.1007/s10994-006-6226-1
- 38Tyzack, J. D.; Williamson, M. J.; Torella, R.; Glen, R. C. Prediction of Cytochrome P450 Xenobiotic Metabolism: Tethered Docking and Reactivity Derived from Ligand Molecular Orbital Analysis J. Chem. Inf. Model. 2013, 53, 1294– 1305 DOI: 10.1021/ci400058s[ACS Full Text
], [CAS], Google Scholar
38https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXot1Srtb0%253D&md5=8215558c1c9459a4d64f5ee88e16a3e3Prediction of Cytochrome P450 Xenobiotic Metabolism: Tethered Docking and Reactivity Derived from Ligand Molecular Orbital AnalysisTyzack, Jonathan D.; Williamson, Mark J.; Torella, Rubben; Glen, Robert C.Journal of Chemical Information and Modeling (2013), 53 (6), 1294-1305CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Metab. of xenobiotic and endogenous compds. is frequently complex, not completely elucidated, and therefore often ambiguous. The prediction of sites of metab. (SoM) can be particularly helpful as a first step toward the identification of metabolites, a process esp. relevant to drug discovery. This paper describes a reactivity approach for predicting SoM whereby reactivity is derived directly from the ground state ligand MO anal., calcd. using D. Functional Theory, using a novel implementation of the av. local ionization energy. Thus each potential SoM is sampled in the context of the whole ligand, in contrast to other popular approaches where activation energies are calcd. for a predefined database of mol. fragments and assigned to matching moieties in a query ligand. In addn., one of the first descriptions of mol. dynamics of cytochrome P 450 (CYP) isoforms 3A4, 2D6, and 2C9 in their Compd. I state is reported, and, from the representative protein structures obtained, an anal. and evaluation of various docking approaches using GOLD is performed. In particular, a covalent docking approach is described coupled with the modeling of important electrostatic interactions between CYP and ligand using spherical constraints. Combining the docking and reactivity results, obtained using std. functionality from common docking and quantum chem. applications, enables a SoM to be identified in the top 2 predictions for 75%, 80%, and 78% of the data sets for 3A4, 2D6, and 2C9, resp., results that are accessible and competitive with other recently published prediction tools. - 39Huang, T.-W.; Zaretzki, J.; Bergeron, C.; Bennett, K. P.; Breneman, C. M. DR-Predictor: Incorporating Flexible Docking with Specialized Electronic Reactivity and Machine Learning Techniques to Predict CYP-Mediated Sites of Metabolism J. Chem. Inf. Model. 2013, 53, 3352– 3366 DOI: 10.1021/ci4004688[ACS Full Text
], [CAS], Google Scholar
39https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3sXhslWlsrzF&md5=1d0592336da0de3bd1374b5d5b8fd991DR-Predictor: Incorporating Flexible Docking with Specialized Electronic Reactivity and Machine Learning Techniques to Predict CYP-Mediated Sites of MetabolismHuang, Tao-wei; Zaretzki, Jed; Bergeron, Charles; Bennett, Kristin P.; Breneman, Curt M.Journal of Chemical Information and Modeling (2013), 53 (12), 3352-3366CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Computational methods that can identify CYP-mediated sites of metab. (SOMs) of drug-like compds. have become required tools for early stage lead optimization. In recent years, methods that combine CYP binding site features with CYP/ligand binding information have been sought to increase the prediction accuracy of such hybrid models over those that use only one representation. Two challenges that any hybrid ligand/structure-based method must overcome are (1) identification of the best binding pose for a specific ligand with a given CYP and (2) appropriately incorporating the results of docking with ligand reactivity. To address these challenges the authors have created Docking-Regioselectivity-Predictor (DR-Predictor) - a method that incorporates flexible docking-derived information with specialized electronic reactivity and multiple-instance-learning methods to predict CYP-mediated SOMs. The hybrid ligand-structure-based DR-Predictor method was tested on substrate sets for CYP 1A2 and CYP 2A6. For these data, the DR-Predictor model was found to identify the exptl. obsd. SOM within the top two predicted rank-positions for 86% of the 261 1A2 substrates and 83% of the 100 2A6 substrates. Given the accuracy and extendibility of the DR-Predictor method, the authors anticipate that it will further facilitate the prediction of CYP metab. liabilities and aid in in-silico ADMET assessment of novel structures. - 40Zaretzki, J. M.; Browning, M. R.; Hughes, T. B.; Swamidass, S. J. Extending P450 Site-of-Metabolism Models with Region-Resolution Data Bioinformatics 2015, 31, 1966– 1973 DOI: 10.1093/bioinformatics/btv100[Crossref], [PubMed], [CAS], Google Scholar40https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC28Xht1Cit77E&md5=5463287e64149487a190db5ad62f6beeExtending P450 site-of-metabolism models with region-resolution dataZaretzki, Jed M.; Browning, Michael R.; Hughes, Tyler B.; Swamidass, S. JoshuaBioinformatics (2015), 31 (12), 1966-1973CODEN: BOINFP; ISSN:1367-4803. (Oxford University Press)Motivation: Cytochrome P450s are a family of enzymes responsible for the metab. of approx. 90% of FDA-approved drugs. Medicinal chemists often want to know which atoms of a mol.-its metabolized sites-are oxidized by Cytochrome P450s in order to modify their metab. Consequently, there are several methods that use literature-derived, atom-resoln. data to train models that can predict a mol.'s sites of metab. There is, however, much more data available at a lower resoln., where the exact site of metab. is not known, but the region of the mol. that is oxidized is known. Until now, no site-of-metab. models made use of region- resoln. data. Results: Here, we describe XenoSite-Region, the first reported method for training site-of-metab. models with region-resoln. data. Our approach uses the Expectation Maximization algorithm to train a site-of-metab. model. Region-resoln. metab. data was simulated from a large site-of-metab. dataset, contg. 2000 mols. with 3400 metabolized and 30 000 un-metabolized sites and covering nine Cytochrome P 450 isoenzymes. When training on the same mols. (but with only region-level information), we find that this approach yields models almost as accurate as models trained with atom-resoln. data. Moreover, we find that atom-resoln. trained models are more accurate when also trained with region-resoln. data from addnl. mols. Our approach, therefore, opens up a way to extend the applicable domain of site-of-metab. models into larger regions of chem. space. This meets a crit. need in drug development by tapping into underutilized data commonly available in most large drug companies.
- 41Powers, D. Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation J. Mach. Learn. Technol. 2011, 2, 37– 63Google ScholarThere is no corresponding record for this reference.
- 42Adams, S. E. Molecular Similarity and Xenobiotic Metabolism; University of Cambridge, 2010.Google ScholarThere is no corresponding record for this reference.
- 43Boyer, S.; Arnby, C. H.; Carlsson, L.; Smith, J.; Stein, V.; Glen, R. C. Reaction Site Mapping of Xenobiotic Biotransformations J. Chem. Inf. Model. 2007, 47, 583– 590 DOI: 10.1021/ci600376q[ACS Full Text
], [CAS], Google Scholar
43https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD2sXhslaitrc%253D&md5=5feeecf97e20393b3fc3a36be29eae6dReaction Site Mapping of Xenobiotic BiotransformationsBoyer, Scott; Arnby, Catrin Hasselgren; Carlsson, Lars; Smith, James; Stein, Viktor; Glen, Robert C.Journal of Chemical Information and Modeling (2007), 47 (2), 583-590CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Predictive metab. methods can be used in drug discovery projects to enhance the understanding of structure-metab. relationships. The present study uses data mining methods to exploit biotransformation data that have been recorded in the MDL Metabolite database. Reacting center fingerprints were derived from a comparison of substrates and their corresponding products listed in the database. This process yields two fingerprint databases: all atoms in all substrates and all reacting centers. The metabolic reaction data are then mined by submitting a new mol. and searching for fingerprint matches to every atom in the new mol. in both databases. An "occurrence ratio" is derived from the fingerprint matches between the submitted compd. and the reacting center and substrate fingerprint databases. Normalization of the occurrence ratio within each submitted mol. enables the results of the search to be rank-ordered as a measure of the relative frequency of a reaction occurring at a specific site within the submitted mol. Predictive performance that would allow this method to be used by drug discovery teams to generate useful hypotheses regarding structure metab. relationships was obsd. - 44Tyzack, J. D.; Mussa, H. Y.; Williamson, M. J.; Kirchmair, J.; Glen, R. C. Cytochrome P450 Site of Metabolism Prediction from 2D Topological Fingerprints Using GPU Accelerated Probabilistic Classifiers J. Cheminf. 2014, 6, 29 DOI: 10.1186/1758-2946-6-29[Crossref], [CAS], Google Scholar44https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2MXivFSks74%253D&md5=cb64fc4f7a201503285ead1d4d732c6bCytochrome P450 site of metabolism prediction from 2D topological fingerprints using GPU accelerated probabilistic classifiersTyzack, Jonathan D.; Mussa, Hamse Y.; Williamson, Mark J.; Kirchmair, Johannes; Glen, Robert C.Journal of Cheminformatics (2014), 6 (), 29/1-29/14, 14 pp.CODEN: JCOHB3; ISSN:1758-2946. (Chemistry Central Ltd.)Background: The prediction of sites and products of metab. in xenobiotic compds. is key to the development of new chem. entities, where screening potential metabolites for toxicity or unwanted side-effects is of crucial importance. In this work 2D topol. fingerprints are used to encode at. sites and three probabilistic machine learning methods are applied: Parzen-Rosenblatt Window (PRW), Naive Bayesian (NB) and a novel approach called RASCAL (Random Attribute Subsampling Classification Algorithm). These are implemented by randomly subsampling descriptor space to alleviate the problem often suffered by data mining methods of having to exactly match fingerprints, and in the case of PRW by measuring a distance between feature vectors rather than exact matching. The classifiers have been implemented in CUDA/C++ to exploit the parallel architecture of graphical processing units (GPUs) and is freely available in a public repository. Results: It is shown that for PRW a SoM (Site of Metab.) is identified in the top two predictions for 85%, 91% and 88% of the CYP 3A4, 2D6 and 2C9 data sets resp., with RASCAL giving similar performance of 83%, 91% and 88%, resp. These results put PRW and RASCAL performance ahead of NB which gave a much lower classification performance of 51%, 73% and 74%, resp. Conclusions: 2D topol. fingerprints calcd. to a bond depth of 4-6 contain sufficient information to allow the identification of SoMs using classifiers based on relatively small data sets. Thus, the machine learning methods outlined in this paper are conceptually simpler and more efficient than other methods tested and the use of simple topol. descriptors derived from 2D structure give results competitive with other approaches using more expensive quantum chem. descriptors. The descriptor space subsampling approach and ensemble methodol. allow the methods to be applied to mols. more distant from the training data where data mining would be more likely to fail due to the lack of common fingerprints. The RASCAL algorithm is shown to give equiv. classification performance to PRW but at lower computational expense allowing it to be applied more efficiently in the ensemble scheme.
- 45Stewart, J. J. MOPAC: A Semiempirical Molecular Orbital Program J. Comput.-Aided Mol. Des. 1990, 4, 1– 105 DOI: 10.1007/BF00128336[Crossref], [PubMed], [CAS], Google Scholar45https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3cXksFyjurc%253D&md5=983366b4ba2797f0fd8a91aec6d366edMOPAC: a semiempirical molecular orbital programStewart, James J. P.Journal of Computer-Aided Molecular Design (1990), 4 (1), 1-105CODEN: JCADEQ; ISSN:0920-654X.An overview is presented of MOPAC, a semiempirical MO program for the study of chem. reactions involving mols., ions, and linear polymers. The program implements the semiempirical Hamiltonians MNDO, AM1, MINDO/3, and MNDO-PM3 and combines the calcns. of vibrational spectra, thermodn. quantities, isotopic substitution effects, and force consts. in a fully integrated program.
- 46MOPAC2016. http://openmopac.net/home.html (accessed Apr 7, 2017) .Google ScholarThere is no corresponding record for this reference.
- 47Schüürmann, G. Quantitative Structure-Property Relationships for the Polarizability, Solvatochromic Parameters and Lipophilicity Quant. Struct.-Act. Relat. 1990, 9, 326– 333 DOI: 10.1002/qsar.19900090406[Crossref], [CAS], Google Scholar47https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaK3MXhvFaqtrY%253D&md5=e7e356b88e9179b4698e7515c47e1f0eQuantitative structure-property relationships for the polarizability, solvatochromic parameters and lipophilicitySchueuermann, GerritQuantitative Structure-Activity Relationships (1990), 9 (4), 326-33CODEN: QSARDI; ISSN:0931-8771.The polarizability αvol, the solvatochromic parameters π* and β and the lipophilicity as expressed by log Kow are subjected to regression analyses using calcd. mol. parameters within the MNDO scheme. The resulting linear one- and more-variable regression equations enable rapid approx. calcns. for αvol, π* and β and log Kow of untested compds.; in particular, αvol is split into a mol. size and an electronic part which offers new possibilities for applications in quant. structure-activity relationships.
- 48Coulson, C. A.; Longuet-Higgins, H. C. The Electronic Structure of Conjugated Systems. II. Unsaturated Hydrocarbons and Their Hetero-Derivatives Proc. R. Soc. London, Ser. A 1947, 192, 16– 32 DOI: 10.1098/rspa.1947.0136
- 49Fukui, K.; Kato, H.; Yonezawa, T. A New Quantum-Mechanical Reactivity Index for Saturated Compounds Bull. Chem. Soc. Jpn. 1961, 34, 1111– 1115 DOI: 10.1246/bcsj.34.1111
- 50Gopinathan, M. S.; Siddarth, P.; Ravimohan, C. Valency and Molecular Structure Theor. Chim. Acta 1986, 70, 303– 322 DOI: 10.1007/BF00534237
- 51Mulliken, R. S. Electronic Population Analysis on LCAO-MO Molecular Wave Functions. I J. Chem. Phys. 1955, 23, 1833– 1840 DOI: 10.1063/1.1740588[Crossref], [CAS], Google Scholar51https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaG28Xnt1Kq&md5=5aab51c06ce6a3250219cf12d1b5f395Electronic population analysis on LCAO-MO [linear combination of atomic orbital-molecular orbital] molecular wave functions. IMulliken, R. S.Journal of Chemical Physics (1955), 23 (), 1833-40CODEN: JCPSA6; ISSN:0021-9606.An analysis in quant. form was given in terms of breakdowns of the electronic population into partial and total "gross at. populations," or into partial and total "net at. populations" together with "overlap populations." Gross at. populations distribute the electrons almost perfectly among the various at. orbitals of the various atoms in the mol. From these nos., a definite figure is obtained for the amt. of promotion (e.g. from 2s to 2p) in each atom; and also for the gross charge Q on each atom if the bonds are polar. The total overlap population for any pair of atoms in a mol. is in general made up of pos. and neg. contributions. If the total overlap population between 2 atoms is pos., they are bonded; if neg., they are antibonded. Tables of gross at. populations and overlap populations were calcd. for CO and H2O. The amt. of s-p promotion was nearly the same for the O atom in CO and in H2O (0.14 electron in CO and 0.15e in H2O). For the C atom in CO it is 0.50e. For the N atom in N2 it is 0.26e. In spite of very strong polarity in the π bonds in CO, the σ and π overlap populations are very similar to those in N. In CO the total overlap population for the π electrons is about twice that for the σ electrons. The most easily ionized electrons of CO are in a mol. orbital such that its gross at. population is 94% localized on the C atom; these electrons account for the weak electron donor properties of CO.
- 52Mulliken, R. S. Criteria for the Construction of Good Self-Consistent-Field Molecular Orbital Wave Functions, and the Significance of LCAO-MO Population Analysis J. Chem. Phys. 1962, 36, 3428– 3439 DOI: 10.1063/1.1732476[Crossref], [CAS], Google Scholar52https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADyaF38Xks1KrtL8%253D&md5=9fb6c9a93a70358d02154b028efe63fcCriteria for the construction of good self-consistent-field molecular orbital wave functions, and the significance of L.C.A.O.M.O. population analysisMulliken, R. S.Journal of Chemical Physics (1962), 36 (), 3428-39CODEN: JCPSA6; ISSN:0021-9606.Criteria for the optimal choice of finite linear combinations of Slater-type orbitals (S.T.O.) adequate to approx. closely S.C.F. M.O. were examd. in the light of computations on HF and other mols. Some aspects of the A.O. (generalized Heitler-London) method were discussed. The inherent limitations on the meaning of charges on atoms in a mol., or of degree of ionic character, were discussed. Unacceptable at. charges are found from an L.C.A.O. mol. orbital population analysis made on S.C.F. M.O. wave functions, approximated by using unbalanced S.T.O. sets, while acceptable results are obtained with judiciously balanced and sufficiently complete S.T.O. sets. The effects of insufficiently complete and unbalanced S.T.O. sets on the computed dipole moments of S.C.F. M.O. wave functions were discussed, with examples.
- 53Holm, S. A Simple Sequentially Rejective Multiple Test Procedure Scand. Stat. Theory Appl. 1979, 6, 65– 70Google ScholarThere is no corresponding record for this reference.
- 54Friedman, M. A Comparison of Alternative Tests of Significance for the Problem of m Rankings Ann. Math. Stat. 1940, 11, 86– 92 DOI: 10.1214/aoms/1177731944
- 55Friedman, M. A Correction: The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance J. Am. Stat. Assoc. 1939, 34, 109– 109 DOI: 10.2307/2279169
- 56Friedman, M. The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance J. Am. Stat. Assoc. 1937, 32, 675– 701 DOI: 10.1080/01621459.1937.10503522
- 57Shapiro, S. S.; Wilk, M. B. An Analysis of Variance Test for Normality (Complete Samples) Biometrika 1965, 52, 591– 611 DOI: 10.1093/biomet/52.3-4.591
- 58Mauchly, J. W. Significance Test for Sphericity of a Normal N-Variate Distribution Ann. Math. Stat. 1940, 11, 204– 209 DOI: 10.1214/aoms/1177731915
- 59Greenhouse, S. W.; Geisser, S. On Methods in the Analysis of Profile Data Psychometrika 1959, 24, 95– 112 DOI: 10.1007/BF02289823
- 60Huynh, H.; Feldt, L. S. Estimation of the Box Correction for Degrees of Freedom from Sample Data in Randomized Block and Split-Plot Designs J. Educ. Behav. Stat. 1976, 1, 69– 82 DOI: 10.3102/10769986001001069
- 61de Bruyn Kops, C.; Friedrich, N.-O.; Kirchmair, J. Alignment-Based Prediction of Sites of Metabolism J. Chem. Inf. Model. 2017, 57 (6) 1258– 1264[ACS Full Text
], [CAS], Google Scholar
61https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC2sXnvV2hs7w%253D&md5=3a48e4bbfc64e569c9a9ff3d4040bc38Alignment-Based Prediction of Sites of Metabolismde Bruyn Kops, Christina; Friedrich, Nils-Ole; Kirchmair, JohannesJournal of Chemical Information and Modeling (2017), 57 (6), 1258-1264CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Prediction of metabolically labile atom positions in a mol. (sites of metab.) is a key component of the simulation of xenobiotic metab. as a whole, providing crucial information for the development of safe and effective drugs. In 2008, an exploratory study was published in which sites of metab. were derived based on mol. shape- and chem. feature-based alignment to a mol. whose site of metab. (SoM) had been detd. by expts. The authors present a detailed anal. of the breadth of applicability of alignment-based SoM prediction, including transfer of the approach from a structure- to ligand-based method and extension of the applicability of the models from cytochrome P 450 2C9 to all cytochrome P 450 isoenzymes involved in drug metab. The authors evaluate the effect of mol. similarity of the query and ref. mols. on the ability of this approach to accurately predict SoMs. In addn., the authors combine the alignment-based method with a leading chem. reactivity model to take reactivity into account. The combined model yielded superior performance in comparison to the alignment-based approach and the reactivity models with an av. area under the receiver operating characteristic curve of 0.85 in cross-validation expts. In particular, early enrichment was improved, as evidenced by higher BEDROC scores (mean BEDROC = 0.59 for α = 20.0, mean BEDROC = 0.73 for α = 80.5). - 62OMEGA, version 2.5.1.4; OpenEye Scientific Software: Santa Fe, NM, 2011; https://www.eyesopen.com (accessed Apr 7, 2017).Google ScholarThere is no corresponding record for this reference.
- 63Hawkins, P. C. D.; Skillman, A. G.; Warren, G. L.; Ellingson, B. A.; Stahl, M. T. Conformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural Database J. Chem. Inf. Model. 2010, 50, 572– 584 DOI: 10.1021/ci100031x[ACS Full Text
], [CAS], Google Scholar
63https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BC3cXjtlaisrY%253D&md5=fb87ecc9c51eddef63b41fffcd9babeeConformer Generation with OMEGA: Algorithm and Validation Using High Quality Structures from the Protein Databank and Cambridge Structural DatabaseHawkins, Paul C. D.; Skillman, A. Geoffrey; Warren, Gregory L.; Ellingson, Benjamin A.; Stahl, Matthew T.Journal of Chemical Information and Modeling (2010), 50 (4), 572-584CODEN: JCISD8; ISSN:1549-9596. (American Chemical Society)Here, we present the algorithm and validation for OMEGA, a systematic, knowledge-based conformer generator. The algorithm consists of three phases: assembly of an initial 3D structure from a library of fragments; exhaustive enumeration of all rotatable torsions using values drawn from a knowledge-based list of angles, thereby generating a large set of conformations; and sampling of this set by geometric and energy criteria. Validation of conformer generators like OMEGA has often been undertaken by comparing computed conformer sets to exptl. mol. conformations from crystallog., usually from the Protein Databank (PDB). Such an approach is fraught with difficulty due to the systematic problems with small mol. structures in the PDB. Methods are presented to identify a diverse set of small mol. structures from cocomplexes in the PDB that has maximal reliability. A challenging set of 197 high quality, carefully selected ligand structures from well-solved models was obtained using these methods. This set will provide a sound basis for comparison and validation of conformer generators in the future. Validation results from this set are compared to the results using structures of a set of druglike mols. extd. from the Cambridge Structural Database (CSD). OMEGA is found to perform very well in reproducing the crystallog. conformations from both these data sets using two complementary metrics of success. - 64RDKit 2016.03.4. https://github.com/rdkit/rdkit/releases/tag/Release_2016_03_4 (Accessed April 7, 2017).Google ScholarThere is no corresponding record for this reference.
- 65Steinbeck, C.; Han, Y.; Kuhn, S.; Horlacher, O.; Luttmann, E.; Willighagen, E. The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics J. Chem. Inf. Comput. Sci. 2003, 43, 493– 500 DOI: 10.1021/ci025584y[ACS Full Text
], [CAS], Google Scholar
65https://chemport.cas.org/services/resolver?origin=ACS&resolution=options&coi=1%3ACAS%3A528%3ADC%252BD3sXhtVaktbg%253D&md5=afc8fd10783af301c73a8183727230bfThe Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and BioinformaticsSteinbeck, Christoph; Han, Yongquan; Kuhn, Stefan; Horlacher, Oliver; Luttmann, Edgar; Willighagen, EgonJournal of Chemical Information and Computer Sciences (2003), 43 (2), 493-500CODEN: JCISD8; ISSN:0095-2338. (American Chemical Society)The Chem. Development Kit (CDK) is a freely available open-source Java library for Structural Chemo- and Bioinformatics. Its architecture and capabilities as well as the development as an open-source project by a team of international collaborators from academic and industrial institutions is described. The CDK provides methods for many common tasks in mol. informatics, including 2D and 3D rendering of chem. structures, I/O routines, SMILES parsing and generation, ring searches, isomorphism checking, structure diagram generation, etc. Application scenarios as well as access information for interested users and potential contributors are given. - 66Chemistry Development Kit 1.4.19. https://github.com/cdk/cdk/releases/tag/cdk-1.4.19 (accessed Apr 7, 2017) .Google ScholarThere is no corresponding record for this reference.
- 67scikit-learn 0.18. http://scikit-learn.org/0.18/documentation.html (accessed Apr 7, 2017) .Google ScholarThere is no corresponding record for this reference.
Supporting Information
Supporting Information
ARTICLE SECTIONSThe Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jcim.7b00250.
Additional figures and tables with calculated CDK descriptors, hyperparameter optimization results, model validation results, and performance of the random forest algorithm (PDF)
Terms & Conditions
Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.