ACS Publications. Most Trusted. Most Cited. Most Read
Predicting Total Drug Clearance and Volumes of Distribution Using the Machine Learning-Mediated Multimodal Method through the Imputation of Various Nonclinical Data
My Activity

Figure 1Loading Img
  • Open Access
Computational Chemistry

Predicting Total Drug Clearance and Volumes of Distribution Using the Machine Learning-Mediated Multimodal Method through the Imputation of Various Nonclinical Data
Click to copy article linkArticle link copied!

  • Hiroaki Iwata*
    Hiroaki Iwata
    Graduate School of Medicine, Kyoto University, 53 Shogoin-kawaharacho, Sakyo-ku, Kyoto 606-8507, Japan
    *Email: [email protected]
  • Tatsuru Matsuo
    Tatsuru Matsuo
    Fujitsu Ltd., 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa 211-8588, Japan
  • Hideaki Mamada
    Hideaki Mamada
    DMPK Research Laboratories, Central Pharmaceutical Research Institute, Japan Tobacco Inc., 1-1, Murasaki-cho, Takatsuki, Osaka 569-1125, Japan
  • Takahisa Motomura
    Takahisa Motomura
    Central Pharmaceutical Research Institute, Japan Tobacco Inc., 1-1, Murasaki-cho, Takatsuki, Osaka 569-1125, Japan
  • Mayumi Matsushita
    Mayumi Matsushita
    Fujitsu Ltd., 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa 211-8588, Japan
  • Takeshi Fujiwara
    Takeshi Fujiwara
    Graduate School of Medicine, Kyoto University, 53 Shogoin-kawaharacho, Sakyo-ku, Kyoto 606-8507, Japan
  • Kazuya Maeda
    Kazuya Maeda
    Graduate School of Pharmaceutical Sciences, Department of Molecular Pharmacokinetics, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
    More by Kazuya Maeda
  • Koichi Handa*
    Koichi Handa
    Toxicology & DMPK Research Department, Teijin Institute for Bio-medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan
    *Email: [email protected]
    More by Koichi Handa
Open PDFSupporting Information (2)

Journal of Chemical Information and Modeling

Cite this: J. Chem. Inf. Model. 2022, 62, 17, 4057–4065
Click to copy citationCitation copied!
https://doi.org/10.1021/acs.jcim.2c00318
Published August 22, 2022

Copyright © 2022 The Authors. Published by American Chemical Society. This publication is licensed under

CC-BY-NC-ND 4.0 .

Abstract

Click to copy section linkSection link copied!

Pharmacokinetic research plays an important role in the development of new drugs. Accurate predictions of human pharmacokinetic parameters are essential for the success of clinical trials. Clearance (CL) and volume of distribution (Vd) are important factors for evaluating pharmacokinetic properties, and many previous studies have attempted to use computational methods to extrapolate these values from nonclinical laboratory animal models to human subjects. However, it is difficult to obtain sufficient, comprehensive experimental data from these animal models, and many studies are missing critical values. This means that studies using nonclinical data as explanatory variables can only apply a small number of compounds to their model training. In this study, we perform missing-value imputation and feature selection on nonclinical data to increase the number of training compounds and nonclinical datasets available for these kinds of studies. We could obtain novel models for total body clearance (CLtot) and steady-state Vd (Vdss) (CLtot: geometric mean fold error [GMFE], 1.92; percentage within 2-fold error, 66.5%; Vdss: GMFE, 1.64; percentage within 2-fold error, 71.1%). These accuracies were comparable to the conventional animal scale-up models. Then, this method differs from animal scale-up methods because it does not require animal experiments, which continue to become more strictly regulated as time passes.

This publication is licensed under

CC-BY-NC-ND 4.0 .
  • cc licence
  • by licence
  • nc licence
  • nd licence
Copyright © 2022 The Authors. Published by American Chemical Society

Introduction

Click to copy section linkSection link copied!

Pharmacokinetic evaluations play an important role in the development of new drugs throughout the entire process. (1) Clinical trials are particularly important in drug development, and improving the success rate of these requires the estimation of effective clinical dosages that produce the best drug effect profile. Therefore, it is necessary to accurately predict human pharmacokinetic parameters from nonclinical experimental data before transitioning to human clinical trials. (2) In general, the parameters that have a large effect on the blood concentration profile of a drug during intravenous administration are the volume of distribution (Vd), which quantifies the distribution of the drug inside the human body, and total body clearance (CLtot), which shows the drug processing capacity within the body as a whole.
Vd is determined by the physical properties of the drug, such as protein binding and membrane permeability, and predictions from machine learning models using chemical structures (CS) have been relatively accurate. (3) When nonclinical animal experimental values are available, the difference between the predicted and experimental values is maintained within approximately a 2-fold error using animal scale-up methods. (4) However, since this kind of highly accurate prediction method uses data from large animals such as dogs and monkeys, it is difficult to use this approach because of their high cost and ethical implications of large animal models. (5)
Predicting CLtot is much more difficult than Vd because there are multiple drug CL pathways, including metabolism mainly by the liver and gastrointestinal tract, bile excretion of the unmetabolized drug, and its excretion in the urine. In one method, the intrinsic CL obtained in in vitro studies using human hepatocytes and microsomes was scaled up to determine hepatic CL. However, in many cases, the data cannot be accurately scaled because of issues around differences in the experimental systems and variations in lots between human specimens. Furthermore, there are currently no suitable in vitro experimental systems for other organs. (6) The empirical CLtot predictions method using the animal weight power law showed accurate at an average of approximately 2-fold error. However, verification has not been performed on external datasets. (7,8)
For predicting CLtot, several studies have already investigated using machine learning. (9) Then, some reports used related experimental values to CLtot as explanatory variables. (10) We proposed using a machine learning method based on multimodal learning that takes the CS and nonclinical data for predicting human CLtot. (11) The main point of this method to note is that the human CLtot prediction accuracy is increased using both CS data and animal experimental data, suggesting that it may be possible to further improve human CLtot prediction accuracy using not only rat CLtot but also the CLtot values from various animals (e.g., dogs and monkeys) and in vitro experimental values such as the protein binding ratio for each animal species as explanatory variables. However, these experimental values are often missing from the compound datasets.
Missing-value imputation is a well-known method for resolving this issue. Methods that use machine learning, such as kNNimputation, (12) multivariate imputation by chained equations, (13) and missForest, (14) are established imputation methods known for their ability to provide improved accuracy in these types of applications. The prediction of drug repositioning with high accuracy has been made possible by the addition of missing-value imputation based on the similarity of compound structures. (15) Missing data related to activity values for different compound targets were predicted using the Random Forest method, and a QSAR model was constructed, which uses the data from these imputed missing values as explanatory variables. (16)
In this study, first, we constructed machine learning models for predicting the missing nonclinical data using chemical compound descriptors. Next, we predicted the missing nonclinical data and then we constructed our machine learning model for predicting human CLtot and Vdss, which used the nonclinical data with imputed missing values and CS data as explanatory variables. XGBoost and Deep Tensor, (17) a deep learning method that can learn graph data, were used as the bases for these machine learning models. As a result, the prediction accuracy of this method is comparable with the many animal scale-up methods. Different from the conventional methods, since these models do not need new experimental data, it seems to be appropriate for predicting the human parameters in not only the clinical stage but also the early drug discovery stage.

Materials and Methods

Click to copy section linkSection link copied!

Workflow

The workflow used in this study consisted of the following three steps: (i) gathering of the chemical compound and nonclinical data; (ii) imputing the missing values in the nonclinical data using ADMEWORKS (https://www.fujitsu.com/jp/solutions/business-technology/tc/sol/admeworks/index.html); and (iii) selecting features by XGBoost or Random Forest and constructing the prediction model (Figure 1).

Figure 1

Figure 1. Workflow of our novel human CLtot and Vdss prediction method. (A) CLtot analysis flow. (i) There were 741 compounds with human CLtot data and 46 that had values for all 11 features. (ii) All feature values were estimated via prediction using ADMEWORKS. (iii) Feature extraction was performed using XGBoost or Random Forest, and a prediction model was constructed. (B) Vdss analysis flow. (i) There were 751 compounds with human Vdss data and 46 that had values for all 11 features. (ii) All feature values were estimated via prediction using ADMEWORKS. (iii) Feature extraction was performed using XGBoost or Random Forest, and a prediction model was constructed.

Gathering Chemical Compounds, Nonclinical Data, and Data Preprocessing

We obtained 741 sets of human CLtot data with CS data, and 751 sets of human Vdss data with CS data from JCP2013 (4,7) and ChEMBL23. (18) We also obtained various sets of animal experimental data (CLtot, Vdss, and fraction-unbound data for rats, dogs, and monkeys) and human fu data for each of these compounds. (4,7) In addition, we collected the pKa acid, pKa base, solubility, and caco-2 permeability data including the calculated values for each compound from PubChem (19) and DrugBank. (20) Caco-2 permeability is a positive/negative binary value, and the values denoted as predicted values were also collected (Table 1). This left us with 46 CLtot and 45 Vdss compounds that recorded all 11 data items for CLtot: rat CLtot, dog CLtot, monkey CLtot, human fu, rat fu, dog fu, monkey fu, pKa acid, pKa base, solubility, and caco-2 permeability; and Vdss: rat Vdss, dog Vdss, monkey Vdss, human fu, rat fu, dog fu, monkey fu, pKa acid, pKa base, solubility, and caco-2 permeability (hereinafter defined as “nonclinical data”). These sets of compounds were labeled the “evaluation dataset” and the sets of compounds not in the “evaluation dataset” were labeled the “training dataset” for human CLtot or Vdss (dataset.xlsx). In addition, the set of compounds formed by removing the “evaluation dataset” from the set of compounds that had rat data were labeled “training dataset (rat)” for the human CLtot and Vdss data. Note that although the excretion pathways have not been identified in most of the compounds used in this study, when the data from Lombardo et al., who gathered drug data, (7) and Varma et al., (21) who investigated human kidney excretion, were compared, kidney excretion of 50% or less of the CLtot was found in 157 of the 231 remaining compounds. Furthermore, although data were gathered during intravenous administration of eptaloprost, there are errors in the cited references, and it has been shown that the human data are for oral administration.
Table 1. Details of the Compound Data
featurenumber of compoundssource
human CLtot741JCP2013, ChEMBL23
rat CLtot387JCP2013, ChEMBL23
dog CLtot284JCP2013, ChEMBL23
monkey CLtot129JCP2013, ChEMBL23
human Vdss751JCP2013, ChEMBL23
rat Vdss351JCP2013, ChEMBL23
dog Vdss274JCP2013, ChEMBL23
monkey Vdss125JCP2013, ChEMBL23
human fu577JCP2013, ChEMBL23
rat fu237JCP2013, ChEMBL23
dog fu179JCP2013, ChEMBL23
monkey fu88JCP2013, ChEMBL23
pKa acid334Pubchem, DrugBank
pKa base335Pubchem, DrugBank
solubility339Pubchem, DrugBank
caco-2 permeability307Pubchem, DrugBank
The gathered data were then preprocessed as follows before applying each of the machine learning calculations described in the next section. First, the CLtot, Vdss, and solubility data underwent a logarithmic transformation and then all of the data except the caco-2 permeability values were standardized. It is worth noting that the data used in the animal scale-up method evaluations were the raw data without data preprocessing.

Missing-Value Imputation Using ADMEWORKS

Some of the 11 features of the nonclinical data were missing for some of the compounds in the “training dataset” (Table 1). Therefore, we created machine learning models using existing data for each item and predicted the unknown nonclinical values for these features using these models. ADMEWORKS was used to complete the descriptor calculations, descriptor extractions, prediction model construction, and prediction. First, compound descriptors with a total of 1465 dimensions were calculated from descriptors with 555 dimensions identified from their atom/bond-related parameters, topology-related parameters, and physiochemical parameters, and descriptors with 910 dimensions were obtained by counting a partial structure search of CS. Next, descriptors containing missing values, calculation errors, and descriptors with a correlation coefficient of 0.9 or higher (the default setting in ADMEWORKS) were excluded. In addition, high-level feature extraction (particle swarm optimization) (22) was performed, and a model was constructed using the remaining descriptors. The machine learning model that maximized the percentage within a 2-fold error in 5-fold cross-validation (maximized the two-class accuracy of caco-2 permeability only) was then adopted for downstream analyses. The nonclinical data for each compound were then predicted using each prediction model and the machine learning methods used for each item and the learning model accuracies are listed in Table S1.

Feature Selection and Prediction Model Construction

Feature Selection
Given the possibility that nonclinical data may not be that useful for prediction and the possibility of inappropriate missing-value imputation having an adverse effect on prediction, specific items among the 11 nonclinical data categories were selected as explanatory variables. This selection was performed based on the importance of the explanatory variable as determined during the construction of the prediction model using the XGBoost or Random Forest method. (23) First, the prediction models for human CLtot and Vdss were constructed with “training dataset” by the XGBoost or Random Forest method using all 11 items from the nonclinical data as explanatory variables. This allowed us to evaluate the importance of each of the 11 variables within the model. Next, the top k nonclinical data of importance were selected as explanatory variables. The best k was determined by the search using the initial k value where the total of the top k importance exceeded 0.5 with each k evaluated in single-value increments. For each k, 5-fold cross-validation using the “training dataset” was performed using the multimodal model described below, whose explanatory variables included the CS and the top k nonclinical data of importance. This was evaluated using the geometric mean fold error (GMFE) and percentage within a 2-fold error, which are described below. When either the GMFE or percentage within a 2-fold error became worse than the previous value of k (i.e., k – 1), the search was complete. Finally, the k that gave the best GMFE and percentage with a 2-fold error within the search range was considered the best k, and the nonclinical data at the best k were selected for further evaluation.
Deep Tensor Model
To evaluate the effectiveness of the missing-value imputation, a prediction model that uses only the CS data as explanatory variables was constructed using Deep Tensor for comparison. The “training dataset” was used to construct this prediction model and the other conditions were set to the same values as those used in the multimodal Deep Tensor model described below.
Multimodal Deep Tensor Model
Prediction models for human CLtot and Vdss were constructed using the CS and nonclinical data as explanatory variables using the previously described method (11) for Deep Tensor, (17) a deep learning technology for structured graph data (Figure 2). For CLtot and Vdss, four combinations of the nonclinical data were used as explanatory variables with the training dataset used to construct the prediction model: (1) rat data only + “training dataset (rat)” (as CS + rat CLtot or Vdss in Table S2); (2) rat data only + “training dataset” (as CS + rat CLtot or Vdss imputed in Table S2); (3) all 11 nonclinical data points + “training dataset” (as CS + 11 features in Table S3); and (4) nonclinical data selected by the feature selection described above + “training dataset” (as CS + selected features in Table S3). The core tensor size was set at 50 × 50 and the neural network structure consisted of two intermediate layers, 1000 neurons in each layer, and one neuron in the output layer. The ReLU function (24) was used as the activation function, and batch normalization (25) with a decay rate of moving average = 0.9, epsilon value = 2 × 10–5, and dropout (26) at a rate of 0.5 was applied to produce the intermediate layers. Then, during training, the number of epochs was set to 50, and the minibatch size was set to 100.

Figure 2

Figure 2. Overview of the multimodal Deep Tensor model.

XGBoost Model
To evaluate the effectiveness of the missing-value imputation, we performed XGBoost calculation as a traditional machine learning model. XGBoost was implemented using scikit-learn of Python language. (27) Prediction models that used only the CS data and the CS data and nonclinical data as explanatory variables were constructed. We transformed the CS data into the extended connectivity fingerprint with bond diameter four (ECFP4). The ECFP4 compound descriptor was calculated using RDKit with parameters of radius 2 and 2048 dimensions.
Animal Scale-Up and Conventional Machine Learning Methods
The single-species allometric scaling (SSS) method for CLtot, which uses the CLtot values of any single model, rats, dogs, or monkeys, the simple allometry (SA) method, which uses all three species, and the fraction-unbound corrected intercept method (FCIM) (28) are often implemented as the conventional method of human CLtot prediction. For the construction of each model of SSS, the compounds that have the value needed for each model are used from “training dataset.” Then, each model is evaluated by the “evaluation dataset”. For SA and FCIM that have no training process by the training dataset, the parameters are tuned by the data of each compound to be predicted. The number of compounds in the training process for parameter selection for each method and the equations for each method are shown in Table S4.
The SSS and SA method for Vdss uses the Vdss values similar to the CLtot prediction models described above. Then, the Øie–Tozer (29) method was also used as the conventional method of human Vdss prediction. The process of construction and evaluation of each model of SA is similar to CLtot. For the SA and Øie–Tozer method that have no training process by the training dataset, the parameters are tuned by the data of each compound to be predicted. The number of compounds in the training dataset for each parameter and each method and the equations for each method are shown in Table S5.
Performance Evaluation
GMFE and percentage within a 2-fold error were used as indicators for evaluating the prediction accuracy of the method. When GMFE = X, the mean error between the measured and predicted values can be interpreted as an X-fold error. GMFE is expressed by the following equation
where GMFE values closer to 1 indicate improved accuracy. Furthermore, a percentage within a 2-fold error indicates the proportion of data that are within a 2-fold error (1/2 × correct value ≤ predicted value ≤ 2 × correct value). Values of percentage within a 2-fold error closer to 100% indicate better accuracy. When the evaluation results did not match for both indicators, GMFE was used as the primary predictor of accuracy as this is the more comprehensive indicator value.

Results

Click to copy section linkSection link copied!

Evaluation of the Usefulness of Missing-Value Imputation

To clarify the effectiveness of missing-value imputation, the accuracy was evaluated using rat data, which had the fewest missing variables. More specifically, three models were created: a model trained using only CS data, a multimodal model using CS and rat CLtot or rat Vdss (CS + rat CLtot or rat Vdss) data, and a multimodal model using CS and rat CLtot or rat Vdss imputed data using predicted values (CS + rat CLtot imputed or rat Vdss imputed), and the effectiveness of the missing-value imputation was evaluated (Table S2). Evaluation was performed using the evaluation dataset that included established values for human, rat, dog, and monkey CLtot or Vdss for 45 compounds for the CL prediction or 46 compounds for the Vdss data predictions. Note that variation was inhibited in these evaluations due to the limited number of compounds in the evaluation dataset; construction of the models and evaluation using the evaluation dataset were completed five times, and the mean values were used for the evaluation.
Table 2 shows the results of the missing-value imputation for CLtot prediction. First, the accuracies were compared for the model trained using only CS and the multimodal model using CS and rat CLtot (CS + rat CLtot). The training data for the model using only CS consisted of 695 compounds excluding those present in the evaluation dataset (only CS), the GMFE was XGBoost: 2.53 and Deep Tensor: 2.44, and the percentage within the 2-fold error was XGBoost: 45.7% and Deep Tensor: 45.7%. For the multimodal model including CS and rat CLtot, taking the 343 compounds that had a rat CLtot value from the 695 compounds as training data (CS + rat CLtot), the GMFE was XGBoost: 2.15 and Deep Tensor: 2.15, and the percentage within 2-fold error was XGBoost: 52.2% and Deep Tensor: 54.8%. This finding confirmed that the accuracy was improved by introducing rat CLtot values to the CLtot prediction. This result is consistent with our previous report. (11) Next, although accuracy is generally increased by increasing the amount of training data, there were fewer compounds for which nonclinical data, such as rat CLtot values, were measured. Therefore, we performed prediction using ADMEWORKS for the compounds with no rat CLtot value data (see the Materials and Methods for more details). This meant that we could then train the multimodal model using the CS and rat CLtot imputed data with 695 compounds (CS + rat CLtot imputed). This model produced a GMFE of XGBoost: 2.09 and Deep Tensor: 2.09, and the percentage within 2-fold error was XGBoost: 54.3% and Deep Tensor: 54.3%, making it the most accurate model. These results suggest that model accuracy can be improved by increasing the data used during training, even if this is imputed data.
Table 2. Results of the Accuracy Evaluations for Imputations of Rat CLtot Data
 methodGMFE% of 2-fold error
CLtot695 compounds2.5345.7
XGBoost: only CS
343 compounds2.1552.2
XGBoost: CS + rat CLtot
695 compounds2.0954.3
XGBoost: CS + rat CLtot imputed
695 compounds2.4445.7
Deep Tensor: only CS
343 compounds2.1554.8
Deep Tensor: CS + rat CLtot
695 compounds2.0954.3
Deep Tensor: CS + rat CLtot imputed
Vdss706 compounds1.6682.2
XGBoost: only CS
306 compounds1.7275.6
XGBoost: CS + rat Vdss
706 compounds1.7368.9
XGBoost: CS + rat Vdss imputed
706 compounds1.8562.2
Deep Tensor: only CS
306 compounds1.8956.9
Deep Tensor: CS + rat Vdss
706 compounds1.7564.4
Deep Tensor: CS + rat Vdss imputed
Next, Table 2 also shows the results of the missing-value imputation for Vdss. When a comparison was performed using training data for 706 compounds, excluding the evaluation dataset, the model trained using only CS (only CS) and the multimodal model using CS and rat Vdss (CS + rat Vdss, using 306 compounds) had GMFEs of XGBoost: 1.66 and 1.72, Deep Tenor: 1.85 and 1.89, and percentages within 2-fold errors of XGBoost: 82.2% and 75.6% and Deep Tensor: 62.2% and 56.9%, respectively. This means that the model using only CS (only CS) was more accurate than the multimodal model using CS and actual nonclinical data (CS + rat Vdss). It is known that the structure–activity relationship is stronger for Vdss than CLtot for prediction from structure information, (3) and this suggests that the increase in structure alone in the training data increases the accuracy of the prediction model. Furthermore, when we evaluated the model produced using the training data of 706 compounds with imputed nonclinical data where there was no rat Vdss value, the multimodal model using CS and rat Vdss imputed using predicted values (CS + rat Vdss imputed) had a GMFE of XGBoost: 1.73 and Deep Tensor: 1.75, and a percentage within 2-fold error of XGBoost: 68.9% and Deep Tensor: 64.4%, making it the most accurate model.

Improving Accuracy Using Feature Selection

Improved accuracy was achieved using multimodal machine learning models and imputed values for the missing rat CLtot/Vdss data. We then went on to evaluate the addition of multimodal machine learning models designed to select specific features from the 11 items of nonclinical data in each dataset (Table S3).
We first created a set of learning models for each of the 11 items in the nonclinical data using ADMEWORKS, and missing-value imputation was performed by prediction. Details of the prediction model and model accuracy for each item of the nonclinical data are shown in Table S1. Multiple models were constructed for each item, and missing-value imputation was performed by predicting the unknown clinical data using the model with the highest accuracy.
We went on to complete feature selection using these 11 items with their imputed data for any missing values. The importance of each item in the nonclinical data is shown in Figures S1–S4. We then used their importance to select the four items (rat CLtot, dog CLtot, human fu, and pKa acid) critical for CLtot prediction (Figure S1 and Table S6) and the five items (rat Vdss, dog Vdss, pKa acid, pKa base, and human fu) identified for Vdss prediction (Figure S2 and Table S7) from the nonclinical data using the XGBoost algorithm. We then used their importance to select the four items (rat CLtot, dog CLtot, human fu, and pKa acid) critical for CLtot prediction (Figure S3 and Table S8) and the six items (dog Vdss, rat Vdss, pKa acid, pKa base, solubility, and human fu) identified for Vdss prediction (Figure S4 and Table S9) from the nonclinical data using the Random Forest algorithm.
Finally, multimodal machine learning models were constructed using the selected nonclinical and CS data. The accuracies of this model, five different conventional methods, the model using only CS, and the multimodal model using CS and all 11 items of nonclinical data were evaluated using the “evaluation dataset.” We then repeated the dataset evaluations five times for both the multimodal machine learning model and the model using CS only, and the mean value was used for evaluation, similar to the evaluation of the usefulness of missing-value imputation. Table 3 shows the results of the CLtot prediction using eight different models. Among the five types of conventional models, models that use monkey CLtot data, such as SSS monkey (GMFE: 1.93, percentage within 2-fold error: 58.7%) and FCIM (GMFE: 1.99, percentage within 2-fold error: 52.2%) were the most accurate. Among the multimodal models using nonclinical data with missing-value imputation and CS as proposed in this research, the model using all 11 items (CS + 11 features) presented with a GMFE of XGBoost: 2.06 and Deep Tensor: 2.11 and a percentage within 2-fold error of XGBoost: 58.7% and Deep Tensor: 52.2% that was equivalent to the conventional methods, while the model using feature selection (CS + selected features) was shown to be the most accurate with a GMFE of XGBoost: 1.98 and Deep Tensor: 1.92 and a percentage within a 2-fold error of XGBoost: 50.0% and Deep Tensor: 66.5%. These results indicate that predictive models can be improved by increasing the number of compounds used for training and that these can be enhanced by first imputing any missing data using predictive values. In addition, these results suggest that a better model can be constructed by performing feature selection and training using only the important features from the nonclinical data.
Table 3. Results of Accuracy Evaluations
 methodaGMFEb% of 2-fold error
CLtotSSS rat2.3643.5
SSS dog2.3039.1
SSS monkey1.9358.7
SA2.3345.7
FCIM1.9952.2
XGBoost: only CS2.4050.0
XGBoost: CS + 11 features2.0658.7
XGBoost: CS + selected features1.9850.0
Deep Tensor: only CS2.4445.7
Deep Tensor: CS + 11 features2.1152.2
Deep Tensor: CS + selected features1.9266.5
VdssSSS rat1.9162.2
SSS dog1.9371.1
SSS monkey1.6080.0
SA2.0768.9
Øie–Tozer1.4684.4
XGBoost: only CS1.7077.8
XGBoost: CS + 11 features1.6471.1
XGBoost: CS + selected features1.6671.1
Deep Tensor: only CS1.8562.2
Deep Tensor: CS + 11 features1.7569.8
Deep Tensor: CS + selected features1.7474.2
a

SSS: single-species allometric scaling; SA: simple allometry; FCIM: fu-corrected intercept method; CS: chemical structure.

b

GMFE: geometric mean fold error.

Table 3 also summarizes the results from the Vdss predictions. First, the Øie–Tozer method (GMFE: 1.46, percentage within 2-fold error: 84.4%) was shown to be the most accurate of the conventional methods evaluated and was closely followed by SSS monkey (GMFE: 1.60, percentage within 2-fold error: 80.0%). However, there were no significant differences among the multimodal models using nonclinical data with missing-value imputation and CS data, with CS + 11 features producing a GMFE of XGBoost: 1.64 and Deep Tensor: 1.75 and a percentage within 2-fold error of XGBoost: 71.1% and Deep Tensor: 69.8% and the feature selection model (CS + selected features) producing a GMFE of XGBoost: 1.66 and Deep Tensor: 1.74 and a percentage within 2-fold error of XGBoost: 71.1% and Deep Tensor: 74.2%, despite a small improvement in the overall accuracy as described by the percentage within 2-fold error value. Unlike in CLtot prediction, Vdss prediction was still more accurate using the conventional Øie–Tozer method, which is based on animal scale-up data.

Discussion

Click to copy section linkSection link copied!

Missing-Value Imputation (NA Imputation)

This research demonstrates that high-accuracy predictions of CLtot and Vdss can be achieved via data extension facilitated by missing-value imputation. Since this method does not require new experimental values, it can be used from the initial stages of drug development.
Although the effectiveness of imputation using this method has been confirmed for rat data, it had not been evaluated for other nonclinical data. Therefore, nonclinical data probably exists where imputation is not effective, but rather reduces prediction accuracy. Furthermore, nonclinical data may also exist where imputation is appropriate but not effective for prediction. With these possibilities in mind, we were careful to select the explanatory variables in this study. Given this, it was necessary to evaluate all combinations of the 12 candidate explanatory variables, consisting of the CS data and 11 items of nonclinical data, to select the explanatory variables that are truly optimal for developing a multimodal Deep Tensor model. However, since this would take a huge amount of time, this study used a simplified version of these evaluations to select the explanatory variables. More specifically, the CS data were taken as always selected, and the required items from among the 11 nonclinical data items were differentially applied. When selecting the nonclinical data features, we determined that the number of explanatory variables that should be selected were evaluated based on the importance of the explanatory variables (i.e., nonclinical data), as obtained from the XGBoost or Random Forest evaluations (Figures S1–S4 and Tables S6–S9). This allowed us to reduce the number of combinations of explanatory variables that needed to be evaluated. Note that the importance of the nonclinical data obtained from the XGBoost or Random Forest method assumes that a prediction model is constructed using the XGBoost or Random Forest method. As a result, it is possible that these do not match the importance of the multimodal Deep Tensor model. Furthermore, in the XGBoost or Random Forest method, the importance of each of the explanatory variables was calculated using only the nonclinical data. As a result, when these are used together with the CS data in the multimodal Deep Tensor model, it is possible that their importance may change. Therefore, we cannot definitively say whether we selected the best explanatory variables.

Selected Explanatory Variables

The explanatory variables selected for CLtot prediction (CS + selected features), which had the highest prediction accuracy were CLtot rat, CLtot dog, human fu, and pKa acid in both XGBoost and Deep Tensor (Figures S1 and S3 and Tables S6 and S8). We believe that the selection of CLtot for multiple species may help to accurately reflect the inherent metabolic differences between species. (30) However, the fact that monkey CLtot was not selected could possibly be due to problems with imputation accuracy since the number of datasets used for missing-value imputation for the monkey values was significantly smaller than that of the other species. In addition, we believe that these selections are valid as human fu was selected and this is a common consideration for human CLtot prediction. (7) The addition of the pKa acid variable may help to accurately reflect the allocation of the metabolism/excretion pathways depending on the compound’s physical properties. (31)
The explanatory variables selected for the most accurate prediction of Vdss (CS + selected features) included rat Vdss, dog Vdss, human fu, pKa acid, and pKa base in XGBoost (Figure S2 and Table S7) and rat Vdss, dog Vdss, human fu, pKa acid, pKa base, and solubility in Deep Tensor (Figure S4 and Table S9). The Vd of multiple species is likely to reflect the inherent differences in the Vd pathways between species (32) and the validity of these selections was similarly supported by the addition of the human fu parameters. (4) However, significantly more physical property parameters (e.g., pKa acid, pKa base, solubility) were included in the Vdss evaluations; this is likely designed to reflect the fact that the Vd of various compounds is determined by interactions between the compound and the constituent components of the tissue (e.g., lipids, phospholipids, acidic glycoprotein). (33)

Comparison with Animal Scale-Up Methods

Conventional animal scale-up methods are known to be useful, offer good accuracy, and are often used for human pharmacokinetic parameter prediction during drug development. (4,7) Here, we showed that SSS monkey, which uses monkey data, had the next highest accuracy for CLtot prediction after applying a multimodal model with missing-value imputation + feature selection (Deep Tensor: CS + selected features). This was followed by the FCIM method, which uses data from rats, dogs, and monkeys and XGBoost: CS + selected features. Among the Vdss prediction methods, the Øie–Tozer method, which is calculated based on the plasma and tissue binding rates in rats, dogs, and monkeys, had the highest accuracy, followed by an animal scale-up method using monkey (SSS monkey) data and the proposed multimodal models with missing-value imputation and features (XGBoost: CS + 11 features and Deep Tensor: CS + selected features), which had slightly worse accuracy (Table 3). Given this, the prediction accuracy of human CLtot and Vdss using monkey data can be said to be valid for humans given the close genetic relationship between these species. (34,35) However, when the experimental costs and ethical aspects of performing animal experiments are considered, since monkey experiments tend to be completed at later stages in the nonclinical development stage, employing these in the initial stages of drug development is difficult. (34,36) Therefore, the proposed multimodal model that includes missing-value imputation and feature selection using existing data can be applied in the initial stages of drug development and is expected to contribute substantially to efficient drug development.

Conclusions

Click to copy section linkSection link copied!

This study constructed a set of high-accuracy CLtot and Vdss prediction models using missing-value imputation and feature selection for nonclinical data. Previous evaluations using nonclinical data as explanatory variables were shown to be less effective as the number of missing data points meant that the final number of evaluated compounds was too small for accurate machine learning. Therefore, we confirmed that the accuracy of these models is improved as a result of increasing the number of compounds used for training and increasing the number of (nonclinical data) explanatory variables that can be used by performing missing-value imputation on nonclinical data. This method differs from animal scale-up methods in that it does not require animal experiments, which have become more strictly regulated in recent years. Although we used XGBoost and Deep Tensor algorithms in this research, the other machine learning algorithms could be applied because this proposed method of imputation has no preference in machine learning algorithms. The increased accuracy of the CLtot and Vdss predictions produced by this method are expected to facilitate the evaluation and identification of candidate structures with improved pharmacokinetic properties at the earlier stages of drug discovery.

Supporting Information

Click to copy section linkSection link copied!

The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.2c00318.

  • The data used in this paper are listed in Supporting_information_dataset.xlsx (PDF)

Terms & Conditions

Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.

Author Information

Click to copy section linkSection link copied!

  • Corresponding Authors
    • Hiroaki Iwata - Graduate School of Medicine, Kyoto University, 53 Shogoin-kawaharacho, Sakyo-ku, Kyoto 606-8507, JapanOrcidhttps://orcid.org/0000-0001-9791-0008 Email: [email protected]
    • Koichi Handa - Toxicology & DMPK Research Department, Teijin Institute for Bio-medical Research, Teijin Pharma Limited, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan Email: [email protected]
  • Authors
    • Tatsuru Matsuo - Fujitsu Ltd., 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa 211-8588, Japan
    • Hideaki Mamada - DMPK Research Laboratories, Central Pharmaceutical Research Institute, Japan Tobacco Inc., 1-1, Murasaki-cho, Takatsuki, Osaka 569-1125, JapanOrcidhttps://orcid.org/0000-0002-5433-7042
    • Takahisa Motomura - Central Pharmaceutical Research Institute, Japan Tobacco Inc., 1-1, Murasaki-cho, Takatsuki, Osaka 569-1125, Japan
    • Mayumi Matsushita - Fujitsu Ltd., 4-1-1 Kamikodanaka, Nakahara-ku, Kawasaki-shi, Kanagawa 211-8588, Japan
    • Takeshi Fujiwara - Graduate School of Medicine, Kyoto University, 53 Shogoin-kawaharacho, Sakyo-ku, Kyoto 606-8507, Japan
    • Kazuya Maeda - Graduate School of Pharmaceutical Sciences, Department of Molecular Pharmacokinetics, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan
  • Author Contributions

    The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.

  • Funding

    This paper is based on the partial results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization (NEDO) and a grant-in-aid of The Fugaku Trust For Medicinal Research.

  • Notes
    The authors declare no competing financial interest.

    Software used: ADMEWORKS (https://www.fujitsu.com/jp/solutions/business-technology/tc/sol/admeworks/index.html).

Acknowledgments

Click to copy section linkSection link copied!

This research was conducted as part of the activities of the Life Intelligence Consortium (LINC). The authors thank Dr. Yasushi Okuno from the Graduate School of Medicine at Kyoto University for supporting their research activities at the LINC.

Abbreviations Used

Click to copy section linkSection link copied!

CL

clearance

CLtot

total clearance

Vd

volume of distribution

Vdss

steady-state volume of distribution

SSS

single-species allometric scaling

SA

simple allometry

FCIM

fraction-unbound corrected intercept method

CS

chemical structure

GMFE

geometric mean fold error

CYP

cytochrome P450

References

Click to copy section linkSection link copied!

This article references 36 other publications.

  1. 1
    Ballard, P.; Brassil, P.; Bui, K. H.; Dolgos, H.; Petersson, C.; Tunek, A.; Webborn, P. J. The right compound in the right assay at the right time: an integrated discovery DMPK strategy. Drug Metab. Rev. 2012, 44, 224252,  DOI: 10.3109/03602532.2012.691099
  2. 2
    Andrade, E. L.; Bento, A. F.; Cavalli, J.; Oliveira, S. K.; Schwanke, R. C.; Siqueira, J. M.; Freitas, C. S.; Marcon, R.; Calixto, J. B. Non-clinical studies in the process of new drug development - Part II: Good laboratory practice, metabolism, pharmacokinetics, safety and dose translation to clinical studies. Braz. J. Med. Biol. Res. 2016, 49, e5646  DOI: 10.1590/1414-431x20165646
  3. 3
    Lombardo, F.; Jing, Y. In Silico Prediction of Volume of Distribution in Humans. Extensive Data Set and the Exploration of Linear and Nonlinear Methods Coupled with Molecular Interaction Fields Descriptors. J. Chem. Inf. Model. 2016, 56, 20422052,  DOI: 10.1021/acs.jcim.6b00044
  4. 4
    Lombardo, F.; Waters, N. J.; Argikar, U. A.; Dennehy, M. K.; Zhan, J.; Gunduz, M.; Harriman, S. P.; Berellini, G.; Rajlic, I. L.; Obach, R. S. Comprehensive assessment of human pharmacokinetic prediction based on in vivo animal pharmacokinetic data, part 1: volume of distribution at steady state. J. Clin. Pharmacol. 2013, 53, 167177,  DOI: 10.1177/0091270012440281
  5. 5
    Russell, W. M. S.; Burch, R. L. The Principles Of Humane Experimental Technique; Methuen, 1959.
  6. 6
    Shiran, M. R.; Proctor, N.; Howgate, E.; Rowland-Yeo, K.; Tucker, G.; Rostami-Hodjegan, A. Prediction of metabolic drug clearance in humans: in vitro–in vivo extrapolation vs allometric scaling. Xenobiotica 2006, 36, 567580,  DOI: 10.1080/00498250600761662
  7. 7
    Lombardo, F.; Waters, N. J.; Argikar, U. A.; Dennehy, M. K.; Zhan, J.; Gunduz, M.; Harriman, S. P.; Berellini, G.; Liric Rajlic, I.; Obach, R. S. Comprehensive assessment of human pharmacokinetic prediction based on in vivo animal pharmacokinetic data, part 2: clearance. J. Clin. Pharmacol. 2013, 53, 178191,  DOI: 10.1177/0091270012440282
  8. 8
    (a) Crouch, R. D.; Hutzler, J. M.; Daniels, J. S. A novel in vitro allometric scaling methodology for aldehyde oxidase substrates to enable selection of appropriate species for traditional allometry. Xenobiotica 2018, 48, 219231,  DOI: 10.1080/00498254.2017.1296208
    (b) Mahmood, I. A Single Animal Species-Based Prediction of Human Clearance and First-in-Human Dose of Monoclonal Antibodies: Beyond Monkey. Antibodies 2021, 10, 35,  DOI: 10.3390/antib10030035
    (c) Sasabe, H.; Koga, T.; Furukawa, M.; Matsunaga, M.; Kaneko, Y.; Koyama, N.; Hirao, Y.; Akazawa, H.; Kawabata, M.; Kashiyama, E.; Takeuchi, K. Pharmacokinetics and metabolism of brexpiprazole, a novel serotonin-dopamine activity modulator and its main metabolite in rat, monkey and human. Xenobiotica 2021, 51, 590604,  DOI: 10.1080/00498254.2021.1890275
  9. 9
    (a) Wang, Y.; Liu, H.; Fan, Y.; Chen, X.; Yang, Y.; Zhu, L.; Zhao, J.; Chen, Y.; Zhang, Y. In Silico Prediction of Human Intravenous Pharmacokinetic Parameters with Improved Accuracy. J. Chem. Inf. Model. 2019, 59, 39683980,  DOI: 10.1021/acs.jcim.9b00300
    (b) Gombar, V. K.; Hall, S. D. Quantitative structure-activity relationship models of clinical pharmacokinetics: clearance and volume of distribution. J. Chem. Inf. Model. 2013, 53, 948957,  DOI: 10.1021/ci400001u
    (c) Demir-Kavuk, O.; Bentzien, J.; Muegge, I.; Knapp, E. W. DemQSAR: predicting human volume of distribution and clearance of drugs. J. Comput. Aided Mol. Des. 2011, 25, 11211133,  DOI: 10.1007/s10822-011-9496-z
  10. 10
    (a) Kosugi, Y.; Hosea, N. Direct Comparison of Total Clearance Prediction: Computational Machine Learning Model versus Bottom-Up Approach Using In Vitro Assay. Mol. Pharm. 2020, 17, 22992309,  DOI: 10.1021/acs.molpharmaceut.9b01294
    (b) Miljković, F.; Martinsson, A.; Obrezanova, O.; Williamson, B.; Johnson, M.; Sykes, A.; Bender, A.; Greene, N. Machine Learning Models for Human In Vivo Pharmacokinetic Parameters with In-House Validation. Mol. Pharm. 2021, 18, 45204530,  DOI: 10.1021/acs.molpharmaceut.1c00718
  11. 11
    Iwata, H.; Matsuo, T.; Mamada, H.; Motomura, T.; Matsushita, M.; Fujiwara, T.; Kazuya, M.; Handa, K. Prediction of total drug clearance in humans using animal data: proposal of a multimodal learning method based on deep learning. J. Pharm. Sci. 2021, 110, 1834,  DOI: 10.1016/j.xphs.2021.01.020
  12. 12
    Troyanskaya, O.; Cantor, M.; Sherlock, G.; Brown, P.; Hastie, T.; Tibshirani, R.; Botstein, D.; Altman, R. B. Missing value estimation methods for DNA microarrays. Bioinformatics 2001, 17, 520525,  DOI: 10.1093/bioinformatics/17.6.520
  13. 13
    Schafer, J. L.; Olsen, M. K. Multiple imputation for multivariate missing-data problems: A data analyst’s perspective. Multivariate Behavioral Res. 1998, 33, 545571,  DOI: 10.1207/s15327906mbr3304_5
  14. 14
    Stekhoven, D. J.; Buhlmann, P. MissForest--non-parametric missing value imputation for mixed-type data. Bioinformatics 2012, 28, 112118,  DOI: 10.1093/bioinformatics/btr597
  15. 15
    Sawada, R.; Iwata, H.; Mizutani, S.; Yamanishi, Y. Target-Based Drug Repositioning Using Large-Scale Chemical-Protein Interactome Data. J. Chem. Inf. Model. 2015, 55, 27172730,  DOI: 10.1021/acs.jcim.5b00330
  16. 16
    Martin, E. J.; Polyakov, V. R.; Zhu, X. W.; Tian, L.; Mukherjee, P.; Liu, X. All-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC50s for 8558 Novartis Assays. J. Chem. Inf. Model. 2019, 59, 44504459,  DOI: 10.1021/acs.jcim.9b00375
  17. 17
    Maruhashi, K.; Todoriki, M.; Ohwa, T.; Goto, K.; Hasegawa, Y.; Inakoshi, H.; Anai, H. Learning Multi-way Relations Via Tensor Decomposition with Neural Networks, In Thirty-Second AAAI Conference on Artificial Intelligence , 2018.
  18. 18
    Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, D1100D1107,  DOI: 10.1093/nar/gkr777
  19. 19
    Kim, S.; Thiessen, P. A.; Bolton, E. E.; Chen, J.; Fu, G.; Gindulyte, A.; Han, L.; He, J.; He, S.; Shoemaker, B. A.; Wang, J.; Yu, B.; Zhang, J.; Bryant, S. H. PubChem Substance and Compound databases. Nucleic Acids Res. 2016, 44, D12021213,  DOI: 10.1093/nar/gkv951
  20. 20
    Wishart, D. S.; Feunang, Y. D.; Guo, A. C.; Lo, E. J.; Marcu, A.; Grant, J. R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z.; Assempour, N.; Iynkkaran, I.; Liu, Y.; Maciejewski, A.; Gale, N.; Wilson, A.; Chin, L.; Cummings, R.; Le, D.; Pon, A.; Knox, C.; Wilson, M. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018, 46, D1074D1082,  DOI: 10.1093/nar/gkx1037
  21. 21
    Varma, M. V. S.; Feng, B.; Obach, R. S.; Troutman, M. D.; Chupka, J.; Miller, H. R.; El-Kattan, A. Physicochemical determinants of human renal clearance. J. Med. Chem. 2009, 52, 48444852,  DOI: 10.1021/jm900403j
  22. 22
    Jahandideh-Tehrani, M.; Bozorg-Haddad, O.; Loaiciga, H. A. Application of particle swarm optimization to water management: an introduction and overview. Environ. Monit. Assess. 2020, 192, 281,  DOI: 10.1007/s10661-020-8228-z
  23. 23
    (a) Breiman, L. Random forests. Mach. Learn. 2001, 45, 532,  DOI: 10.1023/A:1010933404324
    (b) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: Machine learning in Python. J. Machine Learning Res. 2011, 12, 28252830
  24. 24
    Glorot, X.; Bordes, A.; Bengio, Y. In Deep Sparse Rectifier Neural Networks . Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics2011; pp 315323.
  25. 25
    Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift . arXiv preprint arXiv:1502.03167 2015.
  26. 26
    Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Machine Learning Res. 2014, 15, 19291958
  27. 27
    Varoquaux, G.; Buitinck, L.; Louppe, G.; Grisel, O.; Pedregosa, F.; Mueller, A. Scikit-learn: Machine learning without learning the machinery. GetMobile: Mobile Comput. Commun. 2015, 19, 2933,  DOI: 10.1145/2786984.2786995
  28. 28
    Tang, H.; Mayersohn, M. A novel model for prediction of human drug clearance by allometric scaling. Drug Metab. Dispos. 2005, 33, 12971303,  DOI: 10.1124/dmd.105.004143
  29. 29
    (a) Jones, R. D.; Jones, H. M.; Rowland, M.; Gibson, C. R.; Yates, J. W.; Chien, J. Y.; Ring, B. J.; Adkison, K. K.; Ku, M. S.; He, H. PhRMA CPCDC initiative on predictive models of human pharmacokinetics, part 2: comparative assessment of prediction methods of human volume of distribution. J. Pharm. Sci. 2011, 100, 40744089,  DOI: 10.1002/jps.22553
    (b) Obach, R. S.; Baxter, J. G.; Liston, T. E.; Silber, B. M.; Jones, B. C.; Macintyre, F.; Rance, D. J.; Wastall, P. The prediction of human pharmacokinetic parameters from preclinical and in vitro metabolism data. J. Pharmacol. Exp. Ther. 1997, 283, 4658
  30. 30
    Słoczyńska, K.; Gunia-Krzyzak, A.; Koczurkiewicz, P.; Wojcik-Pszczola, K.; Zelaszczyk, D.; Popiol, J.; Pekala, E. Metabolic stability and its role in the discovery of new chemical entities. Acta. Pharm. 2019, 69, 345361,  DOI: 10.2478/acph-2019-0024
  31. 31
    Wakayama, N.; Toshimoto, K.; Maeda, K.; Hotta, S.; Ishida, T.; Akiyama, Y.; Sugiyama, Y. In Silico Prediction of Major Clearance Pathways of Drugs among 9 Routes with Two-Step Support Vector Machines. Pharm. Res. 2018, 35, 197,  DOI: 10.1007/s11095-018-2479-1
  32. 32
    Poulin, P.; Dambach, D. M.; Hartley, D. H.; Ford, K.; Theil, F. P.; Harstad, E.; Halladay, J.; Choo, E.; Boggs, J.; Liederer, B. M.; Dean, B.; Diaz, D. An algorithm for evaluating potential tissue drug distribution in toxicology studies from readily available pharmacokinetic parameters. J. Pharm. Sci. 2013, 102, 38163829,  DOI: 10.1002/jps.23670
  33. 33
    (a) Poulin, P.; Theil, F. P. A priori prediction of tissue: plasma partition coefficients of drugs to facilitate the use of physiologically-based pharmacokinetic models in drug discovery. J. Pharm. Sci. 2000, 89, 1635,  DOI: 10.1002/(SICI)1520-6017(200001)89:1<16::AID-JPS3>3.0.CO;2-E
    (b) Poulin, P.; Schoenlein, K.; Theil, F. P. Prediction of adipose tissue: plasma partition coefficients for structurally unrelated drugs. J. Pharm. Sci. 2001, 90, 436447,  DOI: 10.1002/1520-6017(200104)90:4<436::AID-JPS1002>3.0.CO;2-P
    (c) Poulin, P.; Krishnan, K. A biologically-based algorithm for predicting human tissue: blood partition coefficients of organic chemicals. Hum. Exp. Toxicol. 1995, 14, 273280,  DOI: 10.1177/096032719501400307
    (d) Berezhkovskiy, L. M. Volume of distribution at steady state for a linear pharmacokinetic system with peripheral elimination. J. Pharm. Sci. 2004, 93, 16281640,  DOI: 10.1002/jps.20073
    (e) Rodgers, T.; Rowland, M. Physiologically based pharmacokinetic modelling 2: predicting the tissue distribution of acids, very weak bases, neutrals and zwitterions. J. Pharm. Sci. 2006, 95, 12381257,  DOI: 10.1002/jps.20502
    (f) Schmitt, W. General approach for the calculation of tissue to plasma partition coefficients. Toxicol. In Vitro 2008, 22, 457467,  DOI: 10.1016/j.tiv.2007.09.010
  34. 34
    Tess, D. A.; Eng, H.; Kalgutkar, A. S.; Litchfield, J.; Edmonds, D. J.; Griffith, D. A.; Varma, M. V. S. Predicting the Human Hepatic Clearance of Acidic and Zwitterionic Drugs. J. Med. Chem. 2020, 63, 1183111844,  DOI: 10.1021/acs.jmedchem.0c01033
  35. 35
    Miyamoto, M.; Iwasaki, S.; Chisaki, I.; Nakagawa, S.; Amano, N.; Hirabayashi, H. Comparison of predictability for human pharmacokinetics parameters among monkeys, rats, and chimeric mice with humanised liver. Xenobiotica 2017, 47, 10521063,  DOI: 10.1080/00498254.2016.1265160
  36. 36
    Russell, W.; Burch, R. The Principles of Humane Experimental Technique. Wheathampstead, Universities Federation for Animal Welfare: UK; 1959.

Cited By

Click to copy section linkSection link copied!
Citation Statements
Explore this article's citation statements on scite.ai

This article is cited by 24 publications.

  1. Yuanyuan Zhang, Zhiyin Xie, Fu Xiao, Jie Yu, Zhehuan Fan, Shihui Sun, Jiangshan Shi, Zunyun Fu, Xutong Li, Dingyan Wang, Mingyue Zheng, Xiaomin Luo. Prediction of Multi-Pharmacokinetics Property in Multi-Species: Bayesian Neural Network Stacking Model with Uncertainty. Molecular Pharmaceutics 2024, 21 (12) , 6177-6192. https://doi.org/10.1021/acs.molpharmaceut.4c00406
  2. Leonid Komissarov, Nenad Manevski, Katrin Groebke Zbinden, Torsten Schindler, Marinka Zitnik, Lisa Sach-Peltason. Actionable Predictions of Human Pharmacokinetics at the Drug Design Stage. Molecular Pharmaceutics 2024, 21 (9) , 4356-4371. https://doi.org/10.1021/acs.molpharmaceut.4c00311
  3. Matthew Adrian, Yunsie Chung, Alan C. Cheng. Denoising Drug Discovery Data for Improved Absorption, Distribution, Metabolism, Excretion, and Toxicity Property Prediction. Journal of Chemical Information and Modeling 2024, 64 (16) , 6324-6337. https://doi.org/10.1021/acs.jcim.4c00639
  4. Koichi Handa, Saki Yoshimura, Michiharu Kageyama, Takeshi Iijima. Development of Novel Methods for QSAR Modeling by Machine Learning Repeatedly: A Case Study on Drug Distribution to Each Tissue. Journal of Chemical Information and Modeling 2024, 64 (9) , 3662-3669. https://doi.org/10.1021/acs.jcim.4c00046
  5. Franco Lombardo, Jörg Bentzien, Giuliano Berellini, Ingo Muegge. Prediction of Human Clearance Using In Silico Models with Reduced Bias. Molecular Pharmaceutics 2024, 21 (3) , 1192-1203. https://doi.org/10.1021/acs.molpharmaceut.3c00812
  6. Yaguo Gong, Wei Ding, Panpan Wang, Qibiao Wu, Xiaojun Yao, Qingxia Yang. Evaluating Machine Learning Methods of Analyzing Multiclass Metabolomics. Journal of Chemical Information and Modeling 2023, 63 (24) , 7628-7641. https://doi.org/10.1021/acs.jcim.3c01525
  7. Christopher E. Keefer, George Chang, Li Di, Nathaniel A. Woody, David A. Tess, Sarah M. Osgood, Brendon Kapinos, Jill Racich, Anthony A. Carlo, Amanda Balesano, Nicholas Ferguson, Christine Orozco, Larisa Zueva, Lina Luo. The Comparison of Machine Learning and Mechanistic In Vitro–In Vivo Extrapolation Models for the Prediction of Human Intrinsic Clearance. Molecular Pharmaceutics 2023, 20 (11) , 5616-5630. https://doi.org/10.1021/acs.molpharmaceut.3c00502
  8. Koichi Handa, Peter Wright, Saki Yoshimura, Michiharu Kageyama, Takeshi Iijima, Andreas Bender. Prediction of Compound Plasma Concentration–Time Profiles in Mice Using Random Forest. Molecular Pharmaceutics 2023, 20 (6) , 3060-3072. https://doi.org/10.1021/acs.molpharmaceut.3c00071
  9. Ion Brinza, Razvan Stefan Boiangiu, Iasmina Honceriu, Ahmed M. Abd-Alkhalek, Samir M. Osman, Omayma A. Eldahshan, Elena Todirascu-Ciornea, Gabriela Dumitru, Lucian Hritcu. Neuroprotective Potential of Origanum majorana L. Essential Oil Against Scopolamine-Induced Memory Deficits and Oxidative Stress in a Zebrafish Model. Biomolecules 2025, 15 (1) , 138. https://doi.org/10.3390/biom15010138
  10. Hiroaki Iwata. Transforming drug discovery: the impact of AI and molecular simulation on R&D efficiency. Bioanalysis 2024, 16 (23-24) , 1211-1217. https://doi.org/10.1080/17576180.2024.2437283
  11. Jae-Hee Kwon, Ja-Young Han, Minjung Kim, Seong Kyung Kim, Dong-Kyu Lee, Myeong Gyu Kim. Prediction of human pharmacokinetic parameters incorporating SMILES information. Archives of Pharmacal Research 2024, 47 (12) , 914-923. https://doi.org/10.1007/s12272-024-01520-2
  12. Ion Brinza, Razvan Stefan Boiangiu, Marius Mihasan, Dragos Lucian Gorgan, Alexandru Bogdan Stache, Ahmed Abd-Alkhalek, Heba El-Nashar, Iriny Ayoub, Nada Mostafa, Omayma Eldahshan, Abdel Nasser Singab, Lucian Hritcu. Rhoifolin, baicalein 5,6-dimethyl ether and agathisflavone prevent amnesia induced in scopolamine zebrafish (Danio rerio) model by increasing the mRNA expression of bdnf, npy, egr-1, nfr2α, and creb1 genes. European Journal of Pharmacology 2024, 984 , 177013. https://doi.org/10.1016/j.ejphar.2024.177013
  13. Miaoran Ning, Ma Fang, Kushal Shah, Vaishali Dixit, Devendra Pade, Helen Musther, Sibylle Neuhoff. A cross-species assessment of in silico prediction methods of steady-state volume of distribution using Simcyp simulators. Journal of Pharmaceutical Sciences 2024, 32 https://doi.org/10.1016/j.xphs.2024.12.018
  14. Xiaohua Lu, Liangxu Xie, Lei Xu, Rongzhi Mao, Xiaojun Xu, Shan Chang. Multimodal fused deep learning for drug property prediction: Integrating chemical language and molecular graph. Computational and Structural Biotechnology Journal 2024, 23 , 1666-1679. https://doi.org/10.1016/j.csbj.2024.04.030
  15. Houria Bentoumi, Abdeslem Bouzina, Aïcha Amira, Omar Sekiou, Djawhara Chohra, Loubna Ferchichi, Rachida Zerrouki, Nour-Eddine Aouf. Theoretical investigations of some isolated compounds from Calophyllum flavoramulum as potential antioxidant agents and inhibitors of AGEs. Journal of Biomolecular Structure and Dynamics 2024, 7 , 1-27. https://doi.org/10.1080/07391102.2024.2428375
  16. Dennis A Smith, Lucy Melanie Burton, Sophie Amanda Smith. Through a computer monitor darkly: artificial intelligence in absorption, distribution, metabolism and excretion science. Xenobiotica 2024, 54 (7) , 359-367. https://doi.org/10.1080/00498254.2023.2295361
  17. Davide Bassani, Neil John Parrott, Nenad Manevski, Jitao David Zhang. Another string to your bow: machine learning prediction of the pharmacokinetic properties of small molecules. Expert Opinion on Drug Discovery 2024, 19 (6) , 683-698. https://doi.org/10.1080/17460441.2024.2348157
  18. Mahnaz Ahmadi, Bahareh Alizadeh, Seyed Mohammad Ayyoubzadeh, Mahdiye Abiyarghamsari. Predicting Pharmacokinetics of Drugs Using Artificial Intelligence Tools: A Systematic Review. European Journal of Drug Metabolism and Pharmacokinetics 2024, 49 (3) , 249-262. https://doi.org/10.1007/s13318-024-00883-7
  19. Hiroaki Iwata, Yoshihiro Hayashi, Takuto Koyama, Aki Hasegawa, Kosuke Ohgi, Ippei Kobayashi, Yasushi Okuno. Feature extraction of particle morphologies of pharmaceutical excipients from scanning electron microscope images using convolutional neural networks. International Journal of Pharmaceutics 2024, 653 , 123873. https://doi.org/10.1016/j.ijpharm.2024.123873
  20. Hiroshi Komura, Reiko Watanabe, Kenji Mizuguchi. The Trends and Future Prospective of In Silico Models from the Viewpoint of ADME Evaluation in Drug Discovery. Pharmaceutics 2023, 15 (11) , 2619. https://doi.org/10.3390/pharmaceutics15112619
  21. Koichi Handa, Sakae Sugiyama, Michiharu Kageyama, Takeshi Iijima. Combined data-driven and mechanism-based approaches for human-intestinal-absorption prediction in the early drug-discovery stage. Digital Discovery 2023, 2 (5) , 1577-1588. https://doi.org/10.1039/D3DD00144J
  22. Koichi Handa, Seishiro Sakamoto, Michiharu Kageyama, Takeshi Iijima. Development of a 2D-QSAR Model for Tissue-to-Plasma Partition Coefficient Value with High Accuracy Using Machine Learning Method, Minimum Required Experimental Values, and Physicochemical Descriptors. European Journal of Drug Metabolism and Pharmacokinetics 2023, 48 (4) , 341-352. https://doi.org/10.1007/s13318-023-00832-w
  23. Hiroaki Iwata. Application of in Silico Technologies for Drug Target Discovery and Pharmacokinetic Analysis. Chemical and Pharmaceutical Bulletin 2023, 71 (6) , 398-405. https://doi.org/10.1248/cpb.c22-00638
  24. Olga Obrezanova. Artificial intelligence for compound pharmacokinetics prediction. Current Opinion in Structural Biology 2023, 79 , 102546. https://doi.org/10.1016/j.sbi.2023.102546

Journal of Chemical Information and Modeling

Cite this: J. Chem. Inf. Model. 2022, 62, 17, 4057–4065
Click to copy citationCitation copied!
https://doi.org/10.1021/acs.jcim.2c00318
Published August 22, 2022

Copyright © 2022 The Authors. Published by American Chemical Society. This publication is licensed under

CC-BY-NC-ND 4.0 .

Article Views

5614

Altmetric

-

Citations

Learn about these metrics

Article Views are the COUNTER-compliant sum of full text article downloads since November 2008 (both PDF and HTML) across all institutions and individuals. These metrics are regularly updated to reflect usage leading up to the last few days.

Citations are the number of other articles citing this article, calculated by Crossref and updated daily. Find more information about Crossref citation counts.

The Altmetric Attention Score is a quantitative measure of the attention that a research article has received online. Clicking on the donut icon will load a page at altmetric.com with additional details about the score and the social media presence for the given article. Find more information on the Altmetric Attention Score and how the score is calculated.

  • Abstract

    Figure 1

    Figure 1. Workflow of our novel human CLtot and Vdss prediction method. (A) CLtot analysis flow. (i) There were 741 compounds with human CLtot data and 46 that had values for all 11 features. (ii) All feature values were estimated via prediction using ADMEWORKS. (iii) Feature extraction was performed using XGBoost or Random Forest, and a prediction model was constructed. (B) Vdss analysis flow. (i) There were 751 compounds with human Vdss data and 46 that had values for all 11 features. (ii) All feature values were estimated via prediction using ADMEWORKS. (iii) Feature extraction was performed using XGBoost or Random Forest, and a prediction model was constructed.

    Figure 2

    Figure 2. Overview of the multimodal Deep Tensor model.

  • References


    This article references 36 other publications.

    1. 1
      Ballard, P.; Brassil, P.; Bui, K. H.; Dolgos, H.; Petersson, C.; Tunek, A.; Webborn, P. J. The right compound in the right assay at the right time: an integrated discovery DMPK strategy. Drug Metab. Rev. 2012, 44, 224252,  DOI: 10.3109/03602532.2012.691099
    2. 2
      Andrade, E. L.; Bento, A. F.; Cavalli, J.; Oliveira, S. K.; Schwanke, R. C.; Siqueira, J. M.; Freitas, C. S.; Marcon, R.; Calixto, J. B. Non-clinical studies in the process of new drug development - Part II: Good laboratory practice, metabolism, pharmacokinetics, safety and dose translation to clinical studies. Braz. J. Med. Biol. Res. 2016, 49, e5646  DOI: 10.1590/1414-431x20165646
    3. 3
      Lombardo, F.; Jing, Y. In Silico Prediction of Volume of Distribution in Humans. Extensive Data Set and the Exploration of Linear and Nonlinear Methods Coupled with Molecular Interaction Fields Descriptors. J. Chem. Inf. Model. 2016, 56, 20422052,  DOI: 10.1021/acs.jcim.6b00044
    4. 4
      Lombardo, F.; Waters, N. J.; Argikar, U. A.; Dennehy, M. K.; Zhan, J.; Gunduz, M.; Harriman, S. P.; Berellini, G.; Rajlic, I. L.; Obach, R. S. Comprehensive assessment of human pharmacokinetic prediction based on in vivo animal pharmacokinetic data, part 1: volume of distribution at steady state. J. Clin. Pharmacol. 2013, 53, 167177,  DOI: 10.1177/0091270012440281
    5. 5
      Russell, W. M. S.; Burch, R. L. The Principles Of Humane Experimental Technique; Methuen, 1959.
    6. 6
      Shiran, M. R.; Proctor, N.; Howgate, E.; Rowland-Yeo, K.; Tucker, G.; Rostami-Hodjegan, A. Prediction of metabolic drug clearance in humans: in vitro–in vivo extrapolation vs allometric scaling. Xenobiotica 2006, 36, 567580,  DOI: 10.1080/00498250600761662
    7. 7
      Lombardo, F.; Waters, N. J.; Argikar, U. A.; Dennehy, M. K.; Zhan, J.; Gunduz, M.; Harriman, S. P.; Berellini, G.; Liric Rajlic, I.; Obach, R. S. Comprehensive assessment of human pharmacokinetic prediction based on in vivo animal pharmacokinetic data, part 2: clearance. J. Clin. Pharmacol. 2013, 53, 178191,  DOI: 10.1177/0091270012440282
    8. 8
      (a) Crouch, R. D.; Hutzler, J. M.; Daniels, J. S. A novel in vitro allometric scaling methodology for aldehyde oxidase substrates to enable selection of appropriate species for traditional allometry. Xenobiotica 2018, 48, 219231,  DOI: 10.1080/00498254.2017.1296208
      (b) Mahmood, I. A Single Animal Species-Based Prediction of Human Clearance and First-in-Human Dose of Monoclonal Antibodies: Beyond Monkey. Antibodies 2021, 10, 35,  DOI: 10.3390/antib10030035
      (c) Sasabe, H.; Koga, T.; Furukawa, M.; Matsunaga, M.; Kaneko, Y.; Koyama, N.; Hirao, Y.; Akazawa, H.; Kawabata, M.; Kashiyama, E.; Takeuchi, K. Pharmacokinetics and metabolism of brexpiprazole, a novel serotonin-dopamine activity modulator and its main metabolite in rat, monkey and human. Xenobiotica 2021, 51, 590604,  DOI: 10.1080/00498254.2021.1890275
    9. 9
      (a) Wang, Y.; Liu, H.; Fan, Y.; Chen, X.; Yang, Y.; Zhu, L.; Zhao, J.; Chen, Y.; Zhang, Y. In Silico Prediction of Human Intravenous Pharmacokinetic Parameters with Improved Accuracy. J. Chem. Inf. Model. 2019, 59, 39683980,  DOI: 10.1021/acs.jcim.9b00300
      (b) Gombar, V. K.; Hall, S. D. Quantitative structure-activity relationship models of clinical pharmacokinetics: clearance and volume of distribution. J. Chem. Inf. Model. 2013, 53, 948957,  DOI: 10.1021/ci400001u
      (c) Demir-Kavuk, O.; Bentzien, J.; Muegge, I.; Knapp, E. W. DemQSAR: predicting human volume of distribution and clearance of drugs. J. Comput. Aided Mol. Des. 2011, 25, 11211133,  DOI: 10.1007/s10822-011-9496-z
    10. 10
      (a) Kosugi, Y.; Hosea, N. Direct Comparison of Total Clearance Prediction: Computational Machine Learning Model versus Bottom-Up Approach Using In Vitro Assay. Mol. Pharm. 2020, 17, 22992309,  DOI: 10.1021/acs.molpharmaceut.9b01294
      (b) Miljković, F.; Martinsson, A.; Obrezanova, O.; Williamson, B.; Johnson, M.; Sykes, A.; Bender, A.; Greene, N. Machine Learning Models for Human In Vivo Pharmacokinetic Parameters with In-House Validation. Mol. Pharm. 2021, 18, 45204530,  DOI: 10.1021/acs.molpharmaceut.1c00718
    11. 11
      Iwata, H.; Matsuo, T.; Mamada, H.; Motomura, T.; Matsushita, M.; Fujiwara, T.; Kazuya, M.; Handa, K. Prediction of total drug clearance in humans using animal data: proposal of a multimodal learning method based on deep learning. J. Pharm. Sci. 2021, 110, 1834,  DOI: 10.1016/j.xphs.2021.01.020
    12. 12
      Troyanskaya, O.; Cantor, M.; Sherlock, G.; Brown, P.; Hastie, T.; Tibshirani, R.; Botstein, D.; Altman, R. B. Missing value estimation methods for DNA microarrays. Bioinformatics 2001, 17, 520525,  DOI: 10.1093/bioinformatics/17.6.520
    13. 13
      Schafer, J. L.; Olsen, M. K. Multiple imputation for multivariate missing-data problems: A data analyst’s perspective. Multivariate Behavioral Res. 1998, 33, 545571,  DOI: 10.1207/s15327906mbr3304_5
    14. 14
      Stekhoven, D. J.; Buhlmann, P. MissForest--non-parametric missing value imputation for mixed-type data. Bioinformatics 2012, 28, 112118,  DOI: 10.1093/bioinformatics/btr597
    15. 15
      Sawada, R.; Iwata, H.; Mizutani, S.; Yamanishi, Y. Target-Based Drug Repositioning Using Large-Scale Chemical-Protein Interactome Data. J. Chem. Inf. Model. 2015, 55, 27172730,  DOI: 10.1021/acs.jcim.5b00330
    16. 16
      Martin, E. J.; Polyakov, V. R.; Zhu, X. W.; Tian, L.; Mukherjee, P.; Liu, X. All-Assay-Max2 pQSAR: Activity Predictions as Accurate as Four-Concentration IC50s for 8558 Novartis Assays. J. Chem. Inf. Model. 2019, 59, 44504459,  DOI: 10.1021/acs.jcim.9b00375
    17. 17
      Maruhashi, K.; Todoriki, M.; Ohwa, T.; Goto, K.; Hasegawa, Y.; Inakoshi, H.; Anai, H. Learning Multi-way Relations Via Tensor Decomposition with Neural Networks, In Thirty-Second AAAI Conference on Artificial Intelligence , 2018.
    18. 18
      Gaulton, A.; Bellis, L. J.; Bento, A. P.; Chambers, J.; Davies, M.; Hersey, A.; Light, Y.; McGlinchey, S.; Michalovich, D.; Al-Lazikani, B.; Overington, J. P. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Res. 2012, 40, D1100D1107,  DOI: 10.1093/nar/gkr777
    19. 19
      Kim, S.; Thiessen, P. A.; Bolton, E. E.; Chen, J.; Fu, G.; Gindulyte, A.; Han, L.; He, J.; He, S.; Shoemaker, B. A.; Wang, J.; Yu, B.; Zhang, J.; Bryant, S. H. PubChem Substance and Compound databases. Nucleic Acids Res. 2016, 44, D12021213,  DOI: 10.1093/nar/gkv951
    20. 20
      Wishart, D. S.; Feunang, Y. D.; Guo, A. C.; Lo, E. J.; Marcu, A.; Grant, J. R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z.; Assempour, N.; Iynkkaran, I.; Liu, Y.; Maciejewski, A.; Gale, N.; Wilson, A.; Chin, L.; Cummings, R.; Le, D.; Pon, A.; Knox, C.; Wilson, M. DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018, 46, D1074D1082,  DOI: 10.1093/nar/gkx1037
    21. 21
      Varma, M. V. S.; Feng, B.; Obach, R. S.; Troutman, M. D.; Chupka, J.; Miller, H. R.; El-Kattan, A. Physicochemical determinants of human renal clearance. J. Med. Chem. 2009, 52, 48444852,  DOI: 10.1021/jm900403j
    22. 22
      Jahandideh-Tehrani, M.; Bozorg-Haddad, O.; Loaiciga, H. A. Application of particle swarm optimization to water management: an introduction and overview. Environ. Monit. Assess. 2020, 192, 281,  DOI: 10.1007/s10661-020-8228-z
    23. 23
      (a) Breiman, L. Random forests. Mach. Learn. 2001, 45, 532,  DOI: 10.1023/A:1010933404324
      (b) Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V. Scikit-learn: Machine learning in Python. J. Machine Learning Res. 2011, 12, 28252830
    24. 24
      Glorot, X.; Bordes, A.; Bengio, Y. In Deep Sparse Rectifier Neural Networks . Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics2011; pp 315323.
    25. 25
      Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift . arXiv preprint arXiv:1502.03167 2015.
    26. 26
      Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: a simple way to prevent neural networks from overfitting. J. Machine Learning Res. 2014, 15, 19291958
    27. 27
      Varoquaux, G.; Buitinck, L.; Louppe, G.; Grisel, O.; Pedregosa, F.; Mueller, A. Scikit-learn: Machine learning without learning the machinery. GetMobile: Mobile Comput. Commun. 2015, 19, 2933,  DOI: 10.1145/2786984.2786995
    28. 28
      Tang, H.; Mayersohn, M. A novel model for prediction of human drug clearance by allometric scaling. Drug Metab. Dispos. 2005, 33, 12971303,  DOI: 10.1124/dmd.105.004143
    29. 29
      (a) Jones, R. D.; Jones, H. M.; Rowland, M.; Gibson, C. R.; Yates, J. W.; Chien, J. Y.; Ring, B. J.; Adkison, K. K.; Ku, M. S.; He, H. PhRMA CPCDC initiative on predictive models of human pharmacokinetics, part 2: comparative assessment of prediction methods of human volume of distribution. J. Pharm. Sci. 2011, 100, 40744089,  DOI: 10.1002/jps.22553
      (b) Obach, R. S.; Baxter, J. G.; Liston, T. E.; Silber, B. M.; Jones, B. C.; Macintyre, F.; Rance, D. J.; Wastall, P. The prediction of human pharmacokinetic parameters from preclinical and in vitro metabolism data. J. Pharmacol. Exp. Ther. 1997, 283, 4658
    30. 30
      Słoczyńska, K.; Gunia-Krzyzak, A.; Koczurkiewicz, P.; Wojcik-Pszczola, K.; Zelaszczyk, D.; Popiol, J.; Pekala, E. Metabolic stability and its role in the discovery of new chemical entities. Acta. Pharm. 2019, 69, 345361,  DOI: 10.2478/acph-2019-0024
    31. 31
      Wakayama, N.; Toshimoto, K.; Maeda, K.; Hotta, S.; Ishida, T.; Akiyama, Y.; Sugiyama, Y. In Silico Prediction of Major Clearance Pathways of Drugs among 9 Routes with Two-Step Support Vector Machines. Pharm. Res. 2018, 35, 197,  DOI: 10.1007/s11095-018-2479-1
    32. 32
      Poulin, P.; Dambach, D. M.; Hartley, D. H.; Ford, K.; Theil, F. P.; Harstad, E.; Halladay, J.; Choo, E.; Boggs, J.; Liederer, B. M.; Dean, B.; Diaz, D. An algorithm for evaluating potential tissue drug distribution in toxicology studies from readily available pharmacokinetic parameters. J. Pharm. Sci. 2013, 102, 38163829,  DOI: 10.1002/jps.23670
    33. 33
      (a) Poulin, P.; Theil, F. P. A priori prediction of tissue: plasma partition coefficients of drugs to facilitate the use of physiologically-based pharmacokinetic models in drug discovery. J. Pharm. Sci. 2000, 89, 1635,  DOI: 10.1002/(SICI)1520-6017(200001)89:1<16::AID-JPS3>3.0.CO;2-E
      (b) Poulin, P.; Schoenlein, K.; Theil, F. P. Prediction of adipose tissue: plasma partition coefficients for structurally unrelated drugs. J. Pharm. Sci. 2001, 90, 436447,  DOI: 10.1002/1520-6017(200104)90:4<436::AID-JPS1002>3.0.CO;2-P
      (c) Poulin, P.; Krishnan, K. A biologically-based algorithm for predicting human tissue: blood partition coefficients of organic chemicals. Hum. Exp. Toxicol. 1995, 14, 273280,  DOI: 10.1177/096032719501400307
      (d) Berezhkovskiy, L. M. Volume of distribution at steady state for a linear pharmacokinetic system with peripheral elimination. J. Pharm. Sci. 2004, 93, 16281640,  DOI: 10.1002/jps.20073
      (e) Rodgers, T.; Rowland, M. Physiologically based pharmacokinetic modelling 2: predicting the tissue distribution of acids, very weak bases, neutrals and zwitterions. J. Pharm. Sci. 2006, 95, 12381257,  DOI: 10.1002/jps.20502
      (f) Schmitt, W. General approach for the calculation of tissue to plasma partition coefficients. Toxicol. In Vitro 2008, 22, 457467,  DOI: 10.1016/j.tiv.2007.09.010
    34. 34
      Tess, D. A.; Eng, H.; Kalgutkar, A. S.; Litchfield, J.; Edmonds, D. J.; Griffith, D. A.; Varma, M. V. S. Predicting the Human Hepatic Clearance of Acidic and Zwitterionic Drugs. J. Med. Chem. 2020, 63, 1183111844,  DOI: 10.1021/acs.jmedchem.0c01033
    35. 35
      Miyamoto, M.; Iwasaki, S.; Chisaki, I.; Nakagawa, S.; Amano, N.; Hirabayashi, H. Comparison of predictability for human pharmacokinetics parameters among monkeys, rats, and chimeric mice with humanised liver. Xenobiotica 2017, 47, 10521063,  DOI: 10.1080/00498254.2016.1265160
    36. 36
      Russell, W.; Burch, R. The Principles of Humane Experimental Technique. Wheathampstead, Universities Federation for Animal Welfare: UK; 1959.
  • Supporting Information

    Supporting Information


    The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.2c00318.

    • The data used in this paper are listed in Supporting_information_dataset.xlsx (PDF)


    Terms & Conditions

    Most electronic Supporting Information files are available without a subscription to ACS Web Editions. Such files may be downloaded by article for research use (if there is a public use license linked to the relevant article, that license may permit other uses). Permission may be obtained from ACS for other uses through requests via the RightsLink permission system: http://pubs.acs.org/page/copyright/permissions.html.