C@PA: Computer-Aided Pattern Analysis to Predict Multitarget ABC Transporter Inhibitors

Based on literature reports of the last two decades, a computer-aided pattern analysis (C@PA) was implemented for the discovery of novel multitarget ABCB1 (P-gp), ABCC1 (MRP1), and ABCG2 (BCRP) inhibitors. C@PA included basic scaffold identification, substructure search and statistical distribution, as well as novel scaffold extraction to screen a large virtual compound library. Over 45,000 putative and novel broad-spectrum ABC transporter inhibitors were identified, from which 23 were purchased for biological evaluation. Our investigations revealed five novel lead molecules as triple ABCB1, ABCC1, and ABCG2 inhibitors. C@PA is the very first successful computational approach for the discovery of promiscuous ABC transporter inhibitors.

Computational approaches with respect to ABC transporter inhibition have been undertaken 43,44 mostly focusing on selective inhibition of ABCB1, 45,46 ABCC1, 16 or ABCG2 47 individually. No approach took inhibitors of more than one ABC transport protein into account. However, this would revolutionize our understanding of ABC transporters as this could address two major aspects: (i) identification of structural requirements for a simultaneous targeting of ABCB1, ABCC1, and ABCG2 and, vice versa, identification of structural features for selective inhibition of one of these transporters; and (ii) potentially deciphering common structural features to address other ABC transporters that are not able to be targeted by small-molecules until now. In order to give way for the discovery and development of novel broad-spectrum ABC transporter modulators, we implemented C@PA, a computeraided pattern analysis, which is presented in this work.
■ RESULTS Computational Analysis. Compilation of Data Set and Classification of Compounds. As a first step, 93 reports between 2004 and 2021 were collected in which the evaluation of small-molecule inhibitors of ABCB1, ABCC1, and ABCG2 was described. Reports that did not include biological investigations on all three transporters were not considered, as a subsequent classification of the compounds would fail due to missing activity value(s). The half-maximal inhibition concentration (IC 50 ) values of the compounds were considered as the major indicator of direct inhibition. Other biological data that was not based on tracing of (immediate) ABC transporter-mediated transport (e.g., by a fluorescence dye or a radionuclide) was not taken into account as these surrogates [e.g., the half-maximal reversal concentrations (EC 50 ) obtained in MDR reversal assays] and their observed effects (e.g., the shift in toxicity of a co-administered antineoplastic agent) may not be (directly) linked to inhibition of transport activity alone but also to unspecific, non-ABC transporter-related targets. The IC 50 values were either extracted from tables as reported in the respective publication or estimated from relative inhibition (I rel ) values compared to the maximal inhibition exerted by a standard inhibitor (I max ). In the latter case, the IC 50 was categorized into <10 μM ("active") or ≥10 μM ("inactive"). The dataset including compound names and SMILES codes, inhibitory activity values against ABCB1, ABCC1, and ABCG2, used cell lines and testing systems, as well as the links to the corresponding literature can be found in Supplementary Table 1. In total, 1049 compounds were identified, which have been evaluated at least once regarding ABCB1, ABCC1, and ABCG2. In case a compound has been evaluated in more than one assay, the mean of the reported IC 50 values was taken for further analysis. In case one compound was evaluated with a definite number (e.g., 9.6 μM) and an estimation (e.g., >25 μM), the definite number was always given priority, while the estimated value was not considered. The same accounts for a compound that was classified as "inactive" in one assay and associated with a definite IC 50 value in another assay. If a range was given (e.g., 4−5 μM), the mean has been taken for further analysis (e.g., 4.5 μM). The dataset for ongoing analysis, including compound names and SMILES codes, can be found in Supplementary Table 2. In a next step, the compounds of Supplementary Table 2 were categorized into "active" [1 ("one"); IC 50 value <10 μM] and "inactive" [0 ("zero"); IC 50 values ≥10 μM]. As a result, 256 compounds were found to be active against ABCB1, while 793 were inactive. Concerning ABCC1, 147 were active, while 902 were found to be inactive. Finally, regarding ABCG2, 629 representatives were found as active, and 420 were inactive. Considering their activity profile against ABCB1, ABCC1, and ABCG2, the 1049 compounds were classified into the following eight classes (0−7): (i) class 0 consisted of 276 molecules that had no effect on either ABCB1, ABCC1, or ABCG2 (0, 0, 0); (ii) class 1 comprised 69 selective ABCB1 inhibitors (1, 0, 0); (iii) class 2 contained 58 selective ABCC1 inhibitors (0, 1, 0); (iv) class 3 included 435 selective ABCG2 inhibitors (0, 0, 1); (v) class 4 consisted of 17 dual ABCB1 and ABCC1 inhibitors (1, 1, 0); (vi) class 5 comprised 122 dual ABCB1 and ABCG2 inhibitors (1, 0, 1); (vii) class 6 contained 24 dual ABCC1 and ABCG2 inhibitors (0, 1, 1); and (viii) class 7 included 48 multitarget ABCB1, ABCC1, and ABCG2 inhibitors (1, 1, 1). Supplementary Table 3 provides all 1049 classified compounds with names and SMILES codes.
Basic Scaffold Search and Statistical Substructure Analysis. Two main questions should be addressed to identify the critical fingerprints for triple ABCB1, ABCC1, and ABCG2 inhibition ("multitarget fingerprints"): (i) which basic scaffolds do the 48 compounds of class 7 have and (ii) what structural features must be present for promiscuity toward ABCB1, ABCC1, and ABCG2? To address the first question, a scaffold analysis of class 7 compounds was conducted using the Structure-Activity-Report (SAReport) tool 48 [2,3-b]pyridine. Figure 2 visualizes these six basic scaffolds.
Vice versa, 13 inhibitors could not be categorized, from which 11 did not have a heteroaromatic core structure. Regarding the other 2, one compound was the only representative of its structural class (thieno[2,3-b]pyrimidine). 16 The other compound (apatinib) contained a pyridine, 35 which was only present in three molecules and therefore did not constitute a heteroaromatic (basic) scaffold on its own according to the SAReport. 48 Nevertheless, two features of these 13 non-categorizable ABCB1, ABCC1, and ABCG2 inhibitors should be highlighted: (i) the thieno-[2,3-b]pyrimidine and pyridine scaffolds could be subcategories of the thieno[2,3-b]pyridine and quinoline scaffolds, respectively; and (ii) 9 of the 13 compounds had either dimethoxyphenyl (3 compounds) or trimethoxyphenyl (6 compounds) partial structures, which could be markers for multitarget inhibition.
To address the second question as indicated above, a list of in total 308 partial structures was compiled that are commonly present in organic molecules 50 (names and SMILES codes can be found in Supplementary Table 4). The eight classes were screened against these 308 partial structures using the tool InstantJChem, 51 and the absolute statistical distribution of each partial structure was collected. The relative statistical distribution was calculated, which represented the percentage of occurrence of the corresponding partial structure within the respective class (0−7). As a next step, structural markers were searched for that clearly favored triple ABCB1, ABCC1, and ABCG2 inhibition. For this purpose, the relative statistical distribution was reorganized in five different groups: (i) group A represented the percentage of class 0 (inactive molecules); (ii) group B represented the summed percentages of classes 1− 3 (selective inhibitors); (iii) group C represented the summed percentages of classes 4−6 (dual inhibitors); (iv) group D represented the percentages of class 7 (triple inhibitors); and (v) group E was calculated from the sum of the percentages of classes 4−7 [dual and triple (= multitarget) inhibitors].
Identification of Multitarget Fingerprints. From Supplementary Table 4, "clear positive hits" ("Positive Pattern") could be deduced. These were defined as the following: (i) the respective substructure must have occurred at least five times in the 1049 molecules of the dataset; (ii) group D must have had accounted for at least 15% of the respective hit molecules; and (iii) the percentage of group D should have been at least the same as the percentage of group B. If the second point was fulfilled but the third was not, (iv) the percentage of group E must have been at least the percentage of group B. Applying these rules, nine substructures could be found as potential markers for triple ABCB1, ABCC1, and ABCG2 inhibition:  Table 4). A detailed analysis of the latter two partial substructures revealed that none of the 1049 compounds contained a sulfoxide residue but only sulfones, of which sulfoxide is a part of. Hence, we accepted only sulfone as clear positive hit for triple ABCB1, ABCC1, and ABCG2 inhibition. The thieno[2,3-b]pyridine substructure was for its part already found in the basic scaffold search.
Model Validation and Comparison to Classical Computational Approaches. Before screening of a large virtual compound library, the developed model for compound selection was validated by using a query search tool implemented in InstantJChem. 51 The 1049 compounds served as a validation data set for "Positive Patterns" (Screen 2) and "Negative Patterns" (Screen 3), which were applied as multitarget fingerprints. Applying these two multitarget fingerprints, 30 of the 48 triple ABCB1, ABCC1, ABCG2 inhibitors could be predicted, while 18 represented false negative hits. This equals a virtual hit rate ("Sensitivity") of 62.50%, while the prediction of true negatives ("Specificity") reached 90.81%. To assess the quality and potential superiority of C@PA to classical computational approaches, these results were compared to (i) the 2D similarity search using MACCS fingerprints 16 and (ii) pharmacophore modeling as already reported before. 16 For both approaches, six query molecules of every basic scaffold have been chosen: (i) compound 1 as representative of the 4-anilinopyrimidines; 14 (ii) compound 4 as representative of the pyrrolo[3,2-d]pyrimidines; 17 (iii) compound 5 as representative of the pyrimido [5,4-b]indoles; 17 (iv) compound 6 as representative of the quinazolines; 28 (v) compound 7 as representative of the quinolines; 29 and (vi) compound 8 as representative of the thieno[2,3-b]pyridines. 32 The SMILES codes and inhibitory activities of compounds 1 and 4−8 can be found in Supplementary Tables 1 and 2. For similarity search, MACCS fingerprints were calculated and a Tanimoto coefficient (Tc) with a cutoff value of 0.8 has been applied. As a result, the sensitivity of this approach yielded in 43.75%, while the specificity reached 87.31%. Regarding the pharmacophore modeling, a flexible alignment of the stated compounds has been performed applying MOE ( Figure 4A). 49 Using the consensus methodology implemented in the Pharmacophore Query Editor, five pharmacophore features [(i−iv) F1−F4: aromatic/hydrophobic; and (v) F5: acceptor] were identified that were present in at least four of the six query molecules 1 and 4−8 (tolerance distance: 1.2 Å; threshold value: >50%; Figure 4B). The sensitivity of this approach reached 60.42%, while the specificity had a value of 44.46%. Table 1 gives the prediction values for each class and computational approach. As it turned out, C@PA combined the high sensitivity of the pharmacophore modeling with the high specificity of the similarity search and, moreover, slightly exceeded these values. As its superiority was proven in the process of model validation, we felt confident to continue with large-scale virtual screening.
Virtual Screening, Selection Criteria, and Manual Candidate Selection. For the discovery of novel triple ABCB1, ABCC1, and ABCG2 inhibitors, the ENAMINE Diverse REAL drug-like compound library comprising 15,547,091 molecules was taken for virtual screening. 52 Three initial selection criteria were formulated: (i) the compound must have contained at least one of the six identified basic scaffolds (Screen 1: "Scaffold Search"; Figure  2); (ii) the compound must have contained at least one of the defined clear positive hits (Screen 2: "Positive Pattern"); and (iii) the compounds must not have been equipped with any of the clear negative hits (Screen 3: "Negative Pattern"). In total, 289,971 compounds had at least one basic scaffold. Amongst these, 73,575 candidates included at least one clear positive hit substructure, while 45,506 of them did not have any clear negative hit substructure. Furthermore, compounds were excluded if they did not have a partition coefficient (LogP) as well as molecular weight (MW) that stretched inside the span of LogP and MW of class 7 compounds (Screen 4; LogP span: 2.4−6.9; MW span: 295−915). This downsized the compound library to 25,060 potential multitarget ABCB1, ABCC1, and ABCG2 inhibitors.
In order to obtain novel agents that had scaffolds not associated with simultaneous inhibition of ABCB1, ABCC1, and ABCG2 before, substructures of Supplementary Table 4 were emphasized that have not been part of any of the 1049 molecules, which was the case for 146 substructures. The focus of this work was to discover new heteroaromatic scaffolds as multitarget ABCB1, ABCC1, and ABCG2 inhibitors. Hence, out of the 146 novel substructures, 29 heteroaromatic scaffolds were chosen:   The sensitivity ("virtual hit rate"; true positive hits) as well as the specificity (true negative hits) are highlighted at the very bottom of the table in a rose mark. ABCC1, and ABCG2 inhibitors (Screen 5: "Novel Scaffold Search"; Supplementary Table 5).
Regarding the basics scaffolds, these 1505 molecules comprised (i) 35 (26)] were emphasized since piperazine is often present as a linker in multitarget ABC transporter inhibitors. 16,17,53−55 In essence, these experience-based decisions as well as availability and price of the compounds led to the selection of 87 candidates, from which 43 were ordered from ENAMINE. Amongst these 43 compounds, 23 were available for delivery within the purity requirement of 95% (compounds 11−33; Supplementary Table 6) and were subject to subsequent biological evaluation. Figure 3B summarizes the virtual screening processes of C@PA.
As can be seen from Figure 5A−C, 17, 5, and 18 of the 23 compounds showed an inhibitory activity against ABCB1 (A), ABCC1 (B), and ABCG2 (C), respectively, of over 20% [+ standard error or the mean (SEM)]. Hence, complete concentration-effect curves have been generated to obtain IC 50 values for these compounds, which are summarized in Table 2. Compounds 15,18,21,22, and 26 could be identified as triple ABCB1, ABCC1, and ABCG2 inhibitors and are depicted in Figure 6.
The In addition, two compounds revealed a remarkable inhibitory power against ABCG2: the quinoline/1,2,4oxadiazole/indole derivative 26 (IC 50 = 0.540 ± 0.150 μM; Figure 7) and the pyrimidine/1,2,4-oxadiazole/indole derivative 27 (IC 50 = 0.220 ± 0.020 μM; Figure 8). This is a special finding given the fact that screenings usually do not provide compounds with very high activities. Especially, the results for compound 27 must be put into perspective as it possessed an equal inhibitory power against ABCG2 as the "golden standard", compound 34. Hence, it represents a promising lead molecule for ongoing research.
Confirmation of Inhibitory Power of Compounds 15,18,21,22,26,and 27. In order to confirm the found results with respect to multitarget ABCB1, ABCC1, and ABCG2 inhibition of compounds 15,18,21,22, and 26 as well as ABCG2 inhibition of compound 27, Hoechst 33342 (ABCB1 and ABCG2), 15,57 and daunorubicin (ABCC1) 17 fluorescence accumulation assays have been performed as described previously 15,17,57 with minor modifications, using either ABCB1-overexpressing A2780/ADR, ABCC1-overexpressing H69AR, or ABCG2-overexpressing MDCK II BCRP cells. In short, Hoechst 33342 and daunorubicin are substrates of ABC transporters that passively diffuse into the cells and become extruded by the respective ABC transporter. ABC transporter inhibition leads to an intracellular accumulation of these fluorescence dyes. Hoechst 33342 intercalates with the DNA in the nucleus and accumulates in intracellular membrane bilayers, both leading to a fluorescent complex that could be detected using a microplate reader. On the other hand, daunorubicin is already fluorescent and has been evaluated via flow cytometry. In both assays, the measured fluorescence values correlated with the degree of inhibition of the respective transporter. Ten micromolar cyclosporine A and compound 34 have been used as references to define 100% inhibition of ABCB1 and ABCC1 as well as ABCG2, respectively. The data for the multitarget ABCB1, ABCC1, and ABCG2 inhibitors 15, 18, 21, 22, and 26 are summarized in Table 3.
Compounds 15, 18, 21, 22, and 26 could be confirmed as triple ABCB1, ABCC1, and ABCG2 inhibitors. Generally, the IC 50 values correlated with the values of the calcein AM (ABCB1 and ABCC1) and pheophorbide A (ABCG2) assays (Table 2). Only the IC 50 value of compound 26 determined in the daunorubicin assay (ABCC1) fell out of the correlation, which was with 0.764 μM over 12 times lower than could have  Figure 5A−C) # # ABCB1-overexpressing A2780/ADR, ABCC1-overexpressing H69AR, or ABCG2-overexpressing MDCK II BCRP cells in either calcein AM (ABCB1 and ABCC1) or pheophorbide A (ABCG2) assays were used. [14][15][16]56 The positive control (100%) was defined by the effect value of 10 μM cyclosporine A (ABCB1 and ABCC1) or compound 34 (ABCG2), while buffer medium served as a negative control (0%). Shown is the mean ± SEM of at least three independent experiments. Rose mark: discovered triple ABCB1, ABCC1, and ABCG2 inhibitors. a No IC 50 determined due to lack of inhibitory activity in the initial screening ( Figure 5A−C). been expected from the calcein AM data. However, these discrepancies frequently occur as IC 50 values generally depend on the used fluorescence dye, in particular its polarity, lipophilicity, (velocity of) membrane distribution, and affinity to the respective transporter. 15 Thiadiazoles have also only once been reported in association with selective, dual, or triple ABCB1, ABCC1, and ABCG2 inhibition. 66 However, these compounds had estimated IC 50 values of far beyond 25 μM. 66 Thieno[3,2-c]pyridines, on the other hand, have never been associated with either ABCB1, ABCC1, or ABCG2.
A biological hit rate of 21.7% is common for single-target computational approaches as reported in the literature that subsequently validated their postulations with biological studies. 67−71 However, this number is very impressive for multitarget screening studies, in particular considering the huge challenges involved in the development of C@PA. We identified four major aspects that impacted the model development.
The first aspect is related to the selection of molecules as basis for the development of C@PA. The amount of data that could be used was highly limited. We found only 93 reports containing 1049 compounds that qualified for data processing. Many compounds were not characterized in full by complete concentration-effect curves and had to be estimated for a proper compound categorization and classification. The data processing procedures in these 93 reports that led to the published IC 50 values were not standardized (e.g., three-vs four-parameter logistic equation). Some IC 50 values provided a limited number of significant digits or were not accompanied by standard deviations or standard errors. Furthermore, certain IC 50 values resulted from so-called "partial inhibitors" (IC 50 absolute vs IC 50 relative ). Additionally, the applied assay systems were not standardized and varied within the 93 reports. While a majority of testing systems was accumulation (uptake) assays, some findings were based on efflux experiments. Furthermore, the transporter host system varied in the reported biological studies. While the majority of authors used living cells, some used inside-out membrane vesicles, both for its part influencing compound distribution and binding, but also transporter abundance and functionality (e.g., pump rate). 24 The living cells for their part were either transfected or selected cells, which impacted the expression and abundance of (functional) transport protein and eventually the inhibitory activity against the respective transporter. More importantly, a great variety of fluorescence dyes has been used to assess the corresponding compounds. It is well known that    15,17,57 Cyclosporine A (10 μM; ABCB1 and ABCC1) and compound 34 (10 μM; ABCG2) were used as a reference for 100% inhibition, and buffer medium represented 0%. Shown is the mean ± SEM of at least three independent experiments.

Journal of Medicinal Chemistry
pubs.acs.org/jmc Article inhibitory activity can be strongly dependent on the manner of the fluorescence dye [e.g., its polarity, lipophilicity, velocity of diffusion and distribution, as well as affinity toward the transporter(s)]. 15,24,58 Moreover, fluorescence measurements themselves pose a risk of artifacts, which can be explained by secondary effects like quenching (with each other or with the evaluated compounds). This can be circumvented by the use of other types of measurements, like radioactivity counts in radionuclide studies. However, this kind of testing system has only been used by a minority of authors. Finally, it must be taken into consideration that the 93 reports came from various laboratories with different non-standardized assay procedures, resulting in the very same assay being executed in various manners. Taken these data-related aspects together, the errors of each individual aspect collectively potentiated, giving a final uncertainty for C@PA's prediction capabilities. The second major aspect stems from the initial categorization of compounds into "active" and "inactive". The "activity threshold" has been set to 10 μM. A threshold in general adversely affects compounds close to the chosen value, which inevitably leads to misclassifications. However, only 19 (ABCB1), 24 (ABCC1), and 42 (ABCG2) so-called "borderline-compounds", where the IC 50 ± SD/SEM values either overlapped with the threshold of 10 μM, or were defined as "around 10 μM" (∼10) or exactly 10 μM (10.000), have been identified from Supplementary Table 1. Hence, the problem of miscategorization of the compounds is rather negligible. Although the value of 10 μM seems to be quite high, one must take into account that broad-spectrum ABCB1, ABCC1, and ABCG2 inhibitors generally exert their effect almost always in the single-to double-digit micromolar concentration range. As stated out in the Introduction, only about 50 triple ABCB1, ABCC1, and ABCG2 inhibitors exerted their effect below 10 μM, 14−17,21,23,25−42 and only 22 of them had activities below 5 μM. 14,15,21,23,25,26,28,32,34,37−39 Setting the threshold to higher activities (lower IC 50 values) would have led to a radical downsizing of the data set. This would not have left enough space for action and interpretation. Even setting the threshold at 10 μM allowed only for 48 triple ABCB1, ABCC1, and ABCG2 inhibitors to be considered as a basis of scaffold analysis and the following computational measures. A higher threshold would have led to a larger number of triple ABCB1, ABCC1, and ABCG2 inhibitors, but this would have led to the inclusion of rather inactive compounds. In addition, IC 50 values above 10 μM imply that the necessary test concentrations were much higher (up to 100 μM or more). At these concentration ranges, compound-related assay interferences (e.g., solubility problems, solvent effects, short-term cell toxicity, fluorescence quenching, or unspecific binding) are much more likely to have occurred. Hence, compounds with activities above 10 μM could not be considered as "active". However, it must also be stated that, due to the 10 μM threshold, the value distribution after compound categorization and classification was rather unequal. This can be seen, for example, when comparing selective ABCG2 inhibitors (class 3) with 435 representatives, and dual ABCB1 and ABCC1 inhibitors (class 4) with 17 compounds. This mainly depended on the literature itself and could not be influenced.
The third major aspect was the data processing and the definition of selection criteria. A virtual hit rate of 62.5% is above average; however, the model was not able to predict all 48 triple ABCB1, ABCC1, and ABCG2 inhibitors, although its selection criteria were in part deduced from these. In terms of the selection criteria, it must generally be considered that selectivity and promiscuity are not discrete but continuous attributes of compounds. Statistically speaking, there is a fluent border between both attributes. Molecules consist mostly of several partial structures that for their part can independently or collectively interact with the target(s) leading to selectivity or promiscuity. This ambivalent characteristic of individual substructures can lead to the fact that these are present in both single-or multitarget inhibitors. Our aim was to define unambiguous selection criteria, or at least as close to this as possible. This explains why many substructures present in the triple inhibitors could not be acknowledged for the prediction of the very same triple inhibitors from the data set of 1049 compounds. Inclusion of these discriminated partial structures would inevitably have led to the prediction of more false positive hits and a decreased biological hit rate. To avoid a "randomization" of the model, we chose 15% as the threshold for the selection of clear positive hits. This threshold allowed for the selection of a sufficient number of substructures as clear positive hits. A higher percentage almost eliminates these clear positive hits, while a lower percentage results in the selection of less pronounced multitarget substructures (leading to more false positive hits). On the other hand, this number of 15% implies that the residual 85% of molecules contained dually active, selective, or even inactive compounds. This imbalance posed in our point of view the highest impact on the development of C@PA. Furthermore, novel scaffolds (Screen 5) were chosen that have never been reported before regarding the ABC transporters ABCB1, ABCC1, and ABCG2 according to the initial data set of 1049 compounds (Supplementary Table 4). Selecting for these 29 novel heteroaromatic scaffolds inherited per se a risk of lowering the biological hit rate. However, as the task of this investigation was to identify novel heteroaromatic scaffolds and molecules, stepping into this unknown territory was obligatory. Finally, the manual selection posed also a risk of faulty selection. As outlined above, these criteria were mainly based on our experience with ABC transporter inhibitors. 16,17,[27][28][29]55 C@PA benefited from these experience-driven decisions, as the following shows: (i) a strong focus was put on individual substituents like fluorine, chlorine, cyano, or methoxy, especially in combination. Strikingly, 6 of the 23 compounds had such a combination (17−18, 21, 26, 31, and 33), amongst these were three triple ABCB1, ABCC1, and ABCG2 inhibitors (18, 21, and 26; 50.0%). More strikingly, almost all (15,18,21, and 26; 80.0%) of the triple inhibitors had at least one of such a substructure. Moreover, when turning the focus on dual and triple (= multitarget) inhibitors of ABC transporters, 71.4% (10 out of 14) had at least one of these substructures; (ii) the partial structures piperazine (22), homo-piperazine (18), and piperidine (26) were reflected in the five multitarget ABCB1, ABCC1, and ABCG2 inhibitors (60.0%); Hence, we conclude that the manual selection rather supported than impaired the model and contributed to the finding of multitarget ABCB1, ABCC1, and ABCG2 inhibitors.
The fourth and final major aspect is the target variety. Multitarget inhibition was in the focus of the present study. As ABCB1, 2 ABCC1, 3 and ABCG2 4 have their individual "preferences" regarding inhibitors, finding simultaneously interfering agents is quite an obstacle, which distinguishes C@PA from other approaches in the literature. 67−71 Compound characteristics such as lipophilicity or MW can inversely correlate with the inhibition of the respective Journal of Medicinal Chemistry pubs.acs.org/jmc Article transporter, therefore exacerbating the finding of a multitarget inhibitor. This raised initially the question if a rational approach was possible at all to obtain novel multitarget ABCB1, ABCC1, and ABCG2 inhibitors. Despite these multifaceted challenges, the model proved that it is generally possible to predict broad-spectrum ABCB1, ABCC1, and ABCG2 inhibitors after processing of literature data and identification of critical fingerprints. This cannot only be perceived from the finding of five novel multitarget ABCB1, ABCC1, and ABCG2 inhibitors but also from the discovery of nine dual ABCB1 and ABCG2 inhibitors (13, 16, 17, 23−25, 27, 30, and 32). Consequently, 60.9% of the selected 23 molecules were multitarget inhibitors of ABC transporters. Although dual inhibition was not in the scope of the present study, it must be acknowledged that these numbers mean that suitable molecular patterns were extracted for multitarget ABCB1, ABCC1, and ABCG2 inhibition. In addition, two major achievements are that (i) the 1,2,4-oxadiazole moiety can be suggested as the seventh basic scaffold for triple ABCB1, ABCC1, and ABCG2 inhibition, and (ii) the fluorine, chlorine, methoxy, as well as cyano substructures, as well as the piperazine, homopiperazine, and piperidine linkers can, in association with multitarget ABC transporter inhibition, at least be considered as "secondary positive patterns". Both achievements complement the multitarget fingerprints and will be of use when improving C@PA's prediction capabilities (e.g., as C@PA_1.2).
C@PA provides the unique opportunity to shift the methodology to discover multitarget ABCB1, ABCC1, and ABCG2 inhibitors from "serendipity" to "rationale". Now, it is not a matter of luck anymore to gain novel multitarget inhibitors, but only of statistics, and C@PA proved also to be greatly efficient compared to other computational approaches, such as similarity search and pharmacophore modeling. Remarkably, considering that common motifs within the ABC transporter superfamily exist, C@PA provides also the unique chance to predict and discover novel agents that target understudied ABC transporters that cannot be addressed by small-molecules until now. Finally, this methodology may be transferred to other protein families as well, thriving also drug development in other scientific areas in general.

■ EXPERIMENTAL SECTION
Computational Analysis. Compilation of Data Set and Categorization of Compounds. Literature research to find and assemble inactive, selective, dual, and triple inhibitors of the ABC transport proteins ABCB1, ABCC1, and ABCG2 was conducted using the National Center for Biotechnological Information (NCBI). 72 Reports were only considered when they either presented simultaneous testing at ABCB1, ABCC1, and ABCG2, or the respective compound has been evaluated regarding ABCB1, ABCC1, and ABCG2 in several individual reports. SMILES codes (isomeric if applicable) were either obtained from PubChem, 72 manually assembled from associated content and supplementary material as provided by the respective report, or manually drawn according to the 2D representations provided by the corresponding report using ChemDraw Pro [version 17.1.0.105 (19)]. Determined IC 50 values and deviations were assembled as reported in the respective literature under referral to the applied testing system (detection method and host system; Supplementary Table 1). In case the IC 50 was needed to be estimated from relative inhibition data, the used concentration of the respective compound and its relative effect to a standard ABCB1, ABCC1, and ABCG2 inhibitor were taken into account to categorize the corresponding compound into "active" (estimated IC 50 value <10 μM) or "inactive" (estimated IC 50 value ≥10 μM). In total, 1049 compounds from 93 reports between 2004 and 2020 were taken into account for further data processing. The associated original literature is also provided in Supplementary Table  1. For compound categorization, the assembled data in Supplementary Table 1 has been fused to associate one compound with one single IC 50 value (Supplementary Table 2). In the case of two reported IC 50 values or a given IC 50 span, the mean was calculated. In case of defined and estimated IC 50 values, the defined value has been given priority. In the case of activity (IC 50 present) and inactivity (IC 50 not present), the defined IC 50 value was given priority. Compounds with IC 50 values below 10 μM were considered as active (1, "one"), others as inactive (0, "zero"). The data provided in Supplementary Table 2 Table 3).
Basic Scaffold Search and Statistical Substructure Analysis. The Structure-Activity-Report (SAReport) tool 48  Identification of Multitarget Fingerprints. "Clear positive hits" as indicators for triple ABCB1, ABCC1, and ABCG2 inhibition were defined as follows: (i) the respective substructure must have appeared at least five times within the 1049 molecules; (ii) group D must be at least 15%; and either (iii) group D must be at least equal to group B, or (iv) group E must be at least equal to group B. "Clear negative hits" were defined as follows: (i) the respective substructure must have appeared at least five times amongst the 1049 molecules; (ii) the respective substructure must not account for class 7 compounds (group D must be 0%); and (iii) group B must be at least equal to group C.
Model Validation and Comparison to Common Computational Approaches. Model validation for C@PA has been conducted by applying Screen 2 ("Positive Pattern") and Screen 3 ("Negative Pattern") using a query search tool implemented in InstantJChem. 51 The 2D similarity search was performed by using the MACCS fingerprints as implemented in MOE. 49 This MACCS fingerprint contains 166 structural keys indicating the presence of specified structural fragments in the molecular graph representation. The similarity between the six selected query inhibitors 1 and 4−8 as well as the 1049 molecules in the dataset was measured by using a Tanimoto coefficient (Tc) with a cutoff value of 0.8. For the pharmacophore model, the six selected query inhibitors were aligned using the flexible alignment tool implemented in MOE 49 as described before in detail. 16 The best alignment was used to generate the pharmacophore model using the consensus methodology implemented in the Pharmacophore Query Editor. In total, 196,439 conformers for the 1049 molecules in the dataset were generated using the conformational search tool in MOE 49 by applying the stochastic search method with a conformation limit of 10,000. The threshold for the identification of multitarget pharmacophore features was set at 50.0% and a tolerance value of 1.2.
Virtual Screening, Selection Criteria, and Manual Candidate Selection. The ENAMINE Diverse REAL drug-like database was downloaded 52 and screened for compounds with (i) at least one basic scaffold, (ii) at least one clear positive hit, (iii) no clear negative hit, (iv) a LogP and MW that stretched inside the span of class 7 compounds (LogP span: 2.4−6.9; MW span: 295−915), and (v) at least one "novel scaffold". LogP and MW were calculated using MOE (version 2019.01). 49 In total 1505 potential candidates resulted, from which 87 were manually selected by experience-driven decisions depending on availability and price, from which 41 were ordered from ENAMINE and 23 were delivered within the purity requirement of 95%. All compounds were screened for substructures present in panassay interference compounds (PAINS) and did not contain any of these motifs. 73 The identities of compounds 11−14, 16−19, and 21−32 were determined by ENAMINE via 1 H NMR spectroscopy. Compounds 15, 20, and 33 were analyzed in our laboratory by using a Bruker Avance 500 MHz (500 MHz). All NMR spectra were recorded in DMSO-d 6 , and chemical shifts (δ) are expressed in ppm calibrated to the solvent signal of DMSO (δ: 2.50 ppm). Spin multiplicities of compounds 11−33 are given as singlet (s), doublet (d), doublet of doublets (dd), doublet of triplets (dt), and multiplet (m). The purity of compounds 11−33 was determined by ENAMINE via LC-MS analysis and stated as at least 96% pure. The complete analytical assessment of the compounds is provided in the Supporting Information.
Calcein AM Assay. Calcein AM assays to assess inhibitory activity against ABCB1 and ABCC1 were applied as described earlier. [14][15][16]56 Twenty micromolar of either 50 or 100 μM of compounds 11−33 were added to a 96-well flat-bottom clear plate (Greiner, Frickenhausen, Germany) and complemented with 160 μL of cell suspension containing either A2780/ADR or H69AR cells at a density of 30,000 and 60,000 cells/well, respectively. After incubation (5% CO 2 -humidified atmosphere; 37°C) for 30 min, calcein AM (3.125 μM; 20 μL) was added to each well followed by immediate measurement of fluorescence increase (excitation: 485 nm; emission: 520 nm; interval: 60 s; duration: 1 h) using POLARstar and FLUOstar Optima microplate readers (BMG Labtech, software versions 2.00R2/2.20 and 4.11-0; Offenburg, Germany). Slopes from the linear fluorescence increase were calculated and compared to the respective slopes of the standard inhibitors. To determine IC 50 values, in-depth concentration-effect curves have been generated by plotting the slopes against several logarithmic concentrations of the tested compounds. Data analysis was performed using GraphPad Prism (version 8.4.0, San Diego, CA, USA) using the statistically preferred model (three-or four-parameter logistic equation).
Pheophorbide A Assay. The pheophorbide A assay to assess inhibitory activity against ABCG2 was applied as described earlier. 14−16 Compound and cell preparation was conducted as described above. In total, 45,000 cells in a 160 μL suspension were pipetted into flat-bottom clear 96-well plates after 20 μL of the respective compound dilution has been added (Thermo Scientific, Rochester, NY, USA). A pheophorbide A solution (20 μL; 5 μM) was supplemented, and the reaction mixture was incubated for 120 min (5% CO 2 -humidified atmosphere; 37°C). Eventually, the intracellular fluorescence was detected via flow cytometry (Guava easyCyte HT, Merck Millipore, Billerica, MA, USA) at an excitation wavelength of 488 nm and emission wavelength of 695/50 nm. The absolute fluorescence values were compared to the effect caused by the standard ABCG2 inhibitor compound 34. Determination of relative inhibition and IC 50 values were determined as described above.
Hoechst 33342 Assay. To confirm the inhibitory effect of compounds 15, 18, 21, 22, and 26 against ABCB1 and ABCG2, as well as compound 27 against ABCG2, a Hoechst 33342 assay was performed as described earlier. 15,57 Twenty microliters of the dilutions of the compounds in KHB were pipetted into black 96-well plates (Greiner, Frickenhausen, Germany). Cells were processed as described before, and approximately 30,000 were seeded into the plates with 160 μL per well. After a 30 min incubation period at 37°C and 5% CO 2 , Hoechst 33342 solution (10 μM) was added at a quantity of 20 μL resulting in a final Hoechst 33342 concentration of 1 μM. Fluorescence intensity was assessed in 60 s time intervals for a time period of 120 min at an excitation wavelength of 355 nm and an emission wavelength of 460 nm using microplate readers (POLARstar and FLUOstar Optima by BMG Labtech, Offenburg, Germany). The average fluorescence values at the steady state were calculated for each concentration and plotted against the logarithm of the compound concentration. Determination of relative inhibition and IC 50 values were determined as described above.
Daunorubicin Accumulation Assay. For further confirmation of the inhibitory potency of triple inhibitors on ABCC1, daunorubicin accumulation assay was applied as described before with minor modifications. Dilution series of test compounds and cell culture were performed as described for the calcein AM assay. To 20 μL of the test compounds in different concentrations in a clear flat-bottom 96-well plate (Thermo Scientific, Waltham, MA, USA), 160 μL of the cell suspension containing approximately 45,000 H69 AR cells in colorless culture medium without supplements was added. Then, 20 μL of a 30 μM daunorubicin solution were pipetted to the mixture and incubated for 180 min protected from light at 37°C and a 5% CO 2 humidified atmosphere. Fluorescence was measured by flow cytometry (Guava easyCyte HT) at a 488 nm excitation wavelength and 695/50 nm emission wavelength. Data analysis was performed as described before. Determination of relative inhibition and IC 50 values were determined as described above.  Table  4); and molecular formula strings of "Screen 5" compounds before manual selection (Supplementary Table 5