Evaluation and Characterization of Trk Kinase Inhibitors for the Treatment of Pain: Reliable Binding A ﬃ nity Predictions from Theory and Computation

: Optimization of ligand binding a ﬃ nity to the target protein of interest is a primary objective in small-molecule drug discovery. Until now, the prediction of binding a ﬃ nities by computational methods has not been widely applied in the drug discovery process, mainly because of its lack of accuracy and reproducibility as well as the long turnaround times required to obtain results. Herein we report on a collaborative study that compares tropomyosin receptor kinase A (TrkA) binding a ﬃ nity predictions using two recently formulated fast computational approaches, namely, Enhanced Sampling of Molecular dynamics with Approximation of Continuum Solvent (ESMACS) and Thermodynamic Integration with Enhanced Sampling (TIES), to experimentally derived TrkA binding a ﬃ nities for a set of P ﬁ zer pan-Trk compounds. ESMACS gives precise and reproducible results and is applicable to highly diverse sets of compounds. It also provides detailed chemical insight into the nature of ligand − protein binding. TIES can predict and thus optimize more subtle changes in binding a ﬃ nities between compounds of similar structure. Individual binding a ﬃ nities were calculated in a few hours, exhibiting good correlations with the experimental data of 0.79 and 0.88 from the ESMACS and TIES approaches, respectively. The speed, level of accuracy, and precision of the calculations are such that the a ﬃ nity predictions can be used to rapidly explain the e ﬀ ects


■ INTRODUCTION
The availability of computational methods that can reliably, rapidly, and accurately predict the binding affinities of ligands to a target protein of interest would greatly facilitate drug discovery programs by enabling project teams to more effectively triage design ideas and therefore synthesize only those compounds with a high probability of being pharmacologically active. The overall effect of this approach would be to reduce the number of "design−synthesis−test" cycles needed to generate compounds of sufficient bioactivity to progress to the clinic. Until recently, in silico methods of binding affinity prediction have not been regarded as reliable enough to produce such actionable results. The American Chemical Society's Cross-Pharmaceutical Industry group has been discussing this particular challenge of conducting elaborative studies and has come out with an opinion piece recently, in which it is specifically noted that "coordinated, blinded prediction challenges offer the best opportunity to develop broad understanding and broadly applicable methods". 1 The Coveney group has now developed a suite of computational methods that deliver rapid, accurate, precise, and reliable binding affinity predictions. We report here a prospective computational study on a then-ongoing project at Pfizer 2 to assess the effectiveness of these methods based on the use of a Binding Affinity Calculator (BAC) software tool and associated services. 3 The approach makes use of an automated workflow 4 running in a high-performance computing environment that builds models, runs large numbers of calculations, and analyzes the output data in order to place reliable error bounds on predicted ligand binding affinities. In order to assess the reliability of these methods, the predictions were performed blind at UCL and subsequently compared with the experimentally determined TrkA binding affinity data. 2 For a new method being pursued by an industrial computational chemist, these types of live blinded individual project team collaborations are the best kinds of pilot studies that can be performed.
Tropomyosin receptor kinase A (TrkA) is used as the target protein in this study. TrkA is the cognate receptor of the nerve growth factor (NGF) neuropeptide, a neurotrophic factor involved in the regulation of growth, maintenance, proliferation, and survival of certain target neurons. 5 Preclinical and clinical studies have identified a crucial role for NGF in the pathogenesis of pain; the clinical efficacy of anti-NGF monoclonal antibody (mAb) therapies against several pain end points is well-documented, and preclinically the inhibition of the TrkA kinase domain by small-molecule kinase inhibitors has been shown to reverse the effects of NGF-mediated pain transduction. 6,7 In view of the key role of NGF in modulating pain, there is significant interest in the clinical development of small-molecule TrkA inhibitors to complement anti-NGF mAb treatment options.
The UCL group has recently introduced new approaches termed Enhanced Sampling of Molecular dynamics with Approximation of Continuum Solvent (ESMACS) and Thermodynamic Integration with Enhanced Sampling (TIES) 8 for the reliable prediction of ligand−protein free energies. The ESMACS approach centers on the molecular mechanics Poisson−Boltzmann surface area (MMPBSA) method, 9 which employs a continuum approximation for the aqueous solvent, while TIES is based on thermodynamic integration. The approaches emphasize the necessity of invoking ensemble-based sampling to reliably compute macroscopic quantities using microscopic modeling methods. 10 The BAC tool 3 is employed to automate much of the complexity of running and marshalling the required molecular dynamics simulations as well as collecting and analyzing data. The BAC tool utilized to perform the ESMACS and TIES studies

Journal of Chemical Information and Modeling
Article constitutes a computational pipeline built from an array of software tools and services and requires access to suitable computing resources. Integration and automation are central to the reliability of the method, ensuring that the results are reproducible and can be delivered rapidly.

■ COMPUTATIONAL METHODS
The TrkA binding affinities of the compounds in Table 1 were calculated using ESMACS and TIES. These compounds were picked as representatives of the full series previously studied experimentally, 2 covering the dynamic range of the TrkA pharmacology assay and containing the structural features deemed as key determinants of TrkA activity. (A second series of Pfizer pan-Trk inhibitors, structurally dissimilar to those highlighted in Table 1, also showed good agreement between their ESMACS-predicted and experimentally determined TrkA binding affinities. Studies of those inhibitors will be published in due course.) These congeneric compounds all have the same net neutral charge. In ESMACS, the MMPBSA.py.MPI 11 module of the AMBER12 package 12 was employed for the free energy calculations of the complex (G complex ), the receptor (G receptor ), and the compound (G ligand ) (eq 1). The polar solvation free energy was calculated using the Poisson− Boltzmann equation with a grid spacing of 0.5 Å and dielectric constants of 1 and 80 for the protein and solvent, respectively. The energetic analyses, including the configurational entropy calculations, were conducted on 50 snapshots of complexes, protein, and compounds extracted evenly from each 4 ns production run in the ensemble MD simulations of complexes only (1-traj and 2-traj; see below) or from separate ensemble simulations of complexes and compounds (3-traj). In TIES, the energy derivatives ∂V/∂λ (eq 2) were recorded every 2 fs during the simulations, and a stochastic integration 10,13 was performed using a trapezoidal method. Hydrogen-bond analyses were performed for the ensemble simulation trajectories. A hydrogen bond is considered to be formed when the distance between a hydrogen-bond acceptor and a hydrogen-bond donor is less than a defined distance cutoff and the acceptor−hydrogen−donor angle is greater than an angle cutoff. The cutoffs used in the current study are 3.0 Å for the distance and 135°for the angle. The protein and the compounds can have intramolecular and intermolecular hydrogen bonds, of which the intermolecular ones contribute significantly to the binding affinities. Some "bridging" water molecules, which are hydrogen-bonded to the protein and the compounds at the same time, also play an important role in binding of ligands to proteins, in addition to their direct interactions. In ESMACS predictions, the inter-and intramolecular interactions of the compounds and the protein are calculated explicitly, while the interactions with water, including such bridging water molecules, are taken into account implicitly.
In ESMACS, the free energy is evaluated approximately on the basis of the extended MMPBSA method, 8,14 including configurational entropy and the free energy of association. 15 Free energy changes (ΔG binding ) are determined for the molecules in their solvated states. The binding free energy change is then calculated as where G complex , G receptor , and G ligand are the free energies of the complex, the receptor, and the ligand, respectively.
The three terms on the right-hand side of eq 1 can be generated from single simulations of the complexes or from separate simulations of complexes, receptor(s), and ligand(s). The former is the so-called one-trajectory (1-traj) method, in which the trajectories of the receptor(s) and ligand(s) are extracted from those of the complexes. The latter is the socalled three-trajectory (3-traj) method when the energies are derived from independent simulations of the three components or the two-trajectory (2-traj) method when the energy is derived from independent simulations for only the receptor or the ligand. In the drug development field, binding is usually investigated for a set of ligands bound to the same protein target. The free energy of the receptor, G receptor , is then the same for all of ligands when it is derived from an independent simulation of the receptor and hence can be treated as a constant. In the current study, we employ three approaches to calculate ΔG binding : (a) employing only simulations of the complexes (1-traj); (b) the same as (a) but using the average of the receptor free energies ⟨G receptor ⟩ from the 1-traj method (denoted as 2-traj); and (c) the same as (b) but also invoking separate ligand simulations for the derivation of G ligand (denoted as 3-traj).
In our more recent TIES method, 13 an "alchemical transformation" 16 for the mutated entity, either the ligand or the protein (in this study it is always the ligand), is used in both aqueous solution and within the ligand−protein complex. The relative free energy changes for the alchemical mutation processes, ΔG aq alch and ΔG complex alch , are calculated as where λ (0 ≤ λ ≤ 1) is a coupling parameter such that λ = 0 and λ = 1 correspond to the initial and final thermodynamic states, V(λ) is the potential energy of an intermediate state λ, and ⟨···⟩ λ denotes an ensemble average over configurations representative of the state λ. The relative binding free energy difference is then calculated as Compared with ESMACS, the TIES approach, like all other alchemical-based free energy methods, is usually more accurate in comparing one congeneric ligand to another. However, the nature of these alchemical methods implies that their applications will be limited when diverse sets of compounds are of interest. A recent publication 17 in which the authors claimed that their implementation of free energy perturbation (FEP) calculations successfully predicted a number of more active compounds (a correlation coefficient of 0.71 between the predicted and experimental binding affinities was achieved) has the same limitations. ESMACS, however, has no such limitations. Our previous studies 8 have shown that ESMACS can be applied to sets of compounds that are highly diverse in terms of both the number of atoms and the net charge. In addition, ESMACS is able to make assessments of the effect that an individual ligand has on the protein to which it binds. In particular, the application of two-and three-trajectory versions of ESMACS allows one to probe the extent to which both the ligand and protein adjust their conformations on binding, broadly according to a "lock and key" or "induced fit" recognition mechanism.
In the current study, we apply the ESMACS approach for all of TrkA compounds highlighted in Table 1 and consider TIES

Journal of Chemical Information and Modeling
Article for a selected subset. In the evaluation and characterization of TrkA binding affinity predictions from the Pfizer compound set, the ESMACS and TIES protocols were conducted as recently described. 8,13 TrkA compounds were optimized at the Hartree−Fock level with the 6-31G* basis (HF/6-31G*) in Gaussian 03 18 and parametrized using Antechamber and RESP in AmberTools 12 with the general AMBER force field (GAFF). 19 The Amber ff99SBildn force field 20 was used for the protein. In ESMACS, we used 25 replicas in an ensemble calculation for each ligand. In TIES, five replicas were used for each selected pair of ligands. The MD package NAMD2.9 21 was used throughout the equilibration and production simulations with periodic boundary conditions. We used a protocol established in our previous publications 13,14 in which 2 ns of equilibration and 4 ns of production were conducted for each replica. Energy analyses showed that a 4 ns production run was sufficient to have convergence results for the current molecular systems (see Figure S1 in the Supporting Information). In ESMACS, the MMPBSA.py.MPI 11 module of the Amber package 12 was used to extract the free energies of the complexes, the protein, and the compounds (eq 1), including configurational entropy from normal mode calculations.
ESMACS was performed on SuperMUC, two separate highend supercomputers with a total of ca. 245 000 cores in a combination of various processor technologies (https://www. lrz.de/services/compute/supermuc/systemdescription/), at the Leibniz Rechenzentrum (Leibniz Supercomputing Centre) in Garching, Germany. We were able to run our calculations on them as if it were a single supercomputer. 22 TIES calculations were performed on ARCHER, a Cray XC30 supercomputer (equipped with ca. 110 000 cores), the U.K.'s National High Performance Computing Service located in Edinburgh. A single ESMACS study of one ligand−protein interaction and a TIES study of the binding free energy difference for two congeneric ligands can both be completed in about 8 wall-clock hours on 1200 and 3120 CPU cores, respectively, on ARCHER (see details in our previous publication 8,13 ). (It is possible to further reduce the time required by use of GPU accelerators.) The ESMACS and TIES approaches are both scalable, allowing a large number of calculations to be performed concurrently, depending only on the computing resources available.

■ RESULTS
We report a collaborative study performed between academic computational chemists and a pharmaceutical company on a data set of 16 pan-Trk ligands (Table 1). 2 Of the TrkA cocrystal structures provided by Pfizer, 2 we elected to utilize the cocrystal structure of TrkA and 1 (Figure 1, PDB code 5JFV) as it contained the highest number of crystallographically defined TrkA residues. The missing residues were built by ModLoop, 23 while the initial structures of the complexes were constructed by a docking method employing UCSF DOCK. 24 Nine of the 16 compounds were successfully docked into the binding site of the TrkA protein with orientations agreeing with those from available X-ray structures. Compounds that failed to generate suitable cocomplexes via docking were manually positioned on the basis of X-ray and/or modeled structures of similar compounds.
Comparison of Experimental TrkA Binding Free Energies with ESMACS Predictions. The comparison of the TrkA binding free energies obtained from ESMACS predictions and those determined experimentally is shown in Figure 2. The predicted binding free energies from the 1-traj approach exhibit a moderate Pearson correlation with the experimental data (with a coefficient of 0.42; Figure 2a). The 2traj approach, in which the same free energy of the receptor averaged from the 1-traj method is used, significantly improves the correlation between the predictions and experimental data (with a coefficient of 0.76; Figure 2b). Incorporating the free energies of the ligands from their individual simulations in water (the 3-traj approach) generates a correlation similar to that from the 2-traj approach (with a coefficient of 0.79; Figure  2c). The improvements are achieved by including the adaptation energies 8 of the receptor in the current study, and possibly of the ligands in general cases, which quantify the free energy changes of a molecule between its bound and free states. The 3-traj ESMACS results are in good agreement with the experimental measurements. It should be noted that the ESMACS method is capable of comparing the binding affinities of the entire set of ligands, notwithstanding the relatively significant structural differences. It should also be noted that the experimental TrkA binding free energies were approximated from half-maximal inhibitory concentration (IC 50 ) values, which provide only semiquantitative estimates. In addition, experimental IC 50 values for individual compounds vary between intralab measurements, and their average IC 50 values vary between 1.6-and 4.2-fold when measured independently across the two laboratories (Table 1 and Table  S1 in the Supporting Information). A root-mean-square (RMS) error of ∼24% has recently been inferred from interlaboratory variations of reported binding affinities by Chodera and Mobley. 25 Improvement of Docking Predictions by the ESMACS Approach. For the nine compounds that were successfully docked into the TrkA protein, the binding free energies were also calculated using the structures after docking and energy minimization. The minimization process optimizes the geometry of a collection of atoms so that the net interatomic force on each atom is acceptably close to zero. The remaining seven compounds were not included in these calculations because the manually constructed structures exhibit some close contacts between the protein and the compounds, to which the binding free energy estimations are very sensitive. When the compounds could be docked into a single protein structure, the ranking of their binding affinities was reasonably predicted from the docked structures ( Figure 3a). However, the results from  Table S2), which are not shown in the figures for reasons of clarity. The experimental data from two sites (Pfizer, Sandwich and TCG Lifescience) are displayed in black and red, respectively. The correlation coefficients shown in the figure were calculated using the averages of the calculated binding free energies and the experimental data from Pfizer, Sandwich (black circles) and TCG Lifescience (red circles) where the former are not available. Large uncertainties are associated with the correlation coefficients because of the large error bars of the calculated and experimental binding affinities. Further analyses with bootstrapping resampling (see details in the Supporting Information) generate correlation coefficients of 0.39 ± 0.26, 0.60 ± 0.26, and 0.62 ± 0.27 for the 1-, 2-, and 3-traj approaches, respectively. It is evident that the 2-and 3traj methods improve the ranking provided by the 1-traj method.  Table 1).

Journal of Chemical Information and Modeling
Article the docking structures overestimate the free energy differences among the compounds, as shown in Figure 3a, in which the large slope of the regression line (3.33) indicates that the predicted energy differences are on average more than 3 times larger than the experimentally measured ones. The energy minimization significantly decreases the overestimations (Figure 3b, with a regression slope of 1.81). The ESMACS study further decreases the overestimation and makes the points on the scatter plot more concentrated around the regression line (Figure 3c). The docking method ranks the binding affinities reasonably well only for the compounds with closely related structures. Its results are also strongly dependent on the quality and choice of the structure(s) of the target protein. 26 The ESMACS approach, however, is not sensitive to the initial structure and achieves a similar correlation coefficient between the calculated and experimental binding affinities for all of the compounds irrespective of whether they can be successfully docked (Figure 2c).
ESMACS Binding Affinities: Contributions from Energy Components. The components of the free energy calculations (see the Supporting Information for more details) may provide insight into the mechanism of compound binding. 27 In this study, no apparent correlations could be found for the individual energy components of the internal van der Waals, electrostatic, or electrostatic solvation energies. The total bonded energies, including bond, angle, and dihedral interactions, manifest a moderate correlation with the experimental binding free energies in the 3-traj approach, with a Pearson coefficient of 0.54 (Figure 4a). The nonbonded interactions, including the internal electrostatic, solvation electrostatic, and van der Waals interactions, have a slightly weaker correlation (Pearson coefficient of 0.43) than the bonded energies. The combination of bonded and nonbonded interactions significantly improves the correlation, with a Pearson coefficient of 0.75 (Figure 4c). The bonded and nonbonded energies are the two components of the total adaptation energy (see below), which is associated with the conformational changes upon compound binding. It is therefore not surprising that both of them show correlations with the binding affinities. The configurational entropy components do not exhibit a significant correlation with the experimental binding affinity ( Figure S2). They vary in a range of 4.6 kcal/mol; the MMPBSA energies vary in a range of 14.41 kcal/mol by comparison. Inclusion of the contribution from configurational entropy into the MMPBSA energy does not improve the Pearson correlation coefficients significantly (see Figure 2c, which includes configurational entropy, and Figure  4c, which does not).
Adaptation Energy: A Measure of Conformational Change upon Binding. The improvement obtained with the 3-traj approach compared with the 1-traj approach is achieved by relaxing the assumption that the receptor and the compounds sample similar conformations in both the free and bound states. Unfavorable adaptation energies 8 are usually induced when the conformations within the free state are significantly perturbed upon compound binding. The adaptation energies from the 3-traj approach indeed provide more insights into the mechanism of compound binding. They are related to the structural modifications made to the compounds and the protein and shed important light on the optimization. As the ranking of the ligand binding is the main concern of the study and the relative adaptation energies of the protein are energetically as informative as the absolute ones in the binding affinity comparison, no attempt has been made here to compute accurate absolute adaptation energies for the protein.
All of the adaptation energies for the protein are relative ones in the current paper. Non-negligible adaptation energies are associated with conformational changes upon compound binding, meaning that the binding involves "induced fit" recognition. Compound 8, for example, introduces the largest adaptation energy within the protein, while 16 has the largest adaptation energy within the compound ( Figure 5). The binding of 8 induces a conformational change within the protein because there is a reluctance to accommodate the methoxy group in the linker region of 8 into the binding site; the arrangement of the amide group in 16 forces the compound to adopt a high-energy conformation in the protein's binding site (see details below).
Comparison of Experimental TrkA Binding Free Energies with TIES Predictions. The TIES method provides a more accurate approach to estimate relative binding affinities, but the scope for its use is rather tightly circumscribed, as it is applicable only for pairs of compounds that do not present significant structural differences. Our previous studies have demonstrated that TIES offers more quantitative accuracy in its predictions than ESMACS. 13,28 Here we apply TIES to the

Journal of Chemical Information and Modeling
Article TrkA data set to highlight the accuracy of the approach. In TIES, studies are performed for 14 pairs of TrkA compounds ( Figure 6). The calculated binding free energy differences correlate well with the experimental data, with a Pearson correlation of 0.88 (Figure 6), compared with 0.79 from the 3traj ESMACS study (Figure 2c). A directional agreement is achieved if a prediction has the same sign as the experimental observation; otherwise, the result is deemed to be directional disagreement. For 10 out of the 14 pairs studied, TIES successfully achieves directional agreements, meaning that the results predict the direction of the change in the binding affinity correctly ( Figure 6). As shown in the figure, all but one of the TIES predictions that directionally disagree lie on the border of quadrants II and III within their calculated error bars. When the error bars of the experimental ΔΔG are also taken into account, a better agreement might be achieved than that reported here.

■ DISCUSSION
The calculations, especially the ESMACS ones, can provide further insights for binding affinity variation between structurally similar ligands. In order to analyze the results, it is necessary to consider the differences between the one-, two-, and three-trajectory methods. The 1-traj simulation calculates ΔG binding assuming there is no energetic penalty for the adaptation of the protein or ligand to the binding conformation; the 2-traj method takes the change in protein energy into consideration, and the 3-traj method accounts for the changes in both protein and ligand energies. The simulations enable the exploration of key differences in binding modes for structurally related ligands, as we now discuss.
Hinge Binding Group Modifications: Compounds 1, 4, and 22. Initial compounds in the pyrrolopyrimidine series, such as 1, contain an amino group at the 4-position of the hinge binding group (Table 2). Compounds such these, while TrkAactive, required optimization of their kinase selectivity profile. 29 It is known within the scientific literature that minimizing the number of hydrogen-bonding interactions made between a kinase inhibitor and the kinase hinge binding region can lead to an enhanced kinome selectivity profile. 30,31 In order to design compounds with an optimal kinase selectivity profile, an analysis of the crystal structure of 1 bound to TrkA and the nonliganded TrkA apo structure was undertaken. 29 The aminopyrrolopyrimidine motif of 1 makes a two-point hydrogen-bonding interaction with hinge residues E590 and M592. When 1 is superpositioned onto the TrkA apo crystal structure, the −NH 2 motif of the aminopyrrolopyrimidine group overlays directly with a conserved water molecule observed in the apo structure. The −NH 2 group was therefore removed to generate compounds such as 4, in which a bridging water molecule forms hydrogen bonds with the backbone carbonyl oxygen at E590 and the ketone oxygen of the TrkA ligand. The hydrogen bond between the ligand and the backbone N−H of hinge residue M592 is maintained. The deletion of the −NH 2 group had only a minimal effect on the TrkA activity, as 1 and 4 had broadly similar IC 50 values in the TrkA pharmacology assay.
ESMACS calculations show that compounds without the −NH 2 group at the 4-position (all compounds except 1, 3, 6, and 22; Table 2) have a bridging water molecule between the Figure 6. Correlation between TIES-predicted relative binding affinities and experimental data. The black line is the correlation line, while the dotted lines (x = 0 and y = 0) create four quadrants. Ten out of the 14 data points are in quadrants I (x > 0 and y > 0) and III (x < 0 and y < 0), meaning that the calculated binding free energy differences have the same sign as those from the experimental data.

Journal of Chemical Information and Modeling
Article ketone oxygen of the ligand and the backbone carbonyl oxygen of hinge residue E590, with an average occupancy of 41 ± 9%. The occupancy indeed underscores the presence of a water molecule at the location as the molecule dynamically moves into and out of the cutoff distance and angle that designate the occurrence of a bridging water molecule. Simulations of compounds containing the −NH 2 group at the 4-position (1, 3, and 6; Table 2) show the same bridging water to be present but at a far lower frequency (occupancy values of just 2 ± 0% for 1, 3, and 6) (Figure 7).
ESMACS confirms that the bridging water molecule highlighted in Figure 7 helps mediate and thus satisfy key protein−compound interactions, rendering the overall binding free energies of compounds such as 4 less affected by the absence of the H-bond when −NH 2 is not present. This capability is reflected in the close similarity between the calculated TrkA free energy values for 1 and 4 and those derived from TrkA pharmacology experiments (Table 2). Compound 22 was synthesized as part of an effort to characterize putative compound metabolites. As highlighted in Table 2 (Figure 7b). The result of having a nonsatisfied hydrogen-bonding group (−OH) in 22 is a reduction in overall ligand binding affinity. This is reflected not only in the measured activity data (ΔG exp = −7.1 kcal/mol) but also in the relative binding affinities calculated by ESMACS and TIES (Table 2).
Hinge Binding Group Modifications: Compounds 12, 17, 23, and 24. The TrkA cocrystal structure of compounds such as 12 indicated that a polar atom capable of making an additional hydrogen-bonding interaction with M592 might be accommodated at the 2-position of the hinge binding group (Table 3). Compound 17 was synthesized initially to test this

Journal of Chemical Information and Modeling
Article theory and proved to be more active in the TrkA pharmacology model than the parent molecule 12. Compound 23 was then synthesized to see whether adding a methyl group to the amine would further boost the potency through the introduction of hydrophobic interactions with the lipophilic side chain of Y591. Compound 23 was potent in the TrkA assay, albeit slightly less so than 12. Compound 24 was synthesized as a putative metabolite of 12 and was found to be >15-fold less active at TrkA.
ESMACS shows that addition of polar groups at the 2position increases the total occupancy of hydrogen bonds with M592 from 23 ± 4% (for 12) to 85 ± 4%, 80 ± 4%, and 91 ± 19% (for 17, 23, and 24, respectively). This includes the occupancy of the hydrogen bond made between the pyrimidyl nitrogen and the −NH group of M592, which is common to all of the compounds. The increased occupancies of 17, 23, and 24 result in the additional hydrogen bonds between the polar groups at the 2-position of the compounds and the CO group of M592. The simulations also show that addition of an −NH 2 group (17) does not introduce any unfavorable steric interaction within the protein or the ligand (see Figure 8). The slightly improved binding affinity of 17 versus 12 stems from the favorable electrostatic interaction (including the solvation electrostatic component from the Poisson−Boltzmann calculation) between the compound and the protein. The addition of the methyl group on the amine (23), however, induces an unfavorable adaptation energy within the protein (see the comparison of relative adaptation energies for 12/17 and 23 in Figure 5a) due to steric hindrance between the protein and 23 (Figure 8c). The presence of an −NH 2 group at the 2-position (17) does not induce an unfavorable adaptation energy. The introduction of an −OH group (24), however, introduces a significant adaptation energy in the protein because the −OH group forms the strongest hydrogen bond with M592 among the subgroup of compounds 12, 17, 23, and 24, which induces steric hindrance within the complex and reduces the overall binding affinity (Figure 8).

Linker Group Modifications: Compounds 1, 4, and 22.
The TrkA cocrystal structure of compounds such as 1 highlighted a conserved water molecule that formed hydrogen-bonding interactions with the N−H group of the amide linker and protein residues K544 and E560 (Table 4 and Figure  9). A methyl group was added to the amide nitrogen to assess the effect of displacing the conserved water molecule (and the potentially altered amide conformation relative to the pyridyl ring) on the TrkA activity. The effect of transposing the amide N−H to the other side of the carbonyl group was also investigated, as this, if the activity were retained, could open up new parallel chemistry opportunities within the chemotype. Compound 3, an example of the N-methylamide, remained active at TrkA, although with reduced affinity versus its desmethyl congener 1. Compound 6, an example of the transposed N−H amide, also remained active at TrkA with almost identical affinity as 1.
ESMACS shows that the modifications highlighted in Table  4 do not alter the key hydrogen-bonding interaction between the carbonyl oxygen of the amide group of 1, 3, and 6 and the backbone N−H of D668, with occupancies of 56 ± 7%, 54 ± 3%, and 51 ± 6%, respectively ( Figure 9). The altered position of the amide group in 1 and 6 slightly affects the appearance of the bridging water molecules between the ligands and residue E560, with occupancies of 15 ± 6% and 20 ± 5% for 1 and 6, respectively. Minimal direct hydrogen-bonding interactions are predicted between the amide −NH group of 1 and 6 and residue K544 or E560. The replacement of the amide −NH group of 1 by the more hydrophobic N-methyl group is predicted to displace the adjacent bridging water molecule (Figure 9a,b) and overall make binding to TrkA energetically less favorable. There is a good correlation between the ESMACS-predicted and experimentally derived free energies of TrkA binding for 1, 3, and 6 ( Table 4).
Linker Group Modifications: Compounds 7 and 8. Although the linker group of the pyrrolopyrimidine/pyrrolopyridine series binds to a relatively narrow region of the ligand binding site, between the two adjacent ATP and DFG pockets, the tolerance of the TrkA protein toward bulkier linker groups was briefly investigated as part of the TrkA program. An example from this work is 8, in which a methoxy group has been added to the linker-group carbonyl. As highlighted in Table 5, a significant reduction in TrkA activity is observed for 8 versus 7.
ESMACS calculations indicate that the oxygen atom of the methoxy group of 8 is located proximal (3.47 ± 0.31 Å) to the carbonyl group of residue D668. The occupancy of a bridging water molecule between the methoxy group and the carbonyl is only 5 ± 2%, indicating that the orientation and the space between these groups preclude a bridging water molecule being available to quench any electrostatic repulsion (Table 5 and Figure 10). The chirality of the methoxy group also prevents the polar oxygen from having any direct or favorable waterbridged interactions with E560 or D668. The binding of 8 induces the largest adaptation energy (Figure 5a) in the TrkA protein of all compounds assessed. Energetic analyses show that although the bonded energy and the van der Waals interactions are favorable for 8 compared with 7 (by 3.70 and 2.29 kcal/ mol, respectively), a significantly unfavorable electrostatic energy (10.59 kcal/mol), mainly from the interactions between the polar oxygen of the methoxy group and the polar/charged groups of the protein (carbonyl group of D668 and carboxylate The calculated free energies of TrkA binding for 7 and 8 correlate well with those obtained experimentally. Hence, utilization of ESMACS to predict the TrkA binding affinity of 8 during the Pfizer TrkA project might well have influenced the medicinal chemistry team to deprioritize this compound and redirect synthetic efforts in other directions. Linker Group Modifications: Compounds 4 and 16. The hydrogen bond between the linker-group carbonyl and the backbone N−H of D668 is a key interaction across the pyrrolopyrimidine/pyrrolopyridine series. During the Pfizer TrkA program, reversing the amide group was modeled to assess whether the usual ligand binding mode could be adapted to maintain this interaction in the new amide arrangement. Docking of 16 into TrkA using a Pfizer docking protocol (data not shown) suggested that the reverse amide would likely not be able to adapt a conformation in which this interaction would be retained. However, 16 was synthesized as an example of a reverse amide system to challenge and/or verify the docking result. As can be seen in Table 6, reversing the amide group led to a significant reduction in TrkA activity.
ESMACS-based component binding analysis shows that 16 has the least favorable bonding and nonbonding interactions of the compounds in Table 1. The reversed positions of the −NH and CO groups prevent formation of the hydrogen bond between the ligand and residue D668 (Figure 11), with an Hbond occupancy of <1% compared with occupancies of 30− 56% for the rest of the compound set (except for 26, whose

Journal of Chemical Information and Modeling
Article occupancy is 14% as a result of the competition from an extra hydrogen bond between the polar oxygen at R4 and residue K544) (Table 1). Indeed, ESMACS calculations reveal that the −NH and CO groups of 16 align in a noncomplementary fashion with the −NH and CO groups of residue D668, introducing a large unfavorable electrostatic interaction ( Figure   11). Compound 16 has the largest ligand adaptation energy, indicating that the switching of N−H and CO does not enable an alternate binding mode to be adopted, and therefore, 16 must adopt a high-energy conformation to complex with the protein. As is clear in Table 6, ESMACS correctly predicts the reduced TrkA potency of 16 compared with 4.

■ CONCLUSIONS
Methodologies that can predict the binding affinities of molecules ahead of synthesis represent a key area of interest in the pharmaceutical industry. The ability to effectively prioritize prospective compounds on the basis of their likely activity in a key pharmacology assay offers the potential to rapidly improve lead optimization timelines in drug discovery programs by reducing the number of "design−synthesis− screen" cycles needed to identify candidate-quality molecules suitable for clinical testing. 32 Despite the relative diversity of the compounds in this study, the 3-traj version of the ESMACS approach provides good agreement between the theoretically predicted binding free energies and those derived from experimentally measured activities for the entire data set. TIES generates even better agreement between the calculated and experimental binding free energy differences for a selected subgroup of ligands. This is manifest in the better correlation coefficient obtained in TIES (0.88) than in ESMACS (0.79) and accurate binding free energy differences from the former. This is the case for all of Figure 10. ESMACS simulations of (a) 7 and (b) 8 bound to TrkA. Final conformations of 4 ns production runs from all 25 replicas are overlapped and smoothed by averaging over 10 frames. The key TrkA protein residues D668, E560, and L564 are highlighted. The protein is shown in cyan cartoon, and ligand atoms are colored by element: hydrogen in white, carbon in cyan, oxygen in red, and nitrogen in blue. The unfavorable electrostatic interactions between the methoxy group of 8 and the carbonyl group of D668 and between the methoxy group and the carboxylate group of E560 are highlighted by pink arrows. Table 6. Structures and Properties of 4 and 16 a a TrkA free energy activities derived from experimental IC 50 values are denoted as ΔG exp . The relative binding free energies from experiment and ESMACS and TIES calculations are denoted as ΔΔG exp , ΔΔG ESMACS , and ΔΔG TIES , respectively. Experimental data were obtained from Pfizer, Sandwich (U.K.) for these two compounds.

Journal of Chemical Information and Modeling
Article molecular systems we have studied. 13,28,33 ESMACS also provides structural and energetic insight into the binding of individual ligand−protein systems: some compounds bind by a "lock and key" recognition mechanism, for which no significant conformational adjustment is required by the protein and the ligands; others bind according to an "induced fit" recognition mechanism involving significant conformational changes associated with non-negligible adaptation energies.
The Binding Affinity Calculator (BAC) used in this study automates the workflows for ESMACS and TIES calculations, making the use of these approaches easy and user-friendly. With powerful computing resources now widely available, these robust approaches should become more routine for industrial groups in the field of structure-based drug design. The results described herein, based on an analysis of TrkA ligands synthesized as part of the TrkA program at Pfizer, suggest that the binding affinity calculations have the potential to be successfully applied in real-time prospective ligand design.