STACKED - Solvation Theory of Aromatic Complexes as Key for Estimating Drug binding.

The use of fragments to biophysically characterize a protein binding pocket and determine the strengths of certain interactions is a computationally and experimentally commonly applied approach. Almost all drug like molecules contain at least one aromatic moiety forming stacking interactions in the binding pocket. In computational drug design, the strength of stacking and the resulting optimization of the aromatic core or moiety is usually calculated using high level quantum mechanical approaches. However, as these calculations are performed in a vacuum, solvation properties are neglected. We close this gap by using Grid Inhomogeneous Solvation Theory (GIST) to describe the properties of individual heteroaromatics and complexes and thereby estimate the desolvation penalty. In our study, we investigated the solvation free energies of heteroaromatics frequently occurring in drug design projects in complex with truncated side chains of phenylalanine, tyrosine, and tryptophan. Furthermore, we investigated the properties of drug-fragments crystallized in a fragment-based lead optimization approach investigating PDE-10-A. We do not only find good correlation for the estimated desolvation penalty and the experimental binding free energy, but our calculations also allow us to predict prominent interaction sites. We highlight the importance of including the desolvation penalty of the respective heteroaromatics in stacked complexes to explain the gain or loss in affinity of potential lead compounds.


■ INTRODUCTION
Molecular recognition in biological systems strongly depends on specific interactions between two molecules. In structure-based drug design these structure−activity relationships between ligands and their target molecules are rationalized and optimized and provide additional valuable information for the drug discovery process. 1,2 However, the number of possible and favorable interaction types, which have to be considered in the drug design process, increased significantly over the past decades. 3 Various options of possible interactions exist in a protein−ligand binding site. Crystal structures of protein− ligand complexes, such as the PDB, can elucidate which of those various interaction types are actually relevant in a given protein− ligand binding site.
Lead structures are starting points in medicinal chemistry, which sometimes lack the affinity which is required to function as drugs, and affinity optimization steps increase lipophilicity, molecular size, and molecular complexity. 4,5 An attempt to characterize this druglikeness was proposed with the Lipinsky rule of 5, describing molecular size and hydrophilicity as primary risk factors in drug design. 6 π-stacking interactions between aromatic rings play a central role in medicinal chemistry as an important contribution to ligand binding. 1,7 Aromatic rings are frequently used in medicinal chemistry. 10,8,9 Their characteristic features, such as planarity and the distinct π-electron cloud on top of and below the aromatic rings allow for addressing specific challenges in target recognition as they offer multiple interaction possibil-ities. 7,10 These molecular interactions include π−π interactions, 11 cation−π, 12,13 amid−π, 14 halogen−π, 15 and hydrogenbond interactions via heteroatoms. 16 Before a drug can interact with the protein of interest, it has to be, at least partly, stripped of its solvation shell. However, in the field of computer aided molecular design of small molecules, usually high-level quantum mechanics calculations are performed to assess the strength of stacking interactions of aromatic heterocycles. 17,18 Nevertheless, recent work has revealed a direct correlation of vacuum stacking interactions and the solvation free energy of heteroaromatics. 19 Furthermore, the substitutions on a heteroaromatic core substantially influence the electrostatic properties and alter not only the strength and favored stacking geometry but also the solvation properties. 20 Including hydration properties in computational approaches has been a hurdle in structurebased drug design. 21 Programs like GIST, 22,23 SZMAP, 24 and WaterMap 25−27 are well established to mainly characterize the solvation properties of the protein binding sites. Water molecules within the active site of a protein play a crucial role and therefore have to be considered in structure-based drug design. 28,29 Displacement of water molecules from a protein binding site is considered a major contributor to protein−ligand binding. 30,31 The shape and flexibility of a ligand-binding site in a protein is strongly influenced by water molecules and improves the complementarity between the protein and the ligand. The protein−ligand interaction is stabilized by a network of hydrogen bonds induced by water molecules. 32 However, also the solvation of the ligand has to be taken into account, because the solvation changes by forming the protein−ligand complex, as the ligand has to strip off parts of its solvation shell. 27,33 Different desolvation and solvation properties of ligands have shown to strongly influence receptor−ligand complex stabilities and highlight the critical role of ligand desolvation in determining binding affinity. 34 Additionally, studies revealed that the desolvation effects represent a critical barrier in the binding event. The desolvation of a hydrophobic ligand and the active site of the β2-adrenergic receptor was shown to be the ratelimiting step in the ligand−receptor binding process. 35−37 Introducing polar substituents contributed to an increase of the ligand desolvation barrier reflected in a substantial k on increase. 38 Furthermore, it was shown that the energetic penalty resulting from desolvation of the polar groups compromises the observed binding affinity. 38,39 Various studies confirmed that the inclusion of polar substituents results in a large penalty of desolvation upon binding and reveals a less favorable enthalpy of binding. 33,40−42 Predicting and explaining the hydration properties of proteins as well as ligands is an ongoing challenge in the field of drug design. 43,44 More general computational approaches to investigate the protein binding site make use of small molecules, acting as socalled probes, in short molecular dynamics simulations like SILCS 45,46 or MDMix. 47−49 Analyzing the positions of all fragments during these short simulations allows characterization of the binding site and generation of pharmacophore models. Besides these molecular dynamics based fragment methods, the idea of fragment docking was already established almost 20 years ago. 50 This approach already included estimations of the desolvation penalty by using the generalized Born equation with numerical calculation of the Born radii. 51 Newer scoring functions include already an implementation of the desolvation penalty and focus on characterization of the hydration and interaction hotspots of protein−ligand binding sites. 52 Experimentally, a similar technique is used to assess the hydration properties and at the same time to explore the chemical space of possible ligands, namely, the fragment-based lead discovery (FBLD) approaches. 53 When using FBLD, different cores with a variety of substituents are added to the protein of interest to act as probes. By measuring the affinities of these fragments, lead finding is facilitated. Additionally, the fragments are crystallized with the protein to give additional information on the binding pocket to advance the lead optimization process. In this study, we investigated an FBLD study of phosphodiesterase (PDE)-10-A. 54 PDE-10 is one of the known 11 families of PDEs and reveals a unique primary amino acid sequence and distinct enzymatic activity. Chemical modulators for PDE-10 were revealed to be valuable, both in Journal of Chemical Information and Modeling pubs.acs.org/jcim Article investigating enzyme properties and developing therapeutics, as shown by PDE-10 inhibitors. Inhibiting PDE-10-A might also be a novel treatment for psychosis. 55 Here, we investigate how the solvation properties of aromatic complexes correlate with stacking interactions and highlight the importance of including solvation of ligands and the respective desolvation penalty to enhance the understanding of protein− ligand binding.
■ METHODS Data Set. The set of molecules investigated in this study has been already studied in terms of aromatic stacking interactions 17,56 and is part of a fragment based lead optimization study on PDE-10-A (see Figure 1). 54 Quantum Mechanical Calculations. Monomers were optimized using Gaussian 09 57 at the ωB97XD 58 /cc-pVTZ 59 level as proposed by Huber et al. 60 in their investigation of stacking interactions of monocycles. To obtain the stacked complexes, we placed the truncated amino acid side chains in the center of a Cartesian coordinate system. For toluene and 4methylphenol, we kept the center of the aromatic ring in the origin of the coordinate system. In the case of 3-methylindole, the center of the bond bridging the two aromatic cycles was kept at 0.0, 0.0, 0.0 while the aromatic system was kept flat in the x−y plane. In the z-direction we placed the heteroaromatics 3.9 Å on top using a MOE-SVL script. 61 For the heteroaromatics in our test set, we kept the center of mass at x = 0.0 and y = 0.0 ( Figure  2). These geometries were then optimized using the same setup employed for the monomers. For the structures of the FBLD we superposed the optimized structure with the fragments obtained from the PDB (electron density see Figure S2). Due to the fact that we deal with rigid molecules, the superposition did not cause a large deviation from the crystal coordinates. To mimic amino acids, we chose toluene (phenylalanine), 4-methylphenol (tyrosine), and 3-methylindole (tryptophan; Figure 2).
The interaction energies were calculated by subtracting the energies of the two geometry optimized monomers from the respective complex energy as proposed by the supermolecular approach. 62 We did not use basis set superposition correction to allow for better comparability with previously published data. 63 Simulation Details. The aromatic compounds were generated from SMILES 64 using RdKIT. The structures of the stacked complexes were taken from the high-level QM calculations performed in the previous step. From the resulting files, we performed parametrizations using antechamber. 65 The van der Waals, bond length, angle, and torsion parameters were taken from the General AMBER Force Field (GAFF). 66 After performing a minimization of the molecules, using Gaussian 09, 57 the partial charges were derived using the restrained electrostatic potential (RESP) method based on the QM calculation at the HF/6-31G* theory level. Topology and coordinate files for the molecular dynamics simulations were generated using AmberTools. 67 The simulations were performed in a cubic water box of TIP3P 68 water molecules with a minimal wall distance of 10 Å. To ensure proper equilibration, we followed the protocol outlined by Wallnoefer et al. 69 In the production runs, we performed 10 ns NpT simulations with a time step of 2 fs. We applied restraints on the small molecules of 1000 kcal/(mol Å 2 ). Spatial geometries were saved every 1000 steps, resulting in a final trajectory of 5000 frames. To allow for NpT conditions, we used the Berendsen barostat with a pressure relaxation time of 2.0 ps and Langevin dynamics with a friction coefficient of 2.0 ps −1 .
Grid Inhomogeneous Theory. GIST allows a thermodynamic analysis of water molecules based on a molecular dynamics (MD) trajectory using a grid-based approach. GIST uses a grid to replace the spatial integrals from IST with discrete sums over the voxels on the constructed grid. 22,23 One shortcoming of this method is that the solute has to be restrained to one conformation for each simulation. However, our systems are rigid rings, and therefore restraining the solute to only one conformation is no limitation. Recently published studies highlighted that GIST is a valid approach to study biomolecular systems and small molecules in combination with MD simulations, allowing to differentiate between a reference state, i.e., pure water, and an investigated state. A detailed description on the calculation of entropic and enthalpic terms is given by Nguyen et al. and in our previous and further recent publications. 19,22,23,41,70−72 The free energy of solvation of a solute ΔG Solv can be approximated according to eq 2: where k B is the Boltzmann constant, T is the temperature, G Solv (q) the solvation free energy of a solute in a single conformation q, and p(q) the probability to find the solute in this conformation q. G Solv (q), for a particular conformation q, is calculated by restraining its coordinates, making the solute retain its initial conformation. As mentioned, the compounds investigated in this work can be considered as rigid. Therefore, the error introduced by a single conformation approach is negligible. However, if our ligands would be more flexible, we could account for that by using multiple GIST calculations and combining them using eq 2.

■ RESULTS
We performed calculations on 18 monocycles frequently occurring in computational drug design, which have been recently investigated in a study focusing on π-stacking of heteroaromatics 17 (Figure 1). All these aromatic compounds   . GIST maps of benzene, 1,3,5-triacin, pyrimidone, and toluene as monomers and the maps of the respective complexes. The vacuum stacking interaction is obtained from quantum mechanical optimizations using DFT at the ωB97XD/cc-pVDZ level using Gaussian 09. The desolvation penalty was calculated as the difference of the sum of solvation free energy of the complex and the sum of the respective monomers. Red mesh: favorable solute−water enthalpy (cutoff −0.1 kcal/(mol·Å 3 )). Blue mesh: low water entropy (cutoff −0.08 kcal/mol·Å 3 )).

Journal of Chemical Information and Modeling
pubs.acs.org/jcim Article acid mimics individually, we identified a correlation of 0.86 for stacked compounds with toluene, a correlation of 0.88 for stacking on 4-methylphenol, and a correlation of 0.64 for 3methylindol ( Figure S1). By excluding the two mentioned triacin derivatives, the Pearson correlation of the latter improves significantly to 0.92. To investigate the influence of geometry, we performed GIST calculations using the structures published by Bootsma et al. 17 for heteroaromatics stacking on toluene. We find a Pearson correlation of 0.98 for the desolvation penalty of the different geometries, indicating that the desolvation penalty of these small complexes is not that dependent on the conformation. As shown in a recent publication on stacking interactions of benzene, 19 we observe a good correlation for vacuum stacking interactions and the solvation free energy of the monomers for stacking on each amino acid mimic individually (Pearson coefficients of 0.87 for toluene, 0.80 for 4-methylphenol, and 0.77 for 3-methylindol). However, when comparing all three amino acid mimics, we find that the correlation vanishes ( Figure  3). This is naturally the case as the solvation of the monomers remains constant, while the stacking interactions differ by more than 3 kcal/mol. As expected, we found that the vacuum stacking interactions are always stronger for 3-methylindole compared to toluene. To overcome this difference and to allow better comparability, we calculated the desolvation penalty as the difference between the solvation free energy of the complex and the solvation free energy of the respective monomers. For our data set of parallel displaced stacked geometries, we obtained a Pearson correlation of −0.74 for vacuum stacking interactions and the desolvation penalty of the optimized geometry.
Besides the solvation free energies of the heteroaromatics and the respective complexes, we obtained maps containing  19 This allows us to compare which part of the hydration shell has to be broken to allow interactions between the two heteroaromatics. For benzene and toluene, we can identify only the hydration hot-spot of the respective π clouds (Figure 4.A). For the complex, we find that the hydration site of each π cloud has to be replaced to form the complex. We calculate a desolvation penalty of 2.46 kcal/mol and a vacuum stacking interaction of −4.55 kcal/mol for the complex. With higher grades of substitutions, we observe a decrease of the solvation free energy of heteroaromatics, e.g., 1,3,5-triacin (−7.46 kcal/mol). However, in complex the nitrogen atoms can still form a major part of the monomer's hydration shell ( Figure 4B). Consequently, we obtained a resulting desolvation penalty of only 2.18 kcal/mol for 1,3,5-triacin, while vacuum stacking interactions revealed −5.77 kcal/mol for the complex. The lowest solvation free energy in our data set and the strongest vacuum stacking interactions are calculated for pyrimidone with −14.18 kcal/mol solvation free energy and −8.34 kcal/mol of stacking with toluene. This compound does also show the highest desolvation penalty with 3.50 kcal/mol ( Figure 4C).
We applied our approach on a fragment-based lead discovery campaign on PDE-10-A, Table 1. We deleted everything from the crystal structure but the heteroaromatic ligand and Phe729, the stacking amino acid. We calculated the hydration free energy of this small fragment alone, because we assumed that the pocket is equal for all ligands and that the error introduced by the double sided π-cloud of the phenylalanine is compensated throughout the data set.
As shown in previous studies for monocyclic heteroaromatics, stacking interaction correlates well with the dipole moment; 60 however, we can also identify a correlation for the solvation free energy of these aromatic molecules and vacuum stacking energy as well as dipole moment. 19 Due to the high number of substitutions on these fragments, and the deviation from ideal stacking geometries due to the binding pocket, we could not identify a correlation for the binding free energy either with vacuum stacking interaction (Pearson correlation −0.18) or with the solvation of the respective fragments (Pearson correlation −0.26). However, the Pearson correlation of these 10 fragments for experimental binding free energy and the calculated desolvation penalty resulted in 0.79, Figure 5 and Table S2.
In the FBLD approach on PDE-10-A, the compound with the lowest binding affinity was identified as 4-amino-2-methylphenol (PDB accession code: 4LLP). While the vacuum stacking interactions would predict this specific fragment with the worst stacking interaction energy of this data set, we clearly see that it has the second lowest desolvation penalty. In the study by Recht et al., 54 two fragments are chosen for further development: 4chloro-1,3-benzothiazol-2-amine (PDB accession code: 4MSH) and 8-nitroquinoline (4MSN). Judging solely from the desolvation penalty, these two compounds were ranked second and third in our desolvation penalty. While they also show substantially higher vacuum stacking interactions than 4amino-2-methylphenol, at 6.66 and 6.90 kcal/mol, they still stack worse than the top stacking compounds, i.e., 5-nitro- Figure 5. Correlation of desolvation penalty with experimentally determined binding affinity. The points are color coded by the calculated vacuum stacking interaction energy for the geometries found in crystal structures (left). On the right, we plot the stacking interactions against the experimental binding free energies. The points are color coded by the calculated desolvation penalty. The correlation for the experimental binding free energy and the desolvation penalty results in 0.79, while the vacuum stacking interaction shows a Pearson correlation of −0.18. The sum of the vacuum stacking interaction energy and the desolvation penalty revealed a Pearson correlation of merely 0.13. This is however due to 4-amino-2-methylphenol, which shows such a low vacuum stacking interaction that it cannot be compensated by the low desolvation penalty, SI Figure 4. After performing a geometry optimization, using the previously mentioned quantum mechanical setup, of the respective fragments, we obtained a Pearson correlation of −0.40 for the experimental binding energy and the energy for the optimized complex.

■ DISCUSSION
This study reveals that the desolvation penalty of complexes plays a crucial role in predicting binding properties of several aromatic cores and lead compounds. The optimized positions of the unsubstituted fragments revealed a tendency to be displaced toward the methyl group of toluene and 4-methylphenol, equally ( Figure 6A). This trend was also observed in a recent study on stacking interactions by Bootsma et al. 17 This is most likely a result of the best possible alignment of the dipoles while not violating parallel displaced geometries and the rule of heteroatoms being positioned outside of the amino acid mimic π-cloud. 60 However, when looking at the structures of the FBLD approach, we clearly see that the heteroaromatics are usually away from the protein backbone, mainly due to steric hindrance. Nevertheless, the change of environment from vacuum to water might also influence the optimal geometry. To test this hypothesis, we performed optimizations for pyridazine and 4-methylphenol in implicit solvent and were able to identify geometries with lower energies when the heteroaromatic compound is displaced from the methyl group of 4-methylphenol, Figure 6.
It is known that the free energy surface of stacked heteroaromatics is rather shallow, 60 therefore the two obtained local energy minima show very similar stacking interaction energies, −5.28 kcal/mol vs −5.25 kcal/mol. However, when we optimized the additional minimum, Figure 6B, in a vacuum, the geometry dissociated.
When projecting the enthalpy and entropy grids from the GIST calculation into the respective crystal structures, we can clearly identify where the water molecules of the solvation shell are or should be replaced by hydrophilic amino acids (Figure 7). Not only can we identify the amino acids mainly involved in the binding process, but we can also correctly predict the positions of water molecules modeled in the crystal structure (SI Video 1).
Gln726 shows H bonds with all fragments presented in the study by Recht et al., regardless of their different binding geometries. 54 We can see that the best binding fragment (PDB accession code: 4LLP) interacts not only in terms of π-stacking with Phe729 and H bonds with Gln726, but also the substituted hydroxyl group of the fragment points toward the solvent exposed region of the binding site. The hydration shell of the fragment is not completely broken in forming the complex, and the already small desolvation penalty can be mostly compensated. When comparing the desolvation penalties of the fragments in 4MSH and 4MSA, they are almost identical (Figure 7). However, in this case, we can at least partly attribute the higher affinity of 4MSH to the more favorable stacking geometry compared to 4MSA. The most unfavorable fragment in terms of experimental ΔGBind as well as in desolvation penalty results from 4,6-dimethylpyrimidin-2-amine with the PDB accession code 4LLX (Figure 7). We do not only observe a high desolvation penalty, but we can also see from the crystal structure despite finding H-bond interactions with GLN726 that the fragment solely offers hydrophobic interactions toward the solvent accessible part of the binding pocket.

■ CONCLUSION
In our study, we confirm the recently published correlation of solvation free energy of monomers with vacuum stacking interaction for amino acid mimics of phenylalanine, tyrosine, and tryptophan. Furthermore, we show that the desolvation penalty calculated for the respective complexes shows a strong inverse correlation with vacuum stacking interactions of optimized heteroaromatic cores. This can be of great interest in the initial phase of a drug design process where vacuum stacking interactions are used to optimize the scaffold of a lead compound.
In a protein environment, the experimental geometries are not always optimized, and further substitutions on the heteroaromatics and the interactions in the binding pocket make it harder to use vacuum stacking interactions alone as a predicative tool in drug design. We clearly see in the studied FBLD campaign that stacking alone does not tell the whole story, because hydration properties influence complex binding. Vacuum stacking interactions mainly align the dipoles, resulting in geometries that are not entirely favorable, as shown by the Figure 7. GIST resulting from the calculations of the fragment inhibitors with toluene, projected into the respective crystal structures. Red mesh: favorable solute−water enthalpy (cutoff −0.1 kcal/(mol·Å 3 )). Blue mesh: low water entropy (cutoff −0.08 kcal/mol·Å 3 )). We find that the major hydration hot spots from the GIST calculations are replaced by interactions with Phe729 and GLN726.
Journal of Chemical Information and Modeling pubs.acs.org/jcim Article optimization of L19 with and without implicit solvents. Additionally, calculation of the desolvation penalty allows a comparison of substituted heteroaromatics rather than just the core. Calculations of complexed geometries revealed that the GIST calculations of stacked fragments do not only correlate well with the experimental KI values but also show the preferred environment of a substituted heteroaromatic compound within the binding pocket.
■ ASSOCIATED CONTENT * sı Supporting Information The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acs.jcim.9b01165. Prediction of the positions of water molecules modeled in the crystal structure (MP4) Additional data necessary to regenerate the plots and correlation in this paper; information on the X-ray densities of the experimentally determined structures (PDF)