Estimating Atomic Contributions to Hydration and Binding Using Free Energy Perturbation

We present a general method called atom-wise free energy perturbation (AFEP), which extends a conventional molecular dynamics free energy perturbation (FEP) simulation to give the contribution to a free energy change from each atom. AFEP is derived from an expansion of the Zwanzig equation used in the exponential averaging method by defining that the system total energy can be partitioned into contributions from each atom. A partitioning method is assumed and used to group terms in the expansion to correspond to individual atoms. AFEP is applied to six example free energy changes to demonstrate the method. Firstly, the hydration free energies of methane, methanol, methylamine, methanethiol, and caffeine in water. AFEP highlights the atoms in the molecules that interact favorably or unfavorably with water. Finally AFEP is applied to the binding free energy of human immunodeficiency virus type 1 protease to lopinavir, and AFEP reveals the contribution of each atom to the binding free energy, indicating candidate areas of the molecule to improve to produce a more strongly binding inhibitor. FEP gives a single value for the free energy change and is already a very useful method. AFEP gives a free energy change for each “part” of the system being simulated, where part can mean individual atoms, chemical groups, amino acids, or larger partitions depending on what the user is trying to measure. This method should have various applications in molecular dynamics studies of physical, chemical, or biochemical phenomena, specifically in the field of computational drug discovery.


■ INTRODUCTION
Free energy methods refer to an existing set of methods for estimating free energy differences, for example traditional Free Energy Perturbation (FEP) summed using either exponential averaging (EXP) or the Bennett Acceptance Ratio (BAR) and thermodynamic integration (TI). These methods have become a cornerstone of accurate binding free energy calculations, 1,2 and many reviews have scrutinized the details of such methods, concluding that they are a promising addition to the set of tools used in the drug discovery industry. 3−8 Free energy methods are implemented in many commonly used biological simulation packages, for example NAMD, 9−11 Desmond, 12 GROMACS, 13 BOSS, 14 AMBER, 15 and others. These methods are path based and rely on a set of intermediate states, named λ-windows, in which the interaction parameter λ is altered from 0 to 1. When λ = 0, the simulation represents the initial state of a physical system; this might be the unbound state of a protein and drug complex in a binding free energy calculation or a simulation cell of water for a solvation free energy calculation. When λ = 1, the simulation represents the final state of a physical system. This might be the bound state for the protein-drug complex or the solvated molecule for the solvation free energy calculation. Then a set of intermediate states is taken to gradually measure the free energy change as λ varies from 0 to 1. FEP can be used to extract hydration free energies, 16−18 free energies associated with mutating one molecule into another or one protein into another, 19 and in the context of drug discovery, binding free energies of inhibitors to proteins 12,20,21 The method described in this work is called atom-wise free energy perturbation (AFEP) and is an extension to the EXP method. It is applied to energy and trajectory data from a conventional Molecular Dynamics (MD) simulation with multiple lambda windows. AFEP estimates the contributions from each of the atoms in a system to a general free energy change, given that the total energy of the system is defined to be a sum over atomic energies which are calculated from the MD trajectories. AFEP relies on the same set of MD simulations as conventional FEP analysis does, running a simulation at each of the intermediate λ-values. It then calculates quantitatively the contributions to the free energy change from each "part" of the system by approximating the decomposition of free energy in a simple and intuitive way. The definition of "part" is flexible and could be single atoms or clusters of atoms grouped together (e.g., chemical groups, amino acids, or the solvent) depending on the desired application. The decomposition provided by AFEP cannot be additive between different systems as shown by Mark et al., 22 and similar chemical motifs cannot be expected to have the same free energy contribution in different molecules. However, the results can be interpreted within a given molecule and used as an empirical tool to highlight atoms which contribute relatively favorably or unfavorably to a free energy change.
In this work the atom-wise decomposition is calculated for two types of free energy change: a few hydration free energies and a binding free energy. The extension to FEP discussed in this study is called Atom-wise Free Energy Perturbation because it processes the results of an FEP simulation to give the atom-wise breakdown of the free energies. Many extensions have previously been developed for free energy methods, 1,2,23 and methods for statistically optimizing the results have been developed 11 including BAR 24,25 which is used in this study to calculate total free energy changes. To the authors' knowledge, none of these methods offer the atom-wise distribution results that AFEP provides.
In general, free energy methods output a single number, a free energy change, that can help predict whether a drug will bind, and this has the potential to improve the computational drug design workflow. 12 With the correct collaboration there is hope that free energy methods will become an important part of the drug design and discovery process. 26 With AFEP one can attempt to determine why a drug binds and which parts play which a role in the binding. In this work we give all of the AFEP formulas in the main text. The mathematical derivation of the AFEP method is included in the Supporting Information.
The results are shown for AFEP directly applied to simulations of methane, methanol, methylamine, methanethiol, and caffeine solvated in water to measure a hydration free energy and a system of human immunodeficiency virus type 1 protease (HIV-1pr) with bound and unbound lopinavir drug molecules to estimate the atom-wise distribution of the binding free energy (see Figure 1).
The four methane-like molecules are chosen as similar simple examples of common chemical side groups. Caffeine is a small molecule that blocks the adenosine receptor in the human body. Although quite small compared to most medicinal molecules, it is an example of a molecule that might be considered in the field of drug discovery. Caffeine dissolves in water at room temperature; it is then expected to have a negative free energy of hydration. Caffeine was chosen as it is a small and familiar molecule that mixes polar and nonpolar groups and can be used to test the predictions of the AFEP method against chemical intuition.
HIV-1pr is one of many constituent proteins in HIV-1. This particular protein is responsible for cutting and cleaving parts of the host, after which the virus will go on to reproduce inside the host using the host's cell based machinery. 27 Specifically the virus uses HIV-1pr to cleave Gag and Gag-Pol: two proteins that are essential for the virus to hijack to synthesize a functional and intact viral particle. If HIV-1pr is successfully blocked with a strongly binding inhibitor, that is, an inhibitor with a highly negative binding free energy, the cutting process will be blocked. This will make it impossible for the virus to reproduce, and the infection can subsequently be eliminated by the human body. Lopinavir is a widely used inhibitor of HIV-1pr that is often combined with ritonavir, another drug molecule. 28 This application was picked as lopinavir binds strongly to the well studied HIV-1pr structure (see Figure 2). 29 ■ THEORY AFEP is a methodology that splits the total free energy change of the system into a sum of atom-wise contributions. The method is based on the Zwanzig equation 30 used in the EXP free energy method to calculate the Helmholtz free energy difference ΔF AB between two thermodynamic states A and B with β = (k B T) −1 , where k B is Boltzmann's constant, U X is the total energy of system X, and ⟨·⟩ A represents a state average over system A. From hereon, all state averages are performed over system A without loss of generality, and we will omit the subscript A from equations. Eq 1 is a statement that for any free energy differences between systems only the difference in system energy is needed to calculate the free energy difference.
In practice the free energy difference must be reasonably small because of limited sampling overlap from the start and end states, which would lead to poor convergence. The Zwanzig equation is the starting point for the EXP method, and this equation would normally be applied between each pair of subsequent intermediate λ-values to help smoothly measure a larger system change in smaller steps. For the AFEP method we

Journal of Chemical Theory and Computation
Article define the expansion of the difference in total system energy, ΔU, in terms of the N constituent atoms in those systems where u Xk is the potential energy associated with atom k in system X. The exact representation of u Xk depends on the force field used to simulate the system, and an expression is given later in the text. The goal of AFEP is to write the free energy change as where Δ k ( ) is the contribution to the free energy from atom k, and the sum ranges over all of the N atoms in the system. It is worth stating that the free energy cannot usually be separated into a simple sum in this way, because the entropic contributions are formed from correlations of the multibody density of the system which are not necessarily separable over atoms. 31 The results should not be overinterpreted nor expected to sum together to conserve free energy in an additive way. 22 The special stylization of Δ k ( ) then represents that this is a 'contribution' from an atom k but not a true free energy per se. The goal of this work is to derive a mathematical expression for such a free energy contribution and assess if such an estimate has any meaningful predictive power as an empirical tool.
To achieve this atom-wise decomposition we perform the following stages: • Replace ΔU in the Zwanzig equation (eq 1) with the sum of energies (eq 2). These energies serve as a means of distinguishing which contribution belongs to which atom.
• Expand the exponential term as a Taylor series.
• Perform a multinomial expansion on each power term of this Taylor series.
• Observe that the resulting series is a sum of products of differences in atom-wise energy Δu k . The Δu k in the products are raised to integer exponents.
• Weight the terms by their exponents and group them into contributions toward each atom in the system. For example, a term that looks like Δu 1 2 Δu 2 Δu 3 half belongs to atom 1 and one-quarter to atoms 2 and 3.
• Write the logarithm in the Zwanzig equation as a Taylor series.
• Perform a multinomial expansion on each term in the Taylor series of the logarithm.
• Observe that this series is also a sum of products of differences in the previous terms. These previous terms are also raised to integer powers.
• Apply the same grouping technique to all products of terms to get individual contributions for atoms • Find the closed form for these individual groupings that correspond to the Δ k ( ) AB in eq 4. This procedure is covered in Sections 1−5 in the Supporting Information in full mathematical detail. The resulting closed form expression for Δ k ( ) AB is given by (5) and by inspection of eqs 1 and 5, we have so in effect we have defined a set of weights w k such that a AB (7) by eq 4 then (8) and the weight for atom k is defined by The calculation of these weights will allow the calculation of the free energy contribution for each atom in the system. Because the weights are independent of the value of the total free energy change ΔF AB any method desired can be used to calculate ΔF AB , for example a standard BAR FEP simulation, 24 which is relatively easy using modern methods. 11 All that is needed are the energies associated with each atom in the system, u Xa , which will come from the force field parameters used to simulate the system. The full AFEP expression for the estimate of the free energy contribution from atom a is then given by One could define other partitioning schemes that give a AB AB , for some weight function f(ΔU, Δu a ). A simple example that sums to 1 for all a is f(ΔU, Δu a ) = ⟨Δu a /ΔU⟩. This weight function only considers the energy of each atom and would struggle to predict entropic effects; it is also not rooted to a derivation like the AFEP weights.

■ SIMULATION DETAILS
In the MD simulations, the total system energy comes from electrostatic, Lennard-Jones, bonded, Urey−Bradley, angle, dihedral, and improper terms in the force field, which is a set of parameters for each type of atom in the system. The CHARMM36 force field was used in these simulations. 32 This includes the parametrization of each amino acid in the protein and the explicit water molecules around the protein in the simulation. Fixed bonds and bond angles were used, meaning certain terms are not included in the energy calculations.
To get the atom-wise energies u Xa needed to construct the AFEP weights (eq 9) from the MD simulation we must define the connection between the energy and the force field being used. This is given by  (11) where each term is the sum of all appropriate interactions containing atom a. This energy is a shared contribution from all types of energy measured with Hamiltonian X (the Hamiltonian will change as the FEP variable, λ, changes). For

Journal of Chemical Theory and Computation
Article simulations with fixed bond lengths and fixed angles, which are common in simulations of biological molecules, we have u bonded = u angle = 0. Eq 11 is a statement that for all possible types of interaction, if a number of atoms are involved in an interaction, the energy of that interaction is shared equally between all of the atoms involved. This is an approximation, and variations to this scheme could potentially be made, but to share equally is the least assuming scheme. This work does not include intramolecular terms between the nearest and next nearest bonded neighbors (1-2 and 1-3 terms).
Methane-like Molecules and Caffeine Simulations Details. Caffeine and the four methane-like molecules were simulated with explicit TIP3P 33 water with a lambda schedule that starts with the molecule completely noninteracting with the water (λ = 0) and ends with full interaction between the molecule and the water (λ = 1). The resulting free energy change is then the free energy of hydration for the molecule. For the methane-like molecules these free energies are shown in Table 2 along with experimental values. 34,35 There is experimental data for the caffeine free energy measurement taken from the FreeSolv database, 16 and a similar computational measurement was performed by Mobley et al. 16 These results make up Table 4, which compare experimental and computational results. The force field parameters for all molecules were generated with the ParamChem CGenFF software. 36,37 This software picks the best atom types and charges to represent a given molecule according to its bonding and connectivity similarity with a test set of molecules. The original chemical structure and topology files for caffeine are taken from the ZINC database entry ZINC00001084. 38 The systems were first equilibrated under NVT conditions and then NpT conditions to find a suitable density and subsequently run under NpT conditions for the main trajectory data collection. 32 λ-windows were used, and an MD trajectory was collected for each value of λ. Further details of the MD simulations are given in the Supporting Information.
HIV-1pr with Lopinavir Simulations. The structures used for the simulation of lopinavir in the binding pose of HIV-1pr are from The Protein Data Bank, reference 2Q5K. 39,40 For convenience of processing, the entire free energy calculation was performed in one cell. In this cell there are two copies of the drug lopinavir: one is in the solution and is fully interacting at the beginning of the simulation, and the other is restrained in the natural binding pose in the protein from the PDB data 39 and is noninteracting in the beginning of the simulation. The copy in the solvent was placed 42 Å from the protein which was deemed to be suitably far to stop the majority of interactions with the protein. The simulations had three stages of lambda schedule, as is common for such binding free energy calculations. 20 The first stage is to turn on the Lennard-Jones interactions of lopinavir in the binding site; this creates a cavity in the protein, and the interactions are turned on very slowly at first. This is to prevent a so-called "end point catastrophe" in an FEP simulation, which is where very weakly interacting atoms in the MD simulation can overlap and the divergent form of their interaction potential leads to very large energies and slow convergence. 11 The second stage is to turn the Coulombic charge-based interactions of the solvent based copy of lopinavir off and to turn the charge-based interactions of the binding site ligand on; the third stage is to turn the Lennard-Jones interactions of the solvent copy of lopinavir off. This leaves the end point of the simulation representing the fully interacting and bound inhibitor in the protein and the copy of the inhibitor in the solution fully uninteracting. To keep the inhibitor in the binding site throughout the simulation, a tethering force is used between three atoms in the protein and one atom in the drug molecule. These restraints can be seen in Figure 3. There is a free energy term that must be considered from this tethering which can be decomposed into two parts: the unnatural energy associated with the unphysical tethering force and the unnatural entropic term associated with prohibiting the ligand from accessing the full volume of the simulation cell. The first contribution is measured using a thermodynamic integration   There is a strong hydrogen bond between H and O310 (dashed pink). Four atoms were chosen, O310 the ligand (pink oxygen) and C42, C47, and C44 in HIV-1pr (orange carbons). There is one dihedral restraint through all four, one angular restraint through O310, C47, and C44, and one distance restraint between O310 and C47.

Journal of Chemical Theory and Computation
Article (TI) 41,42 of the system by varying the strength of each degree of freedom in the tethering forces. The strength of these contributions can then be calculated and corrected for. The latter contribution is analytically calculated using a derived equation; the expression for this correction is given by eq 12.
Further details of the MD simulation of HIV-1pr with lopinavir are given in the Supporting Information.
Restraints Correction Factor. A three term restraint across four atoms is used to hold lopinavir in the binding pose in the binding site of HIV-1pr. Three atoms from the binding site and one atom in the ligand are connected with a dihedral term across all four atoms, an angular term across two of the protein atoms and the ligand atom and a separation term from one of the atoms in the protein to the atom in the ligand. This allows the ligand to adopt different orientational poses during the simulation, but keeps it fixed to the protein such that it does not wander through the partially interacting protein when the ligand and protein are only weakly interacting. An analytic correction term ΔA r can be derived to compensate for the restricted environment the ligand resides in as there is an entropic cost of the ligand not being free to wander around the simulation cell. The strength of the restraints is chosen to be suitably weak as not to affect the AFEP results.
Following the method used in the paper by Boresch et al. 20 who analytically calculate this entropic cost for a six point restraint of the ligand in the binding site of the protein, a similar expression was derived for a three point restraint; this gives the free energy correction term associated with constraining the ligand when interactions are turned off as where V is the standard system volume for 1 molar concentration, K r is the strength constant of the distance constraint, K θ is the strength constant of the angular constraint, and K ϕ is the strength constant of the dihedral constraint. r aA,0 is the equilibrium distance between the drug atom and the protein atom. θ A,0 is the equilibrium angle between the atoms with the angular constraint. In addition to this entropic term, there is the energetic cost of the restraints themselves. This should be made as small as necessary to not perturb the system unnaturally when interactions are at the normal level. The restraint must also be strong enough to hold the ligand in the natural binding pose when interactions with the rest of the system are switched off. A set of TI measurements was taken out to ensure the contribution from the restraints when interactions were turned on was close to 0 kcal/mol, which would indicate that it is not interfering with the dynamics of the fully interacting system. The total free energy associated with the restraint was found to be 0.6113 kcal/mol. This value is relatively small and is deemed to be acceptable for the purposes of this simulation. Figure 3 shows a small section of the system to demonstrate the restraints used. Table 1 shows the values of the constants used in the simulation; these are then used to calculate the correction factor (eq 12).

■ RESULTS
Methane, Methanol, Methylamine, and Methanethiol. FEP calculations were performed for methane, methanol, methylamine, and methanethiol to demonstrate the AFEP method for similar small molecules with different side chains. Table 2 shows the total hydration free energy change for these four molecules. The calculated values agree with the experimental values reasonably well. Table 3 shows that atom-wise contributions for the molecules as calculated using AFEP.
Caffeine. A free energy calculation was performed for caffeine solvated in water; AFEP was then applied to the trajectory information to produce an atom-wise breakdown of the hydration free energy. Table 4 shows the total free energy calculated from a standard FEP simulation compared with experiment and a similar computation. Figure 4 shows the atom-wise breakdown of free energies associated with each atom in the caffeine molecule in units of kcal/mol. All of the water molecules in the simulation have been partitioned into one group because they are indistinguishable. For the caffeine simulation the contribution from the water molecules sums to exactly ΔF/2. This arises because the atom-wise energy is defined to be shared evenly across interactions in eq 11. When partitioned in this way the free energy contribution per water molecule is less meaningful, and specific water molecules of chemical interest should not be partitioned into the bulk. Contributions from water molecules within certain shells from the solute could be analyzed by partitioning them carefully according to their distance from the solute.
Some of the atoms in a caffeine molecule are effectively in the same chemical environment. For example H1, H2, and H3 (as labeled in Figure 4) are in the same environment because C1 could rotate freely about its bond with N1. In the limit of

Journal of Chemical Theory and Computation
Article perfect sampling we would expect these three hydrogen atoms to have the same free energy contribution. In practice it may take a long time in an MD simulation to observe the conformational changes required to sample this. This problem is common in many forms of MD. Obvious symmetries can be input manually by grouping atoms together; however, the user should be careful not to insert fictitious symmetries. In the case of the MD simulations performed in this study, fixing the atoms prevents the rotation, and the weights for these atoms are not expected to be the same. A deeper analysis of the values in Figure 4 is given in the Analysis section. Figure 5 shows the convergence of some of the atom-wise weights associated with atoms in the caffeine molecule as the number of considered trajectory frames is increased. The convergence is steady and fairly rapid only requiring around 1000 frames to get a reasonable feel for the contribution from each atom. The values in Figure 4 are calculated from 1800 MD trajectory frames and are expected to be converged according to Figure 5.
HIV-1pr with Lopinavir. Table 5 shows the five terms making up the total system free energy, the total free energy, and an experimental comparison. Three terms are from the different stages of simulation, and the remaining two are correction terms from the restraints used to control the simulations. Figure 6 shows the results of an AFEP calculation on the MD simulation trajectories. Each contribution is given by w bound ΔF bound + w swap ΔF swap + w unbound ΔF unbound . In each case the weights are given by the difference between the atom-wise weights from the copy of the inhibitor in the binding site and

Journal of Chemical Theory and Computation
Article the inhibitor in the solvent. The 12 most contributing and 12 least contributing sites across the lopinavir molecule are labeled. Red atoms are strongly contributing to the binding. Blue sites are those that disfavor the binding. White sites are neutral. Some of the red atoms are points of symmetry; for example, C35, H6, C15, C6, and H2 could all be reflected on the molecular structure by rotations of the side groups. This again implies perfect sampling of all conformational degrees of freedom has not occurred; however, in this case the binding site will restrict the ligand and prohibit these rotations. AFEP proves useful in determining that this asymmetry is present in the results.

■ ANALYSIS
Methane-like Molecules. The free energy calculations for four methane-like molecules mentioned in Tables 2 and 3 show a reasonable agreement with experimental free energy values. For methane the AFEP decomposition predicts that each of the hydrogens is weakly favorable and of similar magnitude which could be interpreted as interactions with the surrounding water. In methanol the oxygen atom is very favorable, and the central carbon has become slightly polarized by the addition of the oxygen. The least favorable atom is the hydrogen in the hydroxyl group, H4; this may be due to hydrogen atoms from surrounding water molecules which are bonding with O crowding closely to H4 and offsetting the contributions made from hydrogen bonds from water to H4. If H4 and O are combined, then the contribution from the OH group is still very favorable. A similar solvent crowding effect may explain the positive contributions from H4 and H5 in methylamine. Again the central carbon is slightly polarized, and the nitrogen atom is very favorable, presumably from hydrogen bonding with the surrounding water. For methanethiol, the net free energy change is small, but each of the contributions is of similar magnitude to methanol and methylamine. The sulfur atom is relatively favorable. The free energies between different molecules cannot be easily interpreted, which is expected according to Mark et al. 22 For example, if methane and methanol are compared, the free energy of mutating H to OH cannot be found by using the total free energy and by splitting methane into CH 3 + H. This is because the CH 3 partition of methane and methanol is not equivalent, partly due to the polarization of the central carbon in methanol. For all cases, the results potentially make more sense when chemical groups are summed over.
Caffeine Total Free Energy. Table 4 shows the total hydration free energy for experimental and simulated caffeine. There is a mismatch between the total free energy for computational simulations and experimental measurements; this is likely because the force field does not contain full information about the molecular interaction, and this is a common problem with classical molecular dynamics simulations that is not specific to the AFEP methodology. The agreement between the two computational methods is within 1 kcal/mol; this is a reasonable tolerance for demonstrating the simulation was run correctly, and the results can be used to explore the AFEP method. The simulation by Mobley et al. was performed in the GROMACS molecular dynamics software 16 and has slight differences in the input and parameter files. It is possible that convergence of the total free energy change could be improved by adding more λ-windows to the BAR calculation. However, showing an accurate result for the total free energy change is not the goal of this work; moreover one of the goals is to show that AFEP produces a valid decomposition independent of the accuracy of the input free energy change.
Caffeine Atom-wise Free Energy. For caffeine the atomwise weights given by eq 9 were well converged. This can be seen in Figure 5, where after around 1000 considered trajectory frames the individual contributions to the free energy do not change considerably. The atom-wise values shown in Figure 6 correspond to 1800 input trajectory frames. Although the results appear well converged, we can tell that perfect sampling has not occurred. Chemically speaking, H1, H2, and H3 should have exactly the same contribution by rotational symmetry about the C1−N1 axis. However, the free energy contribution from H2 is somewhat different than H1 and H3. This highlights that fixed atoms were used in the simulation; if fixed atoms were not used, then such a difference would have indicated a sampling problem. These kinds of sampling problems are not particular to the AFEP methodology and are common in MD simulations. However, for AFEP there is the advantage that prior symmetry information of this kind could be input manually using the atom partitioning scheme. The researcher may still have to be careful before inputting such degeneracies. For example, if the energy of a conformational barrier is particularly high, much greater than the average thermal energy in the system, k B T, then one may lose information by partitioning the atoms together. In this case AFEP could even be a way of diagnosing such conformational barriers, by checking sets of atoms that should have the same contribution by symmetry for distinctly different contributions.
If the AFEP contributions are summed over methyl groups, two out of three cancel to almost zero net contribution. The signs of contributions are consistent across the molecule; for example, all three methyl groups show positive hydrogens attached to a negative carbon attached to a negative nitrogen. N2 is the best contributor to the solubility of the molecule. This makes sense as it is the most solvent exposed part of the molecule and the N2 molecule is assigned a large partial charge in the force field parameters. According to the AFEP simulation C6 is actively resisting the solvation of the molecule. Both of the oxygen atoms have quite large contributions to the hydration free energy. This makes sense as they will create hydrogen bonds with the surrounding water. H4 and C2 are both unfavorably contributing to the solvation. This may suggest that replacing H4 with an OH or NH 2 group would increase the solvation free energy of the molecule.
Lopinvair Total Binding Free Energy. The total free energy change of the HIV-1pr and lopinavir complex was made from five distinct contributions; there are the contributions from the three lambda stages, turning the VDW interactions on in the binding site, swapping the charge interactions from the solvent molecule to the bound molecule, and turning the VDW interactions off in the solvent, as well as the analytic correction for the entropic cost of restraining a ligand in free space to the binding site (eq 12), and the free energy associated with the restraints measured using thermodynamic integration. The sum of these five contributions gave the final value of −22.59 kcal/ mol. The experimental result for this binding is −15.2 kcal/ mol 45 as found using the BindingDB. 44 Lopinvair Atom-wise Free Energy. The AFEP contributions were found for atoms in the lopinavir molecule. The unfavorably contributing sites are colored blue, and the favorable sites are colored red in Figure 6. Both the oxygens and the nearest carbons are positively contributing for O1 and

Journal of Chemical Theory and Computation
Article O4. This is in contrast to the caffeine molecule, where oxygens are paired with unfavorable carbons. There are a few examples of bonded pairs with a favorable and unfavorable atom, for example HN4 and N4, HN2 and N2, HN3 and N3, and HO4 and O4; the latter pairs the strongest of each type of contribution, and all pairs result in significant cancellation if summed over. There are some obvious side groups on the molecule that have few strongly binding atoms and are quite neutral to the binding process. One example is the ring featuring H6, which could potentially be replaced with a similar sized side group that might display a stronger binding.
The protein atoms in the binding site can also be considered and also have atom-wise contributions. Visualization of the interactions between the drug and the binding site are hard due to the complicated 3-dimensional nature of the problem. In a drug discovery context only the drug molecule can be altered and not the protein, so these binding site atoms are not shown in this work. However, in investigations looking into the mechanism of specific protein interactions, this additional information is expected to be useful.

■ CONCLUSIONS
The Atom-wise Free Energy Perturbation (AFEP) method described in this article provides a detailed breakdown of a free energy change across partitions of atoms in a molecular dynamics simulation. The partitions can be selected by the user to capture information at an appropriate length scale. The results shown in this work indicate that a full atom-wise decomposition may be too detailed for certain features, and groups such as amines and hydroxyls should probably be summed over. The main used equations were directly derived from the Zwanzig equation as used in the exponential averaging method (EXP) for free energy changes at fixed volume. During this derivation it was defined that the system energy is decomposable as a sum over the atoms in the system. Two further assumptions were made in the derivation, the first that when decomposing the system energy into individual contributions associated with atoms, the energy is shared equally between groups of atoms interacting with a given force field term. The second that a simple repartitioning scheme can be used during the multinomial expansion stages of the derivation to give a meaningful estimate of the free energy at each atom. Because of the nature of the last approximation, the AFEP method cannot produce an additive decomposition of the free energy, and comparisons between the partitions of dif ferent molecules should be made very carefully. However, AFEP can be used as an empirical tool for studying the internal free energy differences of a given molecule.
The AFEP weights used in the decomposition appear to converge relatively quickly. The system may be partitioned as the user chooses under the method because these weights can be linearly combined. This means interchangeable components of the system, namely the fluid molecules that are in an indistinguishable environment, can be combined together. Atoms that are symmetric by rotation of chemical groups can also be partitioned together in this manner, as could entire amino acids if a large protein was being studied.
The potential applications of this information are numerous. Only six specific examples were covered in this work, but AFEP could be applied to simulations relating to various branches of physics, materials science, chemistry, and biochemistry. AFEP was applied to a calculation of the hydration free energy of methane, methanol, methylamine, methanethiol, and caffeine.
AFEP highlighted which atoms appear to interact the most with the surrounding water molecules. AFEP makes sensible decompositions for the four methane-like molecules and indicates a particular hydrogen atom in caffeine (H4) is relatively weakly interacting with the surrounding water. This type of analysis could be a useful tool to predict, for example, which molecule similar to caffeine dissolves into water more readily.
AFEP was also used to produce a free energy breakdown of the binding free energy change for a protein−ligand complex, HIV-1pr with lopinavir. The binding free energy was made of three stages, AFEP was applied to each stage, and the results were summed along with two correction factors for the restraints used in the MD simulations. AFEP highlighted some sections of lopinavir that are neutral to the binding and may find use as a tool to suggest improvements to a given ligand to increase binding strength. AFEP also showed which components of the lopinavir molecule are the strongest binding and are likely to be essential in the binding process.
The AFEP method can used to analyze the free energy contributions of atoms in most physical and chemical systems from appropriate molecular dynamics simulations arranged in lambda windows. This technology will have applications in the field of computational drug discovery and may assist in developing ligands for other disease target proteins in less time and cost. There is also a great scope for further biological and chemical uses away from the field of drug discovery, as the methods used are general and unassuming about the underlying physical system.

* S Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acs.jctc.8b00027.
Derivation of the equations used in the AFEP method and proof that they are equivalent to the Zwanzig equation, details of the molecular dynamics simulations performed (PDF)