Combining Machine Learning and Enhanced Sampling Techniques for Efficient and Accurate Calculation of Absolute Binding Free Energies

Calculating absolute binding free energies is challenging and important. In this paper, we test some recently developed metadynamics-based methods and develop a new combination with a Hamiltonian replica-exchange approach. The methods were tested on 18 chemically diverse ligands with a wide range of different binding affinities to a complex target; namely, human soluble epoxide hydrolase. The results suggest that metadynamics with a funnel-shaped restraint can be used to calculate, in a computationally affordable and relatively accurate way, the absolute binding free energy for small fragments. When used in combination with an optimal pathlike variable obtained using machine learning or with the Hamiltonian replica-exchange algorithm SWISH, this method can achieve reasonably accurate results for increasingly complex ligands, with a good balance of computational cost and speed. An additional benefit of using the combination of metadynamics and SWISH is that it also provides useful information about the role of water in the binding mechanism.


Calculation of Absolute Binding Free Energy by Combining Machine Learning and Enhanced Sampling Techniques
. Systems used for the enhanced sampling simulations with experimental and calculated ABFEs (in kcal mol −1 ). Chemical and physical properties of the ligands for each system selected. 3.00E-03 2.32E-09 -11.8 -0.7(0.7) -3.9(1.7) 0.0 -3.7 -11.9(1. 3) 376.45 28  The structural information about the ligands were obtained from the PubChem database. 3 c Experimental free energies were obtained from the relation ΔG = −RT ln(K i ) at T = 298 K. 4,5 d Results of fun-metaD for the funnel located over the right hand side (F RHS ) and the left hand side (F LHS ) of the binding cavity, respectively. e Results of the re-projection of fun-metaD based on the RMSD with respect the bound and unbound structure as a reference. f Results of fun-SWISH and COMet-Path methodologies applied at the funnel located over the LHS (F LHS ). g Results up to 1000 ns of fun-SWISH simulations. Figure S1. Speed and scalability of enhanced sampling simulations on different hardware for a holo complex of human sEH (around 75000 atoms), for fun-metaD, COMet-Path and fun-SWISH with GROMACS 2018.3 with the PLUMED 2.4 plug-in. Scaling of performance with increasing number of computational nodes is shown by successive bars. The performance indicated in the first bar corresponds to 1 CPU node = 2x Platinum 8160 (48 cores) and 1 GPU node = 1x Tesla P100. Figure S2. Correlation between the experimental and calculated binding free energies for the six systems selected for COMet-Path methodology that had more than 1 re-crossing (5alp, 5aly, 5aia, 5alt, 5akg and 5akk). The plots show the results for: (top) fun-metaD for F RHS and F LHS ; (bottom-left) COMet-Path methodology, and (bottom-right) fun-SWISH. For the methodologies at the bottom only F LHS was applied (see Table S1 for exact values plotted). The black line indicates the ideal behavior, while the green and orange shaded regions show the  2 and 3.5 kcal·mol -1 tolerance, `4 respectively, around the ideal behavior. Error bars were calculated as described in the manuscript in Computational Methods section.

Results from unbiased Molecular Dynamic simulations of the full-size systems
The conformational flexibility of the sEH-ligand complexes was examined for the 18 selected systems, divided into three sets of six for each pocket in the binding cavity, i.e., tunnel, right-hand side (RHS) and lefthand side (LHS). We have run one replica per system for 300 ns of unbiased MD simulations. The root mean square deviation (RMSD) profiles of the full-size systems (N-and C-lobes) exhibit high conformational flexibility (see Figure S3A and S4). A detailed examination of the unbiased MD simulations supports these observations, as the flexibility of the loop linking both N-and C-lobes provides a high flexibility to the whole complex. The per-residue fluctuations reveal that, in most of the cases, the N-lobe shows notably larger fluctuations than the C-lobe ( Figure S5). In order to confirm these results, we have performed an alignment of residues 250-540 of the C-lobe of the full-size system over the reference structure of the C-lobe. In this case, the stability of the C-lobe is shown to be reassuringly high ( Figure S6). Additionally, it has been validated through inspection of the trajectories that the linker between the N-and C-lobe is not sufficiently long to permit any obstruction of the binding cavity by the N-lobe.    `8

Results from unbiased Molecular Dynamic simulations of the C-lobe size systems
Considering that the full system does not provide any advantage from a structural point of view, we decided to reduce the size of the system to only simulate the residues of the C-lobe (aa 224-547) where the binding cavity is located. In order to confirm that the deletion of the N-lobe does not affect the stability or shape of the binding site, we carried out 300 ns of unbiased MD simulations for the 18 C-lobe-fragment complexes. The RMSD and RMSF results indicate that the systems are very stable throughout the simulations (see Figures S7-S9).      Table S2. RMSD (Å) and standard deviation determined for full and C-lobe systems, aligned over the backbone atoms of equilibrated reference structure.
RMSD Full-size system (Å) RMSD C-lobe system (Å) PDB code BB of N-and C-lobe BB of C-lobe BB of C-lobe Ligand apo 5 Table S1 grouped by the ligands initially located in the narrow tunnel (top), RHS (middle) and LHS (bottom) of the binding cavity. Here, systems that did not achieve at least 1 re-crossing have been excluded. The black line indicates the ideal behavior, while the green and orange shadows show the  2 and 3.5 kcal·mol -1 tolerance, respectively, around the ideal behavior. Error bars were calculated as described in Computational Methods. Figure S24. Correlation between the deviation of calculated binding free energies (obtained after 500 ns of fun-metaD) simulations and experimental values, with respect the molecular weight of the ligands in the simulations. The ideal deviation of 0 is shown by bold green line, and the green shaded region shows the "acceptable" band of ± 2 kcal·mol -1 . The trends for F LHS , F RHS and combined are shown in purple, green, and black respectively. Error bars were calculated as described in Computational Methods.

Results from COMet-Path Simulations
Below, we describe the theoretical background related to COMet-Path simulations.

Maximisation of Path Entropy
Maximum Path Entropy or Maximum Caliber is an approach used to infer the dynamical rates of transitions on a network given the stationary populations. This approach forms the basis of the Spectral Gap Optimisation of Order Parameters (SGOOP) method, and also the Coefficients Optimisation of a Metric for Path Collective Variables (COMet-Path); one of the methodologies used in this research. The approach is related to the Maximum Entropy approach, which can be used to infer static observables instead.
This derivation follows the steps detailed in Dixit et al. 6 The starting point is a Markovian random walker on a directed network with nodes and edges . We can write the following master equations, which describe the = { , ,⋯} instantaneous probability of being at a node at time , , dependent on the probabilities of all the network nodes: We then discretise the time into intervals and replace the transition rates with transition probabilities . The δ ω transition matrix can be approximated as , using the first two terms of the Taylor expansion for = Ωδ ≈ + Ωδ δ . The time-normalized path entropy is then written as: We also enforce normalisation and stationarity, as well as constraints on the mean number of transitions in a time interval and ensemble averages of arbitrary dynamical rate variables These constraints are enforced using ⟨ ⟩ . Lagrange multipliers, and the unconstrained Caliber is then written as: Maximising the Caliber will result in equations for stationary probabilities dependent on the structure of the entire network. This result, however, does not include the detailed balance condition , required (along with ω = ω stationarity) to ensure the system is in equilibrium. If we choose to enforce it, we obtain: .
This expression can be simplified further if the average value of any of the dynamical variables is not constrained, but rather only the mean jump rate is. In such case, and zero otherwise and we get: The two equations above define the transition matrix . With the knowledge of the transition rates for the entire Ω network, an arbitrary time evolution of the system can be simulated given the initial conditions.

Spectral gap definition
Consider the eigenvalues of the transition probability matrix ordered from the largest to the smallest λ Ω λ 0 ≡ 1 > λ 1 ≥ if we observe x barriers in the free energy estimate reweighted as a function of the s path collective variable, then λ 2 … the spectral gap is defined as . λλ + 1 The largest eigenvalues correspond to the slowest changes in the system and represent the transition over the main barriers in the free energy surface. By maximising the spectral gap, we are therefore increasing the separation between these and the fast transitions. The energy threshold for the barriers is tunable by the user. This value of choice will separate slow and fast processes, so it is sensible to keep it in the order of , since barriers less then this threshold can easily be crossed by an unbiased system at a given temperature, which would classify the transitions as fast processes.

General metric definition
The definition of a general metric results from the combination of collective variables with the coefficients (weights) , as follows: Since we are grouping CVs with potentially different units, in order to combine them properly, the coefficients would need to have the inverse units and the metric itself is dimensionless. However, the range of the CVs should be scaled so that the coefficients better reflect their relative importance. Using these now dimensionless coefficients , we would write this equation as:

COMet algorithm
COMet algorithm relies on at least one simulation already being performed that can give us an estimate of the free energy. Any method that allows for the reprojection of the free energy on a new set of CVs through reweighting or otherwise is acceptable.
The selected CVs need to be calculated for the set of initial snapshots describing the transition. In order to ensure a smoothly varying path for a wide range of possible combinations of CVs, we need a set much larger than the target number of snapshots to be used for the PCV calculation. Since typically this number varies between 10 and 20, an ideal number of initial snapshots would be at least 100. In general, these values could be obtained in two ways: careful interpolation or calculation for a regular smoothly varying trajectory.
After the initial setup is completed, the optimisation itself can proceed. The initial coefficients are taken as equal for all CVs, subject to the normalisation condition: For the metric with a given combination of coefficients, the optimal spacing of snapshots is then found. It is important to ensure the frames vary as smoothly as possible. To this end, we used an empirical energy function found previously by trial and error: = log (σ( 1 ) + 0.35 1 + 0.65 σ( 2 ) + 0.1 2 ) .
Where and represent the sets of distances to the first and second neighbour frames, while denotes standard 1 2 σ deviation and the bar stands for the average. The ideal selection of frames would have the lowest value of this energy. Because this optimisation needs to be performed for every single combination of variables, it is also important that it is performed quickly and reliably in every case. The method we use is a sequential optimisation of the snapshot positions. First, we start from equally spaced snapshots on the path. Then, we optimise the position of a second snapshot between the first and third one. We then apply this procedure for all but the first and the last snapshot. We iterate this until we reach self-consistency, that is the snapshots selected stay the same after optimising all of them. While this approach is not guaranteed to reach the global minimum, it provides a good estimate very quickly. Typically, only a few rounds are needed.
After these snapshots are selected, we then calculate the values of the corresponding PCVs based on the data from the initial simulation. This in turn allows us to reweigh the free energy onto the path variable. From this free energy surface, we can then calculate the transition matrix as described above. We assume that transitions along the path are Ω only possible continuously, and hence we can discount any transitions onto states that are not direct neighbours. This assumption is valid for a well-chosen path, but jumps may still occur if the system is further away from the conformations captured by the snapshots. The eigenvalues are then computed and the spectral gap calculated. Since the aim is to maximise the spectral gap, the energy function for our simulated annealing procedure is simply the negative of the spectral gap. For the next step, we perturb the coefficients slightly and repeat the procedure. Finally, the combination of coefficients resulting in the best spectral gap is chosen.   Figure S27. Free energy surfaces for the 6 replicas obtained after 300 ns of fun-SWISH simulations for systems 5alp, 5aia and 5alt, using  1 as a control over fun-metaD at the same funnel-shaped restraint F LHS . Contour lines are drawn every 2 kcal mol -1 . Figure S28. Project CV (pp.proj) value as a function of the sampling time, for ligands initially bound in the tunnel pocket. Re-crossings (rx) are defined as one unbinding/rebinding event and the total count per system is shown in the upper right of each plot. The upper limit "bound" projection values used in determining the re-crossing count were calculated from the average initial values (grey dashed line) plus one standard deviation. Everything below that value (grey shaded region) is considered bound. For fun-SWISH only simulations at F LHS were run. Figure S29. Project CV (pp.proj) value as a function of the sampling time, for ligands initially bound in the RHS pocket. Re-crossings (rx) are defined as one unbinding/rebinding event and the total count per system is shown in the upper right of each plot. The upper limit "bound" projection values used in determining the re-crossing count were calculated from the average initial values (grey dashed line) plus one standard deviation. Everything below that value (grey shaded region) is considered bound. For fun-SWISH only simulations at F LHS were run. Figure S30. Project CV (pp.proj) value as a function of the sampling time, for ligands initially bound in the LHS pocket. Re-crossings (rx) are defined as one unbinding/rebinding event and the total count per system is shown in the upper right of each plot. The upper limit "bound" projection values used in determining the re-crossing count were calculated from the average initial values (grey dashed line) plus one standard deviation. Everything below that value (grey shaded region) is considered bound. For fun-SWISH only simulations at FLHS were run. a The colour code shows the difference between the experimental and the calculated binding free energies (ΔG calc -ΔG exp ) for each methodology, being the difference equal and/or lower than 2 kcal mol -1 colored in green, between 2 and 3.5 kcal mol -1 in orange, and differences larger than 3.5 kcal mol -1 colored in red.  a The colour code shows the difference between the experimental and the calculated binding free energies (ΔG calc -ΔG exp ) for each methodology, being the difference equal and/or lower than 2 kcal mol -1 colored in green, between 2 and 3.5 kcal mol -1 in orange, and differences larger than 3.5 kcal mol -1 colored in red. Figure S31. Free energy surfaces for the 6 replicas obtained after A) 300 ns and B) 1000 ns of fun-SWISH simulations for 5am3 system, using  1 as a control over fun-metaD at the same funnel-shaped restraint F LHS . Contour lines are drawn every 2 kcal mol -1 .

Relationship between free energies from SWISH and enthalpic contributions from water interactions
The absolute binding free energy is defined as the difference of the free energies between the bound and the unbound state. We can separate these free energies into their contributions: G bound = G water -protein + G protein -ligand + G water -ligand + G internals We now make an assumption that the scaling doesn't affect the internal degrees of freedom or the protein-ligand interaction, given that only the water interactions are scaled. We can also assume that the entropic contributions are independent of scaling. The scaling of the water-protein interactions, denoted by , goes from 0.95 to 1.20. The scaling of the water-λ ligand interactions goes from 1.05 to 0.80 and can be expressed as .

-λ
Now we can express the absolute binding free energy with a particular value of scaling:  Figure S32. Correlation between H diff,w obtained from fun-SWISH simulations and logP for the sEH complexes (the outliers after 300 ns of fun-SWISH simulations are not considered).