Per|Mut: Spatially Resolved Hydration Entropies from Atomistic Simulations

The hydrophobic effect is essential for many biophysical phenomena and processes. It is governed by a fine-tuned balance between enthalpy and entropy contributions from the hydration shell. Whereas enthalpies can in principle be calculated from an atomistic simulation trajectory, calculating solvation entropies by sampling the extremely large configuration space is challenging and often impossible. Furthermore, to qualitatively understand how the balance is affected by individual side chains, chemical groups, or the protein topology, a local description of the hydration entropy is required. In this study, we present and assess the new method “Per|Mut”, which uses a permutation reduction to alleviate the sampling problem by a factor of N! and employs a mutual information expansion to the third order to obtain spatially resolved hydration entropies. We tested the method on an argon system, a series of solvated n-alkanes, and solvated octanol.


The mutual information expansion of hydration entropy
We describe the configuration of N water molecules using N translational degrees of freedom from R 3 , and N orientations from SO (3). The set of translational degrees of freedom is T = {1, . . . , i, . . . , N }, where the index i denotes the translational coordinates of molecule i. In the same fashion, the set of rotational degrees of freedom is R = {1, . . . , j, . . . , N }.
The 3rd-order mutual information expansion of the total system entropy is defined as [1][2][3][4] S tot ≈ i∈(T ∪R) where (j, k) ∈ Pairs(T ∪ R) is a pair with i = j, and (l, m, n) ∈ Triples(T ∪ R) is a triple of degrees of freedom with unique l, m, n.
The entropy from equation 1 is split into rotational, translational, and mixed parts: + (l,m,n)∈Triples(T ) The translational entropy reads and the rotational entropy reads The translation-rotation correlation contribution is given by the remaining mixed terms and reads In this Per|Mut implementation, we neglect the mixed three-body correlation terms due to their small contribution and slow convergence.
Using this separation, the full entropy is Choosing the scaling factor ξ for the composite metric The composite metric defined in eq. 8 of the main text, measures combined distances between molecular positions in Euclidean space x and molecular orientations q. The scaling factor ξ serves to ensure equal units under the square root, where d eucl yields distances in nm and d quat returns unitless orientational distances. In the limit of infinite sampling, its numerical value is irrelevant, however, practical considerations limit it to a reasonable range at finite sampling. Figure S1: (A) Qualitative illustrations of the effects of too small ξ, reasonably chosen ξ, and too large ξ (from left to right) using an example data set. The horizontal and vertical axes symbolize the translational degrees of freedom (in Euclidean space) and the orientational degrees of freedom (using quaternions), respectively. The grey ellipses visualize the balls induced by the respective (ξ-dependent) metric around the black data point to its k = 5-nearest neighbor (blue). (B) The mutual information between the translational and rotational degrees of freedom of a water molecule close the the hydroxyl group of octanol in dependence of the scaling factor ξ. The correct value of ≈ 7.4 J·mol −1 ·K −1 is reached in a plateau region between ξ ≈ 0.2 and 30 using k = 1.
As shown in Fig. S1, too small or too large scaling factors ξ result in strongly elongated knearest neighbor balls, either along the translational or the orientational degrees of freedom.
Since the k-nearest neighbor method estimates the local probability density as k over the volume of the ball to the kth neighbor, the probability density is assumed to be constant within each ball. For too elongated balls, this is no longer the case (see Fig. S1A). As a result, mutual information is underestimated, as correlations are "smeared out".
This effect is demonstrated in Fig. S1B, where the mutual information between translational and rotational degrees of freedom of a water molecule close the the hydroxyl group of octanol were calculated for scaling factors ξ between 10 −2.5 and 10 +2.5 nm −1 . As expected, the mutual information is underestimated for very small and very large values, whereas the correct value of ≈ 7.4 J·mol −1 ·K −1 is reached in a plateau region between ξ ≈ 0.2 and 30.
As the width of the distribution of a single water molecule after permutation reduction is Here, a fixed scaling factor of ξ = 10 nm −1 was chosen within the plateau region of suitable values.

Liquid argon at 120 K and 1 bar
To mimic the number density of water, the argon test system was simulated at an average pressure of 10000 bar. Although the system behaves as a liquid in the simulation, these conditions represent a metastable region in the phase diagram. We therefore carried out an additional analysis at liquid-argon conditions of 120 K and a pressure of 1 bar. Because under these conditions, the number density is slightly smaller than that of water, the pair correlation and triple correlation cut-offs were increased to 1.15 nm and 0.5 nm, respectively.
As shown in Fig. S2, the entropy yielded by Per|Mut is accurate within 1.3 % of the TI Figure S2: Comparison between the TI reference values (grey) and the third-order Per|Mut entropies of the argon test system at high pressure (red) and the same system at a temperature of 120 K and a pressure of 1 bar (blue).
reference value.

Choosing the hydration shell around alkanes
To compare the entropy losses associated with the solvation of various alkanes, comparison needs to be made considering the same number of water molecules around each alkane. We therefore defined a solvation shell, containing the closest n water molecules (after permutation reduction) around each solute. The solvation shells need to be large enough that all water molecules that are affected by the solutes are included in the calculations, even for the largest solute (decane). To determine the optimal solvation shell size n, we calculated the alkane entropy differences for the closest 20, 50, 100, 200, 300, and 500 water molecules. As shown in Fig. S3A, the slope of the entropy loss (average entropy loss per additional C-atom) converges once the closest 100 water molecules are considered, indicating that all relevant solvation shell entropy effects are included. For larger solvation shells (e.g., the closest 200, 300, or 500 molecules), the average entropy loss per additional C-atom remains almost unchanged, but the statistical error increases, as more bulk water molecules are (unnecessarily) included in the calculation. We therefore considered the closest 100 water molecules the appropriate hydration shell. Octanol TI reference entropy Initial attempts to determine the solvation entropy difference between octane and octanol through thermodynamic integration (TI) have proven difficult, as the entropy change due to the Coulomb-interactions in particular could not be sufficiently converged. Contrary to the widely used free-energy formulation of TI, entropy TI generally suffers from poor convergence properties.
To overcome this problem, we instead calculated the solvation entropy difference between propane and propanol, assuming that the entropy change due to the addition of the hydroxyl group is essentially independent of the hydrocarbon chain length. This allowed us to reduce the system size from initially 1728 water molecules to only 547 molecules, while a distance of at least 1 nm was kept between the centered solute and the boundaries of the simulation box. 250 intermediate λ-windows were simulated, of which each lasted 1 µs. We obtained an entropy difference of (25.1 ± 0.1) J·mol −1 ·K −1 , where the error was determined by omitting 500 ns from each window.
The result serves as an estimate for the entropy difference between octane and octanol.