A Polarization-Consistent Model for Alcohols to Predict Solvation Free Energies

Classical nonpolarizable models, normally based on a combination of Lennard-Jones sites and point charges, are extensively used to model thermodynamic properties of fluids, including solvation. An important shortcoming of these models is that they do not explicitly account for polarization effects, i.e., a description of how the electron density responds to changes in the molecular environment. Instead, polarization is implicitly included, in a mean-field sense, into the parameters of the model, usually by fitting to pure liquid properties (e.g., density). This causes problems when trying to describe thermodynamic properties that involve a change of phase (e.g., enthalpy of vaporization), that directly depend on the electronic response of the medium (e.g., dielectric constant), and that require mixing or solvation in different media (e.g., solvation free energies). Fully polarizable models present a natural route for addressing these limitations but at the price of a much higher computational cost. In this work, we combine the best of those two approaches by running fast simulations using nonpolarizable models and applying post facto corrections to the computed properties in order to account for the effects of polarization. By applying this new paradigm, a new united-atom force field for alcohols is developed that is able to predict both pure liquid properties, including dielectric constant, and solvation free energies in different solvents with a high degree of accuracy. This paves the way for the development of a generic classical nonpolarizable force field that can predict solvation of drug-like molecules in a variety of solvents.


INTRODUCTION
Solubility is a very important property in the manufacture and formulation of drugs. 1 For example, drug absorption from the gastrointestinal tract depends strongly on the solubility of orally administrated drugs. Nowadays, most drug candidates have a low solubility in water, and as a consequence, their absorption has to be improved by using a suitable formulation approach. About 40% of drugs in the market and nearly 90% of drugs in development are poorly water soluble. 2 Cosolvents can be used to enhance the solubility of nonpolar solutes by several orders of magnitude. These compounds commonly contain both hydrogen-bonding groups, which interact strongly with water, and nonhydrogen-bonding groups. Dimethylacetamide, ethanol, and propylene glycol are widely used in the pharmaceutical industry as cosolvents. 3 Consequently, knowing the solubility of the drug in water and in other solvents, during its formulation process, is extremely important. 4,5 In addition, the octanol/water partition coefficient is a very important property in the pharmaceutical industry, and for example, it can be used to describe a drug's ability to diffuse through lipids. 6 Apart from their practical importance as solvents, alcohols are also interesting from a fundamental point of view, 7−9 as they are the simplest molecules that combine a hydrophobic moiety with a hydrogen-bonding functional group.
Experimental determination of the solubility of drug molecules is very laborious and time consuming, especially when the solubility is very low. As a result, much time and effort could be saved by accurately predicting it using computational methods. 10,11 In particular, molecular dynamics (MD) simulations using classical nonpolarizable models have shown great promise for predicting drug solubilities. 12,13 Despite this promise, however, MD has not yet surpassed the accuracy of much less computationally expensive approaches like quantitative structure−property relationships or group contribution methods in blind tests of solvation free energy predictions. 14 This is most likely due to the inherent shortcomings and lack of transferability of the underlying nonpolarizable molecular models, which cannot explicitly account for polarization effects since the partial charges are assumed to be fixed. Although progress has been made using higher levels of theory, such as fully polarizable models 15 or QM/MM calculations, 16 these approaches are too computationally expensive for routine use. Therefore, it is important to assess to which extent the lack of polarization effects in classical simulations can be taken into account by inexpensive correction schemes.
In 1987, Berendsen et al. proposed a correction that accounts for the repolarization, or distortion, of a molecule when it moves from the liquid phase to the gas phase and used this to develop a new water model, SPC/E. 17 Since then, most "modern" water models 18−21 and a few methanol models 22,23 have been parametrized using this correction. However, the issue has been mostly neglected in the development of generic force fields like OPLS-AA, 24 GROMOS, 25 TraPPE-UA, 26 and GAFF. 27 Moreover, it has been argued on the basis of theoretical considerations that not only is the Berendsen distortion correction underestimated but it should be supplemented by another term to represent the interaction of the polarized molecule with the purely electronic degrees of freedom of the solvent. 28 When these two energy terms, which have opposite signs, were estimated from high-level quantum mechanical calculations for liquid water, it was shown that they nearly cancel each other out. 29 It is not yet clear to which extent this effect is observed for other molecules of lower polarity. Furthermore, when polarization effects are represented by a simple dipole moment scaling factor, 30 dielectric constants that are in quantitative agreement with experiments are obtained for a wide variety of molecules, eliminating previously observed systematic deviations. 31 These recent advances suggest that marked improvements in the performance of classical nonpolarizable models can be obtained if accurate polarization corrections are employed in a consistent fashion in force field parametrization and validation stages. In this paper, we demonstrate that this can indeed be achieved for the particular case of alcohol molecules.
Apart from the generic force fields mentioned above, there have been several attempts to parametrize specific models for alcohols. For example, Weerasinghe and Smith 22 developed a methanol model from Kirkwood−Buff integrals aimed at describing pure liquid and aqueous solution properties. The Berendsen correction was employed in calculations of the enthalpy of vaporization, but no corrections were applied to the dielectric constant or to thermodynamic properties of mixing. The more recent OPLS/2016 model 23 was reparametrized including solid−fluid experimental data. Several properties were used for the validation of the model including the static dielectric constant, which was calculated using polarization scaling corrections. 30,31 Interestingly, while the calculation of the enthalpy of vaporization applied the model-dependent Berendsen correction, this was not used when computing other vapor−liquid coexistence properties, such as phase diagrams and vapor pressure. Another modification of OPLS was the one proposed by Kulschewski and Pleiss. 32 They modified the partial charges of the hydroxyl group in the OPLS-AA force field for different alcohols (including methanol to octanol) to consider the influence of the alkyl tail on the electronic structure of the hydroxyl group. Their simulated densities, self-diffusion coefficients, and dielectric constants are closer to the experimental values than the predictions from the original OPLS-AA force field. However, these modifications increase the complexity of the model and reduce its transferability. 32 Moreover, no attempt was made to include polarization corrections in the calculations. An improved GROMOS force field for alcohols was obtained by fitting to the density, enthalpy of vaporization, and free energy of solvation in water and in cyclohexane, 33 but again, no polarization corrections were used. In conclusion, although there are several force fields for alcohols available in the literature, none of them, to the best of our knowledge, has been parametrized using consistent polarization corrections. Here, we develop a new united-atom model for alcohols taking into consideration polarization corrections at the parametrization and validation stage for all relevant properties. Unitedatom (UA) models have fewer interaction sites than all-atom models, and therefore, they are less computationally expensive and simpler to parametrize. Furthermore, our starting point is a modified TraPPE-UA force field for hydrocarbons 34 (i.e., alkanes, alkenes, and alkynes) that corrects small but systematic deviations when predicting solvation free energies in hydrophobic solvents using generic force fields. 35 In particular, we show that very accurate predictions of solvation free energies in solvents of different polarity can be obtained when polarization effects are consistently accounted for.

METHODOLOGY
2.1. Bulk Properties. Simulations were run with Gromacs 5.1.2. 36 First, 399 molecules of methanol were inserted in a cubic box of approximately 3 nm each side, and a steepest descent minimization run was performed. After this, a 100 ps NVT simulation and a 100 ps NPT simulation, using the Berendsen coupling scheme, 37 were performed to reach the desired temperature (298.15 K) and pressure (1 bar). Subsequently, a 25 ns NPT simulation was run where the temperature was kept at 298.15 K using the V-rescale thermostat, 38 with a time constant of 0.1 ps, and the pressure was controlled using the Parrinello−Rahman barostat, 39 with a time constant of 2 ps and an isothermal compressibility of 4.5 × 10 −5 bar −1 . A leapfrog algorithm 40 was used to integrate Newton's equations of motion with a time step of 2 fs. All bonds were constrained using the LINCS algorithm. 41 The Verlet scheme was chosen for neighbor searching, with a cutoff radius of 1 nm for both van der Waals and electrostatic interactions, and long-range dispersion corrections for energy and pressure were applied. Long-range electrostatic interactions were calculated using PME 42 with a Fourier spacing of 0.16 nm. Additionally, periodic boundary conditions were always applied unless stated otherwise.
The same procedure was followed when simulating ethanol (473 molecules), propanol (361 molecules), butanol (281 molecules), pentanol (216 molecules), hexanol (180 molecules), heptanol (147 molecules), octanol (128 molecules), nonanol (111 molecules), decanol (103 molecules), 2propanol (367 molecules), 2-butanol (290 molecules), 2pentanol (233 molecules) 2-methyl-2-propanol (289 molecules), and 2-methyl-2-butanol (228 molecules). The number of molecules in each simulation box was selected to maintain an approximately constant box size (ca. 3 nm), in order to minimize the use of computational resources. We have previously shown that the results (with the exception of the self-diffusion coefficient; see below) are independent of system size provided long-range corrections are employed. 35 In total, 10 independent simulations were run for each compound, and error bars were calculated to give a 95% confidence interval using a Students' t equal to 2.262. 43 In some cases, the error bars were too small to be plotted, and therefore, they are not visible in the graphs.
The enthalpies of vaporization were calculated using eq 1 Journal of Chemical Information and Modeling pubs.acs.org/jcim Article where U gas is the molar potential energy in the vapor phase, U liq is the molar potential energy in the liquid phase, R is the ideal gas constant, and T is the temperature. This equation assumes that the kinetic energy per molecule is the same in the gas phase and in the liquid phase, an assumption which has been shown to be very accurate for small molecules. 44 To estimate the potential energy in the vapor phase (i.e., the intramolecular energy of an isolated molecule), a 50 ns simulation of a single molecule in vacuum with no boundary conditions was run for each alcohol, and the first 10 ns were discarded. For these simulations, the cutoff scheme selected for neighbor searching was "group", and the cutoff radii were all set to 0. A correction term (C pol ) was added to the enthalpy calculation to account for polarization effects. This term was calculated using eq 2 45 Here, μ l and μ g are the liquid and gas phase dipole moments, respectively. Additionally, α and ϵ el are the electronic polarizability of the molecule in the gas phase and the highfrequency dielectric permittivity of the medium, respectively. The first term on the right-hand side of eq 2 is the distortion energy, which is favorable when moving from the liquid phase to the gas phase (as in a vaporization process). The second term is the interaction energy of the molecule with the surrounding electronic continuum, described here by a simple Kirkwood−Onsager model 46−48 for a dipole in a spherical cavity. The radius of the cavity is treated self-consistently so that it is eliminated from the equations for the polarization energy. 45 Equation 2 contains only the dipole moment contributions to the polarization energies. Although it has been shown that quadrupole (and potentially higher moment) contributions can be non-negligible, 29,49 it was not possible to calculate those contributions here due to the absence of sufficiently accurate data for the quadrupole moments and quadrupole polarizabilities of alcohols in the liquid phase. Note also that quantum vibration corrections were not included in the calculation of the enthalpy of vaporization, as they are already negligible for methanol. 22 Finally, it is important to clarify that the correction C pol employed here is independent of the parameters of the model, unlike the correction proposed by Berendsen et al. which depends on the effective dipole of the model (μ model ), as can be seen from eq 3. 17 The theory of Leontyev and Stuchebrukhov 45 is based on decoupling the contributions to the dielectric response of the solvent coming from the "fast" electronic degrees of freedom and those arising from the "slow" nuclear degrees of freedom, following the Born−Oppenheimer approximation. As such, the liquid state can be thought of as nuclear charges q i moving in a polarizable electronic continuum with a high-frequency dielectric permittivity (ϵ el ) equal to n 2 (where n is the refraction index of the medium), while the relative permittivity of the gas phase is close to 1 (ϵ el ≈ 1 for a perfect vacuum). This purely electronic continuum interacts with the polarized molecule and also screens its charges, resulting in effective charges that are much lower than those of the real liquid, i.e., q i eff = q i /ϵ el 0.5 . Hence, in order to consistently account for polarization effects, the value of μ l to be used in eq 2 should be the dipole of the real liquid, which is normally much higher than the dipole moment of nonpolarizable fixed-charge models due to the charge screening effect. 29,45 As mentioned earlier, the original Berendsen correction uses the dipole moment of the model as a proxy for the real liquid dipole, hence leading to strongly underestimated distortion corrections. 45 In previous work, 29 we used the real dipole moment of liquid water estimated from experiments 50 and high-level ab initio MD calculations. 51 Although, to the best of our knowledge, there is no reliable experimental estimate of the liquid phase dipole moment for alcohols, we were able to find a few AIMD studies that report the liquid dipole of methanol. Pagliai et al. 52 reported a value of 2.64 D from simulations on a rather small box of 26 methanol molecules using the BLYP exchangecorrelation functional. The same functional was used by Handgraaf et al. 53 with a larger box of 64 molecules, yielding a value of 2.59 D. In a study of the methanol vapor/liquid interface with a box of 120 molecules, Kuo et al. 54 report average bulk liquid dipole moments around 2.7 D, using both BLYP and PBE functionals. Finally, a more recent study by Sieffert et al. 55 report a range of values obtained with several functionals in a box of 64 molecules, ranging from 2.58 to 2.84 D. Importantly, some of the calculations were carried out with dispersion corrections, which are known to play important roles in determining the structure of liquids. 51 The average value of those calculations is 2.7 D, which we take to be the best estimate of the dipole moment of liquid methanol. Unfortunately, we were unable to find similar studies for larger alcohols; therefore, an alternative approach was sought. Namely, we used an equation proposed by Leontyev and Stuchebrukhov,45 again based on the theory of continuum dielectrics in a spherical cavity, where ϵ sol is the static dielectric constant of the solvent.
Using experimental values for μ g and ϵ sol and estimating ϵ el from the square of the experimental refraction index at the sodium D-line frequency, we obtain a value for methanol of 2.7 D, in precise agreement with the average of AIMD simulations mentioned above. This expression also provides a good estimate of the dipole moment of liquid water. 29 This gives confidence that eq 4 can provide reasonable estimates of the liquid phase dipole moments for the remaining alcohol molecules. Table S1 reports details of these calculations for all the molecules considered in this work, together with the corresponding polarization corrections obtained from eq 2.
The diffusion constants were calculated using the Einstein relation (eq 5). The left-hand side of eq 5 was obtained by linear regression of the mean square displacement (MSD), where the times were weighted according to the number of reference points. In every case, the fitting was done between t = 100 ps and t = 500 ps, where t is the time from the reference positions and not simulation time. The estimated error was the difference of the diffusion coefficients obtained from fits over the two halves of the fit interval. 56 Journal of Chemical Information and Modeling pubs.acs.org/jcim Article Here, D A is the self-diffusion constant of particles of type A, and the left-hand side of eq 5 is the limit of the mean square displacement when time tends to infinity. It has been shown that the apparent self-diffusion coefficients depend significantly on the system size, and thus, it is important to correct for the deviations observed when comparing to an infinite system. 57 The diffusion coefficient depends linearly on the inverse box length (1/L), and consequently, the diffusion constant for an infinite system can be calculated by extrapolating a straight-line fit 57 ( Figure  S1). For this reason, simulations were run for four different box lengths (3, 4, 5, 6 nm), and a correction term was calculated for each alcohol molecule by subtracting the simulated diffusion constant obtained using a cubic box of 3 nm each side from the intercept of the fitted line. It was assumed that these corrections would be similar for other parameter sets, and therefore, they were used during force field optimization and validation stages. To corroborate this hypothesis, new correction terms were calculated using the final optimized force field, and the values obtained were consistent with the ones calculated with TraPPE (see Table S2 for a comparison).
The static dielectric constant was estimated with Gromacs through eq 6 58 where k B is the Boltzmann constant, V is the volume of the simulation box and M is the total dipole moment of the system. The dielectric constant obtained using Gromacs was then corrected for polarization effects using eq 7. 31 where ϵ sim is the dielectric constant obtained from eq 6, and k is the ratio between the dipole moment of the real liquid (estimated from eq 4) and the dipole moment of the nonpolarizable model. This equation, derived elsewhere, 31 takes into account the purely electronic response of the liquid, which is not accounted for in nonpolarizable models, as well as the charge screening caused by the presence of this electronic continuum.

Free
Energy of Solvation. The free energy of solvation (ΔG sol ) was calculated by alchemical transformations using standard protocols, 59 with full details provided in the Supporting Information. Here, we provide only a summary of the most important methodological considerations.
First of all, we computed ΔG sol via a one-step transformation using the option couple-intramol = "no" in Gromacs. This means that the decoupled state of the solute molecule corresponds to the proper vacuum state without periodicity effects and not to a molecule without intramolecular interactions. 56 This implies the assumption that the contribution of the intramolecular degrees of freedom of the solute to the solvation free energy are the same in the gas phase and in the liquid phase. We have previously shown that this assumption is accurate for solvation of alkanes up to hexadecane. 35 To further check the accuracy of this assumption for alcohols, the free energy of solvation of decanol obtained from two independent one-step simulations was compared to the results obtained using a full thermodynamic cycle (i.e., a standard two-step simulation), and no significant differences were observed (see Table S3 in the Supporting Information).
The free energies were sampled using Bennett's acceptance ratio (BAR) method. 60,61 It is more efficient and insightful to calculate electrostatic and Lennard-Jones (LJ) free energy contributions separately, 59,62 and this was the procedure followed here. The LJ component of the free energy of solvation was calculated using 15 lambda values (0, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.7, 0.8, 0.9, and 1), while seven lambda values were used to obtain the electrostatic component (0, 0.3, 0.6, 0.7, 0.8, 0.9, and 1). The intermediate lambdas were chosen based on their relative entropy (s), which is a measure of how one probability distribution diverges from a second. 63 Table S4 shows the relative entropies for the LJ and electrostatic component of methanol as an example. The error in the free energy calculations was estimated using block averaging and is reported as error bars representing approximately a 95% confidence interval on the mean.
Free energy simulations were run using a leapfrog stochastic dynamics integrator, 64 and therefore, the temperature was kept constant using the Langevin method. A soft-core function was used to avoid instabilities close to the noninteracting state. 65 The soft-core parameters sc-power and sc-sigma were 1 and 0.3, respectively. Sc-alpha was 0.5 for the LJ term since this seems to be the ideal value for transformations of chargeless molecules, 62 while sc-alpha = 0 was used for the electrostatic component. All other parameters were the same as the ones used to calculate bulk properties except for the Fourier spacing, which was 0.12, and the time constant for pressure coupling, which in this case was 5 ps. Long-range van der Waals interactions were calculated using a cutoff method with longrange corrections. In some situations, it is necessary to use PME when calculating van der Waals interactions; 66 however, as can be seen from Figure S5 in the Supporting Information, both methods are equivalent for the alcohols studied here.
The length of the simulation differed from molecule to molecule, and it was chosen by plotting free energy of selfsolvation vs simulation time to ensure convergence (see Figures S3 and S4 in the Supporting Information). When calculating the LJ term of the free energy of self-solvation, simulations were run for 5 ns (methanol to propanol), 7 ns (butanol to heptanol), and 10 ns (octanol to decanol), while longer simulation times were needed to calculate the component of the free energy of self-solvation that is due to electrostatic interactions. The simulation times in this case were 5 ns (methanol), 15 ns (ethanol and propanol), 35 ns (butanol and pentanol), and 50 ns (hexanol to decanol). Additionally, 10 ns simulations were run to obtain the free energy of solvation of primary alcohols in hexadecane and the free energy of solvation of alkanes in octanol. In all cases, the first 500 ps were discarded.
Polarization corrections were added to the solvation free energies calculated from MD simulations. These corrections were calculated from eqs 2 and 4, but C pol has the opposite sign, since the solvation process moves a molecule from the vapor to the condensed phase (i.e., the direction is the opposite of the vaporization process). Note that when applying those equations to solvation, the experimental values of μ g and α correspond to the solute molecule, while values of ε sol and ε el correspond to the surrounding solvent; hence, μ l is the estimated dipole moment of the solute in the solvent of interest. Results for the polarization corrections in all systems studied here are reported in Section 3.1. Experimental values for the free energies of solvation were taken from the Minnesota Solvation Database 67 and the Katritzky database. 68 Journal of Chemical Information and Modeling pubs.acs.org/jcim Article Additionally, some free energies of self-solvation were calculated from the experimental vapor pressure and density at 298 K, using eq 8 67 i k j j j j j y where 24.45 atm is the pressure of an ideal gas at 1 molar concentration and 298 K, P v is the vapor pressure in atm, ρ is the density in g/L ,and M w is the molecular weight.

Force Field
Parameters. The new model developed here is based on the TraPPE-UA (Transferable Potentials for Phase Equilibria-United Atom) force field 69 as this force field performed better than OPLS-UA and GROMOS at predicting hydrophobic solvation. 35 Only nonbonded parameters were modified, while all bonded parameters were kept unchanged. This force field treats the CH x groups as pseudoatoms, located at the sites of the carbon atoms, while it models all other atoms explicitly. Nonbonded interactions are described by pairwiseadditive LJ 12-6 potentials and Coulombic interactions of partial charges Ä where ϵ ij , σ ij , q i , and q j are the LJ well depth, LJ size, and partial charges, respectively, for the pair of atoms i and j separated by a distance r ij , and ϵ 0 is the vacuum permittivity. The LJ parameters for the interaction between two different atoms are calculated using the Lorentz−Berthelot combining rules (eqs 10 and 11) To describe intramolecular interactions, the TraPPE force field uses fixed bonds lengths, a harmonic potential for bond angle bending (eq 12), and torsional potentials to restrict the dihedral rotations around bonds (eq 13) where θ, θ 0 , and k θ are the instantaneous bending angle, the equilibrium bending angle, and the force constant, respectively.
The LJ parameters for alkanes, alkenes, and alkynes have previously been optimized to correct for the small but systematic deviations observed when predicting their solvation free energies in hydrophobic solvents using the original TraPPE-UA force field. 34 These parameters are listed in Table 1, along with the new parameters found for alcohols in this work (see Section 2.4). Bonded parameters, which are identical to those of TraPPE-UA, are reported in Table S5 for completeness, while the original TraPPE-UA nonbonded parameters are reported in Table S7.
2.4. Force Field Optimization. In this paper, we developed a new force field for alcohols by incorporating polarization corrections on all relevant properties (see Section 2.1) at the parametrization stage. To our knowledge, it is the first time that this approach has been applied in force field development. We have named our new model PolCA, standing for "Polarization-Consistent Approach". Below, we describe in detail the strategy used to optimize the potential parameters for PolCA.
The density, diffusion constant, and enthalpy of vaporization of methanol, 1-pentanol, and 1-heptanol were simulated using different LJ parameters for the oxygen atom and different partial charges for the hydroxyl group. The partial charge of the α-carbon was obtained as minus the sum of the other two partial charges to ensure neutrality of the molecule. The parameters for the alkyl chain groups were taken from the model we have proposed for alkanes, alkenes, and alkynes 34 (Table 1). Additionally, the LJ parameters of the alkane methyl and methylene groups were used for the α-carbons without modification.
Meta-models, that predict molecular simulation results for a given set of parameters, were used to decrease computational cost and allow for a more extensive exploration of the force field parameter space. This technique has already been tested by Cailliez et al. 71 and proven to be successful. They also found that it was much more convenient and efficient to model each individual property used for the calibration process instead of directly modeling the full objective function. 71 For this reason, a learning set was built for each molecule used in the optimization routine, and then, each property was fitted to a second-order model using eq 14 72 where k is the number of parameters and X = (x 1 , x 2 , .., x k ). Coded values of the force field parameters were used instead of using the unmodified parameters as this resulted in better prediction of the properties. Equation 15 was used to convert σ, ε, oxygen's partial charge (q O ), and hydrogen's partial charge (q H ) into their corresponding coded values.  In order to obtain a good estimation of the objective function's minimum, it is important to have a good correlation between the simulated values and the values obtained with the meta-models. This was checked by plotting predicted vs simulated values for each calibration property, and as can be seen from Figure 1, there is a very good match in all cases. The predictivity coefficients shown in the plots were calculated using eq 16 71 16) where N is the number of points in the sample, y sim (X i ) is the simulated value, f(X i ) is the value predicted by the metamodel, and y ̅ is the mean value of y sim (X i ) on the whole sample. Additionally, once an optimum was found, new simulations were run using these parameters, and the values predicted by the meta-models were shown to be in agreement with the simulated values. An ideal force field is one where the difference between experimental and simulated values is zero. Consequently, an objective function was created that takes into account these differences (eq 17) where j = 1 corresponds to methanol, j = 2 to pentanol, and j = 3 to heptanol. Additionally, f(X) is the value predicted using the meta-model, y exp is the experimental value, and ρ, D A and ΔH represent the density, diffusion, and enthalpy of vaporization, respectively. This function was minimized using an optimized steepest descent algorithm, similar to the one used by Di Pierro and Elber. 73 The point that describes the original TraPPE force field was the initial guess. From this point, the algorithm searched along the opposite direction of the gradient of the objective function to find a new point. The step length, the distance to move along the specified direction, was found using 100 equidistant trial steps and choosing the one that returns the lowest function value. The maximum step length (t max ) tried was obtained from eq 18 73 where X k is the current iterate, and ∇F(X k ) is the gradient of the function evaluated at X k .
The optimization routine was done in two parts. First, an optimum was found using larger trial steps (from 0.01 t max to t max ), and then, this optimum was chosen as the starting point for a new optimization algorithm with smaller steps (from 0.0001 t max to t max ). The maximum number of iterations was 4000 for the first part and 100 for the second part because the initial point for the second optimization routine tends to be close to the optimum, and thus, not many extra iterations are needed.
Sensitivity Analysis. A variance-based global sensitivity analysis was performed to determine the most influential parameters for each target property. A global sensitivity analysis can be used to identify which factors make no significant contribution to the variance of the output and, therefore, can be fixed to any given value within their range of variation. 74 Sobol′ indices 75 can be used to rank the input variables based on their importance. First-order indices S i give the influence of each parameter taken alone, while total sensitivity indices S T consider the total effect of an input parameter. 76 The importance of interaction effects for a specific parameter depends on the difference between the firstorder and total indices. If these two indices are close to each other, interaction effects are not important for that parameter. Furthermore, the sum of all S i is equal to 1 for additive models (no interaction between parameters) and less than 1 for Journal of Chemical Information and Modeling pubs.acs.org/jcim Article nonadditive models. For this sensitivity analysis, only firstorder indices were used. First-order indices can be evaluated using Monte Carlo simulations to estimate the mean value, the total variance (D), and the partial variance due to variable x i (D i ) (eqs 19−22 76 )  (22) where N sim is the number of samples, x m denotes the mth sample point, and x (∼i)m is the mth sample point without the variable i. Two matrices of random numbers, called (1) and (2), and order N sim × k (k is the number of input variables) need to be generated first. The superscripts (1) and (2) indicate which matrix the sample point needs to be taken from. The results of the sensitivity analysis for the four fitting parameters considered and for the three target properties (density, enthalpy of vaporization, and self-diffusion coefficient) are shown in Figure 2. The charges on the hydroxyl group are seen to have the strongest influence for all three target properties. This is not surprising, given the importance of hydrogen bonds in the structure and thermodynamics of alcohols. 77−79 The oxygen LJ diameter is of secondary importance, except for density, where it plays a major role. Again, this is to be expected, since density is strongly sensitive to intermolecular packing and hence to excluded volume. At the other extreme, ε is shown not to be a relevant factor for any of the target properties of alcohols. As such, to reduce the degrees of freedom of the force field optimization and thus avoid overfitting, we have decided to keep ε constant and equal to the TraPPE value in the remainder of this paper.

TraPPE with Polarization Corrections.
We begin by comparing the performance of the TraPPE-UA model for alcohols with and without applying the polarization corrections described in Section 2.1. This will assess how useful our analytical corrections are when applied to a well-established, previously developed, force field. We note that there is no a priori guarantee that the corrections will work well, given that the force field was developed with the standard approach that does not explicitly account for polarization effects. We focus only on properties for which polarization corrections are required, i.e., phase change properties and the dielectric constant.
In Table 2, we report, for all solute/solvent pairs considered in this work, the two contributions to the polarization correction in eq 2, i.e., the negative distortion term and the positive electronic polarization term as well as the final value of the correction. We note that the sign of these terms corresponds to the transfer of a molecule from the liquid phase to the gas phase, as in calculating the enthalpy of vaporization; for the opposite direction, as in solvation free energy calculations, the signs are reversed. We also present the values of the distortion correction estimated using the Berendsen approach (eq 3) for both the original TraPPE model and for the new PolCA model. We reiterate the fact that the Berendsen correction has different values for the two models because it uses the dipole moment of the model as a proxy for the dipole moment of the liquid (for the alcohol molecules presented on this table, the dipole moment of the PolCA model is 2.07 D, see below, while the dipole moment of TraPPE is 2.26). In contrast, our polarization corrections are model independent and can therefore be applied to correct simulation results obtained with any force field.
An interesting observation from Table 2 is that for the pure alcohol systems (i.e., where the solute and solvent are both alcohols), the distortion and electronic contributions nearly cancel out, leaving total polarization corrections that are close to zero (they vary between −0.41 and 0.46 kJ/mol). This near cancellation between the two contributions has previously been observed for water. 29,45 Further research is needed to determine if this is a universal feature for molecules with functional groups of high polarity. In contrast, the Berendsen correction is always negative because it only considers distortion effects and is significantly larger in magnitude, particularly for the TraPPE model.
In Figure 3, we compare the predictions of the TraPPE model for the enthalpy of vaporization and self-solvation free energy of alcohols with and without applying polarization corrections. The original TraPPE model, having been parametrized against vapor−liquid coexistence of pure alcohols, performs quite well for these systems. As expected from the values in Table 2, applying our polarization corrections to pure alcohol systems has a very small effect, albeit in the direction of improving agreement with experimental data (see Table 3 for a quantitative comparison of the root-mean-square deviation (RMSD) for each approach). However, applying the Berendsen correction worsens the agreement with experiments, particularly for the smaller alcohols for which those corrections are largest in magnitude.
A completely different behavior is observed for solvation of alcohols in hexadecane, as shown in Figure 4. In this case, the original TraPPE systematically overestimates the solvation free   Journal of Chemical Information and Modeling pubs.acs.org/jcim Article energy (i.e., predicts less favorable solvation than observed experimentally), showing poor transferability from a polar to a nonpolar environment. This has been a well-documented limitation of fixed-charge force fields, as discussed in the Introduction. Applying the Berendsen correction only makes matters worse, as it shifts the solvation free energies to even less negative values. The reason for this becomes apparent when analyzing Table 2. The dipole moment of the alcohol solutes does not change much when they are moved from the gas phase to an alkane solvent, as the latter is nonpolar and hence does not significantly polarize the solute. However, the background electronic continuum is much the same as for polar solvents; in fact, the electronic response accounts for practically the entire dielectric constant of alkane solvents. 31 As a consequence, the distortion component of polarization is actually quite small for solvation in alkanes and is thus significantly overestimated by the Berendsen approach. In contrast, the electronic contribution retains a similar magnitude as for the pure alcohol systems and thus dominates the polarization effects. This leads to total polarization corrections for the solvation free energy of polar molecules like alcohols, in nonpolar solvents like alkanes, that are quite substantial. Applying these corrections to the TraPPE results significantly improves agreement with experiments (see also Table 3 for a quantitative comparison). Finally, we analyze the effect of polarization corrections on the dielectric constant predictions in Figure 5. It is clear that the original TraPPE model significantly underpredicts the experimental dielectric constant for all the alcohol molecules. By approximately accounting for both nuclear and electronic polarization effects, eq 7 leads to a much improved agreement with experiments across the entire range. This reinforces the conclusions of a previous study, where the corrections were seen to practically eliminate systematic deviations between simulated and experimental dielectric constants over a wide range of compounds and molecular models. 31 An important difference is that in our previous study the scaling factor k was adjusted empirically, whereas in the present work, it was derived from first -principles using eq 4.
3.2. New PolCA Model. The analysis in the previous section conclusively demonstrates that polarization corrections, as defined by eqs 2 and 7, can significantly improve the performance of an existing model for predicting the enthalpy of vaporization, dielectric constant, and solvation free energies in both polar and nonpolar solvents. However, the TraPPE model did not include any polarization corrections during parametrization. The reason why it works so well for pure alcohols is because the overall polarization correction for those systems turns out to be very small (Table 2). In this section, we develop a new polarization-consistent model that includes polarization corrections from its inception, as described in Section 2.4.
The TraPPE force field can predict very well the density of small primary alcohols (from methanol to pentanol), but from hexanol to decanol, it overestimates this property ( Figure 6). Also, the difference between simulation and experimental values seems to increase with the length of the alkyl chain, suggesting that the predictions will get worse as we move toward even heavier alcohols. As a preliminary test, the density of each alcohol was calculated using the set of parameters we have previously proposed for hydrocarbons 34 combined with the original parameters for the hydroxyl group taken from TraPPE-UA. We call this model "modified TraPPE", and the corresponding nonbonded parameters are provided in Table  S8. The predicted densities of small alcohols obtained this way were too low, as can be seen in Figure 6. However, from octanol to decanol, the values obtained were closer to the experimental densities compared to those obtained using TraPPE-UA. This can be explained by the fact that the original TraPPE parameters were obtained from calculations of the vapor−liquid equilibria (including saturated liquid densities) of methanol and ethanol, 69 while the new parameters were designed to fit the density, enthalpy of vaporization, and free energy of solvation of alkanes. 34 As the number of carbon atoms in an alcohol increases, the nonpolar interactions become more important, and the behavior of the alcohol tends to be more similar to that of the alkane with the same number of carbon atoms. This is the reason why a new set of LJ parameters for the oxygen and new partial charges for the hydroxyl group and the α-carbon had to be found. As described in Section 2.4, this was achieved by optimizing those parameters to simultaneously match the density, enthalpy of vaporization, and diffusion coefficient of methanol, 1-pentanol, and 1-heptanol.
The PolCA model performs better than TraPPE at predicting the density of alcohols from hexanol to decanol, while still giving good predictions for smaller alcohols. The root-mean-square deviation of the new model with respect to the experimental data is 2.79 kg/m 3 , while for TraPPE it is 4.23 kg/m 3 .
Additionally, simulations were run at five other temperatures (283, 303, 313, 323, and 333 K) for methanol, ethanol, butanol, hexanol, and decanol to validate the new PolCA   Figure 7 shows that PolCA very accurately captures the temperature dependence of the density for all the alcohols.
TraPPE overestimates the self-diffusion constant of primary alcohols, except for the diffusion of methanol which is very close to the experimental value, while PolCA underestimates the self-diffusion constant of methanol but accurately predicts this property for the other primary alcohols (Figure 8). The root-mean-square deviation (RMSD) of the new PolCA model with respect to the experimental data is 0.291 × 10 −5 cm 2 /s, while for TraPPE it is 0.207 × 10 −5 cm 2 /s. These deviations are quite small, considering the inherent simplifications of both models, and we believe our model strikes a good compromise. It is, of course, possible to improve the accuracy of the fit by increasing the number of degrees of freedom of the model (e.g., varying ε or scaling the charges of methanol relative to the other alcohols), but that would violate our aim of keeping the model as general and as transferable as possible.
Both TraPPE and PolCA do a very good job at predicting the enthalpy of vaporization. Our model slightly overpredicts this property for methanol, but from ethanol on, it matches the experimental data almost exactly (Figure 9). It is important to mention that experimental values were taken from the NIST Web site, 80 and the error bars for nonanol and decanol are quite large (6 kJ/mol). Therefore, the values obtained using TraPPE fall within the error bars but are a lot lower than the average. The RMSD of PolCA is 0.71 kJ/mol, and the RMSD of TraPPE (C) is 2.96 kJ/mol. TraPPE and PolCA are both able to accurately predict the dielectric constant once the corrections defined in eq 7 are applied ( Figure 10). The root-mean-square deviations for PolCA and TraPPE (C) are 1.84 and 2.06, respectively. This level of agreement is particularly remarkable considering that the simulation results constitute pure predictions; i.e., the dielectric constant was not part of the training set for the parametrization of either TraPPE or PolCA. It further emphasizes the need to consistently account for polarization effects when comparing predictions of nonpolarizable force fields against experimental dielectric constants. Figure 11 compares the experimental data for the free energy of self-solvation of primary alcohols (i.e., when the solute and solvent are the same molecule) against predictions from TraPPE and PolCA. First of all, we note that the value for pentanol reported in the Minnesota Solvation Database 67 is most likely in error, as it disagrees with the value reported by Katritzky et al. 68 and with estimates based on the vapor pressure and deviates significantly from the trend for the remaining alcohols. With this in mind, and excluding this outlier from the analysis, the RMSD for TraPPE (C) is 1.59 kJ/mol, while it is 1.45 kJ/mol for PolCA. This shows that both models do a good job at predicting this property, although the TraPPE (C) values are on the upper (i.e., less   . The polarization corrections for the self-solvation free energy are the same as for the enthalpy of vaporization but with opposite sign; as such, their overall magnitude is also quite small. The good agreement observed for the free energies of selfsolvation reinforces our confidence that the PolCA model is performing well for properties of the pure liquid alcohols. A much more stringent test is to assess its performance in mixtures with different components. Here, we restrict our comparison to mixtures with alkanes because those are the only other molecules for which our polarization-consistent approach has been applied. 34 Figure 12 shows the free energy of solvation of a series of alkanes in octanol; i.e., we are testing the ability of the model to describe alcohols as solvents. The PolCA model yields excellent agreement with experimental data (RMSD of 0.90 kJ/mol), although TraPPE also performs quite well (RMSD of 1.10 kJ/mol). It is worth emphasizing that in this case no polarization corrections need to be applied. The alkanes are nonpolar molecules and hence, the electro-static contribution to the solvation free energy is effectively zero 97 (Table S1).
A more interesting comparison is presented in Figure 13, which shows the free energy of solvation of primary alcohols in a hexadecane solvent. Notice that for methanol there is a discrepancy in the two experimental data points available. The performance of both models is identical for small alcohols, but our PolCA model yields improved predictions for larger alcohols due to the reparametrized alkane force field parameters relative to TraPPE. In both cases, the simulation predictions for methanol are in agreement with the experimental data of Katritzky 68 but deviate from that of the Minnesota Solvation Database. 67 Overall, Figure 13 shows that PolCA leads to consistent predictions of solvation of alcohols in alkanes and that polarization corrections are the key to achieve this good performance. The RMSD for PolCA is 1.833 kJ/mol, while for TraPPE (C) the RMSD is 2.828 kJ/mol.
The root-mean-square deviation of each property can be found in Table 4.

Secondary and Tertiary Alcohols.
To increase the generality of the PolCA model, we have extended it to describe also secondary and tertiary alcohols. As a first attempt, the LJ parameters of the alkane CH and C pseudoatoms were used for the α-carbons without modification (i.e., identical to the corresponding parameters in pure alkanes 34 ), but this resulted in a model that significantly underpredicted the density. This was somewhat expected since the same effect was noticed by the developers of the TraPPE-UA force field, who had to reduce σ of the α-CH and α-C by 7.48% and 9.38%, respectively. 69 This could be explained by the fact that a C−O    Journal of Chemical Information and Modeling pubs.acs.org/jcim Article bond is shorter than a C−C bond and that the hydroxyl oxygen has an electron withdrawing effect over the carbon. 69 As a second attempt, therefore, we decided to also scale down the sigma values for α-CH and α-C pseudoatoms by the same factor (i.e., reducing them by 7.48% and 9.38% with respect to the corresponding alkane pseudoatoms). The performance of this model for a few nonprimary alcohols is presented in Table  5, from which we can see that the performance of the model is quite satisfactory overall, particularly for secondary alcohols. For tertiary alcohols, the performance deteriorates slightly, and the model is unable to predict the dielectric constant of these molecules.

CONCLUSIONS
In this work, we have proposed a new nonpolarizable force field for alcohols that was parametrized taking into consideration polarization effects. Nonpolarizable force fields, which are normally parametrized to match pure liquid properties, tend to fail at predicting the free energy of solvation of polar molecules in nonpolar solvents since they do not explicitly account for changes in the electron distribution of a molecule due to its environment. We have shown in this article that a consistent account of polarization effects through simple and computationally inexpensive post facto corrections is able to circumvent this limitation. In fact, our new PolCA model is able to predict pure liquid properties and solvation free energies in different solvents with a high degree of accuracy, including the free energy of solvation of primary alcohols in hexadecane. The static dielectric constant of pure alcohols was also accurately predicted after corrections were added to take into account the difference between the simulated dipole of the molecule and the true dipole moment in the liquid phase. The new parameters for the alcohol functional group are consistent with a recent united-atom model for aliphatic hydrocarbons 34 that eliminated systematic deviations in solvation free energy predictions from existing UA force fields. When compared to the benchmark TraPPE-UA force field for alcohols, our force field performs better at predicting the density, enthalpy of vaporization, dielectric constant, free energy of self-solvation, free energy of solvation in hexadecane, and free energy of solvation of alkanes in octanol. The RMSD of the diffusion coefficient is slightly higher for PolCA than it is for TraPPE, but this is likely due to an inherent limitation of the UA approach, since our model sacrifices the diffusion of methanol to predict the diffusion of ethanol to decanol with more accuracy than TraPPE. Indeed, the improved performance of PolCA is seen most significantly for larger alcohol molecules, precisely due to the elimination of systematic errors in the description of the alkane moieties. Nevertheless, it is important to highlight that TraPPE-UA already performs quite well, especially for smaller alcohols, once post facto polarization corrections are added. Even though this model was not parametrized taking into account polarization corrections, it performs well because the corrections for properties involving a change of phase in pure alcohols are quite small for alcohols with a short alkyl chain.
In summary, our paper demonstrates that consistently considering polarization effects in the prediction of properties that involve changes in the electrostatic environment leads to improved predictions and increased transferability. Indeed, remarkable improvements were observed when polarization corrections were applied to the predictions of both our new PolCA model and the existing TraPPE-UA force field for alcohols, indicating that such corrections offer a simple route for improving the transferability of generic nonpolarizable models. We expect this work to pave the way for a systematic parametrization of classical nonpolarizable models that adequately accounts for polarization effects, thus achieving a significant increase in accuracy with negligible increase in computational cost. Nevertheless, there are still several aspects of the polarization-consistent approach that can be improved, extended, or generalized. For example, we used an analytical expression to estimate the electronic contribution to the polarization energy that employs the assumption of a spherical solute cavity. Although this was shown to work well even for long-chain alcohols, it is likely that a more sophisticated approach will be needed for nonspherical molecules that contain more than one polar functional group. Furthermore, correction terms that are similar in spirit to those proposed here will still need to be developed for other properties that involve changes in phase, such as melting enthalpies, vapor− liquid equilibrium curves, critical properties, vapor pressures, and surface tensions. In particular, it should be noticed that our approach cannot be applied in its present form to simulations that involve direct coexistence between two phases. Efforts to address all these limitations are currently underway in our research group.