Anisotropic Electrostatic Interactions in Coarse-Grained Water Models to Enhance the Accuracy and Speed-Up Factor of Mesoscopic Simulations

Water models with realistic physical–chemical properties are essential to study a variety of biomedical processes or engineering technologies involving molecules or nanomaterials. Atomistic models of water are constrained by the feasible computational capacity, but calibrated coarse-grained (CG) ones can go beyond these limits. Here, we compare three popular atomistic water models with their corresponding CG model built using finite-size particles such as ellipsoids. Differently from previous approaches, short-range interactions are accounted for with the generalized Gay–Berne potential, while electrostatic and long-range interactions are computed from virtual charges inside the ellipsoids. Such an approach leads to a quantitative agreement between the original atomistic models and their CG counterparts. Results show that a timestep of up to 10 fs can be achieved to integrate the equations of motion without significant degradation of the physical observables extracted from the computed trajectories, thus unlocking a significant acceleration of water-based mesoscopic simulations at a given accuracy.


■ INTRODUCTION
Coarse-grained (CG) molecular dynamics (MD) simulations offer an efficient way to study the most diverse systems at a mesoscopic scale, with applications ranging from biochemistry 1,2 to material science. 3−5 The basic idea behind CG is to decrease the number of interacting sites describing individual molecules. By reducing the model resolution, the computational cost and the configuration space of the system decrease, thus enabling the modeling of larger and more complex systems compared to atomistic simulations. For some phenomena, such as the conformation change of enzymes and functional proteins, 6,7 the limiting factor of all-atom (AA) simulations is the timescale needed to witness a specific process. In this regard, CG models enable longer timesteps and thus accessible simulation times by suppressing the highfrequency motion characteristics of light atoms and/or averaging out some intramolecular degrees of freedom. Several CG models of water have been proposed during the last two decades, following different approaches for the mapping process. 8−12 However, current CG models often do not provide an adequate description of intermolecular interactions due to the lack of many-body contributions, which are particularly important for the accurate description of, for example, water properties. 13 Instead, the MOLC model uses finite-size aspherical particles 14 connected with directional bonds. The particles are decorated with virtual sites representing point charges. Differently from previous approaches, short-range interactions are accounted for with the generalized Gay−Berne potential, 15 while electrostatic and long-range interactions with a modified version of the usual Coulomb pairwise summation and the reciprocal-space Ewald solver, respectively. 16 This work explores the tantalizing possibility of carrying out CG-MD simulations of liquid water with near-atomistic quality using the MOLC model, 16 which is available as a user-defined package for the popular materials modeling code LAMMPS. 17,18 The CG models of water presented in this work use one site to represent the whole molecule but with the possibility to host an arbitrary number of virtual charges to account for realistic electrostatic interactions. Other one-site water models were described in the literature, e.g., using a Stillinger−Weber potential to mimic the effect of anisotropic hydrogen-bonding interactions, 19 projecting the many-body interactions into pairwise basis sets, 20 or accounting for electrostatic interactions with a point dipole, 10 even if significant discrepancies with AA models have been shown. However, many biological processes such as electropore formation in membranes are governed by electrostatic interactions, and thus more detailed modeling of electrostatics is required to capture the dynamic behavior of water. 21 Our solution is to explicitly include three virtual charges per water molecule, having a 1-to-1 correspondence between the CG model and the source AA model. Hence, the aim of this work is to test that atomistic models of water can be rewritten in terms of finite-size spheres (thus taking advantage of the longer timesteps allowed by the Richardson iteration method 22 ), without significantly losing the accuracy of all-atom description for structural and dynamical properties of liquid water.

■ COMPUTATIONAL METHODS
The MOLC force field has been described in detail elsewhere. 16 In the MOLC model, the electrostatic interactions are accounted for via virtual sites acting as point charges (see Figure 1a,b). The sites are placed within a skin distance from the external surface of a parent CG particle, i.e., an ellipsoid. For the specific case of water, we use spheres, which are a special class of ellipsoids. The short-and long-range interactions are evaluated with a custom algorithm based on the Coulomb pairwise summation in real space and a particle− particle particle-mesh (PPPM) Ewalds solver in the reciprocal space. 23 Typically, a reduced set of charges is used to replace the complete set of all-atom charges. However, here, we use the three charges of the original water model without any further modification. The novelty introduced in the MOLC model is that the point charges are defined in the ellipsoid framework to which they are related. In other words, each ellipsoid is decorated with a set of charges whose position is defined with respect to the three principal axes of the ellipsoid. As the virtual sites are defined anywhere inside the ellipsoid, we refer to them as "off-center" charges. At each timestep, the Cartesian coordinates of each virtual charge are reconstructed by combining the position and orientation of the parent ellipsoidal particle. In this way, there is no need to keep track of the position of the virtual charges nor to integrate it explicitly.
Electrostatic interactions in the MOLC model are computed by rewriting the usual Coulombic expression from the charge frame of reference to the ellipsoid frame of reference. The direct Coulombic potential in real space is given by where C is an energy-conversion constant, q 1 and q 2 are the virtual charges, and the norm R R R 12 12 12 = · is the scalar distance. As shown in Figure 2, the vector distance between the virtual charges is expressed as where R i = P i + S i (i = 1, 2) is the position of the virtual charge, P i is the center of the ellipsoid, and S i is the relative position of the virtual charges in the ellipsoid system of reference. In this notation, R i is obtained by rotating the charge position from the ellipsoid's frame to the frame of reference using the quaternion of the parent ellipsoid followed by a translation. The force that the virtual charge 1 exerts on its parent ellipsoid bead is computed from the gradient of the potential with respect to P 1 as while the force on the virtual charge 2 is simply defined as F 2 = −F 1 . Finally, the torque is defined as while, in this case, the torque on the second particle can be written as All of the details of the derivation of forces and torques are reported in Supporting Information Note 1.  The AA water models studied are based on a three-site or four-site rigid molecule with a point charge on each site (in the case of Tip4P, the oxygen charge is displaced by 0.1546 Å), plus a single Lennard-Jones potential on the oxygen atom. The molecules are kept rigid with the SHAKE 24 or RATTLE 25 algorithms. In the MOLC force field, three atoms are replaced with a single sphere, whose mass is that of the water molecule and whose radius is the Lennard-Jones σ parameter. The virtual site representing the oxygen charge is placed at the center of the sphere, while the virtual sites representing the hydrogen atoms are placed on the plane z = 0 in the molecular frame of reference. Despite using one particle per molecule, the MOLC model of water includes all of the terms of the corresponding AA model, described with a different mathematical formalism. Consequently, an arbitrary AA configuration mapped to its CG counterpart will have the same intermolecular energy in both representations. A slight difference is expected for torques: in the AA model, each atom will generate a contribution acting on the center-of-mass of the rigid molecule; in the CG model, only the virtual particles carrying the charge of hydrogen atoms will produce a torque acting on the center of the CG sphere. However, this difference is limited as the center-of-mass of the water molecule lies some 0.07 Å from the position of oxygen, as shown in Figure 1a. The CG trajectories can be easily reverse-mapped to the AA representation using the position and quaternion of each ellipsoid, fully exploiting the extra rotational degrees of freedom associated with finite-size particles. The reversemapped trajectories were computed with the open-source Backmap code. 26 For the special cases of the rigid water models (e.g., MD simulations integrated with the SHAKE algorithm 24 ), the AA trajectories can be directly compared to the reverse-mapped trajectories using structural observables such as radial distribution functions (RDFs). All of the details related to the generation and equilibration of the simulated systems as well as the protocols adopted for the evaluation of the physical observables are reported in Supporting Information Note 2. Representative samples of the LAMMPS codes employed for the simulations are available in the Zenodo archive at https://doi.org/10.5281/zenodo.5552351.

■ RESULTS AND DISCUSSION
We test the validity of the MOLC representation for three widely used all-atom water models, namely, SPC-E, 27 Tip3P-Ew, 28 and Tip4P-05, 29 by comparing the computed selfdiffusion coefficient, dynamic viscosity, surface free energy, and enthalpy of vaporization 30−37 between the AA and corresponding CG models at T = 298 K and p = 1 atm (see Supporting Information Note 2 for methodological details). For each AA and CG model, samples made of 500, 1000, and 5000 water molecules were studied to evaluate possible size dependence on the calculated physical observables. 38 For the sake of simplicity, from this point on, we will refer to the Tip3P-Ew and Tip4P-05 models as Tip3P and Tip4P, respectively. A side-by-side comparison of AA and CG samples is shown in Figure 1c,d.
The computed density of SPC-E and Tip4P CG models is 0.993 ± 0.004 g/cm 3 , while that of the Tip3P CG model is 0.995 ± 0.004 g/cm 3 , in excellent agreement with the corresponding AA values (0.994 ± 0.003 g/cm 3 for SPC-E and Tip4P and 0.992 ± 0.004 g/cm 3 for Tip3P). Furthermore, the 1-to-1 correspondence between the CG and AA representations allows us to reintroduce the atomic detail into the CG trajectories without loss of structural information (and vice-versa): the resulting reverse-mapped trajectory was then used to compute the oxygen−oxygen (O−O), oxygen−   simulations. The first and second peaks in the O−O distance are consistently broader in the CG samples, resulting also in a smaller depth of the first well, which is more pronounced for the TIP4P CG model. We attribute this difference to the shifted reference of the axis of inertia, which is centered in the center-of-mass in the AA model ( Figure 1a) while in the center of the bead in the CG one (Figure 1b). This leads to slightly different dynamics between the AA and CG models. Such a difference is magnified in the TIP4P model since it includes an additional interaction center producing a torque on each molecule.
The self-diffusion coefficient (D), dynamic viscosity (μ), and surface free energy (γ) of both the AA and CG water models evaluated with a timestep of 1 fs are summarized in Supporting Information Table S1. The results of AA simulations are in good agreement with reference modeling and experimental values found in the literature, 10,30,31,34,36 without showing a system size dependence within the computed error bars (see Supporting Information Note 3 for details). Nevertheless, we observe a progressive reduction of the error bar of D with larger samples because of the improved statistics. The selfdiffusion coefficient and dynamic viscosity derived from CG simulations are also plotted against the AA reference, as shown in Figure 4a,b. Results show that the self-diffusion coefficient computed from CG simulations is slightly lower than that from AA ones. Consistently, μ of CG samples is higher than that of AA ones. This evidence agrees with the Stokes−Einstein where k B is the Boltzmann constant and r is the radius of water molecules, strictly valid for spherical particles), thus proving the self-consistency of the proposed model (see Figure 4c). The transient values of μ and D are reported in Supporting Information Figures S2 and S3, highlighting the substantial convergence of simulations. Furthermore, all obtained results remain in good agreement with the experimental values. 30,39 Finally, in Figure 4d, we compare γ computed with the AA and CG models. The results show an average reduction with respect to the AA reference. However, in the specific case of SPC-E and Tip4P CG models, the calculated γ still represents a good approximation of the experimental value (72.0 mJ/m 2 ), 40 making the CG water models perfectly suitable for describing multiphase systems and interfacial phenomena. The observed differences between the AA and CG results can be explained by the fact that the moment of inertia of the CG model, based on a finite-size sphere, is about 10 times larger than that of the AA model.
The self-diffusion coefficient, dynamic viscosity, and surface free energy computed from the CG-MD simulations integrated with a longer timestep of 10 fs are summarized in Figure 5 and fully reported in Supporting Information Table S2. Also, in this case, the evaluated physical observables are not sensibly dependent on the system size. The viscosity and self-diffusion coefficients obtained with a longer timestep (Figure 5a,b) are consistent with those obtained with a timestep equal to 1 fs. The self-diffusion coefficients D* computed via the Stokes− Einstein relation from the μ values are also consistent with those obtained with the mean square displacement ( Figure  5c). No significant difference is observed for the computed value of the surface free energy (Figure 5d). These results show that no substantial degradation of the computed properties of CG models is observed using a longer timestep of 10 fs. Furthermore, for the specific case of SPC-E and Tip4P CG models, we recall the good accuracy with respect to the experimental values.
The computed enthalpy of vaporization for samples of 5000 molecules is reported in Supporting Information Table S3, showing a substantial agreement between the AA and CG models. A comprehensive assessment of other thermodynamic properties, such as the low-temperature density anomaly or the melting temperature, 19,41,42 will be the subject of future investigations.
Considering the Tip3P water as a representative case study for assessing the computational performance of the MOLC model, Figure 6a reports the wall time ratio (wall time ratio = CG elapsed computational time/AA elapsed computational time) between CG and AA simulations with the same trajectory length. At a given timestep (1 fs), the computational cost of the CG models is approximately the same as the AA ones, the wall time ratio between CG and AA being in the range of 0.8−1.8. Indeed, despite reducing the number of particles from 3 to 1, the electrostatic interactions are identical in the AA and CG force fields and so is the computational cost. On the other hand, the possibility of using a timestep of 10 fs leads to a speed-up factor (speed-up factor = CG performance/AA performance, the performance being evaluated as nanoseconds of trajectory computed per day) of up to 6× (see Figure 6b).
The use of a longer timestep has the obvious advantage of significantly speeding up the mesoscopic model with respect to The Journal of Physical Chemistry B pubs.acs.org/JPCB Article the original atomistic one. As a matter of fact, the timestep of AA water simulations typically does not exceed 2 fs, even for accelerated MD simulations of biomolecules where the scale of microseconds is routinely achieved. 43,44 Here, we investigate the longest timestep still able to guarantee stable CG simulations and realistic water properties. For this, we compute the viscosity of the Tip3P water model using timesteps of 1, 10, 15, 20, 40, and 80 fs. As the timestep is increased, numerical instabilities arise from the accumulation of integration errors. To mitigate the inaccuracies in sampling the higher frequency motions, the mass of CG water molecules is progressively increased as to have a larger moment of inertia and slower dynamics (see Supporting Information Figure S4), an expedient already adopted in other works. 45 The higher mass of CG water molecules leads to simulation stability of up to 80 fs timestep, but results show a steady increase of dynamic viscosity with mass (see Supporting Information Figure S5) in agreement with literature values. 46 The scaled density and RDFs, instead, are in excellent agreement with the reference values even with the largest timestep tested. A core feature of the MOLC model is to represent the intermolecular interactions via a Gay−Berne potential and point charges, which can be seamlessly mixed with standard AA force fields based on Lennard-Jones potentials. Using this framework, water models based on four or more point charges can be readily transformed into a CG model without any reparameterization, and AA force fields can be mixed with CG models in the same simulation. Another direction for the finetuning of CG water models involves replacing the finite-size isotropic spheres with finite-size anisotropic ellipsoids to reproduce the anisotropic inertia tensor of the AA water model. Ultra-coarse-grained models, 47−49 i.e., where a single particle represents a cluster of water molecules, are also natively supported by the MOLC model and easily implementable by placing many electrostatic sites inside a large spheroidal particle, whose position and strength should be optimized to simultaneously reproduce the packing of pure water and its enthalpy of vaporization. The implementation of new water models is beyond the scope of this article but it is mentioned to encourage the translation of existing force fields in the MOLC representation. Similarly, from an application point of view, further studies will be necessary to validate the proposed model in multiphase systems (e.g., for the description of wettability or adsorption phenomena), 50−52 nanoconfined geometries, 53,54 or heat-transfer processes. 55,56

■ CONCLUSIONS
In this work, we have proposed, tested, and validated a new coarse description of classic water models based on the MOLC force field. We chose three popular all-atom models (SPC-E, Tip3P-Ew, and Tip4P-05) and found that their corresponding coarse-grained representations show accurate structural and dynamical properties: density, radial distribution functions, self-diffusion coefficient, dynamic viscosity, surface free energy, and enthalpy of vaporization. We observed a reduction of the CG self-diffusion coefficient, which is matched by an increase in the dynamic viscosity, consistent with the Stokes−Einstein relation. The computed surface free energy is approximately 14% smaller for all of the CG models. However, the computed surface free energy is still reasonably close to the experimental value, confirming that the CG models of water are good enough for describing interface problems such as material wettability or adsorption phenomena. A speed-up factor between 3 and 6 times is obtained with respect to the AA model, entirely being due to the increase in the integration timestep unlocked by coarse graining.
Derivation of forces and torques in the Molc model (Note 1), methodological detail (Note 2), detail for the evaluation of error bars (Note 3), Tables S1−S3, and Figures