A Surface Site Interaction Point Method for Dissipative Particle Dynamics Parametrization: Application to Alkyl Ethoxylate Surfactant Self-Assembly

Dissipative particle dynamics (DPD) is a coarse-grained approach to the simulation of large supramolecular systems, but one limitation has been that the parameters required to describe the noncovalent interactions between beads are not readily accessible. A first-principles computational method has been developed so that bead interaction parameters can be calculated directly from ab initio gas-phase molecular electrostatic potential surfaces of the molecular fragments that represent the beads. A footprinting algorithm converts the molecular electrostatic potential surfaces into a discrete set of surface site interaction points (SSIPs), and these SSIPs are used in the SSIMPLE (surface site interaction model for the properties of liquids at equilibrium) algorithm to calculate the free energies of transfer of one bead into a solution of any other bead. The bead transfer free energies are then converted into the required DPD interaction parameters for all pairwise combinations of different beads. The reliability of the parameters was demonstrated using DPD simulations of a range of alkyl ethoxylate surfactants. The simulations reproduce the experimentally determined values of the critical micelle concentration and mean aggregation number well for all 22 surfactants studied.


■ INTRODUCTION
The formation of supramolecular structures such as micelles, vesicles, and bilayer membranes is a fundamentally important process in biology and in the industry with many applications in health and personal care products. 1 The self-assembly of surfactants is a complicated process, and despite the development of simple tools that can be used to predict some aspects of surfactant behavior based on chemical structure (e.g., the critical packing parameter 2 and the hydrophilic lipophilic balance (HLB) 3 ), the development of new surfactant systems still relies on experimental screening. The key parameters are the critical micelle concentration (CMC), which is the concentration at which surfactants start to aggregate into supramolecular structures (micelles), and the mean aggregation number (N agg ), which is the average number of surfactants in a micelle. Common techniques used to measure the CMC are surface tension, 4 dynamic light scattering (DLS), 5 fluorescence, 6 UV−vis, 7 and NMR 8 spectroscopy, and in the case of charged species, conductometry 9 and capillary electrophoresis. 10 Methods for determining the N agg include DLS, 11 small-angle neutron diffraction, 12 and time-resolved fluorescence quenching. 13 Many factors affect these measurements: for example, temperature, the presence of electrolytes, and organic impurities in the solution. 14,15 The method of analysis also plays a role because different methods are sensitive to different aspects of the self-assembled supramolecular structure. 16 As a result, experimental screening of new surfactant formulations is time-consuming and expensive, and in silico prediction of a surfactant self-assembly processes would be an attractive alternative. One computational approach is all-atom molecular dynamics (MD), but due to the long equilibration times and the large numbers of molecules required, coarse graining (CG) approaches are the method of choice. 17 Coarse graining combines several atoms or molecules into single CG beads, reducing the number of force centers and the associated computational cost.
Various CG methods have been used for the prediction of surfactant behavior in solution, including Monte Carlo (MC) 18 and lattice Boltzmann methods. 19 Although recent progress in coarse-grained molecular dynamics (CG-MD), using force fields such as MARTINI shows promise; 20−22 the size of the simulated systems using Lennard-Jones potentials with hard core repulsions is still limited in length and time scale. One of the possibilities for further increasing the computational efficiency is the use of methods that consider only soft core interactions, such as dissipative particle dynamics (DPD). Originally developed by Hoogerbrugge and Koelman 23 and extended by Espanõl and Warren, 24 DPD uses soft core beads, which move according to Newton's equations of motion.
The total force acting on a DPD bead is the sum of a conservative, a drag, and a random force. The drag and random forces are used solely for thermostatting (NVT ensemble). 24 The conservative force derives from a short-range soft pair potential of the form where a ij is the repulsion amplitude describing the interaction between beads i and j, r ij is the distance between the two beads, and R ij is the cutoff distance beyond, which there is no interaction between the beads. Since the conservative force accounts for the difference in noncovalent interactions between different beads, determination of the parameters in eq 1 is critical for accurately reproducing the relationship between the chemical structure and surfactant behavior.
Much effort has gone into the development of systematic parametrization methods to obtain the DPD repulsion amplitudes a ij . An early approach by Groot and Warren links a ij with Flory−Huggins χ parameters. 25 This correlation has been used to obtain a ij from experimental solubility data, 26 Hildebrand cohesive energy density parameters, 27,28 and infinite dilution activity coefficients. 29 Alternatively, a ij can be obtained by using experimental data on the mutual solubility of small molecules. For example, by using the CG approach for the alkyl ethoxylate surfactants shown in Figure 1, the interaction between the terminal alcohol bead and a bead in the ethylene glycol chain could be obtained using the mutual solubilities of methanol and methoxymethane. Computational approaches have also been investigated as an alternative to empirical parameterization. 30−32 A combination of MC and MD simulations was used to derive relationships between Flory−Huggins χ parameters and a ij values for interactions between different sized beads. 33 COSMO-RS was used to calculate infinity dilution activity and 1-octanol/water partition coefficients to obtain DPD repulsion parameters. 34−36 However, estimation of repulsion amplitudes based on experimental or calculated molecular properties has some limitations. Since DPD beads typically represent a fragment of a molecule, in reality, a portion of the molecular surface is buried by the overlap between covalently connected beads; this area should not be included when computing the bead interactions. Computational approaches provide an opportunity to remove the regions of the bead overlap from a molecular surface. Saathoff used COSMO-SAC to delete parts of the molecular surface in the calculation of solvation free energies. 31 An alternative approach to avoid this problem was adopted by Anderson et al. who used DPD simulations of the partition coefficients of complete molecules to optimize the set of a ij values, fitting to the experimental data. 37 Here, we propose a new approach to the calculation of DPD bead repulsion amplitudes a ij using surface site interaction points (SSIPs). 38 We develop the method in the context of CG models of alkyl ethoxylate surfactants shown in Figure 1. In this approach, different beads are used to represent chemical subgroups containing between one and three heavy atoms, as in Anderson et al. This provides flexibility and allows straightforward extension to more complicated systems. We show by simulation that the a ij values thus obtained accurately reproduce the experimental CMC and N agg values for this class of nonionic surfactants.

■ APPROACH
Surface site interaction points (SSIPs) provide a quantitative description of all intermolecular interactions that a molecule can make with its environment. Molecules are represented as discrete sets of SSIPs of a surface area of 10 Å 2 and a volume of 5 Å 3 , as illustrated in Figure 2. The number and properties of the SSIPs required to represent a specific molecule are calculated using the ab initio molecular electrostatic potential surface and a footprinting algorithm. 39 SSIPs can been used in the surface site interaction model for the properties of liquids at equilibrium (SSIMPLE) algorithm described in ref 38 to calculate solvation free energies and partition coefficients, which as we have outlined above, can be used for estimating the a ij parameters required for DPD.
In SSIMPLE, the solvent and solute molecules are each described as an ensemble of SSIPs. The equilibrium constant for the pairwise interaction of any two SSIPs is given by where ϵ i and ϵ j describe the interaction properties of two SSIPs and E vdW is a constant that was estimated to be −5.6 kJ mol −1 based on the experimental data on the enthalpy change for the The values of K ij can be used to determine the speciation of all possible SSIP contacts in a liquid phase. For a solute SSIP x dissolved in a solvent S1, solving the set of simultaneous equations allows the calculation of where [x bound ] is the concentration of SSIP x that is bound to a solvent SSIP, and [x free ] is the concentration of SSIP x that is free. This represents the overall equilibrium constant for the interaction of SSIP x with the solvent. The SSIP description of noncovalent interactions was parameterized using equilibrium constants for H-bonded complexes, which were measured at room temperature, and the approach has not yet been generalized to different temperatures, so this paper will focus on the room temperature behavior of surfactants.
Equating the concentrations of free SSIPs in two different phases allows calculation of the change in free energy (ΔG 12 ) for moving an SSIP from solvents 1 to 2 according to θ θ θ θ where θ 1 and θ 2 are the fractional occupancies of the two phases equal to the total SSIP concentration relative to the maximum possible SSIP concentration (300 M), R is the gas constant, and T is the absolute temperature. Equation 4 represents ΔG 12 as the sum of a binding free energy, which accounts for the interactions that the SSIP makes with the solvent SSIPs (first term) and a confinement free energy (second term). The confinement energy term is obtained by using an equilibrium constant of unity for all pairwise interactions in a phase and corrects for the difference in the probability of interaction associated with constraining the SSIPs to phases with different overall SSIP concentrations. For a solute molecule that is represented by multiple SSIPs, the free energy of transfer between two phases is calculated by summing the values of ΔG 12 over all SSIPs. Similarly, it is possible to represent a DPD bead as a set of SSIPs and to calculate the free energies of transfer of beads between different liquid phases as the sum of the free energies of transfer of the individual SSIPs that represent the bead. These free energies of transfer are then used to obtain DPD repulsion amplitudes a ij as follows.
First, the Flory−Huggins χ parameter is calculated from the bead transfer free energy using eq 5, which corrects for differences in the volumes of different beads. 41 where v r , v 1 , and v 2 are the van der Waals volumes calculated using the 0.002 electron/bohr 3 electron density isosurface for two molecules of water, and the fragments that represent beads 1 and 2, respectively, and ΔG 12 is the calculated change in free energy of transfer of 1 mol of bead 1 from a pure liquid composed of bead 1 to a dilute solution in a liquid composed of bead 2. Second, the linear correlation proposed by Groot and Warren is used to obtain the actual DPD repulsion parameters using eqs 6 and 7 25 χ where c p = 0.291 is a constant appropriate to the overall DPD bead density ρr c 3 = 3 used in this work. 25 jj ij (7) where a ii and a jj are the self-interaction parameter for beads i and j, respectively. One of the limitations of this method is that the interaction between beads of the same type, a ii , cannot be calculated. One approach is to use the same self-interaction parameter for all beads, and the water a ii parameter derived from compressibility data is commonly used (25 k B T). 27 However, this approximation assumes that all beads have the same volume, 25 so we use self-interaction parameters reported by Anderson et al., which were tuned to match the experimental densities of selected molecular liquids. 37 ■ RESULTS AND DISCUSSION DPD Parameters. The SSIP description of the four beads required for DPD simulations of the surfactants in Figure 1 is shown in Figure 3. SSIPs were calculated using methoxymethane for EO, ethane for C2, methane for T, and ethanol for OH. To convert these molecular descriptions to bead descriptions, the SSIPs associated with the hydrogen atoms located at the points of covalent connectivity between beads were removed, i.e., the points indicated by dotted lines in Figure 3. The SSIP description of water has been reported previously, 43 and the SSIP values used for all of the beads are summarized in Table 1. These values were used in SSIMPLE to calculate free energies of transfer for all pairwise combinations of beads (Table 1). A concentration of 1 mM was used for the solute beads to make sure that there are no The Journal of Physical Chemistry B pubs.acs.org/JPCB Article solute−solute interactions, so the results are equivalent to the infinite dilution values required for eq 5. The concentrations of the solvent beads were estimated based on structurally related liquids: the concentration of methanol was used as the bead concentration for a liquid composed of OH beads, half of the concentration of dimethoxyethane was used as the bead concentration for a liquid composed of EO beads, one-quarter of the concentration of n-octane was used as the bead concentration for a liquid composed of C2 beads, and oneeighth of the concentration of n-octane was used as the bead concentration for a liquid composed of T beads. The transfer free energies in Table 1 are used in eqs 5 to 7 to obtain the repulsion parameters listed in Table 2. For beads of different sizes, a volume correction is used in eq 5. The van der Waals volume of the terminal beads T and OH was calculated by using half the calculated volume of ethane and 1,2ethandiol, respectively. For the inner beads C2 and EO, the van der Waals volumes were obtained by calculating the volumes of homologous series of alkanes or ethylene glycols and taking the slope of a plot of volume versus the number of bead repeats in the chain (see the Supporting Information). The van der Waals volumes and a ii values used in eqs 5 to 7 are reported in Tables 1 and 2. The cross-interaction parameters Δa ij in Table 2 show that the strongest repulsion, about 20 k B T, occurs between water and the two hydrocarbon beads, C2 and T, as expected. The values compare well with those reported in the literature. 25,28,37,44−46 The values of Δa ij for the interactions of the terminal alcohol bead with the hydrocarbon beads, OH− C2 and OH−T, are approximately 9 k B T, which represents a significant repulsion. The OH bead is well solvated by itself due to H-bonding interactions between the hydroxyl groups, and these interactions are lost when this bead is transferred to a hydrocarbon solvent. The values are similar to those reported in the literature: values of Δa ij of 6 and 7 k B T were used previously for interactions between OH−C2 and OH−T. 37 The values of Δa ij for the interactions of the ethylene glycol bead with the hydrocarbon beads, EO−C2 and EO−T, are approximately zero. The major difference between the EO bead and the hydrocarbon beads is the presence of two Hbond acceptor sites on the EO oxygen (Table 1). However, none of these beads have H-bond donor sites, so there are no significant differences in the interactions that the beads make with each other. These EO-hydrocarbon cross-interaction parameters in Table 1 differ from those reported in the literature. Because of the limited experimental data on the mutual solubilities of alkanes and oligoethylene glycols, Δa ij was originally estimated to be intermediate in the value between the hydrocarbon−hydrocarbon and hydrocarbon− water parameters, and a value of 6.5 k B T was used in the literature. 26 Attempts to calculate this parameter using COSMO-RS were inconclusive due to a strong dependence on conformation. 34 Anderson et al. used experimental water− octanol partition coefficients to obtain a value of 3.1 k B T for the EO-hydrocarbon cross-interaction parameter, which is closer to the values in Table 2. 37 The values of Δa ij for the interactions of the terminal alcohol and ethylene oxide beads with water, W−OH and W−EO, are both negative. The miscibility of water with alcohols and with oligoethylene oxides makes it impossible to determine the values of Δa ij from experimental solubility measurements. Extrapolation from high-temperature measurements on polyethylene oxide−water mixtures gave a value of Δa ij of 1.0−1.5 k B T for the W−EO interaction. Anderson et al. used experimental logP measurements to obtain a value of −1.25 k B T for the W−EO interaction, which is consistent with the negative value that we calculated. The origin of the negative values of Δa ij in the SSIMPLE calculation is due to the fact that both alcohols and ethylene glycols are H-bond acceptors, and so, the H-bonds formed with the water H-bond donors are favorable in the mixtures.
DPD Simulations. To test the parametrization, DPD simulations were performed on linear, nonionic surfactants belonging to the alkyl ethoxylate family shown in Figure 1;   The Journal of Physical Chemistry B pubs.acs.org/JPCB Article these are generically denoted as C m E n where m is the number of carbon atoms in the hydrocarbon chain and n is the number of oxygens in the ethylene glycol chain. In the literature, these compounds are generally described using two types of beads, 26,29,45,47 one for the hydrophobic tail and the other one for the hydrophilic head group, but here, we use the more detailed CG description shown in Figure 1. 37 The solvent beads used in the simulations were each composed of two molecules of water. The value of R ij for the interaction of two water beads was set equal to r c , the length unit in DPD, which is in turn set using the mapping number identity proposed by Groot and Rabone: 26 ρN m v m = 1 where ρ is the bead density, N m (the "mapping number") is the number of water molecules in a water bead, and v m = 30 Å 3 is the molecular volume of water. With N m = 2 and dimensionless bead density ρr c 3 = 3, this means that r c = 5.64 Å. The correlated cutoff distances for all other bead−bead interactions (R ij ) were taken from Anderson et al. 37,42 In our CG representation of these surfactants (Figure 1), the DPD molecules are linear chains with a stiff harmonic spring potential between connected pairs of beads. We set k b = 150 k B T and choose r 0 to match (approximately) the physical bond lengths, as described below. To provide additional chain rigidity, we supplement this two-body spring potential with a three-body angular potential where k a = 5 k B T and θ 0 = 180°. For two covalently bonded C2 beads, the equilibrium distance r 0 was set at 0.39r c , which gave an average distance during the simulation of 0.45r c , equivalent to a bond length of 2.55 Å. 37 The other equilibrium distances (r 0 ) between covalently bonded beads were modified by 0.1r c for each heavy atom added or deleted relative to C2, which gives 0.29r c for C2−T, 0.49r c for C2−EO, 0.59r c for EO−EO, and 0.49r c for EO−OH, as in Anderson et al. 37,42 All the simulations were performed in a cubic box of side 40r c , containing a total of 192,000 beads. Box sizes from 20r c to 40r c were investigated, but no effect on the value of the calculated CMC was observed (see the Supporting Information). Simulations were run for 2 × 10 6 to 4 × 10 6 timesteps with a timestep of 0.01 in DPD time units. In the literature, the DPD time scale in this kind of CG representation has been estimated using the diffusion of small molecules 35 so that one DPD timestep corresponds approximately to 0.5 ps, and our simulation runs correspond roughly to 1−2 μs. Simulations were performed using the DL_MESO DPD package (version 2.7) 48 and analyzed using a combination of the UMMAP analysis tool 49 and purpose-written scripts. Trajectory files were collected every 500 timesteps, and the NVT ensemble was ensured by a commonly used DPD thermostat based on the standard velocity Verlet integration. 50 DPD simulations were run at 4, 5, and 6 wt % in water for all the C m E n surfactants shown in Figure 1. Figure 4 shows examples of the results. Two molecules were considered part of the same supramolecular cluster, if they were closer than the cutoff distance R ij , and this criterion was used to calculate the number of "free" surfactants as monomers or in submicellar aggregates for each timestep in a simulation. Figure 4a shows the free surfactant concentration for four different surfactants plotted against the number of timesteps. In all cases, the equilibrium was established between 1.0 × 10 5 and 2.5 × 10 5 timesteps. The cutoff distance criterion was also used to calculate the total number of molecules present within each supramolecular aggregate in each timestep of the simulation. The distribution of supramolecular assemblies was calculated for all timesteps greater than 5 × 10 5 , and the populations were summed to obtain the aggregation number distribution (P(N)) for each simulation. Figure 4b shows an example of an aggregation number distribution for simulation of C 10 E 6 . There are clearly two populations. There are a large number of monomers that appear close to the origin, and then, there is a clear gap in the trimer to 10-mer region before a bell-shaped distribution appears with a maximum population of around 20. The bell-shaped distribution corresponds to micellar aggregates of various sizes. The results from single simulations can be rather noisy because the molecules tend to become kinetically trapped in the micellar assemblies. Figure 4b shows that the values of P(N) obtained from a simulation of C 10 E 6 are not a simple function of N. It is possible to obtain a smoother aggregation number distribution by combining multiple simulations, and Figure 4c illustrates the result for C 10 E 6 . Critical Micelle Concentrations. The aggregation number distributions were used to calculate CMC values. The minimum number of molecules required for a cluster to be considered a micelle must first be assigned. A variable N cut is introduced such that clusters with aggregation numbers N > N cut are identified as micelles, and those with N < N cut are identified as monomers and submicellar aggregates; the totality of the latter were used to define the free surfactant concentration. For the example shown in Figure 4b, there is a clear gap between the two populations, so choosing a value of N cut anywhere in the 5−10 range will give the same result. In some simulations, there was an overlap between the submicellar and micellar populations, but in these cases, it was still possible to identify a sparsely populated intermediate region where the precise value of N cut had minimal effect on the calculated CMC values (see the Supporting Information). The values of CMC obtained from different simulations of the same surfactant were similar, leading to relatively well-defined values with small error bars. The surfactant concentration used in a DPD simulation can affect the calculated CMC value, 51 but even for simulations that run over a wider concentration range (1−15 wt % of surfactant; see the Supporting Information), no significant variations in CMC were found. 52 CMC values were calculated for all 22 surfactants by averaging the results from the three different concentrations used in the simulations, 4, 5, and 6 wt %. The results are compared with the corresponding experimentally determined values in Figure 5. The error bars in the calculated CMC values are larger for the C12 surfactants due to the small numbers of free surfactants in individual simulations, but in all cases, there is excellent agreement between calculation and experiment. The simulations clearly reproduce the trend that CMC is independent of n, the length of the ethylene glycol chain, but highly dependent on m, the length of the hydrocarbon chain. The results are also quantitatively accurate reproducing the order of magnitude drop in CMC for every two CH 2 groups added to the hydrocarbon chain (Stauff−Klevens rule).
Mean Aggregation Numbers. For each timestep in a DPD simulation, the mean aggregation number (N agg ) can be calculated using Some authors use different N cut numbers to determine the values of N agg and the CMC from the same P(N) distribution, but here, we used the same N cut for calculation of both values (Table 3). 29,44,53 Consistent values of N agg were obtained for simulations at different surfactant concentrations between 2 and 10 wt %, and the results in Table 3 are quoted as the range of N agg values obtained from individual simulations. All simulations performed in this work gave spherical micelles with no significant population of other morphologies in agreement with simulations 29,37 and experimental reports in the literature. 54−59 For C 12 E 3 , C 12 E 4 , C 10 E 3 , and C 10 E 4 , shortlived interactions of two spherical micelles lead to transient elongated aggregates. 37,60 Figure 6a−d,f−i shows snapshots illustrating the evolution of the micelle structure with time for C 8 E 4 and C 10 E 6 . The time scale for equilibration of the N agg shown in Figure 6e,j is much longer than that for the equilibration of the monomer concentration shown in Figure 6a. The reason is that the equilibration of micelle size must take place via exchange of molecules between micelles via monomers in solution, and both the number of micelles and the dissociation rate from a micelle are low. Figure 6e,j illustrates the effect of surfactant concentration on N agg equilibration time. For C 8 E 4 , all of the simulations converge to the same value of N agg (about 50) after 3 × 10 6 timesteps, but the more dilute solutions equilibrate more slowly because there are fewer micelles. For C 10 E 6 , equilibration is significantly slower, and after 4 × 10 6 timesteps, the values of N agg for different concentrations have not converged. This behavior is illustrated in Table 3 where N agg values are reported after 2 × 10 6 timesteps for all simulations and after 4 × 10 6 timesteps for selected examples. The N agg values for C 6 E n surfactants show no substantial variation between 2 × 10 6 and 4 × 10 6 timesteps, confirming that the equilibration occurs rapidly. For C 8 E n , the N agg values increase slightly between 2 × 10 6 and 4 × 10 6 steps, but as shown in Figure 6e, equilibrium is reached by 4 × 10 6 timesteps  The Journal of Physical Chemistry B pubs.acs.org/JPCB Article simulations. For C 10 E n and C 12 E n , there is a significant difference between the values of N agg obtained at 2 × 10 6 and 4 × 10 6 timesteps. Figure 6j shows that the slow equilibration of the more hydrophobic surfactants means that the calculated values of N agg are likely to be underestimated. 45 We note that, in principle, the mean aggregation number is an increasing function of concentration. 61 This could account for the behavior seen here. Additionally, there are slow processes on millisecond time scales, which are likely to be inaccessible in our microsecond simulations. 62 Thus, the number of micelles may be poorly equilibrated even if the free surfactant concentration has reached a steady state. This obviously impacts the calculation of the mean aggregation number. Table 3 compares the calculated values of N agg with the corresponding experimentally determined values reported by Swope et al. 63 Experimental measurement of N agg is not straightforward, and the application of different techniques to very broad distributions of micelle size can lead to discrepancies of an order of magnitude in reported values (see C 10 E 5 and C 12 E 5 in Table 3). However, the values of N agg calculated for the fully equilibrated C 6 E n and C 8 E n surfactants agree well with the experimental ranges. For the surfactants with longer hydrocarbon chains, C 10 E n and C 12 E n , the simulations consistently underestimate the value of N agg .

■ CONCLUSIONS
A method for calculating the bead interaction parameters required for dissipative particle dynamics (DPD) simulations has been developed. The method is based on ab initio calculation of the gas-phase molecular electrostatic potential surfaces of the molecular fragments that represent the beads, so the approach should be generally applicable to the coarse graining of any molecular system using DPD. A footprinting algorithm was used to convert the molecular electrostatic potential surfaces into a discrete set of surface site interaction points (SSIPs), and these SSIPs were used in the SSIMPLE algorithm to calculate the free energies of transfer of one bead into a solution of any other bead. The bead transfer free energies were used to obtain the required DPD interaction parameters for all pairwise combinations of different beads. The reliability of this computational approach to determination of accurate DPD parameters was demonstrated using DPD simulations of a range of alkyl ethoxylate surfactants. The simulations reproduce the experimentally determined values of critical micelle concentration and aggregation number well for all 22 surfactants studied. The approach provides a powerful new tool for first-principles calculation of DPD parameters and for prediction of the surfactant properties of molecules for which experimental data is not available.  The Journal of Physical Chemistry B pubs.acs.org/JPCB Article