Automated Coarse-Grained Mapping Algorithm for the Martini Force Field and Benchmarks for Membrane–Water Partitioning

With a view to high-throughput simulations, we present an automated system for mapping and parameterizing organic molecules for use with the coarse-grained Martini force field. The method scales to larger molecules and a broader chemical space than existing schemes. The core of the mapping process is a graph-based analysis of the molecule’s bonding network, which has the advantages of being fast, general, and preserving symmetry. The parameterization process pays special attention to coarse-grained beads in aromatic rings. It also includes a method for building efficient and stable frameworks of constraints for molecules with structural rigidity. The performance of the method is tested on a diverse set of 87 neutral organic molecules and the ability of the resulting models to capture octanol–water and membrane–water partition coefficients. In the latter case, we introduce an adaptive method for extracting partition coefficients from free-energy profiles to take into account the interfacial region of the membrane. We also use the models to probe the response of membrane–water partitioning to the cholesterol content of the membrane.

: Schematic of the geometric constructions used to determine weights for a virtual site with n = 4 neighbours.

Derivation of virtual site equations
The equations which are solved to calculate the weights for virtual sites are given in Section 2.3.2. The derivation for these equations in the case where the virtual site has n = 4 neighbours is given here. The endpoints of the lines which cross the box, s ij , (see Figure 1) can be written in terms of the horizontal and vertical components of the site weights, β h and β v , and the positions of the real sites, r i . r 1 is set to the origin for simplicity.
The position of the virtual site, v, is: Substituting t = r 4 − r 3 − r 2 , and separating into x and y components, we get two equations: Making β v the subject of Eq. (3a) gives and substituting into Eq. (3b) gives The substitutions in Eq. (6) of the main article yield the quadratic Eq. (5) for β h in the main article. Eq. (4) above then provides the value of β v .

Procedure for generating ring geometries
This section details the algorithm for generating the constraint and dihedral network described in Section 2.3.3. In the following pseudocode, the ring of real sites is indexed in clockwise order from 0 to n. For a set of beads ijkl, the algorithm generates constraints jk and dihedrals ijkl within one loop. The algorithm stops once the total number of constraints, C, is reached. The indices j and k are iterated in opposite direction round the ring, starting from n and 1 respectively.
C=n-3 j=n; k=1 while constraints to add: i=(j+1) mod (n+1); l=k+1 add constraint jk add dihedral ijkl if C constraints added: stop k=k+1 i=k-1; l=j-1 add constraint jk add dihedral ijkl j=j-1   Fig. 2(d). At this size, the edges of dihedrals begin to overlap, and the automatically generated model becomes unstable during MD integration. However, it is rare to encounter this situation; most ring systems with up to four connected rings, and some larger, are represented by six or fewer real sites. In the case of even larger ring systems, the automatically generated mapping can act as a starting point for manual adjustment of the constraint and dihedral network, or reassignment of additional real sites to virtual sites. Figure 3 shows the convergence of bonded parameters for dodecane as the number of conformers generated, n, increases. Dodecane is modelled by three identical coarse-grained beads, so the two equilibrium bond lengths should be identical. As expected, very low numbers of conformers lead to inconsistent results. However, the average bond lengths and angles converge relatively quickly. When n reaches 200 there is a high level of confidence that the value calculated is close to the true average for the conformer generation method used.

Effect of bonded parameters
There are small differences between the two even at high n, but again, when n reaches 200 these differences are negligible.
The bonded parameters, particularly the bond lengths, changed significantly depending on whether the atomistic conformers from ETKDG were minimised. It should be noted that ETKDG conformers do not strictly require minimisation, as the method aims to directly predict crystal structures. 1 However, we investigated the influence of different conformer generation schemes by testing both dodecane models for density and partitioning. We compared these results to a model where the bonded parameters were generated using a more standard bottom-up procedure, in which a single-molecule atomistic simulation was run properties are of interest, therefore, the method used to parametrise bonded interactions is less important, and fine-tuning may not be necessary.