Optimization of Slipids Force Field Parameters Describing Headgroups of Phospholipids

The molecular mechanics force field Slipids developed in a series of works by Jämbeck and Lyubartsev (J. Phys. Chem. B2012, 116, 3164–3179; J. Chem. Theory Comput.2012, 8, 2938–2948) generally provides a good description of various lipid bilayer systems. However, it was also found that order parameters of C–H bonds in the glycerol moiety of the phosphatidylcholine headgroup deviate significantly from NMR results. In this work, the dihedral force field parameters have been reparameterized in order to improve the agreement with experiment. For this purpose, we have computed energies for a large amount of lipid headgroup conformations using density functional theory on the B3P86/cc-pvqz level and optimized dihedral angle parameters simultaneously to provide the best fit to the quantum chemical energies. The new parameter set was validated for three lipid bilayer systems against a number of experimental properties including order parameters, area per lipid, scattering form factors, bilayer thickness, area compressibility and lateral diffusion coefficients. In addition, the order parameter dependence on cholesterol content in the POPC bilayer was investigated. It is shown that the new force field significantly improves agreement with the experimental order parameters for the lipid headgroup while keeping good agreement with other experimentally measured properties.


■ INTRODUCTION
Lipids are key components of biological membranes which are crucial parts of all cells. 1 Due to their amphiphilic character, lipids in water assemble into bilayer structures with polar headgroups facing the aqueous phase and aliphatic tails forming a hydrophobic core in the bilayer center. Together with other biomolecules, including proteins and carbohydrates, they constitute a barrier between the cytosol and the extracellular space. Furthermore, biological membranes are responsible for compartmentalization within the cell separating functional units called organelles. 2 In addition, they play a crucial role in the transport of molecular cargo in and out of cells and organelles by the processes of endo-and exocytosis 3 and by accommodating transmembrane gates and pumps. 4 Although the composition of biological membranes found in nature is diverse 5 and varies between different types of cells, studies of one-component lipid bilayers can give valuable insight. In the biologically relevant L α phase, 6 the membrane structure is highly dynamic where lipids change conformation, rotate, diffuse in the lateral plane, and undergo collective motion. The time scales of these processes cover several orders of magnitude from the picosecond to the microsecond and millisecond range. 7 Although the lipid molecules that constitute the membrane reorient and wobble as they diffuse in the lateral plane, they possess orientational order with respect to the membrane normal. In addition, bonds within the lipid molecules have different degrees of orientatinal order. Bonds located in more or less rigid parts of the lipid possess a high degree of order, while others exist in flexible regions and are for this reason less ordered. The degree of orientational order is an essential feature of the bilayer structure that has important consequences for the properties of biological membranes.
Experimentally, NMR spectroscopy is the primary source of information for orientational order 8 and dynamics 9 in lipid bilayers, while small-angle X-ray (SAXS) and neutron (SANS) scattering 10,11 are considered the most accurate tools for determination of bilayer thickness and area per lipid. 12,13 Furthermore, molecular dynamics (MD) simulation is a powerful complement to experimental studies that can inform about the structure and dynamics of lipid bilayers in atomistic detail. The results obtained from MD are however limited by the quality of available force fields that describe atomic interactions. This includes popular force fields such as Berger modification of the united atom GROMOS force field, 14 CHARMM36, 15 and GAFFlipid. 16 The force field Slipids, developed in a series of works 17−19 by Jambeck and Lyubartsev, has used partial atom charges and parameters of lipid tails optimized against accurate quantum chemistry calculations, while torsion and Lennard-Joned parameters for the lipid headgroups were inherited from the CHARMM36 force field. It has been shown that the Slipids force field generally provides a good description of lipid bilayers in agreement with experiment for many structural and dynamical properties. 12 A benchmark study 20 comparing the membrane/water partition coefficients obtained using different force fields, including GAFFlipid, 16 Berger, 14 CHARMM36, 15 and Slipids, 17 showed that Slipids and CHARMM36 gave the closest agreement to experiment. However, it has also been found that the order parameters for C−H vectors in the glycerol moiety (g1, g2, and g3 in Figure 1) calculated with Slipids are in poor agreement with accurate NMR experiments. 21 This shows that these C−H bonds do not sample the correct orientational distributions which are sensitive to the dihedral terms in the force field. CHARMM36 dihedral parameters were tuned empirically 15 to provide good agreement with the order parameters from NMR. A problem with this approach is that the order parameter is an average over the orientational distribution function, and since different distributions can give the same average, this approach does not guarantee that the correct orientational distribution is sampled. Slipids has been developed using a different procedure where the force field parameters are first fitted to quantum chemistry data, without empirical input, and then validated against experimental data. A realistic description of the lipid bilayer structure is crucial in order to better understand scientific questions including the formation and role of lipid microdomains 22 (rafts), interaction between pharmaceuticals and biological membranes, 23 and how lipid−protein interactions affect protein function. 24 For this reason, a large amount of scientific work has been devoted to improving force fields used in simulations of lipid bilayers. 12 An overview of the development, validation, and comparison of force fields for lipids and lipid bilayers has been presented in a number of publications. 12,21,25,26 The ability of force fields to reproduce experimental properties, including order parameters, has been subject to extensive investigation within the NMRlipids project. 21,27−29 However, to the best of our knowledge, there is yet no parameter set available that is capable of reproducing order parameters of all C−H vectors in phosphatidylcholine lipids within experimental error bars. In this work, Slipids dihedral force field parameters have therefore been reparametrized against quantum chemical calculations with the purpose of achieving better agreement with experiment. The next section "Methods and Models" will describe the strategy that was undertaken for parameter optimization and how the obtained parameters were validated against experimental data. In the section "Results and Discussion", results are presented and the performance of the new parameter set is discussed. The last section "Conclusion" provides the concluding remarks.

■ METHODS AND MODELS
Parameterization Strategy. In the Slipids force field, the potential energy has the standard functional form shown in eq 1 where the first three terms are contributions from bonds, angles, and dihedral angles and the last two terms describe short-range repulsion, van der Waals attraction, and electrostatic interactions. In this work, we focus on the dihedral part and it is therefore convenient to express the force field energy as in eq 2 where E dihedral is the dihedral contribution and E other contains all of the other interactions.  In the force field, each atom in a molecule is mapped to an atom type representing a chemically distinct environment. A sequence of four atom types i, j, k, and l defines a dihedral interaction which is modeled by a sum of cosine terms each having a multiplicity, m, a dihedral force field parameter k m,ijkl , and a phase shift, δ m,ijkl ; see eq 1. For symmetry reasons, the phase term is restricted to 0 and 180°and can therefore be removed if the dihedral force field parameter is allowed to take both positive and negative values. For this reason, phase shifts are not fitted in the present work. An approach for fitting also the phase terms is described in ref 30. Previously, some dihedral interactions in the Slipids force field were modeled using cosine terms of multiplicity 4 and 6. In this work, we only include cosine functions with multiplicity m max ≤ 3.
The most popular approach for parametrizing dihedral angles has previously been to fit parameters to an energy profile obtained by carrying out incremental rotations about dihedral angles, one at a time, and evaluating the quantum chemical energy at each point. 31 Frequently, such calculations have been performed using small "model compounds" to reduce computational cost. 15 A serious drawback with such procedures is that dihedral angles are optimized one by one and are hence treated as independent. This work attempts to overcome this problem by taking a different approach where all dihedrals of a lipid headgroup are optimized simultaneously. For this purpose, a representative set of lipid conformations was extracted form a well-equilibrated MD trajectory of a DMPC lipid bilayer (303 K, 1.013 bar, 128 lipids, 30 TIP3P water molecules per lipid) described with the current version of the Slipids force field. In order to simplify computationally intense quantum chemistry calculations, the hydrocarbon tails of lipids were replaced by ethyl groups bonded to the carbonyl carbon. For each conformation, two energies were calculated: (1) the single point quantum chemical energy, E QC , computed at the DFT B3P86/cc-pvqz level of theory and (2) the force field energy with all dihedral force field parameters set to zero, E other . Justification for the choice of this level of quantum chemical theory is given in the section "Quantum Chemistry Test Calculations" in the Supporting Information. Extracting N conformations from the trajectory and computing the corresponding energies gives a set of reference energies, eq 3: The goal of the parametrization is to find the set of M dihedral force field paramters {k j } that best reproduce the reference energies. This was done by minimizing the objective function in eq 4 ∑ ∑ where the first term is the sum of squared deviations between reference energies, R i , and the corresponding force field energy, E dihedral,i ({k j }). A constant, c, is introduced, since only relative energies are fitted. Since some dihedral angles (or their combinations) can have little effect on the conformational energy, minimizing the sum of squared deviations between the reference energies and the model may lead to unphysically high parameters for some dihedrals. For this reason, the second term is included in eq 4, intended to reduce overfitting by adding a quadratic penalty to the objective function when parameters rise. A regularization parameter, λ, determines the strength of this penalty. The regularized optimization problem was solved using the LSMR method, 32 as implemented in the lsmr function from the SciPy package sparse.linalg. In order to investigate the effect of replacing the hydrocarbon tails with ethyl groups, we carried out additional optimization on a smaller PC headgroup fragment where the tails were replaced by propyl groups and the bond C g3 −O was replaced by a terminal C g3 −H. The results of these substitutions are further discussed in the text.
Validation of Fitted Parameters. In order to test the performance of the fitted parameters, MD simulations were carried out for fully hydrated bilayers consisting of three different lipids (DMPC, POPC, and DPPC); see Figure 1. Each system consisted of 128 lipids, arranged in a bilayer periodic in the XY-direction and hydrated by 30 water molecules per lipid for DMPC and DPPC and 40 waters per lipid for POPC. Furthermore, POPC bilayers were simulated with the addition of different amounts of cholesterol. In Figure 1, we have also introduced labels that will be used later on in the discussion of order parameters. DMPC and POPC systems were simulated at T = 303 K, while for DPPC the temperature was set to T = 323 K. For each system, properties including order parameters, area per lipid, area compressibility, lateral diffusion coefficients, and scattering form factors were calculated and compared with experimental data. The aim of the validation was twofold: (1) show that the new force field gives order parameters in better agreement with NMR experiments and (2) demonstrate that other properties described correctly by the unmodified force field are also reproduced by the new parameter set. All simulations were performed using Gromacs 2019, 33 and the trajectories were analyzed using the tranal utility from the MDynaMix package. 34 The computational details are presented in Table 1.

■ RESULTS AND DISCUSSION
Optimization of Torsion Angle Parameters. Several computations of the optimized force field parameters using different values of the penalty factor λ were carried out. The quality of fit was quantified by the Pearson coefficient, indicating the correlation between the force field energies and the reference energies, and by the sum of squared deviations (SSD) between same quantities. Furthermore, we divided the whole set of conformations into two parts of equal size, a training set and a The Journal of Physical Chemistry B pubs.acs.org/JPCB Article validation set. Parameters were fitted against the energies in the training set, and the sum of squared deviations (SSD) between force field energies and quantum chemical energies was then calculated, both for the training and the validation sets. Figure 2 shows how the Pearson coefficient and the ratio of the SSD for the validation set to that of the training set varies as function of λ.
If a small λ is used, the obtained parameters fit the quantum chemical energies better compared to parameters obtained using a larger λ. The ratio of the SSD between the validation set and training set quantifies overfitting and decreases upon the initial increase of λ. For the model to describe the real trends in the data, it is desirable to increase λ to reduce overfitting. However, increasing λ eventually deteriorates the quality of fit, which is reflected by a lower Pearson coefficient. In order to achieve a compromise between these trends, we selected λ = 2 and fitted parameters using this penalty factor for the whole data set. In addition, we have investigated how the number of lipid headgroup conformations, N, affects the resulting optimized parameters. We found that increase of N up to several hundreds of conformations significantly reduces the variations, although fluctuations remain up to 2000 conformations. Further details are given in the "Parameter Convergence" section of the Supporting Information. The final force field parameters were optimized using λ = 2 and N = 2000 and used throughout the work.
Fine-Tuning of Lennard-Jones Parameters. Initial test simulations showed that the fitted dihedral force field parameters, while improving values of the order parameters in lipid headgroups, have led to underestimation of the experimental area per lipid by roughly 0.01−0.02 nm 2 . Although this deviation is small, it falls outside the experimental error bars. We found that increasing the Lennard-Jones repulsion parameter, A = 4εσ 12 , of atoms in the PC headgroup improves the results. Increasing σ by 0.02 Å for atoms in the choline group (atom types CTL5 and HL) while keeping the dispersion coefficient, B = 4εσ 6 , constant (by a corresponding adjustment of ε) gave results in good agreement with experiment. Further details on fine-tuning of the Lennard-Jones parameters and their effect on the results is given in the section "Effect of Tuning Lennard-Jones Parameters" of the Supporting Information.
Effect of Replacing Lipid Tails. In order to better understand the effect of replacing full lipid tails by ethyl groups, quantum chemistry calculations were performed on smaller lipid fragments where the tails instead were replaced by propyl groups, but choline and phosphate groups were cut and replaced by a methyl group. This fragment is shown in Figure 3a, while the fragment in Figure 3b is the one for which optimization of all dihedral parameters for the lipid headgroup was carried out as described in the sections above. Parameters for the fragment in Figure 3a were optimized using the same procedure as that described previously. Table 2 shows force field parameters for backbone dihedral angles for the fragments shown in Figure 3a and b. It follows from Table 2 that optimized parameters for the    Figure 3a and Figure 3 around the CL−OSL bond are different for the two fragments. This can be related to the fact that in the fragment in Figure 3a the zwitterionic part of the lipid headgroup was substituted by a methyl group. However, the result also shows that the force field parameters describing the dihedral labeled C (around the CL−CTL2 bond, that is closest to the termination of the tail) are qualitatively the same for the two fragments. This gives confidence that replacing the lipid tails by ethyl groups does not influence our results significantly. For the final force field, we used the fragment shown in Figure 3b which includes the whole lipid headgroup with the lipid tails replaced by ethyl groups.
Order Parameters for C−H Vectors. Since the main motivation of this work was to improve the description of the orientational order of lipids in a fluid phase bilayer, it is relevant to start analysis from the order parameters of C−H vectors, which are computed according to eq 5 where θ is the angle between the C−H vector and the bilayer normal. Figure 4 shows order parameters for all C−H vectors in the three lipid bilayer systems calculated using the new and old parameter sets together with order parameters from NMR. 7 Figure 4, have been interpreted and not measured in the experiment. The uncertainty in the NMR data is ±0.02, and for the simulation results, the errors are all within ±0.01. One can see that the old force field strongly deviates from the NMR results for g1, g2, and g3, while the new parameter set provides significantly better agreement for these order parameters. Still for g1 the new parameter set overestimates the forking (splitting of the order parameters for the two C−H vectors), and results for POPC are not in full agreement with NMR.
For the γ and β order parameters, both force fields are in good agreement with experiments, while the α order parameter is somewhat closer to zero in the new parameter set, indicating that these C−H vectors have orientations closer to θ = 54.7°. Both force fields perform well for the order parameters in the lipid tails; however, those in the lipid headgroup are more challenging to reproduce. Still, overall, the new parameter set provides a substantially better description of the order parameters in the glycerol region of lipids and thus more correctly reproduces the conformational ensemble and ordering of lipids in the considered bilayers.
Area per Lipid and Bilayer Thickness. An important property often compared between experiments and simulations of lipid bilayers is the area per lipid. This quantity can be accurately determined from scattering experiments. From MD simulations, the area per lipid is calculated according to eq 6 = A L L n X Y lipid lipids (6) where L X and L Y are the dimensions of the simulation box in the plane of the bilayer and n lipids is the number of lipids in one leaflet. Since the box dimensions vary during a MD simulation in the NPT ensemble, the area per lipid fluctuates. The instantaneous area per lipid is shown as a function of simulation time in Figure 5, and average values over the trajectory are presented in Table 3. Both the new and old force fields give the area per lipid in perfect agreement with experiment for DMPC and POPC. For DPPC, the two parameter sets underestimate the area per lipid by roughly 0.01 nm 2 , but this difference is within the experimental and computational uncertainty. In addition to the area per lipid, another important property that governs the insertion of biomolecules, including membrane proteins, into the cell membrane is the bilayer thickness. From our simulations, we calculated the thickness according to two conventions both of which can be determined experimentally. The first is the distance between phosphate peaks in the electron density (D HH ) and the second is the Luzzati thickness (D B ) which is calculated according to eq 7 where d z is a distance along the z-direction (membrane normal) that is greater than the membrane thickness and ρ w is the probability distribution of water. Table 4 shows the Luzzati thickness and the distance between the phosphate peaks,   Table 4 that the experimental values are within the error bars of the results obtained using the new force field for all systems. The experimental values are also within the error bars of the results obtained with the old force field for POPC and DPPC, but for DMPC the experimental value is just outside the error interval. Electron Density Profiles and Scattering Form Factors. In order to investigate how well the new force field describes the overall structure of the lipid bilayer, electron density distributions were computed along the direction normal to the bilayer plane. Figure 6 shows the total electron density distribution and the contribution from water calculated with the new and old force fields for all three lipid bilayers. It can be noted that the results obtained using the two parameter sets are in close agreement with each other for the three investigated systems. Although X-ray scattering experiments probe these electron density distributions, obtaining them experimentally requires use of models which makes the comparison with simulations ambiguous. However, form factors are obtained by Fourier transformation of electron density according to eq 8 and can be directly compared with the scattered intensity observed experimentally   The errors in the Luzzati and phosphate thickness from simulations are all less than ±0.04 and ±0.05 nm, respectively. The uncertainties in the experimental Luzzati thickness are all within ±0.08 nm.
The Journal of Physical Chemistry B pubs.acs.org/JPCB Article where ρ(z) is the electron density at distance z from the bilayer center along the normal and ρ w is the electron density in bulk water. Figure 7 shows form factors calculated from the electron density distributions in Figure 6 together with experimental data. These results show that for the lipid bilayer systems investigated in this work both the old and new force fields give form factors in close agreement with X-ray scattering experiments. Area Compressibility. In order to investigate if the new force field correctly describes the area compressibility of the simulated lipid bilayers, we calculated it from fluctuations in the bilayer surface area A according to eq 9 where ⟨A⟩ is the mean bilayer area, σ A 2 is the variance of the area, T is the temperature, and k B = 1.380649 × 10 −23 J/K is Boltzmann's constant. This was done from 200 ns simulations performed using the Parinello−Rahman barostat 50 that gives the correct fluctuations for the NPT ensemble. The area compressibilities from our MD simulations are presented in Table 5 together with experimental values. 51−53 The new force field gives a lower area compressibility than the old parameter set. When comparing the results from our simulations with experiments, it is observed that the old force field gives closer agreement for DMPC while the new parameter set performs better for DPPC. For POPC, the area compressibilities obtained from both force fields are within the range reported from experiment.
Lateral Diffusion. In order to investigate how the new force field describes dynamics in the lipid bilayer, we calculated lateral diffusion coefficients. The diffusion coefficient was obtained from the mean square displacement (MSD) of lipids according to where r(0) and r(t) are the positions of a lipid molecule's center of mass in the plane of the bilayer at a time origin and at a later time t, respectively. The average is taken over all molecules and time origins on the trajectory. Experimentally, lateral diffusion coefficients are often measured using NMR spectroscopy 54,55 or fluorescence recovery after photobleaching (FRAP). 56 Lateral diffusion coefficients calculated from our and MD simulations are shown in Table 6 together with experimental values. 54−56 It is noted that the diffusion coefficients calculated for DMPC and POPC using the new force field underestimate the experimental   Cholesterol Content. Cholesterol is an important component of the cell membrane that can make up to 30−50 mol % of its total lipid content in some tissues. 60 Furthermore, cholesterol is known to rigidify and increase the ordering of hydrocarbon chains in lipid bilayers. 19,45 However, order parameters in lipid headgroups show only a weak dependence on the cholesterol content 45 which is not described correctly by force fields currently available. 21 In order to investigate how the new force field describes these effects, we calculated order parameters for POPC bilayers with different cholesterol contents (20,30, and 60 mol %). As expected, addition of cholesterol increases order parameters in the lipid tails, and this is discussed in the section "Effect of Cholesterol Content" of the Supporting Information. The results for the order parameters in the headgroup region of lipids as a function of cholesterol content are presented in Figure  8. It shows that both the old and new force fields have only a weak dependence of order parameters on cholesterol content, but the values of the order parameters obtained with the new parameter set show significantly better agreement with NMR experiments. 45 This is most clearly seen for the g2 and g3 order parameters. The new force field is also closer to the experimental result 45 for the g1S order parameter, while for g1R both models perform well. Also, α and β order parameters show reasonably good agreement with NMR data. Notably, for a cholesterol content of 20 and 60 mol %, a small forking of the α and β order parameters is obtained which is not observed experimentally. 45

■ CONCLUSION
In this work, a new strategy, based on simultaneous optimization of parameters to fit quantum chemical energies for an extensive set of molecular conformations, has been applied for deriving force field parameters for dihedral angles in headgroups of phosphatidylcholine lipids. The force field obtained using this scheme underwent extensive validation against a number of experimental properties including order parameters, area per lipid, area compressibility, bilayer thickness, scattering form The uncertainties in the calculated diffusion coefficients are all within ±1.1 cm 2 /s. b Measurement was conducted above the phase transition temperature (296 K for DMPC and 315 K for DPPC). The Journal of Physical Chemistry B pubs.acs.org/JPCB Article factors, and lateral diffusion coefficients. Furthermore, we investigated the effect of cholesterol content on order parameters. It was shown that the new force field provides generally good agreement with experiment for all of these properties. Although the new parameter set does not succeed to reproduce all order parameters within experimental uncertainties, it gives significantly better agreement for order parameters in the glycerol moiety compared to the previous version of the Slipids force field. An important point is that the new parameters were derived exclusively on the basis of high quality quantum chemical (DFT) computations. We note that the lipid headgroup conformations used in this work were sampled from a trajectory simulated using the original version of the Slipids force field. These conformations can therefore be biased by the initial parameters that were used to generate them. For this reason, force field parametrization using the approach undertaken in this work should ideally be carried out as an iterative process. This will however require quantum chemical energies to be recalculated which is the most computationally expensive part in the parametrization procedure. Further development can include update of partial atomic charges and Lennard-Jones terms consistently using the same quantum chemistry method. Furthermore, a larger data set will give better statistics and reduce fluctuations in the energy surfaces with respect to the number of lipid fragments used in the optimization. The recent emergence of large databases of quantum chemical computations 61 opens new possibilities for data-driven force field parametrization procedures. Such approaches will likely become increasingly important tools for constructing accurate force fields for various molecular systems in the future. Our work shows that such an approach carried out even on a limited scale can provide substantial improvement in force field development.
■ ASSOCIATED CONTENT
Archive containing developed force field parameters and topology files in Gromacs format (ZIP) Details on computed properties and quantum chemical calculations (PDF)