Using Open Data to Rapidly Benchmark Biomolecular Simulations: Phospholipid Conformational Dynamics

Molecular dynamics (MD) simulations are widely used to monitor time-resolved motions of biomacromolecules, although it often remains unknown how closely the conformational dynamics correspond to those occurring in real life. Here, we used a large set of open-access MD trajectories of phosphatidylcholine (PC) lipid bilayers to benchmark the conformational dynamics in several contemporary MD models (force fields) against nuclear magnetic resonance (NMR) data available in the literature: effective correlation times and spin–lattice relaxation rates. We found none of the tested MD models to fully reproduce the conformational dynamics. That said, the dynamics in CHARMM36 and Slipids are more realistic than in the Amber Lipid14, OPLS-based MacRog, and GROMOS-based Berger force fields, whose sampling of the glycerol backbone conformations is too slow. The performance of CHARMM36 persists when cholesterol is added to the bilayer, and when the hydration level is reduced. However, for conformational dynamics of the PC headgroup, both with and without cholesterol, Slipids provides the most realistic description because CHARMM36 overestimates the relative weight of ∼1 ns processes in the headgroup dynamics. We stress that not a single new simulation was run for the present work. This demonstrates the worth of open-access MD trajectory databanks for the indispensable step of any serious MD study: benchmarking the available force fields. We believe this proof of principle will inspire other novel applications of MD trajectory databanks and thus aid in developing biomolecular MD simulations into a true computational microscope—not only for lipid membranes but for all biomacromolecular systems.


■ INTRODUCTION
Ever since the conception of the Protein Data Bank (PDB) 1,2 and Genbank, 3,4 open access to standardized and searchable pools of experimental data has revolutionized scientific research. Constantly growing and improving in fidelity due to collaborative effort, 5−8 the now hundreds of databanks 9 fuel the data-driven development of biomolecular structure determination, 10 refinement, 11 prediction, 12 and design 13 approaches as well as the development of drugs, 14,15 materials, 16,17 and more. 18,19 It is clear that open data enables scientific progress that is far beyond the resources of a single research group or institute. Consequently, the call for public availability and conservation of data has extended to molecular dynamics (MD) simulation trajectories of biomolecules, 20−22 and the discussion on how and by whom such databanks for dynamic structures would be set up is currently active. 23−26 While there are currently no general MD databanks in operation, individual databanks are accepting contributions on nucleic acid, 27 protein/DNA/RNA, 28 cyclodextrin, 29 G-protein-coupled receptor, 30 and lipid bilayer 31 simulations.
Since 2013, the NMRlipids Project (nmrlipids.blogspot.fi) has promoted a fully open collaboration approach, where the whole scientific research processfrom initial ideas and discussions to analysis methods, data, and publicationsis all the time publicly available. 32 While its main focus has been on conformational ensembles of different lipid headgroups and on ion binding to lipid membranes, 32−34 the NMRlipids Project has also built a databank 31 (zenodo.org/communities/nmrlipids) containing hundreds of atomistic MD trajectories of lipid bilayers and indexed at nmrlipids.fi.
MD databanks are expected to be particularly relevant for disordered biomolecules, such as biological lipids composing cellular membranes or intrinsically disordered proteins. These, in contrast to folded proteins or DNA strands, cannot be meaningfully described by the coordinates of a single structure alone. Realistic MD simulations, however, can provide the complete conformational ensemble and dynamics of such molecules as well as enable studies of their biological functions in complex biomolecular assemblies. Unfortunately, the current MD force fields largely fail to capture the conformational ensembles of lipid headgroups and disordered proteins. 32,34−37 Therefore, before they can be used to draw conclusions, the quality of MD simulations must always be carefully assessed against structurally sensitive experiments. For lipid bilayers, such evaluation is possible against NMR and scattering data. 38 Here, we demonstrate the use of a pre-existing, publicly available set of MD trajectories to rapidly evaluate the fidelity of phospholipid conformational dynamics in state-of-the-art force fields. The rate at which individual molecules sample their conformational ensemble is traditionally used to assess if a given MD simulation has converged. Going beyond such practicalities, realistic dynamics are particularly desired for the intuitive interpretation of NMR experiments sensitive to molecular motions 39 as well as to understand the dynamics of biological processes where molecular deformations play a rate-limiting role, such as membrane fusion. 40 The here presented comprehensive comparison of dynamics between experiments and different MD models at various biologically relevant compositions and conditions is thus likely to facilitate the development of increasingly realistic phospholipid force fields.
Above all, our results demonstrate the power of publicly available MD trajectories in creating new knowledge at a lowered computational cost and high potential for automation. We believe that this paves the way for novel applications of MD trajectory databanks as well as underlines their usefulnessnot only for lipid membranes but for all biomolecular systems.

■ METHODS
Lipid Conformational Dynamics in NMR Data. We analyzed the veracity of phosphatidylcholine (PC) lipid dynamics in MD based on two quantities that are readily available from published 39,41−43 13 C NMR experiments and directly quantifiable from atomistic MD simulations: the effective C−H bond correlation times τ e and the spin−lattice relaxation rates R 1 .
Effective C−H Bond Correlation Times τ e . In a lipid bilayer in the liquid crystalline state, each individual lipid samples its internal conformational ensemble and rotates around the membrane normal. Lipid conformational dynamics are reflected in the second-order autocorrelation functions of its C−H bonds where the angular brackets depict time average, μ⃗ (t) is the unit vector in the direction of the C−H bond at time t, and P 2 is the second-order Legendre polynomial P x x ( ) (3 1) 2 1 2 2 = − . To analyze the internal dynamics of lipids, the C−H bond autocorrelation function is often written as a product where g f (τ) characterizes the fast decays owing to, e.g., the internal dynamics and rotation around membrane normal, and g s (τ) the slow decays that originate from, e.g., lipid diffusion between lamellae with different orientations and periodic motions due to magic-angle spinning conditions ( Figure 1). Ferreira et al. 41 have experimentally demonstrated that, for all phospholipid carbons, the motional correlation times contributing to g f are well below μs and to g s well above 100 μs. This separation of timescales gives rise to the plateau g(1 μs ≲ τ ≲ 100 μs) = S CH 2 illustrated in Figure 1. S CH is the C−H bond order parameter where θ(t) is the angle between the C−H bond and the bilayer normal. S CH can be independently measured using dipolar coupling in 13 C or quadrupolar coupling in 2 H NMR experiments. Knowing the set of S CH for all the C−H bonds in a lipid is highly useful in order to evaluate its conformational ensemble. 38 As S CH describe the conformational ensemble of the lipid, the fast-decaying component g f of the C−H bond autocorrelation function intuitively reflects the time needed to sample these conformations. The complex internal dynamics containing multiple timescales can be conveniently summarized using the effective correlation time which is related to the gray shaded area below the correlation function in Figure 1. The τ e detects essentially an average over all the timescales relevant for the lipid conformational dynamics. Their relation to process speeds is intuitive: an increase in longlived correlations increases τ e . Spin−Lattice Relaxation Rates R 1 . The C−H bond dynamics relate to R 1 , the spin−lattice relaxation rate, through where ω H is the 1 H and ω C the 13 C NMR Larmor frequency and N H is the number of hydrogens covalently bonded to the carbon. The rigid dipolar coupling constant d CH ≈ − 2π × 22 kHz for the methylene bond. The spectral density j(ω) is given by the Fourier transformation of the C−H bond autocorrelation function g(τ) (eq 1). Clearly, the connection between R 1 and molecular dynamics is not straightforward; the magnitude of R 1 does, however, reflect the relative significance of processes with timescales near the inverse of ω H and ω C . These two frequencies depend on the field strength used in the NMR experiments: Typically, R 1 is most sensitive to motions with timescales of ∼0.1−10 ns. (In our experimental data, 39,41−43 ω C = 125 MHz and ω H = 500 MHz, which gives (2π × 125 MHz) −1 = 1.3 ns and (2π × 625 MHz) −1 = 0.25 ns.) A change in given R 1 , therefore, indicates a change in the relative amount of processes occurring in a window around the sensitive timescale; inferring also the direction to which the processes changed (speedup/slowdown) requires measuring R 1 at various field strengths.
Data Acquisition and Analysis. All the experimental quantities used in this work were collected from the literature sources 39,41−43 cited at the respective figures.
The simulation trajectories were collected from the generalpurpose open-access repository Zenodo (zenodo.org), with the majority of the data originating from the NMRlipids Project 32,33 (nmrlipids.blogspot.fi). The trajectories were chosen by hand based on how well the simulation conditions matched the available experimental data (lipid type, temperature, cholesterol content, and hydration) and how precisely one could extract the quantities of interest from the trajectory (length of simulation and system size). Note that, apart from the sampling accuracy, simulation size does not affect τ e and R 1 (Figure 2). Table 1 lists the chosen trajectories of pure POPC (1-palmitoyl-2-oleoylglycero-3-phosphocholine) bilayers at/near room temperature and at full hydration; Table 2 lists the trajectories with cholesterol; and Table 3 those with varying hydration. Full computational details for each simulation are available at the cited Zenodo entry.
The trajectories were analyzed using in-house scripts. These are available in ref 83, along with a Jupyter notebook outlining an example analysis run. To enable automated analysis of several force fields with differing atom naming conventions, we used the mapping scheme developed within the NMRlipids Project to automatically recognize the atoms and bonds of interest for each trajectory.
After downloading the necessary files from Zenodo, we processed the trajectory with Gromacs gmx trjconv to make the molecules whole; that is, we made sure that, for each covalent bond, the partaking atoms are from the same periodic image of the molecule. For the united atom Berger model, hydrogens were added using the Gromacs 4.0.2 tool g_protonate. We then calculated the S CH (eq 3) with the OrderParameter.py script that uses the MDanalysis 84,85 Python library. The C−H bond correlation functions g(τ) (eq 1) were calculated with Gromacs 5.1.4 86 gmx rotacf (note that on MD timescales g s = 1 so that g = g f ) after which the S CH were used to normalize the g f to obtain the reduced and normalized correlation function that is, the integrand in eq 4. The effective correlation times τ e were then calculated by . If g f ′ did not reach zero within t anal /2, the τ e was not determined, and we report only its upper and lower estimates.
To quantify the error on τ e , we first estimate the error on g f ′(τ), where we account for two sources of uncertainty: g f (τ) and S CH 2 . Performing linear error propagation on eq 7 gives f CH Here, the ΔS CH was determined as the standard error of the mean of the S CH over the N l individual lipids in the system. 32 Similarly, we quantified the error on g f (τ) by first determining the correlation function g f m (τ) for each individual lipid m over  a Number of POPC molecules. b Number of water molecules. c Simulation temperature. Note that the temperature varied across these openly available simulation data, but in no case was T lower than in the experiment. Thus, as dynamics slows down when the temperature drops, any overestimation of τ e by MD (as typically seen in Figure 3) would get worse if the simulations were done at the experimental 298 K. d Trajectory length used for analysis. e Reference for the openly available simulation files.
Journal of Chemical Information and Modeling pubs.acs.org/jcim Article the whole trajectory and then obtaining the error estimate Δg f (τ) as the standard error of the mean over the N l lipids. Importantly, this gives an uncertainty estimate for g f (τ) at each time point τ.
To obtain the lower bound on τ e , we integrate the function That is, t l equals the first time point at which the lower error estimate of g f ′ reached zero, or t l = t anal /2, if zero was not reached before that point.
To obtain the upper error estimate on τ e , we first integrate the function g f ′(τ) + Δg f ′(τ) over time from τ = 0 until t u = min{t 0 , t anal /2}. Note, however, that this is not yet sufficient, because there could be slow processes that the simulation was not able to see. Although these would contribute to τ e with a low weight, their contribution over long times could still add up to a sizable effect on τ e . That said, it is feasible to assume (see Figure 1A) that there are no longer-time contributions to g f than something that decays with a time constant of 10 −6 s. We use this as our worst case estimate to assess the upper bound for τ e , that is, we assume that all the decay of g f from the time point t u onward comes solely from this hypothetical slowest process that decays with a time constant of 10 −6 s. The additional contribution to the upper bound for τ e then reads The R 1 rates were calculated using eq 5. The spectral density j(ω) was obtained from the normalized correlation function g f ′ by fitting it with a sum of 61 exponentials with logarithmically spaced timescales τ i ranging from 1 ps to 1 μs and then calculating the spectral density of this fit based on the Fourier transformation 41 The R 1 rate of a given C−H pair was first calculated separately for each lipid m (using eq 5 with N H = 1, and j m (ω) obtained for the normalized correlation function g f ′ m ). The resulting N l measurements per C−H pair were then assumed independent: their mean gave the R 1 rate of the C−H pair, and the standard error of the mean its uncertainty. The total R 1 rate of a given carbon was obtained as a sum of the R 1 rates of its C−H pairs. When several carbons contribute to a single experimental R 1 rate  Journal of Chemical Information and Modeling pubs.acs.org/jcim Article due to the overlapping peaks (for example, in C2 carbon in the acyl chains and the γ carbons), the R 1 from simulations was obtained as an average over carbons with overlapping peaks. The segment-wise error estimates were obtained by standard error propagation, starting from the uncertainties of the R 1 rates of the C−H pairs.
To gain some qualitative insight on the timescales at which the main contributions to the R 1 rates arise, we also calculated "cumulative" R 1 rates, R 1 (τ), which contained those terms of the sum in eq 12 for which τ i < τ. Note that here the g f ′ averaged over lipids was used; therefore, the "cumulative" R 1 (τ→∞) does not necessarily have exactly the same numerical value as the actual R 1 .  Finally, we note that the fit of eq 11 provides an alternative to estimating τ e because When the simulation trajectory is not long enough for the correlation function to reach the plateau, integrating g f ′ gives a lower bound estimate for τ e , while the sum of eq 13 includes also (some) contribution from the longer-time components via the fitting process. However, in practice, the fit is often highly unreliable in depicting the long tails of the correlation function, and thus we chose to quantify τ e using the area under g f ′ and estimate its uncertainty as detailed above.

■ RESULTS AND DISCUSSION
Using open-access MD simulation trajectories, we benchmark phospholipid conformational dynamics in six MD force fields.
We start with pure POPC bilayers in their liquid crystalline fully hydrated state (see Table 1 for simulation details and Figures 3 and 4 for the data) and then proceed to check the changes in dynamics when cholesterol is added to the bilayer (Table 2 and Figure 5) and when the hydration level is reduced (Table 3 and Figure 6). Our yardsticks are the effective correlations times τ e (eq 4) and the R 1 rates (eq 5) measured at 125 MHz 13 C (500 MHz 1 H) Larmor frequency; an MD model with correct rotational dynamics in a window around ∼1 ns will match the experimental R 1 rates, whereas the τ e reflect all the sub-μs timescales ( Figure 1). Pure POPC at Full Hydration: Slipids and CHARMM36 Reproduce τ e Excellently. The top panels of Figure 3 compare the effective correlation times τ e obtained for fully hydrated POPC bilayers in experiments (black) and in six different MD force fields (color). We see thatas implied by the discussion leading to eq 10sub-μs MD simulations typically lead to asymmetric error bars on τ e ; if these open-access trajectories were extended, the τ e values would more likely increase than decrease. Qualitatively, every force field captures the general shape of the τ e profile: dynamics slows down toward the glycerol backbone in both the headgroup and the tails.
Quantitatively, most MD simulations tend to produce too slow dynamics in the glycerol region ( Figure 3). This is consistent with previous results for the Berger model 41 and with the insufficient conformational sampling of glycerol backbone torsions observed in 500-ns-long CHARMMc32b2 87,88 simulations of a PC lipid. 89 The best overall τ e performance is seen in Slipids and in particular CHARMM36 (Figure 3). This is in line with CHARMM36 reproducing the most realistic conformational ensembles for the headgroup and glycerol backbone among the MD simulation force fields benchmarked here. 32,34 Indeed, it is important to keep in mind that the conformational ensembles greatly differ between force fields and are not exactly correct in any of them. 32,34 Consequently, the calculated τ e times and R 1 rates depict the dynamics of sampling a somewhat different and incorrect phase space for each model. To this end, we try to avoid overly detailed discussion on the models and rather concentrate on common and qualitative trends. That said, there are a few carbon segments in the data for which the experimental order parameters, R 1 , and τ e are all (almost) reproduced by simulations, suggesting that both the conformational ensemble and the dynamics are correctly captured by MD in these cases. For example, Slipids performs well at the β and α and CHARMM36 at the g 3 , g 2 , and C2 segments. These are, however, exceptions.
An Excellent τ e May Be Accompanied by a Poor R 1 , or Vice Versa. The lower panels of Figure 3 compare the experimental and simulated R 1 rates under the same conditions that were used for the τ e above. Notably, there are several instances where the R 1 comparison distinctly differs from what was seen for τ e .
There are cases where a matching R 1 is accompanied by a larger-than-experimental τ e . MacRog for the β, α, and g 1 segments provides a prominent example of this. Such a Journal of Chemical Information and Modeling pubs.acs.org/jcim Article combination suggests that MD has the correct relative weight of 1 ns-scale dynamics but has too slow long-time dynamics. There are also cases where τ e matches experiments but R 1 does not, such as the β and α segments in CHARMM36. Therein, a cancellation of errors occurs in τ e : the overestimation of the relative weight of 1 ns-scale dynamics is compensated by wrong dynamics at the other timescales. As CHARMM36 overall performs rather well for C−H bond order parameters, R 1 , and τ e , we proceed to study this shortcoming on the headgroup R 1 rates in some more detail.
Conformational Dynamics of PC Headgroup Segments in MD. Figure 4A zooms in on the headgroup (γ, β, and α) segments, whose τ e were not clearly visible on the scale of Figure 3. We see that, for γ, no force field provides both τ e and R 1 , but Slipids comes closest. For β and α, Slipids captures both measurables near perfectly. In other words, among the benchmarked force fields, Slipids gives the most realistic description of the conformational dynamics in the headgroup region. CHARMM36, e.g., overestimates (R 1 ) the relative weight of timescales around ∼1 ns.
To investigate closer how the differences between force fields arise, Figure 4B shows the "cumulative" R 1 (τ), where the ranges of steepest increase indicate timescales that most strongly contribute to R 1 rates.
For the γ segment, Figure 4B shows that, for models that overestimate the R 1 rate (MacRog, CHARMM36, and Slipids, see Figure 4A), the major contribution to R 1 arises at τ > 50 ps, whereas for models that underestimate R 1 (Lipid14 and ECC), the major contribution comes from τ < 50 ps. This also manifests in the distribution of fitting weights (α i in eq 11) in Figure 4C: the later non-zero weights occur, the larger is the resulting R 1 of γ.
For the β and α segments, Figure 4B shows that the main contribution to R 1 rates arises from processes between 100 ps and 1 ns. CHARMM36 has the largest relative weights of all models in this window ( Figure 4C), which explains its overestimation of R 1 of β and α. All the other models have R 1 rates close to experiments, but only Slipids simultaneously gives also the τ e correctly. Notably, Slipids has its largest weights at τ < 100 ps. Indeed, the considerable weights at short (<10 ps) timescales in all models except MacRog and at long (>10 ns) timescales in MacRog and Berger hardly manifest in R 1 . However, the latter contribute heavily to τ e , which is thus considerably overestimated by MacRog and Berger (Figure 3).
It would be highly interesting to identify the origins of the observed artificial timescales, particularly for the 0.1−1 ns window overpresented in CHARMM36, and propose how to correct those in the simulation models. After all, it is known that the R 1 rates of mono-and disaccharides 90 and proteins 91 in solution agree satisfactorily with experiments when the artificially low viscosity of TIP3P water is accounted for by a simple scaling. Viscosity at the bilayer−water interface, however, remains an open questionalthough one which a careful comparison between spin relaxation rates of lipid headgroups in simulations and experiments might be able to answer. Nevertheless, we refrain from further analysis here as the connection between the fitted correlation times and the correlation times of distinct motional processes, such as dihedral rotations and lipid wobbling, turns out to be highly non-trivial.
Effect of Cholesterol. An essential component in cell membranes, cholesterol has various biological functions. It is well known to order the acyl chains in lipid bilayers, but its effect on the headgroup is more controversial. 65,92 For example, it has been proposed that lipid headgroups reorganize to shield cholesterol from water. 92 However, while acyl chains do substantially order, NMR experiments show no significant conformational changes in the headgroup upon addition of even 50% of cholesterol, which suggests that the tail and head regions behave essentially independently. 32, 65 In principle, the headgroups could shield cholesterol from water even without changing their conformational ensemble: by reorienting only laterally on top of the cholesterol. In this case, one would expect the rotational dynamics of headgroup segments to change when cholesterol is added.
Top panels of Figure 5A depict the experimental effective correlation times τ e in pure POPC bilayers and in bilayers containing 50% cholesterol. The τ e at the glycerol backbone slow down markedly when cholesterol is added. Tail segment dynamics slows down too, most notably close to the glycerol backbone. In stark contrast, the τ e of the headgroup segments (γ, β, and α) remain unaffected. Furthermore, cholesterol induces Error estimates for the simulated Δτ e are the maximal possible based on the errors at 0% and 50% cholesterol; for other data, regular error propagation is used. The Berger Δτ e is not shown because the available open-access trajectories were too short to determine meaningful error estimates. Table 2 provides further simulation details; for segment labeling, see Figure 3.
Journal of Chemical Information and Modeling pubs.acs.org/jcim Article no measurable change in the headgroup β and α segment dynamics at short (∼1 ns) timescales, as demonstrated by the experimental R 1 rates ( Figure 5A, bottom panels). That said, there is a small but measurable impact on R 1 at γ. In summary, these experimental findings support the idea 39 that the acyl chains and the headgroup can respondin agreement with the relative uncoupling of the PC head and tails reported in simulations 93 almost independently to changes in conditions and composition. All four benchmarked force fields ( Figure 5B) qualitatively reproduce the experimental increase in τ e : Slipids and CHARMM36 give rather decent magnitude estimates, while MacRog grossly overestimates the slowdown of glycerol, C2, and C3 segments. Notably, MacRog appears to predict slowdown also for the headgroup (β and α), for which experiments detect no change. Note that, while CHARMM36 correctly shows no change in τ e of the γ, β, and α segments, it does predict an erroneous ΔR 1 for all three, indicating some inaccuracies in the headgroup rotational dynamics. Such inaccuracies might be reflected in the recent findings 94 (obtained using CHARMM36) that the headgroups of PClipids neighboring (within 6.6 Å) a lone cholesterol spend more time on top of the said cholesterol than elsewhere. Interestingly, the tail ΔR 1 seem to be qualitatively reproduced by all three allatom force fields, whereas Berger fails to capture the trend at the oleoyl double bond. All these findings are in line with the general picture obtained from C−H bond order parameters: 38 MD simulations capture the changes in the acyl chain region rather well but changes in and near the glycerol backbone region can be overestimated. Of the benchmarked force fields, CHARMM36 appears most realistic in reproducing the effects of cholesterol on the glycerol backboneand Slipids on the PC headgroup conformational dynamics.
Effect of Drying. Understanding the impact of dehydration on the structure and dynamics of lipid bilayers is of considerable biological interest. Dehydrated states are found, e.g., in skin tissue. Most prominently, the process of membrane fusion is always preceded by removal of water between the approaching surfaces, and thus the dehydration-imposed changes can considerably affect fusion characteristics, such as its rate. Figure 6A shows how a mild dehydration affects C−H bond dynamics in the PC headgroup and glycerol backbone; the plot compares the experimental effective correlation times τ e measured for POPC at full hydration and for DMPC (1,2dimyristoyl-sn-glycero-3-phosphocholine) at 13 waters per lipid. The τ e are the same within experimental accuracy, which suggests two conclusions. First, the headgroup (γ, β, and α) τ e are rather insensitive to the chemical identities of the tails. This is analogous to what was seen experimentally when adding cholesterol ( Figure 5A): structural changes in the tail and  Table 3 for simulation details.
Journal of Chemical Information and Modeling pubs.acs.org/jcim Article glycerol regions do not (need to) affect the headgroup dynamics. Second, a mild dehydration does not alter the τ e in the headgroup and glycerol regions. Figure 6B shows the effects of dehydration in three MD models. Combination of the unrealistically slow dynamics, especially in the glycerol backbone (Figure 3), and the relatively short lengths of the available open-access trajectories (Table 3) led to large uncertainty estimates; thus, we only point out qualitative trends here. For all headgroup and glycerol segments, the simulated τ e indicate slowdown upon dehydration. This is manifested in the increase in the magnitude of the error estimate (cf. the Berger data for β and α) as well as in the increase in the lower limit of the error. For CHARMM36 the lower error estimates stay almost constant all the way until 7 w/l, whereas for Berger and MacRog, they hint that a retardation of dynamics starts already between 15 and 10 w/l.
These simulational findings suggest that experiments reducing hydration levels below 10 w/l would also show an increase in τ e . This prediction is in line with the exponential slowdown of the headgroup conformational dynamics upon dehydration that was indicated by 2 H NMR R 1 measurements of DOPC bilayers: R 1 ≈ exp( − n w/l /4). 95 The slowdown was attributed to the reduced effective volume available for the headgroup 95 as it tilts toward the membrane upon dehydration; such tilt is observed via changes of the lipid headgroup order parameters 96 and is qualitatively reproduced by all the simulation models. 32 Figure 6C shows a collection of experimental 13 C NMR R 1 rates for the headgroup segments at different water contents; in addition to the full hydration POPC data from Figure 3, DMPC at 13 w/l 42 and POPC at 20 and 5 w/l 43 are shown. Experimentally, an increasing trend with decreasing hydration is observed for all three segments, indicating changes of headgroup dynamics at short (∼1 ns) timescales. Interestingly, only CHARMM36 captures this, whereas Berger and MacRog give decreasing R 1 rates for β and α.
The slowdown characteristics discussed here are of significance not only for computational studies of intermembrane interactions, such as fusion, but also when simulating a bilayer (stack) under low hydration: slower dynamics require longer simulation times for equilibration, for reliably quantifying the properties of the bilayers, and for observing rare events.

■ CONCLUSIONS
We have here demonstrated that open-access databanks of MD trajectories enable the creation of new scientific information without running a single new simulation. More specifically, we have benchmarked (against published NMR data 39,41−43 ) the conformational dynamics of a wide range of phosphatidylcholine MD models using existing open-access trajectories from the Zenodo repository, in particular those belonging to the NMRlipids Databank (zenodo.org/communities/nmrlipids).
We found that every MD model captures the 13 C NMR effective correlation time (τ e ) profile of POPC qualitatively, but that most are prone to too slow dynamics of the glycerol backbone C−H bonds (Figure 3). While no force field perfectly reproduces all the experimental data, CHARMM36 and Slipids have overall impressive τ e . This is a particularly exciting finding concerning CHARMM36 as it is also known to reproduce quite well the experimental conformational ensemble. 32 That said, we do find that CHARMM36 struggles with the balance of dynamics in the headgroup region: The R 1 rates, sensitive for ∼1 ns processes, are too high for the γ, β, and α segments ( Figure   4). In fact Slipids, which also reproduces the experimental headgroup order parameters, 32 appears to outperform CHARMM36 when it comes to headgroup conformational dynamics ( Figure 4).
Further, we found that when cholesterol is mixed into a POPC bilayer, MD qualitatively captures the slowdown of conformational dynamics in the tail and glycerol regions ( Figure 5). However, the benchmarked force fields overestimate the changes in the ∼1 ns dynamics of the headgroupexcept Slipids, which captures well the effects of cholesterol on PC headgroup conformational dynamics.
Finally, we found that, upon reducing the water content below 10 waters per lipid, MD exhibits slowdown of headgroup and backbone dynamics in qualitative agreement with experimental data. That said, only CHARMM36 (but not Berger or MacRog) qualitatively captures the experimentally detected increase in R 1 rates upon dehydration ( Figure 6).
While work is still needed in capturing even the correct phospholipid conformations, 32 realistic dynamics will be an essential part of developing MD into a true computational microscope. Here, we gathered a set of published experimental 13 C NMR data on phosphatidylcholine conformational dynamics and charted the typical features of the existing MD models against it, thus laying the foundation for further improvement of MD force fields. Importantly, our work demonstrates the potential of open-access MD trajectories in achieving such benchmarks at a reduced computational and labor costbut it also highlights the challenges inherent in using such data: not all system permutations might readily exist (here the dehydration data for Lipid14 and Slipids were lacking, see Figure 6); the available sampling (simulation length and size) might vary, requiring extreme care with error estimation; and one has to remain aware of the subtle differences in the many simulation parameters (barostats, temperatures, characteristics of the MD engines, etc.). That said, it has not escaped our notice that a pool of well indexed and documented open-access data provides an ideal platform for automation, which in turn will facilitate faster progress in pinpointing the typical failures of the existing force fields, in identifying key differences in models describing chemical variations of the same molecule type (such as different lipid headgroups), and in developing better models through data-driven approaches.
■ ASSOCIATED CONTENT * sı Supporting Information