A Data-Driven Dimensionality Reduction Approach to Compare and Classify Lipid Force Fields

Molecular dynamics simulations of all-atom and coarse-grained lipid bilayer models are increasingly used to obtain useful insights for understanding the structural dynamics of these assemblies. In this context, one crucial point concerns the comparison of the performance and accuracy of classical force fields (FFs), which sometimes remains elusive. To date, the assessments performed on different classical potentials are mostly based on the comparison with experimental observables, which typically regard average properties. However, local differences of the structure and dynamics, which are poorly captured by average measurements, can make a difference, but these are nontrivial to catch. Here, we propose an agnostic way to compare different FFs at different resolutions (atomistic, united-atom, and coarse-grained), by means of a high-dimensional similarity metrics built on the framework of Smooth Overlap of Atomic Position (SOAP). We compare and classify a set of 13 FFs, modeling 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) bilayers. Our SOAP kernel-based metrics allows us to compare, discriminate, and correlate different FFs at different model resolutions in an unbiased, high-dimensional way. This also captures differences between FFs in modeling nonaverage events (originating from local transitions), for example, the liquid-to-gel phase transition in dipalmitoylphosphatidylcholine (DPPC) bilayers, for which our metrics allows us to identify nucleation centers for the phase transition, highlighting some intrinsic resolution limitations in implicit versus explicit solvent FFs.


S1.1 Area Per Lipid (A L )
Area per Lipid A L is defined as where B x and B y are the width of the simulation box in the x and the y axis, respectively, and N L is the number of lipid molecules in a single layer.

S1.2 Membrane Thickness (D HH )
Membrane thickness D HH is given by the average distance of the phosphate groups projected along the direction normal to the plane define by the bilayer S1 (z axis in our case), namely: where z up is the average of the z-position of the phosphate group in the upper layer, and z up is the average of the z-position of the phosphate group in the lower layer.

S1.3 Membrane Compressibility (K A )
Membrane compressibility K A is defined as where A is the total bilayer area, γ is the surface tension, k B is the Boltzmann constant, T is the temperature, N L is the number of lipids in a single layer, A L is the area per lipid, and σ 2 A is the mean square fluctuation of A L .

S1.4 Lateral Diffusion Coefficient (D l )
Lateral Diffusion Coefficient D l can be computed exploiting the Einstein relation for a diffusive particle in a 2-dimensional space where [∆ r] 2 is the mean square displacement (MSD) for all the lipids in the membrane. Practically speaking, for long lag times t, it is possible to obtain D l as the slope of [∆ r] 2 : The number obtained from a MD simulation using this relation is often called D pbc l . This is due to the fact that the presence of periodic boundary condition cause a systematic deviation of the diffusion coefficient values. Following hydrodynamic theory, it is possible to obtain an analytical correction. [5][6][7] In this work, we computed and showed D pbc l values only to prove the variability of this observable in different force field parametrization. A corrected (and more precise) evaluation of D l is beyond the scope of this work.

S2 Smooth Overlap of Atomic Position
The Smooth Overlap of Atomic Position (SOAP) 8 is a fairly recent many-body atomic descriptor, that aims to accurately reproduce atomic or in general particle densities of a given system of interest. The description obtained by SOAP, commonly called "atomic-environment" is usually encoded in a high dimensional vector, that stores all the information regarding a specific center of application of the descriptor. Others known examples of atomic descriptors are: S2 Atom-centered Symmetry Functions (ACSFs), 9 Coulomb Matrix (CM), 10 sine matrix, 11 many-body tensor representation (MBTR). 12 All the notations and terminology about SOAP were first introduced by Bartok et al., 8 where the authors extensively described and validated all the mathematical and physical properties of such framework. In this section we will include some basic notations and expressions to help the reader get familiar with such formalism, but we will not enter in too much detail on the framework, since it deviates from the scope of our article 1 . Let a generic system of interest be composed of a set G of I = (1, . . . , i, . . . , N ) atomic sites 2 , given a cutoff sphere S(r cut ) the ith-particle density is formally defined as where the index a runs over a subset A ⊆ G of sites included in the spherical cutoff S(r cut ). Equation S1 assumes only one species is taken into account in the summation, but the same concepts can be extended to multiple species systems without losing generality, as stated in the original work. 8 According to the original SOAP formulation, the sum of the delta functions of Eq. S1 is then replaced by a gaussian smoothed density function (see original SOAP paper for details). 8 This is expressed combining a set of radial basis {g n } and spherical harmonics Y lm (θ, φ) functions where c a nlm (Γ) are the spherical harmonics and radial functions expansions' coefficients, that cycles over the indices n (the index related to the radial expansion), l and m (the indices related to the spherical harmonics expansion), and the index a of centers included in the cutoff function.ρ (i) (Γ) is also commonly referred as Gaussian smoothed density function. The generic expansion coefficient c nlm (Γ) is defined as the inner product where dV represent the infinitesimal volume element, and we left out a and i indices for clarity, since they do not affect the definition. Eq. S2 allows the construction of a description of the local density of the atomic site, that in general approximates well the estimate of ρ (i) (Γ) (Eq. S1) and retains its permutation invariant property (since permutation of the species only affects the order of the summation, but not the result). From now on, we will refer tõ ρ as simply ρ, and we will also drop the explicit dependency from the system configuration of the expansion coefficients to simplify our notation. 1 For the interested reader, we would like to suggest the reading of Ref. 8,13 2 We use the general term sites to indicate both real and pseudo atoms used as centers for the SOAP representation. Eqs. S9 S12 Figure S1: Flowchart of our computational approach. We start from the full MD trajectory of a sample bilayer. As a first rework, all the non relevant atoms are removed (i.e. solvent atoms), then upon choosing a relevant representation for our system we further reduce our trajectory of interest. The final reworked trajectory is ready to be fed to the DScribe 14 package that compute the SOAP power spectra. Some additional information of these few first steps is also reported in the following figure S2. The power spectra represent the central results of our workflow since they can be reworked to give measures of similarity (Eq. S9) and distance (Eq. S12) between environments. Moreover, power spectra can be appropriately (i.e. after a dimensionality reduction analysis) fed to a clustering algorithm to collect information about their features.

S4
Given this approximation (Eq. S2), in Ref. 8 it is shown that to keep a local density that results rotationally invariant, one can consider the so-called "power spectrum" of Eq. S2, which leads to the expression with coefficients c n lm defined as the inner product like in Eq. S3. Recalling that we are presenting the SOAP formalism only for an homogeneous system, extension to multiple species is straightforward, which would lead to a similar result to Eq. S4, namely with α and β being the identifiers for 2 kind of species (i.e. the atomic number for "real" atomic centers). Additional information and discussion of the multispecies treatment can be found in Refs. 13 The vector defined in Eq. S4 is formally referred as the partial power spectrum of the Gaussian smeared atomic density, which defines the SOAP vector and it represent also the computational output obtained from the SOAP calculation using the DScribe 14 package. A schematic of the pseudo-code for real atomic centers is available at the DScribe documentation web-page 3 . Moreover, we report schematic representations of our computational workflow in Fig. S2 and S2. All the SOAP calculations were done using the DScribe 14 python package. The complete set of input files and scripts used to analyze the trajectories is available on GitHub at https://github.com/GMPavanLab/lipids-comparison.

S2.1 SOAP similarity measure
Given the mathematical description of SOAP, it appears natural to consider the possibility to compare different atomic environments. This task can be done through the definition of a (dis)similarity measure (i.e., a metric) although others strategies have been already explored. 13 In Ref. 8 the authors laid the basis of a straightforward (dis)similarity measure between two different environments (indexed in this section with the letters i and j, which refer to the local environments surrounding the SOAP centers i and j). Operatively, this metrics is defined as inner product of two densities, 8 namely  1-bead representation 4-beads representation Figure S2: Schematic representation of the SOAP analysis. This refers mainly to the last trajectory reduction and to the SOAP spectra calculation part of the flowchart in Figure S2. a) centers selection of a sample Martini 2.2 lipid and the definition of the single lipid Gaussian smoothed density profile. From left to right: the asterisks identify the 4 sites (3 kinds of bead) considered by SOAP; concentric circles of different colors identify the application of the SOAP density profiles on every center. b) at the assembly level, the SOAP lipid environment is completed by considering all the other lipids contribution inside a specified r cut centered in the lipid of choice (white asterisk). c) computation through the DScribe library 14 of the SOAP power spectra which define the complete lipid environment, up to the chosen cutoff. The target lipid (identified by the asterisk) representation is affected by all the others in its surroundings, defining its local environment. This schematic procedure is repeated for all the lipid in a target membrane to obtain its complete local characterization (lipid-wise). d) Isodensity surfaces, having an illustrative purpose, for the 1-and 4-beads perlipid representations, obtained using the QuickSurf representation of VMD 15 using default parameters. S6 possible rotations, leading to the definition of a SOAP similarity linear kernel whereR represent a generic rotation, the tilde superscript means that the kernel is not unit-normalized and ζ is an integer that needs to be ζ 2 in order to maintain the angular information. 8 In all our calculations the ζ-constant was kept equal to two. Recalling the definition of SOAP power spectrum (Eq. S4), after some algebraic manipulation (for which a complete derivation is available in Ref. 8 ), we can rewrite Eq. S7 as wherep i represents the non-normalized SOAP power spectrum vector of the i-th environment. 8 To make all the kernels commensurable, we decided to consider the unit-normalized kernel, defined as gives a similarity measure that can vary from 0 (the environments are not superimposed) to 1 (the environments are completely superimposed).

S2.2 Considerations on the SOAP metrics
The definition of the SOAP similarity kernel in terms of a dot product (Eq. S8) leads naturally to a definition of a similarity metric. Let a = (a 1 , a 2 , . . . , a D ) and b = (b 1 , b 2 , . . . , b D ) be two points in Euclidean D-space, their distance is given by (S10) The position of a point in a Euclidean D-space is tied to the knowledge of Euclidean vector seen as a directed line segment from the space frame origin (tail) to a point in that space (tip). Its magnitude is simply its norm, a = √ a · a, which can be seen as the distance between the vector tail and tip. If the relationship between a and b involve a direction, this relation is expressed simply as , which is called the displacement vector from b to a. The Euclidean distance between a and b then, is just the Euclidean length of this displacement vector At this point combining Eq. S11 with the definition of SOAP kernel (Eq. S9) we can measure the similarity distance between two environments as Recently, the definition of this metric and some derivations from it proven to be a useful tool in evaluating the similarity of crystalline, disordered and molecular compounds. 13

S2.3 Reducing the dimensionality of SOAP descriptors
The total SOAP descriptor feature space dimensionality n D can be predicted by the parameters used for computing the SOAP power spectra. In a system formed by a single type of beads, we have While, in presence of different kind of beads (like in our case) the dimensionality reads where n beads is the number of kind of beads present. This feature is called crossover-enabled representation, because it can take into account the crossspecies evaluation of the SOAP power spectrum. In our case, we have a n D = 2700, because we choose n max = l max = 8, and we had 3 kind of beads in our representation. Irrespective of the cutoff value, the dimensionality of the SOAP spectra calculated using both the TwoNN algorithm 16 and the FCI algorithm 17 is greater than 20. It is clear that the size of the raw SOAP embedding for the 4 bead reduced representation is highly redundant. In fact, 80% of the variance is accounted for by just 5 pca components as shown in Table S1. PCA was used as dimensionality reduction technique throughout the work, for its its efficiency and low computational cost.

S2.4 Jensen-Shannon divergence
To validate the average SOAP distance metrics, we computed the Jansen-Shannon divergence over the probability densities sampled by MD for all systems.
In order to do so, we proceed as follows. Firstly, the SOAP spectra were projected to the first 5 dimensional component calculated using a dataset composed of samples of SOAP spectra for each full atomistic systems. The probability density for each system was estimated via the non parametric Pak algorithm 18 on 20 bootstrapped samples in the reduced representation.
The Jansen-Shannon divergence was then calculated between the probabilities densities extrapolated over a much finer grid for all FF pairs. The extrapolation of out of sample data is not dealt directly by the available Pak implementation (https://github.com/mariaderrico/DPA) and was carried out by applying a distance waiting criteria on neighbouring; additionally, the lack of a asymptotic decay intrinsic of the non parametric nature of the estimator, was dealt with by applying a smoothing term.
To avoid memory issue resulting from the number of grid points required (about n d , for a discretization into n intervals) and avoid estimating probabilities over scarcely sampled areas of the domain, the grid was generated in a lazy manner and was progressively filtered via a Multilabel Perceptron classifier, tuned to strip out grid points likely to belong to the discretization (hence outside the range of values generated by the various ensembles) and not to any of the force fields systems.
The use of alternative to this strategy for the determination of the probability densities making use of Variational Autoencoders (thus providing the possibility of calculating divergences out-of-the-box) have been considered and deemed beyond the score of the present works.

S3 PAMM clustering
The analysis of molecular motifs was carried out following the same workflow for the two data sets of lipid membranes. First, the collection of all the SOAP vectors for all the five temperatures of specific force field was reduced to a sub-set using a farthest point sampling method. We used 2000 representative points, sampled non uniformly in this way for the grid definition. The KDE was then carried out to select the underlying modes. We used as parameters for the clustering input-dim = 5, fspread = 0.3, quick-shift = 1, merger = 0.001 and a total of 73 bootstrap runs for the statistical calculations. The complete set of input files and scripts used for the PAMM unsupervised clustering and motifs analyses of the MD trajectories is available on GitHub at https://github.com/GMPavanLab/lipids-comparison.

S3.1 Choosing PAMM clustering parameters
We started our analysis building a dataset that contains conformations from all the 5 different temperature simulated. Operatively, for each temperature we took a subsample of 30,000 points randomly selected for which we computed the SOAP power spectra. Using this complete dataset we employed a PCA dimensionality reduction on the spectra, considering the first 5 eigenvalues to compute PAMM clusters. A 2D projection of this PCA reduction for the whole data set is available in Figures S3 and S4. The PAMM clustering algorithm 19 output, as discussed by the authors of the original work, mainly depends by few parameters that affect the way the Kernel Density Estimation (KDE) is computed. The two parameters that affect the estimation are: fspread, and Ngrid. The fspread is a localization parameters that affects the anisotropic multivariate Gaussian kernel used during the KDE step of the algorithm, when the optimal KDE bandwidth is estimated. 19 From a practical point of view, from the data reported in Fig.S3 we can observe that for values of 0.3 and lower we have a substantial stability of the estimation (in terms of number and position) of micro clusters, at a given Ngrid (2000 points). Furthermore, the robustness of this clustering is confirmed by a subsequent bootstrap protocol. As a consequence, we can obtain a hierarchical dendrogram that shows the relations between the micro clusters, and clearly shows that there are two main families of clusters (as expected in a transition between 2 phases). The Ngrid parameter is the number of points extracted from the total dataset used to obtain a reliable KDE within a reasonable computational time (the use of the whole dataset is extremely computational intensive and it is practically out of reach). 19 From the data reported in Fig.S4 it appears evident that increasing the points of the grid makes the estimation more and more robust. A finer and more sparse grid (1000 and 500 points) has the drawback of a coarser density that leads to a hierarchical dendrogram that can have a misleading interpretation of the clusters, whereas a bigger grid (3000 and 2000 points) gives a more robust macro clusters estimation at the cost of an higher computational effort. S10 Figure S3: PAMM clustering output vs. the fspread parameter applied to the whole data set (all 5 temperatures) keeping Ngrid fixed at 2000 points (used in the analysis of Figure 5 in the main paper). From top to bottom: microclusters, percentages of the micro clusters population, dendrograms showing the micro-clusters hierarchy, molecular motifs (macro clusters).
Considering both Ngrid and fspread together, we can summarize the effect of both saying that the raise of Ngrid and the lowering of fspread return a higher resolution in PAMM analysis, raising the computational cost. In any case, the robustness of the analysis is confirmed by the conservation of the number of macroclusters (2, one per phase) in all the possible combination of parameters explored. Finally after all the test performed, we decided to use values of fspread= 0.3 and Ngrid= 2000 points because in our opinion they give the best compromise between computational cost needed and precision in the estimation. S11 Figure S4: PAMM clustering output vs. the Ngrid parameter applied to the whole data set (all 5 temperatures) and keeping fspread fixed at 0.3 (used in the analysis of Figure 5 in the main paper). From top to bottom: micro-clusters, percentages of the micro clusters population, dendrograms showing the microclusters hierarchy, molecular motifs (macro clusters).

S4 Voronoi tessellation analysis
To validate the results of our technique with a comparable method, we computed the Voronoi tessellation analysis for all the calculations of DPPC membranes performed. Namely, we computed frame by frame the Voronoi tesselation on the XY plane for both the sides of the bilayer using the APL@voro program, 20 considering the phosphate bead as the center of every cell. Following the available experimental data for DPPC area per lipids (at 293 K in gel phase we have 21 0.473 nm 2 , while at 318 K in liquid phase we have 22 0.623 nm 2 ), we set as an area per lipid threshold for liquid/gel phase at 0.550 nm 2 , and we labeled all the lipids accordingly.  Figure S5: Voronoi tessellation for the XY plane for DPPC membranes at different temperatures for Martini 2.2 and Dry Martini models (we are showing the same frames as in Figure 5). We colored in red the cells with a surface smaller than 0.55 nm 2 , and blue otherwise. With this technique we are qualitatively in agreement with the results of SOAP/PAMM analysis, showing a sharp transition between 293 K and 323 K for Martini 2.2, and a lack of phase transition for Dry Martini.
From a first qualitative analysis ( Figure S5) we can observe a sharp population inversion for Martini 2.2 between 323 and 293 K, whole for Dry Martini the we can see a growth in the red (area < 0.550 nm 2 ) population. After this first test, we perform a quantitative analysis S6, finding that in both the analysis techniques the population inversion is present only for the Martini 2.2 model (right side of the Figure), substantially confirming the results obtained with SOAP-PAMM analysis. On the other hand, we do not have a precise superposition of the identified lipids within the 2 techniques: this can suggest that the analysis of the whole molecules (that means the head and the lipid tail, as performed for SOAP-PAMM calculations), can return different results. Furthermore, this Voronoi-based analysis is based on a XY projection making it not easily adaptable to, e.g., vescicles. SOAP-PAMM analysis instead, being S13 Voronoi population analysis Figure S6: Voronoi tessellation analysis results. In the left side of the plot, we have the fraction of the lipids that have a Voronoi cell area smaller and larger than 0.55 nm 2 (in blue and red, respectively). In the right side of the plot, we show the percentage of the lipids that are identified in the same phase (liquid or gel) for both the SOAP and Voronoi analysis (green lines) and the fraction of lipid that are assigned to different groups in the two techniques (pink lines). intrinsically 3-dimensional, can be used also in other, less ideal contexts. S14 S5 Simulation details

S5.1 All-atom Force Fields
To prepare the initial configurations for Charmm36 and Slipids force fields we used the web tool CHARMM-GUI, 23 putting a 50 Å-high layer of TIP3P 24 water for both sides of the membrane without counterions. For AMBER LIPID17, we took one bilayer configuration without water from Charmm36 system preparation and using tleap from AmberTools20, 25 we solvated it and converted the model to GROMACS format using Parmed. 26 We thus followed the minimization and equilibration protocol of CHARMM-GUI, namely: • 5000 steps of Steepest descent minimization; • 2.5e5 steps (timestep of 2 fs, 500 ps) of MD and position (40 kJ/mol/nm 2 ) and dihedral restraints (100 kJ/mol/rad 2 ) in NPT ensemble using a Berendsen barostat (reference pressure of 1 bar, a coupling constant of 5 ps, and a compressibility of 4.5 × 10 −5 bar −1 ) and thermostat 27 (reference temperature of 303 K with a coupling constant of 1 ps); • 2.5e5 steps (timestep of 2 fs, 500 ps) of MD in NPT ensemble using a Berendsen barostat (reference pressure of 1 bar, a coupling constant of 5 ps, and a compressibility of 4.5×10 −5 bar −1 ) and thermostat 27 (reference temperature of 303 K with a coupling constant of 1 ps); S15 • 5e5 steps (timestep of 2 fs, 1 ns) in NPT ensemble using a Parrinello-Rahman barostat 28 (reference pressure of 1 bar, a coupling constant of 5 ps, and a compressibility of 4.5 × 10 −5 bar −1 ) and Nosé-Hoover thermostat 29 (reference temperature of 303 K with a coupling constant of 1 ps).
All the choice regarding electrostatic long-range calculations are the same as in production runs for every AA FF (see table S2). At the end of this minimization protocol we simulated a 100-ns long equilibration run with the same parameters as the final production run (see table S2).

S5.2 United Atom Force Fields
We obtained the initial conformations from the internet: • Berger FF from Ref; 30,31 • Gromos-CKP from Lipidbook 32 • Gromos-43a1-s3 from Lipidbook 32 we subsequently solvated the system with a 50 Å-high layer of SPC water 33 for both sides of the membrane, and applied the all-atom CHARMM-GUI protocol for minimization and equilibration. After this initial protocol, we performed a a 100-ns long equilibration run with the same parameters as the final production run (see table S2).

S5.3 Non-polarizable Martini Force Field
To prepare the initial configurations for Martini 2.2 force field we used the web tool CHARMM-GUI, 23 putting a 50 Å-high layer of Martini water 34 water for both sides of the membrane without counterions. Martini 3.0 beta 3.2 has the same mapping of Martini 2.2, thus we used the same initial conformation as a starting point. We followed the minimization and equilibration protocol given by CHARMM-GUI, namely: • 5000 steps of Steepest descent minimization with a soft-core potential on LJ and Coulomb interaction; • 5000 steps of Steepest descent minimization; • 5e5 steps (timestep of 2 fs, 1 ns) of MD and position restraint on head group (200 kJ/mol/nm 2 ) in NPT ensemble using a Berendsen barostat (reference pressure of 1 bar, a coupling constant of 5 ps, and a compressibility of 3×10 −4 bar −1 ) and velocity rescale thermostat 35 (reference temperature of 303 K with a coupling constant of 1 ps); • 2e5 steps (timestep of 5 fs, 1 ns) of MD and position restraint on head group (100 kJ/mol/nm 2 ) in NPT ensemble using a Berendsen barostat S16 (reference pressure of 1 bar, a coupling constant of 5 ps, and a compressibility of 3×10 −4 bar −1 ) and velocity rescale thermostat 35 (reference temperature of 303 K with a coupling constant of 1 ps); All the choice regarding electrostatic long-range calculations are the same as in production runs (see table S2). At the end of this minimization protocol we simulated a 100-ns long equilibration run with the same parameters as the final production run (see table S2).

S5.4 Polarizable Martini Force Field
To prepare the initial configurations for Martini 2.2p and Martini 2.3p force fields we used the web tool CHARMM-GUI, 23 putting a 50 Å-high layer of POL water 36 water for both sides of the membrane without counterions. We followed the minimization and equilibration protocol given by CHARMM-GUI, namely: • 5000 steps of Steepest descent minimization with a soft-core potential on LJ and Coulomb interaction; • 5000 steps of Steepest descent minimization; • 5e5 steps (timestep of 2 fs, 1 ns) of MD and position restraint on head group (200 kJ/mol/nm 2 ) in NPT ensemble using a Berendsen barostat S17 (reference pressure of 1 bar, a coupling constant of 5 ps, and a compressibility of 3×10 −4 bar −1 ) and velocity rescale thermostat 35 (reference temperature of 303 K with a coupling constant of 1 ps); • 2e5 steps (timestep of 5 fs, 1 ns) of MD and position restraint on head group (100 kJ/mol/nm 2 ) in NPT ensemble using a Berendsen barostat (reference pressure of 1 bar, a coupling constant of 5 ps, and a compressibility of 3×10 −4 bar −1 ) and velocity rescale thermostat 35 (reference temperature of 303 K with a coupling constant of 1 ps); All the choice regarding electrostatic long-range calculations are the same as in production runs (see table S2). At the end of this minimization protocol we simulated a 100-ns long equilibration run with the same parameters as the final production run (see table S2).

S5.5 Dry Martini Force Field
To prepare the initial configurations for Dry Martini force field we used the web tool CHARMM-GUI, 23 putting a 50 Å-high empty layer on both sides of the membrane without counterions. We followed the minimization and equilibration protocol given by CHARMM-GUI, namely: • 5000 steps of Steepest descent minimization with a soft-core potential on LJ and Coulomb interaction; S18 • 5000 steps of Steepest descent minimization; • 5e5 steps (timestep of 2 fs, 1 ns) of MD and position restraint on head group (200 kJ/mol/nm 2 ) in NVT ensemble with a langevin thermostat (reference temperature of 303 K with a coupling constant of 4 ps); • 2e5 steps (timestep of 5 fs, 1 ns) of MD and position restraint on head group (100 kJ/mol/nm 2 ) in NVT ensemble with a langevin thermostat (reference temperature of 303 K with a coupling constant of 4 ps); • 1e5 steps (timestep of 10 fs, 1 ns) of MD and position restraint on head group (50 kJ/mol/nm 2 ) in NVT ensemble with a langevin thermostat (reference temperature of 303 K with a coupling constant of 4 ps); • 5e4 steps (timestep of 15 fs, 750 ps) of MD and position restraint on head group (20 kJ/mol/nm 2 ) in NVT ensemble with a langevin thermostat (reference temperature of 303 K with a coupling constant of 4 ps); • 5e4 steps (timestep of 20 fs, 1 ns) of MD and position restraint on head group (10 kJ/mol/nm 2 ) in NVT ensemble with a langevin thermostat (reference temperature of 303 K with a coupling constant of 4 ps); • 1.5e6 steps (timestep of 20 fs, 1 ns) of MD in NVT ensemble with a langevin thermostat (reference temperature of 303 K with a coupling constant of 4 ps); All the choice regarding electrostatic long-range calculations are the same as in production runs (see table S2). At the end of this minimization protocol we simulated a 100-ns long equilibration run with the same parameters as the final production run (see table S2).

S5.6 Sirah Force Field
To prepare the initial configurations for Dry Martini force field we started from the initial POPC bilayer conformation for Charmm36 FF, coarse-graining it following the tutorial on www.sirahff.com. We solvated the system putting a 50 Å-high layer of WAT4 water model 37 on both sides of the membrane without counterions. We followed the minimization and equilibration protocol given in the Sirah FF bilayer tutorial, namely: To prepare the initial configurations for Dry Martini force field we used the web tool CHARMM-GUI, 23 putting a 50 Å-high empty layer on both sides of the membrane without counterions. We followed the minimization and equilibration protocol given by CHARMM-GUI, namely: • 100 steps of Steepest descent minimization followed by 4900 steps of conjugate gradient minimization.
• 2.5e4 steps (timestep of 20 fs, 500 ps) of MD in NPT ensemble using a Berendsen barostat (reference pressure of 1 bar, a coupling constant of 8 ps S19 and langevin thermostat (reference temperature of 303 K with a coupling constant of 5 ps); • 2.5e5 steps (timestep of 20 fs, 5 ns) of MD in NPT ensemble using a Berendsen barostat (reference pressure of 1 bar, a coupling constant of 8 ps and langevin thermostat (reference temperature of 303 K with a coupling constant of 5 ps); All the choice regarding electrostatic long-range calculations are the same as in production runs (see table S2). At the end of this minimization protocol we simulated a 100-ns long equilibration run with the same parameters as the final production run (see table S2).