Optimal Bond Constraint Topology for Molecular Dynamics Simulations of Cholesterol

We recently observed artificial temperature gradients in molecular dynamics (MD) simulations of phase-separating ternary lipid mixtures using the Martini 2 force field. We traced this artifact to insufficiently converged bond length constraints with typical time steps and default settings for the linear constraint solver (LINCS). Here, we systematically optimize the constraint scaffold of cholesterol. With massive virtual sites in an equimomental arrangement, we accelerate bond constraint convergence while preserving the original cholesterol force field and dynamics. The optimized model does not induce nonphysical temperature gradients even at relaxed LINCS settings and is at least as fast as the original model at the strict LINCS settings required for proper thermal sampling. We provide a python script to diagnose possible problems with constraint convergence for other molecules and force fields. Equimomental constraint topology optimization can also be used to boost constraint convergence in atomistic MD simulations of molecular systems.


INTRODUCTION
Molecular dynamics (MD) simulations give us a molecularly detailed view of biological processes. To reach relevant length and time scales, the number of degrees of freedom can be reduced by coarse graining (CG). As one of the most successful approaches, the Martini force field 1,2 on average maps four non-hydrogen atoms into a single bead. With the Martini force field, integration time steps of up to about Δt = 30 fs can be used, compared to the 2 fs typical in all-atom simulations. CG force fields make it possible to study phase separation in lipid bilayers 3,4 as possible models for the lipid rafts 5,6 implicated in a myriad of cellular processes. 7−9 However, we recently observed unphysical temperature gradients across the liquid-ordered (L o ) and liquid-disordered (L d ) phase boundaries in Martini 2 simulations of phaseseparating ternary lipid mixtures. 10 We traced these gradients to an insufficient convergence of the highly coupled bondlength constraints in the Martini 2 cholesterol model, 10,11 with cholesterol being a major component of phase-separating lipid membranes. Insufficient constraint convergence also affects other observables at typical time steps, such as the diffusion coefficient or the contact fraction, 12 which describes the degree of the phase separation in the system.
Slow convergence of the linear constraint solver (LINCS) 13,14 is expected for any system with strongly coupled constraints. In Martini 2, this includes sterols such as cholesterol and ergosterol. In atomistic simulations, the issue arises, e.g., for angle-constrained butane or pentane, where LINCS fails entirely 13 or cholesterol with all bonds constrained. 10 Thus, an easy-to-use tool to estimate the required LINCS settings for proper convergence and a strategy to ameliorate bond length constraint issues would be valuable.
Here, we present a general method based on rigid-body mechanics 15 and the use of virtual sites 16 to optimize the LINCS convergence behavior of highly constrained topologies. Applied to the Martini 2 cholesterol topology, we obtain a cholesterol model that is optimized in terms of LINCS convergence but fully retains the original parametrization of the molecule in terms of force field and dynamics. First, we recapitulate the fundamental problem with the joint use of coupled constraints and the LINCS algorithm. We provide a set of guidelines that can help to avoid the creation of such constraints, and we present a script that enables the detection of constraint-related issues without explicitly performing any simulation. We apply the script to assess the behavior of constrained molecules in the Martini 3 small-molecule library 17 and different atomistic cholesterol topologies. Second, we modify the cholesterol constraint topology to reduce the timestep dependence and the temperature gradients of the system. Finally, we perform Martini 2 simulations of (i) a single cholesterol molecule, (ii) a phase-separating lipid bilayer, and (iii) a G-protein-coupled receptor (GPCR) embedded in a cholesterol-containing membrane using both the original Martini 2 and our optimized cholesterol models to demonstrate the improvements introduced by our model.

LINCS Constraint Algorithm.
As discussed in detail in the original LINCS papers, 13 where r n+1 unc is the position after the unconstrained time step n + 1, M is the diagonal matrix of particle masses, B n is the gradient matrix of the constraint equations at time step n, and the vector d contains the prescribed length of the constraints. A key step in enforcing the constraints with eq 1 is the inversion of B n M −1 B n T , a square matrix of dimensions K × K with K being the number of distance constraints. To reduce the computational cost, one rewrites the inverse in the form where S is a diagonal matrix defined as the inverse square root 13 with i A and i B indexing the two sites in distance constraint i = 1, ..., K. Note that eq 20 in ref 13 actually defines the inverse of S. A n is a sparse symmetric matrix with zeros along its diagonal. The off-diagonal elements of A n are given by the cosine of the angle between the constraints multiplied by a dimensionless mass factor. In LINCS, 13 This expansion is only applicable if the largest absolute value of the eigenvalues of A n is less than one, max( ) 1 i max = | | < , and it converges poorly as the magnitude of the eigenvalue approaches one. As the authors of LINCS note, angleconstrained butane has λ max = 0.8 while angle-constrained pentane has λ max = 1.2. 13 The reason for coupled constraints being prone to fail is that increasing powers (p) of the A n matrix represent the coupling effect of constraints that are p constraints away. The largest power p in the truncated series in eq 3 corresponds to the requested LINCS order (lincs_order). In coupled triangles, the third constraint away from a given constraint is already the constraint itself. As such, for highly coupled geometries, the expansion in eq 3 usually converges slowly or not at all. Consequently, the Gromacs MD simulation engine 18 internally doubles the requested lincs_order for all constraints involved in triangular arrangements. 14 As the A n matrix can be diagonalized, one can conveniently assess the convergence of eq 3 with increasing lincs_order based on how fast λ max Here, we use this relationship as a rule-of-thumb to estimate the lincs_order required for convergence.

Rigid-Body Mechanics.
In Newton's equations of motion, the mechanics of a rigid body is completely specified by the zeroth, first, and second moments of its mass distribution, namely, the total mass (M), the center of mass (CoM) vector (R), and the inertia tensor (X). Therefore, one is free to alter the positions of a set of rigidly connected, massive, noninteracting particles as long as M, R, and X are kept constant. Sets of points that possess the same M, R, and X are called equimomental systems. 15 Naturally, to preserve the dynamics of the system, the massless interacting sites must subsequently be reconstructed around the new "scaffold" of massive noninteracting particles. This well-known concept is the basis of virtual sites in MD. 16 While finding equimomental systems is trivial for rigid linear molecules, it becomes increasingly complicated for planar and nonplanar molecules due to couplings between M, R, and X. Recently, Laus and Selig presented a general procedure to generate equimomental systems from a regular tetrahedron. 15 Here, we only present a brief outline of the theory. The central object of the formalism is the pseudo inertia tensor given by The X̃matrix can be readily constructed from outer products of the homogeneous coordinates of the particle positions r r r r ( , , , 1) where m i are the masses of the individual particles. It is clear from eq 4 that setting the center of mass of a point cloud R = 0 and orienting it such that X is diagonal produces a pseudo = that is also diagonal. This transformation can be expressed as where G is an element of the special Euclidean group SE(3) in the 4 × 4 matrix representation, and it encodes the translation and rotation of the point cloud of rigidly connected particles. Moreover, the individual terms in eq 5 are rank 1 matrices. Because four points not in a common plane already result in a Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article full-rank matrix, any general rigid body can be replaced by a rigid structure with a minimum of four points. Following ref 15 and without loss of generality, we assume in the following that we are in the principal frame of the rigid body, where by an appropriate translation and rotation the center of mass has been placed at the origin and the tensor of inertia is diagonal. The pseudo tensor of inertia is then also diagonal, where we used eq 8 and UU T = I. The 4D rotations by U thus correspond to the required equimomental transformations. 15 However, to interpret the 4D-rotated vectors s DUq with transformed masses m i ′ that together leave X̃and thus the inertia tensor, center of mass, and total mass unchanged. Importantly, the resulting masses In addition to the fact that 4D rotations allow one to create all possible equimomental systems, one has considerable freedom in fixing certain positions or masses of the final system. 15 We note that in this process one can also change the number of mass points, e.g., by reducing the number of sites to the minimum of four for a general rigid body. 15

Topology Optimization to Minimize λ max .
We developed a python script using the MDAnalysis 19 package to compute λ max from a single molecular configuration. The use of a single configuration is justified only if the constrained particles do not undergo significant fluctuations during their motion. This condition holds for the Martini 2 cholesterol model as the two coupled triangles never substantially deviate from coplanarity. On the basis of the framework for the generation of equimomental systems 15 introduced in the previous section, we minimized λ max of A n computed by the script with respect to equimomental configurations of a fixed number of rigid sites. In this way, we aimed at reducing the lincs_order required for properly constraining Martini 2 cholesterol.
We optimized the cholesterol model by minimizing the largest eigenvalue λ max of the constraint matrix A n as follows (see Figure S1 for illustration).
1. First, we decoupled the masses of the four beads involved in the two coupled constraint triangles from the interaction sites by introducing four additional noninteracting sites positioned initially at the location of the respective interacting bead. The cholesterol tail bead forming the fifth massive site was left unchanged throughout the optimization, being connected to the rest of the molecule by a flexible bond and thus not part of any constraint. The four newly introduced beads initially inherited the masses of the original beads, while the original beads became massless virtual interaction sites. As a result of the decoupling, the optimized model has 12 beads, 4 more than the original model. 2. Second, we iteratively optimized the positions and masses of the four newly introduced massive sites in an equimomental manner. We used eq 7 to find the matrix D̃and vectors q i (i = 1, ..., 4) for the four massive sites. Then, at each iteration step, we generated random fourdimensional rotations U ∈ SO(4) for small rotation angles. The application of D̃U to q i produced new homogeneous coordinate vectors r i and hence provided new positions r i ′ and masses m i ′. Following Laus and Selig, 15 this procedure guaranteed that M, R, and X remained fixed throughout the optimization process. 3. Then, we computed the A n matrix of the newly generated configuration with modified positions and masses of the four massive constrained particles and determined λ max and the estimate of the required lincs_order. Following a Monte Carlo scheme, we accepted any new configuration that lowered λ max and repeated the refinement (steps 2 and 3) until a sufficiently low value was reached. To avoid pathological structures, we specified a minimum distance of 2.7 Å between any two beads as a constraint for the optimization. 4. After the iterative equimomental optimization of the masses and positions of the four newly introduced sites (black circles in Figure 1, right), the relative positions of the seven massless virtual sites (four originally massive interaction sites plus the three original virtual sites; shown as pink and magenta circles in Figure 1, right) were reconstructed in the reference frame of the four new sites.

MD Simulations.
To compare the geometric properties of the original Martini 2 cholesterol model and our optimized model, we performed MD simulations of single isolated cholesterol molecules in the NVT ensemble using the two topologies. The simulations were 50 ns long each with time steps of 10 fs and lincs_order = 4. This combination of settings ensured sufficient convergence of the molecular properties.
We also performed MD simulations of ternary phaseseparating lipid mixtures based on the simulations of Thallmair et al. 10 Systems consisting of 1276 dipalmitoyl-phosphatidylcholine (DPPC), 912 cholesterol, and 950 dilinoleoylphosphatidyl-choline (DLiPC) molecules corresponding to a molar ratio of 0.42/0.28/0.30 were built using the program insane.py in a random distribution. 20 The bilayers were solvated in a 0.15 M NaCl solution. This resulted in an overall number of 38 917 CG water beads (10% of which were antifreeze particles) and 428 NaCl ion pairs. All simulations were performed with Gromacs 18 version 2020.1. Energy minimization of the initial structures using a steepest descent algorithm was followed by two 500 ps pre-equilibration runs with time steps of 1 and 10 fs, respectively. The equilibration was concluded by a run of 1 μs, which is long enough for phase separation to occur. During production, we simulated 750 000 000 steps corresponding to 7.5 μs with Δt = 10 fs, 15 μs with Δt = 20 fs, and 22.5 μs with Δt = 30 fs. The MD parameters and settings corresponded to the "New-RF" values. 21 During the simulations, the pressure was maintained at 1 bar using a Parrinello−Rahman barostat 22 with semi-isotropic coupling. We employed a coupling constant of τ p = 12.0 ps and a compressibility of β = 3 × 10 −4 bar −1 . The temperature was kept constant at 310 K using a velocity rescaling thermostat 23 with a coupling constant of τ T = 1.0 ps. One thermostat was used for the solvent beads (water, antifreeze, and ion beads) and a second independent thermostat for the lipids.
We assessed the two different constraint topologies of cholesterol on the membrane properties by performing runs with the original Martini 2 cholesterol model and with the optimized geometry of virtual sites. We simulated both models using lincs_order = 4, 6, and 8 and time steps of Δt = 10, 20, and 30 fs to check for possible temperature gradients and to evaluate the dependence of structural and dynamic properties on the lincs_order and the time step.
To investigate the differences in lipid−protein interactions, we performed simulations of a β 2 -adrenergic receptor (β 2 AR, PDB ID: 2RH1) embedded in an asymmetric lipid bilayer consisting of 65% POPC and 35% cholesterol in the upper leaflet and 55% POPC, 35% cholesterol, and 10% PIP 2 in the lower leaflet. The initial configuration was obtained from Song et al. 24 (personal communication) and simulated using lincs_order = 4, 6, and 8 and time steps of Δt = 10, 20, and 30 fs to assess the impact of improper constraining on the interactions of β 2 AR and cholesterol. Three replicas of 15 μs length were simulated with each combination of lincs_order and time step, and the first 5 μs of every trajectory were discarded from analysis as equilibration. All reported quantities were averaged over the three replicas.

Analysis.
To assess the achieved improvements in the optimized cholesterol model, we computed a range of observables for the original and optimized cholesterol models. To show that our approach does not alter the original model in any detectable fashion, we extensively compared the solventaccessible surface area (SASA) and the equilibrium and RMSD values of pairwise bead distances of the single isolated cholesterols. The SASA was computed using gmx sasa and a probe sphere of radius 1.85 Å, while the bead distances were computed using a PLUMED 2.7 25 script.
In the systems containing lipid bilayers, we computed the temperature of the different lipids using the Gromacs 18 tool gmx traj. For the different cholesterol models, the temperatures calculated from the kinetic energies were corrected for the respective numbers of free and constrained degrees of freedom. The standard Martini 2 cholesterol consisting of eight beads has 3N = 24 degrees of freedom. However, the three massless virtual sites do not contribute to the kinetic energy. Together with the 5 constraints imposed on the structure, they leave only 10 degrees of freedom. Hence, a correction of 24/10 was applied to the temperature values. The same reasoning results in a correction factor of 36/10 for our optimized model.
To gain deeper insights into the temperature gradients of the phase-separating systems, we computed the lateral distribution of temperature in the membrane and its difference between DLiPC and DPPC lipids (ΔT) with an in-house Python script using the MDAnalysis library. 19 The kinetic energy of the lipids was calculated through E k = mv 2 /2, where v is the velocity and m is the mass of the particle. According to the equipartition theorem, the temperature was then obtained as , where k B is Boltzmann's constant and N DoF is the number of actual degrees of freedom (corrected for the constraints). Finally, the temperature was binned into 2D histograms. For ease of representation, we averaged the 2D temperature maps along the axis parallel to the boundaries between the L o and L d regions, that is, between the low-and high-temperature domains.
The gmx mindist tool was used to compute the contact fraction between DPPC and DLiPC lipids defined as 12 where c A−B is the number of contacts between species A and B within a cutoff radius of 0.7 nm. The contacts were evaluated based on the PO4 bead of the lipids. The lateral diffusion coefficients of the phospholipid species were calculated from the mean square displacement of their CoM. The trajectories of individual lipids were unwrapped using the NPT-corrected scheme of von Bulow et al. 26 and analyzed with a Generalized Least Squares estimator 27 to obtain the diffusion coefficients. The lateral motion of the lipids was measured with respect to the CoM of the entire bilayer, thereby eliminating the drift of the overall CoM.
We analyzed the protein−lipid interaction between the β 2adrenergic receptor and cholesterol using the PyLipID package. 24 Similar to the original paper, we used 0.475 and 0.80 nm for the dual cutoffs that deal with the "rattling effect" in lipid binding. 24 The binding sites were required to consist of at least four residues. We computed the per-residue and perbinding-site lipid count, occupancy, duration, residence time, and unbinding rate k off for each combination of lincs_order and time step, as in simulations of membranes without proteins. Additionally, we calculated the SASA and binding pose RMSD for the binding sites. The per-residue and perbinding-site observables computed by PyLipID were averaged over the 50 highest scoring residues and the top 3 scoring binding sites, respectively. See the original PyLipID paper 24 for more details about the quantities.

Cholesterol Optimization Decreases λ max .
In addition to three virtual sites, the Martini 2 model of cholesterol contains five massive sites, four of which form two coupled triangles. The original and optimized cholesterol models are illustrated in Figure 1.
Due to the presence of the coupled triangles, λ max ≈ 0.95 and the estimated lincs_order is 72. Here and in the following, the internal doubling by Gromacs is not taken into account. 14 The reason behind the high eigenvalues is closely related to the unequal masses and coupled, far-from-equilateral triangles involved in the constraints. While the use of equal masses and equilateral triangles would completely distort the topology of cholesterol, it would result in λ max ≈ 0.50 and lincs_order = 5. The excessively large lincs_order of the original Martini 2 cholesterol is not only infeasible in simulations but would also be applied to all constraints in the system, not just the coupled ones with convergence issues.
Using the procedure outlined in the Methods section, we reduced λ max from 0.95 to 0.80, corresponding to a decrease in the required lincs_order from 72 to 16. Even though the topology optimization described above efficiently reduced the required lincs_order, large distortions of the geometry of the original model can cause other instabilities. In the case of cholesterol, a further reduction in lincs_order was not possible, because the resulting topologies had massive beads that were too close. Due to the proximity of these massive beads, the integration of the equations of motion produced overly large deviations from the prescribed values, leading to a different kind of LINCS instability.

Optimized Model Leaves Cholesterol Geometry Intact.
We verified that the optimization procedure does not alter the configurations of the interacting beads by running simulations of a single cholesterol molecule in vacuum with both models. Our optimized model excellently reproduces the mean SASA value and its distribution (see Figure S2). To further prove the correctness of the optimized model, we computed all pairwise particle distances (Tables S1 and S2). The optimized model reproduces the mean values of all distances of the original model up to a precision of 0.02 Å as well as their standard deviation (compare Tables S3 and S4). Furthermore, we computed the probability density functions of the single bond, angle, and dihedral angle of the cholesterol models that do not rely on constraints or virtual sites. The distributions are virtually indistinguishable (Figure 2).

Optimized Model Eliminates Artificial Temperature Gradients in Phase-Separating Systems.
We extensively compared the properties of lipid bilayers containing cholesterol described with the original and optimized model using a range of Δt and lincs_order values. As a first test, we evaluated the average temperature difference between DLiPC and DPPC lipids, ΔT = T DLiPC − T DPPC . For the original cholesterol model, the insufficient convergence of the LINCS algorithm led to the development of significant temperature differences, as shown in Figure 3. With the least strict LINCS settings (lincs_order = 4) and largest time step (30 fs), the differences can reach ΔT = 56 K between the two lipid types (Figure 3a). Decreasing the time step to 20 fs is not adequate even when lincs_order = 8 is used. To recover a temperature difference below 2 K with the original model, one has to use a 10 fs time step, incurring a significant penalty in the simulation performance.
The magnitude of the temperature gradient is even more striking when one examines the different membrane domains as a function of position (see Figure 3b and 3c). The reasons for the even larger temperature differences are that the individual phases are not composed uniquely of single lipid types and that the L o phase contains the majority of cholesterol along with DPPC. We found a temperature difference between the two halves of the simulation box as high ΔT ≈ 80 K.
In contrast to the original cholesterol model, we observed only small temperature differences between the two phospholipid types with the optimized model. Even for a low lincs_order = 4 and long time step of 30 fs, the temperature difference is only 3.9 K (Figure 3a). For lincs_order = 6 and a 30 fs time step, ΔT drops to 0.2 K. Moreover, the temperature difference between the two membrane domains was in all cases under 8 K and became negligible for time steps below 30 fs or lincs_order = 8 for a 30 fs time step.

Properties of Phase-Separating System Converge When Temperature Gradients Are Eliminated.
As a test of the dynamic properties, we computed the ratio of diffusion coefficients of DPPC and DLiPC lipids. As a probe of the local structure, we calculated the lipid−lipid contact fraction defined in eq 10. Results are shown as a function of the observed artificial temperature difference ΔT between DLiPC and DPPC lipids (Figure 4). Both quantities are greatly affected by ΔT and can only be considered converged within the sampling uncertainty when ΔT < 2 K (vertical dashed line). For the original cholesterol model, this requires a short time step of 10 fs; by contrast, for the optimized cholesterol model we have converged results in all cases except for the lowest lincs_order = 4 combined with the longest time step of 30 fs. Both models converge to the same values at small ΔT, further supporting the consistency of our optimization procedure.
As another property impacted by temperature gradients, we computed the distribution of cholesterol along the membrane normal. Figure 5 indicates that while the cholesterol distribution is quite sensitive to the combination of time step and LINCS settings in the original model, all curves are on top of each other in the optimized model. Moreover, the original model converges to our optimized model in the limit of ΔT ≤ 2 K, that is, at small time steps and high lincs_order settings. The slight asymmetry in the width and height of the cholesterol populations of the two leaflets is due to the position restraints along the z axis of the DLiPC lipids in one of the two leaflets, which was applied to suppress membrane undulations. 29

Cholesterol−β 2 AR Interactions Are Only Weakly
Affected by Insufficient Constraining. We investigated the interactions between cholesterol and a β 2 AR in a mixed-lipid, asymmetric bilayer, as described in the Methods section. Here, we analyzed the difference between the temperature of cholesterol and the reference temperature of the thermostat ΔT chol = T ref − T chol . The observed ΔT chol values as a function of time step and lincs_order are shown in Figure 6 for the original (top) and optimized models (bottom). The large (>2 K) values of ΔT chol indicate that the protein−lipid systems also suffer from the LINCS convergence issues observed in the phase-separating bilayers, albeit to a much lesser extent. This is due to a lower local cholesterol concentration compared to the L o phase of the ternary system. By contrast, the systems simulated using the optimized model show virtually no temperature differences (see Figure 6 bottom) irrespective of the chosen parameters.
Despite the significant values of ΔT chol , the observables computed with PyLipID 24 for the system containing β 2 AR do not exhibit systematic changes as a function of ΔT chol in the case of the original model or significant differences between the two models (see Figures 7, S3, and S4). However, the system with the least strict settings (and largest ΔT chol ) tends to be an outlier. Therefore, we discourage the use of lincs_order = 4 in combination with a long time step of Δt = 30 fs in the investigated system. We also note that while there are no systematic differences between the two cholesterol topologies, the quantities determined using the optimized model have smaller variances (Figures S3 and S4). Moreover, if the lipid bilayers exhibit phase separation and thus have larger local cholesterol concentrations, we expect an increased impact of the nonconverged constraints on the protein−cholesterol interactions. 4.6. Optimal Cholesterol Model Improves Computational Performance. Results reporting on the computational efficiency of the original and the optimized cholesterol models in the phase-separating lipid bilayer are listed in Table 1. For details about the overall hardware configuration and the efficiency of the β 2 AR-containing simulations, we refer to section 6 of the Supporting Information and Table S5, respectively. For the given settings, MD simulations with the optimized cholesterol model incurred a performance penalty of ∼10−15% in all cases. Because this comparison does not take into account whether the physics of the system is correct or not, we also compared the performance of the previously recommended parameters for proper LINCS convergence (lincs_iter = 2, lincs_order = 12, Δt = 20 fs) and (lincs_iter = 3, lincs_order = 12, Δt = 30 fs). 10 Requiring the temperature gradient to be negligible (|ΔT| < 2 K, based on the convergence of properties in Figure 4), one   Crucially, using our optimized model, other constrained molecules present in the simulation box are not subjected to the overly high LINCS requirements of the original cholesterol model. We evaluated the computational cost of unnecessarily constraining molecules in the simulations of the membraneembedded β 2 AR using the same LINCS settings as for the protein-less phase-separating lipid bilayer. The protein model contains 462 constraints that do not require strict LINCS settings. The use of the optimized model results in an ∼30% performance increase compared to stricter LINCS settings 10 (see Table S5).

Perspective on Other Potentially Affected Molecular Topologies. 4.7.1. Martini 3 Small-Molecule Library Does Not Suffer from LINCS Convergence Issues.
The representation of a rigid topology by two connected triangular constraints, the so-called hinge model, as it is used in cholesterol, served as a blueprint for the Martini 3 topologies of a number of small molecules. Therefore, we assessed the quality of the constraint topology in terms of λ max for 77 constrained Martini 3 small molecules 17 (available at https:// github.com/ricalessandri/Martini3-small-molecules) by estimating the required LINCS order for convergence and by performing explicit simulations (see Tables S6−S8). Our python script identified the molecules BZTA (benzothiazole), BZTH (benzothiophene), and MINDA (1-methylindazole) as having the largest eigenvalues of λ max = 0.76. While these λ max values are not excessively large, the reason behind them is the same as that for cholesterol: uneven masses and slightly distorted triangles. All three molecules were of planar, trapezoidal geometry with constraints applied to the four sides and the longer diagonal. As a test, we "flipped" the constraint along the diagonal of the molecules to constrain the other, shorter diagonal, which resulted in a decrease of λ max in all three cases (BZTA and BZTH, 0.76−0.71; MINDA, 0.76− 0.65).
In MD simulations using the new-rf input parameters 21 and various combinations of lincs_order and time step, we found that all differences between the solute temperature and the thermostat reference temperature were less than 1.5 K and that the temperature difference between the solute and the solvent never exceeded 2 K (Tables S6−S8). Interestingly, "flipping" the diagonal constraint in the molecular topologies did not produce a clear improvement (Tables S6−S8), most likely due to the only moderately large λ max values.
The explicit simulations fully support our conclusions drawn based on λ max . Remarkably, the eigenvalue analysis of all 77 molecules took less than 3 min on a standard laptop, while the explicit simulations require tests using various lincs_order and time step values and take a few hours per system using high-performance computers (running on a single node, 12 000 particles, and 15 million integration steps).
Finally, the topologies involve 2-to-1 mappings of nonhydrogen atoms to CG beads and contain "tiny" beads. The standard Martini 3 parameters of the "tiny" beads restrict the time step Δt to well below 30 fs. 30 While we did not encounter any crashes during the explicit simulations of the above systems, caution must be taken when other molecules with "tiny" beads are present.

Atomistic Topologies with All Bonds Constrained Suffer from Poor Constraint Convergence in LINCS.
In our previous study, we showed that atomistic systems containing cholesterol also can suffer from temperature gradients due to nonconverged constraints. 10 Two examples are the CHARMM36 force field with hydrogen mass repartitioning (HMR) 31 and the CHARMM36 model with hydrogens modeled as virtual sites (VIS). 32,33 To allow time steps of up to Δt = 5 fs, both models constrain all bonds. Although these cholesterol models do not contain any coupled triangles, larger rings of five or six atoms are present (Figure 1, left). Similar to three-membered rings, the resulting coupled constraints affect the convergence of the LINCS algorithm.
We analyzed the largest eigenvalues λ max of the A n matrix for the standard CHARMM36 cholesterol model as well as the HMR and VIS ones using our script. While the standard CHARMM36 cholesterol model is typically run by constraining solely bonds involving hydrogen atoms, HMR and VIS constrain all bonds to enable larger time steps. As expected, the standard CHARMM36 model 34 has a low λ max value of 0.06 because no coupled constraints are present. The other two models, however, exhibit considerably higher λ max values of 0.73 (HMR) and 0.71 (VIS). For proper convergence, they would require lincs_order = 11 and 10, respectively. Note that the internal doubling of lincs_order is not initiated by Gromacs for these topologies because no constrained triangles are present.
This shows that also for atomistic systems in which all bonds are constrained, the analysis of the eigenvalues of the A n matrix is a valuable diagnosis tool. Our script can be used to detect potential convergence issues of the LINCS algorithm and estimate the required LINCS settings. 18

CONCLUSIONS
For phase-separating lipid bilayers, Martini 2 simulations with typical parameter settings have recently been found to suffer from substantial artificial temperature gradients across the phase boundaries. The locally different temperatures impacted other physical properties of the system such as the ratio of diffusion coefficients between the saturated and the unsatu- rated lipids, the degree of phase separation, 10,11 and the distribution of cholesterol along the membrane normal direction (Figures 4 and 5). The origin of the artifact was traced back to insufficient convergence of the highly coupled bond constraints in cholesterol, one of the major components in such bilayers.
Here, we used the mechanics of rigid bodies 15 to develop an optimization strategy for constraint molecular topologies to achieve quicker constraint convergence with the LINCS algorithm. We did not consider alternative ways of solving the constraint equations or other numerical methods to invert the matrix I − A n in LINCS. In the optimization of the constraint topology, we minimized the largest absolute value of the eigenvalues of the A n matrix, λ max . We also provide a python script to rapidly evaluate the quality of the constraint topology in terms of λ max (available at https://github.com/biophys/constraint-coupling-analysis). We demonstrate the optimization strategy for the Martini 2 cholesterol model. By fully preserving the force field and the dynamics in the limit of infinitesimal time steps and perfect accounting for the constraints, the optimized model reproduces the singlemolecule properties of the original model such as the solvent-accessible surface area or the bond/angle/dihedral distributions. With the exception of the largest time step and lowest lincs_order considered, the new model did not develop artificial temperature gradients in the phase-separating bilayer. The optimized model is publicly available at the Martini Web site (http://cgmartini.nl/images/parameters/ ITP/martini_v2.0_CHOL_02-optLINCS.itp).
We further investigated the magnitude of the artifacts and their impact on cholesterol−protein interactions using a membrane-embedded β 2 -adrenergic receptor. Whereas the temperature of the original cholesterol model deviated significantly from the target temperature of the thermostat at larger time steps, there were no significant differences observed in lipid organization and dynamics around the protein between simulations with the original and the optimized cholesterol model. For MD simulations of membrane proteins with the optimized cholesterol model, we recommend the combined use of at least lincs_order = 6 with at most a 30 fs time step.
In the optimization of the cholesterol model, we ensure that the energetic and dynamic properties of the original model are fully maintained. The four additional beads in the constraint topology incur a computational cost. On the other hand, the stricter LINCS settings required for the original Martini 2 cholesterol model also impact the computational cost. In MD simulations of phase-separating ternary lipid mixtures with LINCS settings chosen to ensure similarly small temperature gradients, we achieved comparable performance with the original and optimized cholesterol model in terms of simulated time per wall-clock time (ns/day). The performance advantage of the optimized model increased to ∼30% in the presence of the membrane protein β 2 AR because the increase in the lincs_order required for the original cholesterol model applies also to constraints in the membrane protein.
We also analyzed the constraint topologies of the Martini 3 small-molecule library 17 for the highest λ max , and we performed explicit simulations for the three molecules with the largest eigenvalues λ max that corroborated the eigenvalue analysis. Overall, even for the largest λ max = 0.76 we did not observe appreciable temperature gradients. The analysis and optimization method presented here can be readily incorporated into automatic topology builders and is potentially useful for other constrained molecules as well as rigid-body simulations.
We conclude by emphasizing the generality of the procedure described here to optimize the molecular constraint scaffold for rapid constraint convergence with LINCS. Possible applications include automated topology building of molecules, 35,36 e.g., at the Martini 3 level of coarse graining. 35,36 ■ ASSOCIATED CONTENT

Data Availability Statement
Simulation input files and analysis scripts of this study are openly available on Zenodo at 10.5281/zenodo.7199702. The optimized topology is available at http://cgmartini.nl, while the python analysis script for the eigenvalues is available at https://github.com/bio-phys/constraint-coupling-analysis.
Solvent-accessible surface area of the original and optimized models, mean values and standard deviations of pairwise intramolecular distances of the original and optimized models, cholesterol−β2AR interactions computed by PyLipID, hardware configuration for the comparison of model performance, performance comparison of membrane-embedded β2AR simulations, and temperature differences in the Martini 3 small-molecule library (PDF)