Energy Landscapes Reveal Agonist Control of G Protein-Coupled Receptor Activation via Microswitches

Agonist binding to G protein-coupled receptors (GPCRs) leads to conformational changes in the transmembrane region that activate cytosolic signaling pathways. Although high-resolution structures of different receptor states are available, atomistic details of allosteric signaling across the membrane remain elusive. We calculated free energy landscapes of β2 adrenergic receptor activation using atomistic molecular dynamics simulations in an optimized string of swarms framework, which shed new light on how microswitches govern the equilibrium between conformational states. Contraction of the extracellular binding site in the presence of the agonist BI-167107 is obligatorily coupled to conformational changes in a connector motif located in the core of the transmembrane region. The connector is probabilistically coupled to the conformation of the intracellular region. An active connector promotes desolvation of a buried cavity, a twist of the conserved NPxxY motif, and an interaction between two conserved tyrosines in transmembrane helices 5 and 7 (Y–Y motif), which lead to a larger population of active-like states at the G protein binding site. This coupling is augmented by protonation of the strongly conserved Asp792.50. The agonist binding site hence communicates with the intracellular region via a cascade of locally connected microswitches. Characterization of these can be used to understand how ligands stabilize distinct receptor states and contribute to development drugs with specific signaling properties. The developed simulation protocol can likely be transferred to other class A GPCRs.


Supplementary Tables
(distance between water molecules' oxygen atom and residue's closest heavy atom were considered) Closest ion to Asp79 2.50 Distance to closest sodium from any Asp79 2.50 heavy atom Y-Y motif Tyr219 5.58 -Tyr326 7.53 Cζ distance |Asp79-Asn322| Asp79 2.50 -Asn322 7.49 closest heavy atom distance Met82 ∆RMSD Difference between the RMSD of Met82 2.53 heavy atoms to the inactive and active structures 2RH1 and 3P0G Trp286 ∆RMSD Difference between the RMSD of Trp286 6.48 heavy atoms to the inactive and active structures 2RH1 and 3P0G S3   (22 ) Active structures marked with an asterix (*) are bound to a nanobody.

System Preparation
Four systems were equilibrated and used to define the endpoints for the string simulations, see Table S1. The initial coordinates were taken from PDB structures 3P0G (23 ), 2RH1 (2 ) and 3SN6 (1 ). The intracellular binding partners, if any, were removed. Residues Asp130 3.49 and His172 4.64 were not protonated, the Asn187Glu mutation in 3P0G was reverted, Glu122 3.41 was protonated and residues His172 4.64 as well as His178 were protonated on the epsilon position. The agonist was either bound in the ligand binding site, referred to as the holo condition, or absent from the system, referred to as the apo condition. For each system, the protein was inserted in a POPC (24 ) bilayer and solvated in explicit solvent. Na + and Cl − ions were added to neutralize the systems' total charge. System preparation was performed using CHARMM-GUI (25 ).

MD simulation details
The systems were parametrized using the CHARMM36m force field (26 ) and TIP3P water model (27 ). The ligand had its amine group protonated and was parameterized with CHARMM's general force field using CHARMM-GUI's ligand modeler (28 ).
MD simulations were run with GROMACS 2016.5 (29 ) patched with plumed 2.4.1 (30 ) in an NPT ensemble at 310.15 K. The simulation systems were minimized, thermalized and equilibrated under gradually decreasing constraints, using CHARMM-GUI's default configuration. The time step considered was 2 fs and hydrogen bond lengths were constrained with the LINCS algorithm. Thermostatting was performed using Nose-Hoover thermostats with a time constant τ = 1 ps for protein, membrane and electrolyte while barostatting involved semiisotropic Parrinello-Rahman pressure coupling with a time constant τ p = 5 ps and compressibility of 4.5 10 −5 bar −1 .

CV selection
The collective variables (CVs) used to optimize the minimum free energy pathway should capture slow degrees of freedom of the system, separate distinct functional states and characterize the overall process of activation. In contrast to other enhanced sampling methods where the simulation performance is reduced by increasing the number of CVs, the string is one-dimensional and the computational complexity and expected simulation time of the swarms of trajectories method does not increase with the number of CVs. We thus designed a protocol that allows to derive a relatively low dimensional set of CVs but that does not restrict itself to an artificially low number.
The CVs were derived from an unrestrained MD trajectory of holo β 2 AR (Condition A in (32 )). The CVs were set to be certain interatomic distances using the following protocol: first, to determine the number of conformational states (clusters) present in the simulation in an unbiased way, we clustered the simulation snapshot using spectral clustering on inverse inter-residue distances and computed the SPECTRUS quality score (33 , 34 ) for a number of clusters ranging from 2 to 12 ( Fig S1). The highest quality score was obtained for 3 clusters (Fig S1a). By projecting the cluster onto characteristic variables (1 , 23 ) (TM6-TM3 displacement and the NPxxY twist), the clusters were shown to correspond to the active, the inactive as well as the intermediate state ( Fig. S1b) previously identified by Dror et al (32 ). Next, we trained a multilayer perceptron classifier (35 ) using as input all the interresidue distances and as output the cluster ID. The neural network used ReLu activation and was trained using a lbfgs solver 1 . The classifier performed well with less than 1% false classifications during 2-fold cross validation. The classifier was then trained with all available data and finally deep Taylor decomposition (36 ) was used to identify interresidue distances of relevance for the classification. The top five interatomic distances were used as CVs and can be seen in Fig. 2.
The CVs were scaled to generate unitless values ranging from 0 to 1. This is important 1 Unless otherwise specified, the default settings of scikit-learn were used.

S7
when we compute the drift and reparametrize the string, otherwise CVs corresponding to small-scale movements will be suppressed during reparametrization.
Although it is easy to come up with variations to the scheme presented here for CV discovery, the basic idea is to find human-interpretable CVs which well parametrize the transition between states along the activation path.
There is no requirement that the MD trajectories have to come from unrestrained longtimescale simulations, even though that was used in this paper. Data to use for clustering and CV set parametrization may be obtained by running MD simulations by, for example, removing the ligand to reduce the timescales of deactivation from the active states, mutating a residue, adding a biased force to the system, launching simulations from different crystal structures, increasing the temperature to overcome ergodicic barriers or any other methods which approximately samples states along the activation path.

Swarms of trajectories simulation setup
In the string with swarms of trajectories method, the most probable transition path is estimated iteratively from the drift along the local gradient of the free energy landscape.
From each point on the string, a number of short trajectories are launched (the swarm), which enables to calculate the drift. The string is then updated considering this drift and reparametrized to ensure full sampling of the configurational space (Fig. S4). Convergence is reached when the string diffuses around an equilibrium position.
All swarms of trajectories simulations were instantiated with the coordinates from the 3P0G structure (the first two simulation systems in Table S1). The endpoints of the main strings describing the transition between inactive and active states (subscript t ) were fixed to the output coordinates of the equilibrated structures of 2RH1 and 3P0G. Two additional short strings were setup to increase sampling in the active (subscript a ) and inactive (subscript i ) regions. The endpoints of the active substrings were set to the output coordinates of the equilibrated structures of 3P0G and 3SN6, while the endpoints of the inactive strings were S8 fixed to the average CV coordinates of the second half of two 20 ns unrestrained trajectories which were launched from the output coordinates of the Holo t and Apo t MD simulations steered towards 2RH1.
The path for simulations Holo t was inferred using the same data as was used for clustering. A rough estimate of the free energy landscape was calculated from the probability density landscape estimated using the Scikit-learn (35 ) kernel density estimator (KDE) with automatic bandwidth detection. The initial pathway was then estimated by letting an initial string iteratively follow the gradient of the free energy landscape estimate until convergence to a pathway where the gradient is approximately zero in any direction perpendicular to the string (Fig. S2). The average, partially converged path between iterations 20-30 from Holo t was used as input path for simulations Apo t and Holo Ash79,t . All active and inactive substrings for Apo and Holo were instantiated as straight paths between the endpoints. Finally

Algorithmic improvements to the swarms of trajectories method Adaptive number of swarm trajectories
The most probable transition path is estimated from the drift along the local gradient of the free energy landscape. Increasing the number of trajectories will reduce the standard error in the estimate but also incur an additional computational cost. Thus we suggest an adaptive approach for choosing the number of swarms (Fig. S5b).
Assuming that a single swarm trajectory drifts in CV space with a vector, s, from its starting point on the string, the average drift vector, S, for n swarm trajectories is: We suggest a convergence metric, ∆S, to estimate how well S has converged: where b ≥ 1 is the batch size, equal to the number of simulations running in parallel on a single compute node. As long as ∆S(n) is above a certain threshold, batches of additional b S10 short trajectories are launched from that point (Fig. S4). The iteration stops when ∆S(n) reaches the chosen threshold.

Optimized string reparametrization between iterations
The points along the string do not necessarily need to lie equidistant in CV space; what is important is to prevent all the points from drifting to the nearest free energy minima and instead force some points to lie in transition regions of high free energy. The re-positioning of the points along the string is broadly called reparametrization (Fig. S4). In order to make 3. Compute the transition weights, w i , one per arc on the string: is a small number to avoid zero weights, which would correspond to two points having the exact same coordinates. 4. Normalize the elements of the weight vector w so that i w i = 1. S11 5. Finally, reparametrize the string so that the distance of the arc between point i and point i + 1 is w i L, where L is the total length of the string. We see that setting all weights to a constant value corresponds to keeping all arcs equidistant.
Another modification to the algorithm is to gradually increase the string resolution along the minimum path optimization. This will increase the sampling intensity in the vicinity of the string around the most probable transition path, eventually yielding a higher resolution of the free energy landscape. The number of points on the string is changed at defined iterations by inserting an extra point at the start of the string and reparametrizing it again.

Configuration exchange between string points enables smoothing orthogonal CVs
The most straightforward way to set up a new iteration of the string simulation is to copy the output of the restrained simulation for a certain point and use as initial coordinates for the next iteration's restrained simulation for the same point. However, this may result in unrestrained orthogonal degrees of freedom getting trapped in local free energy minimas. In theory, two adjacent points may end up structurally quite different in all unbiased dimensions.
We therefore use the closest output coordinates in CV space of any trajectory from the previous iteration as input for the restrained simulation (Fig. S5c). This increases sampling of the orthogonal degrees of freedom and also allows us to run the restrained simulations for a shorter period of time, since the input coordinates are already optimally close to the center of the harmonic potential. The optimized string-of swarms simulation protocol was implemented in python. The MDTraj python library (37 ) was used to load and analyze trajectories.

Free Energy Computation
The string does not converge to a unique configuration; it diffuses between an equilibrium of minimum free energy pathways and the swarms can be considered to come from a stochastic diffusion process at equilibrium. By discretizing the system into microstates, or bins, it is S12 possible to use the short trajectories from the swarms to create a transition matrix and derive the free energy distribution of the system (38 ) along some variable (Fig. S9).
The transition probabilities T ij of the transition matrix T can be estimated from the normalized number of transitions, N ij , from bin i to bin j: The transition matrix of a physical system at equilibrium is constrained by detailed balance, such that for the stationary probability distribution, ρ: However, due to the limited amount of sampling from the swarms, the resulting transition matrix does not necessarily fulfill this condition. Instead, we have to regularize the transition matrix to obey detailed balance in the same way as is typically done to construct a reversible Markov State Model (39 ). The eigenvector with unit eigenvalue of such a matrix is the stationary distribution, ρ.
The free energy, ∆G of the system is simply computed with the Boltzmann factor: In this work, the swarm trajectories from the last 200 iterations were used to construct the transition matrix and to compute the free energy along the variables described in Table   S2.
Even though Eq. 4 is valid for any variable and timescale, the dynamics of the system can be affected by projection onto a lower-dimensional space and may be sensitive to short lagtimes (i.e. the length of the swarm trajectories in this context) when constructing the transition matrix. A common test of convergence is to compute the timescales for a sequence of lagtimes and select the shortest lagtime for which the kinetics do not change (39 ). Such S13 test is not possible to perform with the current implementation due to the use of short One should bear in mind that the resulting free energy landscape can be slightly affected by a the choice of discretization technique. In this paper figures are generated with bins of fixed width. If the bins are too large it will lead to free energy barriers of reduced height. The reason for this is that the barrier may not be resolved properly and mixed with neighbouring regions of lower free energy. On the other hand, a bin width too small will lead to unphysical oscillations in the free energy profile due to insufficient sampling.

Error estimation from a posterior distribution of transition matrices
The number of transitions is finite and varies with the size of the system. Therefore there is an uncertainty associated with each transition probability T ij . Since many elements go into constructing T , the impact of all the statistical uncertainties on the final free energy landscape cannot be directly derived. Instead, we apply a Bayesian approach and use Metropolis Markov chain Monte Carlo (MCMC) to sample over the posterior distribution of transition matrices, given the elements of T ij (39 ). As a result we obtain a number of transition matrices and can directly compute the mean and variance of the stationary probability distribution, and therefore also of the free energy landscape. Thus, this gives us another way to assert convergence since trajectories drawn from an equilibrium ensemble will generate a more narrow distribution of free energy profiles than trajectories drawn from different nonequilibrium processes. We estimated the errors on the free energies with a MCMC method using the implementation of a regularized transition matrix in MSMBuilder (39 ). In all error estimations, we sampled 1000 transition matrices.
Outliers in the distribution of sampled free energy profiles (Fig. S8, S10 and S11) may in a few cases reverse the relative stability of states. In general, however, the majority of  Figure S1: Clustering of a long MD simulation trajectory of β 2 AR deactivation into states (32 ). (a) The SPECTRUS quality score for different number of clusters, normalized such that the highest value is 1. The peak occurs at 3 clusters, indicating the data is optimally partitioned into three clusters. (b) Three states found via spectral clustering on the Cα inter-residue inverse distance map. Each configuration is colored according to its cluster assignment and projected in the TM6 displacement and NPxxY RMSD to inactive space. The cluster centers (configurations with minimum total RMSD to all other configuration in that cluster) are marked as empty circles. Figure S2: Initial path guess, via rough estimation of the free energy landscape corresponding to configurations from a long time scale trajectory (32 ). The density map that serves as a basis to compute the free energy landscape is computed by a kernel density estimator (KDE) and projected onto the top two original CVs. The fixed starting and end points, marked with a star and a triangle respectively, are taken to be the equilibrated crystal structures 3P0G and 2RH1.   Figure S4: An optimized iteration along the string-of-swarms algorithm. The input frames from the previous iteration are equilibrated under CV-based restraints (box 1). The swarm trajectories are then launched in parallel batches from the points (box 2). As the short trajectories finish, the drift vector is computed and updated. If the vector changes above a threshold (diamond 3), new batches will be launched (Fig. S5b). Once the drift vector has converged, the final drift as well as the drift weights, a measure of how much the swarm trajectories tend to move between two neighboring points, are computed (box 4). The points are then redistributed (so-called reparametrized) on the string (box 5), such that the number of transitions between neighboring points is kept relatively constant. At selected iterations, new points are also added to the string (diamond 6), requiring additional reparametrization (Fig.  S5a). The configurations with coordinates closest to the new points of the reparametrized string are used as input for the next iteration (box 8) (Fig. S5c). The method allows to iteratively find the most probable transition path between two fixed endpoints, as illustrated here for a simple 2 well model. To increase the resolution of the minimum free energy path as a minimal computational cost, the number of points on the string, and from which the swarms are launched, is gradually incremented and the lengths of the arcs on the strings are set to maximize the transitions between neighbouring points. (b) The number of swarms per string point is optimized on-the-fly by designing an adaptive scheme: the swarm convergence is evaluated by computing the difference between the average drift vector before (grey) and after (green) the last batch of short trajectories (black trajectories) has finished; if the norm of the difference (red) divided by the norm of the average drift after is above a certain limit, new trajectories are instantiated. This is done until this difference reaches a value before the defined limit. (c) To promote convergence and prevent different string points from diverging from one another in the dimensions that were not considered in the CV space, exchanges between string points are promoted: to select input coordinates for the next string iteration (black circles in ii), configurations from the closest swarm trajectory (green trajectories in ii) originating from the previous string iteration (black circles in i, grey circles in ii) are selected. These can come from the same string point or from a neighboring one. S27      Figure S9: Postprocessing protocol for computing the free energy in the vicinity of the activation path. Given some variable of interest to project the free energy onto (box 1), the system is partitioned into bins along this variable (box 2). Next, we iterate through all swarm trajectories and count the number of transitions between all bins (box 3). A transition probability matrix is then constructed (box 4). We then compute the probability distribution at equilibrium (box 5), i.e. the probability for the system to be in a certain bin while fulfilling detailed balance (Eq. 4). Finally, the free energy of all bins is computed from the probability distribution.  Figure S11: Uncertainties associated with the free energy landscapes projected onto various variables of interest including the distance between Asp79 2.50 and Asn322 7.49 (a-c), the Met82 2.53 heavy atom ∆RMSD) (d-f ), the Trp286 6.48 (the so-called 'toggle-switch' residue) heavy atom ∆RMSD) (g-i) and the distance from Asp79 2.50 to the closest sodium ion (j-l) from an MCMC sampling round over 1000 transition matrices. The boxplots show the median (as a horizontal line in orange), the interquartile range (as a box), the upper and lower whiskers (as vertical lines) as well as the outliers (as circles).