Conformational Changes in the Epidermal Growth Factor Receptor: Role of the Transmembrane Domain Investigated by Coarse-Grained MetaDynamics Free Energy Calculations

The epidermal growth factor receptor (EGFR) is a dimeric membrane protein that regulates key aspects of cellular function. Activation of the EGFR is linked to changes in the conformation of the transmembrane (TM) domain, brought about by changes in interactions of the TM helices of the membrane lipid bilayer. Using an advanced computational approach that combines Coarse-Grained molecular dynamics and well-tempered MetaDynamics (CG-MetaD), we characterize the large-scale motions of the TM helices, simulating multiple association and dissociation events between the helices in membrane, thus leading to a free energy landscape of the dimerization process. The lowest energy state of the TM domain is a right-handed dimer structure in which the TM helices interact through the N-terminal small-X3-small sequence motif. In addition to this state, which is thought to correspond to the active form of the receptor, we have identified further low-energy states that allow us to integrate with a high level of detail a range of previous experimental observations. These conformations may lead to the active state via two possible activation pathways, which involve pivoting and rotational motions of the helices, respectively. Molecular dynamics also reveals correlation between the conformational changes of the TM domains and of the intracellular juxtamembrane domains, paving the way for a comprehensive understanding of EGFR signaling at the cell membrane.


■ INTRODUCTION
Membrane proteins play a key role in the structure and function of cells. Most membrane proteins have transmembrane (TM) domains formed by bundles of α-helices that span the lipid bilayer. The dynamics of TM domains are responsible for conformational changes that underlie, e.g., receptor activation and solute transport by membrane proteins. Thus, understanding the underlying free energy landscapes of TM helix interactions provides the key to mechanisms of membrane protein function. A relatively simple and well characterized example of such interactions is provided by TM helix dimers such as those found in receptor tyrosine kinases (RTKs). 1−3 The receptor tyrosine kinases are a major family of membrane receptors controlling many cellular activities. 4 Among the RTKs, epidermal growth factor receptors (EGFR/ErbB1/Her1) regulate processes including cell proliferation, migration and differentiation. Aberrant ErbB activity is associated with a number of illnesses, including cancers and neurodegenerative diseases, 5,6 and therefore ErbB receptors are important drug targets. 7,8 Understanding the molecular mechanisms of signal transduction by the EGFR is of fundamental importance in providing a paradigm for other RTKs 9−11 and will facilitate development of a range of therapeutic interventions. 12 −14 There has been extensive structural characterization of the extracellular ectodomain and the intracellular kinase domain of the EGFR. 15 These studies indicate that the EGFR functions as a dimer. 16 Binding of the ligand (EGF) to the ectodomain of the receptor induces a conformational change that is transmitted through an allosteric mechanism to the kinase domain. 17 Formation of an asymmetric dimer of the kinase domains enables tyrosine phosphorylation, which in turn activates the downstream signal transduction cascade in the cytoplasm. 18 Although cryo-electron microscopy has yielded images of the intact EGFR, 19 in the absence of a high resolution structure of the complete receptor, including the transmembrane (TM) domain, our understanding of its activation mechanism remains incomplete ( Figure 1). 1 The TM helix dimer is believed to play a key role in the activation mechanism. 20 Mutagenesis experiments have also highlighted the role of the juxtamembrane domain (JM), immediately C-terminal to the TM domain, in regulating the allosteric transition between inactive and active states of EGF-bound receptor dimers 11 and in their interactions with lipids. 21 Truncation of the ectodomain activates intracellular phosphorylation, 22 demonstrating that the ectodomain has an autoinhibitory function, and that the TM domain can independently activate the receptor. 20 It is suggested that the active state is characterized by dimerization contacts in the N-terminal region of the TM helices ( Figure  1C, motif highlighted in red) along with an antiparallel interaction between JM segments. 23 The nature of the inactive state(s) remains somewhat less clear but is thought to involve a more C-terminal interaction of the TM helices (as recently discussed in the literature 24 ).
In this context it is important to elucidate modes of interaction of the helices within the dimeric TM domain. 25−27 The conformational dynamics of these interactions may be explored through computer simulations, which have been used to provide detailed descriptions of different ensembles of conformations visited by TM helix dimer systems. 28−30 However, due to the limiting time scale of atomistic molecular dynamics (MD) simulations only partial description of EGFR dynamics have so far been achieved. 31,32 Consequently, models which employ a simplified (e.g., coarse-grained: CG) molecular representation are valuable, 33 as they may be used to investigate TM helix dimerization events beyond the accessible time scale of atomistic simulations. 34−36 CG simulations have been used to obtain one-dimensional projections of the free energy landscape of TM helix dimerization for glycophorin A, 37,38 for ErbB, 39 and for EphA1 40 receptors. However, in order to link the conformational motions of TM helices to possible activation mechanisms, an accurate and exhaustive (i.e., multidimensional) description of the underlying free energy landscape is essential. This remains beyond the accessible time scale of simple CG simulations, even with the availability of advanced supercomputing resources. A potential solution is provided by the use of enhanced sampling techniques. 41−43 These methods accelerate the simulation of the process of interest, allowing capture of long time scale motions at affordable computational cost.
Here we present an innovative computational protocol that by combining CG simulations with well-tempered metadynamics 44,45 (CG-MetaD) greatly extends the time scales typically reached by simulations. In particular, CG models reduce the dimensionality of the system under investigation, 36 while metadynamics enhances the phase space exploration, thus allowing practical investigation of long time scale and largescale motions of biomolecules. 46−48 Pioneering applications of metadynamics to CG models have indeed shown great potential. 49,50 In the present study we employ ∼380 μs of CG-MetaD simulations to fully inspect both association and dissociation events of the EGFR TM helices. This provides an accurate and comprehensive description of the associated free energy landscape, which allows identifying the low free-energy basins that correspond to distinctive TM dimer conformations. The reversible pathways of transition between these dimeric states reveal key functional motions of the TM helices to reach the active dimer state, thus unveiling previously unknown features of the activation mechanism. This information is of paramount relevance to develop an exogenous control of the receptor activity. The conformational stability of the identified low energy dimer states has been confirmed by over 3 μs of atomistic MD simulations, thus providing structural details that explain and complement previous studies. This study demonstrates that CG-MetaD provides a powerful method to investigate large-scale motions in the TM domains of membrane proteins, thus representing an important advance in simulation of protein conformational dynamics in a membrane environment.

■ METHODS
Coarse-Grained Models. The starting structure of the TM domain dimer was a homology model based on the NMR structure of the ErbB1/ErbB2 heterodimer (pdb id: 2KS1), 51 generated using Modeler. 52 Coarse-graining used an in-house variant 53,54 of the Martini force field. 55,56 In addition to the standard bonded interactions, harmonic restraints are generally required to conserve secondary structure elements. Therefore, in the TM models, harmonic restraints were applied between backbone particles separated by three residues in the α-helical region (from Ile 619 to His 648 ) to mimic hydrogenbonds, while the intrinsic flexibility of the N-loop and C-loop regions were maintained with no additional restraints. In the TM+JM models (based on the 2M20 NMR structure), an elastic network model acts between pairs of backbone particles separated by less than 0.7 nm.
CG-MetaD Simulation. The CG-MetaD simulation was performed with the well-tempered version 45 of metadynamics, 44 implemented in Plumed 1.3. 57 The estimate F(s,t) of the free-energy surface F(s) at time t was determined as where V(s,t) is the bias potential added to the system at time t as a function of a set of selected CVs, s. ΔT is the difference between the fictitious temperatures (T + ΔT) of the selected CVs and the temperature T of the system. Therefore, ΔT influences the exploration rate of the CVs space. The bias potential V(s,t) is made up by the sum of the Gaussians deposited so far. In our case the main simulation was performed at 323 K (i.e., the temperature at which the MARTINI force field was parametrized) with a time step of 40 fs. The lateral separation (L) between the center of mass of the helices was chosen as single active collective variable. Gaussians of height 0.05 kJ/mol and width 0.05 nm were initially used and deposited every 5000 steps with a bias factor (T + ΔT)/ΔT of 10. This means that the Gaussian height was gradually decreased along the

Journal of the American Chemical Society
Article 380 μs of CG-MetaD simulation on the basis of the adaptive bias with a ΔT of 2907 K (Supporting Information Figure S1).
Additional CG-MetaD simulations were performed using different parameters: the width of the Gaussians was reduced to 0.025 nm in one simulation or alternatively a soft wall potential was used to restrain the lateral helix separation L within 4.5 nm (Supporting Information Figure S2). The results obtained from the different simulations led to a coherent description of the free energy landscape (Supporting Information Figure S3), demonstrating the robustness of the approach. In contrast we observed both tempering and convergence issues with higher deposition rates (either with higher gaussians or higher deposition frequencies). Unless stated in the text, CG-MetaD results refer to the main simulation.
CG-MD Simulations. A series of standard CG simulations were performed using Gromacs 4.5.5 58 (i.e., ∼185 μs of CG-MD simulations, see Supporting Information Table S1). A leapfrog algorithm was used to integrate Newton's equations of motion, with a time step of 40 fs. Standard CG-MD and CG-MetaD simulations were performed in NpT and NVT ensembles, respectively. The temperatures of the individual monomers, the solvent and the lipid bilayer were separately maintained at 323 K using the Bussi thermostat with relaxation times of 1 ps. 59 In standard CG-MD simulations, semiisotropic pressure coupling was used. The Berendsen barostat was employed to keep the average pressure at 1 bar, 60 with time constants of 10 ps and compressibilities of 3 × 10 −5 bar −1 . Lennard-Jones interactions were shifted to zero between 0.9 and 1.2 nm, while electrostatic interactions were shifted to zero between 0 and 1.2 nm using a relative dielectric constant of 20. Neighbor lists were updated every 10 steps for the calculation of nonbonding interactions, using a 1.3 nm cutoff.
AT-MD Simulations. The populations of lowest energy identified by CG-MetaD were converted to atomistic level using a fragmentbased approach. 61 A total of over 3 μs of standard atomistic MD simulations (AT-MD) were performed using the AMBER force field ff14SB 62 and LIPID14 63 for protein and lipids, respectively. The TIP3 water model was used. 64 All the simulations were performed with the pmemd module of the AMBER MD code. 65 Short-range van der Waals interactions were switched to zero at a cutoff distance of 1 nm. The long-range electrostatic interactions were computed by means of the particle mesh Ewald (PME) method 66 using a real-space cutoff of 1 nm and a 0.1 nm grid spacing in periodic boundary conditions. The SHAKE algorithm was applied to constraint bonds involving hydrogen atoms, and thus an integration time step of 2 fs could be used. Harmonic constraints were applied to the protein and gradually released along the equilibration process. The temperatures of the individual monomers, the solvent and the lipid bilayer were maintained at 323 K using the Langevin thermostat with collision frequency of 3 ps −1 . Production runs were performed in the NpT ensemble at 1 atm under anisotropic pressure scaling conditions using a Monte Carlo barostat and pressure relaxation time of 2 ps.
Collective Variables (CVs). Analysis of the simulations was carried out using Plumed 1.3, 67 Gromacs 4.5.5 58 and locally written scripts. The probability distribution function of various CVs was reweighted for free energy calculations. 57 The CVs used were: L, the lateral distance between the centers of mass of the TM helices; Ω, the crossing angle between the TM helices; Δd, calculated as the difference in interhelix distance at the N-termini and the C-termini; and Δ|Θ|, which describes the rotation of the helices along their long axis relative to one another. More detailed definitions are provided in the paper when these CVs are used. However, the crossing angle CV (Ω) benefits here from a more complete description. The positions of the backbone particles of the TM helical region were used to determine, by singular value decomposition (SVD), the eigenvectors h 1 and h 2 associated with the helix axes. The absolute value of crossing angle was determined using the definition |Ω| = asin(|h 1 × h 2 |/(|h 1 |·| h 2 |)). One might also consider atan2(|h 1 × h 2 |,h 1 ·h 2 ) that leads to identical |Ω| values. In order to compute the sign of Ω, h 1 was first aligned with the positive x-axis, with the same rotation applied to h 2 , which provided two new helix vectors h 1 ′ and h 2 ′. Then, the cross product h 1 ′ × h 2 ′ was aligned with the x-axis, and the same rotation applied to the two helix vectors provided h 1 ″ and h 2 ″. All rotations were done using the origin of the Cartesian coordinate system as center. The sign of Ω was determined by considering two properties: (i) the sign of the x-component of the geometric center of the cross product h 1 ″ × h 2 ″, (ii) the comparison of the x-component of the geometric centers of h 1 ″ and h 2 ″. Negative and positive signs of Ω characterized right-and left-crossings, respectively.
Minimum Free Energy Paths (MFEPs). All minimum free energy paths (MFEPs) presented in the paper were calculated in one of the two dimensions of the free energy surfaces obtained by CG-MetaD calculations. The paths were obtained by scanning of the lowest free energy points of the surface along the (CV) dimension of interest. We compared the MFEPs to free energy profiles obtained for each CV. For the determination of the free energy profiles, partition functions were calculated at each value of the CV of interest. Thus, contrary to MFEPs, all the free energy points obtained in the second dimension of the FES contributed to the free energy profiles. The comparison showed that both are really similar, although the MFEPs allowed a more accurate detection of the free energy basins of the FES. Therefore, we chose to present our results on the basis of the MFEPs. Importantly the relative free energies of the basins identified from the different MFEPs were similar for each dimer population (Supporting Information Figures S3 and S4). This guarantees both the validity of the free energy calculations and the suitability of the different CVs to describe the multidimensional conformational landscape of the EGFR TM domain (Supporting Information Table S2).
■ RESULTS AND DISCUSSION CG-MetaD Sampling. A starting structure for the simulations of the EGFR TM helix dimer was based on the NMR structure of the ErbB1/ErbB2 heterodimer (pdb id: 2KS1). This structure includes flexible N-terminal (extracellular) and C-terminal (intracellular) loops, which we will refer to as the N-loop and C-loop, respectively ( Figure 1). Therefore, in the CG simulations, harmonic restraints were applied (see Methods above) to maintain the secondary structure of the core α-helical region (from Ile 619 to His 648 ) while retaining the intrinsic flexibility of the N-loop and C-loop regions. The Nloop contains residue K618 that is thought to interact with the glycolipid GM3 in cell membranes. 69,70 The C-terminus of the TM helix plus the C-loop contains six basic residues (R645 to R653) that can interact with anionic lipids (e.g., PIP 2 ) of the intracellular membrane leaflet. 21,70,71 The TM helix dimer was embedded in a phospholipid bilayer. 34 To describe the motion of one helix with respect to the other we used the lateral separation between the centers of mass of the helices (L) as a collective variable (CV). The sampling of TM helix dimerization took a total of ∼0.4 ms of CG-MetaD simulation, during which 65 interconversion events between the dimeric and monomeric states of the TM helices were observed. Thus, the diffusive exploration of the CV space in our CG-MetaD simulation yielded a well-characterized and converged free energy surface ( Figure 2 and Supporting Information Figure  S1).
Free Energy of Dimerization. From the free energy surface (FES) one can accurately quantify free energies of interaction. 72 Thus, the free energy of TM helix dimerization was calculated from the FES as the difference between the dimeric and monomeric states, yielding an estimate of −38 ± 3 kJ/mol (error calculated as the standard deviation from the weighted average value of the TM helix dimerization free energy obtained from the last part of the simulation where the calculation is converged; see also Supporting Information Journal of the American Chemical Society Article Figure S1C). This is in reasonable agreement with an experimental estimate of the dimerization free energy of −31 kJ/mol, derived from a K d of ∼2 μM in FRET experiments on EGFR TM helices in LDAO micelles. 73 We do note that these values are larger than some other computational and experimental estimates, of −26 kJ/mol 39 and −11 kJ/mol 74 respectively. However, considering the sensitivity of the dimerization process to experimental conditions (e.g., lipid vs detergent), the approximations implicit in the CG force fields used in simulations, and especially the different construct sequences of the EGFR TM domain used in the various studies (Supporting Information Figure S5), it is perhaps unsurprising that we observe differences. Indeed, we note that a wide range of values (from −30 kJ/mol to −60 kJ/mol) has been estimated for the dimerization free energy of the canonical glycophorin A helix dimer on the basis of atomistic 75,76 and CG 37,38 simulations.
2D Free Energy Landscape. Using a reweighting protocol we computed the Boltzmann distribution of CVs different from those biased in the original calculation. 57 This allows for a more detailed inspection and characterization of the motions observed during the simulation. In this manner we reconstructed the FES as a function of both the original CV (the interhelix distance L) and of a CV corresponding to the interhelix crossing angle (Ω). The latter is a frequently used structural descriptor of the packing interactions within TM helix dimers. 34 The resulting FES ( Figure 2B) reveals two freeenergy minima. The lower minimum corresponds to righthanded (Ω = −30°) TM helix dimer conformations, while the second minimum is 9 kJ/mol higher and corresponds to a lefthanded dimer (Ω = ∼ +12°; Figures 2BC and 3A). Within the right-handed energy basin of the TM dimer, we refer to the  Ω is the crossing angle between helices; Δd CV is calculated as the difference in interhelix distance at the N-termini and the C-termini (defined as the centers of mass of TG 625 xxGA 629 and A 637 xxxG 641 motifs, respectively); Δ|θ| describes the rotation of the TM helices about their long axes relative to one another (see also Supporting Information Figure S4). Each free energy profile was calculated as the minimum free energy path (MFEP) located at low L values of the associated FES (see Methods). (D) Structure of the dimer populations identified in the basins of the free energy landscape of EGFR TM domain. Colored spheres represent residues mainly engaged in interhelical contacts, while gray tubes show helix backbones. Averages or ranges of crossing angles are provided for each dimer.

Journal of the American Chemical Society
Article most populated ensemble of conformations as Nter+ ( Figure  3D) since it shows the two helices interacting through residues of the N-terminal motif along with additional contacts at the helix core (G 625 xxGA 629 xxLL 633 xxxA 637 ). We will refer to the left-handed dimer as the Lzip population ( Figure 3D) since the h e l i c e s f o r m l e u c i n e z i p p e r -l i k e i n t e r a c t i o n s (V 635 xxL 638 xxxL 642 ). The Nter+ state is 11 ± 2 kJ/mol more stable than the Lzip state.
Although the helix crossing angle is a metric commonly used to characterize TM helix dimers, 34 it may not allow us to readily distinguish subtly different dimer conformations within the same macro-ensembles (i.e., within the right-handed and lefthanded energy basins). Therefore, we undertook a series of local analyses of all dimer conformations.
Multidimensional Free Energy Landscape. The sequence of the EGFR TM domain contains a tandem small-X 3small sequence motif toward the N-terminus (TG 625 xxGA 629 ) and a single small-X 3 -small motif (A 637 xxxG 641 ) toward the Cterminus (Figure 1). 51 Such small-X 3 -small motifs play a key role in TM helix dimerization in a number of membrane proteins. 77 Interconversion between different TM helix packing modes has been implicated in activation of the EGFR and related receptors (see the literature 24 for a recent discussion). On this basis, we defined a new CV, Δd, which corresponds to the difference in interhelix distances between these two motifs, such that a negative value of Δd indicates an N-terminal helix/ helix interaction and a positive value indicates a C-terminal interaction ( Figure 3B). The minimum free energy path (MFEP) along the Δd CV, corresponding to the lowest energy points computed in the Δd range from −1 to +1 nm, reveals the presence of four energy minima ( Figure 3B). The deepest energy minimum (at Δd = −0.1 nm) corresponds to the previously described Nter+ population. An additional low freeenergy minimum, named Cter+, shows the two helices interacting through both the C-terminal small-X 3 -small motif a . We note that the Nter+ population is slightly more stable than Cter+ (∼1.4 kJ/mol), as suggested by previous experimental studies. 78 The other two free-energy minima, Nter and Cter, are 6 to 7.5 kJ/mol higher than Nter+ and correspond to helix dimers in contact primarily through their N-terminal and C-terminal small-X 3 -small motifs, respectively. These interactions are similar to those found in the Nter+ and Cter+ conformations, with the lack of additional contacts at the core of the helices explaining the higher free energy values of these populations (Supporting Information Figure S4A−D).
From the analyses presented so far, the Nter+ dimer always represents the lowest free-energy minimum. However, the lefthanded Lzip dimer is absent from the FES computed as a function of Δd ( Figure 3B). Accordingly, we defined an additional CV, Δ|θ|, which is able to further characterize specific conformational differences among the diverse dimer structures. This CV describes the rotation of the TM helices about their long axes relative to one another. The MFEP computed as a function of Δ|θ| reveals three out of the five dimer populations previously described ( Figure 3C). In this representation the Nter dimer appears as an intermediate conformation between Lzip and Nter+.
CG-MetaD simulations have allowed us to explore the largescale motions of the TM domain, revealing the existence of five dimeric states ( Figure 3D, Supporting Information Table S2). We have used each of these dimeric state models as the starting point for extended atomistic simulations, in order to explore in more detail specific inter-residue interactions between the two helices. Thus, each dimeric model of the five lowest energy minima identified by CG-MetaD was used to initiate a 0.5 μs of unrestrained atomistic (AT) MD simulations (yielding a total simulation time of 2.5 μs). We found that the dimer conformations were stable during the AT-MD simulations, with, e.g., no significant change in helix crossing angles over the course of the 0.5 μs simulations ( Figure 4A; also see Supporting Information Figures S6). Examination of the structures at the end of the atomistic simulations confirmed that the hydrophobic residues of the TM domain form interactions with the lipid aliphatic chains within the bilayer core, while interhelix contacts are formed by the interaction motifs identified through the CG-MetaD simulation. We found that a number of H-bonding interhelix interactions involving the N-ter and C-ter loop residues also contribute to stabilize the dimeric states, in addition to electrostatic interactions between cationic residues and lipid headgroups ( Figure 4B; also see Supporting Information SI text).
Examination of the CG-MetaD derived free energy landscape reveals that in the unbound state (i.e., monomeric TM helices at L > 3 nm) the surface is not flat along the Ω dimension ( Figure 2C). The shallow free-energy dependence on the crossing angle, with favorable free energies at Ω ∼ −18°and Ω ∼ +18°, corresponds to the (monomeric) helices adopting a tilted orientation, with a lowest free-energy tilt angle of ∼20°r elative to the bilayer normal (Supporting Information Figure  S7), close to values reported for the tilt of synthetic peptides in model lipid bilayers. 79−81 Dimerization Pathways. The bias added to the system during the CG-MetaD calculation allows acceleration of sampling, and the reconstruction at the end of the simulation of the thermodynamics of the event under study (i.e., the freeenergy landscape and free-energy minima identification). The characterization of the two-dimensional FES ( Figure 2B) also therefore provides information on the possible pathways connecting the monomeric state to the different dimeric structures. We divided the FES into two regions separated by the free energy barrier observed at a crossing angle of 0°in order to calculate MFEPs for dimerization (i.e., from high to low L values). We observed similar features for the calculated pathways in the two regions ( Figure 5A). At high L values, the MFEPs linearly follow the free energy basins at Ω ∼ −18°and Ω ∼ +18°discussed above, which represents the tilted orientations of TM helices in their monomeric state. In contrast at intermediate L values, (1.5 nm < L < 3 nm) corresponding to predimeric states, the MFEPs show a nonlinear pattern with an increase of the magnitude of the crossing angle between the TM helices when they first contact each other. This is followed by a decrease in the magnitude of the crossing angle when the helices pack together more closely ( Figure 5A). Subsequently, the MFEPs follow the curvature of the free energy wells observed at low L values.
These findings prompted us to compare the dimerization pathways predicted by CG-MetaD to those observed through unbiased CG simulations. We therefore performed three standard CG-MD simulations, each of 5 μs duration starting from monomeric states (L ∼ 3 nm) with different Ω values. These simulations displayed dimerization of the TM helices with features similar to those predicted by the MFEPs. However, comparison of the CG-MD simulations showed a

Journal of the American Chemical Society
Article range of trajectories on the FES, highlighting the stochastic behavior of individual dimerization events.
Dimerization of the TM helices occurred early after ∼50 ns in the first simulation, following the MFEP calculated in the region of negative Ω values of the FES ( Figure 5B). In the second simulation a wider exploration of the FES at high L values was seen ( Figure 5C). The longer diffusion time of the helices in the monomer state resulted in a well-balanced representation of the two low energy populations, with both right-handed and left-handed monomeric states sampled. In this simulation the two TM domains first interacted with each other through the extracellular N-loops. These predimeric states can readily convert from left-to right-handed forms in agreement with the quasi-linear isoenergy lines observed along the Ω axis in this region of the FES (1.5 nm < L < 2 nm; Figure  5A). Subsequently, more stable interactions are formed via the residues at the core of the helices. In these dimeric states (L ≈ 1 nm) the system still explores both right-handed and lefthanded forms, through an equilibrium between the Lzip and the Nter populations. The final stage of the simulation shows the stabilization of the dimer in right-handed forms corresponding to the global minimum of the FES ( Figure  5C). The third simulation (Supporting Information, Figure S8) showed a mixed profile with respect to the first two simulations, namely a fast association of the helices followed by the exploration of both right-handed and left-handed forms to reach the final dimeric state. Interestingly these results suggest that the equilibrium between the Lzip and the Nter populations may be a preliminary pathway to access the most stable Nter+ state.
The agreement between CG-MetaD and standard CG-MD in finding the lowest energy pathways in the monomeric, predimeric, and dimeric states demonstrates that the metadynamics bias did not corrupt the thermodynamics of the system, and that the CVs L and Ω properly describe the motion of the helices during the dimerization/dissociation processes.  Elucidating the structural transitions that follow the initial dimerization step is fundamental to our mechanistic understanding of EGFR activation. In particular, the equilibrium dynamics of the TM domain may reveal how the helix interactions change when the active and inactive dimeric states interconvert. Thus, we evaluated through standard (i.e., unbiased) CG-MD simulations whether the four higher energy dimeric states identified in the multidimensional landscape could convert into the Nter+ conformation, which is thought to correspond to the activated state of the receptor. 23 First, we performed a standard CG-MD simulation (of 5 μs duration) starting from the Cter+ conformation, which corresponds to a possible inactive state of the receptor. 23,24,32 This explored the various low energy conformations of the right-handed (negative Ω value) region of the FES ( Figure 6A). Analysis and structural examination revealed that there is equilibrium between the Cter+ and the Nter+ populations, in agreement with the free energy calculations that indicated a low energy barrier between the two dimers (∼2 kJ/mol from Nter+ to Cter+; see Figure 3B). In particular, we found that these two populations can interconvert through a "pivot" motion whereby the contacts formed by the N-terminal small-X 3 -small motif alternate with those formed by C-terminal motif ( Figure 6B). A similar mechanism has been invoked to explain the correlation between the TM conformational transitions and the activation of the EGFR intracellular kinase domains. 23,24,32 Comparable standard CG-MD simulations starting from the (left-handed) Lzip dimer showed a transition to the more stable right-handed Nter+ ensemble ( Figure 6C). This transition occurred through a two-step mechanism involving helix rotation as characterized by evaluation of Δ|θ| ( Figure 6D). In particular, after ∼1.5 μs one of the two helices rotated leading to the intermediate (asymmetric dimer) Nter population. Subsequently rotation of the second helix leads to the Cter+ conformation that rapidly evolves to the active Nter+ ensemble ( Figure 6D). Alongside some experimental studies, 82 our results suggest that the "rotation" motion of the TM helices of the EGFR may correspond to another possible activation pathway.
We also performed CG-MD simulations starting from Cter and from Nter (Supporting Information Figure S9). We found a rapid transition (after ∼0.25 μs) from the Cter population to Nter+, which then existed in equilibrium with Cter+. This is in line with the free energy landscape previously described ( Figures 3B). Starting from the Nter dimer, an equilibrium with the Lzip population was reached and followed by transition to Nter+. These observations confirm our suggestion that Nter is an intermediate state between Lzip and Nter+ ( Figure 3C). Some structural asymmetry, observed in Nter conformations ( Figure S9C), may play a role in the kinetics of conversion to the Nter+ active state. Such asymmetry of TM helix dimers was also recently discussed for other RTKs. 27,40 Taken together these simulations support the picture suggested by, e.g., NMR studies 24 of a number of different structures of the TM dimer, interconversion between which may provide pathways for activation of the EGFR.
Influence of Membrane Thickness. For a number of TM helix dimers, the lipid environment has been suggested to modulate the stability of specific conformations. 25,37,83 Our CG-MetaD simulations have allowed us to identify Nter+ as the most stable dimeric state. We therefore simulated the dimeric states in membranes differing in their bilayer thickness (Supporting Information Figure S10) to investigate the effect of bilayer thickness on the conversion pathways from the higher energy states (Cter+, Nter, Cter, Lzip) to this stable conformation. Overall, these simulations confirm the results obtained in the 16:0 PC bilayer used in the majority of the simulations. However, some interesting differences were found. In particular, in a thinner (12:0 PC) bilayer the amplitude and

Journal of the American Chemical Society
Article the frequency of the pivot motions were increased, whereas these appeared to be slowed in a thicker (20:0 PC) bilayer ( Figure 7A). Similar behavior was observed regarding the rotational motions, even leading in some case to the absence of interconversion in the course of simulations ( Figure 7B). These findings support the view that the bilayer thickness can modulate TM helix dimer stability, 37,38,76 and that the physical properties of plasma membrane (nano)domains may be able to modulate receptor activation. 2,84,85 Indeed, in future studies it would be of interest to explore the dependence of free energy landscape on more complex lipid bilayer compositions, including interactions with, e.g., cholesterol and phosphatidylinositol 4,5-bisphosphate such as have been observed in, e.g., simulations with GPCRs. 86 Structural Correlations. The structure of the TM domain of the EGFR has been determined by NMR in DPC micelles (pdb id: 2M0B). 3 It shows interhelix contacts at the C-terminal small-X 3 -small motif similar to those found in the Cter+ population identified in our simulations ( Figure 8A). Furthermore, we note that the structure of the ErbB1/B2 TM helix dimer (pdb id: 2KS1) obtained by NMR in DMPC/ DHPC bicelles resembles more closely our Nter+ ensemble. 51 Another NMR structure of the EGFR TM domain plus the adjacent intracellular juxtamembrane (JM) region (pdb id: 2M20) also revealed interhelix contacts at the N-terminal small-X 3 -small motif within the TM domain. 23 This dimer structure is very similar to the Nter+ ensemble identified as the population of lowest free-energy in our CG-MetaD calculations ( Figure  8B). This agreement prompted us to undertake a series of 5 μs standard CG-MD simulations using the TM+JM NMR structure (pdb id: 2M20) as a starting conformation. In these simulations, we found that the experimental conformation is stable, displaying contacts between the TM helices at the Nterminal motifs with additional hydrophobic leucine-zipper interactions between the JM segments. 23 Encouragingly, the coarse-grained model reproduced the global structure and especially the interhelical contacts determined by NMR ( Figure  8C).
Correlated Motions of the TM and JM Domains. We also simulated the behavior of the NMR TM+JM dimer using CG-MD in a phospholipid bilayer containing anionic lipids (POPC:POPG 85:15) in order to mimic the environment presented by a mammalian plasma membrane. 32,87 After ∼2.7 μs a conformational transition was observed from the Nter+ to the Cter+ population (Supporting Information Figure S11A). This change occurred through a pivot motion of the TM domain and was associated with the separation of the JM segments ( Figure 9A). Interestingly a recent simulation study suggests that a similar motion is associated with an inhibitory effect of binding of the Herceptin monoclonal antibody to the Her2 receptor. 33 In another set of CG-MD simulations of the TM+JM dimer, this time performed in a zwitterionic (PC) bilayer and starting from the Nter+ conformation, we observed a rotational motion of the helices that led to the Lzip conformation, before returning to the initial Nter+ conformation (Supporting Information Figure S11B). Similarly to what was observed for the TM domain, the transition between the right-handed and the left-handed conformations is mediated by an asymmetric Nter conformation (Supporting Information Figure S11C). Structural characterization also shows that the JM leucine-zipper interactions are disrupted during this process and reformed at the end of the simulation ( Figure 9B). Extrapolating back to the full-length receptor, reformation of the antiparallel leucine-zipper interactions in the JM region would be expected to latch the kinase domains in the asymmetric active conformation. 22 Thus, our CG-MD simulations reveal reversible transitions between conformations, which may be identified as corresponding to the likely active and inactive states of the receptor, providing structural insights into the dynamics of EGFR activation.
Mechanisms of EGFR Activation. Overall, our simulations suggest that when the TM helices are in contact through the Cterminal small-X 3 -small or Lzip motifs, the JM segments are distant and cannot readily form antiparallel leucine-zipper interactions (Supporting Information Figure S11). This further suggests that the Lzip, Cter and Cter+ conformations all correspond to inactive states of the receptor. We note that all these inactive dimer states exhibit an increased distance between the two K618 residues in the N-loop that have been previously identified as critical for regulation of EGFR activation by glycolipids such as GM3. 69 We also note that such variation of distance between charged residues in this specific upper region of the TM domain was found sufficient to regulate activation of the growth hormone receptor. 30 Thus, our findings provide a description of the possible role of the TM domain in relaying conformational changes between the extracellular and the intracellular regions of the receptor (Supporting Information Figure S11). We suggest that the

Journal of the American Chemical Society
Article EGFR can be activated either through a "pivot" motion of the TM helices and/or through an alternative "rotational" mechanism, both leading to the right-handed Nter+ active dimeric state ( Figure 9C).

■ CONCLUSIONS
Large-scale motions of TM domains play a key role in the activation mechanisms of membrane receptors such as RTKs. These functional motion of TM helices can be difficult to characterize by either experimental or theoretical approaches. Here, we have described an advanced computational protocol that combines two well-established techniques, namely Coarse-Grained molecular dynamics 88 and well-tempered Meta-Dynamics (CG-MetaD), 45 allowing us to overcome the time scale limits of current simulations. Using this approach, we have simulated helix/helix interactions of the EGFR TM domain to reconstruct the underlying free energy landscape, providing a detailed description of the conformational motions of this canonical RTK TM dimer at near-atomic resolution. In particular, the enhancement of the phase space exploration using CG-MetaD allows us to describe states and transitions between states that would be otherwise only partially described by more standard approaches. Running large ensembles of multiple standard CG-MD simulations with different initial configurations 89 would be an alternative approach but requires a priori structural knowledge on starting configurations that is achievable only by a thorough sampling such as is possible by CG-MetaD.
Our study reveals the various modes of interaction between EGFR TM helices. The agreement between the simulated packing modes and the available structural data supports our thermodynamic characterization. Our calculations predict the existence of five dimeric states of the EGFR TM domain, which are separated from one another by less than 10 kJ/mol (i.e., ∼4 kT), and for which the relatively fast kinetics of interchange may be modulated by the thickness of phospholipid bilayers (Figures 6 and 7). These features underlie considerable conformational flexibility of the TM domain, supporting its likely role in the activation mechanism of the EGFR. We have described two mechanisms leading to the population of lowest

Journal of the American Chemical Society
Article free-energy, Nter+, corresponding to the activated state of the whole receptor. The first involves a "pivoting" motion of the helices that converts the Cter+ (inactive) population in the Nter + (active) population, while the second is a "rotational" mechanism that leads from a left-handed Lzip (inactive) population to the right-handed Nter+ (active) population. Simulations in the presence of the juxtamembrane (JM) domain confirm this scenario and demonstrate the likely critical role of the TM domain in the allosteric regulation of the EGFR (Figure 9). Thus, we are able to fully describe functionally relevant motions of the TM domain "switch" which links the ecto and intracellular domains within the intact receptor dimer.
It is of course worthwhile to consider possible limitations of the CG model (i.e., MARTINI) employed in these studies. 36 The simplifications of the molecular systems include a mapping of four non-hydrogen atoms into one single CG particle, as well as the short-range nature of the interactions between particles. These features lead to smoothed free energy landscape, which intrinsically modifies the kinetics of the system. 55 However, the enthalpy-driven parametrization of the MARTINI force field potential aids prediction of energetically relevant states, including intermediate states. 36 It has been demonstrated that the CG force field reproduces physicochemical properties of solvated phospholipid bilayers 55,90 and describes protein− lipid 84,91−94 and protein−protein 95 interactions. To date, MARTINI has successfully complemented experimental investigations of various TM protein domains. 30,83,96,97 In particular, the in-house variant used in the present study has previously shown quantitative agreement with experimental measures on a number of TM systems. 81,98,99 Furthermore, comparisons of PMFs for helix/helix interactions of glycophorin A derived from MARTINI v2.1 vs v2.2 suggest that the overall well depth changes by less than kT (unpublished observations). Although the standard formulations of MARTI-NI present limitations that may affect the dynamics of the system, such as the lack of solvent polarizability and entropic components, 36,100 the correlation between our results and available experimental data suggests that this CG force field is able to predict thermodynamically relevant structural properties ( Figure 8).
Using the CG-MetaD approach we have addressed the challenging problem 89 of characterizing accurately the free energy landscape of protein/protein interactions within membranes. The association and dissociation of the TM helices of EGFR in a phospholipid bilayer environment are described by a large number of interconversion events between the dimeric and monomeric states (see Movie S1). Through the accelerated sampling of the slowest motions of the system, CG-MetaD simulations allow us to overcome the time scale limits of current simulations. Estimating effective rates of the observed events will be important for future developments of the CG-MetaD methodology. A protocol has been recently proposed to extract transition rates from atomistic metadynamics simulations. 101,102 However, in the present case the kinetic calculations are rather complex since the sampling acceleration observed with the CG model along the CV space needs to be rigorously evaluated.
By enabling us to go beyond the time scale and sampling of current simulations, our CG-MetaD studies alongside standard equilibrium CG and AT simulations reveal new structural information which complements those obtained previously. Our results demonstrate that the combination of CG and MetaD provides an exhaustive exploration of the phase space leading to an accurate description of the free energy landscape underlying the association of the EGFR TM helices. This allows identification of new dimeric states and characterization of multiple, reversible interconversion events between these states of the TM domain dimer of the EGFR. This contributes to our understanding of the mechanisms of EGFR signaling by helping to bridge the gap between the extracellular and intracellular regions. The intrinsic functional motions of the TM domain will reciprocally modulate and be modulated by additional molecular interactions in the structurally intact receptor in vivo. 103 Future investigations will need to analyze the possible effects of more complex membrane environments 86 and to probe the similarities and/or differences among homologous RTKs. 21 This study endorses CG-MetaD as a powerful method that opens new avenues in free energy calculations of biomolecules 104−106 and in the investigation of large-scale conformational motions underlying the activation mechanism of RTKs and related membrane proteins.

* S Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/jacs.6b05602.
Movie S1 (AVI) Tables with details of simulation setups, additional analysis, text and figures (PDF)