Energy Landscapes and Structural Ensembles of Glucagon-like Peptide-1 Monomers

While GLP-1 and its analogues are important pharmaceutical agents in the treatment of type 2 diabetes and obesity, their susceptibility to aggregate into amyloid fibrils poses a significant safety issue. Many factors may contribute to the aggregation propensity, including pH. While it is known that the monomeric structure of GLP-1 has a strong impact on primary nucleation, probing its diverse structural ensemble is challenging. Here, we investigated the monomer structural ensembles at pH 3, 4, and 7.5 using state-of-the-art computational methods in combination with experimental data. We found significant stabilization of β-strand structures and destabilization of helical structures at lower pH, correlating with observed aggregation lag times, which are lower under these conditions. We further identified helical defects at pH 4, which led to the fastest observed aggregation, in agreement with our far-UV circular dichroism data. The detailed atomistic structures that result from the computational studies help to rationalize the experimental results on the aggregation propensity of GLP-1. This work provides a new insight into the pH-dependence of monomeric structural ensembles of GLP-1 and connects them to experimental observations.


Starting Points and Force Field for Simulations for GLP-1 (7-36)
The starting points for GLP-1 (7-36) simulations were generated using the same methods used for GLP-1 (7-37) described in the main paper.Protein Databank structure 5OTU 1 was used as a seed for basin-hopping global optimization.The protonation states studied are provided in Table S1.
Table S1: Sequence and protonation states of GLP-1 (7-36) variants.Approximate pH values are given so that simulations can be more readily compared to experiment.Three-letter labels for residues indicating the state of protonation/deprotonation follow the convention developed by the AMBER package.Here, histidine is represented not by HIS, but by HID if the residue is deprotonated and has a hydrogen on the delta nitrogen only, by HIE if the residue is deprotonated and has a hydrogen on the epsilon nitrogen only, and by HIP is the residue is protonated and so has hydrogen atoms on both the delta and epsilon nitrogens.GLU represents glutamic acid in its deprotonated form whereas GLH represents this residue when protonated.Residues highlighted in red are those which differ from 7-36 Prot1 .

Label
Approximate pH Sequence 7-36 Prot1 7.    The key Leu14-Lys20 interaction which arises at pH 4 is highlighted in red on both the structure and in the schematic.

GLP-1 (7-36) Discussion
Radius of Gyration (see row 1 of Fig. S3): At pH 7.5, the compactness strongly depends upon which of the δ and ϵ nitrogen atoms of residue 1 (a histidine) are protonated.Lowering the pH so that both of these nitrogen atoms become protonated results in a generally compact structure.However, upon lowering the pH still further, so that the glutamic acid residues also become protonated, results in a predominance of extended α-helix and β-strand structures.
α-helical vs β-strand Content (see rows 2 and 3 of Fig. S3): At high pH, the deepest and widest funnel consists of a multiplicity of structures with high α-helical content, and lowering the pH results in this funnel narrowing and the percentage α-helical content of individual structures becoming less pronounced.β-strands are observed at neutral pH, but they only consist of a small proportion of structures.Upon lowering the pH to 4, these β-strands become more prevalent, and this structure type forms the second-lowest energy funnel (like the 7-37 case), with the α-helix type forming the lowest.

Analysis of aggregation-prone configurations
Amyloidal aggregation generally proceeds via the adoption of assembly-competent monomeric states, so called N* states.Analysis of the structural ensembles to identify these states and their propensities can reveal insight into the likelihood of aggregation, as demonstrated for amyloid-β in previous studies. 2,3Such an analysis requires knowledge of the amyloidal core region, so that a comparison can be made between the eventual fold in the fibril and the monomeric structures.
To our knowledge, such a structure is not publicly available for GLP-1, although some structures of aggregating fragments are available.This limitation means we could not conduct a full N* analysis, but instead compared the structural similarity across the landscapes to one of the reported peptide structures.The peptide structure used for comparison is PDB 8ONQ, 4 a seven residue peptide exhibiting an anti-parallel β-sheet-like structure.
Figure S6 shows the energy landscapes for GLP-1 (7-37) using the backbone RMSD to

Explicit solvent simulations
For each landscape for GLP-1 (7-37), we selected four minima representative of the different funnels for explicit solvent molecular dynamics (MD) simulations to validate their stability.
Each structure was solvated in OPC water in a truncated octahedral solvation box with a distance of 8.0 Å to the box surface from the solute.Cl − ions were added to give an effective concentration of 0.15 M, and K + ions were used to neutralise the system.
The energy of the system was first minimised with and without constraints on the solute, before the simulation box was heated to 300 K with restraints on the solute.These restraints S9 were removed stepwise, before the simulation box was equilibrated in an NPT ensemble.We then ran three 50 ns production trajectories for each minimum.
For each pH value, one minimum is helical (185102 for pH 7.5, 23216 for pH 4 and 131790 for pH 3), while the other three are disordered with varying secondary structure content.

Figure S3 :Figure S4 :
FigureS3: Disconnectivity graphs of the free energy landscapes for monomeric 7-36 GLP-1, coloured using order parameters for key structural features.Top row: The radius of gyration is used as the order parameter, with compact structures in red and extended structures in blue and green.Middle row: The order parameter is the α-helical content, where red is no helical content and green and blue are medium to high helical content.Bottom row: The β-strand content is used for colouring, with red indicating no β-strand content, and green and blue indicating a medium and high level, respectively.

Figure S6 :
Figure S6: The energy landscapes for 7-37 GLP-1 at the three pH values coloured by the backbone RMSD to 8ONQ.The colouring closely resembles the β-strand content and indicates the higher stability of aggregation-prone species at lower pH.

Figure S7 :Figure S8 :Figure S9 :Figure S10 :Figure S11 :Figure S12 :
Figure S7: Backbone RMSD changes during explicit solvent MD simulations for four minima at pH 7.5.The individual trajectories are shown in grey, with the average and standard deviation in green.No larger structural changes are observed on the MD time scale.