A Population Shift between Sparsely Populated Folding Intermediates Determines Amyloidogenicity

The balance between protein folding and misfolding is a crucial determinant of amyloid assembly. Transient intermediates that are sparsely populated during protein folding have been identified as key players in amyloid aggregation. However, due to their ephemeral nature, structural characterization of these species remains challenging. Here, using the power of nonuniformly sampled NMR methods we investigate the folding pathway of amyloidogenic and nonamyloidogenic variants of β2-microglobulin (β2m) in atomic detail. Despite folding via common intermediate states, we show that the decreased population of the aggregation-prone ITrans state and population of a less stable, more dynamic species ablate amyloid formation by increasing the energy barrier for amyloid assembly. The results show that subtle changes in conformational dynamics can have a dramatic effect in determining whether a protein is amyloidogenic, without perturbation of the mechanism of protein folding.


■ INTRODUCTION
Protein unfolding is commonly associated with amyloid formation. This view is supported by the large number of intrinsically unfolded proteins that are the causative agents of amyloid diseases, such as α-synuclein, amyloid-β peptide (Aβ), and amylin. 1−3 In terms of folded protein precursors, a link between local or global unfolding and the onset of aggregation has been documented. 4 Indeed, decreased native state stability and a reduction in co-operativity have been linked with enhanced amyloidogenicity of several proteins, including human lysozyme, 5 transthyretin, 6 prion protein, 7 superoxide dismutase (SOD), 8 and β 2 -microglobulin (β 2 m). 9,10 These findings have guided theoretical approaches to predict aggregation-prone regions of folded proteins by combining the intrinsic propensity of a protein sequence to self-assemble with its probability to become exposed (e.g., by increased predicted hydrogen-exchange rates). 11,12 Along similar lines, strategies to stabilize native-like conformations by the use of small molecules, 13,14 aptamers, 15 molecular chaperones 16,17 or via other protein−protein interactions 18,19 all reduce amyloid formation. However, a quantitative link between the conformational properties and dynamics of individual partially folded species and amyloid propensity remains elusive, in part because of difficulties in identifying and characterizing amyloid intermediates in atomic detail. In addition, the complexity of the energy landscape of proteins, which involves many potentially amyloidogenic species, exacerbates the intricacy of the system.
In this study we investigate the structural, kinetic, and thermodynamic properties of sparsely populated intermediates of human and murine β 2 m (hβ 2 m, mβ 2 m, respectively) and link their properties to the known, very different amyloid propensities of these proteins. 20 Hβ 2 m forms amyloid fibrils in the joints of patients undergoing hemodialysis, in a pathological condition known as dialysis-related amyloidosis. 21,22 Previous studies have examined the link between the folding pathway of β 2 m and its aggregation propensity. These studies showed that hβ 2 m does not aggregate at neutral pH unless additives such as Cu 2+ , 23 trifluoroethanol (TFE), 24 or other cosolvents, 25,26 which partially unfold the protein, are added. As hβ 2 m contains a thermodynamically unfavorable cis prolyl-peptide bond at position 32, protein folding involves a slow folding phase which is attributed to trans−cis proline isomerization of this bond. 9,27 A native-like intermediate containing a trans-Pro 32 (I T ) accumulates during folding of hβ 2 m, which can be trapped by removal of the N-terminal six residues of the protein, to create the variant ΔN6. 28,29 At neutral pH, the concentration of I T correlates directly with the rate of amyloid formation of hβ 2 m, 9 suggesting that formation of this on-pathway native-like folding intermediate is a key determinant of amyloid formation. This supposition is confirmed by the ability of ΔN6 to aggregate readily at pH 6.2. 28 On the other hand, mβ 2 m, despite being 70% identical in sequence to hβ 2 m and also containing a cis-X Pro 32 bond, does not form amyloid fibrils at neutral pH, or even when unfolded under acidic conditions (unless high concentrations of salt are added) ( Figure 1A and 1B). 19,30 The basis of such dramatically different amyloid propensities despite the sequence and structural similarities of these two proteins remained unclear.
Here, we set out to investigate the molecular origins of the reduced amyloidogenicity of mβ 2 m. We characterize the stability, structure, and dynamics of the native protein and show that despite its inability to form amyloid, mβ 2 m is kinetically and thermodynamically less stable than its human counterpart. The folding pathway of mβ 2 m is then explored using real-time NMR, taking advantage of the power of nonuniformly sampling (NUS) methods to reveal detailed information on the energy landscape of mβ 2 m folding. Combined with other biophysical methods, we show that while mβ 2 m also folds through an I T state, this species is relatively more flexible than its hβ 2 m counterpart and in conformational exchange with other, less-structured non-native states.
The molten globule-like characteristics of the mβ 2 m folding intermediate reduce the lifetime of the structured, well-folded I T state, which now represents only a minor substate in the structural ensemble. Our findings confirm the vital role of I T (and hence a highly structured, yet non-native species) in determining the aggregation of β 2 m. Moreover, the results highlight the importance of defining the energy landscape of amyloidogenic proteins in detail to allow prediction of their amyloid propensity. The findings presented also suggest that targeting a defined non-native species should be a successful means of controlling the fate of assembly of β 2 m and, in principle, that of other amyloidogenic proteins which aggregate via a specific, non-native precursor.
■ METHODS Protein Preparation. The pINK plasmid containing the hβ 2 m, mβ 2 m, or ΔN6 gene was transformed into E. coli cells of the BL21 DE3 plysS-strain. Starter cultures were generated by inoculating 100 mL of LB medium with cells containing the relevant gene and 50 μg/ mL carbenicilin and 50 μg/mL chloramphenicol and incubating overnight at 37°C, 200 rpm. 2 L flasks containing 1 L of LB or HDMI (1 g/L 15 N-NH 4 Cl, 2 g/L 13 C-glucose) medium were inoculated with 10 mL of starter culture. Cells were incubated at 37°C, 200 rpm until they reached an OD 600 of ∼0.6 and then the expression of β 2 m was induced by the addition of isopropyl β-D-1-thiogalactopyranoside (IPTG-final concentration of 1 mM). Expression was allowed to continue overnight at 37°C, and cells were harvested next morning using a Heraus continual action centrifuge performing at 15 000 rpm. The cell pellet containing β 2 m as inclusion bodies was chemically lysed by the addition of 50−100 mL of lysis buffer (100 μg/mL lysozyme, 50 μg/mL DNase I, 50 μg/mL phenylmethylesulfonyl fluoride (PMSF), 10 mM Tris-HCl pH 8.0). Further cell disruption was performed using a constant cell disrupter system (Constantsystems) at a high pressure of 20.0 kpsi. Inclusion bodies were separated using centrifugation (15 000 rpm using a Sorvall SS34 rotor) in a Beckman centrifuge for 40 min at 4°C, and the inclusion body pellet was washed with 10 mM Tris-HCl pH 8.0 buffer four times. Finally, β 2 m was solubilized in 10−20 mM Tris-HCl pH 8.0 (hβ 2 m, ΔN6) or 10− 20 mM Tris-HCl pH 8.5 (mβ 2 m) containing 8 M urea (MP biomedicals) and refolded by dialysis (3000 MW cutoff) against 2−5 L of the same buffer but lacking urea. The refolded protein was centrifuged for 30 min at 15000 rpm (Sorvall SS34 rotor) to pellet insoluble material, and the supernatant was loaded on a Q-Sepharose (GE Healthcare) column already equilibrated with 2 column volumes of 20 mM Tris-HCl pH 8.0 (hβ 2 m, ΔN6) or 20 mM Tris-HCl pH 8.5 (mβ 2 m) for anion exchange purification. Bound protein was eluted The NMR structure of monomeric hβ 2 m (gray-2XKS 28 ) overlaid with the crystal structure of mβ 2 m bound to the MHC-I complex (green-1LK2 38 ) or with the solution structure of ΔN6 (red-2XKU 28 ). (B) Aggregation assay of 80 μM hβ 2 m (gray), mβ 2 m (green), or ΔN6 (red) in 10 mM sodium phosphate buffer, pH 6.2 (three replicates for each protein). Negative stain electron micrographs, color-coded with the same scheme, are shown on the right (scale bar = 500 nm). (C) The 1 H− 15 N HSQC spectrum of mβ 2 m in the same buffer as (B). (D) Correlation between experimental RDCs measured for mβ 2 m in 10 mg/mL phage PF1 and those back-calculated from the crystal structure 1LK2 38 (R 2 = 0.85).

Journal of the American Chemical Society
Article with a gradient of 0−400 mM NaCl (in the same buffer) over 800 mL and was freeze-dried after dialysis in dH 2 O or concentrated using 3000 MW cutoff centricons (Avanti LTD). Freeze-dried protein was resuspended in 10 mM sodium phosphate buffer pH 7.0 (hβ 2 m, ΔN6) and 10 mM sodium phosphate buffer pH 8.2 (mβ 2 m) filtered through 0.2 μm filters (Fisher Scientific) and gel-filtered using a HiLoad Superdex-75 Prep column (Amersham Biosciences), calibrated with a standard gel filtration calibration kit (GE Healthcare). The monomer peak was collected, concentrated, aliquoted, and stored at −80°C or freeze-dried.
NMR Spectroscopy. Assignments of the backbone atoms of mβ 2 m were obtained using samples of 500 or 750 μM uniformly labeled ( 15 N, 13 C) protein in 10 mM sodium phosphate buffer pH 6.2, 83.3 mM NaCl, 0.02% (w/v) NaN 3 , and 10% (v/v) D 2 O. Three-dimensional (3D) NMR experiments were recorded at 25°C using Varian Inova spectrometers (Agilent) operating at proton frequencies of 500 MHz (HNCA, HNCO, CBCA(CO)NH, HN(CA)CO) and 750 MHz (HNCACB), equipped with a room temperature or cryogenic probe, respectively. Samples for H/D exchange experiments were prepared in 10 mM sodium phosphate buffer pH 6.2 and then freeze-dried. Freezedried protein was dissolved in 100% (v/v) D 2 O containing 83.3 mM NaCl and placed into the NMR tube after manual mixing. The loss of intensity of amide proton resonances was then monitored by SOFAST 1 H− 15 N HSQC 31 spectra (5−10 min each) at 25°C. Residual dipolar coupling (RDC) experiments (J modulated series) were carried out using a sample of 200 μM 15 N-mβ 2 m in 10 mM sodium phosphate pH 8.2 and aligned in 11 mg/mL bacteriophage PF1 (Asla Scientific). RDC data were back-calculated from crystal structures using PALES. 32 For relaxation experiments, a sample of 80 μM 15 N-mβ 2 m was prepared in 10 mM sodium phosphate buffer at pH 6.2 with 10% (v/ v) D 2 O, 0.02% (w/v) NaN 3  For real-time refolding experiments two refolding protocols were followed: (1) protein samples were made in 10 mM sodium phosphate buffer pH 6.2 and freeze-dried. Unfolding was performed by dissolving the freeze-dried protein (2−3 mg) in 30−60 μL of the same buffer containing 8 M urea at 37°C for 1 h, and the protein was then refolded by rapid 10-fold dilution in 10 mM sodium phosphate pH 6.2, 10% (v/v) D 2 O, and 0.02% (w/v) NaN 3 ; (2) protein samples were made in 250 μL of 10 mM sodium phosphate and 10% (v/v) D 2 O, and the pH was adjusted to 2.0 (or pH 3.6) using Tris-HCl. Refolding was then initiated by addition of 50 μL of 300 mM sodium phosphate buffer, pH 7.2 (final pH 6.1−6.3). Both refolding protocols were found to give rise to similar spectra of the intermediate species. This observation demonstrates that the increased flexibility of the murine intermediate (see Results) is not the result of the residual 0.8 M urea present at the end of the first refolding protocol. The refolding from I T to native hβ 2 m was monitored by a series of SOFAST 1 H− 15 N HSQC spectra 31,33 at 25°C, with 80 increments in the indirect dimension, two scans per increment and 512 complex points, resulting in a total acquisition time of 45 s.
To assign the I 1 state of mβ 2 m, refolding was monitored by continuous acquisition of NUS NMR spectra. 3D HNCO+ and HNCA+ 34 with a total acquisition time of ca. 17 h for each spectrum and 2D 1 H− 15 N BEST-TROSY 35 were collected on separate samples (800 MHz Bruker AVANCE III HD spectrometer with 3 mm TCI cryoprobe). 2D 1 H− 15 N BEST-TROSY, having the highest sensitivity, was chosen as a reference spectrum to guide 3D spectra multidimensional decomposition (MDD) coprocessing with a sliding time frame window, resulting in a temporal resolution of a few minutes 36

(see Supplementary Methods).
To aid assignment of the real-time spectra, two samples were prepared in which the early intermediate of mβ 2 m was highly populated. The first consisted of 600 μM of uniformly labeled 13 C, 15 N mβ 2 m in 10 mM sodium phosphate and 10 mM sodium acetate pH 3.6, and the second consisted of 250 μM of uniformly labeled 13 C, 15 N mβ 2 m in 10 mM sodium phosphate pH 6.2 with 1 M urea added. Both samples gave rise to HNCA spectra that closely resembled the realtime HNCA spectrum of the early intermediate of mβ 2 m. Additional 3D spectra were performed using these samples including HNCA, HNCO, and CBCACONH utilizing a 600 MHz Varian Inova spectrometer equipped with a room tempreature probe. TALOS+ 37 was used to predict the backbone order parameter (S 2 ). TALOS+ uses H, NH, Co, Cα, and Cβ backbone chemical shifts to calculate the random coil chemical shift index which is then converted to backbone S 2 . Aggregation assays performed on these samples confirmed that the early folding intermediate of mβ 2 m is not aggregation-prone.
Aggregation Assays. Samples containing 60 μM protein in 10 mM sodium phosphate buffer, pH 6.2, or 10 mM sodium phosphate, with 10 mM sodium acetate pH 3.6, or in 10 mM sodium phosphate buffer, pH 6.2, with 1 M urea, with the appropriate amount of NaCl added to give a total ionic strength of 100 mM), 0.02% (w/v) NaN 3 and 10 μM Thioflavin T (ThT) were incubated at 37°C in sealed 96 well plates (Thermo Scientific) with agitation at 600 rpm. Fluorescence was monitored at 480 ± 10 nm after excitation at 440 ± 10 nm using a FLUOROstar Optima microplate reader (BMG Labtech).
Equilibrium Unfolding. Urea stock solutions containing 75 mM sodium phosphate buffer, pH 6.2, and either no urea or 10.5 M urea were made, and the exact concentration of urea was determined using the measured refractive index (Ceti refractometer). The stock solutions were used to make samples of protein containing 0−10 M urea in 0.2 M increments, with a final protein concentration of 0.2 mg/ mL. Samples were incubated at 25°C for 12 h before analysis using tryptophan fluorescence. Fluorescence was excited at 295 nm, and the emission was monitored at 340 nm using a Photon Technology International QM-1 spectrofluorimeter (PTI). The data were then globally fit to a two-state model: where ΔG o UF (kJ mol −1 ) is the equilibrium stability, M UF is the equilibrium m-value, a and c represent the denaturant-dependence of the folded and unfolded signal intensities, respectively, and b and d are the signal intensities of the folded and unfolded states, respectively, in the absence of denaturant.
Electron Microscopy (EM). Carbon coated copper grids were prepared by the application of a thin layer of Formvar with an overlay of carbon. Samples were centrifuged (14 000g, 10 min), and the pellets were resuspended in fresh 10 mM sodium phosphate buffer, pH 6.2, diluted to a final protein concentration of 12 μM with deionized water and then applied to the grid in a dropwise fashion. The grid was then carefully dried with filter paper before it was negatively stained by the addition of 18 μL of 2% (w/v) uranyl acetate. Micrographs were recorded on a Philips CM10 or a JEOL JEM-1400 electron microscope.
Stopped Flow Experiments. Experiments were performed using an Applied-Photophysics SX1.8MV stopped-flow fluorimeter. The temperature was held constant at 37°C (±0.1°C) using a Neslab circulating water bath system. Experiments were performed in buffered solutions containing 10 mM sodium phosphate (pH 6.2) and 83. Journal of the American Chemical Society Article dilution of 80 μM protein in 10 mM phosphate (pH 2.5) into 80 mM sodium phosphate (pH 6.2). Data were normalized to the signal of the folded and unfolded protein in 0 and 8 M urea, respectively. At each urea concentration at least seven kinetic traces were obtained, averaged, and fitted to a single exponential function using the manufacturer's software. ΔG o UI1 was determined by plotting the fluorescence at the end point of a 20 s kinetic trace of folding against urea concentration and by plotting the fluorescence of the unfolded state against urea concentration (in the latter case, the values at low urea concentration were obtained by linear extrapolation from the values at high urea concentration). The fluorescence of the I T state decreased with increasing urea concentration until it approached the fluorescence of the unfolded state. To estimate ΔG o UI1 at 0 M urea, data were also recorded in the presence of 0.4 M Na 2 SO 4 and the data in 0 and 0.4 M Na 2 SO 4 were fitted globally to eq 1.

■ RESULTS
The native state of mβ 2 m is thermodynamically unstable. As well as being 70% identical in sequence, mβ 2 m and hβ 2 m/ΔN6 have similar structures (backbone RMSD ≈ 1.5 Å) ( Figure 1A). However, only ΔN6 is able to aggregate into amyloid fibrils at pH 6.2 as monitored by the increase in ThT fluorescence and by negative stain EM ( Figure 1B), in agreement with previous studies. 19 To enable NMR studies of mβ 2 m, chemical shift assignments were obtained for the native monomeric protein using a range of 3D NMR experiments (BMRB 19772) (85% of backbone atoms were assigned; see Methods). The 1 H− 15 N HSQC spectrum of native mβ 2 m at pH 6.2 shows a single set of well-dispersed intense peaks, characteristic of a folded protein in solution that undergoes limited chemical exchange on the ms time scale ( Figure 1C).
The only available structure of mβ 2 m is the crystal structure of the protein bound to the heavy chain of the major histocompatibility complex (MHC-I). 38 In the case of hβ 2 m, binding to the heavy chain causes conformational changes particularly in the AB loop and the D strand of the protein. 39 To determine whether the crystal structure of mβ 2 m bound to the MHC-I complex constitutes a good representation for the structure of the monomeric protein in solution, residual dipolar couplings (RDC) were measured. Figure 1D shows that there is excellent agreement (R 2 = 0.85) between the measured RDCs and those back-calculated from the crystal structure of MHC-Ibound mβ 2 m ( Figure 1D), confirming the identity of the solution structure of monomeric mβ 2 m with that bound to MHC-I. Therefore, differences in the structures of the native proteins cannot explain the different amyloid potential of the two variants.
We next assessed whether differences in thermodynamic and/or kinetic stability between mβ 2 m and hβ 2 m/ΔN6 could rationalize their different amyloid propensity. Equilibrium denaturation experiments revealed that mβ 2 m is less stable than hβ 2 m (ΔΔG°u n = −12.4 kJ/mol) (Figure 2A , 2B and  Supplementary Table 1). Remarkably, mβ 2 m is less stable than the aggregation prone ΔN6 (Figure 2A, 2B) demonstrating that thermodynamic instability cannot explain the inability of mβ 2 m to form amyloid. Notably, ΔN6 shows a reduced mvalue compared with hβ 2 m and mβ 2 m, consistent with exposure of hydrophobic residues that are normally buried in the core of hβ 2 m (10 of the 17 core residues become more exposed in ΔN6 28 ). Hydrogen exchange experiments monitored by 1 H− 15 N HSQC spectra revealed that mβ 2 m is also kinetically less stable than hβ 2 m, with amide protons exchanging with the solvent more rapidly than hβ 2 m, while ΔN6 is the least kinetically stable of the three proteins studied here ( Figure 2C, Figure S1A−C, and Figure S2). NMR relaxation experiments on native mβ 2 m also showed no regions of increased dynamics on the ps−ns time scale, apart from the DE loop which is known to be flexible in all β 2 m variants ( Figure S2D−F). Thus, there is no correlation between thermodynamic or kinetic stability and amyloidogenicity of these different β 2 m variants.
Real-Time Characterization of a Transient Folding Intermediate. We next investigated whether the folding pathway of mβ 2 m also involves transient formation of an intermediate containing trans X-Pro 32 (known as the I T state 28 ), the accumulation of which has been shown to correlate directly with the rate of amyloid formation of the human protein. 9 For these experiments, mβ 2 m was unfolded either by incubating the protein at pH 2.0 or by the addition of 8 M urea at pH 6.2. Refolding was then initiated by dilution to a buffer of pH 7.2 or to a buffer lacking urea (see Methods, Figure S3), and NMR spectra were collected in real time to track the refolding reaction in residue-specific detail. In the case of hβ 2 m, and in accordance with previous studies, 28 a welldispersed spectrum was observed 3 min after refolding was initiated, in which only small chemical shift differences are detected in comparison with those of the native protein ( Figure  3A and 3B). These results indicate that a native-like intermediate (the I T state) accumulates during folding of hβ 2 m, consistent with previous results. 9,28 As previously reported, the spectrum of ΔN6 is very similar to that of the human I T state ( Figure 3C). 28 In marked contrast with the behavior of hβ 2 m, however, the 1 H− 15 N HSQC spectrum of mβ 2 m 3 min after refolding is initiated revealed only ∼20 intense peaks with limited chemical shift dispersion in the 1 H dimension which coexist with the most intense peaks of the native state ( Figure 3D and 3E). Interestingly, the partially folded state of mβ 2 m at pH 3.6 gives rise to a 1 H− 15 N HSQC spectrum that closely resembles the spectrum collected at pH ≈ 6.2, 3 min after refolding was initiated ( Figure 3F). These  Table S1). (C) Representative hydrogen exchange profiles for the amide hydrogen of residue 83 in hβ 2 m, mβ 2 m, and ΔN6 at 25°C and pH 6.2 (see Figure S2).

Journal of the American Chemical Society
Article results suggest that partially folded species are significantly populated during the folding of mβ 2 m, but that these species differ in structure and/or dynamics compared with their human counterparts.
In order to assign the real-time spectrum of the intermediate state of mβ 2 m, continuous, NUS NMR spectra (2D-BEST-TROSY HSQC, 3D-HNCA+, and 3D-HNCO+) were collected during refolding and the whole data set was coprocessed together, resulting in a temporal resolution of a few minutes. 36 Importantly, and in contrast with other real-time studies of protein folding which consist of acquisition of sequential standalone spectra, 40 this NUS approach requires the acquisition of only a single spectrum. Therefore, it does not require prior knowledge about the folding reaction in order to decide on the length of each individual experiment, since the time resolution can be determined in the processing step. Additionally, increased sensitivity is achieved by coprocessing less sensitive 3D NMR spectra (e.g., HNCA) with 2D experiments (e.g., HSQC). 41 The same set of intense resonances shown in Figure 3E were observed in 3D real-time spectra, consistent with these residues being flexible in the folding intermediate of mβ 2 m. As only 20 spin systems are present in the real-time HNCA spectrum, assignment was challenging. To overcome this issue, further backbone assignment experiments were performed on mβ 2 m at pH 3.6 ( Figure 3F) and mβ 2 m at pH 6.2 with 1 M urea added, both of which gave rise to spectra similar to those of the kinetic intermediate ( Figure 3F and data not shown), removing the need for rapid data acquisition (see Methods). The residuetype specific information of the Cβ atoms from these equilibrium experiments greatly facilitated the assignment of the real-time HNCA spectrum.
The assignment walk on the Cα resonances of mβ 2 m 5 min after folding was initiated is shown in Figure 4A. The assignment revealed that all of the intense peaks shown in Figure 3E correspond to residues located in the N-terminal region, the A strand, and the AB loop of mβ 2 m ( Figure 4B and 4C), regions of the polypeptide chain whose dynamics have been implicated in the initiation of the aggregation of the human protein. 19,28 The C-terminal four residues of the protein were also detected with chemical shifts that are different (Δδ H+N > 2 ppm) from those in the native structure. The backbone assignments (N, NH, Cα, Cβ, CO atoms; Supplementary Table 2) allow the accurate prediction of the order parameter S 2 and, therefore, an assessment of protein dynamics. Figure 4B and 4C show that the 20 N-terminal show broader lines suggesting that they correspond to residues with a higher degree of folding ( Figure S5). The far-UV CD spectrum of mβ 2 m obtained 3 min after the pH jump from pH 2.0 to pH 6.2 shows that in I 1 73% of native β-sheet structure is already formed, as quantified by the ratio of the intensities at 219 nm ( Figure 4D). These results show that mβ 2 m at pH 3.6 is partially structured, with the majority of residues being in a βsheet conformation, potentially native-like. These residues undergo chemical exchange on the ms time scale (or exchange rapidly with solvent) and, therefore, cannot be observed in the 1 H− 15 N spectrum shown in Figure 3E.
In order to estimate the stability of the I 1 state, stopped-flow fluorescence experiments were used to determine the fluorescence intensity of mβ 2 m 20 s after folding was initiated ( Figure 4E). These experiments revealed that, by contrast with the I T state of hβ 2 m for which the ΔG°u n is −9.57 ± 0.54 kJ/ mol at 37°C, 9 the I 1 state of mβ 2 m (ΔG°u n = ∼ −4.8 kJ/mol at 37°C) is only marginally stable in solution, in accordance with the real-time NMR data ( Figure 4E). Overall, the data show that mβ 2 m folds through a flexible/molten globule-like intermediate state in which the N-terminus and the A strand are dynamic and detached from a native-like β-sandwich fold (I 1 ) ( Figure 4C).
The flexible intermediate I 1 is not aggregation-prone. The amyloid fibrils of hβ 2 m are composed of parallel in register β-strands, 42,43 while in the native monomer the β-strands are all antiparallel ( Figure 1A). Thus, a major conformational change has to occur on the pathway to fibrils. Detachment of the A strand might represent a first step toward the remodeling of the native protein, and therefore, the early intermediate of mβ 2 m, I 1 , might be expected to be highly amyloidogenic. To test this hypothesis, aggregation assays were performed using ΔN6 at pH 6.2 as a mimic of the highly aggregation-prone state I T , and mβ 2 m at pH 3.6, conditions which favor the less structured intermediate state (I 1 ) of mβ 2 m. Consistent with previous results, these experiments showed that ΔN6 aggregates rapidly at pH 6.2 with a lag time of ∼30 h, resulting in the formation of amyloid fibrils ( Figure 5A). In marked contrast, no increase in ThT fluorescence was observed for mβ 2 m at pH 3.6 ( Figure  5A). Indeed, the majority of the murine protein remained soluble after 100 h of incubation, while ΔN6 was quantitatively converted into amyloid fibrils ( Figure 5B and 5C). Interestingly, the small amount of mβ 2 m that was not found in the supernatant also formed short fibrils ∼300 nm in length ( Figure  5C). These results show that the partially folded state of mβ 2 m is not highly aggregation prone. On the other hand, the specific structural features of the native-like I T intermediate of hβ 2 m are crucial for assembly.
A native-like I T intermediate is populated on the pathway to native mβ 2 m. The data presented above demonstrate that the conformational properties of the dynamic I 1 state of mβ 2 m are different from those the native-like hβ 2 m I T state. However, additional states could be populated after the formation of I 1 and prior to the formation of native mβ 2 m. Indeed, as the folding time progresses a third set of peaks (apart from the native and the flexible intermediate states (Figure 3)) emerges in the real-time 1 H− 15 N HSQC spectrum ( Figure S4). These peaks show small chemical shift differences compared with the native mβ 2 m resonances and are generally broad ( Figure S4) suggesting that additional, more native-like states are populated at later times during the folding of the protein. This observation presumably reflects an ordered assembly mechanism, in which the initially highly dynamic intermediate (I 1 ) folds to the native state via a transiently populated more structured (I T -like) state.
To investigate this possibility further, partially folded mβ 2 m at pH 3.6 (which mimics the I 1 state − Figures 3E and 3F) was allowed to fold by a rapid pH jump to pH 6.2 and folding was monitored in real time using NMR. The 1 H− 15 N HSQC

Journal of the American Chemical Society
Article spectrum collected 3 min after refolding was initiated showed a well dispersed spectrum ( Figure 6A) in striking and marked contrast with the molten globule-like spectrum of the I 1 state shown in Figure 3E. Indeed, the spectrum obtained 3 min after the pH jump is reminiscent of that of native mβ 2 m with significant chemical shift differences being limited to residues in the N-terminal region (residues 1−6), the BC, DE, and FG loops ( Figure 6A and 6B). Moreover, the peaks that show chemical shift differences from the native protein are remarkably similar to the third set of peaks observed in Figure  S4B, consistent with folding from the partially folded state (I 1 ) to a more native-like intermediate (presumably I T ). Importantly, residues that show significant chemical shift changes are surrounded by residues whose resonances are not detected in the real-time NMR spectrum ( Figure 6B). These areas reside in close spatial proximity to Pro 32 in the native structure and, therefore, are exchange-broadened. A similar scenario has been observed for the real-time folding intermediate of hβ 2 m (I T ). 40 Furthermore, ΔN6 shows chemical shift differences to native hβ 2 m in these same regions. 28 Together, the results show that mβ 2 m folds through a native-like intermediate state that has similar structural properties to the amyloidogenic I T state of hβ 2 m. However, due to its decreased stability, this species is copopulated with an ensemble of partially folded, flexible states in which the N-terminal region is highly disordered (I 1 ).

■ DISCUSSION
Characterization of Protein Energy Landscapes Using Sparse Data. The results presented above highlight the importance of determining the precise details of the folding energy landscape of a protein in order to elucidate whether one or more partially folded, or non-native states, have the potential to initiate amyloid assembly. Characterization of ensembles of interconverting non-native species that not only are lowly populated but also have a short lifetime, such as those involved in protein folding and aggregation, is a challenging task, even for the most advanced biophysical methods. While timeresolved NMR studies on proteins can be highly informative, these studies usually suffer from low resolution and poor sensitivity. By combining the use of sparsely sampled NMR and coprocessing of more complicated/less sensitive experiments with others that show increased sensitivity, we demonstrate here the detection and atomic level depiction of the early molten globule state, I 1 , of mβ 2 m, that is populated for only ∼15 min, a task that would not have been possible using standard NMR methodologies (see Supplementary Methods). The power of the real-time NMR experiments allowed us to identify conditions that stabilize the I 1 state (pH 3.6 or addition of 1 M urea at pH 6.2) and to perform more detailed NMR experiments on the trapped intermediate state that led to a complete description of the folding mechanism of the murine protein. The approach is potentially applicable to other amyloidogenic proteins or proteins that fold through the accumulation of transient intermediate states.
The precise balance between folding intermediates determines the aggregation propensity of β 2 m. The folding pathway of hβ 2 m has been investigated in detail over  . Blue dots represent residues for which assignments are missing in the native spectrum. Residues that show chemical shift differences greater than 1 ppm (dashed line) are colored yellow, those that show chemical shift differences less than 1 ppm are shown in gray, and residues that are broadened beyond detection in the 3 min spectrum are colored red. The strucure of mβ 2 m colored in the same color scheme is shown on the right.

Journal of the American Chemical Society
Article the past decade using different protein variants and different techniques. 9,28,44−48 Together, these studies have shown that the folding of hβ 2 m involves the formation of a native-like intermediate I T that is kinetically trapped by virtue of the nonnative trans X-Pro 32 bond. 9,47 This species is preceded by a less well characterized species (I 1 ) 44 that forms in the dead time of a stopped flow experiment (<3 ms) and is less structured than the I T state (Figure 7). The fine details of the exchange processes between these different folding intermediates has a dramatic effect on the propensity to aggregate. For hβ 2 m the structured aggregation-prone I T state is the most highly populated species during folding, accumulating, on average, to ∼4% (pH 7.0, 37°C) at equilibrium. 9,10 By contrast, the flexible I 1 state represents the most highly populated intermediate state during the folding of mβ 2 m. Indeed, the most intense peaks of the I 1 state (but not those of the I T state) are visible in the spectrum of native mβ 2 m, enabling estimation of its equilibrium concentration to ∼7% ( Figure S6). This reduces the population of the mβ 2 m I T state, with the effect that aggregation no longer occurs (at least on an experimentally tractable timescale) (Figure 7). Thus, although the folding mechanisms of human and murine β 2 m are conserved (the same species are populated on the pathway to the native state), the precise balance between these states reduces the population of the key amyloidogenic precursor I T for mβ 2 m and thus defines the course of amyloid formation.
Interestingly, urea denaturation experiments on a partially folded state of hβ 2 m formed at pH 3.6 have shown that the Nterminal six residues and the A strand are the least stable regions, while the rest of the protein forms a stable core. 49 These results are consistent with the real-time NMR studies on the I 1 state of mβ 2 m presented here, showing that the conformational properties of the early partially folded states of hβ 2 m and mβ 2 m are similar. Interestingly, neither of these states is able to form long, straight fibrils characteristic of amyloid, but instead they form short rod-like fibrils (mβ 2 m at pH 3.6) or worm-like fibrils (hβ 2 m at pH 3.6) 50 ( Figure 5).
A direct link between decreased native state stability and increased aggregation propensity has been observed for several proteins including lysozyme, 51 transthyretin, 6 and antibody light chains. 52 Accordingly, destabilizing mutations enhance the rate of exchange between the native protein and partially folded non-native species, which show increased amyloidogenicity compared with the native state. Interestingly, mβ 2 m is less thermodynamically and kinetically stable than hβ 2 m and, as a consequence, the molten globule-like nonamyloidogenic I 1 state is the most abundant non-native species (Figure 7). The increased conformational dynamics results in a protein that is unstable yet protected from amyloid assembly, since I 1 is not able to form amyloid. These findings argue against a simple link between native state stability and amyloidogenicity (at least for β 2 m). Instead, they highlight the importance of the precise conformational properties of the native-like I T state that are vital for assembly.
Together, the results demonstrate that amyloid formation of β 2 m at neutral pH is initiated via the highly structured I T state. Hence, from the myriad of potential non-native conformations that could be populated during folding, only the I T state allows the entrance of β 2 m to the aggregation landscape (Figure 7). Such a finding highlights the ordered specificity in the early stages of assembly into amyloid and opens the opportunity to target a specific non-native state in order to control the onset of aggregation, for example through the development of antibodies, nanobodies, or small molecules that specifically recognize this species. The results highlight the importance of considering multiple factors in order to predict amyloid formation. Hydrophobicity, the propensity of the sequence to aggregate, the stability of the native state, solubility, transient exposure of aggregation-prone regions through protein dynamics, and the stability of the polypeptide sequence within the fully assembled fibril structure itself, may all contribute to enhanced amyloidogenicity. Although, these factors are well understood individually, the interplay between them during aggregation remains poorly explained. The results presented emphasize the importance of understanding the energy landscape of aggregation in intricate detail, from both thermodynamic and kinetic view points, in order to predict whether or not a protein will aggregate and how/why minor alterations in solution Figure 7. Balance between protein folding and aggregation. The native-like I T state is predominantly populated during the folding of hβ 2 m (gray scheme, left). This allows the entrance to the aggregation landscape (red scheme) as I T shows enhanced amyloidogenicity. In the case of mβ 2 m (right) I T represents only a minor conformation during folding, but instead the flexible molten globule-like state in which the A strand is detached from the β-sandwich fold is the major non-native species. As I 1 is not aggregation-prone, mβ 2 m is protected from misfolding and instead folds to the native state (the energy levels of the aggergation landscape are drawn for illustration purposes only).

Journal of the American Chemical Society
Article conditions/amino acid sequence can have a dramatic effect on the course of assembly, by small changes in the relative populations of amyloidogenic versus nonamyloidogenic states.

■ CONCLUSION
In this study we have used NUS NMR methods to study the relationship between protein folding and aggregation of a globular protein that forms amyloid fibrils from a structured precursor state, using β 2 m as the test protein. We show that the least thermodynamically stable protein is the least aggregationprone sequence of the family of proteins studied here. Analysis of the folding energy landscape of the protein using real-time NMR revealed that the decreased stability and decreased lifetime of a precise and well-defined native-like amyloidogenic precursor (I T ) are sufficient to tip the balance from aggregation to folding. The power of sparsely sampled NMR allowed us not only to detect a dynamic intermediate state (I 1 ) of mβ 2 m but also to structurally characterize this species in residue-specific detail. The results reveal that the least stable protein (mβ 2 m) populates predominantly a flexible intermediate (I 1 ) that is not aggregation-prone, while its more stable counterpart (hβ 2 m) folds through a native-like intermediate that has enhanced amyloidogenicity. Subtle changes in the folding energy landscape thus lead to dramatic changes in the aggregation outcome. The results reveal that protein stability does not correlate with aggregation propensity. Instead it is the precise balance and kinetic partitioning of intermediate states that determines whether β 2 m will fold to the native state or aggregate to form amyloid fibrils.

* S Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/jacs.6b02464.
Supplementary Methods, explanation of time resolved NUS NMR; Table S1, unfolding free energies of native and intermediate states; Table S2, chemical shift assignments of the I 1 state; Figures S1−S2, hydrogen exchange and relaxation data for mβ 2 m; Figures S3−S4, refolding of mβ 2 m by pH jump or urea and spectra at later times during refolding; Figure S5, CON spectrum of the I 1 state; and Figure S6, detection of I 1 in the spectrum of native mβ 2 m (PDF)

■ AUTHOR INFORMATION
Corresponding Author *s.e.radford@leeds.ac.uk Notes