Ion Mobility Mass Spectrometry Uncovers the Impact of the Patterning of Oppositely Charged Residues on the Conformational Distributions of Intrinsically Disordered Proteins

The global dimensions and amplitudes of conformational fluctuations of intrinsically disordered proteins are governed, in part, by the linear segregation versus clustering of oppositely charged residues within the primary sequence. Ion mobility-mass spectrometry (IM-MS) affords unique advantages for probing the conformational consequences of the linear patterning of oppositely charged residues because it measures and separates proteins electrosprayed from solution on the basis of charge and shape. Here, we use IM-MS to measure the conformational consequences of charge patterning on the C-terminal intrinsically disordered region (p27 IDR) of the cell cycle inhibitory protein p27Kip1. We report the range of charge states and accompanying collisional cross section distributions for wild-type p27 IDR and two variants with identical amino acid compositions, κ14 and κ56, distinguished by the extent of linear mixing versus segregation of oppositely charged residues. Wild-type p27 IDR (κ31) and κ14, where the oppositely charged residues are more evenly distributed, exhibit a broad distribution of charge states. This is concordant with high degrees of conformational heterogeneity in solution. By contrast, κ56 with linear segregation of oppositely charged residues leads to limited conformational heterogeneity and a narrow distribution of charged states. Gas-phase molecular dynamics simulations demonstrate that the interplay between chain solvation and intrachain interactions (self-solvation) leads to conformational distributions that are modulated by salt concentration, with the wild-type sequence showing the most sensitivity to changes in salt concentration. These results suggest that the charge patterning within the wild-type p27 IDR may be optimized to sample both highly solvated and self-solvated conformational states.


■ INTRODUCTION
Conformational heterogeneity is a defining hallmark of intrinsically disordered proteins (IDPs). 1 As autonomous units, IDPs interconvert among disparate conformations under physiological conditions. 2,3 The amplitudes of conformational fluctuations and the time scales associated with these fluctuations span a wide range, showing sequence specificity and dependence on solution conditions. IDPs are of interest due to their range of functions and their involvement in a range of diseases, particularly cancers and neurodegenerative disorders. 4,5 Bioinformatics and recent proteomics studies indicate that about 25−30% of eukaryotic proteins are mostly disordered, 6 that more than half of eukaryotic proteins have long segments of disorder, 6,7 and that more than 70% of signaling proteins have long disordered regions. 8 Understanding how intrinsically disordered regions (IDRs) mediate the function of a protein requires accurate physical descriptions of their sequence-to-conformation relationships. IDPs and IDRs are often enriched in proline, glutamic acid, lysine, serine, and glutamine, yet depleted in tryptophan, tyrosine, phenylalanine, cysteine, isoleucine, leucine, and asparagine in comparison to folded, globular proteins, 8,9 and an emerging theory suggests that the context, or adaptive location of a given residue within a protein allows modulation of different functional conformational ensembles, which govern how that region will interact with a given partner. 10 One parameter to define this "context" is the net charge per residue 11 (NCPR, defined as NCPR = f + − f − , where f + and f − are the fractions of positively and negatively charged residues, respectively, in the amino acid sequence). While the context will alter in the solvation and self-solvation state of the protein, NCPR is useful in predicting whether a polyelectrolytic IDP will form a collapsed globule or a swollen extended coil. While some IDPs are polyelectrolytic (contain either positively or negatively charged residues), a larger fraction are polyampholytic (contain both positively and negatively charged residues); consequently, the NCPR parameter alone, which subsumes the f + and f − values, is inadequate to describe the sequence-toconformation relationships of such proteins. Das and Pappu proposed that a combination of the fraction of charged residues (FCR, defined as FCR = (f + + f − )) and the linear sequence patterning of oppositely charged residues influence the conformational features of an IDP that include the degree of chain compaction and the amplitude of conformation fluctuations. 12 The extent of linear mixing versus segregation of oppositely charged residues was quantified using a parameter κ. The κ values range between 0 and 1, where low values relate to well-mixed sequences of positive and negative residues and at κ-values near 1 oppositely charged residues are completely segregated in the linear sequence.
A recent study probed the effect of altering κ-values on the conformational properties of the intrinsically disordered Cterminal domain of p27 Kip1 . 13 Sequence variants of p27 96−198 (referred to hereafter as p27-C) were generated by altering the charge patterning of the sequence between residues 100 and 180 while keeping the amino acid composition fixed. The sequence of the short linear motif (T 187 −P 188 −K 189 −K 190 ) was kept constant since phosphorylation of T 187 is the key signaling step that leads to p27 degradation and subsequent activation of Cdk2/cyclin A, which drives progression of the cell division cycle into S-phase. 14 Atomistic simulations and solution-phase small-angle X-ray scattering (SAXS) showed a clear inverse correlation between the κ-value and the ensemble-averaged radius of gyration (R g ) of 6 permutants of p27-C.
Native mass spectrometry is a promising technique for the study of IDPs. 15−17 The charge state distribution (CSD) following nanoelectrospray ionization (nESI) provides a measure of conformation heterogeneity of a protein in solution. Proteins are observed with discrete net charges, which in positive ionization mode are due to the differentially protonated forms of the protein. A given CSD is governed by the availability of the solvent-accessible ionizable residues within the protein as well as components of the solution. In the majority of cases, proteins that possess high degrees of secondary and tertiary structure in solution display narrow charge state distributions. This is indicative of a finite number of accessible protonation sites. Disordered proteins, however, primarily present broad CSDs that are attributable to heterogeneous ensembles of conformations, ranging from highly compact to highly extended states, with concomitant different numbers of surface exposed, ionizable sites. IM-MS experiments enable separation of these different charge states based on their size and shape and allow the measurement of rotationally averaged collision cross sections (CCSs).
Other MS-based techniques that have proven useful in the analysis of IDPs include hydrogen−deuterium exchange (HDX)-MS and cross-linking-MS. HDX-MS reports on the solvent accessibility of specific residues in a protein, thereby allowing localization of conformational changes to particular regions of the protein, providing the backbone amide hydrogen is differentially protected; depending on the inherent flexibility of the IDP and the time window for analysis, HDX-MS can be highly revealing. 18,19 Cross-linking-MS reveals residues that are in close spatial proximity to one another, either from different proteins or within the same protein. This has allowed the characterization of protein−protein interactions involving IDPs, which can be difficult to achieve with traditional structural biology techniques. 20 Here, we report results from IM-MS experiments on p27-C as well as two sequence variants engineered to have κ-values of 0.14 and 0.56 (referred to as κ14 and κ56, respectively; wildtype p27-C has a κ-value of 0.31 and is referred to as κ31). The designed variants κ14 and κ56 were predicted, and experimentally verified, to be less and more compact than wild-type p27-C, respectively. 13 Analysis of these charge pattern variants using IM-MS reveals differences in chargepatterning encoded conformational properties that are not detectable using conventional solution phase experiments. Additionally, we investigated the impact of salt concentration on the sequence-specific conformational distributions of the variants and compared this to the salt dependence of the conformational distributions of the wild-type p27-C. Our results demonstrate sequence-specific and salt concentrationdependent conformational distributions suggesting that the conformations of each permutant are heavily modulated by the salt content of the solution from which they are sprayed. Further insights into the behavior of the sequence variants are gained by performing gas-phase molecular dynamics (MD) simulations.
■ METHODS Protein Preparation. p27-C constructs were generated by insertion of synthetic DNA sequences (Integrated DNA Technologies) into a pET28a vector (Novagen). All variants were generated using the QuickChange II XL Site-Directed Mutagenesis Kit (Stratagene). p27 variants were expressed in E. coli, purified by Ni 2+ affinity chromatography. His-tags were removed by cleavage with thrombin or TEV and further purified by reverse phase HPLC. p27-C-κ56 has an internal thrombin site, and therefore the His-tag cleavage site was mutated to a TEV site. The purified proteins were buffer exchanged into either 10, 100, or 200 mM ammonium acetate pH 6.8 using Bio-Rad Micro Bio-Spin P6̅ columns (Bio-Rad, Hercules, CA, USA). Samples were subsequently diluted down to 30 μM with an appropriate buffer solution.
Nanoelectrospray Ionization (nESI). All MS and IM-MS experiments were conducted using nanoelectrospray ionization. Samples were ionized from a thin-walled glass capillary (i.d. 0.9 mm, o.d. 1.2 mm, World Precision Instruments, Stevenage, UK) pulled in-house to nESI tip with a Flaming/Brown micropipette puller (Sutter Instrument Co., Novato, CA, USA). A positive potential of 1.6 kV was applied to the solution via a thin platinum wire (diameter 0.125 mm, Goodfellow, Huntingdon, UK).
Mass Spectrometry. All MS experiments were performed on a Q-ToF Global (Waters, Manchester, UK), with sampling cone voltage set to 60 V, collision voltage of 5 V, source temperature of 80°C, source pressure of 2.7 mbar and collision cell pressure of 2.3 × 10 −3 mbar.
Ion Mobility-Mass Spectrometry. IM-MS experiments were carried out on a Waters Q-ToF I instrument that was modified inhouse to include a 5.1 cm drift tube, which has been described elsewhere. 21 The temperature and pressure of helium in the drift cell were approximately 28°C and 4 Torr, respectively. Measurements were made at 6 different drift voltages from 60 to 20 V. The precise pressure and temperature were recorded for every drift voltage and used in the calculations of CCSs. Each experiment was performed in triplicate. Ion arrival time distributions were recorded by synchronization of the release of ions into the drift cell with the mass spectral Journal of the American Chemical Society Article acquisition. The CCS distribution plots are derived from raw arrival time data using eq 1 below. 22 where m and m b are the masses of the ion and buffer gas, respectively; z is the ion charge state, e is the elementary charge, k B is the Boltzmann constant, T is the gas temperature, ρ is the buffer gas density, L is the drift tube length, V is the voltage across the drift tube and t d is the drift time. The raw arrival time output (t a ) includes the time the ions spend outside of the drift cell but within the mass spectrometer, known as the dead time (t 0 ). The value for t 0 is calculated by taking an average value of the intercept from a linear plot of average arrival time versus pressure/temperature and was subtracted from the arrival time to calculate drift time (t d ): All MS and IM-MS data were analyzed using Masslynx v4.1 software (Waters, Manchester, UK), ORIGAMI, 23 Origin v8.5 (OriginLab Corporation, USA), and Microsoft Excel.
Global Collision Cross Section Distributions. The global CCS distributions were obtained by first interpolating the individual CCS distributions of each charge state so they span identical CCS range (0−3500 Å 2 with 50 Å 2 spacing) and subsequently summing them together to generate feature-rich distributions. The relative intensity of each charge state is equated to the integrated area of the CCS distribution of each charge state.
Modeling of CCS Framework Boundaries. The procedure of calculating the lower and upper boundaries of the CCS distribution has been described elsewhere. 16 In brief, the lower boundary is predicted by assuming that the globular form of the protein is approximately spherical in shape with a density of ρ (0.904 Da/Å 3 ). The volume of the protein sphere can be calculated via V = M w /ρ, where M w is the molecular weight of the protein. The radius of the sphere is therefore r = (3V/4π) 1/3 . The CCS of a sphere of this radius is therefore given by eq 3: where a scaling factor of 1.19 is then applied for the conversion from geometric size to CCS in helium as previously outlined. 24 The upper CCS boundary is assumed for a protein structure that adopts a fully extended, rod-like conformation. In this case, on the basis of Cauchy's theorem, the average projected area of a convex solid, such as a rod, modeled as a long and thin cylinder, is a quarter of its surface area, where the surface area is defined by eq 4: Thus, the upper CCS boundary can be calculated using eq 5 where l is the length of the cylinder (the contour length of the chain) defined from the distance between α-carbons in a protein chain 3.63 Å, such that for a given polypeptide chain with n residues l = n(3.63), and r is the radius obtained from the average radius from the volume of each amino acid as shown previously. The same scaling factor is applied to covert from a geometrical shape to a CCS He value. These theoretical CCS limits are highly approximate and do not take into consideration proline residues, disulfide bridges, or noncovalent interactions or restrictions. Instead they serve as lower and upper bounds to which experimental results can be compared.
Gas-Phase Desolvation Molecular Dynamics. The starting structures of the p27-C-constructs were obtained from the converged, solution-phase Metropolis Monte Carlo (MC) simulations by Das et al.; 13 briefly, these simulations were carried out using the ABSINTH implicit solvation model with explicit representation of Na + and Cl − ions. Collision cross sections were computed for two results obtained using two simulation temperatures, 298 and 328 K, on three replicate runs.
All MD simulations were performed using the Amber15 molecular dynamics package and Amber ff 99SB force fields. 25 These are gasphase simulations where no boundary conditions were imposed, and the nonbonded cutoff was set to 999 Å and 1 fs time step was used. SHAKE algorithm was used for all bonds involving hydrogen atoms.
In order to capture the charge state distribution from experimental results, a charge permutation protocol was developed to generate an ensemble of protein protomers. Two representative structures from each permutant were taken from the MC ensembles to give a broad description of the solution geometries. A total of 5000 protomers were constructed for each structure, resulting in 10 000 protomers for each protein. A new protomer was generated at each iteration of the protocol by randomly neutralizing negative charges while maintaining positive residues protonated. Protomers with identical charge distribution were removed from the ensemble. The employed protonation protocol builds on previous methodology. 26 Subsequently, protomers were segregated based on their charge state and each structure was subjected to steepest descent energy minimization and gas-phase equilibration to remove any unfavorable steric clashes. On the basis of the energy of the system, ∼5 most energetically favorable structures were kept for further simulation (less if the number of protomers for particular charge state was <5). The remaining structures were subjected to 10 ns of unrestrained vacuum simulation; however, first they were heated and equilibrated at 300 K. In total, 100, 92, and 99 simulations were carried out for the κ14, κ31, and κ56 permutants, respectively.
Finally, the lowest energy structure was extracted for each p27-C permutant from the [M + 7H] 7+ MD simulation ensemble and placed in a water droplet consisting of ∼6000 TIP3P water molecules (radius of 30 Å) and placed in a vacuum. The droplet was then heated and equilibrated at T = 350 K for 1 ns. In order to simulate droplet desolvation, MD simulations were split into 500 ps segments at a constant temperature of 350 K for a period of 42.5 ns. At each interval, water molecules further than 40 Å from the protein surface were removed and the velocity of each atom was reassigned according to the Maxwell−Boltzmann distribution at the preset simulation temperature. The reason for splitting the simulation into smaller segments was 2-fold. (1) Due to the evaporative cooling 27,28 that occurs during desolvation, the temperature of the droplet decreases, potentially freezing the system. Reassignment of the velocities ensures the temperature of the system remains constant, simulating Andersen thermostat. 29 (2) Removal of excess water reduces the number of particles in the system and significantly reduces the computational time required to simulate the droplet desolvation. In the final stages of the protocol, it was necessary to raise the temperature of the system to 400 K; this was required to remove the last remaining sticky waters. The temperature of 400 K was maintained for 5 ns until all water molecules were evaporated. The desolvation protocol described above broadly follows a previously described methodology. 30,31 All simulations were analyzed using Amber15's cpptraj module. Structural rearrangements, as well as protein desolvation, was monitored using the backbone radius of gyration (R g ), solvent accessible surface area (SASA), and CCS. VMD was used for visualization purposes, and an in-house developed MATLAB script was used to visualize hydrogen bond connectivity maps.
Collision Cross Section Calculations. CCS values were calculated using the exact hard sphere scattering method, as implemented in EHSSrot 32 with atom parametrizations of Siu et al. 33 The cross sections were calculated every 50 ps during the MD simulations and every 50 frames for Monte Carlo ensembles.
Interactive It is immediately obvious that the charge patterning in these proteins has a substantial effect on their mass spectra and on the CCS distributions that they occupy. Previous IM-MS studies have demonstrated how Δz provides information on the extent of structure or disorder in the solution phase. 16 For proteins with a molecular mass below 100 kDa, empirical evidence has provided rules to help interpret the ESI-MS data. A protein with minimal dynamics in solution and a tightly configured structure will present with Δz ≤ 5, and if the value for Δz > 5, this indicates a protein that is unfolded, either due to being sprayed from denaturing conditions or due to intrinsic disorder that results in a multiplicity of conformations in solution and a corresponding high number of charging possibilities, and hence a broad CSD. A protein that has regions of both structure and disorder, or that fluctuates among several weakly energetically favorable structures, will present a Δz > 5, with higher occupancy in the lower charge states. The net charge, pI, FCR, and NCPR are parameters are frequently used to distinguish compositional biases of IDPs, and these are shown in Figure 1d. Importantly, all three p27-C sequence variants have identical compositional parameters. Accordingly, to a first approximation, one might expect that all constructs should be characterized by similar conformational distributions. However, results from SAXS measurements show that the permutants exhibit different

Journal of the American Chemical Society
Article degrees of compaction. 13 While p27-C-κ31 has a solution R g value of 28.1 Å, the value for p27-C-κ14 is slightly higher at 29.4 Å, which is still within the experimental error. However, the R g for p27-C-κ56 is significantly lower at 23.3 Å, suggesting that this sequence prefers an ensemble of compact conformationsa feature also reflected in the MS and IM-MS results shown in Figure 1 and discussed below.
Mass Spectrometry and Ion Mobility Mass Spectrometry of p27-C Permutants. Das et al. 13 proposed that IDPs with different κ-values respond differently to changes in the concentration of solution ions. 12 We tested this hypothesis by examining each permutant in solutions with different salt concentrations to investigate the modulation of conformational features and resultant charge state distributions. Solutions of 10, 100, and 200 mM ammonium acetate were used and are referred to as low-, middle-, and high-salt solutions, respectively.
Focusing first on the high-salt solutions, stark differences were observed among the three permutants (Figure 1b,c); the MS profile for p27-C-κ14 is typical of a highly disordered protein with a large charge state range (Δz  16+ ; we propose that at this stage, the protein is present in a highly extended conformation and any addition of protons has a negligible effect on the overall dimensions. In contrast to the other two permutants, the CSD of p27-C-κ56 (Figure 1b, right) is narrow, with charge states between [M + 6H] 6+ and [M + 13H] 13+ . This is suggestive of a protein with low conformational heterogeneity in solutiona feature that is consistent with the chain compaction observed in SAXS measurements and the lowered amplitudes of conformational fluctuations observed in the atomistic simulations of Das et al. 13 The overall observed CCS distribution is much narrower for p27-C-κ56 than the other permutants. The observed ΔCCS is just 1750 Å 2 in contrast to 2400 and 2250 Å 2 for κ14 and κ31, respectively. However, the CCS range for each individual charge state is remarkably wide even though the increase in CCS with the addition of each proton is very small, indicating a broad ensemble of conformers that present with higher similar net charges.
Reducing the concentration of the ammonium acetate solution from which the proteins were sprayed and desolvated has a limited effect on the Δz from the CSDs, but the relative

Journal of the American Chemical Society
Article intensities of individual charge states alter ( Figure S1). The most significant differences are found for the κ31 and κ56 permutants. In the case of p27-C-κ31, the dominant ions [ In terms of the IM-MS results (Figure 1c), the p27-C-κ14 protein variant displays a linear increase in CCSs for each successive charge state at 10 mM ammonium acetate ( Figure  S2a). For the medium and high salt conditions, the intensity of the [M + 8H] 8+ ion decreases and the result is a marked jump between compact and extended forms from ∼1200 Å 2 at [M + 7H] 7+ to 1900 Å 2 at [M + 8H] 8+ and 2100 Å 2 at [M + 9H] 9+ . The p27-C-κ31 still follows a more even increase in its CCS values with charge; however, the increased relative intensities of the higher charge states led to their broadening. Finally, the p27-C-κ56 permutant retains a wide CCS distribution for each charge state, with the [M + 6H] 6+ being the dominant species at low salt concentration ( Figure S2c Effect of Salt Concentration on Global Conformations. The global CCS distribution of the p27-C variants at each experimental condition (Figure 2) summarizes the overall conformational heterogeneity in terms of CCS. First, the CCS distribution profile of p27-C-κ14 sprayed from the low salt solution (Figure 2a) shows that this construct is free to access almost any shape under these experimental conditions, suggesting that there are small energetic barriers to switching between conformers in solution. The absence of abrupt changes of ion intensity with respect to CCS supports this suggestion. In contrast, the conformational profiles of p27-C-κ14 sprayed from middle- (Figure 2b) and high-ionic ( Figure  2c) strength solutions suggest that the protein is stabilized in extended conformational states with CCSs centered around 2000 Å 2 . A small proportion of molecules are present in more compact conformations with CCS values in the range from 750−1500 Å 2 . As previously mentioned, a conformational change appears to occur at around 1500 Å 2 , with conformations of higher surface areas being more highly populated.
When sprayed from the low-salt solution (Figure 2d), p27-C-κ31 exists in a range of conformations, similar to p27-C-κ14 when sprayed from equivalent conditions. The medium-salt solution (Figure 2e) appears to stabilize extended conformations above 2000 Å 2 for p27-C-κ31, which is also similar to what we observe to p27-C-κ14, but here the smaller conformations below 1500 Å 2 are more easily accessed. When p27-C-κ31 is sprayed from the high salt solution (Figure 2f), more compact conformations are preferred, indicating a switch between conformational states preferred in 100 mM versus 200 mM salt. The reason for this is not known, but we can speculate that the WT sequence has a patterning of charged residues that enables such behavior, which may relate to its biological function.
When p27-C-κ56 is sprayed from 10 mM ammonium acetate (Figure 2g), the protein adopts compact states, with most of the intensity being around 1000 Å 2 , displaying significantly less heterogeneity in its CCS than the other permutants. As the salt content is increased to 100 mM (Figure 2h), the protein experiences a slight shift in the conformational landscape; the most intense peak shifts from 1150 to 1600 Å 2 indicating that most of the molecules are now in a more extended conformation, and the minima and maxima of the CCS distribution are both now 250 Å 2 larger. A high-salt concentration (Figure 2i) causes further depletion the previously dominant conformation around 1250 Å 2 and leads to an increase in the intensity of the conformation at 1600 and 2000 Å 2 .
Effect of Protein Charge on Collision Cross Section Distributions by MD Simulations. Gas-phase MD simulations were employed to gain insight into the behavior of solution-derived structures from the Monte Carlo simulation in the absence of solvent. In order to achieve this, two representative structures from the MC ensembles (one compact and one extended) were selected as seed structures to create an ensemble of charge permutants (protomers) in the charge state range of [M + 5H] 5+ to [M + 15H] 15+ , which spans most of the experimentally measured CSD (Figure 1d). The charge state of the protein was adjusted by selectively neutralizing negatively charged amino acids. On the basis of the number of positively and negatively charge residues of the p27 permutants (15 and 14, respectively), the maximum number of charge combinations covering the charge states between [M + 5H] 5+ to [M + 15H] 15+ was 15 913. The charge permutation process created 10 000 protomers for each permutant, all of which were energy minimized and ∼100 lowest energy protomers were selected; for each simulated charge state between 2 to 10 protomers were present. It is worth noting that protomers with minimal Coulombic energy might not necessarily have the highest probability to exist experimentally; however, it is likely that highly probable protomers were selected, despite the multitude of charge permutations available for even the smallest proteins. 28,34 Simulation of multiple charge states of the protein was motivated by the desire to better represent the heterogeneous nature of the CSD observed experimentally. Normally simulations would be performed on either the net charge of the protein ([M + 1H] 1+ ), which is not observed experimentally, or a single charge state based on the pK a value at selected pH.
Focusing on the p27-C-κ14 permutant first ( Figure S4), the extended conformers with charge states between [M + 5H] 5+ and [M + 11H] 11+ were found to undergo average structural compaction of −3% to 9% when compared to the equilibrated starting structure at t = 0 ns (not taking the equilibration time into account). Simultaneously, the compact conformer only experienced minor structural rearrangement, which caused an expansion of ∼1%.  Table S1.
The results from solution-based MC and gas phase MD simulations are summarized in Figure 3. Das et al. 13 previously used the structures from MC simulations to accurately represent the solution phase SAXS data; however, these structures were less successful for structural assignment for the gas phase ions. The distribution obtained from the MC ensemble was successful in accounting for the extended conformers, typically associated with higher charge states, while the compact states were inaccessible; this is not surprising since the MC ensembles were generated in the presence of implicit solvent with dielectric constant (ε) of 78, which would weaken any long-range interactions compared to the vacuum of a mass spectrometer (ε = ∼1). In contrast, the gas-phase MD simulations provided better correspondence with the experimental data for more compact structures and this MD methodology accounts for the majority of the experimentally measured CCSs, although still fails to provide exemplar conformational states for the most compact forms we measure. Borysik et al. 26 have previously stated that in order to represent the extremely compact conformers of IDPs, it is necessary to first activate solvated structures in a simulated annealing approach to overcome any energy barriers that might prevent conformational collapse; however, this approach was not applicable for high charge states of the protein, as it was found to induce large deformations of the structures.
Analysis of Protein Desolvation using Molecular Dynamics Simulations. In order to mimic the gradual transfer of the protein from the solution into the gas phase, we performed additional computations where each permutant was immersed in a droplet of water and subjected to stepwise water evaporation. Desolvation was carried out on a [M + 7H] 7+ charge state of the κ14, κ31 and κ56 permutants. In each case, the simulation was performed on a compact structure, representative of the protein ensemble. The [M + 7H] 7+ ion was selected as it lies below De La Mora's interpretation of the Rayleigh limit (z = 8.2) and is experimentally present in a compact conformation. The protein is solvated in ∼6000 water molecules without counterions or free-protons, maintaining the starting charge state throughout the evaporation process. Figure 4a−c shows representative snapshots for the desolvation of droplets containing the p27-C permutants, while the time-dependent simulation results are shown in Figure S8. In agreement with previous studies by Consta et al. 35,36 and Kim et al., 37 due to the lack of fissile ions such as Na + or NH 4 + , as the size of the droplet decreased and the ratio of charge to droplet volume increased, spike-like protrusions developed on the surface of the droplet. In the early stages of the simulation (0−5 ns), the protein structure undergoes minor rearrangement, following loss of favorable protein−water contacts. The structural changes are exemplified by an increased radius of gyration (R g ), solvent accessible surface area (SASA), and CCS. Following the initial conformational changes, the p27-C-κ31 collapsed to a more compact form with CCS of 1295 Å 2 (∼6.5% smaller than at t = 0 ns). Similarly, the κ14 permutant followed similar conformational broadening as indicated by an increase in the R g , SASA, and CCS; however, as the droplet size decreased, the CCS was only reduced by 3% to 1350 Å 2 ( Figure S7). The κ56 permutant desolvation MD trajectory was started from slightly larger conformation; however, it also exhibited initial conformational expansion and subsequent size reduction ( Figure S9). In this case, the CCS was reduced from 1475 Å 2 to 1305 Å 2 , approximately 13% reduction in size, also highlighted by decreases in SASA and R g .
Interestingly, the hydrogen bond network maps shown in Figure 4d−f showcase the number and importance of hydrogen bonds present during the desolvation protocol. In the case of the κ14 variant, a high number of short-distance hydrogen bonds are observed, in particular in the region of 80−105. This is most likely due to the close proximity of the oppositely charged residues within the amino acid sequence, which prevents the formation of a fully collapsed structural form of the protein, as indicated by broad CSD and preference toward higher charge states. In contrast, the κ56 permutant was found to preferentially form hydrogen bonds between the charged patches between residues 30−36, 71−76, and 99−105, which results in the formation of compact conformations. The segregation of oppositely charged residues for the wild-type  13 charge permutation molecular dynamics (MD), and the CCS range predicted using the framework method (marked as horizontal lines). 16 The width in the violin plots represents the signal intensity of the experimentally measured distributions and population density of the in silico determined structures. An interactive version of this figure is available online at https://beveridge-migas-p27.netlify.com/assets/Figure_3.html.

Journal of the American Chemical Society
Article (κ31) is between that of κ14 and κ56. Accordingly, the hydrogen network consists of numerous short-and longdistance contacts.

■ DISCUSSION
As evidenced by the large differences in the CCS distributions for the p27-C κ-value permutants (Figures 1 and 2), the patterning of charged residues affects the global conformations of disordered protein chains. The p27-C-κ14 permutant displays well-spaced charged residues in its linear sequence, while the p27-C-κ56 variant exhibits dense clusters of oppositely charged residues. For p27-C-κ56, consequently, there is an increased likelihood of long-range attraction and charged-residue pairing resulting in more compact conformations. Previous SAXS measurements 13 support this, as do the experimental findings reported here. IDPs with high fractions of charged residues (FCR ≥ 0.3) and lower κ-values are predicted to have conformational properties similar to selfavoiding random walks due to a counterbalancing of intrachain electrostatic attractions and repulsions. 12 This screening of intrachain repulsions by attractions renders sequences of low κvalues to be insensitive to changes in salt concentrations. In contrast, IDPs with higher κ-values are expected to adopt more compact conformations in solvents with low excess salt. This is because of favorable intrachain electrostatic attractions between blocks of oppositely charged residues. Increasing the salt concentration weakens intrachain attractions between blocks of oppositely charged residues, thereby engendering chain expansion.
To understand the success of using ion mobility mass spectrometry to demark the conformational variability of charge segregation in IDPS, and to explain the broad agreement with solvated measurements as well as the additional contributions from highly compact forms, it is critical to consider the transition from solution to the gas phase. The process by which molecules leave the droplet solution and become gaseous ions has long been debated. 38−40 The accepted view is that ions with well-defined, globular structures follow the charge residue model (CRM) of desolvation, where it is hypothesized that Rayleigh-charged nanodroplets contain a single molecule of solute that evaporates to dryness; as the droplet shrinks, excess charges are lost via fission events and the remaining charges are transferred to the protein during the final stages of desolvation. Ions produced via this mechanism tend to have lower resultant charge state. An alternative model of desolvation for disordered

Journal of the American Chemical Society
Article proteins is the chain ejection model (CEM) proposed by Konermann et al. 40 In the CEM, unfolded proteins have larger solvent accessible surface areas exposing their hydrophobic regions; these proteins are likely to migrate to the surface of the droplet and when their terminus is exposed to the gas phase, the remaining part of the structure is pulled with it. In contrast to the CRM, ions produced via the CEM have higher charge states. This model applies if and only if the IDPs are akin to random coils or self-avoiding walks, since IDPs have the ability to sample a broad spectrum of conformations ranging from those that are as compact (or even more so) as folded proteins, they are unlikely to only undergo CEM since ejection of a compact region via CEM would be unfavorable. In light of this we previously proposed that a hybrid of the CRM and the CEM will govern the generation of the intermediate charge states that are present in a multitude of conformational families. 18 Considering the observations made from MS and IM-MS results for the three permutants, we propose that the high abundance of high charges states of p27-C-κ31 (WT) at low and medium salt (10 and 100 mM ammonium acetate) is predominantly governed by the CEM. 40 By contrast, at high salt (200 mM ammonium acetate) the intensity of the low charge states increases dramatically which could be attributed to alterations of the conformational space of the proteins, modulated by higher salt concentration and consequently resulting in preferential desolvation via the CRM. The CSD of the κ14 permutant indicates higher preference toward higher charge states, irrespective of the buffer environment; hence, the CEM is dominant, although the compact conformations observed for lowest charge states would have been produced via CRM. The maximum number of charges that a spherical conformation of a protein the size of p27-C can hold is 8.2, as determined by De la Mora's interpretation of the Rayleigh limit, 24 which implies that all observed ions above the [M + 8H] 8+ are characterized by extended or at least partially extended conformations. Interestingly, a significant change in intensity occurs between charge states [M + 8H] 8+ and [M + 9H] 9+ for the κ31 and κ14 forms. The threshold at which the apparent conformational switch occurs for the κ56 appears to be lower, between charge states of [M + 7H] 7+ and [M + 8H] 8+ according to the change in the signal intensity in the mass spectra. Moreover, the [M + 8H] 8+ is also the charge state at which a conformational switch occurs ([M + 7H] 7+ for κ56), perhaps also indicative of a change in the desolvation mechanism.
The CSD and CCS distributions obtained from MS and IM-MS experiments highlight the conformational diversity of the three permutants when transferred from solution to gas-phase. It appears that during the desolvation process, p27-C ions undergo significant structural rearrangement broadly increasing the conformational space in comparison to the SAXS measurements. This observation is supported by the CCSs obtained for computational models from the SAXS ensemble mentioned above. The CCS distribution of the three permutants spans a wide range of CCSs, yet the Monte Carlo structures were only able to account for the most extended conformational families. The additional pool of structures created during in vacuo MD accessed the intermediate conformations; however, it was still unable to derive models for the highly compact conformations. Finally, using a previously reported IM-MS framework 16 to estimate the smallest and largest possible CCS a protein can adopt based purely on its amino acid composition highlights the structural heterogeneity of p27-C as the κ14 and κ31 permutants occupy nearly the entire width of the available CCS range, while the κ56 permutant covers a narrower range (Figure 3). The findings reported herein are in agreement with previous studies 26 that highlight how nESI/ESI processes enable creation of low charge states in self-solvated compact states, the extent of which is modulated by the solution conditions, and in our case, distribution of charged residues on the amino acid sequence.

■ CONCLUSIONS
A variety of factors can affect the mass spectra of protein; however, the charge state distribution is predominantly affected by the solution-phase conformation, which is in turn modulated by the solvent composition and in part by the ESI process. Here, we demonstrate how MS and IM-MS methods can be used to investigate the conformational diversity of a set of intrinsically disordered proteins, p27-C and two of its permutants in which the charge patterning within the primary amino acid sequence was altered. The proteins were qualitatively studied, illustrating how small changes to the amino acid sequence can be affected by the ionic buffer strength. Both MS and IM-MS results clearly delineated different permutants and highlighted how IDPs in which charge residues are clustered closely together (high κ-value) form more compact conformations, while those with equal distribution of charged residues on the amino acid sequence demonstrated increased conformational diversity. The experimental results were supplemented by comparison with solution derived and classical MD structures, highlighting the level of compaction occurring once p27-C ions enter the gas phase, while water evaporation MD showed the sequential water loss and structural collapse upon desolvation of the ion, in a process akin to the CRM desolvation model. Author Contributions ∥ RB and LGM contributed equally to this work.

Notes
The authors declare no competing financial interest. The data and all essential metadata that support this study are available from the corresponding author on request and in the interactive data plots as described. A number of figures presented in this article were recreated in an interactive format to enable in-depth interrogation of the presented results. These are deposited online at https://github.com/BarranLab/ Beveridge_Migas_p27_2018 and can be viewed with https://beveridge-migas-p27.netlify.com.

Journal of the American Chemical Society
Article ■ ACKNOWLEDGMENTS LGM would like to thank Dr.