Tuning the Continuum of Structural States in the Native Ensemble of a Regulatory Protein

The mesoscale nature of proteins allows for an efficient coupling between environmental cues and conformational changes, enabling their function as molecular transducers. Delineating the precise structural origins of such a connection and the expected spectroscopic response has, however, been challenging. In this work, we perform a combination of urea–temperature double perturbation experiments and theoretical modeling to probe the conformational landscape of Cnu, a natural thermosensor protein. We observe unique ensemble signatures that point to a continuum of conformational substates in the native ensemble and that respond intricately to perturbations upon monitoring secondary and tertiary structures, distances between an intrinsic FRET pair, and hydrodynamic volumes. Binding assays further reveal a weakening of the Cnu functional complex with temperature, highlighting the molecular origins of signal transduction critical for pathogenic response in enterobacteriaceae.

I t is well established that proteins sample a variety of functionally relevant conformations in their native ensemble. 1,2 The flexibility arises from the weak noncovalent nature of the stabilizing interactions, the large degree of freedom associated with the main chain, and the finite sizes of protein molecules. The resultant mesoscopic nature of proteins translates to large surface-area-to-volume ratios, thus contributing to specific interactions with the solvent molecules and large solvent-coupled fluctuations even when in thermodynamic equilibrium. 3,4 Solvent properties can, therefore, be tuned either by temperature or by adding cosolvents (urea or guanidinium hydrochloride) to modulate these interactions and hence perturb the folding landscape of proteins. Perturbation experiments have therefore contributed immensely to the understanding of protein properties, particularly in two-state-like systems. 5, 6 In such proteins, adding cosolvents tunes the relative macroscopic populations of the folded and unfolded states, resulting in distinct sigmoidal-like unfolding curves. However, it has generally been challenging to extract or interpret the origin of signals beyond a simple two-state equilibrium because of the complexity intrinsic to such analysis. 7 In fact, double-perturbation experiments involving cosolvents and temperature reveal distinct signal dependencies in globally downhill and incipient downhill folders, 8,9 arising from the differences in the structural features of ensembles that are populated in response to one perturbation and that are tuned by another. Such an intrinsically tunable landscape allows for proteins to act as molecular transducers or rheostats; that is, they couple the changes in ambient conditions to their conformations that in turn can determine the functional response. 10 In this regard, it was recently identified that the four-helix bundle protein Cnu (Figure 1a), a single gene product, displays thermosensor-like properties that are critical for efficient pathogenic response in enterobacteriaceae that commonly infect human gastrointestinal tracts. 11 Global spectroscopic, site-specific NMR experiments, hydrodynamic measurements, theoretical modeling, and simulations indicate that the Cnu native ensemble is best described by an array of conformational states that are in dynamic equilibrium with one another in a single broad native well. If this is indeed the case, then solvent modulations with chemical denaturants together with thermal perturbations should result in nontrivial effects on the folding landscape. Moreover, the cosolvent-or temperature-dependent spectroscopic signatures are expected to be different from conventional observations. 5 To explore these issues in detail, we monitor the response of the native ensemble of Cnu to perturbations by urea and temperature with far-and near-UV circular dichroism (CD), fluorescence (specifically, tyrosinetryptophan resonance energy transfer), hydrodynamic measurements, simulations and also perform binding studies.
We first probe the features of Cnu folding landscape with a variant of the statistical mechanical Wako−Saito−Munõz− Eaton (WSME) model. 12−14 Using identical parameters as a previous study, a chemical denaturant dependence is introduced following the linear free-energy relation commonly observed in experiments 15 (see the Supporting Information) and as employed before. 16 Such a perturbation reveals that the native ensemble of Cnu can be coarsely divided into two subensembles, N and N* (Figure 1b). Importantly, their properties vary as a function of both temperature and cosolvent concentration. This can be seen as horizontal shifts in the positions of the population maxima or free-energy minima of both N and N* toward more disorder (Figure 1b). This suggests that probes that are sensitive to the fine features of the landscape should reveal distinct spectroscopic signals under each of the conditions. In addition to this, the populations of the native ensemble, N*, and the relative difference in population between N and N* are predicted to result in sigmoidal, broad and near-parabolic, and linearly decreasing temperature dependencies at individual cosolvent concentrations, respectively (Figure 1c−e). The resulting apparent T m , measured as the temperature at which the signals crossover in signs, should follow a linear trend with urea ( Figure 1f).
Cnu has one tryptophan in the fourth helix and five tyrosines that are distributed throughout the structure. In proteins rich in aromatic residues, near-UV CD spectral analysis can provide detailed structural information, as they are sensitive to the tertiary packing environment of tyrosine and tryptophan. 17 The near-UV CD spectral signatures of Cnu at four different urea concentrations (0 to 3 M) reveal distinct amplitudes for the overall-and relative-spectral bands (265, 270, 280, and 290 nm), clearly indicating that distinct ensembles are populated at these urea concentrations (for example, compare the spectra at 1 and 3 M urea in Figure 2a). Remarkably, the signals at 290 and 280 nm follow the exact same trend predicted by the WSME model, suggesting that they probe the overall population of the native ensemble and that of N*, respectively (Figure 2b,c). SVD (singular-value decomposition) analysis of the temperature−wavelength spectra reveals an anticorrelation between the bands of tyrosine and tryptophan in the second component that reports on spectral changes ( Figure 2d). The amplitude of this component decreases linearly with temperature and changes sign (positive to negative) at specific temperatures as a function of urea ( Figure 2e). This observation is also in accordance with the predictions of the WSME model that points to this dependence to be originating from the differences in the populations of N and N* ( Figure  1e).
The urea-dependent far-UV CD signal at 222 nm again displays a pattern that has not been reported in any protein system: the signal intensity increases (becomes more negative) with the urea concentration, reaches a plateau, and then decreases in intensity in a sigmoidal fashion (Figure 2f and Supporting Information Figure S1). The position of the minima moves toward lower urea concentrations and concomitantly decreases in magnitude, suggestive of a malleable native ensemble. What could contribute to this unique dependence? Careful analysis of the spectral features of far-UV CD bands in proteins has shown that tyrosine exhibits a strong positive band when in a helical conformation. 18 The fact that Cnu has five tyrosines and that the signal intensity increases with urea suggests that some tyrosines populate nonhelical conformations even at 298 K, despite the overall structure appearing to be folded. As the temperature is increased, the probability of the unfolded ensemble increases, thus resulting in a decrease in the signal intensity. In other words, the observed rollover in far-UV CD signals at 222 nm arises from a delicate balance between these two features.

The Journal of Physical Chemistry Letters
Letter Circular dichroism experiments highlight the native ensemble of Cnu to be changing both its secondary and tertiary structure with solvent perturbations in a distinct manner. The structural changes are more probable at the C-terminal helix due to its weak packing and the large conformational flexibility of the loop connecting the third and fourth helices. 11 We therefore expect the relative distances between W67 (located in the fourth helix) and Y40 (in the third helix; Figure 1a) to increase with perturbation magnitude within the native ensemble, that is, in the pretransition region where there is only a minimal population of the unfolded state.
The quantum yield (QY), as estimated by exciting the protein at 274 nm, increases with urea concentration, reaches a plateau, and then decreases sigmoidally, mirroring far-UV CD observations (Figure 3a). The apparent chemical midpoint is estimated to be ∼5.3 M at 298 K from a first-derivative analysis of the QY data. To understand the possible structural changes that contribute to this unique observation, we perform a global SVD of the raw temperature/urea-wavelength fluorescence data. The spectral deconvolution results in two significant components, the first of which represents the average spectrum, and its amplitude accounts for the intrinsic temperature dependence of fluorescence ( Figure S2). The second SVD component again points to an anticorrelation similar to that observed in near-UV CD spectral analysis but this time between the emission bands of tyrosine (emission maximum ∼305 nm) and tryptophan (emission maximum ∼340 nm; Figure 3b). However, unlike near-UV CD signals that are challenging to interpret, the changes in fluorescence intensities have a clear structural origin. Specifically, there can be a FRET (Forster resonance energy transfer) between tyrosine (donor) and tryptophan (acceptor) if they are close in space. In fact, the C α −C α distance between W67 and Y40 is ∼11 Å in the native structure, very close to the expected R 0 for this FRET pair (∼9−12 Å). The FRET-induced effects can be more clearly seen in the plot of the amplitudes of the second component that changes continually with increasing temperature and changes sign at specific temperatures depending on the urea concentration (Figure 3c). In other words, the low tyrosine− tryptophan distance between W67 and Y40, as expected of a fully folded structure at lower temperatures, results in a larger FRET between this pair and hence a decreased intensity for tyrosine band (negative spectrum) and an increased intensity for tryptophan band (positive spectrum) at 278 K (Figure 3d, obtained by multiplying U2 with V2). At higher temperatures, the distance between W67 and Y40 increases due to partial unfolding; this results in reduced FRET between this pair and hence the tyrosine spectrum dominates over that of tryptophan (Figure 3d). The relative dominance of tyrosine emission over tryptophan depends on the urea concentration, and this results in a linear decrease in apparent melting temperatures with urea (inset to Figure 3d), consistent with predictions from the WSME model ( Figure 1f). Interestingly, we also find evidence for temperature-induced collapse in the unfolded ensemble using this simple technique, very similar to other FRET-based observations. 19 This can be seen as a decrease in FRET intensity (Y40−W67 getting closer) with increasing temperatures at 6 M urea (black in Figure 3c).
The continuous increase in the W67−Y40 distances as a function of temperature or urea in the native ensemble should result in a concomitant increase in the hydrodynamic volumes. We clearly observe an increase in the Stokes radius of the protein with increasing urea concentration at 278 K from analytical size-exclusion chromatography (vertical bars in Figure  4a). The protein dimensions approach that of a molten-globule at 278 K and 3 M urea and 298 K and 0 M urea (continuous line in Figure 4a) following the empirical formula of Uversky and coworkers 20 (dashed line in Figure 4a). These observations highlight that fixing urea concentration and tuning temperature or vice versa will result in an equivalent effect on the native ensemble. This reciprocity is particularly advantageous because it is relatively easier to simulate temperature effects and importantly perform binding experiments where temperature is modulated.
We therefore supplement the previous 6 μs of MD simulations in explicit water (280 and 310 K) 11 with another 3 μs of cumulative simulation time at 295 K. The resulting C α − C α distance distribution between the Y40−W67 pair is nearunimodal at 280 K, indicative of a well-folded ensemble (blue in Figure 4b). Despite the limited sampling, increasing the temperature perturbs the native ensemble dramatically with the Y40−W67 distances spanning a large range with distinct conformational substates (Figure 4b) exactly as expected from the experimental FRET temperature dependence. Coarsegrained simulations also point to a similar feature in the native ensemble ( Figure S3). The consistency between the three approaches employed here−experiments, theoretical modeling, simulationsprovides strong evidence that the native ensemble of Cnu is a structural continuum.
The sensitivity of Cnu to solvent conditions, a feature expected of protein molecular rheostats, raises questions on the biological necessity for a tunable native ensemble. 21,22 Because the enterobacteriaceae family predominantly infects human hosts, the constant body temperature of 310 K becomes a major thermodynamic variable. Microbiological−biochemical

The Journal of Physical Chemistry Letters
Letter experiments have also shown that a complex between Cnu and H-NS represses pathogenic response at low temperatures (∼280 K) while promoting the expression of toxins at higher temperatures (∼310 K; the body temperature of humans). 23 It is the molecular patch formed by the helices 3 and 4 of Cnu that is responsible for binding with H-NS. 24,25 Because experiments and simulations point to a continuous increase in the distances between these two helices with changing solvent conditions, it points to a simple mechanism by which the binding affinity can be regulated: the binding interface should be well formed at 280 K, thus promoting complex formation while the interface should be destabilized at high temperatures and thus disfavoring complex formation ( Figure  S3b). To test for this experimentally, we monitored the change in tryptophan fluorescence anisotropy of Cnu by titrating it with H-NS 1−59 at different temperatures (Figure 4c). We find that the binding affinity decreases by a factor of ∼5 between 278 and 310 K (Figure 4d). However, our interpretation is complicated by the fact that H-NS 1−59 , a fragment of a larger protein, by itself displays a steep pretransition upon thermal modulations ( Figure S4), indicating that it also undergoes structural loss in the same temperature range. While it is not clear if the full-length H-NS would undergo a similar structural change, our work provides first experimental evidence that a synergy between the tunable conformational ensembles of both the proteins potentially dictates the extent of pathogenic response.
Our results show that the structure of protein Cnu is highly malleable, displaying hitherto undocumented complexity and thus making it optimally sensitive to fluctuations in the environment. Such a feature has been theoretically predicted before for even folded single-domain proteins 26,27 and is increasingly being observed by several different experimental approaches including ensemble multisite FRET, 28 singlemolecule FRET, 29 calorimetric measurements, and NMR. 30 These observations including ours also highlight that ensemble descriptions, which account for the statistical nature of protein chains, are the way forward, and such an approach can provide a reliable basis for deciphering complex spectroscopic signals. Because natural selection acts at the level of function, the inherently tunable nature of proteins should have a precise functional reason that we exemplify here using Cnu as a model system. Bacteria owe their survival in extremes of environmental conditions to molecular sensors like Cnu. It is therefore natural to expect the possibility of similar coupling between the conformational features of the folding landscape of several other proteins to extrinsic variables like pH and ionic strength. Understanding the molecular origins of environmental sensitivity from such natural sensors could pave the way for effective protein design strategies and potentially reveal promising drug targets.

The Journal of Physical Chemistry Letters
Letter