Silk Protein Solution: A Natural Example of Sticky Reptation

Silk is one of the most intriguing examples of biomolecular self-assembly, yet little is understood of molecular mechanisms behind the flow behavior generating these complex high-performance fibers. This work applies the polymer physics of entangled solution rheology to present a first microphysical understanding of silk in the linear viscoelastic regime. We show that silk solutions can be approximated as reptating polymers with “sticky” calcium bridges whose strength can be controlled through the potassium concentration. This approach provides a new window into critical microstructural parameters, in particular identifying the mechanism by which potassium and calcium ions are recruited as a powerful viscosity control in silk. Our model constitutes a viable starting point to understand not only the “flow-induced self-assembly” of silk fibers but also a broader range of phenomena in the emergent field of material-focused synthetic biology.


■ INTRODUCTION
Protein-based materials have been, and are becoming, increasingly economically and technologically important in areas ranging from food industry 1,2 to biomaterials. 3,4 To ensure efficacy and to deliver consistent performance for industrial implementation, it is crucial to understand the molecular mechanisms controlling their properties. An excellent example is Bombyx mori silk fibroin. Undergoing low-energy processing 5 and using efficient solvent recycling, 6 the B. mori silkworm employs variation in pH, flow, and metallic cations to precipitate the protein to form silk fibers, which greatly outperform their synthetic polymer counterparts in terms of unique combinations of strength and toughness. 4,7,8 The underlying molecular design principles that underpin this, however, remain largely obscure due to a poor understanding of silk feedstock in the gland.
Enabled by the recent discovery of a correlation between changes in viscosity with variations in ionic composition, 6 we present quantitative viscoelastic data on the silk feedstock evaluated against a molecular model for the dynamics of entangled, associating polymers. 9−12 By matching quantitative information about the underlying structure, chemistry, and dynamics to the macroscopic viscoelastic properties, this methodology provides a new window into critical microstructural parameters and a viable starting point to understand not only the "flow-induced self-assembly" of silk fibers 5−8 but also insights into processing and structure development using synthetic biology. 1−4 The similarity between silk feedstock and synthetic polymeric systems is suggested by the structure of "native silk fibroin" (NSF) and its high concentration in silk feedstock in vivo. Indeed, several NMR studies 13−16 show that NSF can be pictured as an intrinsically disordered protein, akin to a randomly coiled polypeptide chain, 17 in a crowded environment above the overlap concentration. 18 Remarkably, the NSF feedstock contains 28 ± 8 Ca 2+ cations per NSF chain and, depending on the stage of cocoon construction, 6−60 K + cations. 6 It was found that this increase in potassium concentration correlated with an order of magnitude decrease in the viscosity. Inspection of the sequence of amino acids, see Figure 1, 6 reveals that each chain incorporates a relatively small number of negatively charged glutamate and aspartate amino acids, which lead us to hypothesize that divalent calcium can bind and form an ionic bridge that acts as a temporary cross-link between the chains. Within this hypothesis, the concentration of monovalent potassium can modulate the strength of these cross-links by either inhibiting the binding sites at the protein or by screening the electrostatic interactions.
Our aim is to confirm this qualitative picture of NSF as a randomly coiled linear "sticky" polymer above the overlap concentration quantitatively and do this by describing the molecular rheology of silk feedstock by "sticky reptation" (SR). 9 The SR model was previously proven successful in predicting macroscopic material properties in terms of the molecular motion of the chain. This motion is restricted by the molecules' crowded environment, and the chain consequently "reptates" (diffuses in a curvilinear path along its own contour) in a tubelike confinement formed by its neighbors. 21 Crucially, in this instance, the temporary cross-links (i.e., "stickers") hinder the motion of the chain and thereby control the rheological properties of the sample.
In the following, we will discuss how the quantitative physical properties of silk feedstock, including the network topology and the chemistry of the sticker, feed into the SR model. Next, we will discuss our experimental procedure to access the linear viscoelastic regime for samples with varying viscosity. We then fit the SR model to our experimental data to extract the physical parameter values and show that (i) there are approximately ten entanglements per chain, (ii) the number of stickers is of the same order of magnitude, and (iii) the sticker lifetime is the main control variable responsible for the variations in viscosity.

■ STICKY REPTATION MODEL
Using a polymer physics interpretation of the biochemistry described in the introduction, we view NSF as a semidilute "sticky entangled polymer", see Figure 2. 9,22,23 Within this interpretation, the intrinsically disordered protein with intermolecular calcium bridges can be modeled by a randomly coiled polymer with "sticky" groups and which is confined in a tubelike confinement due to its crowded environment. The diameter of this tube determines the effective number of entanglements Z e that it has with its neighboring chains. Due to these entanglements, the silk feedstock forms a rubber-like network where each entanglement contributes a thermal energy k B T to the high-frequency elastic "plateau modulus" 24 with υ the average volume of an amino acid residue (0.13 nm 3 , ref 25), N = 5525 the number of amino acids, 19 and ϕ the volume fraction of the protein (with values ranging from 16 to 22%). The rheology of linear polymers is considerably affected even for a small fraction of stickers along their length, Z s /N, because the long-wavelength Rouse relaxation of the chain and their center-of-mass diffusion are hindered. Furthermore, for times shorter than the sticker dissociation time, t < τ s , the plateau modulus is increased by the presence of the temporary crosslinks generated by the stickers and becomes G s = G e (1 + Z s /Z e ), with Z s the number of stickers per chain. 26 Assuming a sticker lifetime that is much larger than the relaxation time of the strand between stickers with τ 0 the Rouse time of a monomer in a nonsticky chain of length N, we can model the relaxation of the long-wavelength modes of the chain using the "sticky-Rouse" relaxation modulus where the coefficient κ equals unity for modes with a wavelength λ ∼ q −1 , shorter than the entanglement strand (i.e., for wavenumbers q ≥ Z e ) and equals 1/5 for longer wavelengths (i.e., for wavenumbers q < Z e ). 27 In this equation, we assume that most of the stickers are bound and sticker dissociation is followed by the Rouse relaxation of a strand of length twice the length of a strand between stickers, N/Z s . For more discussion on a variable fraction of bound and unbound stickers, we refer to refs 9, 28.
The assumption of slow sticker dissociation implies that the stickers reduce the Rouse time of the chain from τ R = τ e Z e 2 to τ R = τ s Z s 2 . Consequently, the time scale at which the chain escapes the tubelike confinement is τ rep = τ R Z e = τ s Z s 2 Z e , see refs 9 and 26. To describe the corresponding relaxation modulus, we follow Chen et al. 12,26 and use the double-reptation model 29 and with g( ] a function that captures approximately the modification to reptation entanglement loss arising from contour-length fluctuations. Finally, α is in principle a universal constant, but in practice increases with a decreasing number of entanglements per chain. 30 This may be remedied by employing more sophisticated predictions for the influence of contour-length fluctuations. 27 For now, we set α = 10 (resembles the value of 8.7 for polystyrene 30 ) as this gave the best fit quality throughout the entire data set (more information on the fitting algorithm is given in the Appendix 1).
Apart from matching the model to oscillatory rheology data, we will also confirm that the parameter values are physically meaningful by using them to approximate the zero-shear viscosity. This viscosity scales approximately as Macromolecules pubs.acs.org/Macromolecules Article with τ d the reptation time after which the behavior of the silk feedstock crosses over from elastic to viscous. The simplest estimate for this time is τ d ∝ L 2 /D, at which the reptating chain escapes the tubelike confinement by curvilinear diffusive motion. Here, L = Z e R(N/Z e ) is the contour length, using the functional form R(n) = bn 1/2 for the end-to-end distance of a subchain of n monomers, and D is the curvilinear diffusivity. In the absence of stickers, the diffusivity is given by the Rouse 22,23 In the presence of Z s cross-links per chain with a sufficiently long lifetime τ s ≫ τ 0 (N/Z s ) 2 (see eq 2), the results are modified using the sticky-Rouse diffusivity, D SR = R 2 (N)/ τ s Z s 2 , 9 to give the sticky reptation time τ d ∝ τ s Z e Z s 2 . Inserting this into eq 6 yields η 0 ∝ G e τ s (Z s 2 Z e + Z s /Z e ). The term Z s /Z e is much smaller than Z s 2 Z e and may therefore be neglected and the macroscopic zero-shear viscosity of silk feedstock scales as with G e ∝ Z e .

■ EXPERIMENTAL METHOD
The NSF feedstock samples were obtained by dissecting commercially bred B. mori silkworms, as described previously. 17,31 Briefly, 5th instar silkworms were housed at ambient temperature (22 ± 2°C) and humidity, in ventilated tubes until required. At various stages prior to or during cocoon spinning, a candidate was selected at random, sacrificed by decapitation and its (two) silk glands were ejected into a plastic Petri dish. One gland was transferred to a second Petri dish, submerged in water, and divided using tweezers. It was impractical to obtain NSF from the narrow posterior (i.e., upstream) section; consequently, this was also discarded. Also, to minimize contamination of the fibroin (fiber-making protein) with sericin (the adhesive protein), the anterior (i.e., downstream) section was discarded. (This procedure prevents contributions by the potential microphase separation of NSF with other proteins. 32 ) Next, the membrane was carefully peeled from the (middle mid and middle posterior) section of the gland and a portion was transferred to a rheometer for characterization. The flow behavior was measured at 24−25°C using a combination of shear and oscillatory methods as follows: 1. 100 s at a constant shear rate of γ̇= 1 s −1 to distribute the specimen evenly between the cone and plate and supersede any residual flow stress from the loading. A low shear rate viscosity (η 1 ) was obtained by averaging data from the final 30 s of these measurements. 2. Oscillatory frequency sweeps to measure the dynamic moduli, phase angle, and complex viscosity. A strain of 0.02 (i.e., 2%) was used, which is well within the linear viscoelastic limit for NSF (which was found to be 0.1 across the frequency ranges used from strain-sweep measurements at 0.1, 1, and 10 s −1 and strains up to 100%). Measurements were made from high-to-low frequency to be consistent with the previous work, although lowto-high-frequency scans produced identical results to within experimental error. 3. Shear rate ramp from γ̇= 0.1 to (nominally) 100 s −1 , over 5−10 min, or until the sample slipped or underwent a flow-induced phase change and was ejected from the rheometer. For silkworms that were dissected prior to cocoon construction, the silk glands were still full and contained sufficient content to perform the measurements using a Discovery High Resolution (DHR2, TA Instruments) rheometer, fitted with a 20 mm diameter, 1°angle cone (CP1/20). For silkworms that were dissected after they had already produced a significant amount of fiber, the glands contained less material and the measurements were performed on a Bohlin Gemini (Malvern Instruments), for which a custom-made 10 mm diameter, 1°a ngle cone (CP1/10) was available. We have verified that the two instruments gave identical linear frequency-dependent rheology measurements on a poly(dimethylsiloxane) (PDMS) standard 9696 although the Bohlin was limited to a narrower frequency range (0.63− 89 rad s −1 ) than the DHR2 (0.13−126 rad s −1 ). As in the previous work, 17,31 excess NSF was not removed to avoid initiating gelation and a water flood around the sample area was used, to prevent drying out and skin formation. This raised the possibility of the sample becoming slightly diluted during the experiment, although the previous work 31 showed this to be a relatively minor problem over the time scale (less than 25 min total) required for each specimen. As previously observed empirically, the oscillatory G′ and G″ data could be simultaneously fitted well using a sum of Maxwell modes with ω the angular frequency, τ i the relaxation time, and g i the weighting factor of mode i. Depending on the quality of the data and the frequency range over which it was measured, it was generally possible to fit a model with 3 or 4 modes. This permitted estimation of the zero-shear rate viscosity (η 0 ) from the oscillatory data which agreed closely with the value obtained by extrapolating the shear rate ramp data toward a zero-shear rate.
In this work, we also go beyond the empirical fits and fit the data with the SR model itself. To do so, we developed a Monte Carlo algorithm that not only extracts the physical parameter values of the model but also explores the fit quality in great detail; see the Appendix 1 for a description of the algorithm and a short discussion on the representation of the error bars in this work.
For the elemental analysis of our samples, a separate portion of the (MM and MP) gland contents was transferred into a tared plastic weighing boat, immersed in distilled, deionized water, and loosely covered with a paper towel. The NSF initially dissolved, then formed a thin film as the water evaporated. The initial solid content of the NSF Macromolecules pubs.acs.org/Macromolecules Article was determined gravimetrically, based on the dry weight. This film was subsequently redissolved in a mixture of ethylene diamine tetra-acetic acid (EDTA) and tetramethyl-ammonium hydroxide, and the elemental content was determined by optical emission spectroscopy (external service within the University of Sheffield, using Spectro-Ciros Vision by Ametek Inc., Mahwah, NJ).

■ RESULTS
We have carried out the experimental procedure for 15 silkworms. This provided samples with zero-shear viscosities ranging from 400 to 12 000 Pa s and 6−51 potassium cations per NSF chain. As in our previous study, 6 we find a weak negative correlation between the viscosity and potassium concentration, see the left panel of Figure 3. The right panel of Figure 3 shows that the shear viscosity (measured under constant shear) and the modulus of the complex viscosity (measured under oscillatory shear, within the linear response range) are in agreement for shear rates up to 10 s −1 . Beyond this shear rate, however, significant differences were observed, which appeared to be caused by the specimen slipping or flow-induced phase separation and gelation during constant shear experiments. We have extracted the values of the physical parameters from our oscillatory rheology data by fitting the SR model (see the Appendix 1). Figure 4 displays the curve fits for all data sets, for which we provide the parameter values in Table 1. Before interpreting the variations in these parameter values, we examine the physical relevance of the SR model by confirming that the extracted parameter values do not violate the expressions for either plateau modulus (eq 1) or viscosity (eq 7).
To confirm eq 1 is not violated, we have plotted the plateau modulus, G e , as a function of a measure for the entanglement concentration ϕZ e , in Figure 5. These data, which are obtained from the oscillatory rheology data and represented by the symbols in the graph, are within the errors consistent with eq 1 (line). Further, in the inset, we compare the predicted zero-shear viscosity (line) from the physical parameters (obtained from oscillatory rheology measurements) and using the scaling approximation in eq 7 to the independently measured zeroshear viscosity (line) under steady shear. The excellent agreement implies that the previously empirically determined zero-shear viscosity using Maxwell modes (see the Experimental Section) can now be determined in a physically meaningful way using the SR model. Now that we have confirmed the applicability of the SR model to the silk feedstock, we can interpret the variations of the physical parameters in terms of the underlying polymer physics.  Table 1). The lines represent the SR model, and the symbols represent the rheological measurements of the samples in Figure 3. Measurements on a silicone viscosity standard using the DHR2 or Bohlin Gemini produced essentially identical results, although the latter was limited to a narrower frequency range. The data using the DHR2 and Bohlin machines are displayed separately (left and right panels, respectively) here merely to avoid making the figure too cluttered.   We first focus on the variability in the number of entanglements per NSF chain in the different samples, which is larger than expected from the range of concentrations alone. 23 Hence, the explanation must be sought in alternative origins, such as variability in relaxation (e.g., by contour-length fluctuations 27 or constraint release) or lengths of the entanglement tube. The latter may, for instance, vary due to variations in solubility with ionic strength, which in turn affects the compactness of the NSF chain at small length scales. Indeed, previously measured variations in the radius of gyration of intrinsically disordered proteins under dilute conditions were quantitatively explained by temperature-and ionic strengthdependent solvation effects of hydrophilic amino acids. 25,33−36 While we presently do not possess solubility information about the NSF protein that could explain the origin of the variations in the entanglement density, we are able to estimate the consequence of this to the viscosity. We find from Table 1 that the number of entanglements per chain, Z e , and concentration, ϕ, vary by a factor of approximately 1.8 and 1.3, respectively, while the zero-shear viscosity varies by a factor 30, implying that the variations in viscosity are predominantly accounted for by a variation in the sticker chemistry.
The contribution to the viscosity from the stickers is controlled by the number of stickers per chain, Z s , and by their dissociation time, τ s . We have plotted these quantities, as well as the number of entanglements per chain, Z e , in Figure 6.
The effective number of stickers is approximately a factor two smaller than the number of charged patches on the protein sequence (see Figure 1); this suggests that the rheologically observed dissociation time is determined by the equilibration of a strand of length 2N/Z s (which is also the subchain released by a single sticker dissociation), rather than N/Z s , as suggested earlier by Leibler et al. 9 More importantly, the number of stickers is rather stable, which suggests that most stickers remain bound and that the variations in feedstock viscosity are predominantly determined by the lifetime of the stickers. The scatter in the sticker lifetimes is consistent with the scatter of the viscosities in Figure 3 and reflects the biodiversity between the different samples. We may conclude that at high potassium concentrations, the sticker dissociation time is always short and long sticker dissociation times only occur at modest potassium concentrations.
The usual approach to understanding variations in the sticker lifetime with changing temperature is in terms of an associated activation energy and the Arrhenius equation 12 For melts of synthetic ionomers, the binding energy E act is of the order 14k B T or higher and the prefactor τ s,0 is considered constant. 12,26,37 Using this approach, we may reinterpret the temperature-dependent rheological data of silk that we previously analyzed empirically by curve fitting four Maxwell modes 18 (the data was obtained using the same experimental procedure as in the present work, except that the content of metallic cations was not measured).
We have extracted the temperature dependence of the sticker dissociation time from the rheological data of samples with a sufficiently high room-temperature viscosity (η 0 of the order 2500 Pa s; for lower viscosities, no statistically significant temperature dependence could be distinguished). Figure 7 shows the linear rheology over a range of 5−50°C of a representative silk feedstock sample with a room-temperature zero-shear viscosity of η 0 (25°C) = 2554 Pa s (neither included in Table 1 nor previously published in ref 18; from Figure 3, we estimate that there are 5−30 potassium cations per NSF chain in the sample). As in the case of dependence on ionic strength, we have measured the dynamic moduli (left panel) and we have fitted the SR model to these data to extract the physical parameter values (the right panel of Figure 7 and Table 2). The right panel of Figure 7 shows that any variations in the number of active stickers per chain, the number of entanglements per chain, and the sticker dissociation time are weak compared to the error margins. In particular, by fitting the Arrhenius equation to the sticker dissociation time, we find an apparent activation energy, E act , of 20.1 ± 14.7 kJ/mol. This value suggests that the stickers are bound less strongly than the stickers in synthetic ionomers (35−90 kJ/mol 12,26,37 ).   The relation τ s = τ s,0 exp(E act /RT) ≫ τ 0 (N/Z S ) 2 (obtained by combining eqs 10 and 2) suggests that τ s,0 is much larger than the Rouse time of a monomer, τ 0 . There are several contributions to a renormalized and larger value of the prefactor τ s,0 in the expression for the sticker dissociation time. First, the anionic patches on the fibroin typically contain three acidic groups (glutamic and aspartic acid) and sometimes a base within a sequence of 10−15 amino acids. 19 Hence, sticker dissociation is most likely not simply the time to break an ionic bond between just two carboxylates and may be affected by the local structure of the protein. Second, and more significantly, sticker dissociation and reassociation may occur and prolong the rheologically observable, "effective", sticker dissociation time. This implies that after dissociation, the "free" stickers have to separate beyond a certain escape radius to effect a reattachment to different sites. It is this event that constitutes a rheologically effective single release event. Viewing the silk feedstock as an electrolyte, with ions such as calcium and potassium present, this escape radius is set by the Debye screening length for electrostatic interactions, λ D = 0.3/√I [nm], with I [mol/L] the ionic strength. The first-passage time τ s,0 is therefore set by the square of the diffusion length λ D divided by the Rouse diffusion coefficient. The latter is inversely proportional to the length g of the amino acid strand within the interaction volume, which is g ≈ λ D 1/υ with υ = 1/2 the swelling exponent of a freely jointed chain. 23 Combining these expressions gives τ s,0 ∝ I −2 , i.e., the sticker dissociation time decreases with increasing ionic strength. This argument is qualitatively consistent with our observation that increasing potassium concentration correlates with decreasing viscosity.

■ CONCLUSIONS
We conclude that by viewing silk through the lens of polymer entanglement physics combined with electrostatics, it is possible to gain insights into an exceptionally elegant natural ion-based fiber processing route that harnesses the sensitive response of both the persistence length of the chain and the dynamic stability of reversible cross-links. We have shown that the sticky reptation (SR) model describes how the structure and dynamics at the molecular level, especially interactions driven by topology and electrostatics combined, affects the macroscopic linear viscoelastic properties, thus providing a viable starting point to examine gelation and precipitation phenomena of protein solutions in response to composition variations and to flow. 1−8 Finally, we suggest that the role of the SR model is not simply limited to illuminating this processing route in nature but can be directly used to apply this insight to design synthetic bioinspired materials. 11,12 ■ APPENDIX: MODEL FITTING To determine the physical parameter values (τ s , Z s , Z e and G e ), we fit the model to the experiment using a Markov Chain Monte Carlo algorithm. This algorithm selects values of the physical All fits were obtained using α = 10.  Table 1). In all cases, we verify that the maximum, mean, and median of the distribution are within the standard deviation of the distribution. The reported error bars in later figures represent the interquartile range of the distribution unless indicated differently.
Macromolecules pubs.acs.org/Macromolecules Article parameters and calculates the fit quality using χ 2 ≡ ∑ n=1 N data ([G experiment ′ − G fit ′ ]/G fit ′ ) 2 + [G fit ″ − G experiment ″ ]/G fit ″ ) 2 . The denominators represent the experimental errors, which we have assumed to be proportional to the magnitude of the absolute modulus value; this weighting method takes into account that G′ and G″ may vary orders of magnitude at different frequencies, see ref 38. Using random numbers, the parameter values are varied and a new fit quality is calculated. If the fit quality is improved, the values are accepted, and else they are only accepted using an acceptance probability ∝ exp(−χ 2 /2σ 2 ).
This generates a distribution of (typically a few hundred or thousand) samples, and σ is chosen such that σ 2 matches the variance of χ 2 in the distribution: This is achieved using "umbrella sampling": we fit with an overlarge value of σ to generate distribution and subsequently sample the obtained distribution with varying values of σ.
This provides the final distributions of the physical parameter values (see Figure 8). The distributions are not perfectly Gaussian and sometimes may even show a bimodal distribution. For instance, in the case of τ s , sometimes a maximum at small values 10 −2 appears; this is at the upper boundary of the frequencies that we can measure and is of numerical rather than physical origin. Further, most of the distributions are skewed. Therefore, in all cases, we verify that the difference between the mean, median, and maximum is within the standard deviation. To reflect the skewness, the error bars in our figures represent the interquartile range of the distribution (e.g., the interval in which 50% of the samples lies), unless indicated differently. A numerical implementation of the SR model is available in Python (RepTate software 39 ) and in C at https://github.com/ CharleySchaefer/Bombyx/. The latter implementation includes the curve-fitting algorithm used in the present work.