Genetic Functionalization of Protein-Based Biomaterials via Protein Fusions

Proteins implement many useful functions, including binding ligands with unparalleled affinity and specificity, catalyzing stereospecific chemical reactions, and directing cell behavior. Incorporating proteins into materials has the potential to imbue devices with these desirable traits. This review highlights recent advances in creating active materials by genetically fusing a self-assembling protein to a functional protein. These fusion proteins form materials while retaining the function of interest. Key advantages of this approach include elimination of a separate functionalization step during materials synthesis, uniform and dense coverage of the material by the functional protein, and stabilization of the functional protein. This review focuses on macroscale materials and discusses (i) multiple strategies for successful protein fusion design, (ii) successes and limitations of the protein fusion approach, (iii) engineering solutions to bypass any limitations, (iv) applications of protein fusion materials, including tissue engineering, drug delivery, enzyme immobilization, electronics, and biosensing, and (v) opportunities to further develop this useful technique.


INTRODUCTION
By forming multiple weak bonds with ligands, proteins can reversibly bind specific molecules with very high affinity.In vivo, proteins require these traits to create and regulate the vital chemical pathways, cells, and tissues necessary for life.The wide variety of natural protein functions has been further augmented by engineering proteins with novel abilities. 1,2−19 Creating devices using materials composed of proteins confers many advantages, including providing a facile method to stabilize useful proteins by embedding them into the materials.The fidelity of transcription and translation (1 error per 10 4 amino acids) ensures that each protein has the same length and amino acid sequence. 20,21−22 Protein-based biomaterials can have useful mechanical properties, including extreme strength, as exemplified by dragline silk, and high extensibility, as observed for elastin and elastin-like proteins (ELPs).Finally, the sequence of the protein monomers can be easily and quickly engineered using standard molecular biology techniques, allowing the materials' sequence, assembly mechanism, structure, stability, and physical properties to be tailored for specific applications. 23−29 Although protein monomers can elicit antibody production when injected into an animal host, assembly of these monomers into a stable material tends to prevent an immune response.−32 Consequently, protein-based materials are often suitable for in vivo applications.−35 Figure 1.Fusion proteins in materials assembly.A) A schematic depicting the production of fusion proteins and their assembly into materials.B) Schematic of fusion protein sequences.The functional protein (dark blue) may be placed N-terminal (top) or C-terminal (bottom) to the selfassembling protein (gray) or within a flexible loop of the self-assembling protein.Purification tags (dark gray) may be included at either terminus or within the protein sequence.Soluble, flexible linkers (green) are often required to prevent each component from sterically interfering with the folding or function of other components of the fusion protein.Linker refers to the 1 letter code for amino acids inserted between the functional protein domain and the self-assembling protein.Fusion Terminus indicates whether the functional protein was fused to the N-terminal side, the C-terminal side, or in the middle of the self-assembling protein.b Produced in Bombyx mori instead of E. coli.c Elastin-like polypeptide (ELP).d Silkworm silk derivative.

ADVANTAGES OF FUSION PROTEINS
Materials that are composed of proteins offer a unique opportunity to efficiently immobilize and display useful functional proteins.−39 The sequence of amino acids is unique to each protein and genetically encoded in the sequence of nucleic acids in that protein's gene (DNA), which is also a linear sequence.To create a "fusion gene", the DNA sequence encoding the functional protein is placed end-to-end with the DNA sequence encoding the self-assembling protein, without intervening stop codons.When the appropriate regulatory DNA sequences are added to the fusion gene and placed in a live cell (Figure 1, Step 1), the cell will use the DNA instructions to produce a single "fusion protein" that contains the amino acid sequence of both proteins in a single chain of amino acids (Figure 1, Step 2).In the example shown in Figure 1 (top), the functional protein sequence is followed by the sequence of the self-assembling protein.Ideally, these proteins will independently fold into their unique 3-dimensional structures defined by their amino acid sequence (Figure 1, Step 3), and the self-assembling protein portion of the fusion protein will interact with other copies of the fusion protein to form materials.The structure and function of the fused functional proteins are often preserved and improved due reduced protease access and limited mobility.This confined space surrounding the fused proteins hampers protein unfolding, resulting in prolonged enzymatic, signaling, and/or biological activities..In practice, an additional short linker sequence of amino acids must often be added in between the functional sequence and the self-assembling sequence to prevent one sequence from sterically hindering the folding or function of the other sequence.This approach can be used to incorporate peptides, protein domains, and full-length, folded proteins into a wide variety of protein-based materials (Tables 1 and 2).There are numerous advantages to this approach.By combining production of the functional protein, production of the self-assembling protein, and protein immobilization into a single-step, single-pot process, the time and cost required to produce materials is substantially reduced. 84,85Therefore, the extent of functionalization does not have to be independently validated after each production batch.The added functional protein has the potential to increase the yield of the recombinant fusion protein, an effect that has been observed in E. coli 36,40 and in the silkworm Bombyx mori. 86This increase in yield correlates with the solubility of the fusion protein 36 and can help prevent the random (nonproductive) association of self-assembling monomers into aggregates. 47The functional proteins are uniformly attached to the material by the same bond.Consequently, there are no variations in the attachment site that have the potential to misposition or inactivate a fraction of the functional proteins.Because the functional protein is attached to the material via a covalent peptide bond, any solution conditions that are safe for the materials will also preserve this bond, reducing or eliminating leaching. 87Because every monomer has one functional protein attached, by definition the materials are saturated and the coverage is even.For some combinations of functional protein and self-assembling proteins, fusion dramatically increases the stability of the functional protein. 36,39However, this is not the case for all fusion proteins that form materials. 88Finally, many different proteins can be incorporated and even patterned into a single material as protein fusions. 38n many cases, the addition of a functional protein does not harm either monomer production or materials assembly. 36,39,44,59For instance, the Drosophila melanogaster protein Ultrabithorax (Ubx) self-assembles in vitro into nanoscale fibrils which further associate into macroscale films and fibers. 89Ubx can be fused to much larger proteins, to highly charged proteins, or to dimers/tetramers without impacting its ability to form materials or the morphology of the resulting fibers (Figure 2A,E; Table 2). 36Additional evidence that materials formation is unaffected is provided by competing Ubx with Enhanced Green Fluorescent Protein-Ubx (EGFP-Ubx) monomers prior to materials assembly.If EGFP impaired Ubx assembly, then the fraction of Ubx protein in the material should be higher than the fraction of Ubx in the monomer mixture.Instead, the ratio of Ubx and EGFP-Ubx proteins in the final material was the same as in the initial protein mixture.Therefore, both proteins assemble equally well (Figure 2F,G). 36he functionality of the appended functional protein is often preserved in protein materials. 36,91,92Incorporating functional Materials formation was quantified as the length of fiber drawn from the assembly solution, averaged over three experiments. 360 Reprinted with permission from Tsai et al., 2015. 36roteins into materials can stabilize these proteins relative to the corresponding free protein monomer.For example, both EGFP and the catalytic domain of Arginine Deiminase retain activity when fused to human Ferritin heavy chain (hFTN-H), creating the recombinant proteins termed hFTN-H::GFP and hFTN-H::ADI 132−410 .When compared to recombinant EGFP or arginine deiminase monomers, these materials demonstrated increased stability in fluorescence-emitting and arginine-degrading activity upon heating (70 °C) or in the presence of a high concentration of guanidinium hydrochloride (4.5 M), acid, or alkali.91 Unlike mCherry monomers, some fluorescence intensity from mCherry is retained in mCherry-Ubx fibers even after ethanol treatment for 30 min (80%) or autoclaving for 20 min (20%), a trait that facilitates materials sterilization.36 This increase in stability can assist with materials storage as well.Thatikonda et al. reported that portions of an antibody fused to an engineered spider silk protein (scFv-NC silk fusion protein) created fibers that remained functional after storage at room temperature for 1 week.44 Similarly, the enzyme xylanase was 40% active after 11 months of wet storage, > 1 month of dry storage, or 70% ethanol treatment.59 The enzyme β-Glucosidase is only 20% active at 60 °C, whereas β-glucosidase fused to an elastin-like polypeptide is 90% active at this temperature.62 The source of this incredible stability is likely due to the confined geometry effect, in which functional proteins embedded within the materials lack the space required for the large-scale motions associated with protein unfolding.92−94 Specific interactions between the functional and self-assembling proteins may also contribute in some cases.
Finally, more than two genes may be fused, thus incorporating multiple proteins and functions into the same polypeptide.One example of a multifusion protein is resilin− elastin−collagen chimeric recombinant materials, designed to tailor the mechanical properties of the resulting materials. 95he EGFP-VEGF-Ubx double fusion also self-assembled into materials, which retained both green fluorescence and VEGF activity. 12The addition of EGFP to VEGF-Ubx improved the solubility, and thus protein yield, during monomer production in E. coli. 12Surprisingly, VEGF was active when positioned as the middle protein in a double fusion protein that is incorporated into materials, despite its requirement for both a surface-exposed dimer interface and a receptor binding interface to function. 12,96

FUSION PROTEIN DESIGN, TYPICAL DIFFICULTIES, AND SOLUTIONS
The general design strategies for creating fusion proteins from a self-assembling protein and a functional protein are depicted in Figure 1B.Depending on the proteins used, the functional protein may either be located N-terminal or C-terminal to the self-assembling protein, or even inserted into a loop of the selfassembling protein.Some fusions may need a linking amino acid sequence connecting the functional protein to the selfassembling protein to provide enough space for both proteins to independently fold and function.Usually, these linkers combine highly soluble amino acids with glycine to provide flexibility.A tag may also be included to aid protein purification.This tag may be attached to the N-or Cterminus of either the functional protein or the self-assembling protein.An additional flexible linker is often used to prevent interference between the tag and the adjacent protein.

Functional Proteins.
Although protein fusion offers a facile mechanism to functionalize materials, surprisingly few proteins have been incorporated into materials via protein fusion.
3.1.1.Breadth of Proteins Successfully Fused to Materials.Table 1 lists domains, or folded portions of proteins, that have been incorporated into protein materials via protein fusion.These sections of proteins can fold and function independent of the remainder of the protein sequence.Because proteins domains are small, they are less likely to impede materials assembly and thus are often selected for materials functionalization.This approach is often selected for incorporating ligand binding or enzyme catalysis activities.To our knowledge, Table 2 lists all reported full-length proteins incorporated into materials via protein fusions as of January 2024.The most common functions added to materials are fluorescence, ligand binding, enzymatic catalysis, cell binding, and cell signaling (Tables 1, 2).For instance, Myoglobin-Ubx materials retain the ability of myoglobin to bind O 2 , 38 while the immunoglobulin G (IgG) binding domains C2 and Z still bind antibodies when fused to the spider silk derivative 4RepCT. 39Amyloid fibrils containing chitin-binding domains bind chitin more readily than unmodified fibrils. 97Luciferase and xylanase both retain their enzymatic function in Luciferase-Ubx and Xylanase-4RepCT materials. 38,59Likewise, collagen, when fused to silk, enhances human mesenchymal stem cell (hMSC) proliferation and attachment, 98 while addition of the VE-Cadherin extracellular domain EC1−2 to Mussel Foot Protein 5 increases adhesion of endothelial progenitor cells. 43mportantly, the ability to add active functions to materials is not limited to one type of ligand or material composition.Seven proteins have been successfully fused to 4RepCT and 25 proteins have been fused to Ubx (Tables 1, 2).Conversely, Enhanced Green Fluorescent Protein (EGFP) has been successfully fused to 3 different materials-forming proteins: silkworm silk light chain, Ubx, or ferritin heavy chain. 38,86,91ikewise, basic Fibroblast Growth Factor (bFGF) has been fused to 4RepCT, Ubx, and the silkworm silk protein light chain. 36,63,64This versatility in the design of fusion proteins reflects the power of this approach: in principle, any functional protein could be added to any protein-based materials, tuning the activity as well as the mechanical and physical properties of that material for a specific application.

Self-Assembling Proteins.
The other main component in a fusion protein is the self-assembling protein.
Obviously, the choice of self-assembling protein will impact the morphology, biocompatibility, degradability, and mechanical properties of the resulting materials.Self-assembling proteins can be broadly classified in three different categories: proteins that naturally self-assemble and proteins that selfassemble into materials in vitro but not as part of their normal function in vivo.Both categories have been successfully functionalized via protein fusion.−102 3.2.1.Natural Self-Assembling Proteins.Three prominent examples of proteins that naturally self-assemble to make bulk materials include silk, elastin, and resilin. 103Silk proteins are naturally synthesized by spiders, silkworms and bees, among others. 24,103,5Natural silks have a long history of use in medicine. 106,107These proteins are challenging to produce as recombinant monomers and to assemble in vitro; therefore, many laboratories have developed modified versions of silk proteins to ameliorate these problems.These modified proteins are often still referred to as the natural protein (e.g., "silk") despite deletions and/or alterations to the protein sequence.The biocompatibility and mechanical properties of silks have made them valuable tools for many applications, 39 including bone regeneration, 104,108 soft tissue regeneration, 109 cell adhesion, 40 cardiac tissue engineering, 110 and drug delivery. 111s components of the extracellular matrix, collagen and elastin have chemical and mechanical properties capable of supporting cell growth to create extensible or flexible tissues. 24,112Materials composed of Elastin-like Recombinamer (ELR) or Elastin-like Polypeptides (ELPs) are being developed for many applications, including skin tissue engineering and wound healing. 113Elastin reversibly forms hydrogels at elevated temperatures.This feature has been exploited as a facile and scalable purification technique− proteins fused to elastin or peptides derived from elastin can be separated by repeated gelation, depolymerization and washing/ dilution steps. 114,115Consequently, a large number of proteins have been fused to elastin (e.g., Blue Fluorescence Protein, 10,50 Cyclosporin A, 56 and Thioredoxin 50 ), although not always with the goal of functionalizing materials.Notably, proteins fused to elastin must be thermostable to withstand the temperatures used for gelation.Stabilization of the appended protein once the gel has formed may ameliorate this need. 62esilin, an elastomeric protein found in insect cuticles, forms materials which are characterized by low stiffness and high energy storage capacity, making them elastic, extensible, and resilient. 19,103,116Resilin has important roles for flight in arthropods, leg movement in arachnids, vocalization in cicadas, and jumping in fleas. 19,117Resilin's mechanical and biological properties make it a very promising scaffold for demanding but flexible tissues such as vocal cords 116,118 and blood vessels. 119,120Although fewer protein fusions have been constructed using Resilin, such fusions are possible: the Kiick lab has fused several functional domains simultaneously to Resilin to create materials that bind both ligands and cells. 42n addition, hybrid fusion proteins have also been created to combine the properties of silk, collagen, elastin, resilin, amyloids, and/or other proteins (Figure 3). 121,122−128 A diagram of self-assembling proteins that have been successfully fused to each other is depicted in Figure 3.These combination materials have recently been reviewed by Rodriguez-Cabello et al. 129 By far the most common combination of proteins is silks and elastins, to make silk-elastin-like proteins, or SELPs.−132 Sometimes these fusions are created to improve the mechanical properties of the resulting materials.For instance, fibers composed of an elastinlike polypeptide and squid ring tooth protein (a.k.a.suckerin) have a breaking strength that exceeds recombinant spider silks. 133A similar suckerin-silk fusion has also been developed. 134.2.2.Proteins That Only Assemble in Vitro.Occasionally, a globular protein that does not form materials as part of its native function is able to self-assemble in vitro.Examples include the Repeats-in-Toxin (RTX) domain, a Pseudomonas aeruginosa alkaline protease, and Ultrabithorax (Ubx), a Drosophila melanogaster Hox transcription factor.The RTX domain, when fused to soluble GFP, forms green fluorescent amyloid protein fibrils and sheets, suggesting this structure may be able to host other proteins as well. 152A wide variety of functional proteins and peptides have been genetically fused to Ubx.Ubx self-assembles into films and fibers in vitro. 89,151The conditions used for Ubx assembly (aqueous buffers near neutral pH) allow proteins genetically fused to Ubx to retain their structure and activity.Full-length proteins successfully fused to Ubx include both dimers (e.g., the pro-angiogenic growth factor VEGF) 12,36 and tetramers (e.g., fluorescent protein AmCyan). 36Furthermore, Ubx can form materials in vitro when fused to multiple proteins, such as EGFP-VEGF-Ubx or NusA-NusA-Ubx.Indeed, Ubx can accommodate fusions greater than 9 times its size (332 kDa appended protein vs 39 kDa Ubx) and still form materials. 153usion order.In theory, the functional protein or peptide can be placed at the N-terminus or C-terminus of the selfassembling protein or inserted into the middle of the selfassembling protein, most likely in a flexible region or loop. 42In practice, the fusion order can have a large impact on the yield of purified monomers, the ability of the monomers to selfassemble into materials, or the activity of the appended protein.For example, fusions in which the added functional protein is C-terminal to Ubx do not produce protein in E. coli; however, when the proteins are appended to the N-terminus of Ubx, fusions are produced that form materials and retain their activity. 36,38Likewise, the production levels, also termed "expression", of four elastin-like proteins (ELPs) is higher when the functional protein is added to the N-terminus of the ELP as opposed to its C-terminus. 50However, the specific activity of three of the proteins, chloramphenicol acetyltransferase (CAT), thioredoxin (Trx), and interleukin-1 receptor antagonist (IL1Ra), is reduced when fused to ELP's Cterminus, thus the function of the appended protein may also be sensitive to fusion order. 50Many proteins have also been fused to the N-terminus of the silk derivative 4RepCT. 39,59In all of the examples listed, the functional proteins retained activity in the materials.An advantage of N-terminal fusions is that stable functional proteins can help protect the nascent fusion protein from aggregation or proteolysis during translation. 154This benefit is especially important when the selfassembling protein contains long, intrinsically disordered (unstructured) regions, as in Ubx and some silk and elastin variants.
Other self-assembling proteins require the functional protein to be fused to the C-terminus.One example is the selfassembling protein RTX.Although GFP fused to the Nterminus of RTX domain was poorly expressed, the C-terminal fusion was easily produced and assembled into materials upon dilution or EGTA addition. 152Likewise, appending cold-shock protein CspB to the C-terminus of Poly(A)-binding Protein Nuclear 1 (PABPN1), an amyloidogenic nuclear protein, produced a fusion protein that was easily produced, assembled, and retained the activity of CspB in the materials. 53In the CBD-CsgA-Mfp3 fusion protein, the chitin binding domain (CBD) misfolds when fused to the amyloidogenic protein CsgA and a mussel foot protein (Mfp3).Altering the fusion order and the choice of mussel protein resolved this issue. 97hile materials can be formed from a fusion of a peptide driving silica deposition (R5) to a spider silk derivative, silica deposition is more controlled when R5 is C-terminal to the silk, suggesting fusion order affects R5 folding or accessibility. 108 few self-assembling proteins permit fusions on either the N-or C-terminus.This ability is very desirable, since the activity of the functional protein may also be determined by fusion order.Thioredoxin can be fused to the N-terminus, Cterminus, or in the middle of elastin and still successfully form materials, although the activity of thioredoxin was never verified. 50No significant difference in production or assembly was observed in materials composed of either Blue Fluorescent Protein-ELP or ELP-Blue Fluorescent Protein. 50A hydroxyapatite binding domain, termed VTK, was fused to either the N-or C-terminus, or both, of a spider-silk inspired selfassembling protein.Although all three fusion proteins formed materials, the double fusion (VTK-silk-VTK) was better able to crystallize hydroxyapatite. 104All members of a family of fusion proteins containing one or more copies of Cytochrome b 562 and multiple copies of an SH3 (amyloid-forming) domain were able to form amyloid fibrils.However, the morphology of the fibrils was determined by the arrangement of Cytochrome b 562 sequences (N-terminal, C-terminal, or both) and the number of SH3 domains (2 or 3) in the fusion protein. 146learly, the ability of both the functional protein and the selfassembling protein to tolerate fusions to either terminus varies between proteins in a manner that is not yet predictable.
A third option is to insert the sequence of one protein into that of a host protein.This strategy requires the identification of a "cut" site on the host protein that can accept the inset without disrupting the host protein's structure or stability.Such sites are generally surface loops or turns that make few long-range contacts within the host protein. 155.4.Linkers.Many protein fusions require an amino acid sequence to link the self-assembling protein to the functional protein/peptide (Figure 1).The linker spatially separates the proteins, thus providing room for the two proteins to independently fold and function. 156Careful linker design can increase the folding, stability, expression and/or activity of fusion proteins, among other functionalities. 157,158The length and sequence of the linker are key factors in determining whether protein fusions are able to be produced, retain functionality, and ultimately form bioactive materials. 159lycine-rich linkers provide flexibility, which may be required for each protein in the fusion to independently fold and function. 160Indeed, glycine is a preferred amino acid in natural linkers. 160Longer linkers may be required for proteins with more complicated structures, including dimers and tetramers.For example, the longer sequences were used for Ubx fusions with such proteins, GTNIDDDDKHMSGSG or GSGSH, whereas GH was deemed a sufficient linker when fusing Ubx to small soluble proteins. 36For natural linkers, the average linker length is roughly 10 amino acids. 160In these examples, glycine was used to provide flexibility, while histidine and serine increased the water solubility of the linker.Similarly, the amino acid sequence LEALFQGPNS sequence was used to link IgG and Albumin binding domains to 4RepCT. 39In another example, the cold-shock protein CspB, when linked to a polyalanine sequence by either GAAGG or GAA, was able to fold and form fibrils.However, the GAA sequence prevented CspB folding, and thus these materials were inactive.Therefore, the increased length and/or the enhanced flexibility of the GAAGG linker is important for creating a successful fusion of CspB to polyalanine. 535][156][157]161,162 The amino acids Pro, Arg, Phe, Thr, Glu and Gln can act as rigid spacers. Althoug the fusion was not designed to form materials, the bifunctional enzyme complex Fluc-R-LRE, composed of Firefly luciferase (Fluc) and Luciferin-regenerating enzyme (LRE) fused together by a rigid linker (R, whose sequence is EAAAK), exhibits enhanced luminescence imaging, with stronger and more stable signal.163 Linker peptides that combined flexible, glycine-rich sequences with a rigid sequence allow the user to control linker flexibility, and thus the orientation and dynamics of the linked proteins.116,164 In contrast, some protein fusions have no need of a linker between different domains.This is the case for the protein fusions Ure2 1−80 -HRP, Ure2 1−93 -HRP, and Ure2 1−93 -AP composed of the HRP and AP enzymes fused to the amyloidogenic protein Ure2.These fusion proteins form amyloid nanofibrils and microgels.144,165 Likewise, the constructs BLP-ELP, Trx-ELP, CAT-ELP, and IL1Ra-ELP fusion proteins lack a linker peptide.50 3.5. TagPurification tags may also be used in some protein fusions, usually connected to the fusion protein by a flexible linker to prevent the tag from interfering with protein folding (Figure 1).A polyhistidine tag (6xHis or His 6 -tag) or FLAG-tag is often used for purification by affinity chromatography.Purification tags can be attached to either the N-or Cterminus of the functional or self-assembling protein.104,108,166 However, some proteins, such as Ubx, do not assemble with even short peptides or tags fused to one of their termini.36 Furthermore, it is important that the tag is sufficiently exposed to be able to bind the chromatography resin.Consequently, fusion order can also be important for short purification tags. 167 Theost of subsequent purification steps used in concert with specific purification tags can vary widely.168 Intriguingly, due to their unique ability to withstand temper-ature transitions, aggregating tags such as ELPs can function both as a purification mechanism and as a self-assembling protein material.169,170 Depending on the net solubility of the fusion, some proteins with an aggregating tags will be expressed as inclusion bodies in bacteria, allowing for insoluble fraction collection.171−174 Several recent reviews on ELPs and other tag sequences are available.168−170,175−177 3.6.Typical Fusion Design Difficulties and Solutions.Despite its numerous advantages, several aspects of fusion and assembly could fail.Incorporation into materials may damage the structure of the functional protein or impede access of the molecules that bind the functional protein (Figure 3).These inhibitory packing interactions could occur within a single polypeptide chain, or between components of different polypeptide chains.Furthermore, inhibitory interactions could prevent folding or assembly of the materials-forming protein.Difficulties with generating and assembling protein fusions have prevented this technique's widespread use (Figure 4).However, many creative solutions have been devised to overcome these issues.
3.6.1.Resolubilizing Proteins from Inclusion Bodies.Methods to purify protein from inclusion bodies can perform better than standard chromatographic approaches for purification. 178Some self-assembling proteins (or proteins fused to aggregating tags) will express as inclusion bodies in bacteria.Most of these proteins are folded in inclusion bodies, and express in this manner due to their propensity to interact with each other.However, if a particular protein is only partially folded in inclusion bodies, it may be susceptible to degradation by proteases. 172This phenomenon is very protein dependent, as some proteins show increased resistance to proteases when produced in inclusion bodies (reviewed in ref 179).A second issue is how to resolubilize inclusion bodies so that materials can be formed in a controlled manner.For proteins, like self-assembly proteins, that are folded or nearly folded, harsh denaturants may not be necessary to disrupt the inclusion bodies.−182 3.6.2.Proteolysis and Aggregation.The intrinsically disordered, or unstructured, regions of functional proteins have the potential to impact protein fusions through a variety of mechanisms.The presence of unstructured regions increases proteolysis rates, 183 potentially inactivating the functional protein or releasing it from the self-assembling protein.In some cases, unstable or intrinsically disordered proteins are also more susceptible to amorphous aggregation, reducing the yield of soluble monomers.Finally, intrinsically disordered regions occupy more volume in solution than a structured region similar in sequence length. 183Therefore, disordered proteins with a shorter amino acid sequence than their structured counterparts may still exceed the size limits for fusion to a given self-assembling protein.
−186 Such aggregation could compete with the formation of orderly materials.Thus, the high protein concentrations typically required for in vitro assembly may present challenges.A list of assembly concentrations for specific self-assembling proteins is provided in Table 3.
3.6.3.Low Yield When Expressing the Fusion Protein.Finally, appending a functional protein has the potential to either increase or decrease the production yield of the corresponding recombinant fusion protein.Fusion to a wellfolded, stable, soluble protein is a well-established strategy for improving the expression and final yield of a soluble protein of interest. 36,152Conversely, fusion to a desirable, yet less soluble protein will reduce protein yield.Reductions in protein solubility, and hence protein expression, caused by adding a functional protein can sometimes be corrected by adding a third, more soluble protein. 36As one example, adding EGFP to the VEGF-Ubx fusion protein improved protein expression without interfering with VEGF activity; both VEGF-Ubx and EGFP-VEGF-Ubx fibers were able to signal to cells. 12,36gure 4. Potential problems caused by fusion proteins and noncovalent strategies for resolving fusion protein difficulties.A) Problems with fusion proteins fall into two categories, Loss of Function and Prevention of Assembly.Loss of function can be caused by the self-assembling protein (gray) inactivating the functional protein (dark blue), which in turn can lead to proteolysis or aggregation.In addition, direct binding by the functional protein to the self-assembling protein could sequester critical protein surfaces needed for either function or materials assembly.Conversely, the functional protein could prevent assembly by impairing contact between self-assembling proteins because the functional protein is too large, too charged, or because quaternary structure mispositions the self-assembling protein monomers.B) Use of unfused protein to generate noncovalent interactions between unfused protein and fusion monomers.Silk-silk interactions can bind silk chimeras bearing a functional protein (gray/blue double spheres) to a material composed only of silk (gray spheres).This approach has been used to coat the surface of silk with collagen, fibronectin, Z domain, and xylanase. 98,189) Functionalization of antibodies via noncovalent interactions.Materials composed of Z-4RepCT fusions harbor a Z domain (blue component of gray/blue double spheres) which noncovalently binds the constant region of antibodies (green).The antibodies, in turn, can recognize their cognate ligand, in this case VEGF dimers (red double spheres). 39.6.4.Loss of Activity during Assembly.Many proteinbased materials are assembled in harsh conditions (e.g., high temperature, low pH, or organic solvents), which are likely to unfold or inactivate most functional proteins. 36,38,54,187,188The most general solution to this problem is, when possible, to assemble materials in near physiological conditions to maintain the activity of the appended protein. 38,39In some cases, the sequence of the self-assembling protein can be engineered to allow assembly in conditions that preserve the activity of the fused protein. 39aterials assembly, even in physiological conditions, can theoretically denature or inactivate the functional protein (Figure 4A). 165For instance, if folding of the functional protein yields less free energy than assembly of the materials, then assembly could drive denaturation of the functional protein.If the materials-forming protein formed additional, noncovalent interactions with the functional protein that blocked its activity (e.g., physically blocked a ligand binding site), then the functional component of a fusion protein would be inactive in both the monomeric and assembled states.For example, although monomers of alkaline phosphatase (AP) fused to Ure2 1−93 prion domain were active, reduced catalytic activity in the materials suggests the structure of AP may be damaged by assembly. 165In addition, k cat /K m was also reduced, suggesting that the substrate access is also hindered in the materials. 165n alternative approach is to fold the added protein after materials assembly.Folding is generally triggered by adding an interacting ligand or cofactor.For instance, a derivative of Nephila clavipes spider dragline silk fused to the C-terminal domain of Dentin Matrix Protein 1 (CDMP1) was only able to self-assemble into films and form β-sheets in the presence of methanol, which would unfold many proteins. 190Upon incubating the assembled materials in simulated body fluid, a solution with ion concentrations similar to human blood plasma, Ca 2+ bound to CDMP1, causing CDMP1 to fold postassembly and mineralize the films. 190This feature could potentially be applied in bone tissue engineering. 190.6.5.Fusion Impedes Assembly.The reverse problem can also occur, in which the functional protein impairs or prevents materials assembly.For instance, when the enzyme Xylanase (Xyl) was fused to a derivative of recombinant spider silk (4RepCT), the structure of the materials was unaffected; however, the assembly kinetics noticeably slowed. 59Moreover, the Xyl-4RepCT fusion accelerated disintegration of the material.This effect was countered by using a mixture of Xyl-4RepCT and nonfunctionalized 4RepCT and steaming the resulting foam. 59Highly charged functional proteins may slow or prevent materials assembly through charge−charge repulsion.Furthermore, functional proteins with quaternary structure (e.g., dimers, trimers) have the potential to misposition the self-assembling protein and prevent materials assembly. 36As in the Xyl-4RepCT example, mixing unfused monomers with the functionalized protein fusion can mitigate steric clashes caused by quaternary interactions.However, multimerization of the functional protein does not always inhibit materials formation, as observed for VEGF-Ubx, SDF-1α-Ubx, bFGF-Ubx, AmCyan-Ubx, TheSSB-Ubx, TmaSSB-Ubx, and Ure2 1−93 -AP. 12,36,165Interestingly, Ure2 1−93 -AP proteins retain the ability to self-assemble despite the fact that alkaline phosphatase (AP) forms a dimer with mirror symmetry whereas the Ure2 dimer is axially symmetric, and no flexible linker was included to separate the two proteins. 165nderstanding what properties of these proteins enable such versatile applications is key to improving the design of new materials.
3.6.6.Steric Hindrance.A major concern in designing protein fusions is steric hindrance between the fused proteins.−193 Although the focus of this review is macroscale materials, fusions to amyloids and VLPs provide useful examples of steric hindrance.Fusion to proteins that form nanoscale materials, such as the antigen to viral coat/capsid proteins, may result in steric hindrance and protein misfolding, 194−196 although there are many notable successes.Huang and Mason (2004) used the Agrobacteriummediated transient expression system to produce and test antigen fusions formed by the Hepatitis B surface antigen (HBsAg) and GFP. 191When GFP is placed on the N-terminus of HBsAg, but not on the C-terminus, the protein fusion GFP-HBsAg folds correctly and assembles into VLPs.Thus, the study concludes that steric hindrance does not allow correct folding of a functional protein to the C-terminus of HBsAg, but any protein up to 239 amino acids can be added the Nterminal position. 191,197Likewise, fusion of GFP to Cauliflower Mosaic Virus Movement Protein changes the topology of the assembled materials due to a steric conflict between the GFP portions of the proteins. 198Cappelli and co-workers compared the success of fusing a β-barrel domain to 3 materials-forming proteins and 3 viral coat proteins, which form particles of varying sizes. 199Two of the three tested viral coat proteins formed successful fusions as well as all of the nonviral materials forming proteins.
Amyloid-forming materials may also be hindered by creating fusion proteins.This steric clash can be overcome by including extra copies of unfused protein, or altering the proteins involved.For instance, the full-length N-terminal prion domain of Ure2, which contains 93 residues, was fused to the N terminus of HRP and resulted in low enzyme activity of the amyloid fibrils.However, when the prion domain was shortened to 80 residues the catalytic activity approximately doubled.The authors concluded that the larger 93 amino acid Ure2 domain hindered substrate access to HRP. 165 Steric hindrance can be avoided by selecting more compatible proteins for fusion, by including a long flexible linker, or by reducing the dose of the functional protein in the final material.3.6.7.An Alternative Approach to Using Fusions: Attaching Functional Proteins Postassembly.If other solutions fail, one other approach to remove problematic interactions between the functional and self-assembling proteins is to create materials using unfused protein, and then rely on noncovalent interactions between the selfassembling proteins to adhere the fusion monomers to the surface.As an example, silk−silk interactions enable recombinant silk fusion proteins to bind the surface of silk materials (Figure 4B). 98,189For protein fusions that are difficult to produce, this approach efficiently places the functional fusion protein only on the materials surface, where it will be most effective.A disadvantage of this approach is that surface coverage must be assessed.In a more complex design, materials were functionalized with antibodies through multiple noncovalent interactions on the material's surface. 39The Z domain of Protein A, which binds the constant region of antibodies, was fused to 4RepCT silk (Figure 4C).These Z-4RepCT silk materials are capable of noncovalently binding to the constant region of a wide array of antibodies. 39Films and fibers of silk fusion protein Z-4RepCT, noncovalently bound to an anti-VEGF antibody, are able to specifically bind VEGF or other antibody/protein antigen pairs without the delay of additional cloning and materials production steps. 39Because the functionalized proteins are added after materials assembly, the danger that the materials formation process could unfold the functionalized protein is eliminated.However, adhering the functional protein to the materials via noncovalent bonds raises the possibility that the functional protein will leach from the materials at a rate determined by the dissociation kinetics of the Z domain:antibody and antibody:functional protein interactions.For applications that require the functional protein to be released, modifying these noncovalent interactions provides opportunities to tune the release rate.
An alternative to fusing the functional protein directly to the self-assembling protein is to attach the functional protein to the materials postassembly via an enzymatically catalyzed covalent bond.In the Streptococcus pyogenes fibronectin-binding protein (FbaB), an isopeptide bond forms spontaneously between the carboxyl terminus of one protein and the amino group of a lysine side chain.When the FbaB protein is split into two complementary pieces, the fragments, termed SpyTag and SpyCatcher, reunite and form an amide covalent bond in minutes. 200Each fragment can be genetically fused either to the functional protein or to the self-assembling protein.The FbaB fragments form a covalent bond that links the functional protein to the materials postassembly (Figure 5). 201,202Similarly, fragments of intein proteins can covalently link a functional protein to a self-assembling protein via an amide bond. 203,204The SpyTag/SpyCatcher system was used to attach the enzyme α-glucosidase (Ima1p) to the surface of VLPs formed by the major capsid protein VP2 from Parvovirus B19 (B19 VLPs) as a potential treatment of glycogen storage disease type II (also known as Pompe disease).The resulting materials retained enzyme activity and demonstrated enhanced stability when compared to monomeric fusion protein. 205imilarly, the Covid Receptor binding domain was attached to capsid-like particles using SpyTag/SpyCatcher to create an effective vaccine. 206The SpyTag/SpyCatcher system has also been used to enhance gelation by elastin-like proteins, including fusions in which heterologous proteins have been placed in the middle of the Elastin-Like Protein amino acid sequence. 51However, adding split protein fragments using SpyTag/SpyCatcher or the related systems SnoopTag/ SnoopCatcher or DogTag/DogCatcher has the potential to reduce the solubility and production yield of the resulting fusion proteins, or decrease surface coverage. 202Thus, a new system has recently been developed using SnoopTag2, DogTag2, and Snoopligase2, in which the ∼15 kDa proteins domains are replaced with short peptides. 207Another alternative approach is to use self-labeling protein tags, such as HaloTag7, SNAP-tag, and CLIP-tag (reviewed in 207). 208hese enzymes form covalent bonds with synthetic molecules, which, if attached to another protein, would create a covalent link between the two proteins.Reactive enzymes are available that create a wide variety of covalent linkages (Reviewed in Figure 5. Covalent linkage of proteins via the SpyTag/SpyCatcher or SnoopTag/SnoopCatcher system.A) SpyTag (green), when genetically fused to a heterologous protein (gray pentagon) will bind SpyCatcher (red) fused to a different protein (gray sphere).After SpyTag and SpyCatcher form an isopeptide covalent bond linking the two proteins, the fused proteins (pentagon and circle) are covalently linked to each other.B) The SpyTag/ SpyCatcher and SnoopTag/SnoopCatcher systems use similar amino acids to form an isopeptide bond.C) The isopeptide bond (blue) formed between SpyTag (green) and SpyCatcher (red).Figure constructed using pdb file 4MLI. 21208).These enzymes have been systematically compared to facilitate selection of the optimal system for a given application. 209More detailed descriptions of the enzymatic ligation and affinity binding strategies for immobilizing proteins can be found elsewhere. 15,207,210,211

APPLICATIONS OF MACROSCALE PROTEIN-BASED MATERIALS
3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19]27,213 Opportunities for specific applications arise both from the general properties of each protein material and the range of proteins that can be incorporated as protein fusions. This mdular approach allows the inclusion of one or more functional peptides or proteins into a single protein sequence with tunable properties.36,213 Protein-based materials are being developed for a variety of biomedical and biotechnological applications such as chemical catalysis, antibody capture and immobilization, cell binding, tissue engineering/regenerative medicine, and drug delivery.3−14 4.1.Chemical Catalysis.The efficiency, stereoselectivity, and ability of enzymes to catalyze reactions in environmentally friendly conditions have increased the interest in using enzymes for industrial chemical production. 59,214Chemical catalysis through the immobilization and stabilization of enzymes on solid supports has many advantages, including facile separation of enzyme from the reaction mixture, which both allows enzyme recovery and aids product purification.165 These advantages reduce product cost, an especially important aspect in industrial processes.59 Although most enzyme immobilization strategies have relied on either physical entrapment, biotin/streptavidin recognition, adsorption, or nonspecific binding to a flat or microparticle surface, 214 enzymes have also been genetically fused to proteins that form materials, including the silk derivative 4RepCT, Ubx, and the amyloidogenic protein Ure2 (Tables 2 and 3), and retained their activity once inside these materials.36,38,59 Another example of catalytic biomaterials includes materials based on the enzyme organophosphate hydrolase (OPH) fused to varying numbers of α-helical leucine zipper domains (H domains) and soluble linker domains (S domains), both with and without an N-terminal polyhistidine purification tag.60 All of the protein constructs were able to self-assemble into enzymatic hydrogels, which retained the ability to degrade organophosphate (OP) compounds, making them useful to be applied as OP biosensors or in bioremediation and decontamination systems.Furthermore, OPH-based hydrogels were capable of retaining their catalytic activity after 5 months of cold storage in buffer solution.In addition, a lyophilized protein sample was still able to form hydrogels and catalyze OP degradation upon rehydration after 6 months of storage at −20 °C. Nature sometimes solves this problem by fusing together multiple enzymes that catalyze different steps in a series of chemical reactions.215,216 Materials composed of fusion proteins provide a unique opportunity to manipulate the geometry of enzyme packing to reduce product diffusion within the materials, and thus increase reaction yields.
The chemical environment can also be used to manipulate enzyme activity.Horseradish peroxidase, when genetically fused to an ELP, coacervates in response to osmotic stress. 217nterestingly, the resulting increase in substrate and enzyme concentrations accelerate the reaction rate, providing an avenue for dynamic control of the enzyme via assembly and disassembly. 217,218.2.Antibody Capture and Immobilization.Many applications are based on functionalization of materials with affinity domains that recognize antibodies.Once antibodies noncovalently bind these materials, the materials can be potentially used to bind specific ligands (as in sensors), 41 to bind and display proteins (for tissue engineering applications), 39 or to deliver proteins (for therapeutics). 39,44In this versatile approach, the same base material can be used with any antibody/antigen combination.In particular, effective immobilization of antibodies on a solid support is of extreme importance for the specificity and sensitivity of immunoassay procedures. 41,219For instance, recombinant mussel adhesive protein (MAP), when genetically fused with the B and C domains of the immunoglobulin G (IgG)-binding Protein A, enabled efficient and oriented antibody immobilization onto diverse surfaces such as glass, polystyrene, and aluminum. 41his functionalized material has potential applications in immunoassays and immunosensors. 41As discussed above, the Z domain of protein A, which binds IgG, has been genetically fused to the silk derivative protein 4RepCT, 189 The resulting materials were used to immobilize the growth factor VEGF, which is largely used in cell culture and tissue engineering applications.These versatile materials could potentially be applied to immobilize other specific proteins depending on the type of antibody captured. 39.3.Cell Binding.−222 Integrin-and fibronectin-binding sequences were both inserted into the collagen-like protein Scl2 from Streptococcus pyogenes fused to recombinant silk, thus creating chimeric collagen-silk fibers, allowing these materials to attach and grow human mesenchymal stem cells (hMSCs). 98ibronectin is a component of the extracellular matrix (ECM) that binds to many molecules, including other ECM proteins and cell adhesion molecules, and is composed of subunits termed Type I, Type II, and Type III. 222ecombinant silk-elastin-like proteins (SELPs) blended with 6mer+FNII, which is a sequence repeat from spider silk genetically fused to fibronectin type II, formed films with increased ability to promote cell adhesion of human skin fibroblasts when compared to 6mer+FNII alone. 40Likewise, coating a plastic surface with a fibronectin-silk variant (FN-4RepCT) improves the seeding efficiency of dermal microvascular endothelial cells. 223The great variety of peptides that recognize specific types of normal and diseased cells 224 have the potential to increase cell attachment and migration to materials for applications involving diverse tissues. 225These tools make peptide fusion materials a particularly powerful tool for cell analysis (cell isolation, detection) and tissue engineering (cell attachment and patterning) applications.
4.4.Tissue Engineering.Protein-based materials are widely applied in tissue engineering and regenerative medicine (reviewed in refs 224−226). 40,64,104,106Because cells respond to both mechanical and chemical cues, matching the materials' properties to the target tissue is very important for tissue regeneration or engineering.−128,227−232 This plasticity allows the properties of the materials to be tailored to suit a variety of applications. 30 large number of proteins and peptides function as chemical cues that control cell behavior and support cell growth, attachment, and proliferation within the scaffold.In addition to peptides, growth factors (generally dimeric proteins) have also been fused to materials to direct cell behavior.For instance, VEGF-Ubx fibers can induce migration, attachment, and survival of human umbilical vein endothelial cells (HU-VECs). 12A double-layered adhesive microneedle bandage was able to preserve cardiac muscle and enhance regeneration during rat heart remodeling.These bandages were composed, in part, of Mussel Adhesive Protein fused to peptides derived from VEGF and fibronectin. 233Recently, the Xia lab fused PDGF-BB to silk sericin to create hydrogels that both augmented osteogenesis by mesenchymal stem cells and inhibited cell differentiation into adipocytes. 228Despite lacking a folded protein structure, peptides can be surprisingly potent.An ELP, modified with peptides to promote laminin adhesion and hydroxyapatite binding, forms osteoid tissue and enhances bone healing in vivo. 234Such materials have the potential to be used not only as scaffolds for tissue engineering, but also as printable bioinks, allowing spatial control of cell function during 3D tissue reconstruction. 235any protein biomaterials have been engineered to induce biomineralization and aid in bone regeneration. 66,104,236,237A recombinant silk derivative fused to bone sialoprotein (BSP), termed 6mer + BSP, formed films that retained the capacity of BSP to induce deposition of calcium phosphate, which is important for bone mineralization. 66Films formed by 6mer alone did not induce mineral deposition.Furthermore, 15mer recombinant silk separately fused to the N-terminus, Cterminus, or both termini of the hydroxyapatite binding domain VTK increased the formation of crystalline hydroxyapatite and allowed growth and differentiation of hMSCs. 104n a final example, a transgenic silk fibroin modified with a Drich peptide has a higher Ca 2+ -binding activity in vitro and presented higher activity and mineralization capacity than natural silk fibroins. 37.5.Drug Delivery.Protein-based materials also have the potential to be used as drug delivery systems. 24,26Genetically engineered biomaterials that are biocompatible, biodegradable, and bind ligands with high specificity have the potential to be used for the controlled release of drugs and other molecules. 5,125,238Recombinant lipid-ELP biomaterials have been developed to self-assemble into drug delivery micelles of different sizes and shapes, such as spheres and rods (Figure 6).The yeast N-myristoyltransferase enzyme was coexpressed with ELP in E. coli followed by a post-translational modification of ELP with myristic acid.Myristoylated ELPs can thus form micelles with lipid cores that can encapsulate hydrophobic anticancer drugs, such as doxorubicin and paclitaxel. 239NGR-ELP BC micelles were constructed to target tumors by fusion of ELP block copolymers, consisting of both hydrophilic and hydrophobic ELP blocks, and NGR, which targets the CD13 membrane protein in angiogenic blood vessels in tumors. 240dditionally, the fusion of the p53-interacting anticancer protein, azurin, to ELP resulted in the formation of micelles at temperatures higher than the lower critical solubility temperature. 80,171The resulting micelles were found to have six times higher apoptotic induction in LNCaP prostate cancer than the azurin-ELP monomers, indicating an effective delivery system for targeted cancer treatment. 80ariation in the number and ratio of silk and elastin motifs in SELPs tunes the aggregate size, stability, and thermoresponsive behavior in nanogels composed of these proteins. 121mportantly, the ability to tune the physicochemical properties of silk nanogels alters the pore size, which in turn dictates the diffusion of molecules through hydrogels. 241Additional structures and physical properties have been achieved by combining ELPS with synthetic polymers.1 and 2. The identity of the self-assembling protein in each fusion is indicated by color.
generically fused to capsid-like particles, elicited a much more robust immune response than free Receptor Binding Domain protein, highlighting the benefits of creating particles with a high density of the antigen. 206Indeed, increasing antigen density by fusing to capsid-like particles or VLPs have increased the immunogenicity of other antigens as well. 242,243his display of multiple antigens both elicits a broader response, and improves cell uptake. 195,196Therefore, viruses may be much more amenable to this potent approach to materials functionalization than generally assumed.Other proteins that form nanoscale materials also see improved antigenicity, including hemagglutinin-ferritin and flagellin fused to several epitopes from SARS-CoV-2. 244,245These increases in cell attachment and cell response appear to occur in macroscale materials, 12 although far fewer quantitative studies have been reported.

CONCLUSIONS AND FUTURE PERSPECTIVES
5.1.Summary.Each protein-based material has a unique set of characteristics that define its optimal applications, such as biocompatibility, surface chemistry, and mechanical properties.As the number and diversity of protein-based materials grows, the opportunity to match or engineer the properties of a material for a specific application improves dramatically.Applying the protein fusion technique to these proteins allows creation of covalently functionalized materials in a single pot, single component, single step reaction.For each selfassembling protein, the design of protein fusions will be limited by factors, such as protein size, extensibility, stability, and fusion order, that are unique for that material.Nevertheless, the number of materials that have been functionalized by protein fusion has expanded in the past decade, creating new opportunities for materials engineering and functionalization.Figure 6 depicts all of the fusion protein materials discovered through an extensive literature search, as a function of the appended proteins' size.Clearly, while incorporated proteins vary in size and complexity, most successful fusion proteins are monomers that are <50 kDa in size. 207Mixing fusion proteins with unfused proteins in materials may allow larger proteins to be incorporated using existing materials. 246ese successes have also revealed gaps in knowledge and in the currently available techniques that limit this field.

Materials Structure and Assembly Mechanisms.
Compared with proteins that form nanoscale materials, selfassembling proteins that are composed of many interaction motifs interspersed with flexible linking regions−such as silk, elastin, and Ubx -are capable of forming the most different types of macroscale materials (Figure 7).For many macroscale materials, little is known about the mechanisms of monomer assembly, or the structure and protein interfaces that form the final product.Understanding these mechanisms for a wide variety of protein systems should improve both the design of new self-assembling proteins and facilitate optimization of existing materials. 247These traits have been explored for many proteins that produce nanoscale materials, such as virus-like particles or capsid proteins, bacterial S-layer proteins, and amyloids.Principles learned from these well-characterized nanomaterials may guide understanding of their macroscale counterparts.

Understanding Phase
Behavior/Multiphase Behavior.Because protein assembly is driven, at least initially, by noncovalent interactions, the physical and chemical environment can have a substantial impact on materials formation.Thus, phase behavior is an inherent property of self-assembling proteins, and a deeper understanding of how to add or eliminate the dependence of assembly on a particular parameter is needed to design new, sensitive protein systems and to enable their reliable use in many applications. 274−276 Because each phase transition is sequence-dependent, fusing two ELPs together, each with a different sequence, will create a protein with multiple transitions, and thus multiple phases. 276his idea is extended in artificial IDPs designed by the Chilkoti lab, which designed two proteins, one an artificial PGXGrepeat ELP, and the other a partially ordered polymers (ELP with polyalanine helices inserted within the sequence).Mixtures of these proteins create tunable, complex unique microarchitectures using only droplet microfluidics and heating/cooling, both scalable approaches. 275−273 multiphasic sequences to respond to the environment allow the creation of new 3D architectures without having to change the sequence or the assembly method.The function of these materials, and thus their potential applications, is determined by the size, shape, and internal architecture of the materials. 275hese very attributes create unique challenges for these materials.First, once the desired material has been created, many applications would require the material to maintain its shape through transfer to other solvents or temperatures required for storage, shipping, or in its final application.Such changes elicit structural transitions in first-generation materials.Second, many applications require predictable behavior, so that the materials only respond to a specific desired event (e.g., ligand binding), and not to other factors such as changes in pH, ion concentration, or electric potential that can occur in vivo. 274One answer to this design issue is to use unnatural amino acid incorporation to photo-cross-link the materials, thus trapping them in the preferred form. 275.4.Predicting Fusion Success.Not all protein fusions are successful.As discussed above, the production and assembly of materials-generating proteins can only accommodate a fused functional protein on the N-or C-terminus.In addition, not all functional proteins are equally good candidates for fusion to a self-assembling protein.An interesting example is fusing the LOO-GFP family of proteins to Ubx.LOO-GFPs are biosensors in which OPT GFP has been circularly permuted and then terminally truncated to remove single β-strand from the GFP barrel.Different LOOs, with either strand 7, 8, or 9 missing, were fused to Ubx and expressed together with their missing peptide in E. coli.Large differences in protein accumulation for the different LOO-Ubx variants were observed, even though the appended proteins were nearly identical and linkers were used to reduce steric hindrances during folding. 246here is no obvious way to predict whether an N-terminal or C-terminal fusion is likely to be more successful.Based on the data collected in Tables 1 and 2, roughly twice as many Nterminal fusions have been constructed (50) relative to Cterminal fusions (27), both of which far outstrip fusions in the middle of the sequence (4) or to both termini (5).However, these differences likely reflect both protein capability and bias in fusion design.A meta-analysis of the existing information for macroscale proteins might reveal unexpected correlations between local sequence properties and success rate.Although many successful fusions with nanoparticle-forming proteins have been reported, a large number of scientists in this field report unsuccessful fusions.One possibility is that the fusion success rate is higher for proteins that form macroscale materials because these proteins often have intrinsically disordered regions, which both provide the flexibility required to avoid steric clashes and the extensibility required to accommodate the appended protein within the 3D structure.In support of this idea, the Ubx protein cannot tolerate fusions to its structured C-terminus, but can support many diverse fusions to its intrinsically disordered N-terminus (Table 2).
5.5.Patterning.Many applications may require the functional proteins to be patterned within the materials.For instance, tissues are composed of many cell types, and thus the proteins that instruct each cell type, as well as the target cells, will ultimately have to be patterned within the material.Similarly, the specificity and reliability of biosensors can be increased by multiplexing many sensing proteins that each bind a different single ligand.Techniques frequently utilized to produce silk-based scaffolds for tissue engineering can potentially be applied to create new micro-and nanoscale patterned silk materials, including nanoimprinting and 3D printing. 277Ubx-based materials can also be patterned to produce microscale spots, stripes, and gradients without the need of additional equipment or solution conditions that could harm the functional protein. 38Developing more general and versatile techniques for patterning proteins would significantly expand the number of protein materials that could be used in applications requiring patterning.
5.6.Ligand Diffusion.For materials designed to bind and/or deliver soluble ligands, the rate of ligand diffusion through the materials will impact binding and release kinetics.The structure, porosity, pore connectivity, surface chemistry and degradation rate of each material is expected to have a substantial impact on ligand diffusion; 278−281 however, little is currently known about the extent to which different materials permit or exclude diffusion of different types of ligands in protein-based materials.In one of the few studies in this area, Baxa et al. demonstrated that substrate diffusion through filaments only slightly reduces enzyme activity. 282The diffusion of molecules through nonprotein hydrogels has been more thoroughly studied., 241,246,[277][278][279][280][281]283,284 The main factors affecting diffusion rate include hydrogel pore size and electrostatic interactions between the hydrogel and the diffusing molecules.241 These hydrogel studies provide a useful guide for investigation of diffusion through protein matrices. The more chmically and topologically complex nature of a protein is expected to make similar studies in proteins complex, yet very interesting.5.7. Guet Protein Dynamics.Dynamics are a key aspect of the behavior of many proteins.For instance, enzyme motions often accompany chemical catalysis.285 Other proteins also require large-scale motions as part of their function.286 How does incorporation into materials impact the ability of these proteins to move and thus function?To what extent, if at all, is protein motion restricted in different materials?Can the sequence of a self-assembling protein be altered to enhance or suppress motions in a fused functional protein?5.8.Fusion Effects on Materials Properties. Litle is known about the impact of fusions on the properties of protein-based materials.In most cases, materials assembly and structure is generally only verified by microscopy. Thu the impact of fusions on assembly kinetics, protein structure, and mechanical properties of the host material are often ignored.However, fusion to EGFP decreases the extensibility and increases tensile strength of Ubx materials, 36 and the structural features of proteins fused to amyloids impact both the assembly and Young's modulus of the resulting materials.97 Forman et al. demonstrated that the conformation and position of proteins displayed on amyloid fibrils impact fibril morphology.146 Conversely, the materials could alter the function of the appended protein.For instance, interactions between the LOO8 biosensor protein and Ubx materials hampered analyte binding, a problem that was largely mitigated by altering the buffer.246 These results highlight the need to better investigate the ability of fused proteins to impact the properties of materials and vice versa.
5.9.Interfaces with Nonprotein Materials.Finally, several applications may require functionalized protein-based materials to interface with other types of materials.In tissue engineering, interfaces between tissue types (e.g., bone/ tendon) may be required to rebuild damaged bodies.Likewise, sensors must not only bind ligand, but also detect when ligand is present.For many approaches, this will require sensors to stably bind detection devices such as a quartz-crystal nanobalance. 41New proteins may need to be designed or evolved to meet these needs.
Addressing these issues in any material will substantially increase the utility of all protein-based materials.In addition, each protein-based material exhibits a unique set of beneficial attributes and engineering challenges.Identifying new selfassembling proteins amenable to protein fusion and defining the range of proteins that can be successfully fused will expand the toolbox, allowing materials to be selected that better match the needs of each application.Finally, developing creative assembly techniques has the potential to incorporate difficult proteins that cannot be immobilized by standard singlecomponent approaches. 39,287

Figure 2 .
Figure 2. Protein fusion does not significantly impact Ubx assembly into materials.A) Comparison of Ubx materials formation at room temperature for 16 h for unfused Ubx (left) and multiple Ubx peptide and protein fusions.The monomer concentration was 50 nM in all samples.Materials formation was quantified as the length of fiber drawn from the assembly solution, averaged over three experiments. 36B−E) Differential interference contrast microscopy of Ubx, VEGF-Ubx, MBP-Ubx, and PFK-Ubx fibers, respectively, demonstrates that protein fusions do not alter the structure of Ubx fibers even when the appended protein forms dimers (VEGF-Ubx) or tetramers (PFK-Ubx). 36F) Fibers formed from mixtures of Ubx and EGFP-Ubx fluoresce to different extents.The percent of EGFP-Ubx in each mixture is labeled on each panel.G) Fiber fluorescence (green bars) scales linearly with the percent of EGFP-Ubx in mixture of monomers, indicating neither Ubx nor EGFP-Ubx assembles better than the other protein.To account for differences in fiber size, the intensity of green fluorescence was normalized to fiber width.Blue bars represent the normalized blue fluorescence from dityrosine bonds present in all Ubx materials.90Reprinted with permission from Tsai et al., 2015.36 Figure 2. Protein fusion does not significantly impact Ubx assembly into materials.A) Comparison of Ubx materials formation at room temperature for 16 h for unfused Ubx (left) and multiple Ubx peptide and protein fusions.The monomer concentration was 50 nM in all samples.Materials formation was quantified as the length of fiber drawn from the assembly solution, averaged over three experiments. 36B−E) Differential interference contrast microscopy of Ubx, VEGF-Ubx, MBP-Ubx, and PFK-Ubx fibers, respectively, demonstrates that protein fusions do not alter the structure of Ubx fibers even when the appended protein forms dimers (VEGF-Ubx) or tetramers (PFK-Ubx). 36F) Fibers formed from mixtures of Ubx and EGFP-Ubx fluoresce to different extents.The percent of EGFP-Ubx in each mixture is labeled on each panel.G) Fiber fluorescence (green bars) scales linearly with the percent of EGFP-Ubx in mixture of monomers, indicating neither Ubx nor EGFP-Ubx assembles better than the other protein.To account for differences in fiber size, the intensity of green fluorescence was normalized to fiber width.Blue bars represent the normalized blue fluorescence from dityrosine bonds present in all Ubx materials.90Reprinted with permission from Tsai et al., 2015.36

Figure 3 .
Figure3.Schematic depicting the groups of self-assembling proteins that have been fused to each other to alter the structural or mechanical properties of the resulting materials.130,131,135−151Each fused pair of proteins is depicted as a line, and triangles represent 3 proteins that have been combined into a single fusion protein.The weight of the lines roughly corresponds to the number of reported fusions.Suckerin is also known as squid ring tooth protein.

4 . 6 .
Enhancing Vaccine Efficacy.Although the focus of this review is macroscale materials, important lessons can be learned from protein fusions used to create nanoscale particles for vaccines.Fougeroux et al. demonstrated that displaying the Receptor Binding Domain of SARS-CoV-2 Spike protein when

Figure 6 .
Figure 6.Analysis of the size fusion proteins reported in Tables1 and 2. The identity of the self-assembling protein in each fusion is indicated by color.

Figure 7 .
Figure 7. Materials formed using self-assembling macroscale and nanoscale proteins.A) Macroscale materials are linked to which types of materials have been formed.Generally, most macroscale materials can form more different morphologies of materials.36,39,48,123,151,153,220,247−267B) Nanoscale materials have been linked to material types which have been formed.58,68,268−273

Table 1 .
A Representative Subset of Protein Domains Genetically Fused to Self-Assembling Proteins That for Macroscale Materials Is Listed a

Table 2 .
Full-Length Proteins Incorporated into Macroscale Protein-Based Materials as Protein Fusions a CTo solve this problem, a broad range of organisms are being used as cell factories as an alternative to E. coli, including a recently developed strain of E. coli that only produces a truncated endotoxin, other types of bacteria, yeast, insect cells, and mammalian cells (especially Chinese hamster ovary cells), among other production systems.18−20

Table 2
a Linker refers to the 1 letter code for amino acids inserted between the functional protein domain and the self-assembling protein.Fusion Terminus indicates whether the functional protein was fused to the N-terminal side, the C-terminal side, or in the middle of the self-assembling protein.b Produced in Bombyx mori instead of E. coli.c ELP, elastin-like protein.d Silkworm silk derivative.e Spider silk derivative.

Table 3 .
Concentrations of Self-Assembling Proteins Typically Used to Form Materials