Precision Templated Bottom-Up Multiprotein Nanoassembly through De ﬁ ned Click Chemistry Linkage to DNA

Information ABSTRACT: We demonstrate an approach that allows attachment of single-stranded DNA (ssDNA) to a de ﬁ ned residue in a protein of interest (POI) so as to provide optimal and well-de ﬁ ned multicomponent assemblies. Using an expanded genetic code system, azido-phenylalanine (azF) was incorporated at de ﬁ ned residue positions in each POI; copper-free click chemistry was used to attach exactly one ssDNA at precisely de ﬁ ned residues. By choosing an appropriate residue, ssDNA conjugation had minimal impact on protein function, even when attached close to active sites. The protein-ssDNA conjugates were used to (i) assemble double-stranded DNA systems with optimal communication (energy transfer) between normally separate groups and (ii) generate multicomponent systems on DNA origami tiles, including those with enhanced enzyme activity when bound to the tile. Our approach allows any potential protein to be simply engineered to attach ssDNA or related biomolecules, creating conjugates for designed and highly precise multiprotein nanoscale assembly with tailored conjugation methodology, puri ﬁ cation and analysis of the protein-DNA conjugates (gel electrophoresis, ﬂ uorescence spectroscopy, kinetic analysis); DNA and origami sequences and design, and origami formation methodology; AMF imaging of DNA origami tiles and protein-origami conjugates, including statistical analysis; spectroscopic and kinetic analysis of the protein-origami conjugates (PDF)

reaction handles capable of biocompatible reactions that can be placed at precise and designed locations in a protein of interest (POI) irrespective of its starting amino acid composition. For example, protein farnesyltransferase (PFTase) has been used to label C-terminal tetrapeptide tagged proteins with an azidemodified isoprenoid diphosphate for copper catalyzed alkyne− azide click chemistry (CuAAC), 15 or employing nitrilotriacetic acid (NTA) forming chelate complexes on DNA to localize histidine (His) tagged proteins. 16,17 While these strategies address the problem of multiple functionalization, they also severely limit the diversity of the protein-ssDNA linkage sites to the termini of the protein only. Furthermore, CuAAC can adversely affect protein structure (both at the primary and tertiary levels) and hence function. 18 −20 Proteins have been modified with azides for CuAAC modification with alkyne-DNA; multiple azides were introduced by global replacement of methionine to azido-homoalanine, leaving only nonburied azides to react with DNA. 21,22 This methodology is, however, limiting with respect to site selectivity and the requirement for Cu-catalyst.
Here we report how a reprogrammed genetic code 23−26 can be used to introduce a single nonbiological reaction handle at a defined site in proteins, to which ssDNA can be attached through a bio-orthogonal and biocompatible copper-free strained ring promoted alkyne−azide cycloaddition (SPAAC) reaction (Scheme 1). 27,28 This will dramatically expand sites on a protein available for useful attachment of large adducts such as ssDNA than is currently available in terms of optimal attachment site, stoichiometry, protein orientation, and interspecies communication. This in turn increases the precision and control of assembly, helping to tailor function of the nanoscale assemblies. We demonstrate that the defined protein attachment site, coupled with precise orientation on DNA origami surfaces, has significant impact on function.

RESULTS AND DISCUSSION
To investigate the role of modification site on protein function, two disparate systems were modified with ssDNA at various different residues (Figure 1). Superfolder green fluorescent protein (sfGFP) 29 and TEM β-lactamase (BL) 30 were selected as model energy capture and catalytic proteins, respectively. Both sfGFP and BL have recently been shown to be amenable to SPAAC, through the introduction of an azide handle into the protein via the noncanonical amino acid p-azido-L-phenyl-alanine (azF) in conjunction with a reprogrammed genetic code. We have shown that the position of the azF is important in terms of overall efficiency of modification 31 and effect on the structure−function relationship. 32,33 Compared to standard SPAAC adducts such as fluorescent dyes, ssDNA represents a much bigger adduct with distinct chemistry, so it is important that site of labeling is optimized in terms of its effect on the POI's structure and function. The residues were selected based on their proximity to functional regions and relative surface accessibility (Figure 1a, b).
The three sfGFP variants, namely sfGFP 34azF , sfGFP 132azF , and sfGFP 204azF (Figure 1b), were successfully modified with a bicyclononyne (BCN) 5′-functionalized DNA (Figure 1c). The SPAAC-based click conjugation of the variants with a 32mer DNA shows some difference in the efficiency as validated by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) analysis (see Supporting Information for details), where 66%, 35%, and 40% coupled product was observed for sfGFP 34azF , sfGFP 132azF and sfGFP 204azF respectively. These Scheme 1. Protein-DNA Conjugation via SPAAC a a The ssDNA contains a terminal strained bicyclononyne (BCN) and the protein contains a genetically encoded p-azido-L-phenylalanine (azF; blue) at a defined residue position. The triazole link between the two components is highlighted.

ACS Nano
Article yields differ from those observed for SPAAC with the bulkier dibenzylcyclooctyne moiety, with higher modification efficiency observed for sfGFP 34azF and sfGFP 132azF but lower for sfGFP 204azF . 31 The overall fluorescence intensity of unmodified and modified sfGFP variants was very similar, indicating that the DNA is not impacting greatly on function ( Figure S3).
Precision modification via genetically encoded non-natural reaction handles enable optimization of communication between normally separate functional centers. The sfGFP-ssDNA conjugates were used to control the distance and thus energy transfer to a Texas Red (TR) dye. A 3′-TR modified complementary strand was hybridized to the sfGFP-DNA conjugates, and relative Forster resonance energy transfer (FRET) was measured against each of the three sfGFP variants (Figure 2a). FRET efficiency was highest for sfGFP 204azF and sfGFP 34azF (∼90%); both these variants have DNA conjugation sites close to the sfGFP chromophore ( Figure 1b) with an estimated chromophore separation of ∼29 Å (based on an R 0 of 43 Å using equation E = 1/1 + (r/R 0 ) 6 ). 31 The lowest observed FRET efficiency was for sfGFP 132azF (∼75%), the variant with attachment site furthest from chromophore, and is estimated to have a longer interchromophore distance (∼36 Å). The individual spectra of ssDNA labeled sfGFP bound to ssDNA with and without TR are shown in Figure S7. The overall FRET efficiencies are in keeping with the conjugation position to the sfGFP and thus the predicted proximity of the two chromophores in the double-stranded DNA (dsDNA) complex. Overall, the system allows for the construction of well-defined and tunable supramolecular systems with optimal communication/coupling, in this case energy transfer.
To demonstrate designed multicomponent addressable bottom-up self-assembly, the sfGFP variants were conjugated to different DNA sequences complementary to extended staple strands on a flat DNA origami tile (see Supporting Information for origami and DNA sequence design). 34 The design is such that sfGFP variants are located close to the surface of the DNA origami tile (Figure 2b), with the interprotein distance being 10 nm. All sfGFP-DNA conjugates hybridize well to the origami tile through a single extended staple strand, as shown by the retained fluorescence of the systems after purification (gel filtration) to remove excess protein ( Figure 2b). AFM imaging showed that the proteins were located at the same position on each tile ( Figure 2c). Using sfGFP 204azF as an exemplar, one, two, or three proteins were attached to the tiles in combinations highlighted in Figure 2b. Fluorescence increased according to the number of different sfGFP 204azF -ssDNA conjugates incubated with the tile. Therefore, the purification of the origami-protein conjugates by gel filtration, which is not normally performed on DNA origami conjugates, does not seem to impact the stability of the system. Thus, the attached sfGFP can be used as an optical handle to detect the origami tiles at very low concentrations (below the absorbance detection limit). Hybridization of the third sfGFP 204azF generated a slightly larger than expected increase in fluorescence (3.5-fold versus the expected 3-fold) compared to one or two molecules of sfGFP. A similar effect was observed when three different sfGFP variants were attached to the tile; there was a clear enhancement in fluorescence above that for a simple linear titration from 1 to 3 molecules ( Figure S8). The exact nature of the functional enhancement effect is currently unknown, but a similar observation was observed for the enzymatic efficiency of BL (vide inf ra). The data demonstrate the ability to assemble tiles containing separate protein variants at different positions and to assemble arrays of the same protein

ACS Nano
Article while sampling different orientations. Since fluorescence readout is strongly dependent on the dipole orientation of the chromophore, which is particularly important in energy transfer, this system will allow vectorizing protein function (input and output in defined orientations) in single molecule mode, for example, when adsorbed on surfaces.
A more general and important question arising from the current approach for making enzyme-DNA conjugates in particular is how attachment site and orientation on a surface influence activity. To address both questions, azF-containing variants of BL were constructed (Figure 1c). Each of the variants sampled different regions of BL including: close to (BL 105azF , BL 165azF , and BL 237azF ) and far (BL 201azF ) from to the catalytic center; fully solvent exposed (BL 201azF ) to partially exposed (BL 105azF , BL 165azF ) and largely buried (BL 237azF ) side chains; and resident in helical (BL 201azF ), strand (BL 237azF ), or loop (BL 105azF , BL 165azF ) secondary structure. The four BL variants were all active toward the colorimetric substrate nitrocefin prior to modification, although BL 237azF displayed significantly lower activity ( Figure S9). This is not unexpected given its location with respect to the active side. Thus, this variant may be deemed nonoptimal. All four variants could be modified with BCN-ssDNA (Figure 1d) with varying degrees of efficiency (∼33% to ∼85%). Attachment of ssDNA had little impact on activity of BL 165azF and BL 201azF but reduced activity of both BL 105azF and the already compromised BL 237azF . This highlights the importance of conjugation position in terms of catalytic activity. Detailed enzyme kinetics of BL 165azF and BL 105azF confirmed the influence of modification with ssDNA (Table S5); the catalytic efficiency of BL 165azF appeared to be slightly enhanced on DNA attachment, mainly due to slightly improved substrate binding (lower K M ); BL 105azF was significantly lower in activity primarily due to changes in catalysis (lower k cat ). Given the close proximity of residue 165 to the catalytic site (and essential catalytic residue E166), and that the modification had little impact on function, BL 165azF was taken forward for further investigation as a model for enzyme assembly on the DNA origami tiles.
We initially explored differences in BL enzyme activity when hybridized to the origami tile at different distances to the tile surface; a 3-fold excess of enzyme over origami tile was used (7 nM BL, 2.33 nM origami tile, 250 μM nitrocefin). BL is anchored to the tile through hybridization to an extended staple strand; by changing this staple to extend at either the 3′−end or the 5′−end, BL 165azF can be positioned either closely associated with the origami tiles surface in a relatively rigid, spatially restricted state (i.e., at touching distance, denoted as "down" orientation, Figure 3a) or protruding away from the tile into the solution in a potentially more dynamic "swinging arm" position 7,35,36 (maximum of ∼10 nm distance, denoted as "up" orientation). This approach allows the position of the protein and the volume sampled relative to the tile to be altered without the need to change the protein-ssDNA conjugate and thus manipulate enzyme activity and multiprotein communication. The co-assembly yield of protein on the DNA origami tile was estimated to be ∼40% based on AFM imaging analysis of sfGFP 204azF (see Table S4 and Figure S5). While the average assembly yield of 40% is very respectable, it is lower than in previously reported enzyme origami tile assemblies as we use only one extended staple strand for attachment compared to the use of up to four in other systems. 3 In this case, the assemblies were not purified by gel filtration but used directly for activity measurements. The activities of the systems, based on initial rate, will give the raw activity (A raw ) and are determined containing all components, which consist of BL 165azF -ssDNA co-assembled on DNA tiles (A assem ), unbound enzymes (A unassem ), and free DNA tiles (no activity) because the protein assembly on the DNA origami tile is not 100% (vide supra). Under these conditions, BL 165azF had similar activity in either configuration (Figure 3b), where the apparent enzyme activity was enhanced by ∼1.9-fold compared to the free enzyme-ssDNA conjugate in solution (Figure 3c and Supporting Information). The rate enhancement (fold enhancement of raw activity in Figure 3c) is calculated as the ratio of A raw and A unassem ; the latter was determined from the BL-ssDNA conjugates in solution. To confirm nitrocefin (c) Enhancement in nitrocefin hydrolysis activity (A) of protein assembly on DNA tile with respect to the BL-ssDNA conjugates, calculated as ratio of initial rates. The shaded columns represent the observed apparent (raw, A raw ) activity of the unpurified system, and the clear columns represent the calibrated (assembled, A assem ) activity which account for the co-assembly yield of enzyme on DNA origami tile (∼40%) and with it the presence of unassembled (free, A unassem ) enzyme in solution; the red and blue columns represent the BL assembled in the down or up configuration, respectively. Assembled activity was calculated as described in eq 1 and Supporting Information, methods.

ACS Nano
Article hydrolysis rate enhancement on the enzyme binding to tile, the functionally compromised BL 105azF -dsDNA variant was tested; the rate enhancement was even greater (∼4.5) upon hybridization to the tile.
To obtain a more detailed insight into the impact on tile assembly, the activities determined from the initial rates were calibrated using eq 1 according to published procedures: 3 Equation 1 was used to adjust the activities to account for the yield of co-assembly of the enzymes. In eq 1, the raw activity (A raw ) consists of contributions from both assembled BL-ssDNA (A assem ) and unassembled enzyme (A unassem ), where Y is the co-assembly yield of the enzymes on the origami tiles (Y = 0.4). Since a 3:1 ratio of enzymes to origami tiles was used for the assembly, the percentage of assembled enzymes was ∼(Y/ 3), while the percentage of unassembled enzymes was ∼((3 − Y)/3). The resulting calibrated activities are presented in Figure  3c as fold enhancement of assembled activity. The calibrated activities of tile assembled BL 165azF -ssDNA show that the activity is enhanced by 7.3 ± 2.0 times for the up orientation and 8.2 ± 1.6 times for the down orientation (Figure 3c). For BL 105azF -ssDNA, the enhancement was even larger, reaching 30.0 ± 8.5 for the up orientation and 25.1 ± 8.8 for the down orientation (Figure 3c). Thus, the combination of optimal conjugation position with assembly on the tile can lead to greatly improved enzymatic activity, at least in the case of TEM β-lactamase.
Multicomponent bottom-up assembly is feasible with BL 165azF fused to different addressing ssDNA sequences and in combination with sfGFP ( Figure 4). The activity of assembled systems comprising one or two BL 165azF and sfGFPs on the DNA tile (after purification by gel filtration) is shown in Figure 4b, c. Assembly of two BL 165azF -ssDNA conjugates on the tile gave a ∼ 5-fold rather than the expected 2-fold increase in initial rate compared to single enzyme assembly (Figure 4c). The fluorescence from the single sfGFP on the tile can be used as an independent estimate to compare protein on the DNA tiles between different samples, while the absorbance at 260 nm can be used to determine bulk DNA concentration. The very similar emission intensities of the sfGFP on the BxG and BGx tiles indicate comparable concentrations of both tiles, thus confirming the observed rate enhancement of two BL over one BL. Attachment of two enzymes either side of sfGFP has an additional positive effect on BL activity compared to just one enzyme (Figure 4). This is on top of the already observed enhanced activity of one BL on the origami tile compared to the BL in solution.
In our case, the assembly of a defined BL-ssDNA moiety on a DNA tile platform generates a microenvironment resulting in improved catalysis. It is not currently known how enzymatic

ACS Nano
Article enhancement occurs. This may be through localization of substrate concentration as seen in planar and positively charged substrates 37 or through improvement in turnover. 38 Given the structure of nitrocefin (no +ve charge and nonplanar), substrate localization is less likely; the important role of water activation and proton transfer in BL catalysis, coupled with the local charged environment of DNA (phosphate backbone), could provide a more likely rationale.

CONCLUSIONS
We have shown that proteins, including enzymes, can be modified with precisely one DNA strand at defined residues, including close to active sites, using an approach that can be applied potentially to any protein. The noncanonical amino acid p-azido-L-phenylalanine (azF) in conjunction with a reprogrammed genetic code can easily be introduced at any position. With the ever increasing number of commercially available genetically encodable non-natural amino acids, coupled with a wide array of 5′−, 3′− and internal ssDNA coupling chemistries, our approach has the potential to have a much broader application range in bionanotechnology. Attachment of ssDNA in turn enables addressable bottom-up nanoassembly on base materials such as DNA origami tiles with single-molecule control. This allows direct and simultaneous modulation of optical response and enzyme activity of DNA origami systems and allows to study enzyme activity based on well-defined orientations, tile placement, and component stoichiometries. Moreover, with respect to BL, we observed that binding of the enzyme to the DNA tile can lead to significant rate enhancements, which is even more pronounced in multiple enzyme systems. Our design approach will allow fine-tuning of protein assemblies on DNA origami tiles to precisely study protein activity rather than having to rely on ill-defined systems. This will be of particular importance for biological systems where orientation is crucial, such as in membrane bound proteins or in multienzyme assemblies.

MATERIALS AND METHODS
Protein Modification and Purification. All sfGFP variants 31 and TEM β-lactamase variants 33 were produced as described previously. Protein (1 equiv) was mixed with modified 5′−BCN DNA (Table S1, 5 equiv) in 100 mM sodium phosphate buffer (pH 8.0, reaction volume 250 μL) for 48 h at room temperature in the dark. The progress of the modification was monitored using denaturing gel mobility shift assays. The mixture was concentrated using Amicon filter (10 kDa). Purification: sfGFP-DNA conjugates were purified on Superdex 75 10/300 GL column using 50 mM HEPES and 75 mM NaCl buffer (pH 7.5). BL-DNA conjugates were purified using GE HiTrap Q FF column: The protein was bound to the column in 50 mM Tris (pH 7.4) and eluted using a gradient of 50 mM Tris and 1 M NaCl (pH 7.4).
DNA Origami Preparation. DNA origami tiles were obtained as described previously using modified pKD1. 34 Single-stranded pKD1 (1 equiv) was mixed with 10-fold molar excess of staple strands (see Table S2) and 10-fold molar excess of extended staple strands (see Table S3) in 1 × TA-Mg 2+ buffer. The reaction mixture was annealed from 95 to 4°C over a gradient of 1°C per minute on T100 Thermal Cycler. The excess staple strands were removed using Sephacryl S-300 HR micro biospin chromatography columns. Formation of the DNA origami tile was analyzed by 0.8% (w/v) agarose gel electrophoresis ( Figure S1). The concentration was determined by UV absorbance measurement (260 nm) using ε = 7871756 L/(mol·cm). Origami assembly buffer: 1 × TA-Mg 2+ buffer (40 mM Tris base, 20 mM acetic acid, 12.5 mM magnesium acetate, pH 8.3).
Protein-ssDNA Co-Assembly on DNA Origami Tile. DNA origami tile in 1 × TA-Mg 2+ buffer (2.33 nM) was mixed with sfGFP and/or BL conjugated with addressing ssDNA (see Table S1) with a molar ratio of 1:3. The solution was annealed from 37 to 4°C over a gradient of −1°C per minute on T100 Thermal Cycler. Samples were stored at 4°C. In case of purification, the assemblies were filtered through Amicon Ultra-0.5 30K spin filters. Concentrations were determined using a NanoDrop spectrometer.
BL Activity Assay. The activity of BL variants after SPAAC modification with 5′−click-easy BCN CEP II DNA was analyzed using the nitrocefin (Becton Dickenson) hydrolysis assay. DNA-modified BL was diluted to a final concentration of 250 ng/μL in PBS (100 mM sodium phosphate, 300 mM NaCl, pH 8). Nitrocefin is a colorimetric BL substrate (molar absorbance coefficient at 485 nm of 14060 M −1 cm −1 ). Assays were measured in triplicate on a Varian Cary 300 Bio UV−vis spectrophotometer (Agilent Technologies) with Hellma synthetic quartz glass cuvette. Reactions were initiated by addition of nitrocefin to a final concentration of between 50−100 μM and monitored by an increase in absorbance at 485 nm. For the detailed BL kinetics, DNA modified protein was purified from the unmodified form using ion exchange chromatography, using a GE HiTrap Q FF column (see above). Protein was bound to the column in 50 mM Tris (pH 7.4) and eluted using a gradient of 50 mM Tris and 1 M NaCl (pH 7.4). Protein concentration was standardized to 500 nM, and kinetics were assessed using nitrocefin as substrate over a concentration range of 10−100 μM. For analysis of BL assembly on DNA origami tiles, the assembly solutions were directly used (see above), and the reaction initiated with 250 μM nitrocefin.

* S Supporting Information
The Supporting Information is available free of charge on the ACS Publications website at DOI: 10.1021/acsnano.7b01711.
Details of conjugation methodology, purification and analysis of the protein-DNA conjugates (gel electrophoresis, fluorescence spectroscopy, kinetic analysis); DNA and origami sequences and design, and origami formation methodology; AMF imaging of DNA origami tiles and protein-origami conjugates, including statistical analysis; spectroscopic and kinetic analysis of the proteinorigami conjugates (PDF)