Double Nitroxide Labeling by Copper-Catalyzed Azide–Alkyne Cycloadditions with Noncanonical Amino Acids for Electron Paramagnetic Resonance Spectroscopy

Electron paramagnetic resonance spectroscopy in combination with site-directed spin labeling (SDSL) is an important tool to obtain long-range distance restraints for protein structural research. We here study a variety of azide- and alkyne-bearing noncanonical amino acids (ncAA) in terms of protein single- and double-incorporation efficiency via nonsense suppression, metabolic stability, yields of nitroxide labeling via copper-catalyzed [3 + 2] azide–alkyne cycloadditions (CuAAC), and spectroscopic properties in continuous-wave and double electron–electron resonance measurements. We identify para-ethynyl-l-phenylalanine and para-propargyloxy-l-phenylalanine as suitable ncAA for CuAAC-based SDSL that will complement current SDSL approaches, particularly in cases in which essential cysteines of a target protein prevent the use of sulfhydryl-reactive spin labels.

spectroscopy are powerful tools for studying the structure, dynamics and interactions of proteins. 1,2 Pulsed techniques such as double electron−electron resonance (DEER) provide access to long-range distance distributions between 1.8 to 16 nm by measuring the dipole−dipole interactions between paramagnetic centers. 3,4 EPR spectroscopy on diamagnetic proteins requires the introduction of paramagnetic spin labels, such as transition-metal ions 5,6 or nitroxide radicals, which are attached to a given protein via site-directed spin labeling (SDSL) techniques. 7,8 The most popular labeling strategy uses site-directed mutagenesis to remove naturally occurring cysteines of a protein and incorporating cysteine residues at the desired labeling positions. Spin labels are then introduced by sulfhydryl-reactive reagents such as the methanethiosulfonate spin label (MTSSL) or by 1,4-additions with maleimide labels. 9 Although a vast range of proteins has been studied with this labeling approach, it does not allow studies of proteins with essential cysteines in their natural, functional state. Moreover, the approach lacks bioorthogonality in complex biological systems with abundant sulfhydryl groups, which complicates its application to EPR studies of proteins under physiologically relevant conditions. 7,8 More recently, alternative labeling strategies have been reported that make use of amber stop-codon suppression with noncanonical amino acids (ncAA). 10,11 In this approach, an orthogonal pair of an aminoacyl-tRNA-synthetase (aaRS) and amber suppressor tRNA is co-expressed with the target protein bearing an in-frame amber codon, enabling the co-translational incorporation of the ncAA at user-defined positions directly in cells. 8,12 Here, ncAA can be incorporated that either already contain a paramagnetic center 13,14 or that contain a chemical handle for post-translational bioorthogonal conjugation reactions with a spin label, for example, by azide−alkyne cycloadditions 15−17 or by oxime formation. 18 Labeling by copper(I)-catalyzed azide−alkyne cyclo-additions (CuAAC), 19 in particular, has been proven to occur with high labeling efficiency, fast reaction kinetics, and simplicity in the context of nucleotides, lipids, sugars, and proteins. 20−23 CuAAC is orthogonal in the context of proteins and other cellular components, enabling applications in bacteria and mammalian cell lines. 24 In view of protein SDSL, CuAAC has been applied to both alkyne-and azide-bearing ncAA, the latter generally being somewhat susceptible to intracellular reduction and consequently reduced labeling yields. 25 Paramagnetic lanthanide tags have been conjugated to azide-bearing ncAA for distance measurements and allowed the observation of pseudocontact shifts in several proteins. 26,27 Moreover, initial studies have reported CuAAC-based SDSL with nitroxide labels and azideor alkyne-bearing ncAA. Although standard 2,2,5,5-tetramethyl-based nitroxide moieties are more sensitive to the reductive conditions of CuAAC than lanthanide tags, they offer unique advantages, such as a small, non-ionic scaffold with low perturbative potential, as well as narrow EPR spectra that are readily accessible for pulsed EPR techniques. However, only a single DEER experiment with a ncAA-bearing, CuAAC-labeled protein has been reported to date. 17 Steinhoff and co-workers conducted expressions of GFP carrying the flexible, lysinebased alkyne ncAA N-ε-propargyloxycarbonyl-L-lysine followed by cellular CuAAC labeling. After purification and in vitro MTSSL labeling of the two endogeneous cysteines of GFP, combined distance distributions between the three centers were recorded. This interesting experiment demonstrates the applicability of CuAAC for SDSL and DEER distance measurements in complex biological systems. However, leveraging the full potential of CuAAC requires the independence of cysteine labeling by the double incorporation of metabolically stable ncAA that ideally result in spin labels with low flexibility for optimal spectroscopic properties.
Here, we report a systematic evaluation of CuAAC-based SDSL with a series of phenylalanine-based nitroxide ncAA bearing alkyne or azide functionalities and varying flexibilities. We evaluate incorporation efficiencies of the ncAA using a generally applicable, polyspecific aaRS/tRNA pair using the Escherichia coli oxidoreductase thioredoxin (TRX) as target protein that contains two essential cysteines as part of its catalytic center. We characterize the spectroscopic properties of the resulting spin labels by both continuous-wave (cw) and DEER experiments and present a rotamer library, enabling simulations of the influence of linker flexibilities in DEER distance measurements.
Results and Discussion. To identify an optimal ncAA for CuAAC-based SDSL, we evaluated the four ncAA para-azido-L-phenylalanine (pAzF), 28 para-ethynyl-L-phenylalanine (pENF), 29 para-propargyloxy-L-phenylalanine (pPrF), 30 and para-O-pentynyl-L-tyrosine (p2yneY) 31 (Figures 1 A and S1). To allow for the flexible incorporation of all ncAA with a single aaRS/tRNA pair, we tested a mutant Methanocaldococcus jannaschii tyrosyl aaRS/tRNA pair previously evolved for the ncAA para-cyanophenylalanine (pCNF-RS). 32 Similar to certain mutant pyrrolysyl aaRS, 33 this pair exhibits polyspecificity in a range of phenylalanine-based ncAA. 33 We initially expressed TRX proteins bearing a single ncAA at the solvent-exposed helix position D14 34 under co-expression of the aaRS/tRNA pair from the pEVOL plasmid backbone 35 and purified the proteins by Ni-NTA affinity chromatography via a C-terminal His6-tag (Figures 1 B and S1; note that in the absence of any ncAA, pCNF-RS also accepts canonical amino acids as substrate). 32,36 For pAzF, pENF, and pPrF, we

ACS Chemical Biology
Letters obtained similar yields (24−32 mg L −1 ) as compared to the wild-type TRX, whereas p2yneY led to a low yield, as expected for the larger tether in the para position. 31 Importantly, the use of pCNF-RS also enabled highly efficient and general double incorporation of pAzF, pENF, and pPrF at combinations of further solvent accessible sites of TRX (D14, G34, Q51, and R74, Figure 1 B shows results for D14/G34, with yields of 7− 15 mg L −1 ). These positions have partially been studied in the context of other spin labels, providing an ideal reference for a comparison of spectroscopic properties. 13,14 In common CuAAC labeling approaches, 37 the catalytically active copper(I) species is generated in situ by mixing copper(II)sulfate, a stabilizing copper ligand, and ascorbate as reducing agent. By adding a large excess of reducing agent, quantitative labeling can be achieved with very low concentrations of copper. In the case of CuAAC-based spin labeling, however, spin-label integrity has to be taken into consideration because the nitroxide moiety is sensitive to reduction. Negative effects on nitroxide stability are minimized by adding copper(II)sulfate and ascorbate in equimolar amounts and limiting the reaction time to 1 h ( Figure S2). Shorter reaction times resulted in reduced labeling yields in the case of TRX. However, labeling conditions may need to be tuned to the protein of interest and accessibility of the labeling site. Effective and selective labeling was confirmed by EPR spectroscopy and mass spectrometry (Figures S3−6). Labeling efficiencies were calculated as the ratio of the spin concentration to the protein concentration ( Figure S7).
Because different labeling sites resulted in slightly different labeling yields, an overall labeling efficiency for each ncAA was determined as the mean value of the three different doubly labeled TRX mutants, each from three independent experiments. The degree of double-labeling was further crosschecked by estimating the number of spins per macromolecule from DEER measurements. Because double incorporation was not achieved in case of p2yneY, the labeling efficiency was instead calculated from the available singly labeled TRX mutants. Under the applied conditions, TRX wild-type remained unlabeled ( Figure S8), while near-quantitative labeling yields (98%) were obtained for pPrF. For pENF, we observed a labeling yield of 57%, presumably because of the slightly decreased accessibility as a result of the short linker length.
Only moderate labeling yields were obtained for pAzF and p2yneY (21% and 16%). The azide functionality of pAzF is prone to partial reduction during expression, 25 and the MS spectra of purified pAzF-containing TRX revealed additional peaks that corresponded to the expected mass of the reduced amino acid, which accounts for the decreased labeling efficiency. No significant changes in the circular dichroism spectra of TRX were observed between unlabeled and labeled TRX variants and the wild-type, indicating that neither the ncAA nor the attached label had an impact on the secondary structure of TRX ( Figure S9).
The EPR spectra of labeled TRX for all ncAA are shown in Figure 2. Spectral simulations were performed using Easy-

ACS Chemical Biology
Letters Spin 38 and under the assumption of isotropic diffusion ( Figures S10 and 11). While this simplified motion model resulted in a slight deviation between simulation and experimental data, it enabled a direct comparison of the label mobility for the different ncAA. The simulations revealed rotational correlation times of 0.32 ns for p2yneY-L, 0.48 ns for pPrF-L, and 0.66 ns in case of pAzF-L and pENF-L. Rotational correlation times are in agreement with the expected trend based on the linker flexibility: the highest mobility is found for p2yneY-L, which contains an extended carbon chain in the linker region between the phenyl ring and alkyne functionality. Reduction of the linker length results in a slightly slower rotation, as is observed for D14pPrF-L TRX. The linker of the remaining two labels is expected to be even less flexible due to the preferred planar orientation of the phenyl-and triazolerings formed during the cycloaddition.
For further characterization of CuAAC-based spin labeling, we investigated the suitability of these labels in EPR distance measurements. Spin−spin relaxation is an important parameter that limits the range of addressable distances and sensitivity. We find that phase-memory times significantly increase by removing copper(II)-ions bound to the protein by EDTA treatment ( Figure S12). DEER distance measurements were performed with doubly labeled TRX bearing pAzF-L, pENF-L, or pPrF-L at positions D14/G34, G34/Q51, and Q51/R74 (expected Cα-Cα distance of 2.8, 2.64, and 2.92 nm as derived from the TRX crystal structure, respectively). 34 The ncAA p2yneY was excluded due to the low expression yields for double incorporation. We were able to obtain dipolar traces and corresponding distance distributions in all cases (Figures 3 and S13−16). We expect the width of the obtained distance distributions to be mainly determined by the linker flexibility of each label because TRX behaves as a highly rigid protein in molecular dynamics simulations. 14 The observed widths follow the trend of increasing flexibility with increasing linker length: pAzF-L generally produces the narrowest distance distributions, closely followed by pENF-L, whereas pPrF-L exhibits more broad distributions. For comparison with the "gold standard" MTSSL, we performed standard cysteine labeling with TRX bearing serine mutations at the catalytic cysteine residues. 14 We can directly compare the distance distributions obtained with MTSSL in the absence with those obtained by the ncAA in the presence of the endogenous cysteines because the the nuclear magnetic resonance structures of oxidized and reduced TRX are very similar, 39 suggesting that the conformation is not stabilized by the disulfide bond in the reactive center. MTSSL labeling resulted in similar widths of distance distributions as pAzF and pENF-L, with the exception of the labeling position D14/G34, where a very narrow distribution with two maxima was observed that differed from the one of pAzF ( Figure 3D). The smallest mean distances were observed for MTSSL labeling, while CuAAC labeling generally resulted in larger values that differed between different linkage types. Taken together, these findings clearly

ACS Chemical Biology
Letters illustrate the similar flexibility of pENF and pAzF compared to MTSSL but also the impact of the label linkage on the obtained distance distribution and, thus, the need for precise modeling and simulations that predict the linker flexibility in individual measurements. We thus simulated the contribution of linker flexibility using a rotamer library approach for pENF-L. 40 We generated a conformer ensemble by random variation of the four relevant torsion angles in the linker of pENF-L and selecting suitable angles based on their energy in a universal force field. 41,42 The ensemble was clustered resulting in a population-weighted, averaged set of dihedral angles representing the ensemble ( Figure S17). The obtained rotamers were attached to the crystal structure of TRX, and simulated distance distributions between labeling sites were compared to the respective experimental counterpart (Figure 4). The adapted orientations appear to be mainly influenced by the dihedral angles χ 1 and χ 2 as rotation around χ 1 has the largest leverage due to the planar arrangement of the phenyl and triazole moiety. The variation of the remaining dihedral angles only shifts the nitroxide moiety within a relatively small range. Overall, experimental results are in good agreement with the simulated distance distribution, and thus, the rotamer library for pENF-L will be a valuable tool in the choice of labeling sites when transferring the CuAAC approach to new proteins and biological questions.
In conclusion, we demonstrated the use of CuAAC-based double spin labeling with genetically encoded ncAA and nitroxides independent of cysteine labeling. We evaluated a set of four phenylalanine-based ncAA in terms of incorporation efficiencies with a polyspecific aaRS/tRNA pair, metabolic stability, labeling yields, and spectroscopic properties, using TRX as exemplary protein. In our study, we identified two ncAA, pPrF and pENF, as suitable choices for CuAAC-based SDSL. While near-quantitative labeling yields were obtained for pPrF-L at the investigated labeling positions, pENF-L showed favorable spectroscopic properties in DEER measurements, comparable to the standard-label MTSSL. Both ncAA will be useful tools that extend the range of existing labeling strategies, which is important given the fact that spin labels need to be carefully chosen with regard to the particular application, the protein under investigation and possible restrictions of the labeling site. Our CuAAC-based SDSL approach is especially suited in cases in which essential cysteines of a target protein prevent MTSSL-labeling, and, given the compatibility of CuAAC with intracellular labeling, we expect that pENF and pPrF may facilitate in vivo EPR studies and thus enable new insights into protein structure and function under physiological conditions.