Bioinspired Strategy for the Ribosomal Synthesis of Thioether-Bridged Macrocyclic Peptides in Bacteria

Inspired by the biosynthetic logic of lanthipeptide natural products, a new methodology was developed to direct the ribosomal synthesis of macrocyclic peptides constrained by an intramolecular thioether bond. As a first step, a robust and versatile strategy was implemented to enable the cyclization of ribosomally derived peptide sequences via a chemoselective reaction between a genetically encoded cysteine and a cysteine-reactive unnatural amino acid (O-(2-bromoethyl)-tyrosine). Combination of this approach with intein-catalyzed protein splicing furnished an efficient route to achieve the spontaneous, post-translational formation of structurally diverse macrocyclic peptides in bacterial cells. The present peptide cyclization strategy was also found to be amenable to integration with split intein-mediated circular ligation, resulting in the intracellular synthesis of conformationally constrained peptides featuring a bicyclic architecture.

M ethods for generating macrocyclic peptides, and combinatorial libraries thereof, have attracted considerable interest owing to the peculiar conformational and molecular recognition properties of this structural class and their promise toward addressing challenging drug targets. 1,2 To this end, one approach has involved the generation and manipulation of natural cyclopeptide scaffolds via the reconstruction and engineering of their respective biosynthetic pathways. 3−7 Following an alternative approach, other groups, including ours, have focused on implementing strategies to enable the macrocyclization of ribosomally derived polypeptides of arbitrary sequence. 8,9 Particularly attractive features of the latter are their high combinatorial potential and the possibility to interface the resulting libraries of constrained peptides with powerful display platforms. Despite significant contributions in this area, 10−18 these approaches have yet remained largely limited to the production of macrocyclic peptides in vitro. 8,9 As a notable exception, there is the split intein-mediated circular ligation method (SICLOPPS) introduced by Benkovic and co-workers, 19 in which head-to-tail cyclic peptides are generated via circularization of an internal peptide sequence upon a trans splicing reaction involving flanking domains from the natural split intein DnaE (Supplementary Figure S1). Enabling the synthesis of cyclic peptides in living cells, this approach has provided a powerful tool for the discovery of cyclopeptide inhibitors against a variety of protein targets upon integration with an intracellular selection or reporter system. 20−22 A limitation of this method however is the accessibility of a single type of cyclic peptide topology (i.e., N-to-C-end cyclic peptide). Furthermore, cyclization efficiency is largely affected by the composition of the target peptide sequence. 22−24 Thus, more general and versatile methods to direct the ribosomal synthesis of macrocyclic peptides would be highly desirable in order to expand current capabilities toward exploring and exploiting peptide macrocycles for drug discovery and chemical biology applications.
To this end, we developed and report here an efficient strategy for the production of structurally diverse thioetherlinked macrocyclic peptides in living bacterial cells (E. coli). Inspiration for this approach was drawn from the 'logic' underlying the biosynthesis of lanthipeptides, a class of ribosomally derived polycyclic peptides constrained by intramolecular thioether bonds. 25,26 As schematically illustrated in Figure 1A, these natural products are initially produced as linear precursor polypeptides via ribosomal synthesis. Recognition of the N-terminal leader sequence by dehydratase enzymes then mediates the conversion of Ser and Thr residues located within the core peptide into dehydroalanine (Dha) and dehydrobutyrine (Dhb), respectively. The signature thioether cross-links are subsequently formed via enzyme-assisted Michael addition of cysteine sulfhydryl groups onto these electrophilic α,βunsaturated amino acid residues. Finally, removal of the leader peptide by a downstream protease releases the mature lanthipeptide. 25,26 Inspired by this reaction sequence, we sought to develop a strategy for the ribosomal generation of conformationally constrained peptides through the combination of peptide cyclization via an inter-side-chain thioether linkage with proteolytic release of the resulting macrocyclic peptide. Conceivably, achieving this goal in an enzyme-independent manner would involve the challenge of building in the structural elements necessary for promoting these transformations directly into the genetically encoded, ribosomally produced precursor polypeptide. As outlined in Figure 1A, we reasoned this task could be achieved via (a) a chemoselective reaction between a cysteine and a ribosomally incorporated unnatural amino acid bearing a cysteine-reactive side-chain group and (b) spontaneous release of the macrocyclic peptide by means of an intein-based protein splicing element.
To implement this idea, we envisioned that the target unnatural amino acid (UAA) ought to satisfy several criteria. First, its side-chain electrophilic group should be sufficiently reactive to enable efficient peptide cyclization upon reaction with a proximal cysteine in intramolecular settings, but not too reactive in order to avoid side reactions with competing nucleophiles present in the cellular environment (e.g., glutathione). In addition, it should be amenable to protein incorporation via an aminoacyl-tRNA synthetase (AARS)/ tRNA pair with orthogonal reactivity. 27 With these considerations in mind, we decided that O-(2bromoethyl)-tyrosine (O2beY, Figure 1B) could meet the aforementioned requirements, as suggested by its sluggish reactivity in the cyclization of cysteine-containing peptides in vitro 15 and its structural similarity to O-propargyl-tyrosine (OpgY), for which an amber suppressor AARS/tRNA pair was made available. 28 Accordingly, the envisioned approach would entail cyclization of a precursor polypeptide via a thioether bond-forming reaction between O2beY and a proximal cysteine, followed by intein-mediated release of the macrocyclic peptide ( Figure 1B).
To evaluate the feasibility of this design, a first series of constructs was utilized (entries 1−9, Figure 2A), which encode for 12-to 16-amino acid long target peptide sequences and in which the O2beY and Cys residues are spaced from each other  by an increasing number of intervening residues (i.e., from Z+1 to Z+12). In the respective genes, an amber stop codon (TAG) was introduced after the initial Met-Gly to allow for the sitespecific incorporation of O2beY into the target sequence. The latter was then genetically fused to an engineered variant of Mxe GyrA intein, whose C-terminal Asn198 was mutated to Ala to abolish its self-splicing activity while preserving its ability to form a thioester bond at the N-terminal end via N → S acyl transfer (dotted box, Figure 1B).
To identify a viable AARS/tRNA pair for O2beY incorporation into these constructs, we initially tested the Methanocaldococcus jannaschii tyrosyl-tRNA synthetase variant previously evolved for recognition of the structurally related OpgY (OpgY-RS). 28 This choice was motivated based on the 'polyspecificity' often exhibited by orthogonal AARSs toward related UAA structures. 29,30 Gratifyingly, using OpgY-RS, O2beY could be successfully incorporated into a model protein consisting of Yellow Fluorescent Protein (YFP) with an Nterminal amber stop codon ( Figure 3B). To improve its efficiency toward O2beY incorporation, OpgY-RS was subjected to further mutagenesis. To this end, an homology model of this enzyme was first generated on the basis of the available crystal structure of MjTyr-RS in complex with its native substrate, tyrosine. 31 Then, the unnatural amino acid O2beY was docked into the enzyme active site ( Figure 3A). Inspection of the model suggested that an Ala32Gly mutation would expand the active site cavity to better accommodate the 2bromo-ethoxy group in O2beY. Rewardingly, the resulting AARS variant (called O2beY-RS) was found to enable the ribosomal incorporation of O2beY with significantly higher efficiency compared to OpgY-RS, while maintaining discriminating selectivity against the natural amino acids ( Figure 3B). Comparison of the expression yields of YFP(O2beY) versus wild-type YFP indicated that the efficiency of amber stop codon suppression with O2beY-RS was excellent (85%).
With a suitable AARS/tRNA pair for O2beY incorporation in hand, the constructs corresponding to entries 1−9 ( Figure 2A) were produced in E. coli BL21(DE3) cells using a dual plasmid system (see Supporting Information for details). To better examine both the occurrence and efficiency of the thioether bond-forming reaction according to the strategy of Figure 1B, the aforementioned constructs were designed to contain a Thr residue at the position preceding the intein ('I−1 site'). This substitution minimizes premature hydrolysis of GyrA-fusion proteins during expression, 14,32 thereby facilitating analysis of the target peptide sequences after chemically induced splicing of the intein from the purified proteins in vitro ( Figure 1B, path  A). This procedure would also permit the isolation of any product resulting from the unselective reaction of O2beY with other nucleophiles in vivo. Accordingly, after purification, the proteins were treated with benzyl mercaptan to release the Nterminal peptides. The reaction mixtures were then analyzed by LC−MS to detect and quantify the amount of the desired thioether-linked macrocyclic product as well as that of the uncyclized linear peptide, as judged on the basis of the peak areas in the corresponding extracted-ion chromatograms (see Figure 4A and Supplementary Figures S3−S10). As summarized in Figure 2B, these studies revealed that the macrocyclization had occurred with very high efficiency (80−95%) across the constructs with Cys and O2beY being separated by one (Z+2) to seven (Z+8) residues. Increasing this distance (i.e., Cys at Z+10 and Z+12, entries 8 and 9 in Figure 2A) resulted in a noticeable increase of the acyclic product (50− 80%, Figure 2B), thus defining the upper limits for the macrocycle size accessible through this method. Interestingly, when the Cys was located immediately adjacent to the unnatural amino acid (entry 1, Figure 2A), minimal cyclization (5%) was observed. A similar lack of reactivity was observed by Suga and co-workers in the context of in vitro translated peptides containing a cysteine-reactive N-terminal 2-chloroacetyl moiety, 33 and this result can be rationalized here on the basis of the unfavorable 14-membered macrocycle formed when the O2beY/Cys pair are in a i/i+1 relationship. For each construct tested, the identity of the macrocyclic product could be further confirmed by analysis of the corresponding MS/MS fragmentation spectrum as illustrated in Supplementary Figure  S2.
Importantly, GyrA intein contains a Cys at its N-terminal end, which is crucial for mediating protein splicing in the context of our planned strategy for producing these peptide macrocycles inside the cells ( Figure 1B). Since this residue is partially buried within the active site, 34 we did not expect it to readily react with the O2beY side-chain. Notably, quantitative splicing of the GyrA moiety upon treatment of all these contructs with benzyl mercaptan ( Figure 4A and Supplementary Figure S17) indicated that no reaction occurred between O2beY and the catalytic Cys at the intein I+1 site. Furthermore, no adducts or dimers were observed for any of the constructs described above, including those undergoing only partial cyclization (i.e., entries 8 and 9, Figure 2A). Altogether, these results evidenced the high chemo-and regioselectivity of the macrocyclization reaction.
In the interest of determining whether the thioether bondforming reactivity is preserved if the order of Cys and O2beY is reversed, the two constructs corresponding to entries 10 and 11 in Figure 2A were prepared. Here, the reactive Cys is located upstream of the unnatural amino acid and specifically at positions Z-6 and Z-8. Analysis of these constructs according to the procedure described above (Supplementary Figures S11 and S12) revealed the occurrence of the desired cyclic peptide as the largely predominant product (>99%), i.e., with comparable (Z-6) or even higher (Z-8) efficiency than the corresponding Z+6 and Z+8 counterparts ( Figure 2B). These data clearly showed that either arrangement of the Cys/O2beY pair within the target sequence is compatible with macrocylization.
Having established the versatility of the approach toward obtaining structurally diverse macrocyclic peptides either linked to the N-terminus of a protein or in isolated form after intein splicing in vitro, we next investigated whether this strategy could be further evolved to permit the production of macrocyclic peptides in vivo. In previous studies, 18 we established that certain amino acid substitutions at the level of the I−1 site, and in particular Asp and Lys, can strongly promote N-terminal splicing of GyrA intein during recombinant expression. This effect is likely due to the ability of these residues to favor hydrolysis of the intein-catalyzed thioester linkage through their nucleophilic side-chain groups. While undesirable in the context of our MOrPH synthesis methodologies, 18 we envisioned this reactivity could be leveraged here for mediating the spontaneous release of the macrocyclic peptide from the precursor protein as outlined in Figure 1B  (path B). To test this idea, the constructs corresponding to entries 12 and 13 in Figure 2A were generated. In addition to an Asp residue at the I−1 site, these contructs were designed to encompass the sequence of two streptavidin-binding peptides previously isolated by phage display 35,36 as a way to facilitate the isolation of the target macrocyclic peptides from the cells. Accordingly, after expression of these constructs in E. coli, cells were lysed and the cell lysates were passed over streptavidincoated agarose beads. To our delight, LC−MS analysis of the eluates revealed the occurrence of the expected peptide macrocycles, as illustrated by the chromatograms and MS/ MS spectra in Figure 4B and Supplementary Figures S13 and S14. Since the uncyclized peptide could also be captured through this procedure, these analyses also showed that the desired macrocyclic product was formed with high efficiency in each case (i.e., >95% for Strep1-Z5C; 70% for Strep2-Z7C). Furthermore, both precursor polypeptides were found to have undergone complete splicing in vivo ( Figure 4B and Supplementary Figure S18A,B). Since O2beY-mediated alkylation of the intein catalytic cysteine would prevent protein splicing, the latter results further higlighted the high degree of regioselectivity of the macrocyclization reaction. Collectively, the results obtained with the streptavidin-binding sequences of constructs Strep1-Z5C and Strep2-Z7C provide a proof-ofprinciple demonstration of the feasibility and efficiency of the strategy of Figure 1B for directing the synthesis of cyclopeptides in living bacterial cells. Interestingly, the cyclization yield observed with these sequences correlated very well with the reactivity trend measured across the previous contructs ( Figure 2B), suggesting that this parameter is rather predictable on the sole basis of the Cys/O2beY distance and despite the difference in the composition of the target peptide sequence.
These positive results prompted us to test whether our bioinspired approach could be further extended to enable the ribosomal synthesis of bicylic peptides via the integration of O2beY/Cys-mediated macrocyclization with split intein-catalyzed circular ligation. 19 If viable, this second strategy ( Figure  1A(iii)) could provide the complementary capability of generating macrocyclic peptides that are constrained by means of an N-to-C-end cyclic backbone and an intramolecular inter-side-chain thioether linkage. Implementation of this design presented the challenge that two cysteine residues are involved in the trans splicing process leading to the head-to-tail cyclopeptide (referred to as Int C +1 and Int N +1 cysteine; see Supplementary Figure S1), which could potentially cross-react with O2beY. However, on the basis of the reactivity studies described in Figure 2B, we envisioned this challenge could be tackled by placing the unnatural amino acid in i/i+1 arrangement with respect to the Int C +1 cysteine and by placing the reactive cysteine at a closer distance to O2beY compared to the Int N +1 cysteine, as schematically outlined in Figure 5.
According to these design principles, the cStrep3(C)-Z3C construct was prepared (entry 14, Figure 2A). A related construct where the Int C +1 cysteine is replaced with serine (entry 15, Figure 2A) was also prepared, as this substitution is compatible with split intein-catalyzed peptide cyclization. 23 In each case, the target peptide sequences were designed on the basis of a previously described HPQ motif-containing cyclopeptide capable of binding streptavidin 37 in order to facilitate isolation of the peptide products from the cells. Lysates from E. coli cells expressing these constructs were processed as described above for the other streptavidin-binding peptides. Notably, the desired bicyclic peptide was isolated as the largely predominant product in both cases (>95%), as determined by LC−MS. The bicyclic structure of these compounds was further evidenced by the corresponding MS/MS fragmentation spectra ( Figure 4C and Supplementary Figures S15 and S16). Treatment of the bicyclic peptides with the thiol-alkylating iodoacetamide resulted in a 57 Da increase in molecular mass and shift of the peptide retention time for the product of the cStrep3(C)-Z3C precursor protein but not for that of cStrep3(S)-Z3C, which is consistent with the presence of a free thiol from (from Int C +1 cysteine) in the former but not in the latter. To allow measurement of the extent of posttranslational self-processing of these precursor polypeptides in vivo, a chitin-binding domain was included at the C-terminus of the Int N domain (Figure 2A). LC−MS analysis of the protein fraction eluted from chitin beads showed that the split inteinmediated cyclization has occurred quantitatively with both cStrep3(C)-Z3C ( Figure 4C) and its serine-containing counterpart (Supplementary Figure S18D). Altogether, these results demonstrate the possibility of integrating our peptide cyclization strategy with split intein-mediated protein circularization to enable the formation of bicyclic peptides in E. coli.
In conclusion, we have developed two complementary, versatile methodologies to direct the production of 'natural product-like' macrocyclic peptides constrained by an intramolecular thioether bridge in bacterial cells. A key feature of these methods is that the structural elements and reactivity driving the peptide macrocyclization process are built into the genetically encoded polypeptide precursor. Using this approach, peptide macrocycles of various ring size and composition could be prepared efficiently and with predictable regioselectivity. Importantly, the possibility to generate these macrocyclic peptides in protein-fused or isolated form make the present methodologies directly amenable to integration with well-established display platforms (e.g., phage 38 or mRNA display 39 ) or intracellular selection systems, 20,21 respectively, for library screening. We also anticipate that the strategies outlined in Figures 1B and 5 can be of rather general value, that is, applicable to other cysteine-reactive unnatural amino acids. Work in this direction is currently ongoing in our group, which includes exploring the possibility of extending this approach to the synthesis of conformationally constrained peptides in eukaryotic cells.