Engineering a Highly Efficient Carboligase for Synthetic One-Carbon Metabolism

One of the biggest challenges to realize a circular carbon economy is the synthesis of complex carbon compounds from one-carbon (C1) building blocks. Since the natural solution space of C1–C1 condensations is limited to highly complex enzymes, the development of more simple and robust biocatalysts may facilitate the engineering of C1 assimilation routes. Thiamine diphosphate-dependent enzymes harbor great potential for this task, due to their ability to create C–C bonds. Here, we employed structure-guided iterative saturation mutagenesis to convert oxalyl-CoA decarboxylase (OXC) from Methylobacterium extorquens into a glycolyl-CoA synthase (GCS) that allows for the direct condensation of the two C1 units formyl-CoA and formaldehyde. A quadruple variant MeOXC4 showed a 100 000-fold switch between OXC and GCS activities, a 200-fold increase in the GCS activity compared to the wild type, and formaldehyde affinity that is comparable to natural formaldehyde-converting enzymes. Notably, MeOCX4 outcompetes all other natural and engineered enzymes for C1–C1 condensations by more than 40-fold in catalytic efficiency and is highly soluble in Escherichia coli. In addition to the increased GCS activity, MeOXC4 showed up to 300-fold higher activity than the wild type toward a broad range of carbonyl acceptor substrates. When applied in vivo, MeOXC4 enables the production of glycolate from formaldehyde, overcoming the current bottleneck of C1–C1 condensation in whole-cell bioconversions and paving the way toward synthetic C1 assimilation routes in vivo.


■ INTRODUCTION
The synthesis of complex molecules from one-carbon (C1) compounds is key to a circular economy. C1 compounds, in particular formate and methanol, can be derived directly from CO 2 through several processes, including hydrogenation, photochemistry, electrochemistry, and biocatalysis, 1−5 and further serve as feedstock for the formation of value-added products through microbial fermentation. 6−9 Because natural methylo-and formatotrophic microorganisms are not well suited for large-scale biotechnological processes, current efforts aim at engineering well-established microbial platform organisms by implementing natural and new-to-nature C1 converting pathways into these microbes. 10−13 As highlighted by several recent studies, linear C1 converting pathways are particularly interesting 10,12,14 since they minimally interfere with the host metabolism and, unlike cyclic pathways, are robust toward the draining of intermediates. However, linear pathways require the direct condensation of two C1 units, which is chemically challenging, due to the difficulty to generate C1 nucleophiles (in contrast to C1 electrophiles, which are readily available, notably in the form of CO 2 and formaldehyde). Only two enzymes are known that catalyze direct C1−C1 condensation in nature: acetyl-CoA synthase (ACS) and glycine synthase (GS). They are the key enzymes of the reductive acetyl-CoA pathway and the reductive glycine pathway, respectively. 15,16 GS and ACS are structurally and mechanistically highly complex biocatalysts (the latter being limited to strictly anaerobic conditions), 17,18 which illustrates the challenge of direct C1−C1 condensations and their implementation in microbial platform organisms. Thus, the discovery or development of simpler C1−C1 carboligases could enable new-tonature linear carbon fixation pathways and ultimately facilitate the engineering of whole-cell catalysts for efficient C1 conversions. 7,19 One way to generate nucleophilic (one)-carbon centers is to invert the reactivity of a carbonyl species through Umpolung, which enzymes can achieve through the cofactor thiamine diphosphate (ThDP). 20 In recent years, several ThDP-dependent enzymes have been engineered to catalyze C1-extension reactions, which underscores the general potential of these enzymes for synthetic carbon fixation pathways. The most prominent example is the new-to-nature enzyme formolase (FLS), which was derived from benzaldehyde lyase. 21 Similarly, glycolaldehyde synthase (GLS) was engineered from benzoylformate decarboxylase. 22 Although in both cases a very low initial activity was improved ∼100-fold by directed evolution, the final enzyme variants still exhibited low catalytic efficiencies (k cat /K M < 10 M −1 s −1 ), as well as poor affinity for highly toxic formaldehyde (K M ≥ 170 mM). While these engineered enzymes indeed were able to support new-to-nature linear C1 assimilation pathways in vitro, the low substrate affinity and activity limited their implementation in vivo so far. 21,22 We recently identified another class of ThDP-dependent enzymes as potential formaldehyde carboligases. The members of the 2-hydroxyacyl-CoA lyase (HACL)/oxalyl-CoA decarboxylase (OXC) enzyme family catalyze the condensation of formyl-CoA with formaldehyde to produce glycolyl-CoA. This activity will be referred to as glycolyl-CoA synthase (GCS) ( Figure 1A). 23, 24 Notably, HACL from Rhodospirillales bacterium URHD0017 (RuHACL) displayed more than 10fold higher catalytic efficiency (k cat /K M = 110 M −1 s−1) compared to the engineered FLS and GLS. However, efforts of further engineering HACL toward higher GCS activity have not been successful, 23 mainly because of the lack of structural information. So far, no HACL structure is available and homology models fail to accurately predict the structure of the C-terminal active site loop, which exhibits low sequence homology throughout the HACL/OXC family ( Figure S1). Moreover, another limitation of HACLs is their poor production in E. coli. 23, 24 To overcome these challenges with HACLs, we recently focused on repurposing OXCs, which naturally catalyze the decarboxylation of oxalyl-CoA, as formyl-CoA condensing enzymes. 24 Here, we aimed at improving the GCS activity of OXC to enable the production of glycolyl-CoA at high rates under physiologically relevant formaldehyde concentrations (<0.5 mM). Using structure-guided enzyme engineering, we converted OXC into a bona fide GCS through several rounds of iterative site mutagenesis (ISM) 25 and demonstrated its function in an E. coli whole-cell bioconversion system to improve current production limitations and pave the way toward synthetic C1 fixation pathways.

■ RESULTS
MeOXC Structure Identifies Targets for Mutagenesis. Previously, we showed that besides the decarboxylation of oxalyl-CoA into formyl-CoA, OXC from Methylorubrum extorquens (MeOXC) is also capable of catalyzing the acyloin condensation of formyl-CoA with various aldehydes, including formaldehyde. 24 However, the catalytic efficiency of MeOXC with formaldehyde (k cat /K M = 2 M −1 s −1 ) is far below that of RuHACL, which is the best performing HACL known to date (k cat /K M = 110 M −1 s −1 ). Additionally, the K M for both substrates is extremely high (formaldehyde: 100 mM and formyl-CoA: 3 mM; Table 1). We therefore sought to engineer MeOXC toward a more efficient GCS.
MeOXC is closely related to OXC from Oxalobacter formigenes (OfOXC, 63% identity) and E. coli (EcOXC, 61%), which both were structurally characterized . 26−28 To gain insights into the specific active site topology of MeOXC, we solved the crystal structure of the enzyme at a resolution of 1.9 Å ( Figure 1B). Overall, the structure is very similar to that of OfOXC root-mean-square deviation (rmsd = 0.425 Å for 7230 aligned atoms). Because we could not observe the electron density after residue E567, we modeled the C- terminal part (16 residues) based on the OfOXC structure (PDB ID 2ji7). In OfOXC, this part is flexible and the electron density for the closed conformation was obtained only after soaking the enzyme with the substrate or the product. 28 During catalysis, OXC and HACL both form the same αcarbanion/enamine intermediate ( Figures 1A and S2). In HACL, this intermediate is generated through proton abstraction from formyl-CoA or by cleaving off the aldehyde moiety from a 2-hydroxyacyl-CoA thioester, 29 whereas in OXC, this intermediate is formed by decarboxylation of oxalyl-CoA. 28 To increase the GCS activity of MeOXC, we sought to systematically alter the active site around the ThDP cofactor with ISM. 25 Based on the crystal structure, we identified eight residues in the proximity of the ThDP cofactor as targets: I48, Y134, E135, A415, Y497, E567, S568, and I571 ( Figure 1C).
Establishing a High-Throughput Screen for GCS Activity. ISM requires the screening of thousands of variants. 30 We thus conceived a high-throughput screen based on the conversion of glycolyl-CoA to glycolate, which is subsequently oxidized to glyoxylate by glycolate oxidase (GOX) under stoichiometric production of H 2 O 2 ( Figure 2A). H 2 O 2 is quantified by horseradish peroxidase (HRP) that catalyzes the oxidation of Ampliflu Red to the fluorophore resorufin.
To convert glycolyl-CoA into glycolate, we sought to establish glycolyl-CoA:formate CoA-transferase (GFT). Such an enzyme would not only turn glycolyl-CoA into glycolate, but at the same time also regenerate formyl-CoA from formate, thereby closing a catalytic cycle for the continuous conversion of formate and formaldehyde into glycolate. We screened formyl-CoA:oxalate CoA-transferase from Oxalobacter formigenes 31 (FRC), propionate CoA-transferase from Cupriavidus necactor 32 (CnPCT), and Clostridium propionicum 33 (CpPCT), as well as 4hydroxybutyrate CoA-transferase from Clostridium aminobutyricum 34 (AbfT). The latter showed the best GFT activity (k cat,app. = 0.40 ± 0.01 s −1 and K M,app (glycolyl-CoA) = 12 ± 1 μM; Figures 2C,D and S3) and was chosen for the assay.  Errors reflect the standard deviation of three independent measurements. Michaelis−Menten plots are shown in Figure S7. We validated our screen with purified MeOXC, AbfT, human GOX, 35 and HRP ( Figures 2B and S4) and further demonstrated that this assay could be used to quantify the MeOXC activity from E. coli lysates in 384-well plates.
Iterative Saturation Mutagenesis of MeOXC. Having established a high-throughput screen, we created saturation mutagenesis libraries of the eight identified active site residues using the 22c trick ( Figure S5). 30 We employed subsaturating   (Figure 3), suggesting that the isoleucines in both of these positions are critical for catalysis. Therefore, these residues were not screened in rounds R2−R6. The libraries of the remaining six residues mostly contained variants with WT-like activity but also some variants with significantly increased GCS activity, validating that these positions are potential targets to improve the enzyme activity ( Figure 3). The best performing variant in R1 was an alanine to cysteine substitution in position 415. Steady-state kinetics with the purified enzyme showed that in the A415C variant, K M for formaldehyde and formyl-CoA decreased 3-and 19-fold, respectively (Table 1), confirming that subsaturating formaldehyde concentrations could be used in our screens to identify enzyme variants with improved kinetics.
Based on the positive results of the first round, we continued with ISM using the best performing variant of each round as s template and saturating all other remaining sites stepwise. In R2, the substitution S568G conferred an ∼3-fold improvement in formaldehyde affinity, prompting us to decrease the formaldehyde concentration to 25 mM in R3, in which we identified E135G. For R4, we further lowered the concentration of formaldehyde to 10 mM to identify the substitution Y497F. In R5, we screened the remaining residues 567 and 134; however, no more positive hits were found (Figure 3). We then screened a library in which positions 134 and 135 were combinatorically saturated (R6; 400 possible variants), but this library also contained no improved variants ( Figure 3). Thus, after screening ∼3600 clones in six rounds, we obtained the final variant MeOXC4 carrying four substitutions E135G, A415C, Y497F, and S568G.
Carboligation Activity is Improved at the Expense of Decarboxylation Activity in MeOXC1−4. Next, we characterized MeOXC4 and the intermediate variants from R1 to R3 (MeOXC1−3) in more detail. All variants were produced in E. coli at levels comparable to native EcOXC ( Figures 4A and  S6), indicating that their improvement was based on the increased catalytic properties and not on the improved solubility and/or stability. The catalytic efficiency of the GCS activity improved over each round of ISM, ultimately resulting in 200and 110-fold improvements in k cat /K M for formaldehyde and formyl-CoA, respectively, due to lowered K M for both substrates, in combination with a 10-fold increase in k cat (Table 1).
While GCS activity strongly increased over the course of ISM, our MeOXC variants gradually lost their native OXC activity ( Figure 4C).
The final variant MeOXC4 retained less than 0.2% catalytic efficiency for oxalyl-CoA decarboxylation (k cat = 0.21 ± 0.01 s −1 and K M = 122 ± 16 μM), which was exclusively due to reduced k cat , as K M for oxalyl-CoA remained virtually identical to WT MeOXC (105 μM). 24 This finding is in line with previous studies on OfOXC and MeOXC, where residues Y134, E135, Y497, and S568 were replaced by alanine without significant changes to K M for oxalyl-CoA. 24,28 Interestingly, when added to the reaction, oxalyl-CoA did not affect the GCS activity of MeOXC4 ( Figure S9), even at 1 mM, indicating that the enzyme's original substrate does not act as an inhibitor, despite still showing a very favorable apparent K M value. This is probably due to a decreased on-rate (k on ) and/or an increased off-rate (k off ) for oxalyl-CoA and an opposite trend for formyl-CoA. The latter was supported by further experiments, which showed that in the presence of formaldehyde, MeOXC4 preferred carboligation and converted oxalyl-CoA after decarboxylation directly into glycolyl-CoA (k obs = 0.15 s −1 ), releasing formyl-CoA only at a very slow rate of <0.01 s −1 . This is in stark contrast to MeOXC WT, which shows a high formyl-CoA release rate of 98 s −1 after oxalyl-CoA decarboxylation, followed by slow carboligation (0.31 s −1 ) ( Figure S9).
The four substitutions in MeOXC4 directly affected the catalytic activity, as well as the k on and k off rates of the different substrates, causing a specificity switch between native OXC (oxalyl-CoA decarboxylation) and GCS activity (formyl-CoA condensation with formaldehyde) of greater than 100 000-fold.
Reversible (De)carboxylation Activity of MeOXC is Lost in MeOXC4. To confirm that the switch in activity was achieved by suppressing the native OXC reaction, we sought to also test the reverse reaction of OXC (i.e., the carboxylation of  Figure S2). formyl-CoA). We envisioned an enzyme cascade in which the product of the reverse reaction, oxalyl-CoA, is constantly removed from the equilibrium by further reduction into glycolate ( Figure S10A). This setup would render the overall reaction thermodynamically favorable (ΔG ≈ −19 kJ mol −1 ) 36 and allow us to measure the reverse reaction, in contrast to earlier efforts, which had failed. 37 To convert oxalyl-CoA into glyoxylate, we characterized a putative oxalyl-CoA reductase PanE2 from M. extorquens 38 ( Figure S10B). For the reduction of glyoxylate into glycolate, we employed GhrB from E. coli. 39 Combined with PanE2 and GhrB, MeOXC catalyzed the carboxylation of formyl-CoA at a rate of ∼0.5 min −1 (Figure S10C−E), while formyl-CoA carboxylation was not detectable in MeOXC4 ( Figure S10D), confirming that this variant had indeed lost its native OXC activity.
Structural Basis for the Improved GCS Activity in MeOXC4. To rationalize the effect of the different substitutions on catalysis, we solved the crystal structure of MeOXC4 to a resolution of 2.4 Å, modeled its flexible C-terminus beyond E567 as described before, and compared it to MeOXC WT and OfOXC.
Substitution A415C, which replaced fully conserved A415 in the HACL/OXC family by cysteine, caused a decrease in the apparent K M value for formyl-CoA by more than an order of magnitude (Table 1). Structural analysis showed that C415 is in close proximity to S568 of the flexible C-terminal loop reaching the active site, suggesting that this substitution is important to (re-)organize the active site for the new substrate. Notably, no other amino acid substitution at this position has a beneficial effect on catalytic efficiency (Figure 3), indicating that the sulfide moiety of cysteine seems to be important for formyl-CoA accommodation. However, when testing the corresponding substitution A389C in RuHACL, GCS activity was decreased by almost 50% at subsaturating formyl-CoA concentrations (100 μM) ( Figure S11), suggesting that formyl-CoA accommodation in MeOXC1−4 differs from RuHACL.
Substitution S568G lies within the flexible C-terminal loop, adjacent to A415C, and likely provides space to accommodate the bulkier side chain of C415. This is in line with the observation that S568G was only beneficial in combination with A415C and was not detected in the first ISM round (Figure 3).
The third substitution E135G caused the greatest structural change by creating an extra cavity at the active site ( Figure 5). Notably, none of the ISM libraries at residue 134 contained a variant of improved activity (Figure 3), suggesting that Y134 is important for GCS catalysis, likely by facilitating the protonation of the glycolyl-CoA-ThDP intermediate ( Figure S2). Thus, the E135G substitution does not serve in substrate accommodation but likely allows optimal positioning of Y134 during catalysis, which is supported by the fact that k cat increased approximately fourfold by the E135G substitution, while the apparent K M value for formyl-CoA slightly increased.
In the OfOXC active site, water molecules play a role during catalysis. Three-ordered water molecules (W1−3) are observed in a structure with a trapped α-carbanion/enamine intermediate, with W2 (in close contact to W1 and W3) proposed to protonate the Cα of the intermediate ( Figure S2). 28 W1 is hydrogen-bonded to residues corresponding to Y134 and E135, while Y497 and S568 form hydrogen bonds to W3. 28 In MeOXC4, this hydrogen bonding network to W1 and W3 is lost due to the E135G, Y497F, and S568G substitutions. This change in the water network likely helps to promote GCS activity at the expense of the OXC reaction. In summary, our structural analysis showed that the active site of MeOXC4 is reorganized for improved substrate accommodation and increased catalytic activity.
MeOXC4 Shows a Broad Substrate Scope for Different C1 Extensions. Having improved the catalytic efficiency of C1 extensions of formaldehyde by MeOXC4 by more than 2 orders of magnitude, we wondered whether the engineered enzyme would also promote the acyloin condensation of formyl-CoA with acceptor substrates other than formaldehyde. Indeed, MeOXC4 accepted a broad range of aldehyde substrates, including small hydrophilic and aliphatic aldehydes, bulky hydrophobic aldehydes, and acetone ( Figure 4B). The activity for all tested substrates was significantly higher than that for WT MeOXC (up to a factor of 340 in the case of phenylacetaldehyde). This broad substrate scope makes MeOXC4 a versatile catalyst for the C1 extension of carbonyl acceptors that can be exploited for the biocatalytic production of valuable 2hydroxy acids, such as lactic acid, mandelic acid, and 3phenyllactic acid.
Application of MeOXC4 to One-Carbon Bioconversion in E. coli Whole Cells. Finally, we aimed at testing the performance of our engineered enzymes also in vivo. We previously established an Escherichia coli whole-cell bioconversion system based on RuHACL. 23 When combined into one pathway with an acyl-CoA reductase from Lysteria monocytogenes (LmACR) and E. coli aldehyde dehydrogenase AldA (EcAldA), whole cells converted formaldehyde into glycolate ( Figure 6A). However, despite several attempts to optimize glycolate production, carbon flux was insufficient to support biotechnological applications, mainly due to the low abundance and the high apparent formaldehyde K M value of RuHACL, which posed a major bottleneck. 23 To test the effects of engineered MeOXC in vivo, we replaced RuHACL with variants MeOXC1−4 ( Figure 6B). In line with the increasing GCS activities of the different MeOXC variants, glycolate production was successively increased. The best mutant, MeOXC4, outperformed RuHACL in glycolate productivity twofold ( Figure 6B), likely because of the enzyme's more favorable kinetics (k cat /K M = 400 M −1 s −1 vs 110 M −1 s −1 for RuHACL) as well as improved the expression of MeOXC4 compared to RuHACL, even when the latter was codonoptimized, as confirmed for the MG1655-derived host strain ( Figure S12).
These results suggested that the other enzymes of the pathway, in particular LmACR (k cat /K M = 95 M −1 s −1 ), 23 became rate limiting. To optimize the concentration of all enzymes in the pathway, we tested different expression levels, using a two-plasmid system with independent inducible promoters. MeOXC4 was expressed from an IPTG-inducible T7 promoter, while LmACR and EcAldA were coexpressed from a cumate-inducible T5 promoter ( Figure 6C). Combinatorically screening of several IPTG and cumate concentrations revealed that the highest glycolate production levels were reached at low induction levels of MeOXC4 ([IPTG] = 40 μM) and high induction levels of LmACR/EcAldA ([cumate] = 400 μM) ( Figure 6C). This is in contrast to our previous results, where productivity was the highest at high RuHACL and low LmACR/ EcAldA induction levels. 23 Additionally, glycolate productivity was decreased by more than 2-fold, when we replaced LmACR with a different ACR, Rhodopseudomonas palustris PduP, 40 which shows a higher k cat value but also higher K M for formyl-CoA ( Figure S13). Taken together, these results support the hypothesis that the glycolate productivity of our system is currently not limited by C1−C1 condensation but ACR activity in vivo. Thus, enhancing ACR activity will be key toward improving and further developing this C1 fixation pathway for biotechnological applications.

■ DISCUSSION
Through ISM, we successfully evolved OXC from M. extorquens into a bona fide GCS by switching the decarboxylation and carboligation activities of the enzyme by 100 000-fold ( Figure  4C). Notably, none of the newly introduced amino acids of MeOXC4 is found in any HACL homolog ( Figure S1), indicating that through directed evolution, an alternative (presumably local) maximum in the GCS activity landscape was found. This raises the question how the reactivity of OXC and HACL is determined.
It has been proposed that in OXC after decarboxylation, the α-carbanion/enamine intermediate is nonplanar, rendering the Cα more basic and facilitating the rate-limiting protonation step, yielding formyl-CoA. 28 In contrast, an enamine-like planar structure was observed in ThDP-dependent carboligases that require a carbonyl acceptor substrate. 41−43 It is tempting to speculate that HACLs and MeOXC4 also stabilize the enamine state of the intermediate and thereby favor nucleophilic attack on the carbonyl acceptor substrate.
Notably, engineered MeOXC4 shows kinetic parameters that are comparable with natural formaldehyde-converting biocatalysts. Only two naturally occurring enzymes are known to be involved in formaldehyde fixation, 3-hexulose-6-phosphate synthase (HPS) in the ribulose monophosphate pathway and formaldehyde transketolase (FTK) in the dihydroxyacetone pathway. The reported values of the Michaelis constant for formaldehyde range from 0.15 to 3 mM and 0.4 to 1.9 mM for HPS and FTK, respectively. 44−47 With an apparent K M value of 5 mM for formaldehyde, engineered MeOXC4 is fully compatible with an in vivo application, in contrast to other (engineered) C1−C1 carboligases that show apparent K M values of 170 mM (GLS) and 29 mM (RuHACL) for formaldehyde concentrations that are toxic to E. coli. Compared to ACS and GS, MeOXC4 is a low-complexity C1−C1 carboligase that is homotetrameric (i.e., requires only one gene), oxygeninsensitive, and requires only ThDP and Mg 2+ as cofactors. Additionally, MeOXC4 can be produced at high levels in E. coli, which makes it a versatile tool for C1 extensions. Taken together, our results highlight the potential of enzyme engineering to create new-to-nature C1 enzymes and pathways for sustainable (bio)catalysis and biotechnology.
Materials and Methods; Strains and plasmids used in this study (Table S1); primers used in this study. N = A, C, G, T. D = A, G, T. H = A, C, T. V = A, C, G (Table S2); data collection and refinement statistics (Table S3); concentrations of substrates and enzymes in determination of GCS steady-state parameters of MeOXC variants (Table  S4); representative MSA of the HACL/OXC enzyme family ( Figure S1); proposed catalytic cycle of GCS ( Figure S2); screening CoA-transferases for GFT activity ( Figure S3); validation of the GCS screen ( Figure S4  condensation reactions of MeOXC and MeOXC4 ( Figure S8); effect of oxalyl-CoA on GCS activity of MeOXC4 ( Figure S9); reverse reaction of OXC ( Figure  S10); comparing the GCS-activity of RuHACL G390N (WT) and RuHACL A389C G390N ( Figure S11); expression analysis of formaldehyde to glycolate conversion pathway enzymes ( Figure S12); effect of PduP as an ACR on the whole-cell conversion of formaldehyde to glycolate ( Figure S13) (PDF)