Mechanism of Biocatalytic Friedel–Crafts Acylation by Acyltransferase from Pseudomonas protegens

Acyltransferases isolated from Pseudomonas protegens (PpATase) and Pseudomonas fluorescens (PfATase) have recently been reported to catalyze the Friedel–Crafts acylation, providing a biological version of this classical organic reaction. These enzymes catalyze the cofactor-independent acylation of monoacetylphloroglucinol (MAPG) to diacetylphloroglucinol (DAPG) and phloroglucinol (PG) and have been demonstrated to have a wide substrate scope, making them valuable for potential applications in biocatalysis. Herein, we present a detailed reaction mechanism of PpATase on the basis of quantum chemical calculations, employing a large model of the active site. The proposed mechanism is consistent with available kinetics, mutagenesis, and structural data. The roles of various active site residues are analyzed. Very importantly, the Asp137 residue, located more than 10 Å from the substrate, is predicted to be the proton source for the protonation of the substrate in the rate-determining step. This key prediction is corroborated by site-directed mutagenesis experiments. Based on the current calculations, the regioselectivity of PpATase and its specificity toward non-natural substrates can be rationalized.


INTRODUCTION
The Friedel−Crafts acylation is the Lewis or Brønsted acidcatalyzed electrophilic aromatic substitution for the synthesis of aromatic ketones using, for example, acyl chlorides or anhydrides as the acylating agents (Scheme 1a). 1 Recently, acyltransferases isolated from the bacteria Pseudomonas protegens (PpATase) and Pseudomonas fluorescens (PfATase) 2 have been reported to catalyze the Friedel−Crafts acylation, 3 providing a biological alternative to the traditional strategy. In contrast to the common O-, N-, and S-acylation reactions, the PpATase and PfATase enzymes catalyze C-acylation/deacyla-tion of monoacetylphloroglucinol (MAPG) to diacetylphloroglucinol (DAPG) and phloroglucinol (PG), 2,3 as shown in Scheme 1b. The reaction is cofactor-independent and does not require high energy-activated acyl donors, such as 1-Oacylglucosides, acyl carrier proteins, or quinic acid ester. These enzymes are thus of potential interest for the synthesis of aromatic ketones.
PpATase is a multimeric enzyme composed of three subunits, PhlA, PhlB, and PhlC, 2a,4 which are arranged in a Phl(A2C2)2B4 composition, according to the X-ray structures solved very recently. 5 Mutational analysis reveals that only the PhlC subunit is directly involved in the catalysis. 5 In the crystallographic structure, the Cys88 residue was found to be acetylated when MAPG was used as a substrate (Figure 1), and mutation of this residue eliminated the activity of the enzyme. Combining these experimental findings, Cys88 has been suggested to be crucial for the catalysis and is likely to act as a nucleophile to attack the acyl group of the substrate. 5 Also, mutations of other residues, such as His56, His144, Tyr298, His347, Ser349, and Asp352, showed extremely low or no activity. 5 The roles of these residues are, however, unclear.
The substrate scope of PpATase has been investigated in recent studies, which demonstrated that this enzyme is capable of catalyzing the Friedel−Crafts acetylation of various resorcinol derivatives by using a wide range of acyl donors, such as acetate derivatives, phenyl esters, and thioesters. 3a, 6,7 Importantly, the acetylation is found to be highly chemo-and regioselective. No O-acetylated, polysubstituted, or hydrolyzed product was observed. The acetyl group was exclusively added by the enzyme to the carbon in position 4 of the resorcinol substrates, thus in the ortho-position to one hydroxyl group and in para-position to the other hydroxyl group. PpATase can be regarded as a valuable biocatalyst for the selective synthesis of various aryl ketones by C−C bond formation. 8 Furthermore, this enzyme exhibits promiscuous activity in catalyzing the amide formation in aqueous buffer using aniline derivatives as substrates, 9 and very interestingly, it also enables the Fries rearrangement of resorcinol derivatives. 3a This increases further the biocatalytic potential of PpATase.
The enzyme is active on resorcinol-derived compounds as acetyl acceptors, but is inactive toward phenol, catechol, or hydroquinone. 3a This means that at least two hydroxyl groups in 1,3-position are required for the acylation.
Mechanistic insights into the reaction mechanism will be helpful in order to further improve the catalytic performance of PpATase, as only limited knowledge is known to date. In the present study, the detailed reaction mechanism of this enzyme is investigated using quantum chemical methodology. 10 A large model of the active site, consisting of more than 400 atoms, is designed on the basis of the crystal structure of the PG-bound enzyme with an acylated Cys88, 5 and various enzyme− substrate complexes with different orientations of the MAPG substrate are first considered. After establishing the binding mode of the substrate, the mechanistic investigation is conducted, and a reaction mechanism with feasible energy barriers is proposed. The calculations will make predictions as to the nature of key residues, and mutational experiments are conducted to examine these suggestions. The regioselectivity of PpATase and the specificity toward non-natural substrates will also be discussed and rationalized.
Here, it is important to stress that from the computational point of view, the present study is challenging in terms of the large size of the employed active site model and the large number of calculations required to determine the reaction pathway. The active site of the enzyme includes many polar residues, which can direct the substrate in different orientations by forming hydrogen bonds to the three hydroxyl groups and the acetyl group on the aromatic ring of the substrate. Thus, a large model has been necessary in order to accurately describe the protein environment and the different binding modes. For each binding mode, structures with different conformations and rotamers have to be considered in order to ensure that the lowest energy structures are located. Moreover, the protonation states of certain residues have also to be examined, increasing further the computational efforts.

COMPUTATIONAL DETAILS
2.1. Technical Details. The calculations were performed using the Gaussian 09 program 11 with the B3LYP hybrid density functional method. 12 Dispersion effects were described by the D3-BJ version of Grimme's empirical method 13 and were included in all calculations, including the geometry optimizations. The geometry optimizations were carried out with the 6-31G(d,p) basis set, and single-point energies were calculated at the same level of theory using the SMD solvation model 14 with the value of dielectric constant ε = 4. Frequency calculations were preformed to obtain zero-point energies (ZPEs). To get more accurate electronic energies, single-point calculations on the optimized structures were performed with the larger basis set 6-311+G(2d,2p). The values presented throughout the paper are thus the large basis set energies (which include dispersion effect) corrected for ZPE and solvation effects.
2.2. Active Site Model. A model of the active site was designed on the basis of the crystal structure of PpATase in complex with PG, in which the Cys88 residue is acetylated (PDB 5MG5). 5 To construct the active site model in the initial enzyme−substrate complex form, the acetylated Cys88 was modified back to the native form, and PG was replaced by MAPG. To arrive at the final active site model, preliminary models were first examined, with different sizes, substrate binding modes, hydrogen bonding patterns, and protonation states of various residues. These initial models were also used to scan possible reaction mechanisms. Here, we present the results of the largest model (shown in Figure 2), which yielded the most feasible energies and energy barriers. Apart from the MAPG substrate and Cys88, the model includes the residues forming hydrogen bonds with the substrate (His56, Tyr124, His144, Tyr298, and His347) and other residues forming the active site cavity (Glu58, Ala86, Asn87, Thr89, Glu116, Tyr127, Ile128, Ser130, Ser131, Thr132, Asp137, Thr145, Phe148, Leu209, Trp211, Leu300, Ala348, Ser349, Asp352, Leu383, Gly384, Gly385, Tyr386, and His389). The final active site model consists of 413 atoms, with an overall charge of −2. To avoid unrealistic movements during the geometry optimizations, a number of atoms were kept fixed, as shown in Figure 2. The fixed atoms are at the edge of this large model, allowing thus enough flexibility to the active site residues to adapt to the changes of geometries along the reaction. This procedure results in a number of small imaginary frequencies that can be ignored because they do not contribute significantly to the ZPE and thus to the final energy.
The protonation states of the various active site residues in the model are worth a comment here. First, it has been previously suggested that Cys88 is in the ionized form already before transformation of the substrate under the reaction condition because no base was found in the vicinity of this residue to affect its deprotonation. 5 However, we noticed that the Asp352 residue is in fact less than 5 Å away from the side chain of Cys88 and could possibly be the general base

ACS Catalysis
Research Article responsible for its activation. To test this, we optimized the geometry of the enzyme−substrate complex with either Cys88 or Asp352 in the deprotonated form and the other in the neutral form. Indeed, the calculations show that the activation of Cys88 by Asp352 is a plausible scenario because the energy of the enzyme−substrate complex in which Cys88 is in the deprotonated form, and Asp352 is neutral is more than 5 kcal/ mol lower than when Cys88 is neutral and Asp352 is ionized.
In the active site model, Cys88 was thus chosen to be in the ionized form and Asp352 in the neutral form. Because His347 forms a hydrogen bond to Asp352, this histidine residue was also modeled as neutral. Similarly, His144 and Asp137 residues, which are bridged by Thr132, were chosen to be in their neutral forms. Initially, the His56 and Glu58 residues, bridged by Tyr127, were also modeled as neutral species. However, a proton transfer from Glu58 to His56 took place spontaneously during the geometry optimization, resulting in an ionized Glu58 and a protonated His56 (Figure 2). The Glu116 residue is chosen to be in the ionized form because preliminary calculations showed that it acts as a general base to deprotonate the hydroxyl group of the substrate in the first step of the reaction. Finally, the other ionizable residue His389 is modeled in the neutral form because no negatively charged group is located in the vicinity.

RESULTS AND DISCUSSION
The overall reaction of PpATase consists of two half-reactions: the acylation of enzyme by the first MAPG substrate (first halfreaction) and the subsequent deacylation of enzyme by acyl transfer to the second MAPG substrate (second half-reaction, Scheme 2). As will be demonstrated below, the deacylation process follows the reverse of the acylation reaction in terms of the step sequence, with only some small differences in the energies. Therefore, a detailed discussion of the acylation halfreaction will be presented first, while the deacylation process will be only briefly discussed. Next, support for the suggested mechanism will be presented in the form of mutagenesis experiments. Finally, a number of alternative mechanistic scenarios that were examined and that were found to be associated with higher energies will be discussed briefly.
3.1. Reaction Mechanism. In the enzyme−substrate complex (called E:S, see Figure 2), the MAPG substrate is found to bind such that the acyl group points toward the area defined by His144, Phe148, and Leu383. The carbonyl oxygen of MAPG forms a hydrogen bond to His144 and an intramolecular hydrogen bond to the C3 ortho-hydroxyl group, which in turn interacts with His56. The C1 orthohydroxyl group forms a hydrogen bond to Tyr298, and the hydroxyl group at para-position (C5−OH) forms hydrogen bonds to His347 and Tyr124. The structure of E:S displays a high similarity to the X-ray structure of the active site (see Supporting Information for a superposition of the two structures).
Starting from E:S, the first step of the reaction is the deprotonation of C5−OH by Glu116, via Tyr124 (E:S→Int1, Scheme 3). The resulting intermediate Int1 is only 0.3 kcal/ mol lower than E:S, and this step is calculated to be essentially barrier-less 15 (see Figure 3 for the energy profile and Figure 4 for the optimized structures).
Next, nucleophilic attack of Cys88 on the carbonyl carbon takes place to form a covalent bond between the enzyme and the substrate (Int1 → [TS1] → Int2, Scheme 3). Concurrently with the formation of the C−S bond, a proton is transferred from the C3−OH of the substrate to the forming alkoxide. The calculated barrier for this step is 13.7 kcal/mol relative to Int1, and the formed tetrahedral intermediate (Int2) has the same energy as TS1 (Figure 3). At TS1, the distance of the forming C−S bond is 2.35 Å. The proton transfer event here is facilitated by a hydrogen bond from the positively charged His56 to the ortho-hydroxyl group. His56 is thus predicted to be important for catalysis, which is consistent with experimental site-directed mutagenesis showing that the replacement of this residue by either serine or alanine renders the enzyme inactive. 5 Upon formation of Int2, the reaction continues by a proton transfer from Asp137 to His144 via Thr132, resulting in the

ACS Catalysis
Research Article formation of intermediate Int3, which has a very similar energy compared to Int2 (Figure 3). His144 then protonates the C2 carbon of the substrate (Int3 → [TS2] → Int4). The calculations predict thus that the remote aspartic acid residue Asp137 located more than 10 Å away from the C2 is the source of the proton, which is shuttled to the substrate via Thr132 and His144 relays (Figure 4). The step of the C2 protonation is calculated to be rate-limiting for the first halfreaction, with an overall barrier of 17.9 kcal/mol relative to Int1, and the resulting intermediate (Int4) is 1.8 kcal/mol lower than Int1. An interesting feature here is that Int4 has low energy, although the aromaticity of the substrate is disrupted. The hydroxyl group at the para-position has a key role because its deprotonation in the previous step facilitates the formation of a relatively stable dienone intermediate, in which a significant degree of conjugation is maintained.
Next, C−C bond cleavage takes place to form the acylated enzyme and the PG product in the deprotonated form. In this

ACS Catalysis
Research Article step, a proton is also simultaneously transferred from the C7− OH back to the ortho-hydroxyl group (Int4 → [TS3] → Int5, Scheme 3). The barrier is calculated to be 13.2 kcal/mol relative to Int4 (Figure 3). At the corresponding transition state TS3, the breaking C−C bond is 2.06 Å (Figure 4). The first half-reaction is completed with the protonation of the para-hydroxyl group by Glu116 via Tyr124 (Int5 → Int6), and a step is calculated to be thermoneutral. The acylated enzyme Int6 is calculated to be 2.5 kcal/mol lower than E:S.
The optimized structure of Int6 from the calculations is very similar to the crystal structure (PDB 5MG5), 5 which represents the intermediate after the first half-reaction (see Supporting Information for a superposition of the two structures). This structural similarity lends further support to the obtained mechanism and the validity of the active site model used in the present study.
The reaction proceeds to the second half-reaction by a ligand exchange (Int6 → Int7, Scheme 3), in which the PG product of the first half-reaction exits the active site and a second MAPG substrate enters. The binding energy of MAPG relative to PG (ΔΔE) can be estimated using the following equation

Research Article
where E(PG) aq and E(MAPG) aq refer to the energies of PG and MAPG in aqueous solution, respectively. Using this equation, Int7 is calculated to be 2.5 kcal/mol lower than Int6 (and thus 5.0 kcal/mol lower than E:S). Although the procedure used to estimate this energy is simple, compared to for instance the much more expensive free energy perturbation technique, the obtained value is quite reasonable, considering the high similarity between the PG and MAPG compounds. In Int7 (Figure 5), which is the lowest energy binding mode of MAPG to the active site of the acylated enzyme, the acyl group of MAPG points toward the area defined by Trp211, Leu209, and Leu300. The C5−OH forms a hydrogen bond with His144, and the C3−OH interacts with Tyr124. By overlapping the structures of Int6 and Int7, some small changes of the geometries of the various active site residues are observed, indicating that the active site model has enough flexibility to accommodate the second MAPG substrate, which is somewhat larger than the leaving PG (see Supporting Information). We have considered a number of different binding modes for the second substrate, in particular one in which it binds in a similar fashion as the first substrate in E:S. Although the energy of this alternative binding mode is not high compared to Int7 (+1.6 kcal/mol), the calculations show that this structure is unproductive because the following intermediate is very high in energy (see Supporting Information for details).
Starting from Int7, the reaction follows the reverse of the first half-reaction (see Supporting Information for the optimized structures). A proton is first transferred from the C3−OH to Glu116 via Tyr124 (Int7 → Int8, Scheme 3), followed by a C−C bond formation between the C6 carbon of the substrate and the acyl group covalently bonded to Cys88 (Int8 → [TS4] → Int9). The C−C bond formation is concurrent with a proton transfer from C5−OH to the carbonyl oxygen, highlighting again the importance of a hydroxyl group at the ortho-position.
Next, the proton at the C6 position of the resulting dienone intermediate is abstracted by Asp137 via the Thr132 and His144 relays (Int9 → [TS5] → Int10 → Int11). The deprotonation of C6−H is calculated to be the rate-limiting step of the second half-reaction, with a barrier of 19.5 kcal/mol relative to Int7 (Figure 3). The final C−S bond cleavage occurs readily (Int11 → [TS6] → E:P) with a barrier of only 3.5 kcal/mol relative to Int11, and the final enzyme−product complex (E:P) product is 4.6 kcal/mol lower than E:S.
Although the sequence of steps of the second half-reaction corresponds to the reverse of the first half-reaction (see Scheme 3), there are some differences in the individual energies between the two. For example, the barrier from Int9 to Int10 via TS5 is 15.5 kcal/mol, while the corresponding barrier in the reverse direction of the first half-reaction, that is, from Int4 to Int3 via TS2, is 19.7 kcal/mol. The experimentally measured k cat for the formation of DAPG is 1.2 s −1 , 2a which corresponds to a barrier of ca 17 kcal/mol. In the present study, the protonation of the first substrate (Int3 → [TS2] → Int4, Scheme 3) and the deprotonation of C6−H of the second substrate (Int9 → [TS5] → Int10) are the rate-limiting steps of the first and second half-reactions, respectively, with calculated barriers of 17.9 and 19.5 kcal/mol, respectively. The calculated ratelimiting barrier for the overall reaction is thus 19.5 kcal/mol, which is in a good agreement with the experimental value.
An important finding here is that a proton transfer to the forming alkoxide takes place concurrently with the formation of enzyme−substrate adducts by either a C−S bond formation in the first half-reaction (Int1 → [TS1] → Int2) or a C−C bond formation in the second half-reaction (Int8 → [TS4] → Int9). In both cases, the proton source is a hydroxyl group in the ortho-position, which is C3−OH of the first MAPG substrate and C5−OH of the second MAPG substrate, respectively. This can be applied to explain the experimentally observed regioselectivity and substrate specificity on nonnatural acyl acceptors. 3a Namely, the enzyme is inactive toward phenol, catechol, or hydroquinonebut is active on various resorcinol-derived compounds. Thus, at least two hydroxyl groups are required for the acylation and these two groups have to be in 1,3-position relative to each other. The acyl group is exclusively added to the carbon at the ortho-position of one hydroxyl group and the para-position of the other hydroxyl group. According to the calculations on the natural reaction discussed above, the para-hydroxyl group is crucial for the reaction, as it is the key for the formation of the dienone intermediate (Int4 and Int9). The need of the ortho-hydroxyl group is explained by that a proton source is required for the protonation of the forming alkoxide in the reaction.
Here, it is important to mention that a number of alternative mechanistic scenarios have also been examined in the present study, all of which were calculated to be associated with high energies. Although negative, these results provide valuable information for the understanding of the mechanism and will therefore be described briefly.
According to the mechanism proposed in Scheme 3, the first step of the entire reaction is the deprotonation of the parahydroxyl group (C5−OH) by Glu116, with Tyr124 acting as a relay. This proton transfer process is important for the formation of the key dienone intermediate with a C5O double bond (Int4) in the first half-reaction. However, we notice that even without the deprotonation of C5−OH taking place, a dienone intermediate could still be formed with the C3O double bond instead because the C3-hydroxyl group is deprotonated concertedly with the C−S bond formation (TS1). We have examined the energetic feasibility of the first

ACS Catalysis
Research Article half-reaction without involving the deprotonation of C5−OH (see Supporting Information for details). The calculations show that the energy of TS1 is very similar, but the barriers of the following two steps are considerably higher. Very importantly, the protonation of the aromatic ring (TS2) has a prohibitively high barrier of 24.7 kcal/mol. These results clearly show that the deprotonation of the para-hydroxyl group of the substrate in the first half-reaction is prerequisite for the catalysis.
Another mechanistic scenario that has been suggested previously is that His347 is the general base that deprotonates the C5−OH group 5 and not Glu116 as in the proposed mechanism shown in Scheme 3. In E:S, the proton of the imidazole ring of His347 is at the Nε nitrogen, engaged in a hydrogen bond to the C5−OH group. For His347 to act as the base, the proton must therefore be at the Nδ position. We have optimized the structure of this tautomer (see Supporting Information), and the energy was found to be ca 15 kcal/mol higher than E:S. This result shows that the scenario of His347 being the general base can be ruled out.
3.2. Experimental Mutational Analysis. As discussed above, one important prediction from the calculations is that the remote Asp137 residue acts as the general acid/base in the rate-limiting protonation and deprotonation of the substrate (Scheme 3). These proton transfer events are mediated by Thr132 and His144. Replacement of His144 by alanine or serine has already been shown to eliminate the activity of the enzyme, 5 supporting the suggested role of this residue.
To verify the role of Asp137, this residue was exchanged to an asparagine experimentally. Indeed, the resulting D137N variant did not lead to any detectable product formation (< 0.1%), although the enzyme was successfully expressed in the soluble form (see Supporting Information for details on the experiments). This strongly supports the suggested function of Asp137.

CONCLUSIONS
In the present work, we have used quantum chemical calculations to investigate the mechanism of the Friedel− Crafts reaction catalyzed by acyltransferase from P. protegens. A very large model of the active site, consisting of more than 400 atoms, was designed on the basis of the crystal structure, and a detailed energy profile was produced for the acylation of monoacetylphloroglucinol (MAPG) to diacetylphloroglucinol (DAPG) and phloroglucinol (PG).
The reaction mechanism proposed on the basis of the calculations is shown in Scheme 3, and the obtained energy graph is given in Figure 3. The overall reaction consists of two half-reactions, as shown in Scheme 2: (1) the acylation of the enzyme by the first MAPG substrate and (2) the acyl transfer from the acylated enzyme to the second MAPG substrate. The calculations show that the second half-reaction follows the reverse of the first half-reaction in terms of the elementary steps, with some small differences in the calculated energies and barriers.
As indicated in Scheme 3, the first step is the deprotonation of the para-hydroxyl group of the substrate by Glu116 (via Tyr124). This is demonstrated to be prerequisite for the catalysis because it allows for the formation of a stable dienone intermediate in the following steps.
Next, the calculations show that the nucleophilic attack by Cys88 on the carbonyl carbon of the substrate takes place concurrently with a proton transfer from the ortho-hydroxyl to the forming alkoxide. This proton transfer is facilitated by hydrogen bonding to His56, which rationalizes the previously observed important role of this residue in the catalysis. 5 The involvement of the two hydroxyl groups in the catalysis helps thus to rationalize the experimentally observed regioselectivity of PpATase and also its specificity on nonnatural acyl acceptor substrates. It has namely been demonstrated that the substrate has to bear two hydroxyl groups and they have to be in a 1,3-position. Also, it has been shown that the acyl group will be added exclusively to the carbon ortho to one hydroxyl group and para to the other. These observations are hence consistent with the proposed mechanism.
An important prediction of the calculations is that the Asp137 is responsible for the protonation of the substrate, through a hydrogen bonding network involving His144 and Thr132. This is very interesting, considering that Asp137 is located more than 10 Å away from the substrate. Such longrange proton transfer chains through hydrogen bonding networks are not uncommon in enzymes. Important examples include the proton pumping in cytochrome c oxidase 16 and the activation of the pyridoxal 5′-phosphate (PLP) cofactor in aspartate aminotransferase. 17 The involvement of Asp137 in the reaction was validated by site-directed mutagenesis, in which the D137N variant was found to be completely inactive.
The rate-limiting step of the overall reaction is calculated to be the deprotonation of the CH group of the substrate by Asp137 in the second half-reaction, with a barrier of 19.5 kcal/ mol, in good agreement with the experimental k cat value.
The insights provided by the current calculations are valuable for the understanding of the required substrate pattern and for the rational design of PpATase to ensure the maintenance of the required catalytic machinery during protein engineering, not only in the first sphere of amino acids around the active site but also in the second sphere.

* S Supporting Information
The Supporting Information is available free of charge at https://pubs.acs.org/doi/10.1021/acscatal.9b04208.
Overlap of optimized structures with the X-ray structure; alternative binding mode of Int7; structures of intermediates and transition states of the second halfreaction; results concerning the alternative mechanism with the para-hydroxyl group being neutral throughout; experimental details; and relative energies and Cartesian coordinates of the intermediates and transition states of the proposed mechanism (PDF)

■ ACKNOWLEDGMENTS
The Swedish Research Council is acknowledged for financial support. A. Z.-D. acknowledges the Austrian Science Fund (FWF) Lise Meitner Fellowship grant M 2172-B21.