Deciphering the Unconventional Reduction of C=N Bonds by Old Yellow Enzymes Using QM/MM

The reduction of C=X (X = N, O) bonds is a cornerstone in both synthetic organic chemistry and biocatalysis. Conventional reduction mechanisms usually involve a hydride ion targeting the less electronegative carbon atom. In a departure from this paradigm, our investigation into Old Yellow Enzymes (OYEs) reveals a mechanism involving transfer of hydride to the formally more electronegative nitrogen atom within a C=N bond. Beyond their known ability to reduce electronically activated C=C double bonds, e.g., in α, β-unsaturated ketones, these enzymes have recently been shown to reduce α-oximo-β-ketoesters to the corresponding amines. It has been proposed that this transformation involves two successive reduction steps and proceeds via imine intermediates formed by the reductive dehydration of the oxime moieties. We employ advanced quantum mechanics/molecular mechanics (QM/MM) simulations, enriched by a two-tiered approach incorporating QM/MM (UB3LYP-6-31G*/OPLS2005) geometry optimization, QM/MM (B3LYP-6-31G*/amberff19sb) steered molecular dynamics simulations, and detailed natural-bond-orbital analyses to decipher the unconventional hydride transfer to nitrogen in both reduction steps and to delineate the role of active site residues as well as of substituents present in the substrates. Our computational results confirm the proposed mechanism and agree well with experimental mutagenesis and enzyme kinetics data. According to our model, the catalysis of OYE involves hydride transfer from the flavin cofactor to the nitrogen atom in oximoketoesters as well as iminoketoesters followed by protonation at the adjacent oxygen or carbon atoms by conserved tyrosine residues and active site water molecules. Two histidine residues play a key role in the polarization and activation of the C=N bond, and conformational changes of the substrate observed along the reaction coordinate underline the crucial importance of dynamic electron delocalization for efficient catalysis.

(c) Polar interactions between the substrate and the active site residues have been shown in green dashed lines.The active site residues are shown in a stick representation with grey-coloured carbons, while the lumiflavin (LuF) carbons are coloured yellow.The alignment was performed using the pair-fitting method available in Pymol, 1 where the corresponding pairs of heavy atoms were selected from the side chains of the active site residues and LuF.For clarity, the ribitylphosphate tail of the flavin cofactor and nonpolar hydrogen atoms of the enzyme are hidden.Table S2: Summary of key geometric parameters of the QM/MM optimised binding pose of imine 2. Data were obtained from 3 independent 5 ps QM/MM (B3LYP-6-31G*/amber19ffsb) SMD simulations.

Detailed Methodology Protein Structure Preparation
Complex of XenA and Oxime 1 The crystal structure (PDB-ID: 8AU8) 2 of XenA in complex with ethyl-(Z)-2-(hydroxyimino)-3-oxopentanoate (1) was used as the starting point for the calculations.In the crystal, XenA is a homodimer in which residue W358 of chain B interacts with the active site of chain A and vice versa. 3Therefore, both chains were considered in this study.The protein structure was further prepared using the protein preparation wizard 4 in Maestro. 5ecifically, the two active site histidines, H178 and H181, were modelled as neutral and singly protonated at the ϵ-and δ-nitrogen atoms, respectively.Our previous study suggested that the reduced enzyme would favour the neutral form of 1; therefore, the hydroxylimino oxygen was modelled as protonated. 2 All crystal waters and ligands were removed except 1 and the flavin (FMN) cofactor.FMN was modelled in its negatively charged (N1 − ) reduced state.Finally, the complex structure was subjected to an energy minimisation step in which only the hydrogen atoms were allowed to move.

Complex of XenA and Imine 2
The starting structure for the complex of XenA with ethyl-2-imino-3-oxopentanoate 2 resulted from the modelling calculations of the reductive dehydration of 1 as described in Scheme 1a in the main manuscript.FMN was reverted to its reduced state (FMNH − ), and this modelled complex was then used for further computations as follows.
Additionally, we have performed docking of 2 with XenA using DiffDock, 6 where we have generated 10 docking poses using the default settings of the DiffDock algorithm.We found that the orientation of the substrate 2 in the docked poses is also very similar to the binding pose of substrate 1 in the active site of XenA.

Molecular Simulations Modeling Simulation System
After the preparatory steps listed above, the prepared flavoenzyme-substrate complex was used as input in the Amber20 7 program, where the complex was solvated in an octahedral box such that any enzyme atom should be at least 10 Å away from the box boundaries.The system was neutralised by adding the required number of N a + or Cl − ions.The molecular mechanics potential energy of the whole system was then derived using Amber ff19SB 8 for the protein, and its compatible version of the Amber general atom force-field 9 (GAFF2) for the ligands (flavin and substrate), the Joung and Cheatham parameters 10 for N a + , and the SPCe 11 water model for water molecules.

Minimisation and Equilibration Using Molecular Mechanics (MM)
The system was then subjected to 2000 steps of energy minimisation using the steepest descent algorithm, and if this did not converge, followed by 2000 steps of conjugate gradient optimisation.Later, the temperature of the whole system was linearly increased from 0K to 298K for 20 ps, followed by a 10 ps NVT run at 298K.The volume of the system was then allowed to equilibrate during a 100 ps NPT run.During all these MM minimisation and equilibration steps, we kept a position restraint on each solute-heavy atom using a force constant of 10 kcal/mol/ Å2 .Starting with a high-resolution crystal structure of the enzymesubstrate complex, our objective was to selectively optimise the positions of introduced hydrogen atoms and solvent molecules.Therefore, the chosen equilibration time was based on the recommended relaxation time specific to this context. 12Also, periodic boundary conditions (PBC) were employed with bonds containing hydrogen atoms held rigid with SHAKE. 13Temperature control was done with the Langevin thermostat, 14,15 and pressure control was achieved with the Berendsen barostat, 16 each with a relaxation time of 2.0 ps.
A cutoff radius of 8 Å was used for non-bonded interactions.[19] Minimisation Using Hybrid Method (QM/MM) The system was then divided into two parts to employ the hybrid QM/MM Hamiltonian. 20The atoms described in the QM3 region (Figure S25) were considered for the QM layer, whereas the remaining structure was treated by molecular mechanics.Subsequently, each system underwent further energy minimisation using the semi-empirical quantum mechanics/molecular mechanics method (SQM/MM), followed by the quantum mechanics/molecular mechanics (QM/MM) method without restraints.We have used the PM6-DH+ 21,22 functional for SQM and the B3LYP [23][24][25] DFT functional with the 6-31G* 26 basis set for the QM caclulations.The electrostatic embedding scheme described in 27 was used to compute the interactions of MM point charges with the QM system.Hydrogen link atoms 28 were used to treat the QM/MM bonding interfaces.As we were dealing with hydride and proton transfer reactions, the SHAKE 13 algorithm was turned off for the QM atoms, which is otherwise used to constrain bonds involving hydrogen atoms.

QM Region
Selection of Residues for QM Region One of the pertaining questions that arise when modelling enzymatic reactions using QM/MM is the composition of QM regions and which residues should be included.To address this particular question, we considered modelling different QM regions.We utilised the existing chemical information on XenA active site residues, such as the pair of histidine residues (H178 and H181) proposed to stabilise the delocalisation of charges on the substrate, Y183 as a potential proton donor, and Y27 forming H-bonds with the substrate.In addition, we incorporated active site water molecules in the QM region.Overall, we considered six different QM regions as depicted in detail in Figure S25 and summarised in Table S3.Starting from a model that consisted only of 1 and lumiflavin (LuF) (QM1), QM2 additionally has two histidines (H178 and H181).The two tyrosine residues, Y27 and Y183, were incorporated in QM3, whereas we added one and two close water molecules in QM4 and QM5, respectively.Finally, QM6 was similar to QM4 but included the complete flavin (FMNH − ) instead of only LuF.At the beginning of each simulation, the residue ID of the nearest water molecule was identified and then explicitly specified for the respective QM region throughout the simulation.
Table S3: Summary of QM regions used in the QM/MM computations.Each QM region bears a net charge of -1, except QM6, which has a net charge of -3.

Selection of QM Method
[38] Our previous experimental investigation estimated that the activation energy (∆E ‡ a ) for hydride transfer to substrate 1 is 15.4 kcal/mol. 2Considering that we have computed the ∆E ‡ a by performing single-point energy calculations on UB3LYP/6-31G*//OPLS-2005 optimised structures of (RS 1 H − ), (TS 1 H − ) using atoms mentioned in QM4 region.First, we tested the B3LYP with different Gaussian basis sets without considering the dispersion correction term, and then we incorporated the D3 39 dispersion correction term; the computed ∆E ‡ a is shown in Table S4.Adding the dispersion correction term to B3LYP has contributed around 1  kcal/mol more to the ∆E ‡ a for each unique combination of B3LYP and Gaussian basis sets.
However, adding more diffuse and/or polarization has lowered the ∆E ‡ a .We have also tested M06-2X density functional, which has been recommended for better accountability of noncovalent interactions across biological systems.

Convergence of Partial Charge on Substrate
In the past, the convergence of the total partial charge on the substrate has been evaluated to choose an optimum number of QM atoms. 35,42We have monitored this parameter for the substrate in the various QM regions and found that the difference between the total partial charge for the substrate converges as the number of atoms in the QM region increases (Fig-

ure S26
).Interestingly, neither the number of water molecules significantly impacted the substrate's total partial charge nor the ribityl-phosphate tail of the flavin cofactor.Therefore, we used QM4 for all QM/MM computations.3][44] However, we did not experience this gap-closing problem.All of our evaluated QM regions maintained a constant 2 eV HOMO-LUMO gap (Figure S26).For our subsequent investigations, we chose the B3LYP functional and a 6-31G* basis set for QM computation.Each of the above-described six QM regions underwent 5 independent 200 ps QM/MM simulations with a time step of 1 fs.The resulting 1000 frames of this 1 ps production run were used for both partial charge and HOMO-LUMO gap analyses.

QM/MM Production Run
For the production runs, we employed the Sander module of AMBER20 7 with Terachem 45,46 as an external QM package.As mentioned above, QM computations were done at the B3LYP/6-31G* level of theory, while the MM potential was derived from AMBER ff19SB. 8ectrostatic embedding was employed to treat interactions between QM and MM regions, and a hydrogen link atom approach was used.The SHAKE 13 algorithm was turned off for all QM4 atoms in all subsequent QM/MM simulations.Interaction cutoffs, and thermostat and barostat parameters remained unchanged from the values given in the section (QM/MM Minimisation).Although we had included side chains of four protein residues in the QM4 region, we still needed to account for the fact that a protein residue outside of the defined QM region can affect ligand polarisation.We were using an electrostatic embedding scheme, 27 in which MM atoms that fall within the defined non-bonded interaction cutoff were considered as point charges, to model the electrostatic interactions with QM atoms and the polarisation of the QM electron density.Considering this, we have chosen the non-bonded cutoff of 8 Å, which is an optimum choice based on the findings of Willow et al. 47 Finally, using the QM/MM energy minimised structure as input, we run a continuous 5000 steps QM/MM MD run with a time-step of 1 fs, yielding a total simulation time of 5 ps.Snapshots of the trajectory were saved at each step.

Reaction Modeling Computation of Reaction Coordinates
The whole 5 ps trajectory from the last step was loaded into VMD, 48 and the centre of the PBC box was re-aligned to the centre of mass of the protein.Solvent molecules not within 5 Å of any protein atom were removed, and the remaining atomic coordinates were saved as a PDB file.The generated PDB file was then opened in Maestero, 5 and input for Q-Site [49][50][51] (the QM/MM module of the Schrödinger package) was prepared by specifying the QM atoms (atoms in the QM4 region), QM charge (-1), QM method and basis set (UB3LYP/6-31G*), and the MM potential (OPLS-2005 52 ).Additionally, MM atoms more than 5 Å away from the QM region were fixed.Hydrogen atoms were used to cap covalent bonds partitioned at the QM/MM boundaries.Reactions were modelled by running a distance scan at a resolution of 0.1 Å against appropriately selected reaction coordinates, as depicted in Figure S27.From there, transition states were modelled using the quadratic synchronous transit (QST) method and further confirmed by the presence of a negative frequency corresponding to the selected reaction coordinates.

QM/MM Steered Molecular Dynamics
In addition to QM/MM geometry optimisation, we employed QM/MM Steered Molecular Dynamics (SMD) simulations to achieve a more dynamic and comprehensive understanding of the underlying reaction pathways.QM/MM SMD simulations facilitated a more rigorous sampling of structural changes leading to a specific reaction coordinate, providing valuable information about the energy barriers and intermediates involved in the reaction.Using the reaction coordinates (CV) defined in Figure S28 for hydride and proton transfer, we performed 100 independent 1 ps SMD simulations for each CV.Out of the 100 independent QM/MM SMD runs, 5 runs were coupled with the Natural Bond Orbital (NBO) 53 program.
Incorporating the NBO program with the QM/MM SMD simulations contributed significantly to a more detailed and precise analysis of the electronic changes during the chemical transformations.The velocity of pulling a hydrogen atom (a hydride or a proton, depending on the respective reaction coordinate) was 1 Å/ps, consistent with the best practices for QM/MM simulations of biological systems. 35All simulation parameters remained the same as mentioned in the section (QM/MM Production Run).Finally, free energy changes along the defined CV were calculated by the fluctuation-dissipation (FD) estimator (see equation 1). 54,55Mean square errors (MSE) as suggested by Gore et al. were computed according to equation 2 in order to assess the quality of the free energy estimator. 56 In the above equations, γ denotes the reaction coordinate, and N is the total number of

Tutorial
In addition, we have a publicly available detailed tutorial explaining the whole methodology discussed here on GitHub (https://github.com/hopanoid/Enzyme-Reaction-Dynamics-Tutorial).

Figure S1 :
Figure S1: Active site of XenA in complex with 1.(a) Crystal structure of the complex (PDB entry: 8AU8) with the key atoms of 1 labelled in green.(b) Structural alignment of the binding poses of 1 as observed in the crystal structure (stick representation with green carbons) and obtained by QM /MM (UB3LYP-6-31G*/OPLS2005) optimisation (yellow carbons).(c)Polar interactions between the substrate and the active site residues have been shown in green dashed lines.The active site residues are shown in a stick representation with grey-coloured carbons, while the lumiflavin (LuF) carbons are coloured yellow.The alignment was performed using the pair-fitting method available in Pymol,1 where the corresponding pairs of heavy atoms were selected from the side chains of the active site residues and LuF.For clarity, the ribitylphosphate tail of the flavin cofactor and nonpolar hydrogen atoms of the enzyme are hidden.

Figure S2 :
Figure S2: 2D potential energy surface for the conversion of 1 to 2 by XenA from Pseudomonas putida (calculated at the UB3LYP-6-31G*/OPLS2005 level).We performed a scan along the reaction coordinate of the hydride transfer (S 1 H − ).Then, each optimised structure was used as a starting point for the proton transfer reaction coordinate (S 1 H + ) with the hydride transfer coordinate fixed.The reaction coordinates were varied in 0.1 Å steps.The details of S 1 H − and S 1 H + are shown in Supplementary Figure S27a.The figure was drawn with the imshow function of Matplotlib using the Gaussian interpolation method.

Figure S3 :Figure S5 :
Figure S3: Dynamics of geometric changes across the reaction coordinates during the hydride transfer step for 1.(a) Intermolecular distances between 1 and XenA, (b) intramolecular distances within 1, and (c) selected angles and dihedrals between 1 and XenA.The plots were generated using data from 100 independent 1 ps QM /MM (B3LYP-6-31G*/amber19ffsb) SMD simulations.The standard deviations are shown as semi-transparent bands.

2 [ 2 [
Figure S6: Role of H178 and H181 in stabilising the accumulated electron density on 1 during the hydride transfer reaction.(a) A snapshot from a QM/MM (B3LYP-6-31G*/amber19ffsb) SMD trajectory showing the proximity of H181 to the lone pair of O3.(b) The delocalisation energies E(2) are plotted together with the standard deviations shown as semi-transparent bands.

Figure S7 :Figure S8 :
Figure S7: NBO analysis across the reaction coordinates during the hydride transfer step for 1. Plot of the delocalisation energy (E 2 ) between the donor LP(O1) and the acceptor σ * (Hw-Ow) NBOs.The standard deviations are shown as semi-transparent bands, while the two insets show the extent to which the LP(O1) donor NBO is perturbed due to delocalisation.
Figure S9: QM/MM (UB3LYP-6-31G*/OPLS2005) geometry optimised structures of the active site of XenA in the (RS 1 H + ), (TS 1 H + ) and (PS 1 H + ) configurations with 1.The configurations shown are the optimised molecular structures corresponding to stationary points along the lowest energy reaction coordinate for the direct proton transfer from Y27.

Figure S10 : 2 [
Figure S10: Dynamics of geometric changes across the reaction coordinates during the proton transfer step for 1.(a) Intramolecular distances within 1, and (b) selected angles and dihedrals between 1 and XenA.The plots were generated using data from 100 independent 1 ps QM/MM (B3LYP-6-31G*/amber19ffsb) SMD simulations.The standard deviations are shown as semi-transparent bands.

24 -Figure S12 :
Figure S12: Dynamics of atom-centred charges across the reaction coordinates during the proton transfer step for 1.The plots were generated using data obtained with the NBO program from 5 independent 1 ps QM/MM (B3LYP-6-31G*/amber19ffsb) SMD simulations.The standard deviations are shown as semi-transparent bands.

Figure S13 :
Figure S13: Active site of XenA in complex with 2. (a) Structural alignment of the docking pose of 2 (stick representation with green carbons) and obtained by QM /MM (UB3LYP-6-31G*/OPLS2005) optimisation (yellow carbons).(b) Polar interactions between the substrate and the active site residues have been shown in green dashed lines.The active site residues are shown in a stick representation with grey-coloured carbons, while the lumiflavin (LuF) carbons are coloured yellow.The alignment was performed using the pair-fitting method available in Pymol,1 where the corresponding pairs of heavy atoms were selected from the side chains of the active site residues and LuF.For clarity, the ribitylphosphate tail of the flavin cofactor and nonpolar hydrogen atoms of the enzyme are hidden.

Figure S14 :
Figure S14: 2D potential energy surface for the conversion of 2 to 3 by XenA from Pseudomonas putida (calculated at the UB3LYP-6-31G*/OPLS2005 level).We performed a scan along the hydride transfer reaction coordinate (S 2 H − ).Then, each optimised structure was used as a starting point for the proton transfer reaction coordinate (S 2 H + ) with the hydride transfer coordinate fixed.The reaction coordinates were varied in 0.1 Å steps.The details of S 2 H − and S 2 H + are mentioned in supplementary figure S27b.The figure was drawn with the imshow function of Matplotlib using the Gaussian interpolation method.

Figure S15 :
Figure S15: Dynamics of geometric changes across the reaction coordinates during the hydride transfer step for 2. (a) Intermolecular distances between 2 and XenA, (b) Intramolecular distances within 2, and (c) selected and dihedrals between 2 and XenA.The plots were generated using data obtained from 100 independent 1 ps QM/MM (B3LYP-6-31G*/amber19ffsb) SMD simulations.The standard deviations are shown as semitransparent bands.

2 [ 14 -Figure S18 :
Figure S16: Potential of mean force (PMF) across the reaction coordinates during the hydride transfer step for 2. The data were derived from the 100 independent 1 ps QM /MM SMD simulations.The mean square errors are shown as vertical bars.

2 [
Figure S19: NBO analysis across the reaction coordinates during the proton transfer step for 2. Plot of the delocalization energy (E 2 ) between the donor LP(C1) and acceptor σ * (Hw-Ow) NBOs.The standard deviations are shown as semi-transparent bands, while the inset image shows the extent to which the LP(C1) donor NBO is perturbed due to delocalization.
Figure S21: QM/MM (UB3LYP-6-31G*/OPLS2005) geometry optimised structures of the active site of XenA in the (RS 2 H + ),(TS 2 H + ), and (PS 2 H + ) configurations with 2. The configurations shown are the optimised molecular structures corresponding to stationary points along the lowest energy reaction coordinate for the direct proton transfer from Y183.

Figure S22 : 2 [Figure S24 :
Figure S22: Dynamics of geometric changes across the reaction coordinates during the proton transfer step for 2. (a) Intra-molecular distances within 2, and (b) selected angles and dihedrals between 2 and XenA.The plots were generated using data from 100 independent 1 ps QM/MM (B3LYP-6-31G*/amber19ffsb) SMD simulations.The standard deviations are shown as semi-transparent bands.

Figure S25 :
Figure S25: Atoms in the different QM regions: (a) QM1: lumiflavin (LuF) and 1, (b) QM2: QM1 plus sidechains of H178 and H181, (c) QM3: QM2 plus sidechains of Y27 and Y183, (d) QM4: QM3 plus one closest water molecule, (e) QM5: QM4 plus the second closest water molecule, (f) QM6: the same as QM4 but with a complete flavin (FMNH − ) instead of LuF.The carbon atoms in LuF/FMNH − are shown in yellow, those of the substrate in grey, and those belonging to protein residues in green.Oxygen and nitrogen atoms are coloured red and blue, respectively.Polar hydrogen atoms are shown in white.Water molecules are shown in a ball-and-stick representations, while the remaining atoms are shown in stick representations.23

Figure S26 :
Figure S26: Violin plots showing (a) the sums of partial charges on 1 and (b) the HOMO-LUMO gaps when employing different QM regions.Charges and band gaps were computed using data obtained from 5 independent 200 ps QM/MM simulations.

Figure S27 :
Figure S27: Definition of reaction coordinates (collective variables, CVs) used in QM/MM scans for reaction modelling.(a) For 1, S1 H− is the hydride transfer reaction CV, and S2 H+ refers to a direct proton transfer from water.(b) For 2, S2 H− is the hydride transfer reaction CV, and S2 H+ to direct proton transfer from Y183.The depicted atoms were included in the QM/MM calculations.Dotted lines represent the reaction coordinates, and wavy lines depict the QM/MM boundary.

Figure S28 :
Figure S28: Definition of reaction coordinates (collective variables, CVs) used in the QM/MM SMD simulation for (a) 1 and (b) 2. Both CVs are linear combinations of distances (LCOD).The values of the hydride (H − CV ) and proton transfer (H + CV ) CVs are (d 1 -d 2 ) and (d 3 -d 4 ), respectively.The depicted atoms were included in the QM/MM calculations.Dotted lines represent reaction coordinates, and wavy lines depict the QM/MM boundary.

Table S4 :
38,40Indeed, the computed ∆E ‡ Benchmarking of QM methods to be used in the QM/MM computations.