Bottom-Up Nonempirical Approach To Reducing Search Space in Enzyme Design Guided by Catalytic Fields

Currently developed protocols of theozyme design still lead to biocatalysts with much lower catalytic activity than enzymes existing in nature, and, so far, the only avenue of improvement was the in vitro laboratory-directed evolution (LDE) experiments. In this paper, we propose a different strategy based on “reversed” methodology of mutation prediction. Instead of common “top-down” approach, requiring numerous assumptions and vast computational effort, we argue for a “bottom-up” approach that is based on the catalytic fields derived directly from transition state and reactant complex wave functions. This enables direct one-step determination of the general quantitative angular characteristics of optimal catalytic site and simultaneously encompasses both the transition-state stabilization (TSS) and ground-state destabilization (GSD) effects. We further extend the static catalytic field approach by introducing a library of atomic multipoles for amino acid side-chain rotamers, which, together with the catalytic field, allow one to determine the optimal side-chain orientations of charged amino acids constituting the elusive structure of a preorganized catalytic environment. Obtained qualitative agreement with experimental LDE data for Kemp eliminase KE07 mutants validates the proposed procedure, yielding, in addition, a detailed insight into possible dynamic and epistatic effects.


■ INTRODUCTION
In recent years, intense research has been conducted in pursuit of harnessing the versatile protein capabilites for application in industrial biocatalysis.
The major challenge arises from the fact that living organisms produce enzymes tailored for sustaining their own existence, rather than the needs of human society. Therefore, the key to the future of rational biocatalyst engineering lays in the redesigning of existing enzymes and, ultimately, de novo design with the aid of new computational methods. 1−4 Major breakthroughs in the theoretical design of enzymes for new, non-natural reactions (so-called "theozymes") have occurred in the past two decades. 5−12 However, after initial success, the progress stalled, 13,14 because theozymes that were designed to maximize transition-state stabilization (TSS) displayed rather low catalytic activity and conventional topdown models requiring numerous assumptions were not able to fully explain the role of additional mutations in the second coordination sphere, introduced by LDE experiments. 13,14 Therefore, such LDE-mutated theozymes represent an excellent proving ground for testing new methodologies and interpreting the deficiencies of older methods. This problem was investigated by several research groups with various methods, including extensive molecular dynamics, 15,16 QM/MM, 17−19 and EVB 20−23 simulations. All previously mentioned methods belonging to the top-down category require consideriation of the entire protein and numerous assumptions regarding protonation states, QM/MM boundaries, always arbitrary selection of force field, etc. Although commonly used methods are able to explain some of the differences between designed and evolved enzymes, they are rather unsuitable for efficient scan over vast combinatorial space. Therefore, a method simple enough for high-throughput prescreening and yet deeply rooted in the physics of the problem without involving any empirical factors is needed to improve the existing protocols.
The catalytic field technique that has been used since we developed it in our laboratory in 1981 24−26 has been recently supplemented by Dittner and Hartke with extensive optimization techniques, allowing one to generate and explore various characteristics of the optimal catalytic environment. 27,28 However, extensive brute force optimization of all variables involved would still make theoretical design of theozyme composed of several hundreds of amino acids extremely costly and impractical. We believe that systematic nonempirical analysis of the physical nature of interactions involved in the catalysis and development of additional techniques specific for enzymes presented in this contribution could lead to rational construction of simpler models useful in theozyme design. Other rare inverse catalyst design attempts that have been recently summarized in a review 29 are mostly based on artificial intelligence or data mining. Although such methods could be very useful in the automated scanning of vast chemical spaces, it would be surprising to obtain a deeper understanding of physical phenomena responsible for extraordinary catalytic activity of enzymes this way, while many essential structural details are missing in currently available databases.
Recent experimental results that employ Stark effects for a specific band of infrared (IR) spectra for mutated enzymes indicate the electrostatic nature of KSI catalytic activity. 30 This permits one to monitor electric fields exerted by specific mutations on certain reactant bonds in top-down fashion, yielding valuable but still limited information. 16,31 This perspective is reversed here in bottom-up fashion by the use of a catalytic field derived from the reactant and transition state wave functions, yielding general characteristics of charge distribution of the optimal catalytic environment for every point in space outside reagents and enabling one to monitor lowering of activation barrier directly, as a result of amino acid mutations, side-chain rotations, rate-promoting vibrations, or proton transfers in hydrogen bond chains.
Although the catalytic field technique has been derived in our laboratory 24−26 using the theory of intermolecular interactions, it could be also applied to systems involving intramolecular interactions. The ability to predict catalytic effects resulting from hydrogen to fluorine (H → F) substitutions has been illustrated for two simple model organic reactions, 32 which demonstrates qualitative applicability of this technique to model substrateassisted catalysis. 33 The present paper constitutes the first application of a catalytic field approach to real case intermolecular catalysis. For this purpose, here, we have chosen the artificial Kemp eliminase KE07 and its mutants. 9 Among other Kemp eliminases based on different scaffolds, this system has the largest number of mutants with gradually increasing activities. Several KE07 mutants introduced by LDE experiments, which apparently were missed in earlier theozyme designs, 13,14 constitute a suitable benchmark to examine new biocatalyst design methodologies.
The mutants obtained from LDE experiments differ mainly by the number of charged amino acids located in the second coordination sphere, 13,14 which may change conformation of its side chain upon docking with the substrate or nearby mutation. 34,35 The importance of considering amino acid sidechain rotamers in biocatalysis has been recently demonstrated for Kemp eliminase 36 and triosephosphate isomerase mutants. 37 A more general catalytic field technique, 25,26 considering simultaneously both TSS 8,9 and ground-state destabilization, 23 combined with the atomic multipole rotamer library scan, allows one to obtain valuable insight into active site preorganization.
This contribution also constitutes the first attempt to test and validate the catalytic field technique and apply it to explore possible mutation nonadditivity, as well as dynamic and epistatic effects in biocatalysis resulting from side-chain rotations.

■ METHODS
Here, we present a bottom-up strategy based on a concept of catalytic fields derived from the DTSS approach. 25,26 Within this ab initio-based framework, activation barrier lowering is partitioned into intrinsic (vacuum in the original paper 25 ) contribution of reactants and the difference of enzyme-transition state(TS) and enzyme−substrate(S) interaction energies E DTSS .
Whenever the DTSS approach is applied to enzyme reactions, the term "substrate" refers to the ground state (alternatively, the reactant complex or Michaelis complex). In principle, the DTSS approach is more general and could be applied to other catalytic systems, such as zeolites. 38 Technically speaking, the "intrinsic" contribution is a barrier calculated for reaction path coordinates as they are in catalysts, but without inclusion of molecular environment in quantum mechanics (QM) calculation. Despite the virtual character of this reference state and the DTSS energy for a single catalyst, it can be used as the reference in quantitative comparison of a sequence of mutants of the same enzyme. Provided that mutation does not change the reaction mechanism, taking the difference between mutant and wild-type (WT) DTSS energies cancels out the "intrinsic" contribution, leaving a value corresponding to the difference of reaction barriers of mutated and WT enzyme (see Figure 1): Furthermore, DTSS could be partitioned into the electrostatic multipole ΔE EL,MTP , electrostatic penetration ΔE EL,PEN exchange ΔE EX , delocalization (induction) ΔE DEL , and correlation (dispersion) ΔE CORR terms defined within symmetry-adapted perturbation or hybrid variation-perturbation theories of intermolecular interactions. 39,40 This provides not only insight into the physical nature of catalytic interactions induced upon mutation, but it also allows one to derive, in a systematic way, more approximate methods based on leading terms. Recent experimental results 30 and ab initio analysis of several enzyme systems 26,41,42 indicate clearly the dominant contribution of electrostatic term ΔE EL and, in particular, its long-range multipolar electrostatic component ΔE EL,MTP accurately reflecting, at an atomic level, the molecular charge redistribution during the progress of the reaction. 43 The electrostatic multipolar DTSS term, 26 constituting a more-general approach, represents TSS 44 and, at the same time, ground-state destabilization (GSD) 23 effects, postulated earlier as the   46 in agreement with recent experimental data, 30 confirming the electrostatic nature of catalytic activity.

■ CATALYTIC FIELDS
Further consideration of electrostatic component of DTSS energy leads to the static catalytic field Δ S (r⃗ i ), which is defined 25,26 as the difference between electrostatic potentials V(r⃗ i ) of TS and substrate S (see eq 1). Its value denotes activation barrier change resulting from the presence of unit point charge in any location of catalytic site outside reactants. Because of the additive nature of electrostatic interactions, catalytic field values obtained in one computational step constitute the solution of inverse catalysis problem in the form of Δ S (r⃗ i ), i.e., complementary charge distribution of optimal molecular environment q i (r⃗ i ). 25,26 Therefore, a catalytic field may serve as a general guide for rational selection of mutations or side chain orientations or ratepromoting vibrations by providing clues about optimal static and dynamic charge distribution of catalytic environment reducing the activation barrier. The values of Δ S (r⃗ i ) could be rapidly calculated from CAMM derived directly from transition-state and substrate quantum-chemical wave functions. 46 Therefore, calculations involving catalytic field scale as O(N R · N A · N rot ) only, where N R and N A indicate the number of atoms in reactants and considered mutated amino acid residue, respectively. N rot denotes the average number of rotamers per mutation site (typically a few dozens). In the case when charged amino acid is represented by single unit charge located at the charge gravity center (for example, located near the terminal side chain atom of lysine or arginine), scaling could be further reduced to O(N R · N rot ), which must be calculated only once for the mutation site. The validity of such approximation is illustrated in Figure 2.
This way, the search space in biocatalyst design is considerably reduced within purely nonempirical approach. The use of atomic multipoles is essential to represent the anisotropy of molecular charge distribution properly at the atomic level, in particular for transition states, where bonds are formed or broken. 43 In addition, always arbitrary atomic charge definitions are compensated by the use of higher-order atomic multipoles, considerably reducing the basis set dependency of atomic charges. 46 Note that Δ S gradient field defined as dynamic catalytic field Δ D , indicates direction of the favorable unit probe charge q i (r⃗ i ) movement coupled with the reaction coordinate and resulting in additional decrease of the activation barrier. 47 Such coupling may involve proton dislocations in hydrogen-bonded networks. Recent application of the dynamic electrostatic catalytic field (DECF) approach allowed us to determine that each of two steps of reaction proceeding in ketosteroid isomerase (KSI) is catalyzed by concerted proton dislocations proceeding in reverse directions in the network of six hydrogen bonds. 48 Therefore, observed exceptional KSI catalytic activity could be related to ultrafast proton switching. 48

■ EXPLORATION OF ROTAMER SPACE
In this study, we will concentrate on some unexplained mutations introduced by LDE into artificial Kemp eliminase, which are not in direct contact with the substrate. 13,14 According to the aformentioned assumption about second-coordination sphere mutations (that is, their negligible effect on the reaction mechanism), the following calculations base on transition state obtained for KE07 enzyme with Gaussian09 49 ONIOM implementation 50 (details are provided in section S2 of the Supporting Information). The QM region of that simulation, consisting of 5-nitrobenzoixazole (5NBI) and acetate representing part of Glu101 residue, was superimposed on 5NBI-enzyme complexes of the mutants.
The space of possible amino acid conformations was explored in two ways. In the first, conventional molecular dynamics (MD) has been used with solvent effects modeled within the Molecular Mechanics Poisson−Boltzmann Surface Area (MMPBSA) model, 51 which is regarded as a reliable free-energy simulation method to investigate protein−ligand interactions. This provides a baseline representing a commonly used approach to the problem. In the second one, the library of atomic multipoles for amino acid side chain rotamers has been scanned using catalytic fields, providing a proof of concept of a proposed approach (note that, with proper implementation, this involves much less computational time than MD simulation and avoids one having to always use arbitrary force fields).

■ MMPBSA APPROACH
First, 30-ns MD trajectories were prepared for each of eight mutants, followed by MMPBSA calculation with the usage of the gMMPBSA program 51 MD simulations were performed in Gromacs 4.6. 52 For the sake of consistency with ONIOM calculations, Amber94 force field with TIP3 water model was used. Mutant structures were generated from KE07 by a homemade script, according to published data 9 (see Table S1 in the Supporting Information). All structures were submerged in water box at least 1 nm from the protein atoms, with Na and Cl ions for charge neutralization. In all simulations, periodic Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article boundary conditions and Particle Mesh Ewald long-range electrostatics (1 nm cutoff) were used. For MD simulations, a leapfrog algorithm was used, and the temperature and pressure were set to 300 K and 1 bar, respectively. After initial steepest descent minimization, NVT and NPT equilibration stages were performed, each one 200 ps long. In this phase, a modified Berendsen thermostat and a Parinello-Rahman isotropic barostat were chosen. In subsequent 30 ns production run, distance restraints between Glu101 OE2 and ligand C3 and H3 atoms were used, in order to prevent random dissociation of the complex. Then, trajectories ware used in MMPBSA calculations done with gMMPBSA program. 51 MD simulations performed for KE07 and its mutants concerned only the structure of the enzyme−substrate complex. It was assumed that the reaction event, when it occurs, is much faster than conformational changes of the surrounding amino acid residues. Therefore, these simulations were supposed to provide an ensemble of protein conformations rather than to simulate the reaction itself. Furthermore, the application of catalytic field Δ S could be tested in this way. Taking advantage of the minor changes in geometry between reactant and TS (see Figure S3b in the Supporting Information) and keeping the assumption mentioned above, the same trajectory was assumed for TS. Different energies were obtained by substitution of charges assigned to the "QM region" (as defined in ONIOM calculations) in the topology files. Since g_mmpbsa evaluates electrostatic and implicit solvation contributions to binding energy (we included only electrostatic components), DTSS energy with implicit solvation was obtained by putting differential charges (the difference between each atomic charge in TS and the reactant), which effectively are an atomic monopole representation of Δ S . Atomic charges for RS and TS were calculated using the Merz−Kollman scheme 53 in the same basis as that used in ONIOM calculation. The contribution of QM region to solvation energy was subtracted from the result, since this method of calculation provides incorrect value of this term; on the other hand, it may be assumed to be negligible or at least constant among mutants (for a more-detailed derivation, see Section S3 of the Supporting Information).

■ SCANNING THE CAMM LIBRARY OF ROTAMERS
As the major approach in this study, we used the Dynameonics Rotamer Database (backbone-independent, updated in 2015) 54 to prepare a library of CAMM for each amino acid rotamer.
All CAMMs were calculated with GAMESS (US) 55 with HF method and 6-311G(3d,2p) basis set. CAMMs for the QM layer in both RS and TS states were obtained separately. The CAMM expansion was then used in the calculation of catalytic fields and interaction energies. For each mutation, rotamers with the best and worst TSS/DTSS contribution were determined (hereafter, we will refer to the rotamer with the best DTSS contribution as optimal). Finally, we explored the configuration space resulting from combining rotamers at different sites (mathematically speaking, a Cartesian product of individual site rotamer spaces in a given mutant).
In order to test the importance of solvation effects on electrostatic interaction energies, we examine the version of library incorporating Generalized Born (GB) model, where the solvation energy is modeled according to eq 3. The Born radii were calculated using bornRadius software developed by Onufriev's group, utilizing an improved algorithm described in refs 56 and 57. The atomic charges for amino acid residues were taken from Amber94 force field, whereas charges on the QM region were calculated with the Merz−Kollman method for the HF/6-311G(3d,2p) wavefunction.
■ ALGORITHMS FOR THE EXPLORATION OF ROTAMER CONFIGURATIONAL SPACE Since the number of rotamer combinations grows exponentialy with the number of amino acids, it is important to restrict the size search space, especially in the early steps of calculation. In particular, elimination of rotamers that are unlikely to participate in final combination is of great importance. In our experiments, we found that the average number of rotamers per site is in the range of 60−80. Thus, elimination of highly improbable rotamer (e.g., having short contact with the protein backbone) reduces the size of search space by 70(N − 1) configurations (with N being the number of included residues). Here, two criteria were applied: (i) geometric and (ii) energetic. The first one basically involves exclusion of rotamers which side chains have close contanct (≤1 Å) with protein backbone (neglecting, obviously, contacts corresponding to chemical bonds). The second one is the well-known Goldstein criterion for singles elimination. 58 Here, subscripts k,l ∈ [1,N] denote the position in the amino acid sequence, whereas superscripts "A" and "B" indicate the rotamer at a particular site. However, the value ϵ in the original work was equal to zero; by choosing a positive nonzero value, one can "soften" the criterion and thus obtain a larger final space. Such an approach may be useful when one is interested in a set of low-energy conformers, which may be particularly desirable when considering dynamic effects. The search algorithm generally can be described by the following steps.
(1) Loading of the amino acid rotamers into specified positions in a protein structure. At this stage, criterion (i) is applied. (2) Calculation of an interaction energy table.
(3) Elimination of the rotamers according to different criteria (e.g., criterion (ii)). (4) Systematic search over resulting space. Throughout the presented work, we refer to the method without additional elimination criteria in point 3 as "multiscan". The second algorithm tested, denoted as Dead End Elimination (DEE), involves criterion (ii) with negligence of the rotamer self-energy (E k (r k A )) and ϵ set to either 0.0 or 0.005 hartree. The performance of both approaches was tested on P 6 mutant represented by five charged residues: Asp7, Glu19, Lys146, Arg202, and Asp224. A potential bottleneck may be the calculation of CAMM interaction energies (which can be several times slower than that observed for the point-charge model) or loading CAMM from library (since the rotation of large multipole arrays may be computationally expensive). Therefore, we compared the performance of the RESP model Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article (Table S1 in the Supporting Information) and CAMM (Table  S2 in the Supporting Information). We ensured that both approaches lead to the same rotamer configuration. Two major observations can be drawn from the presented results.
• Elimination of rotamers according to Goldstein criterion reduces the search space by 2−4 orders of magnitude, depending on the assumed threshold. This, in turn, leads to a similar reduction of time spent on scanning the rotamer space. • Application of CAMM mutipoles leads to an ∼2-fold increase of loading time (due to the required transformation of multipole arrays) and an ∼6-fold increase in the time spent on the computation of interaction energy table. In fact, this time is comparable to that required to scan the entire (unreduced) rotamer space.
However, for larger sets of amino acid residues, the problems with application of CAMM expansion can be tackled by initial reduction of search space, based on atomic charges, maybe with a nonzero epsilon. This should allow one to avoid situations when inclusion of the charge anisotropy leads to different configurations.
Furthermore, there are at least two possibilities, with regard to further improvement of the presented algorithm: • The elimination of unlikely pairs from the search space (analogous to the Goldstein criterion for singles  The Catalytic Field and Optimal Locations of Crucial Amino Acids. The three charged residuesAsp7, Lys146, and Arg202, conserved in directed evolution 9 (see Table S1)seem to play the most important role. What's even more remarkable, these findings are consistent with the catalytic field corresponding to this reaction. As can be seen in Figure 3, charged amino acids conserved in directed evolution reside exactly where they should be located according to catalytic field Δ S , and the range of their possible rotamers fits the corresponding "basins" well. The position of Asp7 on the edge between the two areas of Δ S Figure 3. Catalytic field in Kemp eliminase. (a) 3D view of surface colored with the value of Δ S (red indicates that positively charged catalyst residues will be beneficial, blue indicates that negatively charged residues will be beneficial); (b) Kohonen map 60 of the catalytic field (values given in kcal/(mol e)) (positions of amino acids at a distance of 7 Å above the van der Waals surface are marked with circles (optimal DTSS) and squares (the worst DTSS); residues predicted to provide further improvement are marked with stars (optimum DTSS) and diamonds (the worst DTSS); the arrows indicate the transition from the position with the worst DTSS to that with the best one.
Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article provides a straightforward exploration of both its low contribution and range varying from positive to negative (the best and worst DTSS from the rotamer scan are −1.1 kcal/mol and 0.0 kcal/mol, respectively). Figure 3a presents catalytic field Δ S (r⃗ i ) for the Kemp eliminase, calculated according to eq 2 on the van der Waals surface of reactants, as well as optimal rotamers of crucial charged residues.
Side-Chain Movements and the Ranges of DTSS Contributions. Charged amino acid side chains, especially of lysines and arginines, belong to most flexible upon ligand binding, 34,35 so it is necessary to consider entire range of their dislocations in more detail. Results obtained using library of atomic multipoles for amino acid side chains indicate that activation energy changes may vary between 1 kcal/mol and 4 kcal/mol for charged residues, in contrast to neutral amino acids (0−0.5 kcal/mol). The possible movements of charged amino acid terminal atoms representing extreme values of catalytic activity measured by DTSS are illustrated on a 2D Kohonen map of the Kemp eliminase catalytic field in Figure 3b.
In order to make a 2D representation of catalytic field, we used a self-organizing map (SOM) based on the Kohonen neural network. 60 First, a surface at a distance of 7 Å from the vdW surface of TS was generated with the point density set to 25 points/Å 2 . It was then used to train a neural network consisting of 3600 neurons arranged into a square grid with a toroidal topology. This special topology means that the grid (and thus the resulting map) does not have an edge: the "rightmost" point is topologically adjacent to the "leftmost" point, while the "topmost" point neighbors the "bottom-most" point. Another consequence of this fact is that replicas of such map may be placed next to each other, similar to tiles, showing the topological surroundings of each point. After training, the map was colored according to the value of catalytic field Δ S , and terminal atoms of amino acids (chosen in the same manner as presented on Figure 2) were projected onto resulting SOM.
These side chain conformation changes could be regarded as the important part of a preorganized catalytic environment. Relatively large activation barrier changes resulting from amino acid side chain movements reinforce the conclusions from previous experimental and computational studies, indicating the importance of the precise positioning of catalytic residues. 61 Multiscan Results Including Solvent EffectsComparison with Experiment. Enzyme catalytic activity could be expressed in terms of TS and S interaction energies with the enzyme treated as a rigid body. However, the docking substrate may induce significant changes in side-chain conformation of charged amino acids, 34,35 which may strongly interact between themselves as well as with TS and S. Whereas the interaction energy for typical neutral hydrogen-bonded dimer is 3−5 kcal/ mol, it can be 1 or 2 orders of magnitude higher, if at least one of the monomers bear nonzero electric charge (specifically, the absolute value of interaction energy may reach 20 kcal/mol in the case of neutral molecules bonded to a charged one or even 100 kcal/mol when both partners are charged). Therefore, the interactions involving charged amino acids are expected to be dominant and a model relying solely on this contribution seems to be a reasonable first approximation. In the case of Kemp eliminase, six positions varied in LDE experiments involved charged residues: 7, 19, 123, 146, 202, and 224 as shown in Tables 1 and 2. The change of the charged state of these residues located beyond the first coordination sphere and directly involved in reactant binding has the greatest chance to modulate a catalytic field without affecting the reactant binding pose. In addition, they may strongly interact with each other and substrate binding may affect the conformation of their charged side chains. Therefore, multiscan procedure seems to a reasonable way to model these effects, which could not be determined by currently available conventional experimental or theoretical methods.
For all mutants, residues at these positions have been taken into account in a search for the lowest-energy rotamer configuration using a CAMM rotamer library (http://camm. pwr.edu.pl/CAMM).
In order to take into account solvent presence, the GB model has been used with selection of effective Born radii computed with the bornRadius program developed by Onufriev and coworkers. 56,62 With this energy function (denoted henceforth as GB), a search over rotamer space was performed in exactly the same way as the CAMM model. Interactions between amino acid residues were included.
Finally, we compare those results to MMPBSA 51 calculations based on 30-ns MD simulations performed for each mutant separately (more details are given in the section devoted to the MMPBSA approach). Experimental activation free energies for Kemp eliminase mutants are compared in Figure 4 with all three aforementioned models. Remarkably, both models involving a scan over rotamer spacethat is, DTSS(CAMM) and DTSS-(GB)outperform the DTSS(MMPBSA) approach, in terms of the number of correctly ordered pairs N pred (See Tables 1 and  2). N pred of a model 63 is a ratio of correctly predicted pairs to all pairs in a test set. A "correctly predicted" or concordant pair is a pair of mutants P,Q for which E DTSS P < E DTSS Q and k cat P > k cat Q or E DTSS P > E DTSS Q and k cat P < k cat Q ; otherwise, such a pair is counted as nonconcordant. Given the total number of pairs N tot in a test set consisting of N enzymes (eq 5), the value of N pred (usually  expressed as a percentage) is given by eq 6. It is related to Kendall's tau τ K by the relation described in eq 7).  23 This was probably one of the reasons of limited success of initial theozyme design. 13,14 Because of the approximations involved, the correspondence is not linear; however, this does not disqualify the overall approach, which is intended to determine the ranking or narrow the large set of possible mutations, so it would be manageable for more sophisticated methods. Thus, it seems to be sufficient to indicate a general qualitative trend, favoring the DTSS model over the TSS model.

■ NONADDITIVE EFFECTS OF MUTATIONS
Activation barrier changes (DTSS) originating from specific residue could sometimes vary as much as 1.25 kcal/mol, indicating possible side-chain conformation changes, resulting from interactions between other residues (see Table S1 in the Supporting Information). Such variations responsible for nonadditivity of mutations could be examined in detail using the multiscan technique presented here. Let us discuss in detail ASP7 catalytic activity in P3, P4, P5, P6, and P7 mutants. Figure  5a−e presents schematic side-chain orientations obtained using a multidimensional scaling technique, preserving Euclidean distances in the space of lower dimensionality. Mutation GLU146LYS in P4 forces change of ASP7 conformation in P3 and, as a result, a loss of its activity measured by DTSS from −1.12 kcal/mol to +0.02 kcal/mol. Then, mutations LYS19-GLU and LYS146XX (where XX rerpesents a neutral amino acid) introduced into P4 restore ASP7 activity in P5. In P6, the previous mutation is reversed, resulting in the same activity loss as that observed in P4. Finally, the LYS19XX mutation in P6 restores ASP7 activity in P7 again. All of these results indicate the essential role of molecular context, which can be explored by the multiscan technique presented in the previous chapter. Additional analysis of nonadditive effects of LYS146 and Arg202 is presented in section S4 of the Supporting Information.

■ DISCUSSION
The presented scheme correctly explains the effect of mutations leading from KE07 to R7−R10/11G, providing an atomistic insight into the possible role of each residue contribution to the activation barrier lowering, emerging from first principles.  8 In the CAMM approach, reactant−amino acid side-chain interactions were calculated by employing cumulative atomic multipole moment expansion. In the GB approach, corresponding ESP atomic charges have been used, including solvent effects estimated from the Generalized Born method. MMPBSA denote mean interaction energies using MD trajectories where solvent effects were obrained from the Poisson−Boltzmann model. 51 Journal of Chemical Theory and Computation pubs.acs.org/JCTC Article The most significant result of this analysis is an effective approximate solution of the "inverse catalysis problem" confronted and confirmed for the first time by experimental results. We have shown that, based on the information provided by the catalytic field, the search space could be narrowed down to a limited number of computational steps. The underlying electrostatic differential transition state stabilization DTSS concept 25,26 has been confirmed later in recent independent studies. 18,23,31 Jindal and co-workers analyzed the differences between evolutionary pathways of two Kemp eliminase families: descendants of KE07 and HG3. 23 They noted that, in the first group, the improvement arises from ground-state destabilization, whereas, in the second one, te improvement results from an increase in transition-state stabilization (TSS). This is consistent with our more general DTSS and catalytic field approach, which covers both effects simultaneously. 25,26 Moreover, some elements of differential stabilization of the transition state approach appeared later in two other works, 18,22 which provides further reinforcement for this idea.
In summary, the role of residues 7, 19, 123, 146, 202, and 224 modified in laboratory-directed evolution experiments, which are not in direct contact with reactants, seems to be modulation of the catalytic field to minimize activation barrier, without changing the reactant binding pose determined primarily by the amino acids constituting the first coordination sphere.
The essential advantage of the present approach is the use of cumulative atomic multipole moments to describe molecular charge redistribution during the progress of the reaction, capturing the anisotropic character of molecular electrostatic potentials, 64 which may be more significant in neutral systems. 43 It may be still crucial in proper description of the catalytic field, since Δ S has at least dipole character, because of the charge conservation principle. Another important feature of the catalytic field is that, despite special emphasis placed on charged residues in the present analysis, it can provide more general information; by considering its gradient Δ ⃗ d , one gains insight into the optimal distribution of catalytic dipoles in a way analogous to that presented above for charges (see Figures S1 and S2 in the Supporting Information). Finally, as mentioned earlier, whenever necessary, the DTSS energy can always be generalized to include remaining interaction energy terms defined in interaction energy decomposition schemes, such as SAPT 39 or HVPT. 40 For example, exchange repulsion and dispersion terms could be estimated from the corresponding nonempirical atom−atom functions approximating accurate SAPT results. 65,66 An important issue is a nonadditivity of mutations reported recently. 23 Within the presented framework, it could be included by exploration of the effect of mutation on rotamer distribution of other residues (and corresponding DTSS energy changes). An example of such analysis is presented in Figure 5, where we found that considering the mutual interactions of residues Asp7, Lys19, Lys146, Arg202, and Arg2246 in P3, P4, P5, P6, and P7 mutants changes the optimal DTSS contribution of Asp7 several times from −1.1 (P3), 0.0 (P4), −1.1 (P5), 0.0 (P6), and −1.1 (P7) kcal/mol.
Such an approach can be effectively used to obtain rapid estimates of nonadditive effects in multiple mutants. The general idea presented in this work may be easily introduced into other existing protocols, such as an ensemble of short MD 67 or Monte Carlo simulations. 36 Application of static catalytic field gradients (dynamic electrostatic catalytic fields DECF) 47