Stable and Functionally Diverse Versatile Peroxidases Designed Directly from Sequences

White-rot fungi secrete a repertoire of high-redox potential oxidoreductases to efficiently decompose lignin. Of these enzymes, versatile peroxidases (VPs) are the most promiscuous biocatalysts. VPs are attractive enzymes for research and industrial use but their recombinant production is extremely challenging. To date, only a single VP has been structurally characterized and optimized for recombinant functional expression, stability, and activity. Computational enzyme optimization methods can be applied to many enzymes in parallel but they require accurate structures. Here, we demonstrate that model structures computed by deep-learning-based ab initio structure prediction methods are reliable starting points for one-shot PROSS stability-design calculations. Four designed VPs encoding as many as 43 mutations relative to the wildtype enzymes are functionally expressed in yeast, whereas their wildtype parents are not. Three of these designs exhibit substantial and useful diversity in their reactivity profiles and tolerance to environmental conditions. The reliability of the new generation of structure predictors and design methods increases the scale and scope of computational enzyme optimization, enabling efficient discovery and exploitation of the functional diversity in natural enzyme families directly from genomic databases.


Computational methods 2
Materials and experimental procedures 3 Amino acid sequences of characterized VP designs 8 Table S1. Selected VP origins, protein lengths and number of mutations in each design 9 Table S2. Selected VPs homology 10 Table S3. Most-active VP designs sequence homology (in percentage) 10 Table S4. Kinetic parameters with various substrates 11 Figure S1. Accuracy of trRosetta models 12 Figure S2. Sequence alignment of 5H, 8H and 11H to their wildtype progenitor 13 References 17

Materials and Experimental Procedures
Reagents. The protease-deficient S. cerevisiae strain BJ5465 (α ura3-52 trp1 leu2Δ1 his3Δ200 pep4::HIS3 prb1Δ1.6R can1 GAL) was obtained from LGCPromochem (Barcelona, Spain). The uracilindependent and ampicillin resistance shuttle vector pJRoC30 was obtained from the California Institute of Technology (Caltech, Pasadena, CA). The α-factor prepro-leader sequence and all the VP genes sequences were ordered from Twist Biosciences (San Francisco, CA). The BamHI and XhoI restriction enzymes were ordered from New England Biolabs (NEB, Rehovot, Israel). ABTS, VA, RB5, and the S. cerevisiae transformation kit, were purchased from Sigma-Aldrich (Rehovot, Israel). DMP was purchased from Acros Organic (Geel, Belgium) and hemoglobin from bovine erythrocytes from EMD Millipore Corp (Billerica, MA, USA).
Cloning of VP genes. Cloning of all VP genes was performed by using the S. cerevisiae homologous recombination machinery 14 . pJRoC30-AAO (aryl-alcohol oxidase) expression shuttle vector previously constructed in the Alcalde lab 15 was digested with BamHI and XhoI restriction enzymes to remove the signal peptide and the AAO gene constructed within it. The α-factor prepro-leader DNA sequence, that was used in a previous directed evolution campaign of VPL (including additional restriction site in its 3' that encodes for Glu-Phe dipeptide in the N-terminal of the mature proteins) 16 , was ordered as a gene fragment with 40 bp overlap to the linearized plasmid, and the VP genes were ordered each with 40 bp overlap to the signal peptide sequence, and to the linearized plasmid. The design of the 40 bp overlapping regions between the three fragments (plasmid, signal peptide, VP genes) allowed the recombination machinery of the protease-deficient S. cerevisiae strain BJ5465 to drive the fusion of the three DNA elements after transformation, and to form the pJRoC30-SignalPeptide-VPgene expression shuttle vector. pJRoC30-VPL-WT, -R4, and -2-1B were constructed previously in the Alcalde lab 16 . All S. cerevisiae-transformed cells were plated in synthetic complete (SC) drop-out plates, and in each plate, selected colonies were picked and sequenced to verify the correct assembly and gene sequence.

Culture media.
Minimal medium is composed of 6.7 g/L yeast nitrogen base, 1.92 g/L amino acids supplements (yeast synthetic drop-out medium supplements without uracil), 2% raffinose and 25 mg/L chloramphenicol. VPs expression medium is composed of YP x1.11 medium (22.2 g/L bacto peptone and 11.1 g/L yeast extract), 67 mM KH2PO4 buffer at pH 6.0, 25 g/L ethanol, 22.2 g/L D-galactose, 500 mg/L bovine hemoglobin, 1 mM CaCl2 and 25 mg/L chloramphenicol. SC drop-out plates are composed of 6.7 g/L yeast nitrogen base, 1.92 g/L amino acids supplements (yeast synthetic drop-out medium supplements without uracil), 2% glucose, 20 g/L Bacto agar and 25 mg/L chloramphenicol.
Screening for active variants. A colony from each S. cerevisiae clone containing the parental or mutant VP gene was picked from an SC drop-out plate, inoculated in 2 mL minimal medium in a 14 mL culture tube, and incubated for 48 hours at 30 °C and 225 rpm. An aliquot of cells was removed and used to inoculate 2 mL of minimal medium in a new 14 mL culture tube to an OD600nm of 0.25-0.30, under the same conditions. The cells completed two growth phases (8-10 hours, reaching OD600nm ~ 1), then the expression medium (2.7 mL) was inoculated with 0.3 mL of the pre-culture in a new 14 mL culture tube (OD600nm ~ 0.1). Cells were incubated for further ~38-40 hours at 30 °C and 225 rpm and then centrifuged at 4000 g for 20 min at 4°C. The supernatant was removed into new tubes for further analysis. The expression protocol ran in triplicate, with an empty vector (containing only the signal peptide sequence) and the VPL-R4 and -2-1B variants as negative and positive controls, respectively. An ABTS-based colorimetric assay was conducted to assess the variants' activity: 20 μL of supernatant were transferred into activity 96 plates (Greiner Bio-One GmbH, Kremsmünster, Austria), and then, 180 μL of the reaction mixture were added to each row in the plate, and absorption at 418 nm was recorded immediately in a kinetic mode in a platereader at 25 °C (Citation5 or Synergy HTX plate readers, Bio-Tek, Bad Friedrichshall, Germany). The reaction mixture contained 100 mM citrate-phosphate buffer (pH 4.0), 2 mM ABTS and 0.1 mM H2O2. The activities were recorded in triplicate.
Small scale production of active variants. A colony from each S. cerevisiae clone containing the parental or mutant VP gene was picked from an SC drop-out plate, inoculated in 2 mL minimal medium in a 14 mL culture tube, and incubated for 48 hours at 30 °C and 225 rpm. An aliquot of cells was removed and used to inoculate 2.7 mL of minimal medium in a new 14 mL culture tube to an OD600nm of 0.25-0.30, under the same conditions. The cells completed two growth phases (8-10 hours, reaching OD600nm ~ 1), then the expression medium (9 mL) was inoculated with 1 mL of the pre-culture in a 50 mL Falcon tube (OD600nm ~ 0.1). Cells were incubated for further ~60 hours at 30 °C and 225 rpm and then centrifuged at 4000 g for 20 min at 4°C. The supernatant was removed into new tubes for further characterization of VPs in supernatant.
Thermostability assay (T50). Aliquots of 30 μL of selected variants' supernatant at appropriate dilutions (with 20 mM piperazine pH=5.5, buffer A, to achieve linear response in kinetic mode measurements in activity reads) were used for each incubation temperature. The samples were incubated for 10 or 15 minutes in a thermocycler pre-heated to a specific temperature (every 5 °C in a gradient scale ranging from 25 to 80 °C) and then removed and chilled on ice for 10 min. Thereafter, samples were removed from ice and incubated for at least 5 min at room temperature. Activity at each temperature was measured using the ABTS-based colorimetric assay described above and was normalized to the activity at 25 °C for residual activity calculations. All incubations and activity assays were conducted in triplicate. T50 values were calculated by sigmoidal fit to the T50 data of 5H, 8H, 11H and R4 (15 minutes incubation).
Kinetic thermostability (t1/2). Aliquots of 30 μL of selected variants' supernatant at appropriate dilutions (with buffer A, to achieve linear response in kinetic mode measurements in activity reads) were used for each incubation time point. The samples were incubated in a thermocycler (S1000 TM thermocycler, Bio-Rad, Rishon LeZion, Israel) pre-heated to 60 °C or 65 °C, and removed at different times (after 0, 2, 5, 7, 10, 15, 20, 30, 45, 60, 90 and 120 min), chilled out on ice for 10 min and further incubated at room temperature at least for 10 min. Activity at each time point was measured using the ABTS-based colorimetric assay described above and was normalized to the activity at time 0 for residual activity calculations. All incubations and activity assays were conducted in triplicate. pH stability. Supernatants of the selected variants were diluted to reach a final concentration of 100 mM citrate-phosphate-borate buffer at pH ranging from 2-9. Aliquots of 20 μL were removed at different times (time 0, 4, 25, 50, 75 and 165 hours) and measured in the regular ABTS-based colorimetric assay described above, but here in the presence of 180 μL of the following reaction mixture: 111.11 mM citrate-phosphateborate buffer (pH 4.0), 2.22 mM ABTS and 0.111 mM H2O2. For pH 2, an additional experiment was conducted, under the same procedure but with aliquots being removed at different time points (time 0, 10, 20, 30, 45, 60, 75, 90, 120, and 150 min). For the assay in pH range 2-9, activities were normalized to the activity at time 0 in pH=3, for residual activity calculations. All incubations and activity assays were conducted in triplicate.
VPs production and purification. A colony from S. cerevisiae clone containing the VP gene (5H, 8H, 11H, VPL_R4, VPL_2-1B) was picked from an SC drop-out plate, inoculated in 25 mL minimal medium in a 250 flask, and incubated for 48 hours at 30 °C and 225 rpm. An aliquot of cells was removed and used to inoculate 100 mL of minimal medium in a 1 L flask to an OD600nm of 0.25-0.30, under the same conditions. The cells completed two growth phases (8-10 hours, reaching OD600nm ~ 1-1.5), then the expression medium (450 mL) was inoculated with 50 mL of the pre-culture in a 2 L flask (OD600nm ~ 0.1). Cells were incubated for a further ~60 hours at 30 °C and 225 rpm. Thereafter, cells were centrifuged at 6000 g for 15 min at 4°C, and the supernatant was collected and filtered with a 0.2 μm filter bottle.
Filtrates were subjected to fractional precipitation with ammonium sulfate in two steps: a first cut of 50%, followed by centrifugation and elimination of the precipitates, and a second cut of 70%. Buffer A was used to dissolve the pellet of the second cut, and the dissolved protein solution was shaken overnight at 4 ᵒC for maximal recovery. The dissolved fraction was then centrifuged, filtrated, concentrated, and subjected to overnight dialysis against buffer A. Filtered fractions of the VP proteins after dialysis were uploaded into a HiTrap TM Q HP Column (GE Healthcare Bio-Sciences AB, Uppsala, Sweden) pre-equilibrated with buffer A, through ÄKTA pure protein purification system (GE Healthcare Bio-Sciences AB). Proteins were eluted in a two-step linear gradient from 0 to 1 M NaCl, at a flow rate of 1 mL/min: the first phase of 0-25 % over 15 column volumes (75 min) and second phase of 25-100 % over 2 column volumes (10 min). The fractions of the peak with the highest VP activity (and absorption at 407 nm) were pooled, concentrated, and dialyzed against 20 mM piperazine buffer pH=5.5 and 150 mM NaCl (buffer B). Protein fractions were then uploaded onto a Superdex 75 Increase 10/300 GL (GE Healthcare Bio-Sciences AB) through the ÄKTA pure system pre-equilibrated with buffer B. The fractions of the peak with the highest VP activity (and absorption at 407 nm) were pooled and dialyzed against buffer A. Pure protein samples were stored at 4 °C. Protein concentration was determined using the BCA assay with bovine serum albumin as a standard. The obtained Reinheitszahl values (Rz: Abs407 nm/Abs280 nm), which indicate for the purity of peroxidases, were 1 for 8H (due to high tryptophan content and therefore extinction coefficient at 280 nm) and above 2 for 5H, 11H, R4 and 2-1B.
Hydrogen peroxide stability. Purified 5H, 8H, 11H, R4 and 2-1B at 250 nM concentration (diluted with buffer A, to achieve linear response in kinetic mode measurements in activity reads) were incubated for 50 minutes at room temperature with 750 μM H2O2 (1:3,000 molar ratio). An aliquot of 20 μL was removed at times 0, 3, 7, 12, 18, 28, 38 and 48 minutes. Activity was immediately measured using the ABTS-based colorimetric assay described above and was normalized to the activity at time zero for residual activity calculations. All incubations and activity assays were conducted in triplicate.
pH activity profiles. For purified 5H, 8H, 11H, and R4, 20 μL protein samples (diluted in buffer A) were transferred into activity 96 plates (in the case of VA and MnSO4, UV-Star plates; Greiner Bio-One GmbH, Kremsmünster, Austria) and then, 180 μL of the reaction mixture were added to each row in the plate, and absorption at the appropriate wavelength (substrate-dependent) was recorded immediately in a kinetic mode in a plate-reader at 25 °C. The reaction mixtures contained a specific substrate in 100 mM citratephosphate-borate buffer (pH 2, 3, 3. For ABTS and DMP, using semi-logarithmic [S]-axis scale, double hyperbolic curves could be obtained, implying two oxidation sites (of high and low efficiency, in the μm and mm ranges). Each of the curve regions were fitted separately to the Michaelis-Menten model, which enabled the calculation of the two sets of kinetic constants. 20 μL purified protein samples (diluted in buffer A to appropriate concentration, [E]<<[S]) were transferred into activity 96 plates (in the case of VA and MnSO4, UV-Star plates) and then, 180 μL of the reaction mixture were added to each row in the plate, and absorption at the appropriate wavelength (substrate-dependent) was recorded immediately in a kinetic mode in a plate-reader at 25 °C. The reaction mixtures contained substrates at varying concentrations, in 100 mM citrate-phosphate-borate buffer at optimum pH (for manganese, sodium tartrate buffer was used) and optimum H2O2 concentration (0.4 mM for 5H, 0.2 mM for 8H, 0.1 mM for 11H and 1 mM for R4; approximately double of the KM values were used to gain high activity with minimal inhibition effect). H2O2 kinetics was measured using 2 mM (5H, 11H and R4) or 3 mM (8H) ABTS in 100 mM citrate-phosphate-borate buffer at optimum pH for ABTS activity. The following molar extinction coefficients were used to calculate the substrate/product concentration: ABTS, ε418 nm = 36,000 M −1 cm −1 ; DMP, ε469 nm = 27,500 M −1 cm −1 ; RB5, ε598 nm = 30,000 M −1 cm −1 ; VA, ε310 nm = 9300 M −1 cm −1 ; Mn 3+ -tartrate, ε238 nm = 6500 M −1 cm −1 . All activities were recorded in triplicate and the average velocity was used for the kinetic constants calculations. Table S4 (of the lowefficiency site that dominates the reaction at the used ABTS concentration). The activity assay was performed at pH=4.0, therefore the initial activities were normalized to the activity at optimal pH (using the data from pH-dependent activity assay; Figure S5).

Amino acid sequences of characterized VP designs:
5WT: # mut refers to the number of mutations in each designed variant: H -high mutational load, M -medium mutational load, L -low mutational load.  Figure S1. Accuracy of trRosetta models. (A) Four most reliable (top-ranked) models of a representative VP (VP5, gray), superimposed onto one another, demonstrate that most of the protein structural elements converge and small discrepancies occur only in peripheral loops (brown). (B) Best model of VP5 (gray) overlapped onto wild type VPL crystal structure (PDB entry 3FJW; light green). VPL calcium and manganese ions are presented in blue and pink spheres, respectively, and the heme group in pink sticks. (C) Close-up look onto VP5 residues that face all ligands and ions. Figure S2. Sequence alignment of 5H, 8H and 11H to their wildtype progenitor. PROSS-mutated positions are highlighted in green. Design in positions highlighted in gray were disallowed due to: proximity to one of the active sites or structural calcium ions, putative disulfide-bond forming cysteines or structural inconsistency in the best five models calculated by trRosetta. Mutations calculated by PROSS in positions highlighted in yellow were omitted after visual inspection due to one of the following reasons: low homologous-sequence data in the mutation region (PSSM with less than 10 sequences), formation or depletion of possible N-glycosylation site, radical mutation in the protein core (large hydrophobic to small hydrophobic, hydrophilic to hydrophobic substitutions, etc.), mutation in the heme's substrate pocket and mutation in possible contact with structural inconsistent regions. In all panels, the PROSS-design model (purple) is superimposed onto the trRosetta-generated wildtype model (gray). Significant mutations, residues in their vicinity and the hydrogen bonds they form are presented in purple and gray sticks for the wildtype and designed models, respectively. Figure S4. Thermal stabilities of selected VP designs. VPs (5H, 8H, 11H, VPL-R4 and 2-1B) were incubated for (A) 15 or (B) 10 minutes at temperatures ranging from 30 to 80 °C (in gradient steps of 5 °C), and their residual activity compared to the activity at 25 °C was measured (data not shown for 2-1B in A). (C) kinetic thermostability (t1/2) profiles were determined by incubation of VP supernatants at 65 °C and measuring their residual activity at times 0-120 minutes, compared to the activity at time zero. All the results are the means ± S.D. from three independent experiments. Figure S5. pH stabilities of selected VP designs. (A) VPs (5H, 8H, 11H, and VPL-R4 and 2-1B) were incubated at pH ranging from 2-9 using 100 mM citrate-phosphate-borate buffer, and their residual activity at times 0-165 hours, compared to the activity at pH=3 at time zero, was measured. (B) VPs were incubated at 100 mM citrate-phosphateborate buffer pH=2, and their activity at times 0-150 minutes was measured. All the results are the means ± S.D. from three independent experiments. Figure S6. pH activity profiles of selected VP designs with versatile substrates. Purified 5H, 8H, 11H and R4 assayed for activity at range of pHs (pHs 2-9 to all substrates but Mn 2+ , in which was tested at pH range of 3-5): ABTS, DMP, Mn 2+ , VA and RB5. VPs activity was normalized to the activity at optimal pH for each protein-substrate pair. All the results are the means ± S.D. from three independent experiments. Figure S7. 2020's evolution of deep-learning based ab initio structure prediction methods. (A) Most reliable (topranked) models of a representative VP (VP11), superimposed onto one another and onto wildtype VPL crystal structure (PDB entry 3FJW; gray): trRosetta V1 9 (released in January 2020), trRosetta V2 10 (DeepAccNet-Rosetta, which became available via the Robetta server in December 2020) and AlphaFold2 13 (made available through Google Colab notebook in July 2021; green). VPL calcium and manganese ions are presented in blue and pink spheres, respectively, and the heme group in light pink sticks. (B) Close-up look onto the active site packing of wildtype VPL crystal structure and VP11 trRosetta V1, trRosetta V2 and AlphaFold2 models. VPL manganese ions and its heme group are presented as in (A). Position identities and numbers are relative to PDB entry 3FJW. Notice the close correspondence between trRosetta V2, AlphaFold2 and the VPL crystal structure in the core of the enzyme domain and within the heme binding pocket by contrast to the divergence of the trRosetta V1 model.