The Peptide Ligase Activity of Human Legumain Depends on Fold Stabilization and Balanced Substrate Affinities

Protein modification by enzymatic breaking and forming of peptide bonds significantly expands the repertoire of genetically encoded protein sequences. The dual protease-ligase legumain exerts the two opposing activities within a single protein scaffold. Primarily localized to the endolysosomal system, legumain represents a key enzyme in the generation of antigenic peptides for subsequent presentation on the MHCII complex. Here we show that human legumain catalyzes the ligation and cyclization of linear peptides at near-neutral pH conditions, where legumain is intrinsically unstable. Conformational stabilization significantly enhanced legumain’s ligase activity, which further benefited from engineering the prime substrate recognition sites for improved affinity. Additionally, we provide evidence that specific legumain activation states allow for differential regulation of its activities. Together these results set the basis for engineering legumain proteases and ligases with applications in biotechnology and drug development.


Protein preparation
Human legumain was expressed, purified and activated as described previously 1 . Point mutants were generated using round-the-horn site directed mutagenesis, which is based on the inverse PCR method, as described earlier 2, 3 . Caspase 9 was cloned into the pet22b vector using XbaI and XhoI.
The expression construct carried the catalytic domain of human caspase 9 and an C-terminal His6-tag. ∆CARD-caspase-9 was expressed in BL21(DE3) cells as described elsewhere 4 . Briefly, BL21(DE3) cells containing the ∆CARD-caspase-9 expression construct were grown in LB medium at 37 °C till an OD600 of 0.5 -0.7 was reached. Subsequently cells were transferred to 25 °C and expression was induced upon addition of 0.4 mM IPTG. After 4 hours of expression, cells were harvested by centrifugation at 4000 rpm, 4 °C for 10 min. The pellet was resuspended in lysis buffer composed of 50 mM Hepes pH 7.5 and 100 mM NaCl. Cells were lysed by sonication (45 seconds, 40% power, 50% cycle, 3 times) and the supernatant containing soluble caspase-9 was harvested by centrifugation (17500 g, 15 min, 4 °C). The cleared supernatant was incubated with Ni 2+ -resin for 15 min at 4 °C. After protein binding, the beads were washed with 5 column volumes of lysis buffer and one column volume of lysis buffer supplemented with 5 mM, 10 mM and 15 mM imidazole respectively. Bound protein was eluted with lysis buffer supplemented with 250 mM imidazole. Elution fractions were concentrated using Amicon ultra centrifugal filter units (cutoff: 3 -5 kDa) and subjected to size exclusion chromatography using an S75 column preequilibrated in 20 mM Hepes pH 7.5 and 100 mM NaCl. The final protein migrated as 3 distinct bands on SDS-PAGE, corresponding to the p19, p18 and p12 subdomains. Fractions containing pure caspase-9 were pooled, concentrated, and frozen in aliquots at -20 °C.

Fluorescent activity assay
Legumain protease activity was tested using the Z-Ala-Ala-Asn-7-amino-4-methylcoumarin (AAN-AMC; Bachem) fluorescent peptide at an enzyme concentration of 4 nM, if not stated differently. Activity of wild-type and different mutants was assayed at pH 5.5, unless otherwise stated, in assay buffer composed of 50 mM citric acid, 100 mM NaCl, 0.05% Tween-20, and 50 µM AAN-AMC. To test the pH-stability of legumain, activity was measured in buffer composed of 50 mM Hepes pH 7.0, 100 mM NaCl, 0.05% Tween-20. Increase in fluorescence signal was monitored at 460 nm after excitation at 380 nm in an Infinite M200 Plate Reader (Tecan) at 37 °C.

Chemical surface modification
Active legumain was buffer exchanged to 100 mM MES pH 6.0 and 100 mM NaCl at a final protein concentration of 0.05 mg/ml. EDC and NHS stock solutions were prepared freshly by dissolving appropriate amounts of powder in ddH2O. Concentrations of the stock solutions were 400 mM for EDC and 100 mM for NHS. EDC and NHS were mixed in a 1 : 1 ratio and added to legumain at a final concentration of 80 mM and 20 mM respectively. The reaction was incubated for 10 min. at 22 °C. After 10 min., ethanolamine was added at a final concentration of 166 mM and the reaction was incubated for another 10 min at 22 °C. A 1 M ethanolamine stock solution was used which was prepared by diluting the 16.5 M stock solution with 1 M citric acid and adjusting pH to 6.2. After completion of the reaction, the reaction buffer was exchanged to 20 mM citric acid pH 4.0 and 100 mM NaCl using a NAP-5 TM column and the protein was further concentrated using a vivaspin concentrator (Satorius; MW cutoff: 10 kDa). Additionally, a control reaction was prepared in parallel, that was treated essentially the same, but supplemented with ddH2O instead of EDC/NHS and ethanolamine.

Cyclisation assays
SFTI-derived peptides were synthesized and analyzed as described previously 5

ACP stability assay
To test whether the two-chain complex of legumain (ACP; Asparaginyl Carboxpeptidase), which is composed of the AEP domain and the LSAM domain, is stable in the presence of the SFTI(N14)-GL substrate, we subjected it to size exclusion experiments. Specifically, we incubated 10 µM ACP with 25 µM SFTI(N14)-GL peptide in a buffer composed of 50 mM citric acid pH 6.0 and 100 mM NaCl for 2 hours at 37 °C. Subsequently we loaded the reaction on a S75 size exclusion column preequilibrated in reaction buffer at pH 6.0. Peak fractions were analyzed by SDS-PAGE.

Covariance analysis
We used CoeViz (4) to do a pairwise coevolution analysis of amino acid residues. A multiple sequence alignment was generated using PSI-BLAST and coevolution scores were computed using different covariance metrics. We identified closely related amino acid residues after applying a ≥ 0.3 cutoff to χ² scores.

Thermal stability assay
The thermal stability of different legumain variants was assayed using differential scanning fluorimetry. The respective protein was diluted into assay buffer containing 100 mM NaCl and 100 mM Hepes pH 7.0 or citric acid pH 4.0 to a final concentration of 0.2 mg/ml and supplemented with 10x Sypro Orange Dye (Sigma). Thermal unfolding was measured in a 7500 Real Time PCR System (Applied Biosystems) after increasing temperature by 1 °C per min from 20 °C to 95 °C and detecting fluorescence signal. Fluorescence data was normalized to peak values and melting curves were evaluated as described elsewhere 6 .

PICS experiments
To test the substrate specificity of human legumain we carried out Proteomic Identification of protease Cleavage Sites (PICS) assays using peptide libraries generated from Escherichia coli BL21 proteome extracts as described previously [7][8][9] . The proteome was dissolved in 100 mM Hepes Desalted peptides were analyzed using a nano HPLC (Ultimate 3000 RSLC, Thermo, Dreieich, Germany) operated in two-column setup coupled to a high resolution Q-TOF mass spectrometer (ImpactII, Bruker, Bremen, Germany). MaxQuant v1.6.0.16 10 was used to match spectra to peptides from the UniProt E.coli K12 proteome library (downloaded Nov 2015, 4313 entries) with appended MaxQuant standard contamination entries. Trypsin was set as semi-specific digestion enzyme (i.e. only one side of the peptide was required to match the trypsin specificity). Label multiplicity was set to two, considering light dimethylation (+28.0313 Da) and heavy dimethylation (+34.0631 Da) as peptide N-terminal and lysine labels. Carbamidomethylation of cysteine residues (+57.0215 Da) was set as fixed modification, methionine oxidation (+15.9949 Da) and protein N-terminal acetylation (+ 42.0106 Da) were considered as variable modifications. PSM false discovery rate was set to 0.01. Identified peptides that showed at least a fourfold increase in intensity after protease treatment compared to the control treatment or were exclusively present in the protease-treated condition were considered as putative cleavage products.
An in-house Perl script (https://sourceforge.net/projects/pincis/) was used to remove putative library peptides (trypsin specificity on both sides of the identified peptide) and to reconstruct the full cleavage windows from the identified cleavage products as described 7 . Aligned validated cleavage windows were visualized using the iceLogo software version 1.3.8 11 , displaying sitespecific differential amino acid abundance compared to the E.coli K12 proteome as reference set Formation of ligated product was measured via mass spectrometry experiments. When CIP was used as primed side substrate, samples were reduced/alkylated and analyzed by nanoHPLC (Dionex Ultimate 3000, Thermo Fisher Scientific, Bremen, Germany) coupled via nano electrospray to a Q Exactive Orbitrap mass spectrometer (Thermo Fisher Scientific). When GG and GIP were used as primed side substrate, samples were directly infused into the Q Exactive mass spectrometer. Data analysis and de novo sequencing was done with PEAKS Studio X (Bioinformatics Solutions, Waterloo, Canada).
To confirm covalent modification of legumain with the CIP tripeptide, we incubated legumain with CIP as described above, and tested its activity towards the AAN-AMC substrate at pH 5.5. In a control reaction, we added DMSO instead of the CIP peptide. were soaked in crystallization buffer supplemented with 20% glycerol. X-ray data was collected at the ESRF (Grenoble) on beamline ID23-2 at 100 K. Crystals diffracted to a resolution of 2.0 Å.
Data processing was performed using iMOSFLM and Aimless from the CCP4 program suite 12,13 .
PDB entry code 4awa was used as a search model for molecular replacement using PHASER 14 .
Iterative cycles of model building in COOT followed by refinement in phenix.refine were carried out 15,16 . The final structure was analyzed using PROCHECK and MolProbity and coordinates and structure factors were deposited with the PDB under the entry code 7O50. Molecular graphics were prepared with Pymol. Electrostatic surface potentials were prepared using APBS after assigning charges at pH 7.0 with Pdb2pqr.     Autocatalytic processing sites are indicated with arrows, catalytic residues with red stars, residues close to the active site with pink stars, glycosylation sites with green triangles, S1-specificity residues with blue diamonds. The sequence numbering corresponds to human legumain. Secondary  V155/G184 D160/Y 190 S1 S1' S2'

A B
Relative activity,      Table S1. Cyclic product formed using indicated precursor peptides, enzyme concentrations and pH values (the % values are based on the relative abundance of the three species detected by mass spectrometry: precursor, processed linear and cyclic products). AtLEGγ-TC: Arabidopsis thaliana legumain isoform γ in two-chain activation state.    1.23 The structure was determined from a single crystal.

Ac-SFTI
[a] Highest resolution shell is shown in parentheses.