ACS Publications. Most Trusted. Most Cited. Most Read
My Activity
CONTENT TYPES

Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence–Function Space and Genome Context to Discover Novel Functions

View Author Information
Departments of Biochemistry and Chemistry, Institute for Genomic Biology, University of Illinois, Urbana-Champaign Urbana, Illinois 61801, United States
Cite this: Biochemistry 2017, 56, 33, 4293–4308
Publication Date (Web):August 22, 2017
https://doi.org/10.1021/acs.biochem.7b00614

Copyright © 2017 American Chemical Society. This publication is licensed under these Terms of Use.

  • Open Access
  • Editors Choice

Article Views

8011

Altmetric

-

Citations

LEARN ABOUT THESE METRICS
PDF (4 MB)

Abstract

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of “genomic enzymology” web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence–function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.

In 2001 Patricia Babbitt and I discussed nature’s strategies for divergent evolution of new enzymatic functions from a common progenitor to yield mechanistically diverse enzyme superfamilies (conserved active site architectures that catalyze reactions with shared partial reactions, intermediates, or transition states) and functionally diverse suprafamilies (conserved active site architectures that catalyze mechanistically distinct reactions). (1) When our review was published, only a few superfamilies/suprafamilies had been recognized, including the enolase, amidohydrolase, thiyl radical, enoyl-CoA hydratase (crotonase), vicinal-oxygen-chelate superfamilies, and the orotidine 5′-monophosphate (OMP) decarboxylase suprafamily, not surprising because the UniProt database then contained only 571 804 protein sequences (July 2001) (http://www.uniprot.org/; see Table 1 for a summary of abbreviations). Despite, in retrospect, a meager number of sequences, we concluded that enzymologists were positioned to expand their interests beyond studies of single enzymes to encompass entire enzyme families. We proposed that sequenced genomes (1) provided a rapidly expanding source of new proteins for investigation and (2) allowed genomic context to be used to infer novel enzymatic functions and, therefore, better understand the evolution of functional diversity in enzyme superfamilies. We suggested the term genomic enzymology to describe the expansive strategy of using protein families and genome context to focus studies of enzyme mechanisms, discover new functions, and more accurately describe the evolution of enzyme function in molecular terms (sequence and structure). However, we did not propose how the protein and genome sequence databases could be leveraged and used by the experimental community.
Table 1. List of Abbreviations
ABCATP-binding cassette
AGeNNTAutomatically Generates refined Neighborhood NeTworks
antiSMASHAntibiotics & Secondary Metabolite Analysis SHell
BGCbiosynthetic gene cluster
BLASTBasic Local Alignment Search Tool
DSFdifferential scanning fluorimetry
DUFdomain of unknown function
EFIEnzyme Function Initiative
EFI-ESTEFI-Enzyme Similarity Tool
EFI-GNTEFI-Genome Neighborhood Tool
ENAEuropean Nucleotide Archive
GNNGenome Neighborhood Network
GREglycyl radical enzyme
InterProIntegrated Protein Database
JGI-IMG/MJoint Genome Institute-Integrated Microbial Genomes/Metagenomes
MSAmultiple sequence alignment
NCBINational Center for Bioinformatics Information
NRPSnonribosomal peptide synthase
OMPorotidine 5′-monophosphate
orfopen reading frame
P5CΔ1-pyrroline-5-carboxylate
PfamProtein Family Database
PKSpolyketide synthase
PNproteome network
PRISMPRediction Informatics for Secondary Metabolomes
RLPRuBisCO-like protein
RODEOrapid ORF description and evaluation online
RuBisCOribulose bisphosphate carboxylase/oxygenase
SBPsolute binding protein
SFLDStructure–Function Linkage Database
ShortBRED“Short, Better Representative Extract Data Set”
SSNSequence Similarity Network
TCTtricarboxylate transport
TRAPtripartite ATP-independent periplasmic transporter
TRNTaxonomic Rank Network
UniProtUniversal Protein Resource
UniProtKBUniProt Knowledgebase
Sixteen years later, the UniProt database contains 88 588 026 nonredundant sequences (Figure 1; Release 2017_07); the number of sequences is increasing at the rate of 2.4% per month (doubling time 2.5 years), largely the result of microbial genome projects. The challenge is to devise “user friendly” methods to interrogate the massive amount of data so that hypotheses can be generated that direct experimental determination of in vitro activities and in vivo metabolic functions of uncharacterized enzymes. For example, 379 mechanistically diverse superfamilies and functionally diverse suprafamilies have been described; (2) additional superfamilies and suprafamilies must be present in (1) genomic “dark matter” that has not been curated by databases such as Pfam and (2) the genomes of phylogenetically diverse bacterial species that have not yet been systematically sequenced. (3) This large, and growing for the foreseeable future, set of superfamilies includes members that catalyze novel reactions in novel pathways, a boon to enzymologists.

Figure 1

Figure 1. Growth of the UniProt protein sequence database (Release 2017_07). The blue line represents the EMBL/TrEMBL sequences with automated annotations; the red line represents the EMBL/SwissProt with manually curated annotations. Currently, the doubling time is ∼2.5 years. The number of sequences decreased by ∼50% in April 2015 when UniProt identified reference proteomes for closely related species and archived the redundant proteomes.

Approximately 50% of the proteins in the databases have incorrect, uncertain, or unknown functional annotations. (4) The UniProt Knowledgebase (UniProtKB) is composed of two sections, UniProtKB/SwissProt and UniProtKB/TrEMBL. The annotations in UniProtKB/SwissProt are manually curated; the functional annotations in UniProtKB/TrEMBL are computationally assigned based on the function of the “closest” homologue. In the most recent UniProt release (2017_07), only 0.63% of the sequences are in the UniProtKB/SwissProt section (Figure 1); this fraction continues to decrease because the total number of sequences added in each release greatly exceeds the number of new sequences with SwissProt-curated, experimentally verified annotations. In principle, curated annotations might be extended to orthologues; however, the sequence boundaries between functions are unknown, so homology-based approaches for functional assignment are risky. Therefore, incorrect, uncertain, or unknown annotations will continue to propagate, compromising their utility to allow the discovery of new enzymatic functions, metabolic pathways, metabolites, and biology.
Khosla recently summarized this challenge: (5) “Although enzymology will remain a predominantly experimental science for the foreseeable future, one cannot avoid a sense of helplessness when one considers the huge (and growing) deficit in functionally annotated sequences. By now, there are approximately 100 million nonredundant protein sequence entries in GenBank, but a reliably curated protein database such as SwissProt contains fewer than 1 million entries. This is a quintessential ‘big data’ problem, where the rate at which data is generated continues to outpace the rate at which it is curated. It is unlikely that more resource-intensive curation alone can solve the problem. As the proverb says, this may be a situation where the most desirable approach will involve user-friendly tools that teach a novice how to fish instead of serving fish. Such tools could ideally capture the essence of an enzymologist’s judgment in layers of increasing sophistication, depending on the user’s actual needs.”
This Perspective describes “genomic enzymology” web tools that initially were developed by the Enzyme Function Initiative (EFI) (6) and provides examples of their applications.

Web Tools for Natural Product Discovery

In parallel with the development of genomic enzymology, the natural products community discovered that genes encoding biosynthetic pathways for natural products often are organized in “biosynthetic gene clusters” (BGCs). (7-9) Given the structural complexity of natural products and the need to identify the enzymes that assemble their backbones, e.g., terpene synthases, nonribosomal peptide synthases (NRPSs), and polyketide synthases (PKSs), as well as the enzymes that catalyze “tailoring” reactions, e.g., glycosylases, methylases, and redox enzymes, the genomic colocalization of the biosynthetic genes facilitates pathway discovery and experimental characterization. Although the type of scaffold may be apparent from the annotations in the BGCs, the structure of the natural product is not trivial to predict. Indeed, many enzymes (backbone-forming and tailoring) are novel members of diverse enzyme superfamilies. Nonetheless, the discovery of a BGC facilitates enzyme identification so that they can be experimentally tested for sequential activities in the biosynthetic pathway.
The number of natural products is estimated to be extremely large; (10, 11) therefore, identification of BGCs is an attractive strategy for their discovery. In the past several years, bioinformatic tools have been developed for discovering BGCs in sequenced genomes, (12, 13) including antiSMASH (Antibiotics & Secondary Metabolite Analysis SHell (14)), PRISM (PRediction Informatics for Secondary Metabolomes (15)), and RODEO (Rapid ORF Description and Evaluation Online (16)). These tools are widely used by the natural products/synthetic biology community, e.g., more than 300 000 jobs have been processed by the antiSMASH server (https://antismash.secondarymetabolites.org/). Although these tools enable the discovery of BGCs, the annotations of the uncharacterized enzymes in the BGCs are limited to their membership in protein families, an overview that often is insufficient to restrict substrate specificities and/or reaction identities/mechanisms. Therefore, many of the challenges in BGC characterization are the same as those encountered by enzymologists focused on small-molecule metabolic pathways (vide infra).

What Should Genomic Enzymology Tools Provide?

Genomic enzymology focuses on the discovery of function in the context of entire enzyme families: this approach allows recognition of sequence and structure attributes that are conserved for specific functions. Babbitt developed the Structure–Function Linkage Database (SFLD; http://sfld.rbvi.ucsf.edu/) to generate and disseminate sequence–structure relationships that associate specific functional properties with specific sequence and structure motifs in functionally diverse enzyme superfamilies. (17) As an early example of the use of genomic enzymology to obtain mechanistic insights, the recognition that (1) the reactions catalyzed by mandelate racemase and muconate lactonizing enzyme in the enolase superfamily require stabilization of an enolate anion intermediate and (2) their sequences have conserved motifs for binding an active site Mg2+ defined the catalytic strategy for the superfamily. (1, 18, 19) The functional diversity in the superfamily, including dehydration, deamination, cycloisomerization, racemization, and epimerization of carboxylate-anion substrates, could be explained by divergent evolution selecting (1) acid/base catalysts for both generating the enolate anion intermediate and directing it to products and (2) specificity determinants for binding different substrates in productive geometries relative to the acid/base catalysts. (20, 21) This same strategy for evolution of new enzymatic functions applies to many mechanistically diverse superfamilies. (2)
The challenges for genomic enzymology are developing and applying large-scale methods for (1) grouping members of mechanistically diverse superfamilies and functionally diverse suprafamilies in isofunctional families, e.g., identifying acid/base catalysts and placing restrictions on reaction mechanisms and substrate specificities and (2) analyzing the genome contexts for the members of isofunctional families so that their roles in metabolic pathways can be deduced. e.g., predicting substrates, intermediates, and products.

Sequence Similarity Networks (SSNs)

Evolutionary biologists typically use phylogenetics-based approaches to distinguish orthologues from paralogues. (22, 23) Phylogenetic trees are constructed from multiple sequence alignments (MSAs); however, MSAs are difficult to generate for large protein families. (23) Many superfamilies and suprafamilies are large: >15 K sequences in the glycyl-radical enzyme superfamily, >22 K sequences in the OMP decarboxylase suprafamily, >44 K sequences in the enolase superfamily, >122 K sequences in the enoyl-CoA hydratase (crotonase) superfamily, and >250 K sequences in the radical SAM superfamily. In addition to being difficult to construct, trees for large families also are difficult to interpret because of their complexity. (24) Trees do not provide immediate access to all sequences in a family—representative sequences usually are selected in the construction of the tree. Instead, what is needed is a large-scale approach that allows easy visualization and analyses for all sequences in a family, recognizing that it must be “user friendly”, i.e., intuitive and fast.
Atkinson and Babbitt introduced sequence similarity networks (SSNs) to enable large-scale analyses of sequence–function relationships in protein families. (25) An SSN displays pairwise relationships obtained from an all-by-all sequence comparison, e.g., BLAST. Although the use of BLAST can be criticized because it provides a measure of overall sequence similarity and, therefore, may be insensitive to different domain architectures important in determining molecular function, it is (1) fast, a requirement for routine all-by-all comparisons of the sequences of members of increasingly large protein families (each sequence must be compared with every other sequence so the time required increases with the square of the number of sequences), and (2) familiar to experimentalists. An SSN contains “nodes” for sequences; “edges” that quantitate sequence similarity (pairwise sequence identity) connect nodes that share sequence similarity that exceeds a user-specified level (Figure 2). As the sequence similarity required to connect nodes with edges is increased, the nodes segregate into clusters; the goal is to select a level of sequence similarity that segregates the nodes/members of the family into isofunctional clusters (Figure 3).

Figure 2

Figure 2. A sequence similarity network (SSN) showing the protein sequence nodes and pairwise sequence similarity edges.

Figure 3

Figure 3. SSNs for sequences from the proline racemase family (Pfam family PF05544). (A) Alignment score ≥15, ≥22% pairwise sequence identity. (B) Alignment score ≥20, ≥25% pairwise sequence identity. (C) Alignment score ≥50, ≥35% sequence identity. (D) Alignment score ≥70, ≥40% sequence identity. (E) Alignment score ≥90, ≥48% sequence identity. (F) Alignment score ≥110, ≥58% sequence identity. The colors in panel F are used to color the nodes in panels A–E.

SSNs contain “node attributes”, including functional and phylogenetic information associated with each sequence/node, that assist the user in analyzing sequence–function relationships, including choosing sequence similarity thresholds for drawing edges and segregating the families into isofunctional clusters. Atkinson and Babbitt compared SSNs with phylogenetic trees and concluded “the most valuable feature of SSNs is not the optimal or most accurate display of sequence similarity, but rather the flexible visualization of many alternate protein attributes for all or nearly all sequences in a superfamily”. (25)
SSNs are viewed using Cytoscape (http://cytoscape.org/), “an open source platform for visualizing complex networks and integrating these with attribute data”. (26) Although Cytoscape has a steep “learning curve”, it provides Control Panels to select nodes based on the node attributes and to filter and color the networks to enable visual analyses. With node attributes and the Control Panels, SSNs viewed with Cytoscape satisfy Khosla’s vision that genomic enzymology tools “could ideally capture the essence of an enzymologist’s judgment in layers of increasing sophistication, depending on the user’s actual needs”. (5)
The SFLD provides SSNs for a several functionally diverse superfamilies with manually curated (labor intensive and expensive) annotations/node attributes; (17) these SSNs serve as “gold standards” for functional annotation in both the bioinformatics and enzymology communities. (27) However, with the large number of superfamilies/suprafamilies (vide infra) and families that provide additional metabolic enzymes, e.g., dehydrogenases, kinases, and aldolases, community-initiated generation of SSNs is necessary. The SFLD does not provide this capability; Pythoscape was developed by the SFLD for generating large SSNs, but it is not “user friendly” for most experimentalists because it requires access to a computer cluster and programming expertise. (28)
In principle, the construction of SSNs is “simple”, i.e., connecting sequences with edges that quantitate similarity. However, most experimentalists would be hard-pressed to develop their own programs for generating SSNs. And, other web tools that construct SSNs, e.g., Pclust (29) and CLANS, (30) use a limited number of sequences and/or node attributes.
The EFI developed a web tool, the Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST; http://efi.igb.illinois.edu/efi-est/), (31) to generate SSNs for large protein families. To date, >1600 unique users have submitted jobs to EFI-EST, and >50 publications have appeared that reference the use of EFI-EST. (13, 14, 32-78) EFI-EST uses sequences and node attribute information from UniProt: in contrast to the NCBI database, annotations in the UniProt database can be changed with data provided by any member of the community, allowing important corrections and additions that diminish propagation of annotation errors.
EFI-EST now provides four options for selecting sequences to be included in the SSN: Option A, a single user-supplied sequence is used to collect homologues with BLAST from the UniProt database (maximum 10 000 sequences); Option B, the user specifies one or more UniProt and/or InterPro families [currently limited to ≤255,000 sequences to allow the SSN for the radical SAM superfamily (Pfam family PF04055) to be generated]; Option C (enhanced in the most recent update), the user provides a FASTA file of sequences and selects whether accession IDs in the headers are used to retrieve node attributes from UniProt; and Option D (new in the most recent update), the user provides a list of UniProt and/or NCBI accession IDs. After the all-by-all comparison using BLAST, the user selects an “alignment score” based on pairwise percent identity to filter the edges (the threshold for drawing edges to connect nodes). The user then downloads the SSN for analysis with Cytoscape.
EFI-EST now provides a “Color SSN Utility” to facilitate analyses of SSNs by (1) coloring each cluster in an input SSN with a unique color, (2) providing a file with color information that allows the user to color SSNs of the same sequences generated with lower similarity (pairwise identity) to track segregation of clusters (e.g., Figure 3), and (3) FASTA files for the sequences in each cluster to facilitate the generation of MSAs.

Applications of SSNs

The EFI used SSNs from the SFLD to characterize sequence–function space in targeted functionally diverse superfamilies (amidohydrolase, (79-85) enolase, (19, 86-92) glutathione S-transferase, (93) haloalkanoate dehalogenase, (94) and isoprenoid synthase (95, 96)) and select targets for functional discovery. Then, when EFI-EST became available, both the EFI and community began to use SSNs to characterize sequence–function space in a wide range of proteins families.
SSNs generated by the community using EFI-EST (13, 14, 32-78) have been used to identify and describe potential isofunctional families within enzyme families, e.g., clusters with different (but unknown) substrate specificities, thereby providing an overview of sequence–function space in specificity diverse superfamilies (different substrates but same type of overall reaction) and functionally diverse superfamilies (different substrates and different reaction mechanisms, although a partial reaction may be conserved). SSNs also provide the ability to survey the members of a protein family for different domain architectures that may suggest different functional contexts, i.e., fusion proteins in different pathways. And, the pathway for cluster segregation as sequence similarity increases (Figure 3) may suggest functional linkages between clusters. Several community-generated SSNs from the recent literature that illustrate their use are shown in Figure 4; readers are referred to the publications for detailed descriptions. (13, 14, 32-78)

Figure 4

Figure 4. Examples of SSNs generated with EFI-EST that were included in recent publications. (A) SSN for isopeptidases involved in lasso peptide synthesis. (43) (B) SSN of precursor peptides for microviridin synthesis. (60) (C) SSN of LanMs in lantibiotic synthesis. (76) (D) SSN for ferredoxins compared with a phylogenetic tree. (40) (E) SSN for IspH in isoprenoid biosynthesis. (56) (F) SSNs for members of the DRE-TIM metallolyase superfamily. (52) Figures reproduced with permission from refs 40, 43, 52, 56, 60, and 76.

Genome Neighborhood Networks (GNNs)

With the potential to segregate protein families into isofunctional clusters using SSNs, the second genomic enzymology challenge is to place these clusters in a functional context, e.g., identify the small-molecule metabolic pathways in which uncharacterized enzymes participate. In eubacteria, archaea, and fungi, the enzymes in a metabolic pathway often are encoded by a gene cluster or operon (just as the biosynthetic pathways for natural products are encoded by BGCs). Therefore, the proteins encoded by the genes proximal to those that encode members of an isofunctional cluster (orthologues) may allow the number and types of reactions in the metabolic pathway to be determined if these are conserved by the members of the cluster.
Genome neighborhoods for homologues can be examined using web resources such as JGI-IMG/M (https://img.jgi.doe.gov/cgi-bin/m/main.cgi); however, complete pathways are not always encoded by a single genome neighborhood. Large-scale mining of genome neighborhoods for all orthologues in an SSN cluster has the advantage that operon/gene cluster organization may not be preserved across phylogenetic species; i.e., the sequences in an isofunctional SSN cluster may have diverse genome neighborhoods and pathway neighbors, but the ability to survey all of the neighborhoods provides the potential to identify all of the functionally linked genes/enzymes that can be assembled into a metabolic pathway.
In 2014, the EFI described a genome neighborhood analysis that was applied to the proline racemase family (Pfam family PF05544) using an all-by-all comparison (with BLAST) of the neighbors to generate a network (the genome neighborhood network, GNN); (97) the neighbors were segregated into protein families using an e-value >20 for the edges in the SSN. By assigning unique colors to the clusters in the SSN (Figure 5A) and coloring the neighbors in the GNN with the same color, the neighbors for the sequences in each cluster were identified (Figure 5B). Then, candidates for functionally linked enzymes were recognized and potential pathways were predicted. This analysis allowed in vitro enzymatic activities and in vivo metabolic functions (the three pathways shown in Figure 5C) to be assigned to 85% of the sequences in the family [2333 sequences in InterPro Release 43.0 (July 2013)].

Figure 5

Figure 5. (A) A colored SSN for the proline racemase family (PF05544; InterPro Release 43.0). (B) The GNN generated by an all-by-all BLAST of the genome neighbors. (C) Three pathways catalyzed by members of the proline racemase family. The nodes in the GNN (panel B) are colored using the color clusters in the SSN (Panel A). Figures reproduced with permission from ref 97.

The EFI subsequently developed the Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT; http://efi.igb.illinois.edu/efi-gnt/) to provide a “user friendly” interface for generating GNNs to facilitate the identification of pathway/metabolic context for isofunctional clusters in SSNs. Although EFI-GNT has not yet been “officially” announced with a detailed publication (a manuscript describing the updated version of EFI-EST and EFI-GNT is in preparation for publication later this year), >250 unique users have accessed the web tool that is available for community use.
An SSN generated by EFI-EST is the input for EFI-GNT [Figure 6A; 6419 sequences in the proline racemase family in InterPro Release 63.0 (May 2017)]. EFI-GNT assigns a unique color (from a palette of 1513 colors) to each cluster (Figure 6B). It then interrogates the European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) database for the neighbors of each sequence in each cluster in the input SSN (for eubacteria, archaea, and fungi), and the neighbors are associated with their Pfam families. The co-occurrence frequencies of the queries in the SSN cluster with the neighbors as well as the absolute values of the distances in open reading frames (orfs) between the queries and neighbors are calculated. Functionally linked genes encoding a pathway are expected to have (1) large query-neighbor co-occurrence frequencies (diminished if operon/gene cluster organization is phylogenetically diverse) and (2) short distances between the queries and neighbors.

Figure 6

Figure 6. (A) SSN for the proline racemase family (PF05544, InterPro Release 63.0) segregated with an alignment score of ≥110 (≥58% pairwise sequence identity). (B) Colored SSN generated by the EFI-GNT web tool. (C, D) GNN with SSN cluster hub-nodes and Pfam family spoke-nodes. (E, F) GNN with Pfam family hub-nodes and SSN cluster spoke-nodes. The GNNs were generated with a ±10 orf genome neighborhood window and a query-neighbor co-occurrence threshold of 20%.

EFI-GNT provides GNNs in two formats. In one format (Figure 6C,D), a cluster is present for each SSN cluster: the hub-node represents the sequences in the SSN cluster (colored with a unique color so that it can be easily identified in a colored version of the input SSN that is generated), and the spoke-nodes represent the neighbor Pfam families; this format allows the user to identify the pathway enzymes. In the second format, a cluster is present for each neighbor Pfam family: the hub-node represents the Pfam family, and the spoke nodes represent the SNN clusters that identified the neighbors (Figure 6E,F); this format allows the user to assess whether the similarity (edge) threshold used to generate the input SSN was too large (pairwise identity too large) so that orthologues are segregated in multiple clusters, with these identifying the same Pfam family neighbors and pathway.
In both GNN formats, the co-occurrence frequencies of the SSN queries and neighbors are the values of the edges between the hub- and spoke-nodes: if the co-occurrence frequency exceeds a user-specified threshold, the edge and spoke-node are present. From the co-occurrence frequencies, the user can identify neighbors that “always” occur with the query (the same conserved operon/gene cluster) as well as those that are less frequently associated (operon/gene cluster in some species; dispersed genes in other species).
EFI-GNT also provides files with the UniProt IDs for the sequences in each neighbor Pfam family that can be used to identify the neighbors in the SSNs for their families. This mapping (1) assists the selection of alignment score thresholds for segregating the neighbor SSNs into isofunctional clusters/families and (2) provides useful context about possible functional (substrate specificity and reaction mechanism) relationships that may be useful in deducing in vitro activities and in vivo metabolic functions.

Integrated Use of SSNs and GNNs To Discover Metabolic Pathways

The synergistic “power” of the EFI-EST and EFI-GNT web tools for functional annotation of bacterial and fungal enzymes is the ability to (1) segregate protein families into isofunctional clusters in an SSN using EFI-EST (the sequences in a cluster have the same genome context) and (2) use the SSN as the input for EFI-GNT to interrogate and visualize genome neighborhood context for the isofunctional clusters in the GNN. To the best of our knowledge, no other web tools provide this integrated capability.
The GNN format in which the hub-node represents the SSN cluster and the spoke-nodes represent the Pfam families (Figure 6C,D) can be used to identify the enzymes, transcriptional regulators, and transporters in a metabolic pathway. For example, continuing with the proline racemase family (PF05544; SSN in Figure 6A,B), the enzymes in a catabolic pathway for the conversion of trans-4-hydroxyproline to α-ketoglutarate (middle pathway in Figure 5C) can be identified for cluster 16 in the input SSN (Figure 6D, 792 sequences with genome neighborhoods in the ENA files). In addition to 4-hydroxyproline epimerase (the queries in cluster 16 and the SSN hub-node in the GNN cluster in Figure 6D), the Pfam family spoke-nodes of the GNN cluster identify the three remaining enzymes in the pathway: (1) cis-4-hydroxyproline oxidase, a member of the d-amino acid oxidase family (“DAO” in Figure 6D; PF01266, co-occurrence frequency, 0.91, median distance 1.0 orfs); (2) cis-4-hydroxyproline imino acid dehydratase/deaminase, a member of the dihydrodipicolinate synthase family (“DHDPS”; PF00701, co-occurrence frequency, 0.82, median distance 2.0 orfs); and (3) α-ketoglutarate semialdehyde dehydrogenase, a member of the aldehyde dehydrogenase family (“Aldedh”; PF00171, co-occurrence frequency, 0.66, median distance 2.0 orfs). The curations provided by Pfam provide essential clues for deducing the identities of the reactions catalyzed by the various neighboring enzymes (conserved reaction mechanisms).
The GNN in Figure 6D also includes (1) the ATP-bonding component of an ABC transport system (“ABC_trans”, PF00005, co-occurrence frequency, 0.35, median distance 4.0 orfs), (2) an additional membrane component of the ABC transport system (“BPD_transp_1”, PF00528, co-occurrence frequency, 0.31, median distance 3.0 orfs), and (3) a bidomain transcriptional regulator (“GntR-FCD”, PF00392 and PF07729, co-occurrence frequency, 0.67, median distance 3.0 orfs).
The GNN analysis also recognizes genome neighbors that are not associated with any Pfam family (“none” in Figure 6D; ∼15% of the proteins in UniProt are not associated with a Pfam family). These sequences can contain protein families currently not curated by Pfam; these families can be defined by generating SSNs for these sequences using Option D of EFI-EST.
The GNN in Figure 6D was generated with a minimum co-occurrence frequency of 0.30. At lower co-occurrence frequencies (Figure 7), members of four families of solute binding proteins [SBPs; Peripla_BP_6 (PF13458), SBP_bac_3 (PF00497), Peripl_BP_8 (PF13416), and SBP_bac_5 (PF00496)] for ABC transport systems also are genome proximal to the SSN queries with co-occurrence frequencies of 0.16, 0.11, 0.07, and 0.03, respectively, and median distances of 6.0, 5.0, 2.0, and 6.0 orfs, respectively. Also members of the major facilitator superfamily (MFS_1, PF07690) and an amino acid permease family (AA_permease_2 family, PF13520) are genome proximal to the SSN queries with co-occurrence frequencies of 0.15 and 0.11, respectively, and median distances of 9.0 and 2.0 orfs, respectively. The enzymes in metabolic pathways usually are conserved (orthologues instead of analogues; vide infra), but transport systems and transcriptional regulators often are not conserved, so members of multiple families of transporters and regulators may be genome proximal to the queries in the SSN cluster.

Figure 7

Figure 7. GNN for SSN cluster 16 presented at different query-neighbor co-occurrence frequencies. (A) 3%. (B) 5%. (C) 10%. (D) 12%. (E) 15%. (F) 20%.

Figure 7 illustrates the ability of GNNs to analyze genome neighborhoods as a function of co-occurrence frequency, thereby allowing the identification of pathways that may be encoded by single genome neighborhoods in some species and multiple genome neighborhoods in other species. An example of the utility of this capability is described in the next section. (34)

Use of Transport System SBPs To Anchor Pathway Prediction Using SSNs and GNNs

For uncharacterized pathways, pathway prediction is facilitated by independent information about the substrate for the first enzyme in the pathway. For microbial enzymes in catabolic pathways, such information can be obtained from the identity of the solute for the transporter (or the ligand for a transcriptional regulator). For ABC, TRAP, and TCT transport systems, the solute is conveyed to the membrane components with a soluble extracellular (Gram-positive)/periplasmic (Gram-negative) solute binding protein (SBP); SBPs can be purified on large scale and subjected to ligand screening with differential scanning fluorimetry (DSF)/ThermoFluor using a physical library of small molecules. (98) These ligand specificities anchor the pathway by identifying the substrate for the first enzyme; the Pfam families of the neighbors allow the reactions to be predicted. Experiments, both in vitro and in vivo, are required to validate the pathway.
Using this strategy, experimentally determined ligands for SBPs and synergistic use of SSNs and GNNs to identify pathway components, the EFI identified several novel catabolic pathways. A particularly informative example is the discovery of catabolic pathways for the three tetritols, d-threitol, l-threitol, and erythritol, in Mycobacterium smegmatis. (34) Ligand screening identified one SBP for an ABC transporter that bound d-threitol; a genome-proximal dehydrogenase catalyzed its oxidation; however, other catabolic enzymes were encoded elsewhere in the genome (Figure 8A). These “missing” enzymes were discovered by first constructing the SSN for the d-threitol dehydrogenase and then the GNN for the cluster containing the dehydrogenase—this identified a d-erythrulose kinase that was encoded by a gene cluster distal to the one containing the SBP and d-threitol dehydrogenase in M. smegmatis (but not other species that encode the pathway). The SSN for the kinase family was then constructed, and the cluster containing the d-erythrulose kinase was used to construct the GNN; this identified a second gene cluster distal to both the one containing the SBP and d-threitol dehydrogenase and the one containing the d-erythrulose kinase that contained isomerases to complete the d-threitol pathway. Investigation of other genes in both distal clusters allowed identification of the remaining enzymes in the pathway for d-threitol catabolism as well as the enzymes in the pathways for l-threitol and erythritol catabolism (Figure 8B). The ligand specificity of a single SBP was sufficient to identify enzymes for three catabolic pathways encoded by three distal gene clusters.

Figure 8

Figure 8. (A) Strategy for discovering catabolic pathways for d-threitol, l-threitol, and erythritol in M. smegmatis using differential scanning fluorimetry (DSF) to screen the ligand specificities of SBPs and the integrated used of SSNs and GNNs to discover the pathway enzymes. (B) Catabolic pathways for d-threitol, l-threitol, and erythritol. (C) Catabolic pathways for d-threonate, l-threonate, and d-erythronate in R. eutropha H16. (59) Figures in Panel A and B reproduced with permission from ref 34; figure in Panel C reproduced with permission from ref 59.

The EFI also used this strategy to assign functions to members of Domain of Unknown Function 1537 (DUF 1537; approximately 20% of the 16 712 Pfam families in Release 31.0 are families of DUFs or proteins of unknown function). (59) Using the specificities for four SBPs for TRAP transport systems for four-carbon acid sugars, including d-erythronate and l-erythronate, SSNs and GNNs were used to identify two genome neighborhoods in Ralstonia eutropha H16 that encode enzymes in catabolic pathways for d-threonate, l-threonate, and d-erythronate (Figure 8C). Members of the DUF1537 family (Pfam families PF07005 and PF17402) were determined to be kinases for four-carbon acid sugars, identifying a previously uncharacterized family of kinases. In addition, members of the PdxA2 family (PF04166) were determined to be oxidative decarboxylases that generate dihydroxyacetone phosphate (DHAP) and CO2.
In unpublished work, the specificities of three ABC SBPs for d-apiose, a branched chain pentose found in plant cell walls, and the iterative use of SSNs and GNNs have been used to discover five catabolic pathways for d-apiose, a branched aldose, two of which are found in species in the human gut microbiome (humans ingest plant cell walls; species of Bacteroides can degrade the rhamnogalacturonan-II component that contains d-apiose to release d-apiose that can be catabolized (99)). Two pathways include novel RuBisCO-like proteins (RLPs) from the RuBisCO superfamily, one catalyzes a β-ketoacid decarboxylation and the second catalyzes a “transcarboxylation” in which the substrate is decarboxylated (β-ketoacid decarboxylation), with the sequestered CO2 used to carboxylate the enediolate intermediate on the adjacent carbon, and the resulting isomeric β-ketoacid undergoes hydrolysis as in the canonical RuBisCO reaction. The experimentally determined specificity of three SBPs anchored discovery of five pathways by identifying the substrates; the iterative use of SSNs and GNNs identified the enzymes.

Comments

The success of the integrated application of SSNs and GNNs to discover metabolic pathways is limited by the proximities of the genes encoding the pathway components, so this analysis may not be successful for all functional assignment problems. However, the large-scale nature of the analyses provides the potential to determine whether colocalization of genes is due to limited genetic drift among similar genomes or pathway conservation among phylogenetically diverse genomes; it also allows identification of low co-occurrence frequency but significant clustering of the genes encoding multiple pathway components that would be tedious to discover by examination of large numbers of individual genome neighborhoods. (34)
Also, SSNs provide the ability to segregate members of mechanistically diverse superfamilies and functionally diverse suprafamilies into isofunctional clusters (families). For enzymes an important test of isofunctionality is that the GNN generated for an SSN cluster identifies the components of a single pathway. The iterative use of SSNs and GNNs not only provides a test of isofunctionality but also a method for determining the minimum SSN alignment score required to achieve isofunctionality. If the GNN for an SSN cluster identifies “too many” components for a single pathway, further segregation of the cluster with a larger alignment score into “daughter” clusters may allow the resolution of the pathways. The reader should recognize that achieving isofunctional clusters in an SSN may not be straightforward, e.g., even within the same superfamily different alignment scores may be required to achieve isofunctional clusters. However, the integration of SSNs and GNNs using EFI-EST and EFI-GNT provides a powerful strategy for assessing and achieving isofunctional clusters.

Chemically Guided Functional Profiling: Building on EFI-EST

With ∼50% of the proteins in the sequence databases having incorrect, uncertain, or unknown functions, devising a target selection strategy is a major challenge for functional assignment. The SSNs for functionally diverse enzyme families often have many uncharacterized clusters—the problem is deciding which are worth experimental characterization. One approach is to select those that are most biologically relevant, but how is that achieved in the absence of knowledge of their functions?
Balskus and Huttenhower recently described a strategy for choosing biologically relevant targets termed “chemically guided functional profiling”. (72) This strategy involves (1) construction of the SSN for a targeted protein family segregated into isofunctional families and (2) mapping the abundance of metagenome reads to the clusters in the SSN, with uncharacterized clusters having the largest number of metagenome markers the highest priority for functional characterization (Figure 9A). ShortBRED (100) provides a fast and accurate method to profile metagenome samples and uses sequence fragments from the clusters in the SSN (“markers’) to identify homologous sequences in the metagenome reads; their abundance is then mapped to the SSN clusters to accomplish target selection.

Figure 9

Figure 9. (A) Strategy for chemically guided functional profiling. (B) SSN for the glycyl radical enzyme superfamily showing clusters with previously assigned functions as well as clusters (15 and 16) for which chemically guided functional profiling was used to leverage experimental functional assignment. Figures reproduced with permission from ref 72.

The utility of chemically guided functional profiling was demonstrated using the glycyl radical enzyme (GRE) superfamily; the reactions are initiated by abstraction of a hydrogen atom from the substrate by a glycine-centered backbone radical (generated by an activase from the S-adenosyl methionine superfamily). The metagenome samples used for target selection were from the human gut microbiome, so uncharacterized members of the GRE superfamily are likely involved in reactions that allow the microbiome to utilize small molecules in the gut. Balskus previously had identified choline trimethylamine-lyase (CutC) in human gut microbiome species; CutC catalyzes the cleavage of choline to acetaldehyde and trimethylamine, the latter involved in the production of methane as well as implicated in human diseases via its N-oxide. (101, 102)
The SSN for the GRE family is shown in Figure 9B. The functionally assigned clusters are colored, as are two clusters (15 and 16) that were identified as abundant in the human gut microbiome. Both of the latter clusters were hypothesized to be dehydratases based on conserved active site residues associated with known dehydratase reactions. Cluster 15 was characterized as a 4-hydroxyproline dehydratase; again, genome context was used to predict the substrate because of its proximity to Δ1-pyrroline-5-carboxylate (P5C) reductase that reduces P5C that would be derived from dehydration of 4-hydroxyproline to proline. Cluster 16 was characterized as a novel (S)-1,2-propanediol dehydratase (a previously characterized analogue is an adenosylcobalamin-dependent enzyme); the identity of the substrate was suggested from genome analysis because the enzyme is found in Roseburia inulinivorans that catabolizes l-fucose but lacks the adenosylcobalamin-dependent dehydratase.
A “user friendly” web tool is not yet available to allow the community to use “chemically guided functional profiling” with their favorite families. But, the development of a web tool is a high priority goal given its ability to identify important targets for functional characterization.

AGeNNT and Refined GNNs: Building on EFI-GNT

EFI-GNT provides GNNs in two formats that summarize (1) the Pfam families identified by each SSN cluster (edges between SSN cluster hub-nodes and Pfam family spoke-nodes), providing information about the reactions in metabolic pathways, and (2) the SSN clusters that identify each Pfam family (edges between Pfam family hub-nodes and SSN cluster spoke-nodes), providing information about whether multiple clusters may contain orthologues.
Merkl and co-workers recently described AGeNNT (Automatically Generates refined Neighborhood NeTworks), a Java application that uses the GNNs provided by EFI-GNT to generate a third format (“refined GNN”) in which all of the SSN cluster and Pfam family nodes are connected by edges. (71) Clusters that contain orthologues, identified when they share the same genome neighbors, can be distinguished from clusters that have different genome contexts. An SSN is submitted to the EFI-GNT web tool. AGeNNT then generates the refined GNN. Several options are provided, including (1) eliminating overrepresented phylogenetically related subspecies from the input SSN to reduce redundancy in the GNN and (2) using a user-defined “whitelist” of Pfam families to include in the refined GNN. For example, only Pfam families for enzymes can be included in the refined GNN so Pfam cluster connections between SSN clusters that involve transporters and transcriptional regulators are eliminated (in contrast to pathway enzymes, transporters and transcriptional regulators are not conserved).
Continuing again with the proline racemase family (PF05544) to provide an example, several major clusters from the SSN were selected for generation of GNNs using EFI-GNT and the refined GNN using AGeNNT (Figure 10). The colored SSN is shown in Figure 10A, the SSN cluster hub-node GNN format is shown in Figure 10B, the Pfam family hub-node GNN format is shown in Figure 10C, and the refined GNN is shown in Figure 10D (Pfam families for transport systems and transcriptional regulators are deleted in the GNNs; because these families are not conserved in pathways (vide supra), their inclusion in the refined GNN can complicate the analysis). Comparison of the refined GNN with the GNNs establishes the utility of the refined GNN in identifying orthologous SSN clusters: clusters 2, 4, 5, and 6 are orthologous 4-hydroxyproline epimerases; clusters 1 and 3 are orthologous trans-3-hydroxylproline dehydratases; and cluster 7 is proline racemase (using functional assignments based on experimental verification (97)). Building on EFI-EST and EFI-GNT, AGeNNT links SSN clusters that share pathway context, potentially identifying interrelations of subfamilies within a protein family.

Figure 10

Figure 10. (A) Colored SSN generated by EFI-GNT for selected clusters in the proline racemase family (PF05544). (B) GNN with SSN cluster hub-nodes and Pfam family spoke-nodes. (C) GNN with Pfam family hub-nodes and SSN cluster spoke-nodes. (D) Refined GNN showing identification of three different functions as deduced by connections (or lack thereof) between SSN cluster and Pfam family nodes.

Future Directions

EFI-EST and EFI-GNT provide experimentalists with otherwise inaccessible but essential perspectives on sequence–function space in protein families and genome context that facilitate the assignment of functions to uncharacterized enzymes. Other web tools are available for smaller scale analysis of protein families, but genomic enzymology “requires” large-scale analyses to provide the maximum amount of context.
Other large-scale web tools can be imagined. For example, the proteome of an organism (or of a community) determines its metabolic capabilities; therefore, an easy-to-construct overview of the metabolic potential would be useful and could be provided by a “proteome network” (PN) tool. A PN would include a node for each protein encoded by a genome (or community) and collected into Pfam family clusters (Pfam family hub-node and protein spoke nodes). The PN would identify the catalytic capabilities via the identities of the Pfam families and, also, the locations of the proteins (spoke nodes) in the SSNs for their families. For a community PN, identification of species-specific Pfam families could provide the potential to identify syntrophic metabolic pathways, e.g., different organisms contribute different metabolic capabilities to synthesize a natural product or degrade an energy source. In analogy with chemically guided functional profiling, mapping transcriptome abundance to the PN would provide a visually powerful approach for identifying enzymes in novel pathways.
Also, the Pfam families that contribute enzymes to a pathway often are conserved in phylogenetically diverse organisms; however, we have observed that one or more reactions in a metabolic pathway can be catalyzed by analogues (nonorthologous gene replacements) in different taxonomic ranks, e.g., phyla, class, order, or family. The ability to discover analogues may be enhanced by clustering members of a protein family by taxonomic rank instead of pairwise sequence identity (SSNs). Because the node attributes that are provided by EFI-EST for sequences include taxonomic ranking, a taxonomic rank network (“TRN”) would be easy to construct. Subsequent generation of sequence similarity-based SSNs for individual clusters in the TRN would be accomplished with Option D of EFI-EST, thereby providing the ability to further segregate and analyze the clusters by sequence homology.
Finally, although the generation of an SSN is straightforward, Release 31.0 of the Pfam database (Release 31.0) defines 16 712 families. Immediate access to a library of precomputed SSNs for all Pfam families would provide the biological and biomedical communities, including users of web tools that identify BGCs (vide supra), with the ability to quickly place their favorite enzymes in the context sequence–function relationships for their protein families. This library of SSNs should be regularly updated to provide current information (perhaps in parallel with releases of the InterPro database), but its construction requires considerable computational resources. We have demonstrated that the calculation of this database is feasible, although we have not yet been able to initiate the production phase of this effort.
I encourage the readers to (1) try the EFI-EST and EFI-GNT web tools, (2) imagine new applications for SSNs and GNNs, and (3) identify additional large-scale data visualization and analysis challenges that would be amenable to solution by community-accessible web tools. Like the natural products community, the enzymology community needs to recognize the essential role of web tools that allow the protein and genome sequence databases to be leveraged for the solution of biological problems.

Author Information

ARTICLE SECTIONS
Jump To

    • Author
    • Funding

      This work was supported by U54GM093442 and P01GM118303 from the National Institutes of Health.

    • Notes
      The author declares no competing financial interest.

    Acknowledgment

    ARTICLE SECTIONS
    Jump To

    I thank Mr. Daniel Davidson and Mr. Nils Oberg for writing the software that supports EFI-EST and EFI-GNT. I also thank Drs. Tyler Stack and Rémi Zallot for their critical comments on the manuscript.

    References

    ARTICLE SECTIONS
    Jump To

    This article references 102 other publications.

    1. 1
      Gerlt, J. A. and Babbitt, P. C. (2001) Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies Annu. Rev. Biochem. 70, 209 246 DOI: 10.1146/annurev.biochem.70.1.209
    2. 2
      Furnham, N., Dawson, N. L., Rahman, S. A., Thornton, J. M., and Orengo, C. A. (2016) Large-Scale Analysis Exploring Evolution of Catalytic Machineries and Mechanisms in Enzyme Superfamilies J. Mol. Biol. 428, 253 267 DOI: 10.1016/j.jmb.2015.11.010
    3. 3
      Mukherjee, S., Seshadri, R., Varghese, N. J., Eloe-Fadrosh, E. A., Meier-Kolthoff, J. P., Goker, M., Coates, R. C., Hadjithomas, M., Pavlopoulos, G. A., Paez-Espino, D., Yoshikuni, Y., Visel, A., Whitman, W. B., Garrity, G. M., Eisen, J. A., Hugenholtz, P., Pati, A., Ivanova, N. N., Woyke, T., Klenk, H. P., and Kyrpides, N. C. (2017) 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life Nat. Biotechnol. 35, 676 683 DOI: 10.1038/nbt.3886
    4. 4
      Schnoes, A. M., Brown, S. D., Dodevski, I., and Babbitt, P. C. (2009) Annotation error in public databases: misannotation of molecular function in enzyme superfamilies PLoS Comput. Biol. 5, e1000605 DOI: 10.1371/journal.pcbi.1000605
    5. 5
      Khosla, C. (2015) Quo vadis, enzymology? Nat. Chem. Biol. 11, 438 441 DOI: 10.1038/nchembio.1844
    6. 6
      Gerlt, J. A., Allen, K. N., Almo, S. C., Armstrong, R. N., Babbitt, P. C., Cronan, J. E., Dunaway-Mariano, D., Imker, H. J., Jacobson, M. P., Minor, W., Poulter, C. D., Raushel, F. M., Sali, A., Shoichet, B. K., and Sweedler, J. V. (2011) The Enzyme Function Initiative Biochemistry 50, 9950 9962 DOI: 10.1021/bi201312u
    7. 7
      Ikeda, H., Nonomiya, T., Usami, M., Ohta, T., and Omura, S. (1999) Organization of the biosynthetic gene cluster for the polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis Proc. Natl. Acad. Sci. U. S. A. 96, 9509 9514 DOI: 10.1073/pnas.96.17.9509
    8. 8
      Bentley, S. D., Chater, K. F., Cerdeno-Tarraga, A. M., Challis, G. L., Thomson, N. R., James, K. D., Harris, D. E., Quail, M. A., Kieser, H., Harper, D., Bateman, A., Brown, S., Chandra, G., Chen, C. W., Collins, M., Cronin, A., Fraser, A., Goble, A., Hidalgo, J., Hornsby, T., Howarth, S., Huang, C. H., Kieser, T., Larke, L., Murphy, L., Oliver, K., O’Neil, S., Rabbinowitsch, E., Rajandream, M. A., Rutherford, K., Rutter, S., Seeger, K., Saunders, D., Sharp, S., Squares, R., Squares, S., Taylor, K., Warren, T., Wietzorrek, A., Woodward, J., Barrell, B. G., Parkhill, J., and Hopwood, D. A. (2002) Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2) Nature 417, 141 147 DOI: 10.1038/417141a
    9. 9
      Ikeda, H., Ishikawa, J., Hanamoto, A., Shinose, M., Kikuchi, H., Shiba, T., Sakaki, Y., Hattori, M., and Omura, S. (2003) Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis Nat. Biotechnol. 21, 526 531 DOI: 10.1038/nbt820
    10. 10
      Yu, X., Doroghazi, J. R., Janga, S. C., Zhang, J. K., Circello, B., Griffin, B. M., Labeda, D. P., and Metcalf, W. W. (2013) Diversity and abundance of phosphonate biosynthetic genes in nature Proc. Natl. Acad. Sci. U. S. A. 110, 20759 20764 DOI: 10.1073/pnas.1315107110
    11. 11
      Ju, K. S., Gao, J., Doroghazi, J. R., Wang, K. K., Thibodeaux, C. J., Li, S., Metzger, E., Fudala, J., Su, J., Zhang, J. K., Lee, J., Cioni, J. P., Evans, B. S., Hirota, R., Labeda, D. P., van der Donk, W. A., and Metcalf, W. W. (2015) Discovery of phosphonic acid natural products by mining the genomes of 10,000 actinomycetes Proc. Natl. Acad. Sci. U. S. A. 112, 12175 12180 DOI: 10.1073/pnas.1500873112
    12. 12
      Medema, M. H. and Fischbach, M. A. (2015) Computational approaches to natural product discovery Nat. Chem. Biol. 11, 639 648 DOI: 10.1038/nchembio.1884
    13. 13
      Tietz, J. I. and Mitchell, D. A. (2016) Using Genomics for Natural Product Structure Elucidation Curr. Top. Med. Chem. 16, 1645 1694 DOI: 10.2174/1568026616666151012111439
    14. 14
      Blin, K., Wolf, T., Chevrette, M. G., Lu, X., Schwalen, C. J., Kautsar, S. A., Suarez Duran, H. G., de Los Santos, E. L. C., Kim, H. U., Nave, M., Dickschat, J. S., Mitchell, D. A., Shelest, E., Breitling, R., Takano, E., Lee, S. Y., Weber, T., and Medema, M. H. (2017) antiSMASH 4.0-improvements in chemistry prediction and gene cluster boundary identification Nucleic Acids Res. 45, W36 W41 DOI: 10.1093/nar/gkx319
    15. 15
      Skinnider, M. A., Merwin, N. J., Johnston, C. W., and Magarvey, N. A. (2017) PRISM 3: expanded prediction of natural product chemical structures from microbial genomes Nucleic Acids Res. 45, W49 W54 DOI: 10.1093/nar/gkx320
    16. 16
      Tietz, J. I., Schwalen, C. J., Patel, P. S., Maxson, T., Blair, P. M., Tai, H. C., Zakai, U. I., and Mitchell, D. A. (2017) A new genome-mining tool redefines the lasso peptide biosynthetic landscape Nat. Chem. Biol. 13, 470 478 DOI: 10.1038/nchembio.2319
    17. 17
      Akiva, E., Brown, S., Almonacid, D. E., Barber, A. E., 2nd, Custer, A. F., Hicks, M. A., Huang, C. C., Lauck, F., Mashiyama, S. T., Meng, E. C., Mischel, D., Morris, J. H., Ojha, S., Schnoes, A. M., Stryke, D., Yunes, J. M., Ferrin, T. E., Holliday, G. L., and Babbitt, P. C. (2014) The Structure-Function Linkage Database Nucleic Acids Res. 42, D521 530 DOI: 10.1093/nar/gkt1130
    18. 18
      Babbitt, P. C., Hasson, M. S., Wedekind, J. E., Palmer, D. R., Barrett, W. C., Reed, G. H., Rayment, I., Ringe, D., Kenyon, G. L., and Gerlt, J. A. (1996) The enolase superfamily: a general strategy for enzyme-catalyzed abstraction of the alpha-protons of carboxylic acids Biochemistry 35, 16489 16501 DOI: 10.1021/bi9616413
    19. 19
      Gerlt, J. A., Babbitt, P. C., Jacobson, M. P., and Almo, S. C. (2012) Divergent evolution in enolase superfamily: strategies for assigning functions J. Biol. Chem. 287, 29 34 DOI: 10.1074/jbc.R111.240945
    20. 20
      Schmidt, D. M., Mundorff, E. C., Dojka, M., Bermudez, E., Ness, J. E., Govindarajan, S., Babbitt, P. C., Minshull, J., and Gerlt, J. A. (2003) Evolutionary potential of (beta/alpha)8-barrels: functional promiscuity produced by single substitutions in the enolase superfamily Biochemistry 42, 8387 8393 DOI: 10.1021/bi034769a
    21. 21
      Vick, J. E., Schmidt, D. M., and Gerlt, J. A. (2005) Evolutionary potential of (beta/alpha)8-barrels: in vitro enhancement of a “new” reaction in the enolase superfamily Biochemistry 44, 11722 11729 DOI: 10.1021/bi050963g
    22. 22
      Engelhardt, B. E., Jordan, M. I., Repo, S. T., and Brenner, S. E. (2009) Phylogenetic molecular function annotation J. Phys.: Conf. Ser. 180, 012024 DOI: 10.1088/1742-6596/180/1/012024
    23. 23
      Liu, K., Raghavan, S., Nelesen, S., Linder, C. R., and Warnow, T. (2009) Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees Science 324, 1561 1564 DOI: 10.1126/science.1171243
    24. 24
      Price, M. N., Dehal, P. S., and Arkin, A. P. (2010) FastTree 2--approximately maximum-likelihood trees for large alignments PLoS One 5, e9490 DOI: 10.1371/journal.pone.0009490
    25. 25
      Atkinson, H. J., Morris, J. H., Ferrin, T. E., and Babbitt, P. C. (2009) Using sequence similarity networks for visualization of relationships across diverse protein superfamilies PLoS One 4, e4345 DOI: 10.1371/journal.pone.0004345
    26. 26
      Kohl, M., Wiese, S., and Warscheid, B. (2011) Cytoscape: software for visualization and analysis of biological networks Methods Mol. Biol. 696, 291 303 DOI: 10.1007/978-1-60761-987-1_18
    27. 27
      Brown, S. D., Gerlt, J. A., Seffernick, J. L., and Babbitt, P. C. (2006) A gold standard set of mechanistically diverse enzyme superfamilies Genome Biol. 7, R8 DOI: 10.1186/gb-2006-7-1-r8
    28. 28
      Barber, A. E., 2nd and Babbitt, P. C. (2012) Pythoscape: a framework for generation of large protein similarity networks Bioinformatics 28, 2845 2846 DOI: 10.1093/bioinformatics/bts532
    29. 29
      Li, W., Kinch, L. N., and Grishin, N. V. (2013) Pclust: protein network visualization highlighting experimental data Bioinformatics 29, 2647 2648 DOI: 10.1093/bioinformatics/btt451
    30. 30
      Frickey, T. and Lupas, A. (2004) CLANS: a Java application for visualizing protein families based on pairwise similarity Bioinformatics 20, 3702 3704 DOI: 10.1093/bioinformatics/bth444
    31. 31
      Gerlt, J. A., Bouvier, J. T., Davidson, D. B., Imker, H. J., Sadkhin, B., Slater, D. R., and Whalen, K. L. (2015) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks Biochim. Biophys. Acta, Proteins Proteomics 1854, 1019 1037 DOI: 10.1016/j.bbapap.2015.04.015
    32. 32
      Colin, P. Y., Kintses, B., Gielen, F., Miton, C. M., Fischer, G., Mohamed, M. F., Hyvonen, M., Morgavi, D. P., Janssen, D. B., and Hollfelder, F. (2015) Ultrahigh-throughput discovery of promiscuous enzymes by picodroplet functional metagenomics Nat. Commun. 6, 10008 DOI: 10.1038/ncomms10008
    33. 33
      Cox, C. L., Doroghazi, J. R., and Mitchell, D. A. (2015) The genomic landscape of ribosomal peptides containing thiazole and oxazole heterocycles BMC Genomics 16, 778 DOI: 10.1186/s12864-015-2008-0
    34. 34
      Huang, H., Carter, M. S., Vetting, M. W., Al-Obaidi, N., Patskovsky, Y., Almo, S. C., and Gerlt, J. A. (2015) A General Strategy for the Discovery of Metabolic Pathways: d-Threitol, l-Threitol, and Erythritol Utilization in Mycobacterium smegmatis J. Am. Chem. Soc. 137, 14570 14573 DOI: 10.1021/jacs.5b08968
    35. 35
      Petronikolou, N. and Nair, S. K. (2015) Biochemical Studies of Mycobacterial Fatty Acid Methyltransferase: A Catalyst for the Enzymatic Production of Biodiesel Chem. Biol. 22, 1480 1490 DOI: 10.1016/j.chembiol.2015.09.011
    36. 36
      Rao, G., O’Dowd, B., Li, J., Wang, K., and Oldfield, E. (2015) IspH-RPS1 and IspH-UbiA: ″Rosetta Stone″ Proteins Chem. Sci. 6, 6813 6822 DOI: 10.1039/C5SC02600H
    37. 37
      Roche, D. B., Brackenridge, D. A., and McGuffin, L. J. (2015) Proteins and Their Interacting Partners: An Introduction to Protein-Ligand Binding Site Prediction Methods Int. J. Mol. Sci. 16, 29829 29842 DOI: 10.3390/ijms161226202
    38. 38
      Wichelecki, D. J., Vetting, M. W., Chou, L., Al-Obaidi, N., Bouvier, J. T., Almo, S. C., and Gerlt, J. A. (2015) ATP-binding Cassette (ABC) Transport System Solute-binding Protein-guided Identification of Novel d-Altritol and Galactitol Catabolic Pathways in Agrobacterium tumefaciens C58 J. Biol. Chem. 290, 28963 28976 DOI: 10.1074/jbc.M115.686857
    39. 39
      Ahmed, F. H., Mohamed, A. E., Carr, P. D., Lee, B. M., Condic-Jurkic, K., O’Mara, M. L., and Jackson, C. J. (2016) Rv2074 is a novel F420 H2-dependent biliverdin reductase in Mycobacterium tuberculosis Protein Sci. 25, 1692 1709 DOI: 10.1002/pro.2975
    40. 40
      Atkinson, J. T., Campbell, I., Bennett, G. N., and Silberg, J. J. (2016) Cellular Assays for Ferredoxins: A Strategy for Understanding Electron Flow through Protein Carriers That Link Metabolic Pathways Biochemistry 55, 7047 7064 DOI: 10.1021/acs.biochem.6b00831
    41. 41
      Baier, F., Copp, J. N., and Tokuriki, N. (2016) Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships Biochemistry 55, 6375 6388 DOI: 10.1021/acs.biochem.6b00723
    42. 42
      Bhandari, D. M., Fedoseyenko, D., and Begley, T. P. (2016) Tryptophan Lyase (NosL): A Cornucopia of 5′-Deoxyadenosyl Radical Mediated Transformations J. Am. Chem. Soc. 138, 16184 16187 DOI: 10.1021/jacs.6b06139
    43. 43
      Chekan, J. R., Koos, J. D., Zong, C., Maksimov, M. O., Link, A. J., and Nair, S. K. (2016) Structure of the Lasso Peptide Isopeptidase Identifies a Topology for Processing Threaded Substrates J. Am. Chem. Soc. 138, 16452 16458 DOI: 10.1021/jacs.6b10389
    44. 44
      Davey, L., Halperin, S. A., and Lee, S. F. (2016) Thiol-Disulfide Exchange in Gram-Positive Firmicutes Trends Microbiol. 24, 902 915 DOI: 10.1016/j.tim.2016.06.010
    45. 45
      Desai, J., Liu, Y. L., Wei, H., Liu, W., Ko, T. P., Guo, R. T., and Oldfield, E. (2016) Structure, Function, and Inhibition of Staphylococcus aureus Heptaprenyl Diphosphate Synthase ChemMedChem 11, 1915 1923 DOI: 10.1002/cmdc.201600311
    46. 46
      Ding, W., Li, Q., Jia, Y., Ji, X., Qianzhu, H., and Zhang, Q. (2016) Emerging Diversity of the Cobalamin-Dependent Methyltransferases Involving Radical-Based Mechanisms ChemBioChem 17, 1191 1197 DOI: 10.1002/cbic.201600107
    47. 47
      Gerlt, J. A. (2016) Tools and strategies for discovering novel enzymes and metabolic pathways Perspectives in Science 9, 24 32 DOI: 10.1016/j.pisc.2016.07.001
    48. 48
      Ghodge, S. V., Biernat, K. A., Bassett, S. J., Redinbo, M. R., and Bowers, A. A. (2016) Post-translational Claisen Condensation and Decarboxylation en Route to the Bicyclic Core of Pantocin A J. Am. Chem. Soc. 138, 5487 5490 DOI: 10.1021/jacs.5b13529
    49. 49
      Hao, Y., Pierce, E., Roe, D., Morita, M., McIntosh, J. A., Agarwal, V., Cheatham, T. E., 3rd, Schmidt, E. W., and Nair, S. K. (2016) Molecular basis for the broad substrate selectivity of a peptide prenyltransferase Proc. Natl. Acad. Sci. U. S. A. 113, 14037 14042 DOI: 10.1073/pnas.1609869113
    50. 50
      Ji, X., Li, Y., Xie, L., Lu, H., Ding, W., and Zhang, Q. (2016) Expanding Radical SAM Chemistry by Using Radical Addition Reactions and SAM Analogues Angew. Chem., Int. Ed. 55, 11845 11848 DOI: 10.1002/anie.201605917
    51. 51
      Ji, X., Liu, W. Q., Yuan, S., Yin, Y., Ding, W., and Zhang, Q. (2016) Mechanistic study of the radical SAM-dependent amine dehydrogenation reactions Chem. Commun. 52, 10555 10558 DOI: 10.1039/C6CC05661J
    52. 52
      Kumar, G., Johnson, J. L., and Frantom, P. A. (2016) Improving Functional Annotation in the DRE-TIM Metallolyase Superfamily through Identification of Active Site Fingerprints Biochemistry 55, 1863 1872 DOI: 10.1021/acs.biochem.5b01193
    53. 53
      Li, D., Moorman, R., Vanhercke, T., Petrie, J., Singh, S., and Jackson, C. J. (2016) Classification and substrate head-group specificity of membrane fatty acid desaturases Comput. Struct. Biotechnol. J. 14, 341 349 DOI: 10.1016/j.csbj.2016.08.003
    54. 54
      Molloy, E. M., Tietz, J. I., Blair, P. M., and Mitchell, D. A. (2016) Biological characterization of the hygrobafilomycin antibiotic JBIR-100 and bioinformatic insights into the hygrolide family of natural products Bioorg. Med. Chem. 24, 6276 6290 DOI: 10.1016/j.bmc.2016.05.021
    55. 55
      Plach, M. G., Reisinger, B., Sterner, R., and Merkl, R. (2016) Long-Term Persistence of Bi-functionality Contributes to the Robustness of Microbial Life through Exaptation PLoS Genet. 12, e1005836 DOI: 10.1371/journal.pgen.1005836
    56. 56
      Rao, G. and Oldfield, E. (2016) Structure and Function of Four Classes of the 4Fe-4S Protein, IspH Biochemistry 55, 4119 4129 DOI: 10.1021/acs.biochem.6b00474
    57. 57
      Thotsaporn, K., Tinikul, R., Maenpuen, S., Phonbuppha, J., Watthaisong, P., Chenprakhon, P., and Chaiyen, P. (2016) Enzymes in the p-hydroxphenylacetate degradation pathway of Acinetobacter baumannii J. Mol. Catal. B: Enzym. 134, 353 366 DOI: 10.1016/j.molcatb.2016.09.003
    58. 58
      Zallot, R., Harrison, K. J., Kolaczkowski, B., and de Crecy-Lagard, V. (2016) Functional Annotations of Paralogs: A Blessing and a Curse Life 6, 39 DOI: 10.3390/life6030039
    59. 59
      Zhang, X., Carter, M. S., Vetting, M. W., San Francisco, B., Zhao, S., Al-Obaidi, N. F., Solbiati, J. O., Thiaville, J. J., de Crecy-Lagard, V., Jacobson, M. P., Almo, S. C., and Gerlt, J. A. (2016) Assignment of function to a domain of unknown function: DUF1537 is a new kinase family in catabolic pathways for acid sugars Proc. Natl. Acad. Sci. U. S. A. 113, E4161 4169 DOI: 10.1073/pnas.1605546113
    60. 60
      Ahmed, M. N., Reyna-Gonzalez, E., Schmid, B., Wiebach, V., Sussmuth, R. D., Dittmann, E., and Fewer, D. P. (2017) Phylogenomic Analysis of the Microviridin Biosynthetic Pathway Coupled with Targeted Chemo-Enzymatic Synthesis Yields Potent Protease Inhibitors ACS Chem. Biol. 12, 1538 DOI: 10.1021/acschembio.7b00124
    61. 61
      Bearne, S. L. (2017) The interdigitating loop of the enolase superfamily as a specificity binding determinant or ’flying buttress’ Biochim. Biophys. Acta, Proteins Proteomics 1865, 619 630 DOI: 10.1016/j.bbapap.2017.02.006
    62. 62
      Benjdia, A., Guillot, A., Ruffie, P., Leprince, J., and Berteau, O. (2017) Post-translational modification of ribosomally synthesized peptides by a radical SAM epimerase in Bacillus subtilis Nat. Chem. 9, 698 707 DOI: 10.1038/nchem.2714
    63. 63
      Erb, T. J., Jones, P. R., and Bar-Even, A. (2017) Synthetic metabolism: metabolic engineering meets enzyme design Curr. Opin. Chem. Biol. 37, 56 62 DOI: 10.1016/j.cbpa.2016.12.023
    64. 64
      Estrada, P., Manandhar, M., Dong, S. H., Deveryshetty, J., Agarwal, V., Cronan, J. E., and Nair, S. K. (2017) The pimeloyl-CoA synthetase BioW defines a new fold for adenylate-forming enzymes Nat. Chem. Biol. 13, 668 674 DOI: 10.1038/nchembio.2359
    65. 65
      Giessen, T. W. and Silver, P. A. (2017) Widespread distribution of encapsulin nanocompartments reveals functional diversity Nat. Microbiol 2, 17029 DOI: 10.1038/nmicrobiol.2017.29
    66. 66
      Glasner, M. E. (2017) Finding enzymes in the gut metagenome Science 355, 577 578 DOI: 10.1126/science.aam7446
    67. 67
      Hetrick, K. J. and van der Donk, W. A. (2017) Ribosomally synthesized and post-translationally modified peptide natural product discovery in the genomic era Curr. Opin. Chem. Biol. 38, 36 44 DOI: 10.1016/j.cbpa.2017.02.005
    68. 68
      Holliday, G. L., Brown, S. D., Akiva, E., Mischel, D., Hicks, M. A., Morris, J. H., Huang, C. C., Meng, E. C., Pegg, S. C., Ferrin, T. E., and Babbitt, P. C. (2017) Biocuration in the structure-function linkage database: the anatomy of a superfamily Database DOI: 10.1093/database/bax045
    69. 69
      Jia, B., Jia, X., Hyun Kim, K., Ji Pu, Z., Kang, M. S., and Ok Jeon, C. (2017) Evolutionary, computational, and biochemical studies of the salicylaldehyde dehydrogenases in the naphthalene degradation pathway Sci. Rep. 7, 43489 DOI: 10.1038/srep43489
    70. 70
      Jia, B., Jia, X., Kim, K. H., and Jeon, C. O. (2017) Integrative view of 2-oxoglutarate/Fe(II)-dependent oxygenase diversity and functions in bacteria Biochim. Biophys. Acta, Gen. Subj. 1861, 323 334 DOI: 10.1016/j.bbagen.2016.12.001
    71. 71
      Kandlinger, F., Plach, M. G., and Merkl, R. (2017) AGeNNT: annotation of enzyme families by means of refined neighborhood networks BMC Bioinf. 18, 274 DOI: 10.1186/s12859-017-1689-6
    72. 72
      Levin, B. J., Huang, Y. Y., Peck, S. C., Wei, Y., Martinez-Del Campo, A., Marks, J. A., Franzosa, E. A., Huttenhower, C., and Balskus, E. P. (2017) A prominent glycyl radical enzyme in human gut microbiomes metabolizes trans-4-hydroxy-l-proline Science 355, eaai8386 DOI: 10.1126/science.aai8386
    73. 73
      Ney, B., Ahmed, F. H., Carere, C. R., Biswas, A., Warden, A. C., Morales, S. E., Pandey, G., Watt, S. J., Oakeshott, J. G., Taylor, M. C., Stott, M. B., Jackson, C. J., and Greening, C. (2017) The methanogenic redox cofactor F420 is widely synthesized by aerobic soil bacteria ISME J. 11, 125 137 DOI: 10.1038/ismej.2016.100
    74. 74
      Ortega, M. A., Cogan, D. P., Mukherjee, S., Garg, N., Li, B., Thibodeaux, G. N., Maffioli, S. I., Donadio, S., Sosio, M., Escano, J., Smith, L., Nair, S. K., and van der Donk, W. A. (2017) Two Flavoenzymes Catalyze the Post-Translational Generation of 5-Chlorotryptophan and 2-Aminovinyl-Cysteine during NAI-107 Biosynthesis ACS Chem. Biol. 12, 548 557 DOI: 10.1021/acschembio.6b01031
    75. 75
      Pimviriyakul, P., Thotsaporn, K., Sucharitakul, J., and Chaiyen, P. (2017) Kinetic Mechanism of the Dechlorinating Flavin-dependent Monooxygenase HadA J. Biol. Chem. 292, 4818 4832 DOI: 10.1074/jbc.M116.774448
    76. 76
      Repka, L. M., Chekan, J. R., Nair, S. K., and van der Donk, W. A. (2017) Mechanistic Understanding of Lanthipeptide Biosynthetic Enzymes Chem. Rev. 117, 5457 5520 DOI: 10.1021/acs.chemrev.6b00591
    77. 77
      Schwalen, C. J., Feng, X., Liu, W., O-Dowd, B., Ko, T. P., Shin, C. J., Guo, R. T., Mitchell, D. A., and Oldfield, E. (2017) Head-to-Head Prenyl Synthases in Pathogenic Bacteria ChemBioChem 18, 985 991 DOI: 10.1002/cbic.201700099
    78. 78
      Zallot, R., Yuan, Y., and de Crecy-Lagard, V. (2017) The Escherichia coli COG1738 Member YhhQ Is Involved in 7-Cyanodeazaguanine (preQ(0)) Transport Biomolecules 7, 12 DOI: 10.3390/biom7010012
    79. 79
      Xiang, D. F., Kolb, P., Fedorov, A. A., Xu, C., Fedorov, E. V., Narindoshivili, T., Williams, H. J., Shoichet, B. K., Almo, S. C., and Raushel, F. M. (2012) Structure-based function discovery of an enzyme for the hydrolysis of phosphorylated sugar lactones Biochemistry 51, 1762 1773 DOI: 10.1021/bi201838b
    80. 80
      Fan, H., Hitchcock, D. S., Seidel, R. D., 2nd, Hillerich, B., Lin, H., Almo, S. C., Sali, A., Shoichet, B. K., and Raushel, F. M. (2013) Assignment of pterin deaminase activity to an enzyme of unknown function guided by homology modeling and docking J. Am. Chem. Soc. 135, 795 803 DOI: 10.1021/ja309680b
    81. 81
      Goble, A. M., Toro, R., Li, X., Ornelas, A., Fan, H., Eswaramoorthy, S., Patskovsky, Y., Hillerich, B., Seidel, R., Sali, A., Shoichet, B. K., Almo, S. C., Swaminathan, S., Tanner, M. E., and Raushel, F. M. (2013) Deamination of 6-aminodeoxyfutalosine in menaquinone biosynthesis by distantly related enzymes Biochemistry 52, 6525 6536 DOI: 10.1021/bi400750a
    82. 82
      Hitchcock, D. S., Fan, H., Kim, J., Vetting, M., Hillerich, B., Seidel, R. D., Almo, S. C., Shoichet, B. K., Sali, A., and Raushel, F. M. (2013) Structure-guided discovery of new deaminase enzymes J. Am. Chem. Soc. 135, 13927 13933 DOI: 10.1021/ja4066078
    83. 83
      Ornelas, A., Korczynska, M., Ragumani, S., Kumaran, D., Narindoshvili, T., Shoichet, B. K., Swaminathan, S., and Raushel, F. M. (2013) Functional annotation and three-dimensional structure of an incorrectly annotated dihydroorotase from cog3964 in the amidohydrolase superfamily Biochemistry 52, 228 238 DOI: 10.1021/bi301483z
    84. 84
      Barelier, S., Cummings, J. A., Rauwerdink, A. M., Hitchcock, D. S., Farelli, J. D., Almo, S. C., Raushel, F. M., Allen, K. N., and Shoichet, B. K. (2014) Substrate deconstruction and the nonadditivity of enzyme recognition J. Am. Chem. Soc. 136, 7374 7382 DOI: 10.1021/ja501354q
    85. 85
      Korczynska, M., Xiang, D. F., Zhang, Z., Xu, C., Narindoshvili, T., Kamat, S. S., Williams, H. J., Chang, S. S., Kolb, P., Hillerich, B., Sauder, J. M., Burley, S. K., Almo, S. C., Swaminathan, S., Shoichet, B. K., and Raushel, F. M. (2014) Functional annotation and structural characterization of a novel lactonase hydrolyzing D-xylono-1,4-lactone-5-phosphate and L-arabino-1,4-lactone-5-phosphate Biochemistry 53, 4727 4738 DOI: 10.1021/bi500595c
    86. 86
      Lukk, T., Sakai, A., Kalyanaraman, C., Brown, S. D., Imker, H. J., Song, L., Fedorov, A. A., Fedorov, E. V., Toro, R., Hillerich, B., Seidel, R., Patskovsky, Y., Vetting, M. W., Nair, S. K., Babbitt, P. C., Almo, S. C., Gerlt, J. A., and Jacobson, M. P. (2012) Homology models guide discovery of diverse enzyme specificities among dipeptide epimerases in the enolase superfamily Proc. Natl. Acad. Sci. U. S. A. 109, 4122 4127 DOI: 10.1073/pnas.1112081109
    87. 87
      Wichelecki, D. J., Balthazor, B. M., Chau, A. C., Vetting, M. W., Fedorov, A. A., Fedorov, E. V., Lukk, T., Patskovsky, Y. V., Stead, M. B., Hillerich, B. S., Seidel, R. D., Almo, S. C., and Gerlt, J. A. (2014) Discovery of function in the enolase superfamily: D-mannonate and d-gluconate dehydratases in the D-mannonate dehydratase subgroup Biochemistry 53, 2722 2731 DOI: 10.1021/bi500264p
    88. 88
      Wichelecki, D. J., Froese, D. S., Kopec, J., Muniz, J. R., Yue, W. W., and Gerlt, J. A. (2014) Enzymatic and structural characterization of rTSgamma provides insights into the function of rTSbeta Biochemistry 53, 2732 2738 DOI: 10.1021/bi500349e
    89. 89
      Wichelecki, D. J., Graff, D. C., Al-Obaidi, N., Almo, S. C., and Gerlt, J. A. (2014) Identification of the in vivo function of the high-efficiency D-mannonate dehydratase in Caulobacter crescentus NA1000 from the enolase superfamily Biochemistry 53, 4087 4089 DOI: 10.1021/bi500683x
    90. 90
      Wichelecki, D. J., Vendiola, J. A., Jones, A. M., Al-Obaidi, N., Almo, S. C., and Gerlt, J. A. (2014) Investigating the physiological roles of low-efficiency D-mannonate and D-gluconate dehydratases in the enolase superfamily: pathways for the catabolism of L-gulonate and L-idonate Biochemistry 53, 5692 5699 DOI: 10.1021/bi500837w
    91. 91
      Ghasempur, S., Eswaramoorthy, S., Hillerich, B. S., Seidel, R. D., Swaminathan, S., Almo, S. C., and Gerlt, J. A. (2014) Discovery of a novel L-lyxonate degradation pathway in Pseudomonas aeruginosa PAO1 Biochemistry 53, 3357 3366 DOI: 10.1021/bi5004298
    92. 92
      Groninger-Poe, F. P., Bouvier, J. T., Vetting, M. W., Kalyanaraman, C., Kumar, R., Almo, S. C., Jacobson, M. P., and Gerlt, J. A. (2014) Evolution of enzymatic activities in the enolase superfamily: galactarate dehydratase III from Agrobacterium tumefaciens C58 Biochemistry 53, 4192 4203 DOI: 10.1021/bi5005377
    93. 93
      Mashiyama, S. T., Malabanan, M. M., Akiva, E., Bhosle, R., Branch, M. C., Hillerich, B., Jagessar, K., Kim, J., Patskovsky, Y., Seidel, R. D., Stead, M., Toro, R., Vetting, M. W., Almo, S. C., Armstrong, R. N., and Babbitt, P. C. (2014) Large-scale determination of sequence, structure, and function relationships in cytosolic glutathione transferases across the biosphere PLoS Biol. 12, e1001843 DOI: 10.1371/journal.pbio.1001843
    94. 94
      Huang, H., Pandya, C., Liu, C., Al-Obaidi, N. F., Wang, M., Zheng, L., Toews Keating, S., Aono, M., Love, J. D., Evans, B., Seidel, R. D., Hillerich, B. S., Garforth, S. J., Almo, S. C., Mariano, P. S., Dunaway-Mariano, D., Allen, K. N., and Farelli, J. D. (2015) Panoramic view of a superfamily of phosphatases through substrate profiling Proc. Natl. Acad. Sci. U. S. A. 112, E1974 1983 DOI: 10.1073/pnas.1423570112
    95. 95
      Tian, B. X., Wallrapp, F. H., Holiday, G. L., Chow, J. Y., Babbitt, P. C., Poulter, C. D., and Jacobson, M. P. (2014) Predicting the functions and specificity of triterpenoid synthases: a mechanism-based multi-intermediate docking approach PLoS Comput. Biol. 10, e1003874 DOI: 10.1371/journal.pcbi.1003874
    96. 96
      Wallrapp, F. H., Pan, J. J., Ramamoorthy, G., Almonacid, D. E., Hillerich, B. S., Seidel, R., Patskovsky, Y., Babbitt, P. C., Almo, S. C., Jacobson, M. P., and Poulter, C. D. (2013) Prediction of function for the polyprenyl transferase subgroup in the isoprenoid synthase superfamily Proc. Natl. Acad. Sci. U. S. A. 110, E1196 1202 DOI: 10.1073/pnas.1300632110
    97. 97
      Zhao, S., Sakai, A., Zhang, X., Vetting, M. W., Kumar, R., Hillerich, B., San Francisco, B., Solbiati, J., Steves, A., Brown, S., Akiva, E., Barber, A., Seidel, R. D., Babbitt, P. C., Almo, S. C., Gerlt, J. A., and Jacobson, M. P. (2014) Prediction and characterization of enzymatic activities guided by sequence similarity and genome neighborhood networks eLife 3e03275 DOI: 10.7554/eLife.03275
    98. 98
      Vetting, M. W., Al-Obaidi, N., Zhao, S., San Francisco, B., Kim, J., Wichelecki, D. J., Bouvier, J. T., Solbiati, J. O., Vu, H., Zhang, X., Rodionov, D. A., Love, J. D., Hillerich, B. S., Seidel, R. D., Quinn, R. J., Osterman, A. L., Cronan, J. E., Jacobson, M. P., Gerlt, J. A., and Almo, S. C. (2015) Experimental strategies for functional annotation and metabolism discovery: targeted screening of solute binding proteins and unbiased panning of metabolomes Biochemistry 54, 909 931 DOI: 10.1021/bi501388y
    99. 99
      Ndeh, D., Rogowski, A., Cartmell, A., Luis, A. S., Basle, A., Gray, J., Venditto, I., Briggs, J., Zhang, X., Labourel, A., Terrapon, N., Buffetto, F., Nepogodiev, S., Xiao, Y., Field, R. A., Zhu, Y., O’Neill, M. A., Urbanowicz, B. R., York, W. S., Davies, G. J., Abbott, D. W., Ralet, M. C., Martens, E. C., Henrissat, B., and Gilbert, H. J. (2017) Complex pectin metabolism by gut bacteria reveals novel catalytic functions Nature 544, 65 70 DOI: 10.1038/nature21725
    100. 100
      Kaminski, J., Gibson, M. K., Franzosa, E. A., Segata, N., Dantas, G., and Huttenhower, C. (2015) High-Specificity Targeted Functional Profiling in Microbial Communities with ShortBRED PLoS Comput. Biol. 11, e1004557 DOI: 10.1371/journal.pcbi.1004557
    101. 101
      Craciun, S. and Balskus, E. P. (2012) Microbial conversion of choline to trimethylamine requires a glycyl radical enzyme Proc. Natl. Acad. Sci. U. S. A. 109, 21307 21312 DOI: 10.1073/pnas.1215689109
    102. 102
      Craciun, S., Marks, J. A., and Balskus, E. P. (2014) Characterization of choline trimethylamine-lyase expands the chemistry of glycyl radical enzymes ACS Chem. Biol. 9, 1408 1413 DOI: 10.1021/cb500113p

    Cited By

    ARTICLE SECTIONS
    Jump To

    This article is cited by 145 publications.

    1. Li Jiang, Yiqian Yang, Lin Huang, Yan Zhou, Junwei An, Yuchun Zheng, Yiwei Chen, Yanhong Liu, Jianhui Huang, Ee Lui Ang, Suwen Zhao, Huimin Zhao, Rongzhen Liao, Yifeng Wei, Yan Zhang. Glycyl Radical Enzymes Catalyzing the Dehydration of Two Isomers of N-Methyl-4-hydroxyproline. ACS Catalysis 2024, 14 (7) , 4407-4422. https://doi.org/10.1021/acscatal.4c00216
    2. Vasiliki T. Chioti, Kenzie A. Clark, Jack G. Ganley, Esther J. Han, Mohammad R. Seyedsayamdost. N–Cα Bond Cleavage Catalyzed by a Multinuclear Iron Oxygenase from a Divergent Methanobactin-like RiPP Gene Cluster. Journal of the American Chemical Society 2024, 146 (11) , 7313-7323. https://doi.org/10.1021/jacs.3c11740
    3. Meng-Xue Guo, Meng-Meng Zhang, Ke Sun, Jiao-Jiao Cui, Yi-Cheng Liu, Kun Gao, Shi-Hui Dong, Shangwen Luo. Genome Mining of Linaridins Provides Insights into the Widely Distributed LinC Oxidoreductases. Journal of Natural Products 2023, 86 (10) , 2333-2341. https://doi.org/10.1021/acs.jnatprod.3c00527
    4. Jiayi Tian, Alejandro Arcadio Garcia, Patrick H. Donnan, Jennifer Bridwell-Rabb. Leveraging a Structural Blueprint to Rationally Engineer the Rieske Oxygenase TsaM. Biochemistry 2023, 62 (11) , 1807-1822. https://doi.org/10.1021/acs.biochem.3c00150
    5. Lide Cha, Jared C. Paris, Brady Zanella, Martha Spletzer, Angela Yao, Yisong Guo, Wei-chen Chang. Mechanistic Studies of Aziridine Formation Catalyzed by Mononuclear Non-Heme Iron Enzymes. Journal of the American Chemical Society 2023, 145 (11) , 6240-6246. https://doi.org/10.1021/jacs.2c12664
    6. Yu Li, Zifei Xu, Ping Chen, Chen Zuo, Liyifan Chen, Wei Yan, Ruihua Jiao, Yonghao Ye. Genome Mining and Heterologous Expression Guided the Discovery of Antimicrobial Naphthocyclinones from Streptomyces eurocidicus CGMCC 4.1086. Journal of Agricultural and Food Chemistry 2023, 71 (6) , 2914-2923. https://doi.org/10.1021/acs.jafc.2c06928
    7. Ichiro Matsumura, Wayne M. Patrick. Dan Tawfik’s Lessons for Protein Engineers about Enzymes Adapting to New Substrates. Biochemistry 2023, 62 (2) , 158-162. https://doi.org/10.1021/acs.biochem.2c00230
    8. Syam Sundar Neti, Debangsu Sil, Douglas M. Warui, Olga A. Esakova, Amy E. Solinski, Dante A. Serrano, Carsten Krebs, Squire J. Booker. Characterization of LipS1 and LipS2 from Thermococcus kodakarensis: Proteins Annotated as Biotin Synthases, which Together Catalyze Formation of the Lipoyl Cofactor. ACS Bio & Med Chem Au 2022, 2 (5) , 509-520. https://doi.org/10.1021/acsbiomedchemau.2c00018
    9. Kenzie A. Clark, Mohammad R. Seyedsayamdost. Bioinformatic Atlas of Radical SAM Enzyme-Modified RiPP Natural Products Reveals an Isoleucine–Tryptophan Crosslink. Journal of the American Chemical Society 2022, 144 (39) , 17876-17888. https://doi.org/10.1021/jacs.2c06497
    10. Zeng-Fei Pei, Lingyang Zhu, Raymond Sarksian, Wilfred A. van der Donk, Satish K. Nair. Class V Lanthipeptide Cyclase Directs the Biosynthesis of a Stapled Peptide Natural Product. Journal of the American Chemical Society 2022, 144 (38) , 17549-17557. https://doi.org/10.1021/jacs.2c06808
    11. Kenzie A. Clark, Leah B. Bushin, Mohammad R. Seyedsayamdost. RaS-RiPPs in Streptococci and the Human Microbiome. ACS Bio & Med Chem Au 2022, 2 (4) , 328-339. https://doi.org/10.1021/acsbiomedchemau.2c00004
    12. Qiongxiang Yan, Hua Huang, Xinshuai Zhang. In Vitro Reconstitution of a Bacterial Ergothioneine Sulfonate Catabolic Pathway. ACS Catalysis 2022, 12 (9) , 4825-4832. https://doi.org/10.1021/acscatal.2c00169
    13. Spencer S. Macdonald, Jose H. Pereira, Feng Liu, Gregor Tegl, Andy DeGiovanni, Jacob F. Wardman, Samuel Deutsch, Yasuo Yoshikuni, Paul D. Adams, Stephen G. Withers. A Synthetic Gene Library Yields a Previously Unknown Glycoside Phosphorylase That Degrades and Assembles Poly-β-1,3-GlcNAc, Completing the Suite of β-Linked GlcNAc Polysaccharides. ACS Central Science 2022, 8 (4) , 430-440. https://doi.org/10.1021/acscentsci.1c01570
    14. Matthew R. Dent, Madeleine G. Roberts, Hannah E. Bowman, Brian R. Weaver, Darrell R. McCaslin, Judith N. Burstyn. Quaternary Structure and Deoxyribonucleic Acid-Binding Properties of the Heme-Dependent, CO-Sensing Transcriptional Regulator PxRcoM. Biochemistry 2022, 61 (8) , 678-688. https://doi.org/10.1021/acs.biochem.2c00086
    15. Sha-Sha Zhang, Jiang Xiong, Jiao-Jiao Cui, Kai-Liang Ma, Wen-Liang Wu, Ya Li, Shangwen Luo, Kun Gao, Shi-Hui Dong. Lanthipeptides from the Same Core Sequence: Characterization of a Class II Lanthipeptide Synthetase from Microcystis aeruginosa NIES-88. Organic Letters 2022, 24 (11) , 2226-2231. https://doi.org/10.1021/acs.orglett.2c00573
    16. Nils Oberg, Timothy W. Precord, Douglas A. Mitchell, John A. Gerlt. RadicalSAM.org: A Resource to Interpret Sequence-Function Space and Discover New Radical SAM Enzyme Chemistry. ACS Bio & Med Chem Au 2022, 2 (1) , 22-35. https://doi.org/10.1021/acsbiomedchemau.1c00048
    17. Sangeetha Ramesh, Xiaorui Guo, Adam J. DiCaprio, Ashley M. De Lio, Lonnie A. Harris, Bryce L. Kille, Taras V. Pogorelov, Douglas A. Mitchell. Bioinformatics-Guided Expansion and Discovery of Graspetides. ACS Chemical Biology 2021, 16 (12) , 2787-2797. https://doi.org/10.1021/acschembio.1c00672
    18. Matthew A. Hostetler, Chloe Smith, Samantha Nelson, Zachary Budimir, Ramya Modi, Ian Woolsey, Autumn Frerk, Braden Baker, Jessica Gantt, Elizabeth I. Parkinson. Synthetic Natural Product Inspired Cyclic Peptides. ACS Chemical Biology 2021, 16 (11) , 2604-2611. https://doi.org/10.1021/acschembio.1c00641
    19. Yuping Liu, Siting Pan, Xinshuai Zhang, Hua Huang. In Vitro Reconstitution of the Pantothenic Acid Degradation Pathway in Ochrobactrum anthropi. ACS Chemical Biology 2021, 16 (8) , 1350-1353. https://doi.org/10.1021/acschembio.1c00492
    20. Melanie A. Higgins, Gregor Tegl, Spencer S. MacDonald, Gregory Arnal, Harry Brumer, Stephen G. Withers, Katherine S. Ryan. N-Glycan Degradation Pathways in Gut- and Soil-Dwelling Actinobacteria Share Common Core Genes. ACS Chemical Biology 2021, 16 (4) , 701-711. https://doi.org/10.1021/acschembio.0c00995
    21. Joanna A. Quaye, Giovanni Gadda. Kinetic and Bioinformatic Characterization of d-2-Hydroxyglutarate Dehydrogenase from Pseudomonas aeruginosa PAO1. Biochemistry 2020, 59 (51) , 4833-4844. https://doi.org/10.1021/acs.biochem.0c00832
    22. Yuan Zhi, Dao Feng Xiang, Tamari Narindoshvili, Helene Andrews-Polymenis, Frank M. Raushel. Deciphering the Aldolase Function of STM3780 from a Bovine Enteric Infection-Related Gene Cluster in Salmonella enterica Serotype Typhimurium. Biochemistry 2020, 59 (48) , 4573-4580. https://doi.org/10.1021/acs.biochem.0c00768
    23. Alexander J. Stirling, Stephanie E. Gilbert, Megan Conner, Evan Mallette, Matthew S. Kimber, Stephen Y. K. Seah. A Key Glycine in Bacterial Steroid-Degrading Acyl-CoA Dehydrogenases Allows Flavin-Ring Repositioning and Modulates Substrate Side Chain Specificity. Biochemistry 2020, 59 (42) , 4081-4092. https://doi.org/10.1021/acs.biochem.0c00568
    24. Leah B. Bushin, Brett C. Covington, Britta E. Rued, Michael J. Federle, Mohammad R. Seyedsayamdost. Discovery and Biosynthesis of Streptosactin, a Sactipeptide with an Alternative Topology Encoded by Commensal Bacteria in the Human Microbiome. Journal of the American Chemical Society 2020, 142 (38) , 16265-16275. https://doi.org/10.1021/jacs.0c05546
    25. Joshua B. Pyser, Summer A. Baker Dockrey, Attabey Rodríguez Benítez, Leo A. Joyce, Ren A. Wiscons, Janet L. Smith, Alison R. H. Narayan. Stereodivergent, Chemoenzymatic Synthesis of Azaphilone Natural Products. Journal of the American Chemical Society 2019, 141 (46) , 18551-18559. https://doi.org/10.1021/jacs.9b09385
    26. Rémi Zallot, Nils Oberg, John A. Gerlt. The EFI Web Resource for Genomic Enzymology Tools: Leveraging Protein, Genome, and Metagenome Databases to Discover Novel Enzymes and Metabolic Pathways. Biochemistry 2019, 58 (41) , 4169-4182. https://doi.org/10.1021/acs.biochem.9b00735
    27. Jamison P. Huddleston, Frank M. Raushel. Biosynthesis of GDP-d-glycero-α-d-manno-heptose for the Capsular Polysaccharide of Campylobacter jejuni. Biochemistry 2019, 58 (37) , 3893-3902. https://doi.org/10.1021/acs.biochem.9b00548
    28. Yohei Morishita, Huiping Zhang, Tohru Taniguchi, Keiji Mori, Teigo Asai. The Discovery of Fungal Polyene Macrolides via a Postgenomic Approach Reveals a Polyketide Macrocyclization by trans-Acting Thioesterase in Fungi. Organic Letters 2019, 21 (12) , 4788-4792. https://doi.org/10.1021/acs.orglett.9b01674
    29. Nilkamal Mahanta, Katherine A. Hicks, Saad Naseem, Yang Zhang, Dmytro Fedoseyenko, Steven E. Ealick, Tadhg P. Begley. Menaquinone Biosynthesis: Biochemical and Structural Studies of Chorismate Dehydratase. Biochemistry 2019, 58 (14) , 1837-1840. https://doi.org/10.1021/acs.biochem.9b00105
    30. Leah B. Bushin, Kenzie A. Clark, István Pelczer, Mohammad R. Seyedsayamdost. Charting an Unexplored Streptococcal Biosynthetic Landscape Reveals a Unique Peptide Cyclization Motif. Journal of the American Chemical Society 2018, 140 (50) , 17674-17684. https://doi.org/10.1021/jacs.8b10266
    31. Thibault Annaval, Lu Han, Jeffrey D. Rudolf, Guangbo Xie, Dong Yang, Chin-Yuan Chang, Ming Ma, Ivana Crnovcic, Mitchell D. Miller, Jayashree Soman, Weijun Xu, George N. Phillips, Jr., Ben Shen. Biochemical and Structural Characterization of TtnD, a Prenylated FMN-Dependent Decarboxylase from the Tautomycetin Biosynthetic Pathway. ACS Chemical Biology 2018, 13 (9) , 2728-2738. https://doi.org/10.1021/acschembio.8b00673
    32. Janine N. Copp, Eyal Akiva, Patricia C. Babbitt, Nobuhiko Tokuriki. Revealing Unexplored Sequence-Function Space Using Sequence Similarity Networks. Biochemistry 2018, 57 (31) , 4651-4662. https://doi.org/10.1021/acs.biochem.8b00473
    33. Michael A. Welsh, Atsushi Taguchi, Kaitlin Schaefer, Daria Van Tyne, François Lebreton, Michael S. Gilmore, Daniel Kahne, and Suzanne Walker . Identification of a Functionally Unique Family of Penicillin-Binding Proteins. Journal of the American Chemical Society 2017, 139 (49) , 17727-17730. https://doi.org/10.1021/jacs.7b10170
    34. Meng Wang, Wen-Wei Li, Zhe Cao, Jianong Sun, Jiang Xiong, Si-Qin Tao, Tinghong Lv, Kun Gao, Shangwen Luo, Shi-Hui Dong. Genome mining of sulfonated lanthipeptides reveals unique cyclic peptide sulfotransferases. Acta Pharmaceutica Sinica B 2024, 20 https://doi.org/10.1016/j.apsb.2024.02.016
    35. Annelise L. Goldman, Emily M. Fulk, Lily M. Momper, Clinton Heider, John Mulligan, Magdalena Osburn, Caroline A. Masiello, Jonathan J. Silberg, . Microbial sensor variation across biogeochemical conditions in the terrestrial deep subsurface. mSystems 2024, 9 (1) https://doi.org/10.1128/msystems.00966-23
    36. Chin-Soon Phan, Brandon I. Morinaka. Bacterial cyclophane-containing RiPPs from radical SAM enzymes. Natural Product Reports 2024, 29 https://doi.org/10.1039/D3NP00030C
    37. Wisely Chua, Carl O. Marsh, Si En Poh, Winston LC. Koh, Melody Li Ying Lee, Li Fang Koh, Xin-Zi Emily Tang, Peter See, Zheng Ser, Shi Mei Wang, Radoslaw M. Sobota, Thomas L. Dawson, Yik Weng Yew, Steven Thng, Anthony J. O’Donoghue, Hazel H. Oon, John E. Common, Hao Li. A Malassezia pseudoprotease dominates the secreted hydrolase landscape and is a potential allergen on skin. Biochimie 2024, 216 , 181-193. https://doi.org/10.1016/j.biochi.2023.09.023
    38. Liangzhi Li, Zhenghua Liu, Delong Meng, Yongjun Liu, Tianbo Liu, Chengying Jiang, Huaqun Yin, . Sequence similarity network and protein structure prediction offer insights into the evolution of microbial pathways for ferrous iron oxidation. mSystems 2023, 8 (5) https://doi.org/10.1128/msystems.00720-23
    39. Ulrike Vogel, Matthieu Da Costa, Carlos Alvarez Quispe, Robin Stragier, Henk‐Jan Joosten, Koen Beerens, Tom Desmet. The Conversion of UDP‐Glc to UDP‐Man: In Silico and Biochemical Exploration To Improve the Catalytic Efficiency of CDP‐Tyvelose C2‐Epimerases. ChemBioChem 2023, 446 https://doi.org/10.1002/cbic.202300549
    40. Jiayi Tian, David G. Boggs, Patrick H. Donnan, Gage T. Barroso, Alejandro Arcadio Garcia, Daniel P. Dowling, Joshua A. Buss, Jennifer Bridwell-Rabb. The NADH recycling enzymes TsaC and TsaD regenerate reducing equivalents for Rieske oxygenase chemistry. Journal of Biological Chemistry 2023, 299 (10) , 105222. https://doi.org/10.1016/j.jbc.2023.105222
    41. Liangzhi Li, Lei Zhou, Chengying Jiang, Zhenghua Liu, Delong Meng, Feng Luo, Qiang He, Huaqun Yin. AI-driven pan-proteome analyses reveal insights into the biohydrometallurgical properties of Acidithiobacillia. Frontiers in Microbiology 2023, 14 https://doi.org/10.3389/fmicb.2023.1243987
    42. Nils Oberg, Rémi Zallot, John A. Gerlt. EFI-EST, EFI-GNT, and EFI-CGFP: Enzyme Function Initiative (EFI) Web Resource for Genomic Enzymology Tools. Journal of Molecular Biology 2023, 435 (14) , 168018. https://doi.org/10.1016/j.jmb.2023.168018
    43. Adam Begeman, Artem Babaian, Samantha C. Lewis, . Metatranscriptomic analysis uncovers prevalent viral ORFs compatible with mitochondrial translation. mSystems 2023, 8 (3) https://doi.org/10.1128/msystems.01002-22
    44. Mariia A. Beliaeva, Matthias Wilmanns, Michael Zimmermann. Decipher enzymes from human microbiota for drug discovery and development. Current Opinion in Structural Biology 2023, 80 , 102567. https://doi.org/10.1016/j.sbi.2023.102567
    45. Eva Kaulich, Patrick T. N. McCubbin, William R. Schafer, Denise S. Walker. Physiological insight into the conserved properties of Caenorhabditis elegans acid‐sensing degenerin/epithelial sodium channels. The Journal of Physiology 2023, 601 (9) , 1625-1653. https://doi.org/10.1113/JP283238
    46. Miriam Kronen, Xabier Vázquez-Campos, Marc R. Wilkins, Matthew Lee, Michael J. Manefield, . Evidence for a Putative Isoprene Reductase in Acetobacterium wieringae. mSystems 2023, 8 (2) https://doi.org/10.1128/msystems.00119-23
    47. Minshik Jo, Madison Knapp, David G. Boggs, Marley Brimberry, Patrick H. Donnan, Jennifer Bridwell-Rabb. A structure-function analysis of chlorophyllase reveals a mechanism for activity regulation dependent on disulfide bonds. Journal of Biological Chemistry 2023, 299 (3) , 102958. https://doi.org/10.1016/j.jbc.2023.102958
    48. Hayley L. Knox, Karen N. Allen. Expanding the viewpoint: Leveraging sequence information in enzymology. Current Opinion in Chemical Biology 2023, 72 , 102246. https://doi.org/10.1016/j.cbpa.2022.102246
    49. Fabian Thomas, Oliver Kayser. Improving CBCA synthase activity through rational protein design. Journal of Biotechnology 2023, 363 , 40-49. https://doi.org/10.1016/j.jbiotec.2023.01.004
    50. Yu Meng Yang, Er Juan Zhao, Wanqing Wei, Zi Fei Xu, Jing Shi, Xuan Wu, Bo Zhang, Yasuhiro Igarashi, Rui Hua Jiao, Yong Liang, Ren Xiang Tan, Hui Ming Ge. Cytochrome P450 Catalyzes Benzene Ring Formation in the Biosynthesis of Trialkyl‐Substituted Aromatic Polyketides. Angewandte Chemie International Edition 2023, 62 (5) https://doi.org/10.1002/anie.202214026
    51. Yu Meng Yang, Er Juan Zhao, Wanqing Wei, Zi Fei Xu, Jing Shi, Xuan Wu, Bo Zhang, Yasuhiro Igarashi, Rui Hua Jiao, Yong Liang, Ren Xiang Tan, Hui Ming Ge. Cytochrome P450 Catalyzes Benzene Ring Formation in the Biosynthesis of Trialkyl‐Substituted Aromatic Polyketides. Angewandte Chemie 2023, 135 (5) https://doi.org/10.1002/ange.202214026
    52. Patricia Molina-Espeja, Laura Fernandez-Lopez, Peter N. Golyshin, Manuel Ferrer. Assigning Functions of Unknown Enzymes by High-Throughput Enzyme Characterization. 2023, 181-194. https://doi.org/10.1007/978-1-0716-2795-2_13
    53. Vesna Simunović, Ivan Grubišić. Amino acid (acyl carrier protein) ligase-associated biosynthetic gene clusters reveal unexplored biosynthetic potential. Molecular Genetics and Genomics 2023, 298 (1) , 49-65. https://doi.org/10.1007/s00438-022-01962-7
    54. Maybelle Kho Go, Tingting Zhu, Kevin Jie Han Lim, Yossa Dwi Hartono, Bo Xue, Hao Fan, Wen Shan Yew. Cannabinoid Biosynthesis Using Noncanonical Cannabinoid Synthases. International Journal of Molecular Sciences 2023, 24 (2) , 1259. https://doi.org/10.3390/ijms24021259
    55. Evan Mann, Shahrokh Shekarriz, Michael G. Surette, . Human Gut Metagenomes Encode Diverse GH156 Sialidases. Applied and Environmental Microbiology 2022, 88 (23) https://doi.org/10.1128/aem.01755-22
    56. Bin Li, Minshik Jo, Jianxin Liu, Jiayi Tian, Robert Canfield, Jennifer Bridwell-Rabb. Structural and mechanistic basis for redox sensing by the cyanobacterial transcription regulator RexT. Communications Biology 2022, 5 (1) https://doi.org/10.1038/s42003-022-03226-x
    57. Vida M. B. Leite, Leandro M. Garrido, Marcelo M. P. Tangerina, Leticia V. Costa-Lotufo, Marcelo J. P. Ferreira, Gabriel Padilla. Genome mining of Streptomyces sp. BRB081 reveals the production of the antitumor pyrrolobenzodiazepine sibiromycin. 3 Biotech 2022, 12 (10) https://doi.org/10.1007/s13205-022-03305-0
    58. Leah B. Bushin, Brett C. Covington, Kenzie A. Clark, Alessio Caruso, Mohammad R. Seyedsayamdost. Bicyclostreptins are radical SAM enzyme-modified peptides with unique cyclization motifs. Nature Chemical Biology 2022, 18 (10) , 1135-1143. https://doi.org/10.1038/s41589-022-01090-8
    59. Burhan Hamid, Zaffar Bashir, Ali Mohd Yatoo, Fayaz Mohiddin, Neesa Majeed, Monika Bansal, Peter Poczai, Waleed Hassan Almalki, R. Z. Sayyed, Ali A. Shati, Mohammad Y. Alfaifi. Cold-Active Enzymes and Their Potential Industrial Applications—A Review. Molecules 2022, 27 (18) , 5885. https://doi.org/10.3390/molecules27185885
    60. Yifan Zhang, Julia E. Martin, Katherine A. Edmonds, Malcolm E. Winkler, David P. Giedroc. SifR is an Rrf2-family quinone sensor associated with catechol iron uptake in Streptococcus pneumoniae D39. Journal of Biological Chemistry 2022, 298 (7) , 102046. https://doi.org/10.1016/j.jbc.2022.102046
    61. Zhifeng Zeng, Yu Chen, Rafael Pinilla-Redondo, Shiraz A. Shah, Fen Zhao, Chen Wang, Zeyu Hu, Chang Wu, Changyi Zhang, Rachel J. Whitaker, Qunxin She, Wenyuan Han. A short prokaryotic Argonaute activates membrane effector to confer antiviral defense. Cell Host & Microbe 2022, 30 (7) , 930-943.e6. https://doi.org/10.1016/j.chom.2022.04.015
    62. Tingting Huang, Zihua Zhou, Maolong Wei, Lin Chen, Zhihong Xiao, Zixin Deng, Shuangjun Lin, . Characterization of Pyridomycin B Reveals the Formation of Functional Groups in Antimycobacterial Pyridomycin. Applied and Environmental Microbiology 2022, 88 (6) https://doi.org/10.1128/aem.02035-21
    63. He Li, Junfeng Zhao, Wei Ding, Qi Zhang. Glucuronyl C4 dehydrogenation by the radical SAM enzyme BlsE involved in blasticidin S biosynthesis. Chemical Communications 2022, 58 (21) , 3561-3564. https://doi.org/10.1039/D1CC07132G
    64. Hayley L. Knox, Erica K. Sinner, Craig A. Townsend, Amie K. Boal, Squire J. Booker. Structure of a B12-dependent radical SAM enzyme in carbapenem biosynthesis. Nature 2022, 602 (7896) , 343-348. https://doi.org/10.1038/s41586-021-04392-4
    65. Baolei Jia, Xiao Han, Kyung Hyun Kim, Che Ok Jeon. Discovery and mining of enzymes from the human gut microbiome. Trends in Biotechnology 2022, 40 (2) , 240-254. https://doi.org/10.1016/j.tibtech.2021.06.008
    66. Yongxin Li, Hua Huang, Xinshuai Zhang. Identification of catabolic pathway for 1-deoxy-D-sorbitol in Bacillus licheniformis. Biochemical and Biophysical Research Communications 2022, 586 , 81-86. https://doi.org/10.1016/j.bbrc.2021.11.072
    67. Letícia Ferreira Lima, André Quintanilha Torres, Rodrigo Jardim, Rafael Dias Mesquita, Renata Schama. Evolution of Toll, Spatzle and MyD88 in insects: the problem of the Diptera bias. BMC Genomics 2021, 22 (1) https://doi.org/10.1186/s12864-021-07886-7
    68. Jolyn Pan, Kjersti Lian, Aili Sarre, Hanna-Kirsti S. Leiros, Adele Williamson. Bacteriophage origin of some minimal ATP-dependent DNA ligases: a new structure from Burkholderia pseudomallei with striking similarity to Chlorella virus ligase. Scientific Reports 2021, 11 (1) https://doi.org/10.1038/s41598-021-98155-w
    69. Michael P. Andreas, Tobias W. Giessen. Large-scale computational discovery and analysis of virus-derived microbial nanocompartments. Nature Communications 2021, 12 (1) https://doi.org/10.1038/s41467-021-25071-y
    70. Zuodong Sun, Bing Xu, Shaun Spisak, Jennifer M. Kavran, Steven E. Rokita. The minimal structure for iodotyrosine deiodinase function is defined by an outlier protein from the thermophilic bacterium Thermotoga neapolitana. Journal of Biological Chemistry 2021, 297 (6) , 101385. https://doi.org/10.1016/j.jbc.2021.101385
    71. Jaire A. Ferreira Filho, Rafaela R. Rosolen, Deborah A. Almeida, Paulo Henrique C. de Azevedo, Maria Lorenza L. Motta, Alexandre H. Aono, Clelton A. dos Santos, Maria Augusta C. Horta, Anete P. de Souza. Trends in biological data integration for the selection of enzymes and transcription factors related to cellulose and hemicellulose degradation in fungi. 3 Biotech 2021, 11 (11) https://doi.org/10.1007/s13205-021-03032-y
    72. Roland Wohlgemuth. Bio-based resources, bioprocesses and bioproducts in value creation architectures for bioeconomy markets and beyond – What really matters. EFB Bioeconomy Journal 2021, 1 , 100009. https://doi.org/10.1016/j.bioeco.2021.100009
    73. Yavuz Öztürk, Crysten E. Blaby-Haas, Noel Daum, Andreea Andrei, Juna Rauch, Fevzi Daldal, Hans-Georg Koch. Maturation of Rhodobacter capsulatus Multicopper Oxidase CutO Depends on the CopA Copper Efflux Pathway and Requires the cutF Product. Frontiers in Microbiology 2021, 12 https://doi.org/10.3389/fmicb.2021.720644
    74. Yue Yin, Xinjian Ji, Qi Zhang. The Promiscuous Activity of the Radical SAM Enzyme NosL toward Two Unnatural Substrates. Chinese Journal of Chemistry 2021, 39 (9) , 2417-2421. https://doi.org/10.1002/cjoc.202100304
    75. Alex S. Grossman, Terra J. Mauer, Katrina T. Forest, Heidi Goodrich-Blair, . A Widespread Bacterial Secretion System with Diverse Substrates. mBio 2021, 12 (4) https://doi.org/10.1128/mBio.01956-21
    76. Vesna Simunović. Genomic and molecular evidence reveals novel pathways associated with cell surface polysaccharides in bacteria. FEMS Microbiology Ecology 2021, https://doi.org/10.1093/femsec/fiab119
    77. Katherine A Edmonds, Matthew R Jordan, David P Giedroc. COG0523 proteins: a functionally diverse family of transition metal-regulated G3E P-loop GTP hydrolases from bacteria to man. Metallomics 2021, 13 (8) https://doi.org/10.1093/mtomcs/mfab046
    78. Tristan de Rond, Julia E. Asay, Bradley S. Moore. Co-occurrence of enzyme domains guides the discovery of an oxazolone synthetase. Nature Chemical Biology 2021, 17 (7) , 794-799. https://doi.org/10.1038/s41589-021-00808-4
    79. Anastasia C. Manesis, Richard J. Jodts, Brian M. Hoffman, Amy C. Rosenzweig. Copper binding by a unique family of metalloproteins is dependent on kynurenine formation. Proceedings of the National Academy of Sciences 2021, 118 (23) https://doi.org/10.1073/pnas.2100680118
    80. Ricardo Valencia, Valentina González, Agustina Undabarrena, Leonardo Zamora-Leiva, Juan A. Ugalde, Beatriz Cámara. An Integrative Bioinformatic Analysis for Keratinase Detection in Marine-Derived Streptomyces. Marine Drugs 2021, 19 (6) , 286. https://doi.org/10.3390/md19060286
    81. Remi Zallot, Nils Oberg, John A Gerlt. Discovery of new enzymatic functions and metabolic pathways using genomic enzymology web tools. Current Opinion in Biotechnology 2021, 69 , 77-90. https://doi.org/10.1016/j.copbio.2020.12.004
    82. Yan Zhou, Xuexia Xu, Yifeng Wei, Yu Cheng, Yu Guo, Ivan Khudyakov, Fuli Liu, Ping He, Zhangyue Song, Zhi Li, Yan Gao, Ee Lui Ang, Huimin Zhao, Yan Zhang, Suwen Zhao. A widespread pathway for substitution of adenine by diaminopurine in phage genomes. Science 2021, 372 (6541) , 512-516. https://doi.org/10.1126/science.abe4882
    83. Robert J Nichols, Benjamin LaFrance, Naiya R Phillips, Devon R Radford, Luke M Oltrogge, Luis E Valentin-Alvarado, Amanda J Bischoff, Eva Nogales, David F Savage. Discovery and characterization of a novel family of prokaryotic nanocompartments involved in sulfur metabolism. eLife 2021, 10 https://doi.org/10.7554/eLife.59288
    84. Chelsea J. Vickers, Dean Fraga, Wayne M. Patrick. Quantifying the taxonomic bias in enzymology. Protein Science 2021, 30 (4) , 914-921. https://doi.org/10.1002/pro.4041
    85. Jinduo Cheng, Wenjuan Ji, Suze Ma, Xinjian Ji, Zixin Deng, Wei Ding, Qi Zhang. Characterization and Mechanistic Study of the Radical SAM Enzyme ArsS Involved in Arsenosugar Biosynthesis. Angewandte Chemie 2021, 133 (14) , 7648-7653. https://doi.org/10.1002/ange.202015177
    86. Jinduo Cheng, Wenjuan Ji, Suze Ma, Xinjian Ji, Zixin Deng, Wei Ding, Qi Zhang. Characterization and Mechanistic Study of the Radical SAM Enzyme ArsS Involved in Arsenosugar Biosynthesis. Angewandte Chemie International Edition 2021, 60 (14) , 7570-7575. https://doi.org/10.1002/anie.202015177
    87. Jing Shi, Xiang Xu, Pei Yi Liu, Yi Ling Hu, Bo Zhang, Rui Hua Jiao, Ghader Bashiri, Ren Xiang Tan, Hui Ming Ge. Discovery and biosynthesis of guanipiperazine from a NRPS-like pathway. Chemical Science 2021, 12 (8) , 2925-2930. https://doi.org/10.1039/D0SC06135B
    88. Priyam Raut, Jennifer B. Glass, Raquel L. Lieberman. Archaeal roots of intramembrane aspartyl protease siblings signal peptide peptidase and presenilin. Proteins: Structure, Function, and Bioinformatics 2021, 89 (2) , 232-241. https://doi.org/10.1002/prot.26009
    89. Jonathan Chiu-Chun Chou, Veronica E. Stafford, Grace E. Kenney, Laura M.K. Dassama. The enzymology of oxazolone and thioamide synthesis in methanobactin. 2021, 341-373. https://doi.org/10.1016/bs.mie.2021.04.008
    90. Zhifeng Zeng, Yu Chen, Rafael Pinilla-Redondo, Shiraz A Shah, Fen Zhao, Chen Wang, Zeyu Hu, Changyi Zhang, Rachel J. Whitaker, Qunxin She, Wenyuan Han. A Short Prokaryotic Argonaute Activates Membrane Effector to Confer Antiviral Defense. SSRN Electronic Journal 2021, 353 https://doi.org/10.2139/ssrn.3988392
    91. Liangzhi Li, Zhenghua Liu, Min Zhang, Delong Meng, Xueduan Liu, Pei Wang, Xiutong Li, Zhen Jiang, Shuiping Zhong, Chengying Jiang, Huaqun Yin, . Insights into the Metabolism and Evolution of the Genus Acidiphilium , a Typical Acidophile in Acid Mine Drainage. mSystems 2020, 5 (6) https://doi.org/10.1128/mSystems.00867-20
    92. Ady Berenice Meléndez, Daniel Valencia, Erik Thomas Yukl. Specificity of Interactions between Components of Two Zinc ABC Transporters in Paracoccus denitrificans. International Journal of Molecular Sciences 2020, 21 (23) , 9098. https://doi.org/10.3390/ijms21239098
    93. Benjamin Dose, Claudia Ross, Sarah P. Niehs, Kirstin Scherlach, Johanna P. Bauer, Christian Hertweck. Food‐Poisoning Bacteria Employ a Citrate Synthase and a Type II NRPS To Synthesize Bolaamphiphilic Lipopeptide Antibiotics**. Angewandte Chemie International Edition 2020, 59 (48) , 21535-21540. https://doi.org/10.1002/anie.202009107
    94. Benjamin Dose, Claudia Ross, Sarah P. Niehs, Kirstin Scherlach, Johanna P. Bauer, Christian Hertweck. Food‐Poisoning Bacteria Employ a Citrate Synthase and a Type II NRPS To Synthesize Bolaamphiphilic Lipopeptide Antibiotics**. Angewandte Chemie 2020, 132 (48) , 21719-21724. https://doi.org/10.1002/ange.202009107
    95. Thi Quynh Ngoc Nguyen, Yi Wei Tooh, Ryosuke Sugiyama, Thi Phuong Diep Nguyen, Mugilarasi Purushothaman, Li Chuan Leow, Karyna Hanif, Rubin How Sheng Yong, Irene Agatha, Fernaldo R. Winnerdy, Muriel Gugger, Anh Tuân Phan, Brandon I. Morinaka. Post-translational formation of strained cyclophanes in bacteria. Nature Chemistry 2020, 12 (11) , 1042-1053. https://doi.org/10.1038/s41557-020-0519-z
    96. Adele Williamson, Hanna-Kirsti S Leiros. Structural insight into DNA joining: from conserved mechanisms to diverse scaffolds. Nucleic Acids Research 2020, 48 (15) , 8225-8242. https://doi.org/10.1093/nar/gkaa307
    97. Hila Levy, Rafaela S. Fontenele, Ciara Harding, Crystal Suazo, Simona Kraberger, Kara Schmidlin, Anni Djurhuus, Caitlin E. Black, Tom Hart, Adrian L. Smith, Arvind Varsani. Identification and Distribution of Novel Cressdnaviruses and Circular Molecules in Four Penguin Species in South Georgia and the Antarctic Peninsula. Viruses 2020, 12 (9) , 1029. https://doi.org/10.3390/v12091029
    98. Ian J. Campbell, Jose Luis Olmos, Weijun Xu, Dimithree Kahanda, Joshua T. Atkinson, Othneil Noble Sparks, Mitchell D. Miller, George N. Phillips, George N. Bennett, Jonathan J. Silberg. Prochlorococcus phage ferredoxin: structural characterization and electron transfer to cyanobacterial sulfite reductases. Journal of Biological Chemistry 2020, 295 (31) , 10610-10623. https://doi.org/10.1074/jbc.RA120.013501
    99. Selamawit M. Ghebreamlak, Steven O. Mansoorabadi. Divergent Members of the Nitrogenase Superfamily: Tetrapyrrole Biosynthesis and Beyond. ChemBioChem 2020, 21 (12) , 1723-1728. https://doi.org/10.1002/cbic.201900782
    100. Amedea Perfumo, Georg Johannes Freiherr von Sass, Eva-Lena Nordmann, Nediljko Budisa, Dirk Wagner. Discovery and Characterization of a New Cold-Active Protease From an Extremophilic Bacterium via Comparative Genome Analysis and in vitro Expression. Frontiers in Microbiology 2020, 11 https://doi.org/10.3389/fmicb.2020.00881
    Load all citations
    • Abstract

      Figure 1

      Figure 1. Growth of the UniProt protein sequence database (Release 2017_07). The blue line represents the EMBL/TrEMBL sequences with automated annotations; the red line represents the EMBL/SwissProt with manually curated annotations. Currently, the doubling time is ∼2.5 years. The number of sequences decreased by ∼50% in April 2015 when UniProt identified reference proteomes for closely related species and archived the redundant proteomes.

      Figure 2

      Figure 2. A sequence similarity network (SSN) showing the protein sequence nodes and pairwise sequence similarity edges.

      Figure 3

      Figure 3. SSNs for sequences from the proline racemase family (Pfam family PF05544). (A) Alignment score ≥15, ≥22% pairwise sequence identity. (B) Alignment score ≥20, ≥25% pairwise sequence identity. (C) Alignment score ≥50, ≥35% sequence identity. (D) Alignment score ≥70, ≥40% sequence identity. (E) Alignment score ≥90, ≥48% sequence identity. (F) Alignment score ≥110, ≥58% sequence identity. The colors in panel F are used to color the nodes in panels A–E.

      Figure 4

      Figure 4. Examples of SSNs generated with EFI-EST that were included in recent publications. (A) SSN for isopeptidases involved in lasso peptide synthesis. (43) (B) SSN of precursor peptides for microviridin synthesis. (60) (C) SSN of LanMs in lantibiotic synthesis. (76) (D) SSN for ferredoxins compared with a phylogenetic tree. (40) (E) SSN for IspH in isoprenoid biosynthesis. (56) (F) SSNs for members of the DRE-TIM metallolyase superfamily. (52) Figures reproduced with permission from refs 40, 43, 52, 56, 60, and 76.

      Figure 5

      Figure 5. (A) A colored SSN for the proline racemase family (PF05544; InterPro Release 43.0). (B) The GNN generated by an all-by-all BLAST of the genome neighbors. (C) Three pathways catalyzed by members of the proline racemase family. The nodes in the GNN (panel B) are colored using the color clusters in the SSN (Panel A). Figures reproduced with permission from ref 97.

      Figure 6

      Figure 6. (A) SSN for the proline racemase family (PF05544, InterPro Release 63.0) segregated with an alignment score of ≥110 (≥58% pairwise sequence identity). (B) Colored SSN generated by the EFI-GNT web tool. (C, D) GNN with SSN cluster hub-nodes and Pfam family spoke-nodes. (E, F) GNN with Pfam family hub-nodes and SSN cluster spoke-nodes. The GNNs were generated with a ±10 orf genome neighborhood window and a query-neighbor co-occurrence threshold of 20%.

      Figure 7

      Figure 7. GNN for SSN cluster 16 presented at different query-neighbor co-occurrence frequencies. (A) 3%. (B) 5%. (C) 10%. (D) 12%. (E) 15%. (F) 20%.

      Figure 8

      Figure 8. (A) Strategy for discovering catabolic pathways for d-threitol, l-threitol, and erythritol in M. smegmatis using differential scanning fluorimetry (DSF) to screen the ligand specificities of SBPs and the integrated used of SSNs and GNNs to discover the pathway enzymes. (B) Catabolic pathways for d-threitol, l-threitol, and erythritol. (C) Catabolic pathways for d-threonate, l-threonate, and d-erythronate in R. eutropha H16. (59) Figures in Panel A and B reproduced with permission from ref 34; figure in Panel C reproduced with permission from ref 59.

      Figure 9

      Figure 9. (A) Strategy for chemically guided functional profiling. (B) SSN for the glycyl radical enzyme superfamily showing clusters with previously assigned functions as well as clusters (15 and 16) for which chemically guided functional profiling was used to leverage experimental functional assignment. Figures reproduced with permission from ref 72.

      Figure 10

      Figure 10. (A) Colored SSN generated by EFI-GNT for selected clusters in the proline racemase family (PF05544). (B) GNN with SSN cluster hub-nodes and Pfam family spoke-nodes. (C) GNN with Pfam family hub-nodes and SSN cluster spoke-nodes. (D) Refined GNN showing identification of three different functions as deduced by connections (or lack thereof) between SSN cluster and Pfam family nodes.

    • References

      ARTICLE SECTIONS
      Jump To

      This article references 102 other publications.

      1. 1
        Gerlt, J. A. and Babbitt, P. C. (2001) Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies Annu. Rev. Biochem. 70, 209 246 DOI: 10.1146/annurev.biochem.70.1.209
      2. 2
        Furnham, N., Dawson, N. L., Rahman, S. A., Thornton, J. M., and Orengo, C. A. (2016) Large-Scale Analysis Exploring Evolution of Catalytic Machineries and Mechanisms in Enzyme Superfamilies J. Mol. Biol. 428, 253 267 DOI: 10.1016/j.jmb.2015.11.010
      3. 3
        Mukherjee, S., Seshadri, R., Varghese, N. J., Eloe-Fadrosh, E. A., Meier-Kolthoff, J. P., Goker, M., Coates, R. C., Hadjithomas, M., Pavlopoulos, G. A., Paez-Espino, D., Yoshikuni, Y., Visel, A., Whitman, W. B., Garrity, G. M., Eisen, J. A., Hugenholtz, P., Pati, A., Ivanova, N. N., Woyke, T., Klenk, H. P., and Kyrpides, N. C. (2017) 1,003 reference genomes of bacterial and archaeal isolates expand coverage of the tree of life Nat. Biotechnol. 35, 676 683 DOI: 10.1038/nbt.3886
      4. 4
        Schnoes, A. M., Brown, S. D., Dodevski, I., and Babbitt, P. C. (2009) Annotation error in public databases: misannotation of molecular function in enzyme superfamilies PLoS Comput. Biol. 5, e1000605 DOI: 10.1371/journal.pcbi.1000605
      5. 5
        Khosla, C. (2015) Quo vadis, enzymology? Nat. Chem. Biol. 11, 438 441 DOI: 10.1038/nchembio.1844
      6. 6
        Gerlt, J. A., Allen, K. N., Almo, S. C., Armstrong, R. N., Babbitt, P. C., Cronan, J. E., Dunaway-Mariano, D., Imker, H. J., Jacobson, M. P., Minor, W., Poulter, C. D., Raushel, F. M., Sali, A., Shoichet, B. K., and Sweedler, J. V. (2011) The Enzyme Function Initiative Biochemistry 50, 9950 9962 DOI: 10.1021/bi201312u
      7. 7
        Ikeda, H., Nonomiya, T., Usami, M., Ohta, T., and Omura, S. (1999) Organization of the biosynthetic gene cluster for the polyketide anthelmintic macrolide avermectin in Streptomyces avermitilis Proc. Natl. Acad. Sci. U. S. A. 96, 9509 9514 DOI: 10.1073/pnas.96.17.9509
      8. 8
        Bentley, S. D., Chater, K. F., Cerdeno-Tarraga, A. M., Challis, G. L., Thomson, N. R., James, K. D., Harris, D. E., Quail, M. A., Kieser, H., Harper, D., Bateman, A., Brown, S., Chandra, G., Chen, C. W., Collins, M., Cronin, A., Fraser, A., Goble, A., Hidalgo, J., Hornsby, T., Howarth, S., Huang, C. H., Kieser, T., Larke, L., Murphy, L., Oliver, K., O’Neil, S., Rabbinowitsch, E., Rajandream, M. A., Rutherford, K., Rutter, S., Seeger, K., Saunders, D., Sharp, S., Squares, R., Squares, S., Taylor, K., Warren, T., Wietzorrek, A., Woodward, J., Barrell, B. G., Parkhill, J., and Hopwood, D. A. (2002) Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2) Nature 417, 141 147 DOI: 10.1038/417141a
      9. 9
        Ikeda, H., Ishikawa, J., Hanamoto, A., Shinose, M., Kikuchi, H., Shiba, T., Sakaki, Y., Hattori, M., and Omura, S. (2003) Complete genome sequence and comparative analysis of the industrial microorganism Streptomyces avermitilis Nat. Biotechnol. 21, 526 531 DOI: 10.1038/nbt820
      10. 10
        Yu, X., Doroghazi, J. R., Janga, S. C., Zhang, J. K., Circello, B., Griffin, B. M., Labeda, D. P., and Metcalf, W. W. (2013) Diversity and abundance of phosphonate biosynthetic genes in nature Proc. Natl. Acad. Sci. U. S. A. 110, 20759 20764 DOI: 10.1073/pnas.1315107110
      11. 11
        Ju, K. S., Gao, J., Doroghazi, J. R., Wang, K. K., Thibodeaux, C. J., Li, S., Metzger, E., Fudala, J., Su, J., Zhang, J. K., Lee, J., Cioni, J. P., Evans, B. S., Hirota, R., Labeda, D. P., van der Donk, W. A., and Metcalf, W. W. (2015) Discovery of phosphonic acid natural products by mining the genomes of 10,000 actinomycetes Proc. Natl. Acad. Sci. U. S. A. 112, 12175 12180 DOI: 10.1073/pnas.1500873112
      12. 12
        Medema, M. H. and Fischbach, M. A. (2015) Computational approaches to natural product discovery Nat. Chem. Biol. 11, 639 648 DOI: 10.1038/nchembio.1884
      13. 13
        Tietz, J. I. and Mitchell, D. A. (2016) Using Genomics for Natural Product Structure Elucidation Curr. Top. Med. Chem. 16, 1645 1694 DOI: 10.2174/1568026616666151012111439
      14. 14
        Blin, K., Wolf, T., Chevrette, M. G., Lu, X., Schwalen, C. J., Kautsar, S. A., Suarez Duran, H. G., de Los Santos, E. L. C., Kim, H. U., Nave, M., Dickschat, J. S., Mitchell, D. A., Shelest, E., Breitling, R., Takano, E., Lee, S. Y., Weber, T., and Medema, M. H. (2017) antiSMASH 4.0-improvements in chemistry prediction and gene cluster boundary identification Nucleic Acids Res. 45, W36 W41 DOI: 10.1093/nar/gkx319
      15. 15
        Skinnider, M. A., Merwin, N. J., Johnston, C. W., and Magarvey, N. A. (2017) PRISM 3: expanded prediction of natural product chemical structures from microbial genomes Nucleic Acids Res. 45, W49 W54 DOI: 10.1093/nar/gkx320
      16. 16
        Tietz, J. I., Schwalen, C. J., Patel, P. S., Maxson, T., Blair, P. M., Tai, H. C., Zakai, U. I., and Mitchell, D. A. (2017) A new genome-mining tool redefines the lasso peptide biosynthetic landscape Nat. Chem. Biol. 13, 470 478 DOI: 10.1038/nchembio.2319
      17. 17
        Akiva, E., Brown, S., Almonacid, D. E., Barber, A. E., 2nd, Custer, A. F., Hicks, M. A., Huang, C. C., Lauck, F., Mashiyama, S. T., Meng, E. C., Mischel, D., Morris, J. H., Ojha, S., Schnoes, A. M., Stryke, D., Yunes, J. M., Ferrin, T. E., Holliday, G. L., and Babbitt, P. C. (2014) The Structure-Function Linkage Database Nucleic Acids Res. 42, D521 530 DOI: 10.1093/nar/gkt1130
      18. 18
        Babbitt, P. C., Hasson, M. S., Wedekind, J. E., Palmer, D. R., Barrett, W. C., Reed, G. H., Rayment, I., Ringe, D., Kenyon, G. L., and Gerlt, J. A. (1996) The enolase superfamily: a general strategy for enzyme-catalyzed abstraction of the alpha-protons of carboxylic acids Biochemistry 35, 16489 16501 DOI: 10.1021/bi9616413
      19. 19
        Gerlt, J. A., Babbitt, P. C., Jacobson, M. P., and Almo, S. C. (2012) Divergent evolution in enolase superfamily: strategies for assigning functions J. Biol. Chem. 287, 29 34 DOI: 10.1074/jbc.R111.240945
      20. 20
        Schmidt, D. M., Mundorff, E. C., Dojka, M., Bermudez, E., Ness, J. E., Govindarajan, S., Babbitt, P. C., Minshull, J., and Gerlt, J. A. (2003) Evolutionary potential of (beta/alpha)8-barrels: functional promiscuity produced by single substitutions in the enolase superfamily Biochemistry 42, 8387 8393 DOI: 10.1021/bi034769a
      21. 21
        Vick, J. E., Schmidt, D. M., and Gerlt, J. A. (2005) Evolutionary potential of (beta/alpha)8-barrels: in vitro enhancement of a “new” reaction in the enolase superfamily Biochemistry 44, 11722 11729 DOI: 10.1021/bi050963g
      22. 22
        Engelhardt, B. E., Jordan, M. I., Repo, S. T., and Brenner, S. E. (2009) Phylogenetic molecular function annotation J. Phys.: Conf. Ser. 180, 012024 DOI: 10.1088/1742-6596/180/1/012024
      23. 23
        Liu, K., Raghavan, S., Nelesen, S., Linder, C. R., and Warnow, T. (2009) Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees Science 324, 1561 1564 DOI: 10.1126/science.1171243
      24. 24
        Price, M. N., Dehal, P. S., and Arkin, A. P. (2010) FastTree 2--approximately maximum-likelihood trees for large alignments PLoS One 5, e9490 DOI: 10.1371/journal.pone.0009490
      25. 25
        Atkinson, H. J., Morris, J. H., Ferrin, T. E., and Babbitt, P. C. (2009) Using sequence similarity networks for visualization of relationships across diverse protein superfamilies PLoS One 4, e4345 DOI: 10.1371/journal.pone.0004345
      26. 26
        Kohl, M., Wiese, S., and Warscheid, B. (2011) Cytoscape: software for visualization and analysis of biological networks Methods Mol. Biol. 696, 291 303 DOI: 10.1007/978-1-60761-987-1_18
      27. 27
        Brown, S. D., Gerlt, J. A., Seffernick, J. L., and Babbitt, P. C. (2006) A gold standard set of mechanistically diverse enzyme superfamilies Genome Biol. 7, R8 DOI: 10.1186/gb-2006-7-1-r8
      28. 28
        Barber, A. E., 2nd and Babbitt, P. C. (2012) Pythoscape: a framework for generation of large protein similarity networks Bioinformatics 28, 2845 2846 DOI: 10.1093/bioinformatics/bts532
      29. 29
        Li, W., Kinch, L. N., and Grishin, N. V. (2013) Pclust: protein network visualization highlighting experimental data Bioinformatics 29, 2647 2648 DOI: 10.1093/bioinformatics/btt451
      30. 30
        Frickey, T. and Lupas, A. (2004) CLANS: a Java application for visualizing protein families based on pairwise similarity Bioinformatics 20, 3702 3704 DOI: 10.1093/bioinformatics/bth444
      31. 31
        Gerlt, J. A., Bouvier, J. T., Davidson, D. B., Imker, H. J., Sadkhin, B., Slater, D. R., and Whalen, K. L. (2015) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST): A web tool for generating protein sequence similarity networks Biochim. Biophys. Acta, Proteins Proteomics 1854, 1019 1037 DOI: 10.1016/j.bbapap.2015.04.015
      32. 32
        Colin, P. Y., Kintses, B., Gielen, F., Miton, C. M., Fischer, G., Mohamed, M. F., Hyvonen, M., Morgavi, D. P., Janssen, D. B., and Hollfelder, F. (2015) Ultrahigh-throughput discovery of promiscuous enzymes by picodroplet functional metagenomics Nat. Commun. 6, 10008 DOI: 10.1038/ncomms10008
      33. 33
        Cox, C. L., Doroghazi, J. R., and Mitchell, D. A. (2015) The genomic landscape of ribosomal peptides containing thiazole and oxazole heterocycles BMC Genomics 16, 778 DOI: 10.1186/s12864-015-2008-0
      34. 34
        Huang, H., Carter, M. S., Vetting, M. W., Al-Obaidi, N., Patskovsky, Y., Almo, S. C., and Gerlt, J. A. (2015) A General Strategy for the Discovery of Metabolic Pathways: d-Threitol, l-Threitol, and Erythritol Utilization in Mycobacterium smegmatis J. Am. Chem. Soc. 137, 14570 14573 DOI: 10.1021/jacs.5b08968
      35. 35
        Petronikolou, N. and Nair, S. K. (2015) Biochemical Studies of Mycobacterial Fatty Acid Methyltransferase: A Catalyst for the Enzymatic Production of Biodiesel Chem. Biol. 22, 1480 1490 DOI: 10.1016/j.chembiol.2015.09.011
      36. 36
        Rao, G., O’Dowd, B., Li, J., Wang, K., and Oldfield, E. (2015) IspH-RPS1 and IspH-UbiA: ″Rosetta Stone″ Proteins Chem. Sci. 6, 6813 6822 DOI: 10.1039/C5SC02600H
      37. 37
        Roche, D. B., Brackenridge, D. A., and McGuffin, L. J. (2015) Proteins and Their Interacting Partners: An Introduction to Protein-Ligand Binding Site Prediction Methods Int. J. Mol. Sci. 16, 29829 29842 DOI: 10.3390/ijms161226202
      38. 38
        Wichelecki, D. J., Vetting, M. W., Chou, L., Al-Obaidi, N., Bouvier, J. T., Almo, S. C., and Gerlt, J. A. (2015) ATP-binding Cassette (ABC) Transport System Solute-binding Protein-guided Identification of Novel d-Altritol and Galactitol Catabolic Pathways in Agrobacterium tumefaciens C58 J. Biol. Chem. 290, 28963 28976 DOI: 10.1074/jbc.M115.686857
      39. 39
        Ahmed, F. H., Mohamed, A. E., Carr, P. D., Lee, B. M., Condic-Jurkic, K., O’Mara, M. L., and Jackson, C. J. (2016) Rv2074 is a novel F420 H2-dependent biliverdin reductase in Mycobacterium tuberculosis Protein Sci. 25, 1692 1709 DOI: 10.1002/pro.2975
      40. 40
        Atkinson, J. T., Campbell, I., Bennett, G. N., and Silberg, J. J. (2016) Cellular Assays for Ferredoxins: A Strategy for Understanding Electron Flow through Protein Carriers That Link Metabolic Pathways Biochemistry 55, 7047 7064 DOI: 10.1021/acs.biochem.6b00831
      41. 41
        Baier, F., Copp, J. N., and Tokuriki, N. (2016) Evolution of Enzyme Superfamilies: Comprehensive Exploration of Sequence-Function Relationships Biochemistry 55, 6375 6388 DOI: 10.1021/acs.biochem.6b00723
      42. 42
        Bhandari, D. M., Fedoseyenko, D., and Begley, T. P. (2016) Tryptophan Lyase (NosL): A Cornucopia of 5′-Deoxyadenosyl Radical Mediated Transformations J. Am. Chem. Soc. 138, 16184 16187 DOI: 10.1021/jacs.6b06139
      43. 43
        Chekan, J. R., Koos, J. D., Zong, C., Maksimov, M. O., Link, A. J., and Nair, S. K. (2016) Structure of the Lasso Peptide Isopeptidase Identifies a Topology for Processing Threaded Substrates J. Am. Chem. Soc. 138, 16452 16458 DOI: 10.1021/jacs.6b10389
      44. 44
        Davey, L., Halperin, S. A., and Lee, S. F. (2016) Thiol-Disulfide Exchange in Gram-Positive Firmicutes Trends Microbiol. 24, 902 915 DOI: 10.1016/j.tim.2016.06.010
      45. 45
        Desai, J., Liu, Y. L., Wei, H., Liu, W., Ko, T. P., Guo, R. T., and Oldfield, E. (2016) Structure, Function, and Inhibition of Staphylococcus aureus Heptaprenyl Diphosphate Synthase ChemMedChem 11, 1915 1923 DOI: 10.1002/cmdc.201600311
      46. 46
        Ding, W., Li, Q., Jia, Y., Ji, X., Qianzhu, H., and Zhang, Q. (2016) Emerging Diversity of the Cobalamin-Dependent Methyltransferases Involving Radical-Based Mechanisms ChemBioChem 17, 1191 1197 DOI: 10.1002/cbic.201600107
      47. 47
        Gerlt, J. A. (2016) Tools and strategies for discovering novel enzymes and metabolic pathways Perspectives in Science 9, 24 32 DOI: 10.1016/j.pisc.2016.07.001
      48. 48
        Ghodge, S. V., Biernat, K. A., Bassett, S. J., Redinbo, M. R., and Bowers, A. A. (2016) Post-translational Claisen Condensation and Decarboxylation en Route to the Bicyclic Core of Pantocin A J. Am. Chem. Soc. 138, 5487 5490 DOI: 10.1021/jacs.5b13529
      49. 49
        Hao, Y., Pierce, E., Roe, D., Morita, M., McIntosh, J. A., Agarwal, V., Cheatham, T. E., 3rd, Schmidt, E. W., and Nair, S. K. (2016) Molecular basis for the broad substrate selectivity of a peptide prenyltransferase Proc. Natl. Acad. Sci. U. S. A. 113, 14037 14042 DOI: 10.1073/pnas.1609869113
      50. 50
        Ji, X., Li, Y., Xie, L., Lu, H., Ding, W., and Zhang, Q. (2016) Expanding Radical SAM Chemistry by Using Radical Addition Reactions and SAM Analogues Angew. Chem., Int. Ed. 55, 11845 11848 DOI: 10.1002/anie.201605917
      51. 51
        Ji, X., Liu, W. Q., Yuan, S., Yin, Y., Ding, W., and Zhang, Q. (2016) Mechanistic study of the radical SAM-dependent amine dehydrogenation reactions Chem. Commun. 52, 10555 10558 DOI: 10.1039/C6CC05661J
      52. 52
        Kumar, G., Johnson, J. L., and Frantom, P. A. (2016) Improving Functional Annotation in the DRE-TIM Metallolyase Superfamily through Identification of Active Site Fingerprints Biochemistry 55, 1863 1872 DOI: 10.1021/acs.biochem.5b01193
      53. 53
        Li, D., Moorman, R., Vanhercke, T., Petrie, J., Singh, S., and Jackson, C. J. (2016) Classification and substrate head-group specificity of membrane fatty acid desaturases Comput. Struct. Biotechnol. J. 14, 341 349 DOI: 10.1016/j.csbj.2016.08.003
      54. 54
        Molloy, E. M., Tietz, J. I., Blair, P. M., and Mitchell, D. A. (2016) Biological characterization of the hygrobafilomycin antibiotic JBIR-100 and bioinformatic insights into the hygrolide family of natural products Bioorg. Med. Chem. 24, 6276 6290 DOI: 10.1016/j.bmc.2016.05.021
      55. 55
        Plach, M. G., Reisinger, B., Sterner, R., and Merkl, R. (2016) Long-Term Persistence of Bi-functionality Contributes to the Robustness of Microbial Life through Exaptation PLoS Genet. 12, e1005836 DOI: 10.1371/journal.pgen.1005836
      56. 56
        Rao, G. and Oldfield, E. (2016) Structure and Function of Four Classes of the 4Fe-4S Protein, IspH Biochemistry 55, 4119 4129 DOI: 10.1021/acs.biochem.6b00474
      57. 57
        Thotsaporn, K., Tinikul, R., Maenpuen, S., Phonbuppha, J., Watthaisong, P., Chenprakhon, P., and Chaiyen, P. (2016) Enzymes in the p-hydroxphenylacetate degradation pathway of Acinetobacter baumannii J. Mol. Catal. B: Enzym. 134, 353 366 DOI: 10.1016/j.molcatb.2016.09.003
      58. 58
        Zallot, R., Harrison, K. J., Kolaczkowski, B., and de Crecy-Lagard, V. (2016) Functional Annotations of Paralogs: A Blessing and a Curse Life 6, 39 DOI: 10.3390/life6030039
      59. 59
        Zhang, X., Carter, M. S., Vetting, M. W., San Francisco, B., Zhao, S., Al-Obaidi, N. F., Solbiati, J. O., Thiaville, J. J., de Crecy-Lagard, V., Jacobson, M. P., Almo, S. C., and Gerlt, J. A. (2016) Assignment of function to a domain of unknown function: DUF1537 is a new kinase family in catabolic pathways for acid sugars Proc. Natl. Acad. Sci. U. S. A. 113, E4161 4169 DOI: 10.1073/pnas.1605546113
      60. 60
        Ahmed, M. N., Reyna-Gonzalez, E., Schmid, B., Wiebach, V., Sussmuth, R. D., Dittmann, E., and Fewer, D. P. (2017) Phylogenomic Analysis of the Microviridin Biosynthetic Pathway Coupled with Targeted Chemo-Enzymatic Synthesis Yields Potent Protease Inhibitors ACS Chem. Biol. 12, 1538 DOI: 10.1021/acschembio.7b00124
      61. 61
        Bearne, S. L. (2017) The interdigitating loop of the enolase superfamily as a specificity binding determinant or ’flying buttress’ Biochim. Biophys. Acta, Proteins Proteomics 1865, 619 630 DOI: 10.1016/j.bbapap.2017.02.006
      62. 62
        Benjdia, A., Guillot, A., Ruffie, P., Leprince, J., and Berteau, O. (2017) Post-translational modification of ribosomally synthesized peptides by a radical SAM epimerase in Bacillus subtilis Nat. Chem. 9, 698 707 DOI: 10.1038/nchem.2714
      63. 63
        Erb, T. J., Jones, P. R., and Bar-Even, A. (2017) Synthetic metabolism: metabolic engineering meets enzyme design Curr. Opin. Chem. Biol. 37, 56 62 DOI: 10.1016/j.cbpa.2016.12.023
      64. 64
        Estrada, P., Manandhar, M., Dong, S. H., Deveryshetty, J., Agarwal, V., Cronan, J. E., and Nair, S. K. (2017) The pimeloyl-CoA synthetase BioW defines a new fold for adenylate-forming enzymes Nat. Chem. Biol. 13, 668 674 DOI: 10.1038/nchembio.2359
      65. 65
        Giessen, T. W. and Silver, P. A. (2017) Widespread distribution of encapsulin nanocompartments reveals functional diversity Nat. Microbiol 2, 17029 DOI: 10.1038/nmicrobiol.2017.29
      66. 66
        Glasner, M. E. (2017) Finding enzymes in the gut metagenome Science 355, 577 578 DOI: 10.1126/science.aam7446
      67. 67
        Hetrick, K. J. and van der Donk, W. A. (2017) Ribosomally synthesized and post-translationally modified peptide natural product discovery in the genomic era Curr. Opin. Chem. Biol. 38, 36 44 DOI: 10.1016/j.cbpa.2017.02.005
      68. 68
        Holliday, G. L., Brown, S. D., Akiva, E., Mischel, D., Hicks, M. A., Morris, J. H., Huang, C. C., Meng, E. C., Pegg, S. C., Ferrin, T. E., and Babbitt, P. C. (2017) Biocuration in the structure-function linkage database: the anatomy of a superfamily Database DOI: 10.1093/database/bax045
      69. 69
        Jia, B., Jia, X., Hyun Kim, K., Ji Pu, Z., Kang, M. S., and Ok Jeon, C. (2017) Evolutionary, computational, and biochemical studies of the salicylaldehyde dehydrogenases in the naphthalene degradation pathway Sci. Rep. 7, 43489 DOI: 10.1038/srep43489
      70. 70
        Jia, B., Jia, X., Kim, K. H., and Jeon, C. O. (2017) Integrative view of 2-oxoglutarate/Fe(II)-dependent oxygenase diversity and functions in bacteria Biochim. Biophys. Acta, Gen. Subj. 1861, 323 334 DOI: 10.1016/j.bbagen.2016.12.001
      71. 71
        Kandlinger, F., Plach, M. G., and Merkl, R. (2017) AGeNNT: annotation of enzyme families by means of refined neighborhood networks BMC Bioinf. 18, 274 DOI: 10.1186/s12859-017-1689-6
      72. 72
        Levin, B. J., Huang, Y. Y., Peck, S. C., Wei, Y., Martinez-Del Campo, A., Marks, J. A., Franzosa, E. A., Huttenhower, C., and Balskus, E. P. (2017) A prominent glycyl radical enzyme in human gut microbiomes metabolizes trans-4-hydroxy-l-proline Science 355, eaai8386 DOI: 10.1126/science.aai8386
      73. 73
        Ney, B., Ahmed, F. H., Carere, C. R., Biswas, A., Warden, A. C., Morales, S. E., Pandey, G., Watt, S. J., Oakeshott, J. G., Taylor, M. C., Stott, M. B., Jackson, C. J., and Greening, C. (2017) The methanogenic redox cofactor F420 is widely synthesized by aerobic soil bacteria ISME J. 11, 125 137 DOI: 10.1038/ismej.2016.100
      74. 74
        Ortega, M. A., Cogan, D. P., Mukherjee, S., Garg, N., Li, B., Thibodeaux, G. N., Maffioli, S. I., Donadio, S., Sosio, M., Escano, J., Smith, L., Nair, S. K., and van der Donk, W. A. (2017) Two Flavoenzymes Catalyze the Post-Translational Generation of 5-Chlorotryptophan and 2-Aminovinyl-Cysteine during NAI-107 Biosynthesis ACS Chem. Biol. 12, 548 557 DOI: 10.1021/acschembio.6b01031
      75. 75
        Pimviriyakul, P., Thotsaporn, K., Sucharitakul, J., and Chaiyen, P. (2017) Kinetic Mechanism of the Dechlorinating Flavin-dependent Monooxygenase HadA J. Biol. Chem. 292, 4818 4832 DOI: 10.1074/jbc.M116.774448
      76. 76
        Repka, L. M., Chekan, J. R., Nair, S. K., and van der Donk, W. A. (2017) Mechanistic Understanding of Lanthipeptide Biosynthetic Enzymes Chem. Rev. 117, 5457 5520 DOI: 10.1021/acs.chemrev.6b00591
      77. 77
        Schwalen, C. J., Feng, X., Liu, W., O-Dowd, B., Ko, T. P., Shin, C. J., Guo, R. T., Mitchell, D. A., and Oldfield, E. (2017) Head-to-Head Prenyl Synthases in Pathogenic Bacteria ChemBioChem 18, 985 991 DOI: 10.1002/cbic.201700099
      78. 78
        Zallot, R., Yuan, Y., and de Crecy-Lagard, V. (2017) The Escherichia coli COG1738 Member YhhQ Is Involved in 7-Cyanodeazaguanine (preQ(0)) Transport Biomolecules 7, 12 DOI: 10.3390/biom7010012
      79. 79
        Xiang, D. F., Kolb, P., Fedorov, A. A., Xu, C., Fedorov, E. V., Narindoshivili, T., Williams, H. J., Shoichet, B. K., Almo, S. C., and Raushel, F. M. (2012) Structure-based function discovery of an enzyme for the hydrolysis of phosphorylated sugar lactones Biochemistry 51, 1762 1773 DOI: 10.1021/bi201838b
      80. 80
        Fan, H., Hitchcock, D. S., Seidel, R. D., 2nd, Hillerich, B., Lin, H., Almo, S. C., Sali, A., Shoichet, B. K., and Raushel, F. M. (2013) Assignment of pterin deaminase activity to an enzyme of unknown function guided by homology modeling and docking J. Am. Chem. Soc. 135, 795 803 DOI: 10.1021/ja309680b
      81. 81
        Goble, A. M., Toro, R., Li, X., Ornelas, A., Fan, H., Eswaramoorthy, S., Patskovsky, Y., Hillerich, B., Seidel, R., Sali, A., Shoichet, B. K., Almo, S. C., Swaminathan, S., Tanner, M. E., and Raushel, F. M. (2013) Deamination of 6-aminodeoxyfutalosine in menaquinone biosynthesis by distantly related enzymes Biochemistry 52, 6525 6536 DOI: 10.1021/bi400750a
      82. 82
        Hitchcock, D. S., Fan, H., Kim, J., Vetting, M., Hillerich, B., Seidel, R. D., Almo, S. C., Shoichet, B. K., Sali, A., and Raushel, F. M. (2013) Structure-guided discovery of new deaminase enzymes J. Am. Chem. Soc. 135, 13927 13933 DOI: 10.1021/ja4066078
      83. 83
        Ornelas, A., Korczynska, M., Ragumani, S., Kumaran, D., Narindoshvili, T., Shoichet, B. K.