Deciphering Human Microbiota–Host Chemical Interactions

Our gut harbors more microbes than any other body site, and accumulating evidence suggests that these organisms have a sizable impact on human health. Though efforts to classify the metabolic activities that define this microbial community have transformed the way we think about health and disease, our knowledge of gut microbially produced small molecules and their effects on host biology remains in its infancy. This Outlook surveys a range of approaches, hurdles, and advances in defining the chemical repertoire of the gut microbiota, drawing on examples with particularly strong links to human health. Progress toward understanding and manipulating this chemical language is being made with diverse chemical and biological expertise and could hold the key for combatting certain human diseases.


INTRODUCTION
In the late 1800s, the first germ-free (GF) guinea pig was delivered by aseptic caesarian section, 1 marking the first time a mammal, normally inoculated with billions of microbes upon birth, emerged without any. Seventy years later, astronauts exiting the Apollo 11 lunar module brought life where no life had ever existed. Neil Armstrong and Buzz Aldrin spent ∼22 h on the moon's surface, 2 and the guinea pig was kept GF for 2 weeks. 3 While the occupancy of the lunar body and sterility of the rodent's were relatively short-lived, the two cases represent examples of major scientific and technological breakthroughs that enabled small inhabitants of a larger system to be added or eliminated from places where life was considered otherwise impossible or impossible to remove. In the case of GF animals, this advance supported a field of study that continues to illuminate the myriad ways in which an animal's microbial inhabitants shape host physiology and behavior. Understanding the roles of the metabolites produced, consumed, and transformed by these microbes is vital to this process and presents an exciting opportunity for chemists.
The collection of microbes we carry in and on us is called the human microbiota. In this Outlook, this microbial community is referred to as the microbiota, and the genes associated with the community are referred to as the microbiome. Although our microbiota constitutes a minor fraction of our total mass (∼2 kg), 4 it carries the brunt of our genetic diversity. Two individuals having >99.5% identity across their 46 chromosomes can have microbiomes that are >90% different. 5,6 Moreover, the gene content of our microbiome outnumbers that of our chromosomes 100fold. 7,8 The vast majority of our microbes inhabit the human gastrointestinal tract, which harbors trillions of bacteria from hundreds to thousands of species. 8−10 Over the past two decades, advances in DNA sequencing have allowed us to identify the organisms and genes present in the human gut microbiota, revealing large interindividual variation in community composition and correlations with health and disease. Colonization of GF model organisms with both complex gut microbiotas and simplified, defined consortia have also suggested that gut microbes can play causal roles in altering host phenotypes. Despite these advances, the molecular mechanisms underlying the human gut microbiota's influences on host biology have largely remained elusive.
Such efforts, which have provided a descriptive understanding of this community, have underscored the importance of elucidating the functional contributions of gut microbes. Indeed, growing evidence supports a key role for microbially derived small molecules in mediating the biological effects of the gut microbiota through interactions with host targets and pathways. In this Outlook, we discuss efforts to decipher the chemical repertoire of the gut microbiota and emphasize the integral role of chemistry in enabling the discovery and characterization of bioactive microbial metabolites. The approaches, challenges, and opportunities we explore apply not just to other human-associated microbial communities but also to microbial and chemical ecology, more generally.

BIOACTIVE SMALL MOLECULES FROM THE HUMAN GUT MICROBIOTA
One of the pioneering untargeted MS-based metabolomic experiments, 11 a comparison of blood from GF versus conventional mice, exposed numerous metabolites that were either present only in conventional mice or found in significantly different abundance between the two groups. A more recent study 12 examining the metabolomes of GF and specific-pathogen-free (SPF) mice on an organ-by-organ basis revealed that the presence of gut microbes resulted in locationspecific differences in the number of unique features at every body site. Notably, in this work, ∼20% of metabolites in the brain differed between GF and SPF mice. 12 While not all of these molecules may be derived directly from microbial metabolism, it is nonetheless clear that this community greatly expands the chemical diversity of the human body. 13−15 These and other studies also reveal extensive microbial variability in terms of total gene content and metabolic capabilities across individuals, suggesting potential differences in metabolic activities and downstream biological effects. 16−18 Moreover, a large fraction of the features detected do not correspond to known compounds, indicating a wealth of microbial chemistry still awaiting discovery. Principally, all gut microbiota-derived metabolites are produced in one of three ways: directly from ingested compounds, from host-derived substrates, or de novo from primary metabolites ( Figure 1). There is evidence for a strong causal relationship between small molecules produced via each of these scenarios and host health. For example, short-chain fatty acids (SCFAs), which are derived from gut bacterial fermentation of complex dietary polysaccharides, are an important nutrient source for the colonic epithelium and can reach millimolar concentrations in systemic circulation. 19,20 SCFAs have been shown to modulate a variety of essential host processes from energy balance to immune function. 21, 22 Among the most well-characterized host-derived metabolites modified by gut microbes are bile acids (BAs). BAs are synthesized from cholesterol via host enzymes in the liver, and their steroid scaffolds are linked to either glycine or taurine. 23 They are transported and stored in the gallbladder before being secreted to the small intestine, recaptured, and recirculated. 24 At every cycle, occurring multiple times per day, an estimated 5% of BAs skip reabsorption and are instead subject to gut microbial metabolism. 25−27 Known microbial BA biotransformations include oxidation and epimerization of hydroxy groups, dehydroxylation, and conjugation and deconjugation of the amino acid. 12,28−30 In addition to acting as a surfactant to aid in the absorption of dietary fats and lipophilic vitamins, 31 BAs also act as ligands in host signaling processes. BA receptors, such as the farnesoid X receptor (FXR) and G protein-coupled BA receptor (GPBAR1 or TGR5), regulate important processes including BA, lipid, and glucose homeostasis as well as immune and barrier functions. 24,32,33 Finally, gut microbes synthesize uniquely microbial compounds, some of which are detected by the host innate immune system. For example, peptidoglycan (PG) is a bacterially produced polymer of peptide-linked glycan chains which serves as the major component of the cell wall of most bacteria. 34 Release of PG fragments called muropeptides occurs during bacterial cell division, and these molecules engage the host immune system via host-encoded pattern recognition receptors nucleotide oligomerization domain 1 and 2 (NOD1 and NOD2). 35,36 When activated by their respective PG-derived ligands, isoglutamyl-meso-diaminopimelic acid and muramyl dipeptide, NOD1 and NOD2 trigger intracellular signaling pathways leading to the production of proinflammatory cytokines. 37−39 These examples confirm the biological relevance of gut microbial molecules to host biology but belie the complexity of microbial chemistry that remains uncharacterized. In addition to discovering new molecules, major gaps in our understanding of gut microbial metabolite production include limited knowledge of which gut microbes synthesize specific small molecules, the genes and enzymes involved, and the biological consequences for the host. Below, we highlight recent efforts that demonstrate how chemical knowledge and approaches can address these challenges.

USING CHEMISTRY TO DECIPHER GUT MICROBIAL METABOLISM 3.A. Understanding Known Metabolic Activities.
A large body of research has already linked many metabolic activities in the body to the gut microbiota; however, our understanding of the biological significance of these transformations remains largely incomplete. One of the major reasons for this gap stems from uncertainty regarding which enzymes and organisms are responsible for a given activity and the correspondingly difficult task of accurately identifying these functions within microbiome sequencing data.
Recent efforts to understand gut microbial cholesterol metabolism illustrate the power of combining chemical knowledge with multiomics analyses (metagenomics, metatranscriptomics, metabolomics). Cholesterol is an essential component of mammalian cell membranes; it is produced endogenously in the liver and is also obtained from animal products in the diet. While cholesterol levels are normally maintained as a balance between the two sources, high serum cholesterol levels are associated with the development of cardiovascular disease. Given that cholesterol is found in the diet, it has long been proposed that gut microbial metabolism could influence total levels. Indeed, cholesterol metabolism by this community has been documented for more than a century. 40 The presumed in vivo transformation, an overall reduction leading to a molecule called coprostanol, was reported from human stool at least 90 years ago. 41 The activity was later documented in bacteria isolated from rat cecal contents, 42 baboon intestines, 43 and human feces and has been proposed to lower cholesterol levels by reducing absorption from the gut. 44 However, the unavailability of these original host-associated isolates precluded a mechanistic follow up. As such, the genetic and biochemical basis of coprostanol formation has remained largely unknown.
Kenny, Plichta, et al. recently identified cholesterolmetabolizing human gut bacteria and enzymes ( Figure 2). 45 They began by studying coprostanol formation in the only strain available, an uncharacterized, unsequenced bacterium called Eubacterium coprostanoligenes, isolated from a hog sewage lagoon. 46 By detecting intermediates involved in coprostanol formation in E. coprostanoligenes lysates, the authors found that the activity of the initial oxidation of cholesterol to cholestenone required a cofactor (NADP+) and did not require oxygen. 45 This information allowed the investigators to rule out the involvement of oxygen-dependent, cholesterol oxidase-type enzymes and focus their attention on hydroxysteroid dehydrogenases (HSDs). Although HSDs had not previously been shown to act on cholesterol, they perform related reactions on BAs. Using domain predictions, transcriptional profiling, and heterologous expression, Kenny, Plichta, et al. identified a single HSD enzyme, IsmA, in E. coprostanoligenes that catalyzes the first and last steps in coprostanol formation. By examining studies with matched gut metagenomics and metabolomics data, they found that ismA genes were strongly correlated with coprostanol and, using de novo gene assembly and phylogenetic approaches, showed that they are found only in uncultivated gut bacteria. 45 Finally, the authors showed that patients whose microbiota encoded ismA had reduced total serum cholesterol levels. This effect was larger than the effects associated with variations in human genes (HMGCR and PCSK9) that are known predictors of cholesterol levels and the basis of several FDA approved, cholesterol-lowering medications. These results suggest that understanding the genetic and biochemical origins of microbial metabolism can unlock new diagnostic or therapeutic options for diseases that have conventionally been treated only with host-targeted drugs.
3.B. Discovering New Metabolic Activities. The expanded genetic potential and metabolic output associated with the gut microbiota predicts that many bioactive metabolites remain to be identified. The expertise of chemists will be critical for understanding these new compounds, as illustrated by the discovery and characterization of colibactin.
It has been known for ∼15 years that certain gut isolates, mostly of Escherichia coli, are genotoxic toward human cells ( Figure 3A). 47 Transposon-based mutagenesis of these isolates identified a 54-kilobase genomic island, called pks, that encodes a hybrid nonribosomal peptide synthetase-polyketide synthase (NRPS-PKS) assembly line, as the source of this effect. 47 Despite its seemingly transparent genetic basis, the natural product generated by this gene cluster, colibactin, eluded isolation via traditional methods, and its complete structure has only recently been proposed. In particular, years of studying colibactin biosynthesis, enabled by an understanding of the biosynthetic logic of PKS-NRPS assembly lines, 48−51 were critical in providing initial structural information and guiding experimental efforts that ultimately resulted in the prediction of an active colibactin structure containing two reactive aminocyclopropane "warheads". 52,53 The difficulty associated with characterizing colibactin serves as a reminder that critical microbial metabolites may not be readily detected in metabolomics experiments due to chemical instability or lack of accumulation.
Chemical knowledge was also essential in deciphering colibactin's mode of action. Specifically, the presence of a strained, potentially reactive cyclopropane ring in shunt Figure 2. A multiomics approach leveraged a cholesterol metabolizing environmental bacterium to understand this metabolic activity in the human gut microbiota. Hog sewage lagoon bacterium E. coprostanoligenes converts cholesterol to coprostanol. Characterization of this activity in cultures revealed the chemical logic of this pathway and led to the discovery of a critical enzyme, IsmA. Homologs of ismA were classified phylogenetically and identified in as-yet-uncultured human gut bacteria. Paired metagenomic−metabolomic samples revealed that ismA+ encoders have less fecal cholesterol and more coprostanol than nonencoders. ismA is inversely correlated with serum cholesterol levels, suggesting that gut bacterial cholesterol metabolism may reduce host cholesterol levels and risk for cardiovascular disease.
products isolated from mutant strains was a critical clue that the natural product might alkylate DNA ( Figure 3B). 54 This inspired efforts to identify colibactin-derived DNA adducts. The structures of these adducts, which are likely derived from a larger interstrand cross-link, revealed that colibactin alkylates at N3 of adenine, and they may serve as biomarkers of exposure to this genotoxin. More recently, the use of organoids, paired with whole-genome sequencing, enabled the identification of mutagenic signatures arising from colibactin. 55 These signatures also appear in cancer genomes. Notably, pks+ E. coli are present in approximately two-thirds of patients with colorectal cancer and contribute to tumorigenesis in mouse models of this disease. 56,57 Collectively, these data may support a causal role for colibactin in cancer, something which has proven challenging to establish for any disease-associated gut microbial metabolic activity. 58 The notion that gut inhabitants encode for a natural product that compromises the integrity of the human genome highlights the diversity of microbial metabolic processes and foretells that uncharacterized microbial metabolites may act in ways we do not yet anticipate.

FUTURE OPPORTUNITIES AND CHALLENGES
Microbiome research has undoubtedly broadened our appreciation of the chemistry taking place within us. The above examples, implicating gut microbes in diverse aspects of health, also illustrate how age-old medical observations, guided by new genomic information and chemical approaches, are changing our understanding of the causes of certain human diseases. At the same time, given the vast interindividual differences in our microbiota, its complexity, the fact that many gut microbes have not been cultured, and that many remain genetically intractable, the prospect of finding and characterizing new microbiome-encoded activities is daunting.
Based on knowledge of human biology and recent advances in our understanding of how this community shapes host phenotypes, we highlight potentially fertile settings for characterizing new metabolic interactions. At the same time, this area of human microbiota research could greatly benefit from the development of new approaches and tools and needs to leverage the expertise of chemists. 4.A. Where Should We Look? 4.A.i. Immune System. Our small and large intestines comprise more immune cells than any other tissue in the body. 59 Exactly how the gut microbiota interfaces with the host immune system across this changing landscape and why certain diseases localize to particular intestinal regions are areas of active exploration. 60−62 Recently, James et al. undertook a multifaceted approach to relate the regional makeup of bacteria in the colon to the composition and cell-type-specific activities of the immune cells in their vicinity, first by generating a map of the bacterial communities found in distinct parts of the colon and then by transcriptionally profiling immune cells from these locations using single-cell RNA sequencing. 63 While the study reveals previously unrecognized correlations in microbiota composition and the presence and activity of T cells and B cells along the intestinal tract, it will be interesting in the future to consider bacterial transcriptional activity and to overlay these regional patterns with the presence of host-produced compounds and microbial metabolites that may mediate local host−microbe interactions. As an example in this direction, a recent study showed that secondary BA production by the murine gut microbiota mediates the regionalization of mammalian enteric viruses by signaling to immune-regulating BA receptors that are expressed in distinct regions along the intestinal tract. 64 Another recent study showed that the lifespan in a mouse model of amyotrophic lateral sclerosis (ALS) depends on the presence of immune-stimulating The potential reactivity of cyclopropanes found in metabolites isolated from pks mutants suggested that colibactin was a DNA alkylating agent and inspired efforts to identify colibactin-derived DNA adducts. DNA adductomics revealed adenine adducts that were proposed to derive from larger, interstrand cross-links. Injection of organoids with pks+ and pks− E. coli revealed a mutagenic signature specific to colibactin. This signature further suggests that colibactin forms interstrand cross-links to deoxyadenosine residues and is found in human colorectal cancer patients. bacteria found in the animal's gut microbiota. 65 While many factors in immune training remain to be resolved, these analyses and others may help prioritize particular microbes to screen for new immunomodulatory metabolites. 4.A.ii. Endocrine System. Among the most compelling evidence for disease-causation by the gut microbiota are those in which microbiota transplantation confers a GF mouse with the donor's phenotype (e.g., obesity). 66,67 In addition to indicating that some features of the microbiota are transferrable, these studies suggest that the microbiota can regulate host homeostasis, a key function of the endocrine system. The synthesis and secretion of hormones in the gut and pancreas are accomplished by enteroendocrine cells. The hormones produced can act locally or at a distance through paracrine and endocrine action where they regulate a multitude of essential processes, including digestion, satiety, insulin release, and weight gain. On the microbial end, much progress has been made toward understanding how SCFAs affect the regulation and production of known host-produced hormones. 68,69 However, more specialized microbe-derived mediators of host endocrine functions also exist. For example, an untargeted MS-based comparison of plasma from human subjects with and without type II diabetes identified the microbial metabolite imidazole propionate as being upregulated in the type II diabetic group. 70 Imidazole propionate is produced from the microbial metabolism of histidine via an enzyme, urocanate reductase UrdA, found in bacteria associated with type II diabetics. 71 Moreover, when administered to mice, imidazole propionate induced glucose intolerance and impaired insulin signaling, suggesting a causal link between this metabolite and disease. 70 We propose that identifying endocrine-specific microbial effectors and understanding their mode of action is an important area of further research, along with characterizing how gut microbes metabolize host-derived hormones.
4.A.iii. Nervous System. The gut microbiota is linked to the progression of numerous neurological diseases including multiple sclerosis, Parkinson's disease, Alzheimer's disease, and autism spectrum disorders. 72−75 While it is largely inconclusive whether changes in community composition are consequences of or contributors to disease, evidence for the transmission of information between the gut and the brain, the gut−brain axis, is well established. For example, Bravo, Forsythe, et al. first showed that administration of the probiotic Lactobacillus rhamnosus to mice modulates host behavior and GABA receptor expression levels via the vagus nerve. 76 More recently, Strandwitz, Kim, et al. isolated GABAmetabolizing and GABA-producing gut bacteria. 77 In the brain, GABA levels are inversely associated with depression. Using functional magnetic resonance imaging (fMRI) of women with major depressive disorder, Strandwitz, Kim, et al. showed that activity in a region of the brain affected by depression correlates with the presence of potential GABA-producers (Bacteroides) in patients' stool. 77 While additional work is needed to explore this potential connection, we anticipate that the continued development and use of tools such as noninvasive imaging techniques that can be used on humans, MS-imaging methods, which preserve spatial information from a sample, and fluorescent probes that report on specific activities in live brains will catalyze these endeavors. 77−79 4.A.iv. Infant Gut Microbiota. The human gut is colonized by microbes at birth, and the composition of this community changes over the first three years of life before reaching a state that is thought to be largely stable into adulthood. 80,81 This assembly process, which varies starting with the mode of delivery, 82−85 may be critical for host health later in life, as administering antibiotics during this period is associated with increased risk for disease in adulthood. However, the mechanisms by which the developing gut microbiota influences infant health are poorly understood. An important factor in this process is likely the source of infant nutrition, with breastand bottle-feeding being associated with differences in early community composition. 86 We propose that distinct microbial metabolic activities enabled by the molecular components of these different diets may contribute to the developing infant gut microbiota's effects on host health. Along these lines, recent analyses of meconium and fecal samples from newborns have described different coordinated assembly processes in the development of the infant gut microbiota and its chemical environment. 87,88 The viral component of the gut (the virome), for example, develops in a stepwise manner, and its progression differs depending on whether the baby is exclusively formula-fed versus partially or exclusively breastfed. 88 Human milk, but typically not bovine milk or formula, contains high concentrations of human milk oligosaccharides (HMOs), compounds that are indigestible by host enzymes but metabolized by gut bacteria. 89 HMOs, along with other components in human milk, including lipids, glycoproteins, and immune modulating factors, could affect the trajectory of the developing infant microbiota. A better understanding of how our pioneering bacteria metabolize each of these components and, in turn, generate bioactive metabolites is an exciting area of research that necessitates a broad range of chemical expertise.
4.B. Tools and Approaches. 4.B.i. Discovering Metabolites Using Metabolomics. Multiple approaches exist for discovering metabolites. Germane to microbial chemists are activity-guided isolation, functional metagenomics, and untargeted metabolomics. Briefly, whereas activity-guided isolation seeks to identify an active factor present in a mixed sample based on its bioactivity in an assay, functional metagenomics aims to produce, purify, and identify an active product from DNA obtained from a microbiota. Distinct from both approaches, untargeted metabolomics uses global changes in metabolites to identify uncharacterized compounds for isolation and structural characterization. It is estimated that <2% of spectra in a given untargeted metabolomics experiment can be annotated. 90 In the past decade, the development of molecular networking tools has allowed researchers to classify and group multiple features from their metabolomics data (e.g., precursor ion count, LC-MS elution time, MS-MS fragmentation pattern) into a network. 91 While linking a novel spectrum to its corresponding metabolite can still be a laborious process, molecular networking has enabled spectra corresponding to known compounds to be clustered with unknowns, helping to guide structural characterization. For example, the recent identification of previously unknown amino acid-conjugated BAs was deduced based on their clustering with spectra of known BA-conjugates. 12 4.B.ii. Identifying Responsible Organisms, Genes, and Enzymes. Several obstacles exist in linking newly characterized molecules to producing organisms and biosynthetic genes. Automated gene annotations are frequently vague and often incorrect; genetic tractability and our ability to cultivate microbes are often limited, which hinders in vitro validation, and prolific microbes can be missed, despite their importance, if they are at low abundance. Solutions to the latter obstacles are currently being addressed with new experimental approaches to genetically modify gut microbes 92−96 and to identify, isolate, and characterize low-abundance or locationspecific microbial members. 97−100 As for annotation, approaches to classify features based on secondary structure can have benefits similar to those of molecular networking strategies. For example, of the nearly 18 000 entries in Pfam v32, ∼4000 (∼22%) represent domains of unknown function (DUFs). While discovering that a protein-of-interest harbors a DUF is not necessarily immediately useful, the value of being "unknown together" is that a successful characterization of any one protein with the domain has the potential to inform all of the affiliated proteins.
4.B.iii. Elucidating Effects on the Host. With a molecule in hand and the microbial source identified, addition of the bacterial producer or the compound itself to cells or animals would seem an obvious approach to study its activity. However, identifying a representative model is not trivial. For instance, the use of GF mice as a comparative tool is complicated by major differences in their immune systems (e.g., a lack of early immune education 61 ), anatomy (e.g., GF mice have enlarged ceca 101 ), and metabolism 102 compared to conventional animals. Furthermore, many human intestinal pathogens, including Vibrio cholerae and Campylobacter jejuni, do not easily colonize nor are considered naturally infective toward mice, 103 raising related concerns about studies of human commensals. The development of additional tools including organ-on-a-chip platforms and organoids not only provides compelling alternatives but also offers unique opportunities to measure and manipulate the "host"-provided environment in ways that cannot be done in animals. 104 4.B.iv. Target Identification. It is also important to consider that, while animal models can link a given microbial metabolite to organism-level phenotypes, target identification is often required to fully understand the mechanism of action. Diverse approaches using cell-based reporter assays (e.g., G proteincoupled receptor (GPCR) reporter cell lines 15 ), genetics (e.g., CRISPRi 105 ), biochemistry (e.g., thermal proteome profiling 106,107 ), and chemical biology (e.g., chemoproteomics 108 ) can be applied to achieve this goal.

OUTLOOK AND CONCLUSIONS
The gut microbiota is represented by members from all three domains of life (bacteria, eukarya, archaea) as well as viruses. While most microbiome studies focus on bacteria, compelling research on human-associated fungi (the mycobiome), archaea (the archaeome), and the virome suggests that microbial metabolites are produced by and affect all lifeforms present in ways that remain to be fully appreciated. 88,109,110 Moreover, humans have distinct microbiotas in different body sites (gut, oral, skin, etc.). Understanding the community dynamics and chemical ecology at each one could reveal a set of unifying principles as well as location-specific determinants that underlie each system. While this Outlook provides a glimpse into the exciting research and opportunities in the gut microbiota, we hope it also highlights the critical role that chemists and chemical biologists can play in understanding enzymes and pathways encoded by individual microbes from all body habitats and ascertaining the metabolic potential and physiologic implications of microbiome collective on human health.