Identifying and Filling the Chemobiological Gaps of Gut Microbial Metabolites

12 June 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Human gut microbial metabolites are currently undergoing much research due to their involvement in multiple biological processes important for health, including immunity, metabolism, nutrition, and the nervous system. Metabolites exert their effect through the interaction with host and bacterial proteins, suggesting the use of “metabolite-mimetic” molecules as drugs and nutraceutics. In the present work, we retrieve and analyze the full set of published interactions of these compounds with human and microbiome-relevant proteins, and find patterns in their structure, chemical class, target class, and biological origins. In addition, we use virtual screening to expand (> 4-fold) the interactions, validate them with retrospective analyses, and use bioinformatic tools to prioritize them based on biological relevance. In this way, we fill many of the chemobiological gaps observed in the published data. By providing these interactions we expect to speed up the full clarification of the chemobiological space of these compounds, by suggesting many reliable predictions for fast, focused experimental testing.


Metabolite mimetic
gut microbiome
gut metabolome
new drug modalities
drug design
nutraceutic design
drug target

Supplementary materials

Supporting Information
Table S1: set of microbial genera typical in human microbial metagenomics analyses and used in this work. Table S2: distribution by target classes of target sharing between gut microbial metabolites and drugs . Table S3: set of published and predicted metabolite-target interactions. For each interaction, the following data is provided: hmdb identifier (“hmdb_id”), inchi string, chemical class (“chem_cl”), compound set (“cset”: Metabolites vs Drugs); specific compound set (“comp_set”: Drugs vs GutFL vs GutnoFL vs Gut/Serum); uniprot accession number of the target (“uniport_id”); target name (“tar_name”); target class (“tar_cl”); target biological group (“tar_biolgr”: “b” for bacterial, “h” for human); biological species (“organism”); source of data (“src”); pchembl-like affinity data (“pbind”); maximum Tanimoto coefficient for SEA prediction (“maxTc”); name of compound (“comp_name”); aggregated source of data (“src2”); even more aggregated source of data (“src3”); high priority target (“hpr”: empty vs “hum” for high-priority human vs “bac” for high-priority bacterial).


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.