ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
NPAMAP4ChemRxiv.pdf (3.09 MB)
0/0

Assigning the Origin of Microbial Natural Products by Chemical Space Map and Machine Learning

preprint
submitted on 01.09.2020 and posted on 02.09.2020 by Alice Capecchi, Jean-Louis Reymond

Microbial natural products (NPs) are an important source of drugs. However, their structural diversity remains poorly understood. Here we used our recently reported MinHashed Atom Pair fingerprint with diameter of four bonds (MAP4), a fingerprint suitable for molecules across very different sizes, to analyze the Natural Products Atlas (NPAtlas), a database of 25,523 NPs of bacterial or fungal origin downloaded from https://www.npatlas.org/joomla/. To visualize NPAtlas by MAP4 similarity, we used the dimensionality reduction method tree map (TMAP) (http://tmap.gdb.tools). The resulting interactive map (https://tm.gdb.tools/map4/npatlas_map_tmap/) organizes molecules by physico-chemical properties and compound families such as peptides, glycosides, polyphenols or terpenoids. Remarkably, the map separates bacterial and fungal NPs from one another, revealing that these two compound families are intrinsically different despite of their related biosynthetic pathways. We used these differences to train a machine learning model capable of distinguishing between NPs of bacterial or fungal origin.

History

Email Address of Submitting Author

jean-louis.reymond@dcb.unibe.ch

Institution

University of Bern

Country

Switzerland

ORCID For Submitting Author

0000-0003-2724-2942

Declaration of Conflict of Interest

no conflict of interest

Version Notes

version number 1

Exports