Navigating the Chemical Space and Chemical Multiverse of a Unified Latin American Natural Product Database: LANaPDB

24 August 2023, Version 1

Abstract

The number of databases of natural products (NPs) have increased substantially. Latin America is extraordinarily rich in biodiversity enabling the identification of novel NPs, which has encouraged both the development of databases and the implementation of those that are being created or are under development. In a collective effort from several Latin American countries, herein we introduce the first version of Latin American Natural Products Database (LANaPDB), a public compound collection that gathers the chemical information of NPs contained in diverse databases from this geographical region. The current version of LANaPD unifies the information from six countries and contains 12,959 chemical structures. The structural classification showed that the most abundant compounds are the terpenoids 63.2%, phenylpropanoids 18% and the alkaloids 11.8%. From the analysis of the distribution of properties of pharmaceutical interest, it was observed that many LaNaPDB compounds satisfy some drug-like rules of thumb for physicochemical properties. The concept of the chemical multiverse was employed to generate multiple chemical spaces from two different fingerprints and two dimensionality reduction techniques. Comparing LaNaPDB with FDA-approved drugs and the major open-access repository of NPs, COCONUT it was concluded that the chemical space covered by LaNaPDB completely overlaps with COCONUT and in some regions with FDA-approved drugs. LANaPD will be updated adding more compounds from each database plus the addition of databases from other Latin American countries. The database is freely available at https://github.com/alexgoga21/LaNaPDB.

Keywords

chemical multiverse
chemical space
chemoinformatics
databases
diversity
drug discovery
Latin America
natural products
virtual screening

Supplementary materials

Title
Description
Actions
Title
Supplementary Tables
Description
Table S1: Rules of thumb - guides - associated with drug-likeness. Table S2: Analysis metrics of the principal component analysis. Table S3: Websites of the natural product databases of Latin America.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.