Accurate Structure Prediction of Natural Products with NatGen

19 May 2025, Version 1

Abstract

Natural products (NPs), as a vital source of pharmaceutical agents, have contributed to the development of 60% of marketed small-molecule drugs. However, NP-based drug discovery faces a major challenge due to the combinatorial expansion of NPs' configurational space and their complex 3D-structures, which arise from atomic chirality dictated by stereospecific biosynthetic enzymes. To date, over 20% of known NPs lack complete chiral configuration annotations, and only 1–2% have fully resolved crystal structures. To address this bottleneck, we present NatGen, an innovative deep learning framework for predicting the chiral configurations and 3D conformations of natural products. NatGen leverages advanced structure augmentation and generative modeling techniques and achieves near-perfect accuracy in chiral configuration prediction: 96.87% on benchmark NP structural dataset and 100% in a prospective study involving 17 recently resolved plant-derived natural products. The average root-mean-square deviation (RMSD) of the predicted 3D structures is below 1 Å—smaller than the radius of a single atom. Using NatGen, we successfully predicted the 3D structures of 684,619 NPs from COCONUT - the largest open NP repository to date - and made the full dataset publicly available at https://www.lilab-ecust.cn/natgen/. We believe this resource significantly expands the structural landscape of natural products and will empower researchers to cross-validate findings and accelerate progress in diverse fields including natural product chemistry, enzymatic biosynthesis, physical, organic and analytical chemistry, phytochemistry, NP and NP-derived drug discovery.

Keywords

Artificial Intelligence
natural product

Supplementary materials

Title
Description
Actions
Title
Supporting Information For Accurate Structure Prediction of Natural Products with NatGen
Description
A PDF document containing detailed experimental data of the NatGen study along with crystallographic analysis results.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.