TCM-Navigator, a Deep Learning-based Workflow for Generation and Evaluation of Traditional Chinese Medicine-like Compounds for Drug Development

Feiying Chen; Victor Jun Yu Lim; Mingyu Li; Hao Fan

doi:10.26434/chemrxiv-2025-xrcfk

Biological and Medicinal Chemistry

Search within Biological and Medicinal Chemistry

TCM-Navigator, a Deep Learning-based Workflow for Generation and Evaluation of Traditional Chinese Medicine-like Compounds for Drug Development

06 June 2025, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Traditional Chinese Medicine (TCM) has long been regarded as a valuable resource for modern drug discovery. However, the limited availability of recorded entities and information, the complexity and sparsity of the herb–ingredient–target–disease network, and inconsistencies in data representation hinder the effectiveness of high-throughput screening approaches. While some therapeutically valuable compounds from TCM have been discovered through manual experimental screening, such methods are time-consuming and require substantial human resources. To address these challenges, we developed a data-driven and deep learning–based workflow, TCM-Navigator, that enables the in-silico generation, quality control, and physics-based evaluation of TCM-like molecules. The generation is done by TCM-Generator, a transfer learning- and LSTM-based chemical language model that generates standardized, hierarchically structured, and high-throughput–friendly datasets of TCM-like molecules. In this study, we generated a target-nonspecific dataset comprising 3.7 million TCM-like molecules, expanding the number of entities in existing TCM datasets by more than 100-fold. The workflow also enables flexible, goal-driven molecule generation customized for specific targets, yielding three target-specific datasets and multiple high-potential target-ligand pairs. The quality control is done by TCM-Identifier, the first quantitative model specifically designed to capture unique characteristics of TCM, using an AttentiveFP framework with Message Passing Neural Networks (MPNNs). TCM-Identifier is expected to serve as an essential evaluation and guidance tool for TCM-related drug development. Our workflow bridges cutting-edge data science—including deep learning—with biomedical research to tackle longstanding challenges in target identification and molecular design. Its adaptable framework is also transferable to interdisciplinary innovation beyond drug development.

Keywords

Traditional Chinese Medicine

chemical language model

Supplementary materials

Title

Description

Actions

Title

Supplementary figures

Description

Supplementary Figures 1 to 5 Referenced in the Main Manuscript

Actions

Title

Supplementary Methods

Description

Supplementary Methods mentioned in the main text, including: Datasets, Compound Generation with TCM-Generator, Evaluation of Chemical Space, ADMET and Chemical Properties Analysis, TCM-Identifier, Molecular Docking, and Molecular Dynamics (MD) Simulation.

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Jun 06, 2025 Version 1

Metrics

248

Views

Downloads

Citations

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2025-xrcfk

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

TCM-Navigator, a Deep Learning-based Workflow for Generation and Evaluation of Traditional Chinese Medicine-like Compounds for Drug Development

Authors

Abstract

Keywords

Supplementary materials

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share