Abstract
As an important research area in organic chemistry, asymmetric catalysis has contributed greatly to the development of chemistry and other fields, and chiral ligands/catalysts are the core research content of it. However, traditional experimental methods still have some limitations, and machine learning (ML)-based computational methods suffer from the lack of sufficient and accurate data resources about chiral ligands/catalysts. To overcome this challenge, we develop the Chiral Ligand and Catalyst Database (CLC-DB). To our best knowledge, CLC-DB is the first open-source and largest professional database for chiral ligands/catalysts, containing 1861 molecules of several basic chiral types that belong to 32 different chiral ligand/catalyst types. A total of 19 items of information are included for each data record, including the 2D and 3D chemical structure, ligand/catalyst category, chiral type, chemical and physical properties, artificial intelligence (AI) generated description, etc. Each molecular data is linked with authoritative chemical databases and validated by chemical experts. CLC-DB is a user-friendly database that supports two quick search methods and batch search. In addition, CLC-DB provides an efficient online molecular clustering tool for ML computational analyses. CLC-DB is accessible at https://compbio.sjtu.edu.cn/services/clc-db, and all the data can be downloaded for free.