These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
3 files

Synthetically Accessible Virtual Inventory (SAVI)

submitted on 24.04.2020, 01:21 and posted on 27.04.2020, 04:58 by Hitesh Patel, Wolf Ihlenfeldt, Philip Judson, Yurii S. Moroz, Yuri Pevzner, Megan Peach, Nadya Tarasova, Marc Nicklaus
We have made available a database of over 1 billion compounds predicted to be easily synthesizable. They have been created by a set of transforms based on an adaptation and extension of the CHMTRN/PATRAN programming languages describing chemical synthesis expert knowledge, which originally stem from the LHASA project. The chemoinformatics toolkit CACTVS was used to apply a total of 53 transforms to about 150,000 readily available building blocks ( Only single-step, two-reactant syntheses were calculated for this database even though the technology can execute multi-step reactions. The possibility to incorporate scoring systems in CHMTRN allowed us to subdivide the database of 1.75 billion compounds in sets according to their predicted synthesizability, with the most-synthesizable class comprising 1.09 billion synthetic products. Properties calculated for all SAVI products show that the database should be well-suited for drug discovery. It is being made publicly available for free download from


National Institutes of Health


Email Address of Submitting Author


National Cancer Institute, NIH


United States

ORCID For Submitting Author


Declaration of Conflict of Interest

The authors declare no competing interests