ChemPlot, a Python library for chemical space visualization

22 October 2021, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Visualizing chemical spaces streamlines the analysis of molecular datasets by reducing the information to human perception level, hence it forms an integral piece of molecular engineering, including chemical library design, high-throughput screening, diversity analysis, and outlier detection. We present here ChemPlot, which enables users to visualize the chemical space of molecular datasets in both static and interactive ways. ChemPlot features structural and tailored similarity methods, together with three different dimensionality reduction methods: PCA, t-SNE, and UMAP. ChemPlot is the first visualization software that tackles the activity/property cliff problem by incorporating tailored similarity. With tailored similarity, the chemical space is constructed in a supervised manner considering target properties. Additionally, we propose a metric, the Distance Property Relationship score, to quantify the property difference of similar (i.e. close) molecules in the visualized chemical space. ChemPlot can be installed via Conda or PyPi (pip). Its web application is freely accessible at https://www.amdlab.nl/chemplot/.

Keywords

chemical space visualization
python library
cheminformatics
dimensionality reduction
molecular similarity

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.