AIMSim: An Accessible Cheminformatics Platform for Similarity Operations on Chemicals Datasets

14 October 2022, Version 5
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The recent advances in deep learning, generative modeling, and statistical learning have ushered in a renewed interest in traditional cheminformatics tools and methods. Quantifying molecular similarity is essential in molecular generative modeling, exploratory molecular synthesis campaigns, and drug-discovery applications to assess how new molecules differ from existing ones. Most tools target advanced users and lack general implementations accessible to the larger community. In this work, we introduce Artificial Intelligence Molecular Similarity (AIMSim), an accessible cheminformatics platform for performing similarity operations on collections of molecules called molecular datasets. AIMSim provides a unified platform to perform similarity-based tasks on molecular datasets, such as diversity quantification, outlier and novelty analysis, clustering, dimensionality reduction, and inter-molecular comparisons. AIMSim implements all major binary similarity metrics and molecular fingerprints and is provided as a Python package that includes support for command-line use as well as a fully functional Graphical User Interface for code-free utilization with fully interactive plots.

Keywords

Cheminformatics
Molecular Fingerprints
Similarity
Data Visualization
Open-Source Software
FOSS

Supplementary materials

Title
Description
Actions
Title
Supporting Information for AIMSim: An Accessible Cheminformatics Platform for Similarity Operations on Chemicals Datasets
Description
Tabulated Similarity Measures, Graphical User Interface Walkthrough, Cluster Analysis of Solvents in Use Case, Speedup and Efficiency Tables, Statement of Availability of Source Code
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.