Abstract
Solubility is crucial in organic chemistry and holds significant value in the field of medicinal chemistry. Employing computational and QSPR modeling for solubility estimation is favorable as it reduces experimental costs. However, high-quality experimental data is essential for training these QSPR models. In our study, we compiled a dataset consisting of 54,273 experimental solubility values within a temperature range of 243.15 to 403.15 K in various organic solvents and water. This dataset can be used as a reference for individual values or training solubility QSPR models. We conducted a statistical analysis and identified prevalent patterns in the data. Furthermore, we developed an interactive, parametric t-SNE-based tool to explore the chemical space of solutes. Utilizing this tool, we characterized common scaffolds in the dataset and demonstrated that the chemical space of solutes is extensive and diverse.
Supplementary weblinks
Title
The visualisation of BigSolDB chemical space
Description
That is an interactive web-based tool to visualize the
chemical space of solutes. Using our tool, chemists can explore our dataset and see the solubility charts for each compound individually.
Actions
View