On-line OCHEM Multi Task Model for Solubility and Lipophilicity Prediction of Platinum Complexes

31 December 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Predicting the solubility of platinum(II, IV) complexes is essential to the process of prioritizing potential anticancer candidates in drug discovery. This study presents the first publicly available on-line model for predicting the solubility of platinum complexes, addressing the current scarcity of literature and absence of models in this regard. Using a time-split dataset, the consensus model that we developed had a Root Mean Squared Error (RMSE) of 0.62 through 5-cross-validation on a training set of 284 historical compounds (solubility data reported prior to 2017). However, the RMSE increased to 0.86 when applied to a prospective test set of 108 compounds measured after 2017. Further analysis of the high prediction errors revealed that these inaccuracies are primarily attributed to the underrepresentation of novel chemical scaffolds–mainly Pt(IV) derivatives, in the training sets. For instance, a series of eight phenanthroline-containing compounds, not covered by the training set’s chemical space, had an RMSE of 1.3. When the model was redeveloped using a combined dataset, the RMSE of this series significantly decreased to 0.34 under the same validation protocol. Additionally, we developed an interpretable linear model to identify structural features and functional groups that influence the solubility of platinum complexes. We further validated the correlation between solubility and lipophilicity, consistent with the Yalkowsky General Solubility Equation. Building on these insights, we developed a final multitask model that simultaneously predicts solubility and lipophilicity as two endpoints with RMSE = 0.62 and 0.44, respectively. The data and final developed model is available at https://ochem.eu/article/31.

Keywords

Platinum Pt(II)/Pt(IV) complexes
Water solubility
Lipophilicity
Consensus model
Neural networks
Representation learning

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.