Critical Assessment of pH-Dependent Lipophilic Profiles of Small Molecules for Drug Design: Which One Should We Use and In Which Cases?

17 July 2023, Version 3
This content is a preprint and has not undergone peer review at the time of posting.


Lipophilicity is a physicochemical property with wide relevance in drug design and is applied in areas such as food chemistry, environmental chemistry, and computational biology. This descriptor strongly influences the absorption, distribution, permeability, bioaccumulation, protein-binding, and biological activity of bioorganic compounds. Lipophilicity is commonly expressed as the n-octanol/water partition coefficient (PN) for neutral molecules, whereas for molecules with ionizable groups, the distribution coefficient (D) at a given pH is used. The logDpH is usually predicted using a pH correction over the logPN using the pKa of ionizable molecules, while often ignoring the apparent ionic partition (PIapp) because of the challenge of predicting the partitioning of the charged species and/or related species (e.g., ion-pairs, counterions, molecular aggregates). In this work, we studied the impact of "P" _"I" ^"app" on the prediction of both the experimental lipophilicity of small molecules and experimental lipophilicity-based applications and metrics such as lipophilic efficiency (LipE), distribution of spiked drugs in milk products, and pH-dependent partition of water contaminants in synthetic passive samples such as silicones. Our findings show that better predictions are obtained by considering the apparent ionic partition, whereas ignoring its contribution can lead to inadequate experimental simplifications and/or computational predictions. In this context, we developed machine learning algorithms to determine the cases that "PIapp" should be considered. The results indicate that small, rigid, and unsaturated molecules with logPN close to zero, which present a significant proportion of ionic species in the aqueous phase, were better modeled using the apparent ionic partition (PIapp). In addition, we validated our findings using a test and two external sets, which included small molecules and amino acid analogs, where the logistic regressions, random forest classifications, and support vector machine models predicted better formalism to determine the logDpH for each molecule with high accuracies, sensitivities, and specificities. Finally, our findings can serve as guidance to the scientific community working in early-stage drug design, food, and environmental chemistry who deal with ionizable molecules, to determine a priori which pH-dependent lipophilicity profile should be used in their research and applications depending on the structure of a substance.


Partition coefficient
Lipophilic profiles
Machine learning
Drug Design

Supplementary materials

Supporting Information
Tables and Figures

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.
Comment number 1, William J. Zamora Ramírez: Oct 30, 2023, 16:12

Now is published on