Abstract
Efficient prediction of the partition coefficient ($\log P$) between polar and non-polar phases could shorten the cycle of drug and materials design. In this work, a descriptor, named $\langle q-ACSFs \rangle_{conf}$, is proposed to take the explicit polarization effects in polar phase and conformation ensemble of energetic and entropic significance in non-polar into considerations. The polarization effects are involved by embedding the partial charge directly derived from force fields or quantum chemistry calculations into the atom-centered symmetry functions (ACSFs), together with the entropy effects which are averaged according to Boltzmann distribution of different conformations taken from similarity matrix. The model was trained with the high-dimensional neural networks (HDNNs) on a public dataset PhysProp (with $41039$ samples). Satisfactory $\log P$ prediction performance was achieved on three other datasets, namely, Martel ($707$ molecules), Star \& Non-Star ($266$) and Huuskonen ($1870$). The present $\langle q-ACSFs \rangle_{conf}$ model was also applicable to the $n$-carboxylic acid with the number of carbon ranging from $2$ to $14$ and the $54$ kinds of organic solvents. It is easy to apply the present method to arbitrary sized systems and give a transferable atom-based partition coefficient.
Supplementary materials
Title
Molecular Partition Coefficient from Machine Learning with Polarization and Entropy Embedded Atom- Centered Symmetry Functions
Description
Additional details in collected
datasets, generation of descriptors, computational methods of Molecular Dynamics
(MD) simulations and Quantum Mechanisms (QM), hyper-parameter optimization
of high-dimensional neural networks, and contribution from distinct elements
with different environments
Actions