Theoretical and Computational Chemistry

Automated High Throughput pKa and Distribution Coefficient Measurements of Pharmaceutical Compounds for the SAMPL8 Blind Prediction Challenge



The goal of the SAMPL (Statistical Assessment of the Modeling of Proteins and Ligands) challenge is to improve the accuracy of current computational models to estimate free energy of binding, deprotonation, distribution and other associated physical properties that are useful for the design of new pharmaceutical products. New experimental datasets of physicochemical properties provide opportunities for prospective evaluation of computational prediction methods. Here, aqueous pKa and a range of bi-phasic logD values for a variety of pharmaceutical compounds were determined through a streamlined automated process to be utilized in the SAMPL8 physical property challenge. The goal of this paper is to provide an in-depth review of the experimental methods utilized to create a comprehensive data set for the blind prediction challenge. The significance of this work involves the use of high throughput experimentation equipment and instrumentation to produce acid dissociation constants for twenty-three drug molecules, as well as distribution coefficients for eleven of those molecules.


Thumbnail image of Automated HT pKa and logD Measurements for SAMPL8.pdf

Supplementary weblinks

Input Data and Measured Values
The datasets generated and/or analyzed during the current study are available in the GitHub repository. Both input data as well as measured values are available.