Abstract
Protein pKa prediction is essential for the investigation of pH-associated relationship between protein structure and function. In this work, we introduce a deep learning based protein pKa predictor DeepKa, which is trained and validated with the pKa values derived from continuous constant pH molecular dynamics (CpHMD) simulations of 279 soluble proteins. Here the CpHMD implemented in the Amber molecular dynamics package has been employed (Huang, Harris, and Shen J. Chem. Inf. Model. 2018, 58, 1372-1383). Notably, to avoid discontinuities at the boundary, grid charges are proposed to represent protein electrostatics. We show that the prediction accuracy by DeepKa is close to that by CpHMD benchmarking simulations, validating DeepKa as an efficient protein pKa predictor. In addition, the training and validation sets created in this study can be applied to the development of machine learning based protein pKa predictors in future. Finally, the grid charge representation is general and applicable to other topics, such as the protein-ligand binding affinity prediction.
Supplementary materials
Title
Supporting information
Description
Supplemental figures and tables including the statistics of the training and test datasets, the
distribution of solvent accessible surface areas of residues in the training dataset, the correlation plots by Propka and the
pKa's convergence analysis of CpHMD simulations.
Actions