Accurate prediction of the pKa’s of protein residues is crucial to many applications in biological simulation and drug discov-ery. Here we present the use of free energy perturbation (FEP) calculations for the prediction of single protein residue pKa values. We begin with an initial set of 191 residues with experimentally determined pKa values. To isolate sampling limita-tions from force field inaccuracies, we develop an algorithm to classify residues whose environments are significantly affect-ed by crystal packing effects. We then report an approach to identify buried histidines that require significant sampling be-yond what is achieved in typical FEP calculations. We therefore define a clean dataset not requiring algorithms capable of predicting major conformational changes on which other pKa prediction methods can be tested. On this data set, we report an RMSE of 0.76 pKa units for 35 ASP residues, 0.51 pKa units for 44 GLU residues, and 0.67 pKa units for 76 HIS resi-dues.
Detailed description of the construction of the crystal packing severity scoring function, table of defined crystal packing true positives, table of final weights for the crystal packing scoring function, detailed description of the histidine force field parameterization and final set of force field parameters, table of all titratable residues, their experimental and predicted pKa values, and the experimental reference