Infinite Dilution Activity Coefficients as Constraints for Force Field Parameterization and Method Development

<p>Molecular simulations see widespread use in calculating various physical properties of interest, with a key goal being predictive molecular design. These simulations, including molecular dynamics (MD) simulations, begin with a underlying energy model or force field and then, based on this model, use simulations to compute properties of interest. However, one of the most significant challenges in molecular dynamics and modeling studies is ensuring that the force field is a good enough approximation of the underlying physics that computed quantities can be used to reproduce experimental properties with the desired level of accuracy. Parameterization of force fields depend on various experimental properties including as much of the chemistry of interest as possible. Physicochemical properties measurable in a relatively straightforward manner are particularly interesting for developers. Such properties can be measured for a relatively diverse chemical set and used to expand the parameterization dataset as needed. Here, we examine infinite dilution activity coefficients (IDACs) which are experimental quantities that can play this role. We retrieved 237 empirical IDACs from NIST's ThermoML, a database of measured thermodynamic properties, and we estimated the corresponding values using solvation free energy calculations. We found that calculated IDAC values correlate strongly with experiment. Specifically, the natural logarithm of calculated and experimental IDAC values shows a Pearson correlation coefficient of 0.85+/-0.02. The calculated IDAC values allow us to identify strengths and potential weaknesses of force field parameters for specific functional groups in solutes and solvents, suggesting these may be a valuable source of data for force field parameterization, capturing some of the same type of information as hydration and solvation free energies and thus potentially providing a useful new source of experimental data.</p>