Increasing the Accuracy and Robustness of the CHARMM General Force Field with an Expanded Training Set

14 January 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Small molecule empirical force fields (FFs), including the CHARMM General Force Field (CGenFF), are designed to have wide coverage of organic molecules and to rapidly assign parameters to molecules not explicitly included in the FF. Assignment of parameters to new molecules in CGenFF is based on a trained bond-angle-dihedral charge increment linear interpolation scheme for the partial atomic charges along with bonded parameters assigned based on analogy using a rules-based penalty score scheme associated with atom types and chemical connectivity. Accordingly, the accuracy of CGenFF is related to the extent of the training set of available parameters. In the present study that training set is extended by 1,390 molecules selected to represent connectivities new to CGenFF training compounds. Quantum mechanical (QM) data for optimized geometries, bond, valence angle, and dihedral angle potential energy scans, interactions with water, molecular dipole moments, and electrostatic potentials were used as target data. The resultant bonded parameters and partial atomic charges were used to train a new version of the CGenFF program, v5.0, which was used to generate parameters for a validation set of molecules, including drug-like molecules approved by the FDA, which were then benchmarked against both experimental and QM data. CGenFF v5.0 shows overall improvements with respect to QM intramolecular geometries, vibrations, dihedral potential energy scans, dipole moments and interactions with water. Tests of pure solvent properties of 216 molecules show small improvements versus the previous release of CGenFF v2.5.1 reflecting the high quality of the Lennard-Jones parameters that were explicitly optimized during the initial optimization of both the CGenFF and the CHARMM36 force field. CGenFF v5.0 represents an improvement that is anticipated to more accurately model intramolecular geometries and strain energies as well as non-covalent interactions of drug-like and other organic molecules.

Keywords

molecular dynamics simulations
molecular modeling
drug design
medicinal chemistry

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
Additional figures and tables for: water-compound interactions; pure solvent MD simulations; partial charge distribution; penalty and charge correlation; example of equilibrium values for a molecule with optimized bonded terms; 4-membered ring validation molecules and corresponding comparison of MM vs QM internal coordinates; PES scans energy plots; vibrational analysis; additional data for FDA compounds validation and HFE calculations. Also see GitHub repository (Additional figures and tables for: water-compound interactions; pure solvent MD simulations; partial charge distribution; penalty and charge correlation; example of equilibrium values for a molecule with optimized bonded terms; 4-membered ring validation molecules and corresponding comparison of MM vs QM internal coordinates; PES scans energy plots; vibrational analysis; additional data for FDA compounds validation and HFE calculations.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.