ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
1/1
2 files

Building Machine Learning Force Fields of Proteins with Fragment-Based Approach and Transfer Learning

preprint
submitted on 05.04.2021, 02:59 and posted on 06.04.2021, 10:53 by Zheng Cheng, Jiahui Du, Lei Zhang, Jing Ma, Wei Li, Shuhua Li

Molecular dynamic (MD) simulation plays an essential role in understanding protein functions at atomic level. At present, MD simulations on proteins are mainly based on classical force fields. However, the accuracy of classical force fields for proteins is still insufficient for accurate descriptions of their structures and dynamical properties. Here we present a novel protocol to construct machine learning force field (MLFF) for a given protein with full quantum mechanics (QM) accuracy. In this protocol, the energy of the target system is obtained by fitting energies of its various subsystems constructed with the generalized energy-based fragmentation (GEBF) approach. To facilitate the construction of MLFF for various proteins, a protein’s data library is created to store all data of subsystems generated from trained proteins. With this protein’s data library, for a new protein only its subsystems with new topological types are required for the construction of the corresponding MLFF. This protocol is illustrated with two polypeptides, 4ZNN and 1XQ8 segment, as examples. The energies and forces predicted from this MLFF are in good agreement with those from density functional theory calculations, and dihedral angle distributions from GEBF-MLFF MD simulations can also well reproduce those from ab initio MD simulations. Therefore, this GEBF-ML protocol is expected to be an efficient and systematic way to build force fields for proteins and other biological systems with QM accuracy.

Funding

21833002

22033004

22073043

21873046

History

Email Address of Submitting Author

2369561300@qq.com

Institution

Nanjing University

Country

China

ORCID For Submitting Author

0000-0003-2737-606X

Declaration of Conflict of Interest

The authors declare no competing financial interest

Version Notes

GEBF-ML-2021.4.5.v1

Exports

ChemRxiv

Exports