A Genetic Algorithm for Automated Parameterization of Network Hamiltonian Models of Amyloid Fibril Formation

06 July 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The timescales of long-time atomistic molecular dynamics simulations are typically reported in microseconds, while the timescales for experiments studying the kinetics of amyloid fibril formation are typically reported in minutes or hours. This timescale deficit of roughly 9 orders of magnitude presents a major challenge in the design of computer simulation methods for studying protein aggregation events. Thus, coarse-grained molecular simulations of amyloid fibril formation are crucial for understanding the molecular mechanism behind the formation of these structures, which are implicated in diseases such as Alzheimer’s, Parkinson’s, and Type II diabetes. Network Hamiltonian simulations of aggregation are centered around a Hamiltonian function that returns the total energy of a system of aggregating proteins, given the graph structure of the system as input. In the graph, or network, representation of the system, each protein molecule is represented as a node, and non-covalent bonds between proteins are represented as edges. The parameter, i.e. a set of coefficients that determine the degree to which each topological degree of freedom is favored or disfavored, must be determined for each network Hamiltonian model, and is a well-known technical challenge. Here, a type of artificial intelligence (AI) called a genetic algorithm is introduced for autonomously parameterizing network Hamiltonian models, whereby an initial set of randomly parameterized models, typically of low fibril yield (e.g. < 5 %), is used to initiate the evolution of subsequent model generations, ultimately leading to high fibril yield models (e.g. > 70 %). The methodology is also demonstrated by applying it to optimizing previously published network Hamiltonian models for the 5 key amyloid fibril topologies that have been reported in the Protein Data Bank (PDB), and showing that the models generated by the AI produce fibril yields that surpass or match previously published fibril yields in all cases. The authors also aim to encourage more widespread use of the network Hamiltonian methodology by introducing a free open-source implementation of the genetic algorithm for fitting network Hamiltonian models to other self-assembling systems.

Keywords

network Hamiltonian
artificial intelligence
amyloid fibril
biophysics
ergm
machine learning
genetic algorithm
automated discovery
network
statistical mechanics
supramolecular chemistry

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.