Abstract
State of the art in silico ∆∆G predictions for antibody-antigen complexes achieve an accuracy of ±1 kcal/mol. While this is sufficient for high throughput screening or affinity maturation, it is insufficient for assessing the criticality of post-translational modifications (PTMs) during clinical development. PTMs that impair binding by >50% pose a major risk to achieving the desired therapeutic bioactivity and must be controlled within defined limits to ensure product quality. A 50% loss in the dissociation constant (Kd) corresponds to a ∆∆G of +0.5 kcal/mol, thus requiring a ±0.5 kcal/mol accuracy threshold for in silico predictions to be practically actionable in clinical phases. In this work, we use conventional molecular dynamics thermodynamic integration (cMD-TI) to generate ∆∆G predictions and develop an error analysis approach using random forest models and end state Gaussian accelerated molecular dynamics (GaMD). This approach provides insight into inadequate sampling of key degrees of freedom (DOF) using only cMD-TI and end state GaMD. We identify bulky side chain undersampling and violation of energetically relevant interatomic interactions as major sources of error, and our GaMD-based error corrections lead to > 1 kcal/mol improvements in accuracy in our most erroneous cases. When applied to a set of 13 predictions, the GaMD-based error correction reduced the root mean square error (RMSE) from 1.01 kcal/mol to 0.69 kcal/mol. This work introduces the application of alchemical free energy predictions to estimating PTM impacts on bioactivity and addresses the current errors that limit their practical use in clinical development.
Supplementary materials
Title
Supplementary Materials
Description
Full empirical dataset, full TI predictions dataset, results from one-step corrections, explanation of key features referenced in the main text, barstar-barnase GaMD 1-D free energy profiles indicating possible salt bridges or hydrogen bonds (referenced in the Discussion section), random forest + GaMD error analysis applied to all 13 hu4D5-5 systems.
Actions
Supplementary weblinks
Title
Code repository
Description
All initial structures, TI scripts, and random forest model notebooks used for this study.
Actions
View