Data-Driven Design of Protein-Like Single-Chain Polymer Nanoparticles

13 September 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


The functional structure of proteins is heavily influenced by their folding behavior. AlphaFold, a powerful artificial intelligence (AI) program trained on information from the Protein Data Bank (PDB), was developed to predict the 3D structure of proteins from its amino acid sequence. Inspired by this, we aim to elucidate structural features of synthetic single-chain polymer nanoparticles (SCNPs) based on compositional information (monomers, chain length, molecular weight, charge, and valency) by machine learning (ML). Specifically, we demonstrate the effectiveness of ML to improve the efficiency of SCNP design and uncover important polymer design attributes to mimic protein-like structural features. To start, we randomly screened over 1000 synthesized SCNPs through a combination of high-throughput dynamic light scattering (DLS) and small-angle X-ray scattering (SAXS) and compared these results to simulated protein data from the PDB. Then, utilizing evidential neural networks (ENets), we predicted, synthesized, and characterized 30 novel compact SCNPs. Incredibly, this data-driven approach yielded 58% of the predicted SCNPs with Porod exponent ≥ 3.5 as opposed to 5% of SCNPs from the random screen. Using Shapely additive explanation (SHAP) values, we further uncovered interesting contributions of monomer content on Porod exponent and radius of gyration. From this work, we have shown that an ML-guided approach proves effective for the challenging, unintuitive problem of nanoparticle design.


small-angle X-ray scattering
dynamic light scattering
machine learning
high throughput

Supplementary materials

Supporting Information
Contains additional data and tables


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.