Modular Software for Generating and Modelling Diverse Polymer Databases

17 January 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Machine learning methods offer the opportunity to design new functional materials on an unprecedented scale however building the large, diverse databases of molecules on which to train such methods remains a daunting task. Automated computational chemistry modelling workflows are therefore becoming essential tools in this data-driven hunt for new materials with novel properties, since they offer a workflow by which to create and curate molecular databases without requiring significant levels of user input. This ensures well-founded concerns regarding data provenance, reproducibility and replicability are mitigated. We have developed a versatile and flexible software package, PySoftK (Python Soft Matter at King's College London), that provides flexible, automated computational workflows to create, model, and curate libraries of polymers with a minimal user intervention. PySoftK is available as an efficient, fully-tested, and easily installed Python package. Key features of the software include the wide range of different polymer topologies that can be automatically generated and fully parallelized library generation tools. It is anticipated that PySoftK will support the generation, modelling and curation of large polymer libraries to support functional materials discovery in nano- and bio-technology.


high-throughput calculations
molecular models

Supplementary materials

Supplementary Information
Further information about the benchmarking of the code and the torsion finding function of the code.

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.