Generative BigSMILES: An Extension for Polymer Informatics, Computer Simulations & ML/AI

09 August 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


The BigSMILES notation, a concise tool for polymer ensemble representation, is augmented here by introducing an enhanced version called generative BigSMILES. G-BigSMILES is designed for generative workflows, and is complemented by tailored software tools for ease of use. This extension integrates additional data, including reactivity ratios (or connection probabilities among repeat units), molecular weight distributions, and ensemble size. An algorithm, interpretable as a generative graph is devised that utilizes these data, enabling molecule generation from defined polymer ensembles. Consequently, the G-BigSMILES notation allows for efficient specification of complex molecular ensembles via a streamlined line notation, thereby providing a foundational tool for automated polymeric materials design. In addition, the graph interpretation of the G-BigSMILES notation sets the stage for robust machine learning methods capable of encapsulating intricate polymeric ensembles. The combination of G-BigSMILES with advanced machine learning techniques will facilitate straightforward property determination and in-silico polymeric material synthesis automation. This integration has the potential to significantly accelerate materials design processes and advance the field of polymer science.


line notation

Supplementary materials

Supporting Information for Generative BigSMILES
Jupyter Notebook, that explains different scenarios of generative BigSMILES and how it is applied to real chemistries.

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.