Exploring and Mapping Chemical Space with Molecular Assembly Trees

05 May 2021, Version 2
This content is a preprint and has not undergone peer review at the time of posting.


The mathematical search of chemical space can generate an almost infinite number of molecules and it is hard to know which molecules are relevant experimentally. A way to explore the chemical space of known molecules as a function of their relative complexity might help us understand biological processes and find new relationships. Assembly theory provides an approach to explore and compare the intrinsic complexity of molecules by the minimum number of steps needed to build up the target graphs. Here we show assembly theory can be applied to networks of molecules to explore the assembly properties of common motifs and use these to define a tree of assembly spaces. This theory allows us to explore the accessible molecules connected to the tree, rather than the entire space of possible molecules. We apply this approach to prebiotic chemistry, to gene sequences, a family of plasticizers, as well as the well-known opiate class of natural products. This analysis allows us to quantify the amount of external information needed to assemble the tree and identify and predict new components in this family of molecules. Finally, by developing a new reassembly system that uses the disassembly motifs, we found that in the case of the opiates a new set of opiate-like drug candidates could be generated that would not be accessible via conventional fragment-based drug design, thereby demonstrating how this approach might find application in drug discovery.


Assembly theory
Molecular Assembly
Molecular Complexity
Assembly Tree
Molecular information


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.