The COMPAS Project: A Computational Database of Polycyclic Aromatic Systems. Phase 1: cata-condensed Polybenzenoid Hydrocarbons

27 April 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Chemical databases are an essential tool for data-driven investigation of structure-property relationships and design of novel functional compounds. We introduce the first phase of the COMPAS Project – a COMputational database of Polycyclic Aromatic Systems. In this phase, we have developed two datasets containing the optimized ground-state structures and a selec- tion of molecular properties of 34k and 9k cata- condensed polybenzenoid hydrocarbons (at the GFN2-xTB and B3LYP-D3BJ/def2-SVP lev- els, respectively), and have placed them in the public domain. Herein we describe the process of the dataset generation, detail the informa- tion available within the datasets, and show the fundamental features of the generated data. We analyze the correlation between the two types of computation as well as the structure- property relationships of the calculated species. The data and the insights gained from them can inform rational design of novel functional aro- matic molecules for use in, e.g., organic elec- tronics, and can provide a basis for additional data-driven machine- and deep-learning studies in chemistry.


polycyclic aromatic hydrocarbons
computational chemistry
structure-property relationships

Supplementary materials

Supporting Information for COMPAS_Phase1
General computational details, description of benchmarking procedure, histograms of data distribution, color-coded plots for all studied structural features, further analysis on D3 versus D4 corrections.

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.