COMPAS-3: a Data Set of peri-Condensed Polybenzenoid Hydrocarbons

26 February 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


We introduce the third installment of the COMPAS Project – a COMputational database of Polycyclic Aromatic Systems, focused on peri-condensed polybenzenoid hydrocarbons. In this installement, we develop two data sets containing the optimized ground-state structures and a selection of molecular properties of ∼39k and ∼9k peri -condensed polybenzenoid hydrocarbons (at the GFN2-xTB and CAM-B3LYP-D3BJ/cc-pvdz//CAM-B3LYP-D3BJ/def2-SVP levels, respectively). The manuscript details the enumeration and data generation processes and describes the information available within the data sets. An in-depth comparison between the two types of computation is performed, and it is found that the geometric disagreement is maximal for slightly-distorted molecules. In addition, a data-driven analysis of the structure-property trends of peri-condensed PBHs is performed, highlighting the effect of the size of peri-condensed islands and linearly annulated rings on the HOMO-LUMO gap. The insights described herein are important for rational design of novel functional aromatic molecules for use in, e.g., organic electronics. The generated data sets provide a basis for additional data- driven machine- and deep-learning studies in chemistry


polycyclic aromatic hydrocarbons
computational chemistry
structure-property relationships
peri-condensed polybenzenoids

Supplementary materials

Supporting Information for COMPAS-3
Details of general computational methods, templates for xTB and DFT calculations, benchmarking procedure for choosing the DFT level of theory, comparison of the COMPAS-1 data set using the two levels of theory (this report versus the original publication), extended discussion of the outliers in the aIP and aEA plot, comparison of D3 and D4 dispersion corrections, and additional discussion of the relative energy and structure-property analyses.

Supplementary weblinks


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.