Abstract
We introduce MindlessGen, a Python-based generator for creating chemically diverse, “mindless” molecules through random atomic placement and subsequent geometry optimization. Using this framework, we constructed the MB2061 benchmark set, containing 2061 molecules with high-level PNO-LCCSD(T)-F12 reference data for dissociation reactions. This set provides a challenging benchmark for testing, validation, and training of density functional approximations (DFAs), semiempirical methods, force fields, and machine learning potentials using molecular structures beyond the conventional chemical space. For DFAs, we initially hypothesized that highly parameterized functionals might perform poorly on this set. However, no consistent relationship between fitting strategy and accuracy was observed. A clear Jacob’s ladder trend emerges, with ωB97X-2 achieving the lowest mean absolute error (MAE) of 8.4 kcal·mol−1 and r²SCAN-3c offering a robust cost-efficient alternative (19.6 kcal·mol−1). Furthermore, we discuss the performance of selected semiempirical methods and contemporary machine learned interatomic potentials.
Supplementary materials
Title
ESI1
Description
Additional technical details and supporting fig-
ures
Actions
Title
ESI2
Description
Calculated reaction energies with SQM, DFT,
and MLIPs
Actions
Title
Benchmark Set
Description
Geometries (XYZ files), reference energies, and
reaction energy coefficients of the MB2061
benchmark set
Actions