ichor: A Python library for computational chemistry data management and machine learning force field development

10 June 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We present ichor, an open-source Python library that simplifies data management in computational chemistry and streamlines machine learning force field development. Ichor implements many easily extendable file management tools, in addition to a lazy file reading system, allowing efficient management of hundreds of thousands of computational chemistry files. Data from calculations can be readily stored into databases for easy sharing and post-processing. Raw data can be directly processed by ichor to create machine learning-ready datasets. In addition to powerful data-related capabilities, ichor provides interfaces to popular workload management software employed by High Performance Computing clusters, making for effortless submission of thousands of separate calculations with only a single line of Python code. Furthermore, a simple-to-use command line interface has been implemented through a series of menu systems to further increase accessibility and efficiency of common important ichor tasks. Finally, ichor implements general tools for visualization and analysis of datasets and tools for measuring machine-learning model quality both on test set data and in simulations. With the current functionalities, ichor can serve as an end-to-end data procurement, data management, and analysis solution for machine-learning force-field development.

Keywords

ichor
Python library
force field development
data procurement

Supplementary materials

Title
Description
Actions
Title
Supporting Information
Description
Contents 1. Example of Methanol Geometry ................................................................................................ 2 2. Example of a GAUSSIAN input file Wri􀆩en by ichor: ....................................................................... 2 3. Example of an ORCA input file Wri􀆩en by ichor: ............................................................................. 2 4. PointDirectory Raw Data Full Output ............................................................................................... 3 5. PointsDirectory Full Output for Two Points ..................................................................................... 4 6. JSON Database Example ............................................................................................. 6 7. SQLite3 Database Schema ........................................................................................... 8 8. Example Model File Used by ichor and FFLUX ................................................................................. 9
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.