These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Preprints are manuscripts made publicly available before they have been submitted for formal peer review and publication. They might contain new research findings or data. Preprints can be a draft or final version of an author's research but must not have been accepted for publication at the time of submission.
Machine learning has been emerging as a promising tool in the chemical and materials domain. In this paper, we introduce a framework to automatically perform rational model selection and hyperparameter optimization that are important concerns for the efficient and successful use of machine learning, but have so far largely remained unexplored by this community. The framework features four variations of genetic algorithm and is implemented in the chemml program package. Its performance is benchmarked against popularly used algorithms and packages in the data science community and the results show that our implementation outperforms these methods both in terms of time and accuracy. The effectiveness of our implementation is further demonstrated via a scenario involving multi-objective optimization for model selection.