These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
Manuscript v1 Chemrxiv.pdf (778.04 kB)

Introduction to Machine Learning for Chemists: An Undergraduate Course Using Python Notebooks for Visualization, Data Processing, Data Analysis, and Data Modeling

submitted on 08.02.2021, 18:29 and posted on 10.02.2021, 05:06 by Deborah Lafuente, Brenda Cohen, Guillermo Fiorini, Agustín García, Mauro Bringas, Ezequiel Morzan, Diego Onna
Machine Learning, a subdomain of Artificial intelligence, is a pervasive technology that would mold how chemists interact with data. Therefore, it is a relevant skill to incorporate into the toolbox of any chemistry student. This work presents a course that introduces machine learning for chemistry students based on a set of Python Notebooks and assignments. Python language, one of the most popular programming languages, allows for free software and resources, which ensures availability. The course is constructed for students without previous experience in programming, leading to an incremental progression in depth and complexity that covers both programming and machine learning concepts. The examples used are related to real data from physicochemical characterizations of wines, producing an attractive material that captures the interest of students. Topics included are Introduction to Python, Basic Statistics, Data Visualization and Dimension Reduction, Classification, and Regression.


Email Address of Submitting Author


INQUIMAE, DQIAQF, FCEN, Universidad de Buenos Aires



ORCID For Submitting Author


Declaration of Conflict of Interest

no conflict of interest

Version Notes

Manuscript Version 1