Introduction to Machine Learning for Chemists: An Undergraduate Course Using Python Notebooks for Visualization, Data Processing, Data Analysis, and Data Modeling

10 February 2021, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Machine Learning, a subdomain of Artificial intelligence, is a pervasive technology that would mold how chemists interact with data. Therefore, it is a relevant skill to incorporate into the toolbox of any chemistry student. This work presents a course that introduces machine learning for chemistry students based on a set of Python Notebooks and assignments. Python language, one of the most popular programming languages, allows for free software and resources, which ensures availability. The course is constructed for students without previous experience in programming, leading to an incremental progression in depth and complexity that covers both programming and machine learning concepts. The examples used are related to real data from physicochemical characterizations of wines, producing an attractive material that captures the interest of students. Topics included are Introduction to Python, Basic Statistics, Data Visualization and Dimension Reduction, Classification, and Regression.


machine learning
Introduction to Python
Basic Statistics
Data Visualization
Dimension Reduction


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.