Chemical Education

Principal Component Analysis (PCA) and Statistical Tests Using Factoshiny and R Commander.

Authors

Abstract

Spreadsheets are commonly used for data handling. However, in huge data sets, spreadsheets cannot do statistical tests, such as one-way ANOVA, boxplot, plot of means, principal component analysis (PCA). Most of the students had never worked with programming software such as MATLAB, Phyton, Octave and R project. Hence, in this lab experiment, students analyzed large data sets using R Commander and Factoshiny plugins. Commander and Factoshiny are packages which gives graphical user interface (GUI). GUI plugins allows students with no programming knowledge to run statical tests quickly and easily without having to type a single command line. The class was divided into three parts. First, students analyzed a red wine data set (1599 samples, 11 physicochemical variables, and one qualitative variable) to find correlations between wine quality (qualitative variable) and its physicochemical variables (quantitative variable). Second, they analyzed a white wine data set (4898 samples, 11 physicochemical variables, and one qualitative variable) to find correlations between white wine quality and its physicochemical variables. Third, they analyzed a red wine and white wine data set and found correlations between wine's physicochemical variables and their quality and type. Statistical tests and PCA were carried out using R Commander and Factoshiny, respectively. Due to the graphical interface and simplicity of these two plugins, the class can be concluded in 200 min.

Content

Thumbnail image of chemrxiv 24 10 2021.pdf

Supplementary material

Thumbnail image of Tables.zip
Tables
Tables used in the examples given in the paper (ZIP)

Supplementary weblinks