Using Machine Learning to Estimate Concentrations of Non-Targeted Chemicals Without Analytical Standards

We developed two in silico quantification methods for chemicals analyzed with capillary electrophoresis electrospray ionization-mass spectrometry (CE-ESI-MS) using machine learning - a random forest (RF) and an artificial neural network (ANN). The algorithms can be used to predict chemical concentrations based on the chemicals’ relative response factors (RRFs) and their physicochemical properties. The RF and ANN predicted the measured concentrations with a mean absolute error of 0.2 log units and a coefficient of determination (R2) of about 0.85 for the testing set.