Analytical Chemistry

Open-Source Chromatographic Data Analysis for Reaction Optimization and Screening



Automation and digitalization solutions in the field of small molecule synthesis face new challenges for chemical reaction analysis, especially in the field of high-performance liquid chromatography (HPLC). Chromatographic data remains locked in vendors’ hardware and software components limiting their potential in automated workflows and contradicting to FAIR data principles (findability, accessibility, interoperability, reuse), which enable chemometrics and data science applications. In this work, we present an open-source Python project called MOCCA (Multivariate Online Contextual Chromatographic Analysis) for the analysis of open-format HPLC–DAD (photodiode array detector) raw data. MOCCA provides a comprehensive set of data analysis features including a peak deconvolution routine which allows for automated deconvolution of known signals even if overlapped with signals of unexpected impurities or side products. We highlight the broad applicability of MOCCA in four studies: (i) a simulation study to validate MOCCA’s data analysis features; (ii) a reaction kinetics study on a Knoevenagel condensation reaction demonstrating MOCCA’s peak deconvolution feature; (iii) a closed-loop optimization study for the alkylation of 2-pyridone highlighting MOCCA’s potential to obviate the need for human control during data analysis; (iv) a well plate screening of categorical reaction parameters for a novel palladium-catalyzed cyanation of aryl halides employing O-protected cyanohydrins where MOCCA tracks all known and unknown signals. These studies emphasize how MOCCA enables its users to make data-based decisions in synthesis workflows with different degrees of automation by providing actionable analytics. By publishing MOCCA as a Python package together with this work, we envision an open-source community project for chromatographic data analysis with the potential of further advancing its scope and capabilities


Thumbnail image of MOCCA_manuscript_ChemRxiv.pdf

Supplementary material

Thumbnail image of MOCCA_SI_ChemRxiv.pdf
Supplementary Information
Additional details to all presented case studies, description how to extract HPLC–DAD raw data from vendor control software of major vendors, technical details to MOCCA’s data analysis features, NMR spectra of O-protected cyanohydrins.
Thumbnail image of
Examples of MOCCA reports in html format
MOCCA reports for the data analysis of the well plate screening (cyanation of aryl halides).