Abstract
Ionic liquids, i.e., organic salts with a low melting point, can be used as gas chromatographic liquid stationary phases. These stationary phases have some advantages such as peculiar selectivity, high polarity, and thermostability. Many previous works are devoted to such stationary phases. However, there are still no large enough retention data sets of structurally diverse compounds for them. Consequently, there are very few works devoted to quantitative structure-retention relationships (QSRR) for ionic liquid-based stationary phases. This work is aimed to close this gap. Three ionic liquids with substituted pyridinium cations are considered. We provide large enough data sets (123 - 158 compounds) that can be used in further works devoted to QSRR and related methods. We provide a QSRR study using this data set and demonstrate the following. The retention index for a polyethylene glycol stationary phase (denoted as RIPEG), predicted using another model, can be used as a molecular descriptor. The use of this descriptor significantly improves the accuracy of the QSRR model. Both deep learning-based and linear models were considered for RIPEG prediction. The ability to predict the retention indices for ionic liquid-based stationary phases with high accuracy is demonstrated. Particular attention is paid to the reproducibility and reliability of the QSRR study. It was demonstrated that adding/removing several compounds, small perturbations of the data set can considerably affect the results such as descriptor importance and model accuracy. These facts have to be considered in order to avoid misleading conclusions. For the QSRR research, we developed a software tool with a graphical user interface, which we called CHERESHNYA. It is intended to select molecular descriptors and construct linear equations connecting molecular descriptors with gas chromatographic retention indices for any stationary phase. The software allows the user to generate several hundred molecular descriptors (one-dimensional and two-dimensional). Among them, predicted retention indices for popular stationary phases such as polydimethylsiloxane and polyethylene glycol are used as molecular descriptors. Various methods for selecting (and assessing the importance of) molecular descriptors have been implemented, in particular the Boruta algorithm, partial least squares, genetic algorithms, L1-regularized regression (LASSO) and others. The software is free, open-source and available online.
Supplementary materials
Title
Supplementary Material
Description
A full list of compounds used for creation of retention data set on ionic-liquid based stationary phases; the full equation linking RI_PEG_LM molecular descriptor with other molecular descriptors; overlappings of data sets for three ionic liquid-based stationary phases; molecular descriptors selected by SEQ_ADD and LASSO methods for full data sets
Actions
Supplementary weblinks
Title
CHERESHNYA software
Description
Interactive software for quantitative structure-retention relationships in gas chromatography
Actions
View