Naive Bayes classification model for isotopologue detection in LC-HRMS data

03 November 2021, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Isotopologue identification or removal is a necessary step to reduce the number of features that need to be identified in samples analyzed with non-targeted analysis. Currently available approaches rely on either predicted isotopic patterns or an arbitrary mass tolerance, requiring information on the molecular formula or instrumental error, respectively. Therefore, a Naive Bayes isotopologue classification model was developed that does not depend on any thresholds or molecular formula information. This classification model uses elemental mass defects of six elemental ratios and can successfully identify isotopologues in both theoretical isotopic patterns and wastewater influent samples, outperforming one of the most commonly used approaches (i.e., 1.0033 Da mass difference method - CAMERA).


Isotopologue detection
Non-targeted analysis
Machine learning
Computational mass spectrometry

Supplementary materials

Supporting information for: Naive Bayes classi fication model for isotopologue detection in LC-HRMS data
Information on the presence of the elemental ratios for the chemicals in the DDS-Tox database, an overview of correlation coefficients for the different elemental ratios between the EMDmono and EMDiso values with scatter plots for the two most extreme correlations, receiver operator curves for the classification model and mass difference method used for the selection of the scoreEMD, a reference compound list used for the performance assessment of the classification model and mass difference method on wastewater influent samples, and an example of FP detected isotopologue for the classification model.


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.