Abstract
Fragment deconvolution is a crucial step during componentization of non-targeted analysis (NTA) high-resolution mass spectrometry (HRMS) data, aiming to filter out false positive (FP) signals that do not belong to the component. Moreover, inclusion of FP fragments could lead to, for example, wrong identification further down the workflow. Commonly used methods for deconvolution of fragment signals rely on the presence of a time domain (e.g., peak apex retention time difference and correlation analysis). However, when there is no or insufficient MS2 information in the time domain, these methods are unusable and only the mass domain remains. A probability based cumulative neutral loss (CNL) model for fragment deconvolution using the mass domain information was thus developed to allow deconvolution for such cases. The optimized model, with a mass tolerance of 0.005 Da and a CNL score threshold of -0.95, was able to achieve true positive rate (TPr) of 95.0%, a false discovery rate (FDr) of 25.6%, and a reduction rate of 39.9%. Additionally, the CNL model was extensively tested on real samples containing predominantly pesticides at different concentration levels and with matrix effects. Overall, the model was able to obtain a TPr above 95% with FD rates between 45% and 77% and reduction rates between 10% and 24%. Finally, the CNL model was compared with the retention time difference method and peak shape correlation analysis, showing that a combination of correlation analysis and the CNL model was the most effective for fragment deconvolution, obtaining a TPr of 93.1%, a FDr of 57.2%, and a reduction rate of 42.6%.
Supplementary materials
Title
Supporting information
Description
Overview of reference compounds and their corresponding sample, ROCs for the performance assessment of the CNL model using both the database and measured fragments, overview of high probability CNLs, and case figures for TP, FN, FP, and TN detected fragments.
Actions