Development and Comparison of Formula Assignment Algorithms for Ultrahigh-Resolution Mass Spectra of Natural Organic Matter

Increasing number of application of ultrahigh-resolution mass spectrometry (UHR-MS) to natural organic matter (NOM) characterization requires an efficient and accurate formula assignment from a number of mass data. Herein, we newly developed two automated batch codes (namely TRFu and FuJHA) and assessed their formula assignment accuracy together with frequently used open access algorithms (i.e., Formularity and WHOI). The overall assignment accuracy for the NOM-like 8,717 chemicals with known molecular formulae (mass range from 68 Da to 1,000 Da) was highest (94%) for TRFu. Comparative examination using 35 NOM mass spectrum data sets (totally 78,482 peaks with m/z range of 69 to 999) revealed that TRFu, FuJHA and Formularity had superior performance (e.g., high formula assignment ratios and lower mass errors) compared to WHOI, though the performance was depending on mass values and molecular compositions. Moreover, among all methods, TRFu showed smallest deviation from certified data in the 13C-formula assignment analysis. Therefore, as a reliable and practically feasible tool, the automated batch TRFu can precisely characterize UHR-MS spectra of various NOM and could be extended to the non-target screening of NOM-like emerging chemicals in natural and engineered environments.