Abstract
Improvements in Fourier Transform Mass Spectrometry (FT-MS) enable increasingly more complex experiments in the field of metabolomics. What is directly detected in FT-MS spectra are spectral features (peaks) that correspond to sets of adducted and charged forms of specific molecules in the sample. The robust assignment of these features is an essential step for MS-based metabolomics experiments, but the sheer complexity of what is detected and a variety of analytically-introduced variance, errors, and artifacts has hindered the systematic analysis of complex patterns of observed peaks with respect to isotope content. We have devel-oped a method called SMIRFE that detects small biomolecules and determines their elemental molecular formula (EMF) using de-tected sets of isotopologue peaks sharing the same EMF. SMIRFE does not use a database of known metabolite formulas, instead a nearly comprehensive search space of all isotopologues within a mass range is constructed and used for assignment. This search space can be tailored for different isotope labeling patterns expected in different stable isotope tracing experiments. Using consumer-level computing equipment, a large search space of 2000 daltons was constructed and assignment performance was evaluated and validated using verified assignments on a pair of peak lists derived from spectra containing unlabeled and 15N-labeled versions of amino acids derivatized using ethylchloroformate. SMIRFE identified 18 of 18 predicted derivatized EMFs and each assignment was evaluated statistically and assigned an e-value representing the probability to occur by chance.