Analytical Chemistry

Reproducible Untargeted Metabolomics Data Analysis Workflow for Exhaustive MS/MS Annotation



Motivation Unknown features in untargeted metabolomics and non-targeted analysis (NTA) are identified using fragment ions from MS/MS spectra to predict the structures of the unknown compounds. The precursor ion selected for fragmentation is commonly performed using data dependent acquisition (DDA) strategies or following statistical analysis using targeted MS/MS approaches. However, the selected precursor ions from DDA only cover a biased subset of the peaks or features found in full scan data. In addition, different statistical analysis can select different precursor ions for MS/MS analysis, which make the post-hoc validation of ions selected by new statistical methods impossible for precursor ions selected by the original statistical method. By removing redundant peaks and performing pseudo-targeted MS/MS analysis on independent peaks, we can comprehensively cover unknown compounds found in full scan analysis using a “one peak for one compound” workflow without a priori redundant peak information. Here we propose an reproducible, automated, exhaustive, statistical model-free workflow: paired mass distance-dependent analysis (PMDDA), for untargeted mass spectrometry identification of unknown compounds found in MS1 full scan. Results More annotated compounds/molecular networks/spectrum were found using PMDDA compared with CAMERA and RAMClustR. Meanwhile, PMDDA can generate the preferred ions list for iterative DDA to cover more compounds when instruments support such functions. Availability and implementation The whole workflow is fully reproducible as a docker image xcmsrocker with both the original data and the data processing template. A related R package is developed and released online: R script, data files and links of GNPS annotation results including MS1 peaks list and MS2 MGF files were provided in supplementary information.

Version notes

- add iterative DDA results with PMDDA as preferred ions list - remove DDA and posneg connection sections to make the workflow clear - add more resources in supporting information


Thumbnail image of PMDDA.pdf

Supplementary material

Thumbnail image of Supporting information.pdf
Supporting informaiton for Reproducible untargeted metabolomics data analysis workflow for exhaustive MS/MS annotation
Supporting informaition for the manuscripts with online links to raw file, GNPS annotation results, and supporting figures
Thumbnail image of
Raw data for this study
Check supporting information for the details of this file.

Supplementary weblinks

xcmsrocker image
Rocker image for metabolomics data analysis with workflow templates (peak picking, statistical analysis, annotation, etc.) for regular metabolomics data analysis, as well as PMDDA.