These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
PMDDA.pdf (843.47 kB)

Reproducible Untargeted Metabolomics Data Analysis Workflow for Exhaustive MS/MS Annotation

submitted on 13.01.2021, 02:57 and posted on 18.01.2021, 04:36 by Miao Yu, Georgia Dolios, Lauren Petrick

Unknown features in untargeted metabolomics and non-targeted analysis (NTA) are identified using fragment ions from MS/MS spectra to predict the structures of the unknown compounds. The precursor ion selected for fragmentation is commonly performed using data dependent acquisition (DDA) strategies or following statistical analysis using targeted MS/MS approaches. However, the selected precursor ions from DDA only cover a biased subset of the peaks or features found in full scan data. In addition, different statistical analysis can select different precursor ions for MS/MS analysis, which make the post-hoc validation of ions selected by new statistical methods impossible for precursor ions selected by the original statistical method. Here we propose an automated, exhaustive, statistical model-free workflow: paired mass distance-dependent analysis (PMDDA), for untargeted mass spectrometry identification of unknown compounds. By removing redundant peaks and performing pseudo-targeted MS/MS analysis on independent peaks, we can comprehensively cover unknown compounds found in full scan analysis using a “one peak for one compound” workflow without a priori redundant peak information. We show that compared to DDA, PMDDA is more comprehensive and robust against samples' matrix effects. Further, more compounds were identified by database annotation using PMDDA compared with CAMERA and RAMClustR. Finally, compounds with signals in both positive and negative modes can be identified by the PMDDA workflow, to further reduce redundancies. The whole workflow is fully reproducible as a docker image xcmsrocker with both the original data and the data processing template.


Mount Sinai HHEAR Network Untargeted Lab Hub

National Institute of Environmental Health Sciences

Find out more...

Discovery of early life causes of autism spectrum disorder through retrospective metabolomics and proteome analysis of teeth

National Institute of Environmental Health Sciences

Find out more...

Environmental chemical mixtures and metabolomics in autism spectrum disorder

National Institute of Environmental Health Sciences

Find out more...



Email Address of Submitting Author


Icahn School of Medicine at Mount Sinai


United States

ORCID For Submitting Author


Declaration of Conflict of Interest

no conflict of interest