The Hitchhiker’s Guide to Statistical Analysis of Feature-based Molecular Networks from Non-Targeted Metabolomics Data

Abzer K. Pakkir Shah; Axel Walter; Filip Ottosson; Francesco Russo; Marcelo Navarro-Díaz; Judith Boldt; Jarmo-Charles Kalinski; Eftychia E. Kontou; James Elofson; Alexandros Polyzois; Carolina González-Marín; Shane Farrell; Marie R. Aggerbeck; Thapanee Pruksatrakul; Nathan Chan; Yunshu Wang; Magdalena Pöchhacker; Corinna Brungs; Beatriz Cámara; Andrés M. Caraballo-Rodríguez; Andres Cumsille; Fernanda de Oliveira; Kai Dührkop; Yasin El Abiead; Christian Geibel; Lana G. Graves; Martin Hansen; Steffen Heuckeroth; Simon Knoblauch; Anastasiia Kostenko; Mirte CM. Kuijpers; Kevin Mildau; Stilianos Papadopoulos Lambidis; Paulo Wender Portal Gomes; Tilman Schramm; Karoline Steuer-Lodd; Paolo Stincone; Sibgha Tayyab; Giovanni Andrea Vitale; Berenike C. Wagner; Shipei Xing; Marquis T. Yazzie; Simone Zuffa; Martinus de Kruijff; Christine Beemelmanns; Hannes Link; Christoph Mayer; Justin JJ van der Hooft; Tito Damiani; Tomáš Pluskal; Pieter C. Dorrestein; Jan Stanstrup; Robin Schmid; Mingxun Wang; Allegra T. Aron; Madeleine Ernst; Daniel Petras

doi:10.26434/chemrxiv-2023-wwbt0

Analytical Chemistry

Search within Analytical Chemistry

The Hitchhiker’s Guide to Statistical Analysis of Feature-based Molecular Networks from Non-Targeted Metabolomics Data

01 November 2023, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Feature-Based Molecular Networking (FBMN) is a popular analysis approach for LC-MS/MS-based non-targeted metabolomics data. While processing LC-MS/MS data through FBMN is fairly streamlined, downstream data handling and statistical interrogation is often a key bottleneck. Especially, users new to statistical analysis struggle to effectively handle and analyze complex data matrices. In this protocol, we provide a comprehensive guide for the statistical analysis of FBMN results. We explain the data structure and principles of data clean-up and normalization, as well as uni- and multivariate statistical analysis of FBMN results. We provide explanations and code in two scripting languages (R and Python) as well as the QIIME2 framework for all protocol steps, from data clean-up to statistical analysis. Additionally, the protocol is accompanied by a web application with a graphical user interface (https://fbmn-statsguide.gnps2.org/), to lower the barrier of entry for new users. Together, the protocol, code, and web app provide a complete guide and toolbox for FBMN data integration, clean-up, and advanced statistical analysis, enabling new users to uncover molecular insights from their non-targeted metabolomics data. Our protocol is tailored for the seamless analysis of FBMN results from Global Natural Products Social Molecular Networking (GNPS and GNPS2) and can be adapted to other MS feature detection, annotation, and networking tools.

Keywords

Feature-Based Molecular Networking

Non-Targeted Metabolomics

Statistical Analysis

Univariate Analysis

Multivariate Analysis

Data Cleanup

Supplementary materials

Title

Description

Actions

Title

Supporting Information - The Hitchhiker’s Guide to Statistical Analysis of Feature-based Molecular Networks from Non-Targeted Metabolomics Data

Description

Contains information about the example data used for the protocol and step-by-step guides for Python Notebook, QIIME2, and Web app.

Actions

Supplementary weblinks

Title

Description

Actions

Title

FBMN-STATS GitHub Repository

Description

This repository contains the test data, the Jupyter notebooks, and the Web App for the paper 'A hitchhiker's guide to statistical analysis of Feature-based Molecular Networks'. Using the notebooks provided here, one can perform data merging, data cleanup, blank removal, batch correction, and univariate and multivariate statistical analyses on their non-targeted LC-MS/MS data and Feature-based Molecular Networks.

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

Nov 01, 2023 Version 1

Metrics

6,208

3,267

Views

Downloads

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv-2023-wwbt0

Author’s competing interest statement

JJJvdH is currently a member of the Scientific Advisory Board of Naicons Srl., Milano, Italy, and is consulting for Corteva Agriscience, Indianapolis, IN, USA. PCD is a scientific advisor and holds equity to Cybele and a Co-founder, advisor and holds equity in Ometa, Arome and Enveda with prior approval by UC-San Diego and consulted in 2023 for DSM animal health. MW is the founder of Ometa Labs.

Ethics

The author(s) have declared ethics committee/IRB approval is not relevant to this content

The Hitchhiker’s Guide to Statistical Analysis of Feature-based Molecular Networks from Non-Targeted Metabolomics Data

Authors

Abstract

Keywords

Supplementary materials

Supplementary weblinks

Comments

Version History

Metrics

License

DOI

Author’s competing interest statement

Ethics

Share