Abstract
Open mass spectral libraries (OMSL) are critical for metabolite annotation and machine learning, especially given the rising volume of untargeted metabolomic studies and the development of annotation pipelines. Despite their importance, the practical application of OMSLs is hampered by the lack of standardized file formats, metadata fields, and supporting ontology. Current libraries, often restricted to specific topics or matrices such as natural products, lipids, or the human metabolome, may limit the discovery potential of untargeted studies. FragHub addresses these challenges by integrating multiple OMSLs into a single comprehensive database, supporting various data formats and harmonizing metadata. It also proposes some generic filters for mass spectrum using a graphical user interface. Additionally, a workflow to generate in-house libraries compatible with FragHub is proposed. FragHub dynamically segregates libraries based on ionization modes and chromatography techniques, thereby enhancing data utility in metabolomic research. The FragHub Python code is publicly available under a MIT license, at the following repository: https://github.com/eMetaboHUB/FragHub. Generated data can be accessed at https://doi.org/10.5281/zenodo.11057687.
Supplementary materials
Title
supplementary table
Description
All supplementary table cited in the main text
Actions
Title
FragHub tutorial
Description
A tutorial to install and use the fraghub python workflow
Actions
Title
in-house database
Description
A tutorial to create an in-house database using MZMine
Actions