Application of the RDF framework to integrate heterogenous experimental data of a large chemo- and biodiverse collection from a research collaborative project

28 November 2023, Version 1

Abstract

Plants have a complex chemo-diversity and represent a reservoir of potential new therapeutic agents. Within a Swiss research project, six scientific research groups from different disciplines are collaborating to investigate a collection of more than 17’000 unique dried plant extracts. It aims to find new bioactive molecules and their modes of action, with for example anti-infective or pro-metabolic activities. One of the main challenges of this enterprise is the management, integration and sharing of the highly heterogeneous data that are produced by the different research groups. Among these we find (i) massive high-resolution mass spectrometry data, (ii) the numerical results of innovative chemo-informatics methods, (iii) bioassay results from experimental models of tuberculosis and obesity, and (iv) organic synthetic chemistry. Additionally, requirements for data management plan and open-source science with the FAIR principles must be met. We have established an agile pipeline to capture and structure this heterogeneous data into an RDF graph. The data content's gradual expansion and evolution throughout the project presented considerable challenges, particularly in terms of data modeling. Additionally, despite many collaborators not being RDF experts, most were technically adept at producing RDF triples relevant to their contributions. We have deployed multiple instances of a triplestore and developed an in-house custom tool (i.e. KGSteward) to synchronize their content, based on a configuration file, which is centrally managed and version-controlled using Git. This strategy gave us the flexibility required to address global project challenges in common data management effectively.

Keywords

Plant extracts
RDF framework
Heterogenous experimental data
Swiss research project
Bioactive molecules
FAIR principles

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.