Interoperable nanosafety data using semantic modeling and linked data knowledge graphs

25 October 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

The nanosafety domain has seen significant advancements in data generation and sharing, yet challenges remain in ensuring data interoperability and reuse. This article focuses on developing a semantic interoperability framework for nanosafety data to maximize the FAIRness (Findability, Accessibility, Interoperability, and Reusability) of existing and new datasets. The approach centers on creating the NanoLinks semantic model, which unifies diverse data modalities representation, such as cytotoxicity, transcriptomics, and physicochemical data. By leveraging established ontologies like the BioAssay Ontology (BAO), NanoParticle Ontology (NPO), Data Catalog vocabulary (DCAT) and the PROV-O ontology, NanoLinks facilitates the conversion of semi-structured data into Resource Description Framework (RDF) format using the RDF Mapping Language (RML). This transformation allows to generate machine-readable and interoperable datasets. Five datasets from the literature, spanning nanomaterial characteristics and biological assay data, were selected by the NanoSolveIT EU project partners for FAIRification. These datasets were converted into RDF format, hosted on Zenodo under a CC-BY 4.0 license, and integrated into a knowledge graph, NanoLinks-KG, following the linked-data best practices. The knowledge graph was validated for consistency and adherence to the semantic model using shape expressions (ShEx). The presented applications of this graph showcase the potential of querying interconnected datasets to derive insights and support integration with external resources such as AOP-Wiki and the NanoCommons knowledge base. One usage example given is the cross-dataset dose-response curve comparison of zinc oxide nanomaterials. The results demonstrate the successful application of semantic modeling and linked-data knowledge graphs to convert and integrate diverse nanosafety datasets, enhancing their interoperability, and promoting reuse. The developed framework advances the state of data sharing in the nanosafety community and demonstrates the potential of semantic technologies in facilitating comprehensive data analysis and novel discoveries in the field.

Keywords

data interoperability
FAIR
semantic web
linked data
knowledge graph
nanomaterial
bioassay
RDF
SPARQL

Supplementary materials

Title
Description
Actions
Title
SPARQL queries used to summarize the content of the RDFied datasets
Description
SPARQL queries used to summarize the content of the RDFied datasets
Actions
Title
Results of a federated SPARQL query integrating NanoLinks-KG nanomaterial types with adverse outcomes from AOP-Wiki
Description
Results of a federated SPARQL query integrating NanoLinks-KG nanomaterial types with adverse outcomes from AOP-Wiki
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.