From descriptors to intrinsic fish toxicity of chemicals: an alternative approach to chemical prioritization

21 June 2022, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Most chemicals present in the human and environmental exposome are structurally unknown (i.e. ≤ 1%). The European Chemicals Agency (ECHA) and US Environmen- tal Protection Agency (EPA) have listed approximately 800k chemicals that must be further investigated for their potential environmental and/or human health risk. A sig- nificant number of these chemicals have large enough global volumes of consumption (e.g. industrial and agrochemical) to reach the limits of detection of our analytical chemistry methods and may be toxic. In this study we present a supervised classification model that directly connects the molecular descriptors of chemicals to their toxicity. As a proof of concept we used 907 experimentally defined LC50 values for acute fish toxicity. Our classification model explained ≈ 90% of variance in our data for the training set and ≈ 80% for the test set. Direct comparison of our classification model with the conventional strategy (i.e. QSAR regression models) resulted in a 5 fold decrease in the wrong chemical categorization for our model. This optimized model was employed to predict the toxicity categories of ≈ 32k chemicals (from the Norman SusDat). Finally, a comparison between the model based applicability domain (AD) vs the training set AD was performed, suggesting that the training set based AD is a more adequate way to avoid extrapolation when using such models. The better performance of our direct classification model compared to conventionally employed QSAR methods, makes this approach a viable tool for hazard identification and risk assessment of chemicals.


Data scinece
Toxicity category

Supplementary materials

Supporting Information for: From descriptors to intrinsic fish toxicity of chemicals: an alternative approach to chemical prioritization
Supporting Information


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.