Classifying the toxicity of pesticides to honey bees via support vector machines with random walk graph kernels

Ping Yang; E. Adrian Henle; Xiaoli Fern; Cory M. Simon

doi:10.26434/chemrxiv-2022-q5zgx-v3

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Classifying the toxicity of pesticides to honey bees via support vector machines with random walk graph kernels

05 April 2022, Version 3

This is not the most recent version. There is a

newer version

of this content available

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Pesticides benefit agriculture by increasing crop yield, quality, and security. However, pesticides may inadvertently harm bees, which are agriculturally and ecologically vital as pollinators. The development of new pesticides---driven by pest resistance to and demands to reduce negative environmental impacts of incumbent pesticides---necessitates assessments of pesticide toxicity to bees. We leverage a data set of 382 molecules labeled from honey bee toxicity experiments to train a classifier that predicts the toxicity of a new pesticide molecule to honey bees. Traditionally, the first step of a molecular machine learning task is to explicitly convert molecules into feature vector representations for input to the classifier. Instead, we (i) adopt the fixed-length random walk graph kernel to express the similarity between any two molecular graphs and (ii) use the kernel trick to train a support vector machine (SVM) to classify the bee toxicity of pesticides represented as molecular graphs. We assess the performance of the graph-kernel-SVM classifier under different walk lengths used to describe the molecular graphs. The optimal classifier, with walk length 4, achieves a (mean over 100 runs) accuracy, precision, recall, and F1 score of 0.82, 0.69, 0.74, and 0.71 on the test data set.

Keywords

random walk graph kernels

graph kernels

toxicity prediction

pesticide toxicity to honey bees

Supplementary weblinks

Title

Description

Actions

Title

code to reproduce

Description

Julia code to reproduce

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Version History

May 23, 2022 Version 4

Apr 05, 2022 Version 3

Mar 10, 2022 Version 2

Mar 09, 2022 Version 1

Version Notes

use F1 score to select optimal model in cross-validation procedure. report F1 score. this is more standard than precision * recall and it varies between 0, 1.

Metrics

2,465

946

Views

Downloads

Citations

License

The content is available under CC BY 4.0

DOI

10.26434/chemrxiv-2022-q5zgx-v3

Funding

National Science Foundation

1920945

Author’s competing interest statement

The author(s) have declared they have no conflict of interest with regard to this content

Ethics

The author(s) declare that they have sought and gained approval from the relevant ethics committee/IRB for this research and its publication.

Classifying the toxicity of pesticides to honey bees via support vector machines with random walk graph kernels

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Version History

Version Notes

Metrics

License

DOI

Funding

Author’s competing interest statement

Ethics

Share