Krein Support Vector Machine Classification of Antimicrobial Peptides

17 June 2022, Version 2
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Antimicrobial peptides (AMPs) represent a potential solution to the growing problem of antimicrobial resistance, yet their identification through wet-lab experiments is a costly and time-consuming process. Accurate computational predictions would allow rapid in silico screening of candidate AMPs, thereby accelerating the discovery process. Kernel methods are a class of machine learning algorithms that utilise a kernel function to transform input data into a new representation. When appropriately normalised, the kernel function can be regarded as a notion of similarity between instances. However, many expressive notions of similarity are not valid kernel functions, meaning they cannot be used with standard kernel methods such as the support-vector machine (SVM). The Kreın-SVM represents a generalisation of the standard SVM that admits a much larger class of similarity functions. In this study, we propose and develop Kreın-SVM models for AMP classification and prediction by employing the Levenshtein distance and local alignment score as sequence similarity functions. Utilising two datasets from the literature, each containing more than 3000 peptides, we train models to predict general antimicrobial activity. Our best models achieve an AUC of 0.967 and 0.863 on the test sets of each respective dataset, outperforming the in-house and literature baselines in both cases. We also curate a dataset of experimentally validated peptides, measured against Staphylococcus aureus and Pseudomonas aeruginosa, in order to evaluate the applicability of our methodology in predicting microbe-specific activity. In this case, our best models achieve an AUC of 0.933 and 0.917, respectively. Models to predict both general and microbe-specific activities are made available as web applications.

Keywords

indefinite kernel
antimicrobial peptide
support vector machine

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.