DEL+ML paradigm for actionable hit discovery – a cross DEL and cross ML model assessment.

24 July 2024, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

DNA-Encoded Library (DEL) technology allows the screening of millions, or even billions, of encoded compounds in a pooled fashion which is faster and cheaper than traditional approaches. These massive amounts of data related to DEL binders and not-binders to the target of interest enable Machine Learning (ML) model development and screening of large, readily accessible, drug-like libraries in an ultra-high-throughput fashion. Here, we report a comparative assessment of the DEL+ML pipeline for hit discovery using three DELs and five ML models (fifteen DEL+ML combinations using two different feature representations). Each ML model was used to screen a diverse set of drug-like compound collections to identify orthosteric binders of two therapeutic targets, Casein kinase 1𝛼/δ (CK1𝛼/δ). Overall, 10% and 94% of the predicted binders and not-binders were confirmed in biophysical assays, including two nanomolar binders (187 and 69.6 nM affinity for CK1𝛼 and CK1δ, respectively). Our study provides insights into the DEL+ML paradigm for hit discovery: the importance of an ensemble ML approach in identifying a diverse set of confirmed binders, the usefulness of large training data and chemical diversity in the DEL, and the significance of model generalizability over accuracy. We shared our results via an open-source repository for further use and development of similar efforts.

Keywords

DNA-encoded library
Machine leaning
Hit identification

Supplementary materials

Title
Description
Actions
Title
Supplemental_file
Description
Supplementary figures and supplementary table legends
Actions
Title
Supplemental_tables
Description
Five Supplementary tables as separate sheets in one excel file
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.