CACHE: Utilizing Ultra-Large Library Screening in Rosetta to Identify Novel Binders of the WD-Repeat Domain of Leucine-Rich Repeat Kinase 2

22 April 2025, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

In this study, we present a pipeline for identifying novel ligands targeting the Thryptophan-Aspartate-Repeat domain 40 (WDR40) of Leucine-Rich Repeat Kinase 2 (LRRK2), a protein associated with Parkinson's disease, as part of the first Critical Assessment of Computational Hit-Finding Experiments (CACHE) challenge, a blind benchmark experiment for drug discovery. Mutations in this protein are the most common genetic cause of familial Parkinson’s disease, yet this target remains understudied. We conducted an ultra-large library screening (ULLS) of the Enamine REAL space using a newly developed evolutionary algorithm, RosettaEvolutionaryLigand (REvoLd), which allows for efficient screening of combinatorial compound libraries. The protocol involved refining the target structure with molecular dynamic simulations, identifying a binding site via blind-docking, and optimizing compounds through REvoLd, culminating in a manual selection amongst the top-scoring REvoLd hits. A single binder molecule was identified that derived from the combination of two Enamine building blocks. In the second round, derivatives of the hit compound were used as input for REvoLd to further sample within the Enamine REAL space. Ultimately, a total of five molecules were identified, showcasing the effectiveness of this approach. However, it also highlighted shortcomings, such as the preference for nitrogen-rich rings in the RosettaLigand scoring function.

Keywords

drug discovery
ultra-large library screening
REvoLd
Rosetta
LRRK2
CACHE challenge

Supplementary materials

Title
Description
Actions
Title
All clustered PDB structures and clustering information from the MD simulation
Description
The protein structures after MD simulation (in PDB format), the cluster representative of all eleven clusters were selected and the respective clustering information from DBSCAN.
Actions
Title
All selected and ordered compounds in bothrounds of CACHE challenge #1.
Description
The lists contain selected or ordered compounds with SMILES, catalog ID, rank, vendor and price as communicated by the CACHE organizers. Table S1: All initial 150 selected compounds and their SMILES code. Table S2: All initial 109 ordered compounds and their SMILES code. Table S3: All improved 72 selected compounds and their SMILES code Table S4: All improved 45 ordered compounds and their SMILES code.
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.