The First CACHE Challenge – Identifying Binders of the WD-Repeat Domain of Leucine-Rich Repeat Kinase 2



In December 2021, Molecular Forecaster (MFI) applied to participate in the inaugural CACHE Challenge. Organized by the Structural Genomics Consortium (SGC), CACHE (Critical Assessment of Computational Hit-finding Experiments) is a public–private partnership benchmarking initiative to enable the development of computational methods “to compare and improve small-molecule hit-finding algorithms through cycles of prediction and experimental testing.” As a Research-as-a-Service (RaaS) start-up in computer-aided drug design, MFI has a clear focus: to help kickstart drug discovery programs around the world with our software and expertise. It’s social impact that drives our mission and we are in business to help people, to further science, and to make drug discovery more efficient. However, because much of MFI’s work and contributions are bound by NDA, we cannot share the information publicly. This is why the CACHE framework is energizing and exciting to our team: we have the opportunity to collaborate beyond typical confidentiality limitations and to support researchers by sharing what we do and how we do it in a broader context. The MFI team has decided to take multiple research-focused approaches to our predictions in this first CACHE challenge, aiming to learn from our successes and failures. We are putting MFI’s team, expertise, and algorithms to the test, using them as a foundation to push the boundaries beyond our scientific and application successes to-date. We’ve also decided to double-down and share the details of our work with the community. Over the next several months, you will get a closer look at the way we approach scientific problems and push projects forward at MFI.

Version notes

Primarily, the addition of supplementary files with data output. We also made a few technical corrections to the text regarding Fitted.


Supplementary material

Supporting Protein Files
- SWISS-Model structure (pdb format) after rebuilding the missing regions - MD simulation input files (mdp, top and ndx format) for minimization, equilibration, and production - Equilibrated protein structure (gro format) - LRRK2MD-WD40 structure (pdb format), along with the files created by PREPARE and PROCESS
Supporting documentation
- GNN scores for evaluated compounds - QMSF scores for evaluated compounds - List of the 150 selected compounds with SMILES, scores (docking, GNN, QMSF, or visualization), key physico-chemical properties (MW, HBD, HBA, etc.) and method of selection - Final list of 96 compounds for testing

Supplementary weblinks

Corresponding blog post
An accompanying blog post on our website detailing our participation in the CACHE challenge.