Benchmarking Refined and Unrefined AlphaFold2 Structures for Hit Discovery



The recently developed AlphaFold2 (AF2) algorithm predicts proteins’ 3D structures from amino acid sequences. The open AlphaFold Protein Structure Database covers the complete human proteome. It shows great potential to provide structural information to enable and enhance existing and new drug discovery projects. Using an industry-leading molecular docking method (Glide), we benchmarked the virtual screening performance of 28 common drug targets each with an AF2 structure and known holo and apo structures from the DUD-E dataset. The AF2 structures show comparable early enrichment of known active compounds (avg. EF 1%: 13.16) to apo structures (avg. EF 1%: 11.56), while falling behind early enrichment of the holo structures (avg. EF 1%: 24.81). We also demonstrated that with the IFD-MD induced-fit docking approach, we can refine the AF2 structures using a known binding ligand to improve the performance in structure-based virtual screening (avg. EF 1%: 19.25). Thus, with proper preparation and refinement, AF2 structures show considerable promise for in silico hit identification.

Version notes

Corrected author affiliation.


Supplementary material

Supporting Information
Target overview and additional analysis of enrichment results and binding sites