Abstract
The software macHine leArning booSTed dockiNg (HASTEN) was developed to accelerate
structure-based virtual screening using machine learning models. It has been validated using
datasets both from literature (12 datasets, each containing three million molecules docked
with FRED) and in-house sources (one dataset of four million compounds docked with
Glide). HASTEN showed reasonable performance by having the mean recall value of 0.78 of
the top one percent scoring molecules after docking 10 % of the dataset for the literature data,
whereas excellent recall value of 0.95 was achieved for the in-house data. The program can be
used with any docking- and machine learning methodology, and is freely available from
https://github.com/TuomoKalliokoski/HASTEN.
structure-based virtual screening using machine learning models. It has been validated using
datasets both from literature (12 datasets, each containing three million molecules docked
with FRED) and in-house sources (one dataset of four million compounds docked with
Glide). HASTEN showed reasonable performance by having the mean recall value of 0.78 of
the top one percent scoring molecules after docking 10 % of the dataset for the literature data,
whereas excellent recall value of 0.95 was achieved for the in-house data. The program can be
used with any docking- and machine learning methodology, and is freely available from
https://github.com/TuomoKalliokoski/HASTEN.