Abstract
Molecular docking is a pragmatic approach to exploit protein structure for new ligand discovery, but the growing size of available chemical space is increasingly challenging to screen on in-house computer clusters. We have therefore developed AWS-DOCK, a protocol for running UCSF DOCK in the AWS cloud. Our approach leverages the low cost and scalability of cloud resources combined with a low-molecule-cost docking engine to screen billions of molecules efficiently. We benchmarked our system by screening 50 million HAC 22 molecules against the DRD4 receptor. We saw up to 3-fold variations in cost between AWS availability zones. Docking 4.5 billion lead-like molecules, a 7-week calculation on our 1000-core lab cluster, runs in less than a week in AWS for around $25,000, less than the cost of two new nodes. The cloud docking protocol is described in easy-to-follow steps and may be sufficiently general to be used for other docking programs. All the tools to enable AWS-DOCK are available free to everyone, while DOCK 3.8 is free for academic research.
Supplementary materials
Title
S1 - Set up account
Description
Instructions on how to set up an AWS account for docking. One time.
Actions
Title
S2 - upload files for docking
Description
For each job, what you need to do prior to launching docking.
Actions
Title
S3 - Submit docking job in the cloud
Description
Instructions on how to run the job itself.
Actions
Title
S4 - Merge and download results
Description
Instructions on post-processing docking results in preparation for hit picking.
Actions
Title
S5 - Cleanup
Description
Without cleanup, you will pay for storage of data you no longer need. This tutorial describe when, where and what you can delete to keep your storage costs under control.
Actions