ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
supercomputer_ensemble_drug_discovery_preprint.pdf (4.95 MB)

Supercomputer-Based Ensemble Docking Drug Discovery Pipeline with Application to Covid-19

preprint
submitted on 28.07.2020 and posted on 29.07.2020 by Atanu Acharya, Rupesh Agarwal, Matthew Baker, Jerome Baudry, Debsindhu Bhowmik, Swen Boehm, Kendall Byler, Leighton Coates, Sam Yen-Chi Chen, Connor J. Cooper, Omar Demerdash, Isabella Daidone, John Eblen, Sally R. Ellingson, Stefano Forli, Jens Glaser, James C. Gumbart, John Gunnels, Oscar Hernandez, Stephan Irle, Jeffery Larkin, Travis J Lawrence, Scott LeGrand, Shih-Hsien Liu, Julie C. Mitchell, Gilchan Park, Jerry M. Parks, Anna Pavlova, Loukas Petridis, Duncan Poole, Line Pouchard, Arvind Ramanathan, David Rogers, Diogo Santos-Martins, Aaron Scheinberg, Ada Sedova, Shawn Shen, Jeremy C. Smith, Micholas Smith, Carlos Soto, Aristides Tsaris, Mathialakan Thavappiragasam, Andreas F. Tillack, Josh V Vermaas, Van Quan Vuong, Junqi Yin, Shinjae Yoo, Mai Zahran, Laura Zanetti-Polzi
We present a supercomputer-driven pipeline for in-silico drug discovery using enhanced sampling molecular dynamics (MD) and ensemble docking. We also describe preliminary results obtained for 23 systems involving eight protein targets of the proteome of SARS CoV-2. THe MD performed is temperature replica-exchange enhanced sampling, making use of the massively parallel supercomputing on the SUMMIT supercomputer at Oak Ridge National Laboratory, with which more than 1ms of enhanced sampling MD can be generated per day. We have ensemble docked repurposing databases to ten configurations of each of the 23 SARS CoV-2 systems using AutoDock Vina. We also demonstrate that using Autodock-GPU on SUMMIT, it is possible to perform exhaustive docking of one billion compounds in under 24 hours. Finally, we discuss preliminary results and planned improvements to the pipeline, including the use of quantum mechanical (QM), machine learning, and AI methods to cluster MD trajectories and rescore docking poses.

Funding

This work was made possible in part by a grant of high-performance computing resources and technical support from the Alabama Supercomputer Authority to JB and KB.

CJC was supported by a National Science Foundation Graduate Research Fellowship under Grant No. 2017219379.

This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725 and National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231.

This research was supported by the Cancer Research Informatics Shared Resource Facility of the University of Kentucky Markey Cancer Center (P30CA177558) and the University of Kentucky’s Center for Computational Sciences (CCS) high-performance computing resources.

History

Email Address of Submitting Author

msmit316@utk.edu

Institution

The University of Tennessee, Knoxville

Country

USA

ORCID For Submitting Author

0000-0002-0777-7539

Declaration of Conflict of Interest

The authors declare no conflicts of interest

Version Notes

Version 1.0

Exports