Phytochemical Drug Discovery for COVID-19 Using High-resolution Computational Docking and Machine Learning Assisted Binder Prediction

13 July 2022, Version 2
This content is a preprint and has not undergone peer review at the time of posting.


The COVID-19 pandemic has resulted in millions of deaths around the world. Multiple vaccines are in use, but there are many underserved locations that do not have adequate access to them. Variants may emerge that are highly resistant to existing vaccines, and therefore cheap and readily obtainable therapeutics are needed. Phytochemicals, or plant chemicals, can possibly be such therapeutics. Phytochemicals can be used in a polypharmacological approach, where multiple viral proteins are inhibited and escape mutations are made less likely. Finding the right phytochemicals for viral protein inhibition is challenging, but in-silico screening methods can make this a more tractable problem. In this study, we screen a wide range of natural drug products against a comprehensive set of SARS-CoV-2 proteins using a high-resolution computational workflow. This workflow consists of a structure-based virtual screening (SBVS), where an initial phytochemical library was docked against all selected protein structures. Subsequently, ligand-based virtual screening (LBVS) was employed, where chemical features of 34 lead compounds obtained from the SBVS were used to predict 53 lead compounds from a larger phytochemical library via supervised learning. A computational docking validation of the 53 predicted leads obtained from LBVS revealed that 28 of them elicit strong binding interactions with SARS-CoV-2 proteins. Thus, the inclusion of LBVS resulted in a 4-fold increase in the lead discovery rate. Of the total 62 leads, 18 showed promising pharmacokinetic properties in a computational ADME screening. Collectively, this study demonstrates the advantage of incorporating machine learning elements into a virtual screening workflow.


Machine Learning
Ligand Docking
Drug Discovery
Natural Products
Virtual Screening

Supplementary materials

Supporting Information
Figures S1-S3 and Tables S1-S9. A description of the contents can be found in the Associated Content section of the main manuscript.


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.