ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
1/1
0/0

REDIAL-2020: A Suite of Machine Learning Models to Estimate Anti-SARS-CoV-2 Activities

preprint
revised on 15.09.2020 and posted on 16.09.2020 by Govinda KC, Giovanni Bocci, Srijan Verma, Mahmudulla Hassan, Jayme Holmes, Jeremy Yang, suman sirimulla, Tudor I. Oprea

Strategies for drug discovery and repositioning are an urgent need with respect to COVID-19. We developed "REDIAL-2020", a suite of machine learning models for estimating small molecule activity from molecular structure, for a range of SARS-CoV-2 related assays. Each classifier is based on three distinct types of descriptors (fingerprint, physicochemical, and pharmacophore) for parallel model development. These models were trained using high throughput screening data from the NCATS COVID19 portal (https://opendata.ncats.nih.gov/covid19/index.html), with multiple categorical machine learning algorithms. The “best models” are combined in an ensemble consensus predictor that outperforms single models where external validation is available. This suite of machine learning models is available through the DrugCentral web portal (http://drugcentral.org/Redial). Acceptable input formats are: drug name, PubChem CID, or SMILES; the output is an estimate of anti-SARS-CoV-2 activities. The web application reports estimated activity across three areas (viral entry, viral replication, and live virus infectivity) spanning six independent models, followed by a similarity search that displays the most similar molecules to the query among experimentally determined data. The ML models have 60% to 74% external predictivity, based on three separate datasets. Complementing the NCATS COVID19 portal, REDIAL-2020 can serve as a rapid online tool for identifying active molecules for COVID-19 treatment. The source code and specific models are available through Github (https://github.com/sirimullalab/redial-2020), or via Docker Hub (https://hub.docker.com/r/sirimullalab/redial-2020) for users preferring a containerized version.

Funding

NSF-PREM grant #DMR- 1827745

History

Email Address of Submitting Author

ssirimulla@utep.edu

Institution

The University of Texas at El Paso

Country

United States

ORCID For Submitting Author

0000-0003-4665-6665

Declaration of Conflict of Interest

The authors did not declare any conflicts of interest.

Version Notes

This is the first version of the Redial-2020 manuscript

Exports