An Integrative Drug Repurposing Pipeline Using KNIME and Programmatic Data Access: A Case Study on COVID-19 Data

22 July 2020, Version 1
This content is a preprint and has not undergone peer review at the time of posting.


Biomedical information mining is increasingly recognized as a promising technique to accelerate drug discovery and development. Especially, integrative approaches which mine data from several (open) data sources have become more attractive with the increasing possibilities to programmatically access data through Application Programming Interfaces. The use of open data in conjunction with free, platform-independent analytic tools provides the additional advantage
of flexibility, re-usability, and transparency. Here, we present a strategy for performing in silico drug repurposing with the analytics platform KNIME, using data for 38 suggested COVID-19 drug targets as a timely use case. The workflow includes a targeted download of data through web services, data curation (including chemical structure standardization), detection of enriched structural patterns, as well as substructure searches in DrugBank and a recently deposited dataset of antiviral drugs provided by Chemical Abstracts Service. Developed workflows, tutorials with detailed step-by-step instructions, and the information gained by the analysis of COVID-19 data are made freely available to the scientific community. The provided framework can be reused by researchers for other in silico drug repurposing projects, and it should serve as a valuable teaching resource for conveying integrative data mining strategies.


drug repurposing
data integration
data mining
data access
application programming interface
structure standardization
maximum common substructure
substructure search
KNIME workflow

Supplementary materials

Supporting information JCheminf


Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.