Abstract
The diversity of RNA structural elements and their documented role in human diseases make RNA an attractive therapeutic target. However, progress in drug discovery and development has been hindered by a limited understanding of the parameters that drive RNA recognition by small molecules, including a lack of experimentally-validated structure-activity relationships (QSAR). We developed an adaptable ensemble learning-based method that quantitatively predicts both affinity and kinetic-based binding parameters of small molecules against the HIV-1 TAR model RNA system. A training set of small molecules was screened against the HIV-1-TAR construct using surface plasmon resonance, which provided the binding kinetics and affinities. Introduction of ensemble learning on these data combined with structure-based molecular descriptors afforded predictive models as well as explicit interpretation of the contributing parameters. The accuracy of the model was tested by external validation where binding properties of additional molecules outside training set were correctly predicted. The ensemble model presented herein is the first application of predictive and experimentally-validated 2D-QSAR against an RNA target, in this case HIV-1-TAR RNA, and provides a platform to guide future synthetic efforts. Furthermore, we expect the workflow described herein to be applicable to other RNA structures, ultimately providing essential insight into the small molecule descriptors that drive selective binding interactions and, consequently, exponentially increasing the efficiency of ligand design and optimization without the need for high-resolution structures.
Supplementary materials
Title
Supplementary Information for Ensemble learning-based quantitative structure-activity relationship platform predicts binding behavior of RNA-targeted small molecules
Description
Supporting Information document containing additional figures and tables referenced within the manuscript, as well as detailed experimental and computational and synthetic procedures, materials and methods, spectroscopic and chromatographic data, code lines and SPR binding curves
Actions