De Novo Generation of Hit-like Molecules from Gene Expression Signatures Using Artificial Intelligence

Finding new molecules with a desired biological activity is an extremely difficult task. In this context, artificial intelligence and generative models have been used for molecular de novo design and compound optimization. Herein, we report the first generative model that bridges systems biology and molecular design conditioning a generative adversarial network with transcriptomic data. By doing this we could generate molecules that have high probability to produce a desired biological effect at cellular level. We show that this model is able to design active-like molecules for desired targets without any previous target annotation of the training compounds as long as the gene expression signature of the desired state is provided. The molecules generated by this model are more similar to active compounds than the ones identified by similarity of gene expression signatures, which is the state-of-the-art method for navigating compound-induced gene expression data. Overall, this method represents a novel way to bridge chemistry and biology to advance in the long and difficult road of drug discovery.