Abstract
Approximately 40% of marketed drugs exhibit suboptimal pharmacokinetic profiles. Co-crystallization, where pairs of molecules form a multicomponent crystal, constitutes a promising strategy to enhance physicochemical properties without compromising the pharmacological activity. However, finding promising co-crystal pairs is resource-intensive, due to the vast number of possible combinations. We present DeepCocrystal, a novel deep learning approach designed to predict co-crystal formation by processing the 'chemical language' from a supramolecular vantage point. Rigorous validation of DeepCocrystal showed a balanced accuracy of 78% in realistic scenarios, outperforming existing models. By leveraging the properties of molecular string representations, DeepCocrystal can also estimate the uncertainty of its predictions. We harness this capability in a challenging prospective study, and successfully discovered two novel co-crystal of diflunisal, an anti-inflammatory drug. This study underscores the potential of deep learning -- and in particular of chemical language processing -- to accelerate co-crystallization, and ultimately drug development, in both academic and industrial contexts.