Multi-Instance Learning Approach to the Modeling of Enantioselectivity of Conformationally Flexible Organic Catalysts


The computational design of chiral organic catalysts for asymmetric synthesis is a promising technol-ogy, which may significantly reduce the material and human resources required for the preparation of enantiopure compounds. Herein, for the modelling of catalysts enantioselectivity, we propose to use the Multi-Instance Learning (MIL) approach accounting for multiple catalyst conformers and requir-ing neither conformers selection nor their spatial alignment. A catalyst was represented by an ensem-ble of conformers, each encoded by 3D pmapper descriptors. A catalyzed chemical transformation was converted into a single molecular graph - ondensed Graph of Reaction (CGR) - encoded by 2D fragment descriptors. A whole chemical reaction was finally encoded by concatenated 3D catalyst and 2D transformation descriptors. The performance of the proposed method was demonstrated in the modelling of enantioselectivity of homogeneous and phase-transfer reactions and compared with some state-of-the-art approaches