Abstract
Chemists have traditionally relied on heuristic approaches to qualitatively assess chemical structure-property relationships and interpret experimental outcomes. However, these methods are inherently limited in handling large volumes of data and integrating them effectively into experimental planning. Understanding the interrelationships among different substitution patterns of organic molecular materials is 1 crucial for optimizing synthetic conditions and expanding their applicability. In this study, we developed a machine learning (ML) algorithm incorporating latent variables to predict unobservable reactions and synthetic conditions for organic materials: perfluoro-iodinated naphthalene derivatives. The algorithm accurately estimated substitution pattern relationships and reaction yields, which were experimentally validated with high-yield outcomes. Our findings reveal that latent variables effectively capture underlying physicochemical relationships, achieving an R2 value >0.99. This approach establishes an ML-guided framework that complements heuristic decision-making in chemistry and optimizes synthetic processes for the target molecule in an extrapolative manner. Further applications of this algorithm will focus on synthetic design and physicochemical property prediction, particularly for catalyst discovery and organic semiconductor optimization.
Supplementary materials
Title
Supporting Information
Description
1. General information
2. Synthesis and characterization of substrate
3. Preparation of magnesium amide bases
4. Iodination reaction of polyfluoronaphthalenes
5. Computational studies
6. References
Appendix 1. Cartesian coordinates
Appendix 2. Details of predicted yields
Appendix 3. List of algorithms
Appendix 4. List of descriptors
Actions