Abstract
The selection of optimal reaction conditions is a critical challenge in synthetic chemistry, influencing
the efficiency, sustainability, and scalability of chemical processes. While machine learning (ML) has
emerged as a promising tool for predicting reaction conditions in computer-aided synthesis planning
(CASP), existing approaches face many significant challenges, including data quality, sparsity, choice
of reaction representation and method evaluation. Recent studies have suggested that these models
may fail to surpass literature-derived popularity baselines, underscoring these problems. In this
work, we provide a critical review of state-of-the-art ML techniques, identifying innovations which
have addressed the key challenges facing researchers when modelling conditions. To illustrate how
relevant reaction representations can improve existing models, we perform a case study of heteroaromatic
Suzuki-Miyaura reactions, derived from US patent data (USPTO). Using Condensed Graph
of Reaction-based inputs, we demonstrate how this alternative representation can enhance the predictive
power of a model beyond popularity baselines. Finally, we propose future directions for the
field beyond improving data quality, suggesting potential options to mitigate data issues prevalent in
existing literature data. This perspective aims to guide researchers in understanding and overcoming
current limitations in computational reaction condition prediction