Abstract
The suitability of synthetic protocols for a target molecule can be evaluated by analyzing reaction scopes of possible literature procedures. Whereas trial-and-error screening of reaction conditions is standard practice, it can be expensive and frustrating. Yet despite growing demand for extensive scopes, many synthetic protocols bear low substrate diversity or conditions that were finetuned for specific substrates across the scope, making it difficult to predict whether a new substrate would react in a known reaction. Herein, we disclose a method for classifying substrates according to different reaction conditions to encourage the design of informative scopes. This method estimates the probability that specific reaction conditions would suit a target substrate by comparing its molecular features against those of reported substrates, which is especially useful when the target is at a late stage of a stepwise synthesis. Our approach was designed for very small datasets, making it applicable across a broad reaction space.