ChemRxiv
These are preliminary reports that have not been peer-reviewed. They should not be regarded as conclusive, guide clinical practice/health-related behavior, or be reported in news media as established information. For more information, please see our FAQs.
reaction_completion.pdf (246.28 kB)

Completion of Partial Reaction Equations

preprint
submitted on 23.11.2020, 13:11 and posted on 24.11.2020, 04:55 by Alain C. Vaucher, Philippe Schwaller, Teodoro Laino
We present a deep-learning model for inferring missing molecules in reaction equations. Such an algorithm features multiple interesting behaviors. First, it can infer the necessary reagents and solvents in chemical transformations specified only in terms of main compounds, as often resulting from retrosynthetic analyses. The completion with necessary reagents ensures that reaction equations are compatible with deep-learning models relying on a complete reaction specification. Second, it can cure existing datasets by detecting missing compounds, such as reagents that are essential for given classes of reactions. Finally, this model is a generalization of models for forward reaction prediction and retrosynthetic analysis, as both can be formulated in terms of incomplete reaction equations. We illustrate that a single trained model, based on the transformer architecture and acting on reaction SMILES strings, can address all three points.

Workshop paper at the Machine Learning for Molecules Workshop at NeurIPS 2020.

History

Email Address of Submitting Author

ava@zurich.ibm.com

Institution

IBM Research Europe

Country

Switzerland

ORCID For Submitting Author

0000-0001-7554-0288

Declaration of Conflict of Interest

No conflict of interest.

Exports