Unsupervised Attention-Guided Atom-Mapping

Philippe Schwaller; Benjamin Hoover; Jean-Louis Reymond; Hendrik Strobelt; Teodoro Laino

doi:10.26434/chemrxiv.12298559.v1

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Unsupervised Attention-Guided Atom-Mapping

14 May 2020, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Knowing how atoms rearrange during a chemical transformation is fundamental to numerous applications aiming to accelerate organic synthesis and molecular discovery. This labelling is known as atom-mapping and is an NP-hard problem. Current solutions use a combination of graph-theoretical approaches, heuristics, and rule-based systems. Unfortunately, the existing mappings and algorithms are often prone to errors and quality issues, which limit the effectiveness of supervised approaches. Self-supervised neural networks called Transformers, on the other hand, have recently shown tremendous potential when applied to textual representations of different domain-specific data, such as chemical reactions. Here we demonstrate that attention weights learned by a Transformer, without supervision or human labelling, encode atom rearrangement information between products and reactants. We build a chemically agnostic attention-guided reaction mapper that shows a remarkable performance in terms of accuracy and speed, even for strongly imbalanced reactions. Our work suggests that unannotated collections of chemical reactions contain all the relevant information to construct coherent sets of reaction rules. This finding provides the missing link between data-driven and rule-based approaches and will stimulate machine-assisted discovery in the chemical domain.

Code is available at: https://github.com/rxn4chemistry/rxnmapper

Keywords

Organic Synthesis

Organic Chemistry

SMILES-Encoded Molecular Structures

Supplementary weblinks

Title

Description

Actions

Title

Description

Actions

View

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Extraction of organic chemistry grammar from unsupervised learning of chemical reactions

Philippe Schwaller, Benjamin Hoover, Jean-Louis Reymond, Hendrik Strobelt, Teodoro Laino journal article

Science Advances , Volume 7, Issue 15

Print publication date: Apr 09, 2021

Version History

May 14, 2020 Version 1

Metrics

14,035

3,767

Views

Downloads

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv.12298559.v1

Author’s competing interest statement

No conflicts of interest

Unsupervised Attention-Guided Atom-Mapping

Authors

Abstract

Keywords

Supplementary weblinks

Comments

Now Published

Version History

Metrics

License

DOI

Author’s competing interest statement

Share