Img2Mol - Accurate SMILES Recognition from Molecular Graphical Depictions

Djork-Arné Clevert; Tuan Le; Robin Winter; Floriane Montanari

doi:10.26434/chemrxiv.14320907.v1

Theoretical and Computational Chemistry

Search within Theoretical and Computational Chemistry

Img2Mol - Accurate SMILES Recognition from Molecular Graphical Depictions

29 March 2021, Version 1

Working Paper

Show author details

This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Automatic recognition of the molecular content of a molecule’s graphical depiction is an extremely challenging problem that remains largely unsolved despite decades of research. Recent advances in neural machine translation enable the auto-encoding of molecular structures in a continuous vector space of fixed size (latent representation) with low reconstruction errors. In this paper, we present a fast and accurate model combining a deep convolutional neural network learning from molecule depictions and a pre-trained decoder that translates the latent representation into the SMILES representation of the molecules. This combination allows to precisely infer a molecular structure from an image. Our rigorous evaluation show that Img2Mol is able to correctly translate up to 88% of the molecular depictions into their SMILES representation. A pretrained version of Img2Mol is made publicly available on GitHub for non-commercial users.

Keywords

Image recognition, algorithms and filters

Supplementary materials

Title

Description

Actions

Title

img2mol task

Description

Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here .

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Now Published

Img2Mol – accurate SMILES recognition from molecular graphical depictions

Djork-Arné Clevert, Tuan Le, Robin Winter, Floriane Montanari journal article

Chemical Science , Volume 12, Issue 42

Online publication date: 2021

Version History

Mar 29, 2021 Version 1

Version Notes

Version 1.0

Metrics

4,860

1,681

Views

Downloads

License

The content is available under CC BY NC ND 4.0

DOI

10.26434/chemrxiv.14320907.v1

Author’s competing interest statement

No conflict of interest

Img2Mol - Accurate SMILES Recognition from Molecular Graphical Depictions

Authors

Abstract

Keywords

Supplementary materials

Comments

Now Published

Version History

Version Notes

Metrics

License

DOI

Author’s competing interest statement

Share