Toward a Comprehensive Treatment of Tautomerism in Chemoinformatics Including in InChI V2

29 November 2019, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We have collected 86 different transforms of tautomeric interconversions. Out of those, 54 are for prototropic (non-ring-chain) tautomerism; 21 for ring-chain tautomerism; and 11 for valence tautomerism. The majority of these rules have been extracted from experimental literature. Twenty rules – covering the most well-known types of tautomerism such as keto-enol tautomerism – were taken from the default handling of tautomerism by the chemoinformatics toolkit CACTVS. The rules were analyzed against nine differerent databases totaling over 400 million (non-unique) structures as to their occurrence rates, mutual overlap in coverage, and recapitulation of the rules’ enumerated tautomer sets by InChI V.1.05, both in InChI’s Standard and a Non-Standard version with the increased tautomer-handling options 15T and KET turned on. These results and the background of this study are discussed in the context of the IUPAC InChI Project tasked with the redesign of handling of tautomerism for an InChI version 2. Applying the rules presented in this paper would approximately triple the number of compounds in typical small-molecule databases that would be affected by tautomeric interconversion by InChI V2. A web tool has been created to test these rules at https://cactus.nci.nih.gov/tautomerizer.

Keywords

Chemoinformatics Approach
Tautomerism
InChI

Supplementary materials

Title
Description
Actions
Title
CACTVS Tcl scripts 2019-11-03
Description
Actions
Title
S1
Description
Actions
Title
S2
Description
Actions
Title
S3
Description
Actions
Title
S4
Description
Actions
Title
S5
Description
Actions
Title
Tauto Rules for InChI V2 2019-11-20 Supp Inf
Description
Actions

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.