Accurate and Automated de novo Identification of Molecular Functional Groups Using Deep Learning Architectures

06 May 2019, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

We present a deep learning method for identifying all the functional groups of unknown compounds using a combination of FTIR and MS spectra without the use of any database, pre-established rules, procedures, or peak-matching methods. We derive patterns and correlations directly from spectral data representing multiple functional groups as a multi-class, multi-label problem. For practical usability, we introduce two new metrics (Molecular F1 score and Molecular Perfection rate) to measure the performance by identifying all functional groups on molecules. Our optimized model has a Molecular F1 score of 0.92 and a Molecular Perfection rate of 72%. Backpropagation of our model reveals IR patterns typically used by human chemists suggesting “learning” of known spectral features. We show that the introduction of new functional groups does not decrease model performance. Finally, we show redundancy in FTIR and MS data by encoding combined data in a latent space that retains the accuracy of the original model. Our results reveal the importance of deep learning for rapid identification of functional groups to realize autonomous analytical processes in the future.

Keywords

deep learning
Spectral data analysis
Functional Group
machine Learning Predictions
chemical modeling
Chemistry
Mathematics
Inverse design
Information And Computing Sciences
Combining data sets
Spectral Database
instrumentation
autonomous

Supplementary materials

Title
Description
Actions
Title
2019 Fine et al. Deep Learning Spectra (Supporting)
Description
Actions
Title
ListingS1
Description
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.