Improving Deep Neural Network in Predicting Electron Ionization Mass Spectra by Molecular Similarity-wise Sampling

18 April 2023, Version 1
This content is a preprint and has not undergone peer review at the time of posting.

Abstract

Mass spectrometry is a widely used technique for identifying molecules in a variety of applications, including organic synthesis and metabolomics. Recently,a deep neural network model for mass spectrometry has been developed and numerically assessed. However, we confirmed that the model performs poorly for a specific target such as highly-fluorinated compound, this study introduces a simple dataset undersampling scheme using a molecular similarity. The model trained on the undersampled dataset shows that the predictive performance was improved for fluorinated compounds and was relatively maintained even for non-fluorinated compounds. This performance is probably ascribed to the reduction of bit collisions of ECFPs. The undersampling approach is general and applicable to any specific target.

Keywords

neural network
undersampling
EI-MS

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.