Abstract
Raman spectroscopy is a powerful technique for probing molecular vibrations, yet the computational
prediction of Raman spectra remains challenging due to the high cost of quantum chemical methods
and the complexity of structure-spectrum relationships. Here, we introduce Mol2Raman, a deeplearning
framework that predicts spontaneous Raman spectra directly from SMILES representations
of molecules. The model leverages Graph Isomorphism Networks with edge features (GINE) to encode
molecular topology and bond characteristics, enabling accurate prediction of both peak positions and
intensities across diverse chemical structures. Trained on a novel dataset of over 31,000 molecules
with state of the art Density Functional Theory (DFT)-calculated Raman spectra, Mol2Raman outperforms
both fingerprint-based similarity models and Chemprop-based neural networks. It achieves
a high fidelity in reproducing spectral features, including for molecules with low structural similarity
to the training set and for enantiomeric inversion. The model offers fast inference times (22
ms per molecule), making it suitable for high-throughput molecular screening. We further deploy
Mol2Raman as an open-access web application, enabling real-time predictions without specialized
hardware. This work establishes a scalable, accurate, and interpretable platform for Raman spectral
prediction, opening new opportunities in molecular design, materials discovery, and spectroscopic
diagnostics.
Supplementary materials
Title
Supplementary Informations
Description
This file includes supplementary information with complementary analysis cited also in the main paper.
Actions
Supplementary weblinks
Title
Mol2Raman web app
Description
This URL contains the Web app for a user-friendly access to Mol2Raman predictions.
Actions
View