Abstract
Fluorescent molecules, fluorophores, play essential roles in bioimaging. Attachment
of fluorophores to proteins enables observation of the detailed structure and dynamics
of biological reactions occurring in the cell. Effective bioimaging requires fluorophores
with high quantum yields to detect weak signals. Besides, fluorophores with various
emission frequencies are necessary to extract richer information. An essential com-
putational component to discover novel functional molecules is to predict molecular
properties. Here, we present statistical machines that predict excitation energies and
associated oscillator strengths of a given molecule using a random forest algorithm. Ex-
citation energies and oscillator strengths are directly related to the emission spectrum
and the quantum yields of fluorophores, respectively. We discovered specific molecu-
lar substructures and fragments that determine the oscillator strengths of molecules
from the feature importance analysis of our random forest machine. This discovery is
expected to serve as a new design principle for novel fluorophores.