A Computationally-Realizable Rigorous Canonical Numbering Algorithm for Chemical Graphs with its Open-Source Implementation in Rust


Canonical numbering of the vertices from a graph has been a challenging open issue for decades not only in the domain of graph theory but also in the cheminformatic applications. This paper presents an efficient, fast and rigorous approach for canonical numbering and symmetry perception as the first workable solution with theoretical completeness. The methodology is composed of a set of algorithms including extendable representation of vertex, high-performance sorting and graph reduction, etc. The canonical numbering of vertices can be generated in a short time through the novel vertex representation method. Furthermore, a new concept of graph reduction decreases the amount of computation to determine constitutional symmetry of complex graphs into the range of hardware capability. An open-source version of algorithms overall is implemented in Rust thanks to the features of safety, performance and robust abstraction of this modern programming language. The results of experiments on more than 2 million molecules from ChEMBL database has been given at the end.