Sanitize It Yourself: human-based sanitization checker against machine-generated chemical structures

Many computer-aided drug design (CADD) methods using deep learning have recently been proposed to explore the chemical space toward novel scaffolds efficiently. However, there is a tradeoff between the ease of generating novel structures and the chemical feasibility of structural formulas. To overcome the limitations of computational filtering, we have implemented a web application that allows easy compound sanitization by humans. The application is available at https://sanitizer.chemical.space/.


Introduction
Computer-aided drug design (CADD) has become an even more active research field with the rise of deep learning. 1 The cooperation of researchers from various backgrounds ranging from 1 organic chemistry to computer science is required to design feasible new compounds; however, it is not always easy to combine multidisciplinary insights. In recent years, there have been plenty of researches on molecular generative models. Still, some of these researches only look at numerical performance evaluations and lack the discussions about chemistry perspective of generated compounds.
In fact, a medicinal chemist Derek Lowe, known as a blogger influencer, posted a criticism article about the usefulness of molecular generative models. In the post, he pointed out even algorithm which claims to attain high-performance scores often generate chemically infeasible molecules, and he called such molecules as "crazy structures". 2 In order to find out such inappropriate structures, it is necessary to define and calculate the appropriateness of generated molecules. Some reports attempt at quantification of the appropriateness based on synthetic feasibility by automatic retrosynthesis tools. 3,4 If the tools successfully find reasonable synthetic routes for generated compounds, such compounds are considered to be synthetically feasible. These approaches can screen millions of generated compounds, but their reliability is controversial. Automatic retrosynthesis tools sometimes give incorrect synthetic routes for even simple molecules, 5 and there is room for further improvement.
Given such situations, human-based sanitization of molecules is still necessary for ensuring the reliability of molecules generated by computers.
We have been advocating social drug discovery, in which we best-utilize the wisdom of the crowd for drug discovery. The power of the crowd on structure-based drug discovery was evaluated in our previous study. 6 In the study, Twitter users were asked to vote about which docking pose seems to be the most reasonable. The most voted pose matched the actual docking pose on 3 cases out of 3 questions. This result suggests that the majority opinion of chemists can be a useful source of information for drug discovery.
In this paper, we introduce the visualization tool for generated molecules on chemical.space 7 to enhance human-based sanitization of molecules. We first describe the problem of molecules generated by popular algorithms from the perspective of medicinal chemists, then introduce the new visualization tools and finally describe the future perspective.

Problematic structures generated by molecular generation algorithms
The researches of generative models for molecules at the early days were mainly focused to increase the ratio of valid molecules among all generated ones. RDKit 8 has been used to assess the validity, 9 and validity is now regarded as one of the most important benchmarks for evaluating generative models . 10 After numerous efforts on increasing validity ratio, the generative models in most recent reports succeeded in achieving very high validity. However, it has been pointed out that such benchmark metrics including validity cannot properly evaluate the generated molecules. Renz and coworkers showed failure mechanisms of generative models. In the work, they exemplified the generated molecules could contain unstable, synthetically infeasible, or highly uncommon substructures. 11 In Figure 1, examples of such unwanted molecules are shown. In order to discard unwanted molecules, various quality filters have been proposed. For example, PAINS, and MCF filters are implemented in Moses packages. 12 Although these filters are useful to some extent, some unwanted molecules remain unfiltered because which substructures are unwanted depends on each user's individual situation. Therefore, users must prepare their own custom-defined filters to get meaningful generated molecules. In fact, REINVENT, 13 one of the most cited and widely used generative models, provides Custom Alerts (CA) component, which enables users to define their own unwanted substructures. Actual preparation of custom filters are laborious tasks, and tools for supporting visual inspection of users are essential to check and find out unwanted molecules/substructures among generated molecules.

Molecule sanitization checker
We developed a web-based molecule visualization tool to enhance molecule sanitization through visual inspection. Organized visualization of chemical space of molecules is necessary to check the computer-generated molecules. Molecular scaffold is one of important characteristics where medicinal chemists pay attention to when checking molecules. Therefore, we adopted a substructure-based classification of molecules for visualization. The web service is available at https://sanitizer.chemical.space/. If users want to deal with private data, they can build their own server using the source code for the service.

Case study
Evaluation of our web application was conducted by medicinal chemists. The goal of this case study was to find invalid molecules from molecules generated by one of the author's previous work 16 using this visualization tool. According to the users, unwanted molecules could easily be found from the list of generated molecules on this app. Example unwanted molecules are shown in Figure 3.
The users highlighted how easy it was to judge whether the selected substructure is useful or unwanted by looking at similar molecules with common structural moiety by the substructure-based classification function. We will continue to develop the application re-  flecting the users' opinion. Currently our tool now only supports automatically generated substructures. More functionality such as user-defined SMARTS filter would be useful to specify unwanted substructures more accurately.

Conclusion
In this study, we pointed out the problem of current generative models. It is very likely that some unwanted compounds are contained in generated molecules. Benchmark metrics are not sufficient to prioritize and select compounds in appropriate way from generated ones. It would be very useful if molecules with chemically unstable or synthetically infeasible substructures are captured effectively and automatically. Although there are attempts to filter out unwanted structures, visual inspection by experts is still necessary. We implemented a web-based visual inspection tool that takes advantages of common substructures found by subgraph mining. This will ease the validation of generated molecules from wide-range of people. We will expand this tool for crowd-based drug discovery in the future by implementing voting functions.