New inhibitors of p38 mitogen-activated protein kinase: Repurposing of existing drugs with deep learning

The p38-alpha (MAPK14) is a protein kinase that is implicated in the pathological mechanisms of BAG3 P209L myofibrillar myopathy, cancers, Alzheimer’s disease and other diseases like rheumatoid arthritis. Inhibition of p38 has shown promise as treatment for these diseases. Traditional drug discovery methods were unable to create both effective and safe small molecule inhibitors, so we used machine learning to elucidate potential p38 blockers from existing FDA-approved drugs. Using available bioactivity data, we determined the best existing p38 inhibitors and applied fingerprint clustering to isolate the compounds with similar structures. Descriptors were calculated for these clustered compounds and the most important of these descriptors were determined through a machine-learning based feature selection algorithm. This data served as the training set for a deep neural network that was fine-tuned to a 92% validation accuracy. The neural network model was applied to a database of FDA-approved drugs, revealing 149 potential p38 inhibitors, whose efficacy were confirmed by

docking simulations to be statistically significantly higher than random FDA drugs and slightly higher than known inhibitors. Our study not only reveals potential treatments for p38-mediated diseases but also demonstrates the capability of integrating various machine-learning techniques and computational algorithms to predict novel functions of existing pharmaceuticals.

| INTRODUCTION
The p38-alpha, or MAPK14, is a mitogen-associated protein kinase that is activated by dual phosphorylation of a tripeptide motif (Thr-Xaa-Tyr) located in its activation loop. The p38 kinase is regulated by stress-activated MAPK Kinase 3 (MKK3), MKK6, and MKK4. Its signaling cascade ultimately is responsible for the release of proinflammatory cytokines such as TNF, IL1 and IL6. [1][2][3] The p38 has been found to play a large role in the inflammatory mechanisms of rheumatoid arthritis, Alzheimer's disease, Parkinson's disease, Crohn's disease, and several types of myopathies and lung diseases. It has also had shown tumor promotion properties in different types of cancer, including breast, liver, and colorectal cancer. 4,5 A p38's contribution to the severe phenotype of so many degenerative diseases has made it an attractive target for treatment via molecular inhibitors.
Myofibrillar myopathy (MFM) is another disease in which p38 plays a significant role. including cancers, myopathies, and neurodegeneration. 6 A BAG3 missense mutation of P209 into leucine and several mutations in the same site have been shown to result in a severe childhood MFM phenotype, characterized by progressive limb and axial muscle weakness, respiratory insufficiency, and cardiomyopathy. Three of 53 random MFM patients showed the P209 mutation. 7,8 Further investigation of the P209L mutation in zebrafish discovered that the mutation leads to a toxic aggregation of mutated BAG3, ultimately causing a deficiency of functional BAG3, triggering myofibrillar disintegration. 9 Research on BAG3 P209L mice found significantly increased activation of p38. The mechanism of p38 activation is similar to that seen in Alzheimer's and other tauopathies and neurodegenerative diseases, in which oxidative cell stress activates the MAPK signaling pathway, causing an inflammatory response. 8,10,11 In these brain diseases, cellular stress is caused by misfolded amyloid-beta and tau proteins, while in BAG3 P209L MFM, cellular stress results from a lack of functional selective autophagy. 10,12 Targeted activation of p38 in vivo has been shown to induce heart failure and a similar cardiac hypertrophy as that seen in the mutant mice, and the mutant mice also show increases in inflammatory infiltrates and activation of NF-κB, a prototypical proinflammatory signaling pathway, which is characteristic of p38 activation. 8,[11][12][13] All these evidence points to targeted inhibition of p38 as an effective treatment of BAG3 mutation caused MFM. Our research is the first attempt we know of to discover p38 inhibitors as a treatment for a genetic myopathy.
Similar to other protein kinase inhibitors, MAPK14 inhibitors mostly function through competitive inhibition at the ATP-binding site. There are two types of these competitive inhibitors: Type 1, which bind in the active DGF-in conformation (e.g., compound SB 203580), and Type 2, which bind in the inactive DFG-out conformation (e.g., compound Birb 796). 14 The two binding modes differ in the orientation of the DFG motif within the ATP pocket. 15 Many different p38 inhibitors have been, or are currently being, investigated in phase I or II clinical trials, including Birb 796 and VX-745, but none have been recommended for use, mostly due to high toxicity or lack of significant efficacy. Since existing p38 inhibitors have been artificially synthesized, their side effects are unknown until human clinical trials, which is why even though they may work very well during in-vitro trials or even in-vivo animal models, they end up failing at the clinical trials. Development of new drugs is a very long and expensive process, so in this project we propose to repurpose existing FDA-approved drugs to find effective, non-toxic p38 inhibitors with already known side effects. FDA drug repurposing significantly expedites the drug discovery process, delivering safer yet effective treatments to patients in a much timelier manner.
There have been a few attempts to use computational techniques to find p38 kinase inhibitors, using structure-based virtual screening or structure-based design of novel inhibitors. [16][17][18][19] However, there has been no use of deep learning (DL) for the repurposing of FDA-approved drugs as p38 inhibitors that we are aware of. Deep learning is a form of machine learning that mimics the human brain and its networks of neurons. It has applications in a variety of fields like healthcare, cybersecurity and even video games, but it has recently emerged as one of the most effective machine-learning techniques in all aspects of drug discovery, from target prediction to synthesis to identification of prognostic biomarkers. 20 Furthermore, data-filtration techniques like fingerprint clustering and feature selection have both been shown to significantly increase deep-learning accuracy. 21,22 That's why in this study we utilized these data preparation algorithms along with a deep neural network to discover which FDA approved drugs could also function as p38 inhibitors, revealing 149 candidates.
These predictions were then ranked through ligand-docking, a structure-based virtual screening method that shows sufficient accuracy in determining how thermodynamically favorable the binding of a ligand to a protein is. 23 With corroboration from experimental trials, the most promising of these drugs could be used as treatment for BAG3 P209L MFM or other diseases with similar p38-mediated pathological mechanisms.

METHODS
All research was completed in silico. The programs, tools, and websites used were the

| Clustering
Activity values and structure files for 12 456 compounds tested with the p38 kinase was retrieved from PubChem. 24 To limit the tested compounds to the strongest inhibitors, only those with an activity value (IC50, Kd, Ki) less than 100 nM were considered. These compounds Since clustering is based on the molecule's 3D structure, the structure files inputted to the software had to include 3D coordinates. So, we converted SMILES, which are a structural representation consisting of one-dimensional ASCII strings, into a 3D Spatial Data File (SDF) format using the CACTUS Structure Files Generator. 26 The SDF files were then imported to the MOE database.

| Descriptor Calculation
PaDEL-Descriptor 29 is a graphical or command line interface that mostly utilizes the Chemistry Development Kit and a few additional descriptor categories. It was employed to calculate 1875 one-, two-, and three-dimensional descriptors. These descriptors-the data that the deep-learning model analyzes to train and make predictions-were calculated for three dimensional SDFs of the largest cluster of known inhibitors, random molecules (control) pulled from PubChem, and for FDA-approved drugs retrieved from DrugBank. 25

| Feature Selection
To narrow down the calculated descriptors to only the most significant ones, we employed attribute selection from WEKA, an open-source machine learning software. 28 The descriptors were ranked by the Information Gain Attribute Evaluation (InfoGain) function, an unsupervised machine-learning algorithm, that measures how important each descriptor is in determining whether a given molecule is an inhibitor or not. Only the most significant descriptors were selected to be used by the machine-learning algorithm in order to reduce noise. Figure 1 is a histogram that illustrates the most informative descriptor, nAtomP. The vertical bars represent the number of molecules for each value of nAtomP, the length of the longest pi chain in the molecule. Inhibitors (red) tend to have larger values for nAtomP than non-inhibitors (blue). This clear distinction between the two is what makes this descriptor useful for deep-learning classification.

FIGURE 1
Histogram of the length of the largest pi chain in the molecule for all molecules in the training set. This descriptor, nAtomP, was ranked as most informative by an information gain algorithm.

| Neural Network Construction
The deep-learning algorithm was created in Python and works by first converting the training data of known inhibitors' and random molecules' most important descriptors into a shuffled

FIGURE 2
A visual representation of the hyperparameter tuning process. The parameters being considered here are batch size, learning rate, and number of epochs. The red lines represent effective combinations of parameters, while the blue lines are ineffective.

FIGURE 3
Neural network visualizations. (a) A snippet of a complex representation of the deep neural network showing how the input data of molecular descriptors connects to the different nodes of the first two hidden layers. 31 (b) This more simplified representation shows how the input data with the 160 descriptors travels through 7 different hidden layers with 336 nodes, a dropout layer (reduces overfitting to the training data by randomly setting input units to 0), and then a final output layer. 32

| RESULTS
To present a comprehensive picture of all important compounds for repurposing we engaged two scaling measures: first, by the docking scores of compounds to p38 (Table 2), second, by the likelihood that the compounds inhibit p38 according to the deep-learning model (Table 5). We also combined the two scaling techniques in Table 6 to provide a more comprehensive ranking system.
However, the first method, by docking score, can be most reliable because it ranks the compounds directly by their simulated interactions with p38, which is most important in determining effective inhibitors.

| Docking
Our conformational search resulted in 8890 conformers of p38 inhibitors predicted by our neural network, and 6616 conformers of already known p38 inhibitors along with 5473 conformations of random FDA drugs added as negative fits. These conformers were then docked with p38 and given a final GBVI/WSA score. The random control compounds had an average docking score of −5.92 kcal/mol, the known inhibitors had a score of −7.30 and the predicted inhibitors had a score of −7.45 ( Figure 4). A t-Test (Table 1) confirmed that the predicted inhibitors were statistically significantly better at binding to p38 at the ATP-binding site-therefore inhibiting p38's function-than random FDA-approved drugs (Since the docking results measure binding energy, a more negative value means stronger binding). The t-Test also shows that the performance of the predicted inhibitors is much more consistent, with a variance less than half that of the random control group. The compounds exhibiting the best docking energies are listed in Table 3 and displayed with the p38 binding pocket in Figure 5.     Table 3 shows the 10 residues, which interacted most frequently with the predicted compounds.
Lys161 appears to be the most reactive residue, interacting through hydrogen bonds with about 80% of the ligands.

| Deep-Learning Model
After the deep-learning algorithm was made and fine-tuned, it was run 5 times on a dataset of 2151 FDA-approved molecules with the same 160 descriptors as the training set. Figure 7 shows  (Table 3). predictions as a percentage of the total predictions. AUC is a slightly different but nonetheless important metric that demonstrates the model's ability to discriminate between two cases, which in this study are being an inhibitor or being a non-inhibitor. Our model's average AUC of 0.969 can be interpreted as meaning that 96.9% of the time, the model will correctly output a higher probability of inhibition for a randomly selected inhibitor than a randomly selected non-inhibitor, even if the overall classification is incorrect. It's important to note that the Figure 7 and Table 4 are not showing the same metrics, since Figure 7 shows the metrics while training, while Table 4 shows the metrics after training when the model is applied to the validation data, a data set randomly set aside from the training data.
The validation metrics are a more accurate way to measure a deep-learning model because they better simulate the model's performance on a separate dataset and confirm that a model is not  Table 5 shows the top drug candidates ranked by the deep neural network score. For each given compound, the algorithm outputs a number between 0 and 1. The closer the value is to 0 the more likely it is a non-inhibitor, and the closer the value is to 1 the more likely it is a p38 inhibitor.
Thus, the output value can be interpreted as probability of inhibition according to the deep-learning model. The compound with the highest probability of inhibition (0.99) is doxorubicin, which also has a strong docking score of −7.91, making it a promising candidate. Table 6 lists the top 100 drug candidates ranked by a combination of their docking score rank and deep-learning-predicted rank. Although ranking by docking scores (Table 2) is the most reliable, combining these two ranking methods can present a more comprehensive picture of the best candidates. For example, ibrutinib, acalabrutinib, and hesperidin, are the top two candidates by this join scaling measure, with combined ranks of 10, 13, and 15, respectively. The fact that these compounds' incredibly high-performing docking scores are backed by the deep-learning model makes them some of the most promising candidates. b

Discussion
We elucidated 149 potential p38 inhibitors that can be tested through in vitro and in vivo experimental trials. With an average validation accuracy of 92% and area under the ROC curve of 0.97, the deep-learning model has shown significant efficacy in predicting the ability of a compound to inhibit p38. Furthermore, protein docking scores indicate that our predicted inhibitors statistically significantly better (Table 1) than random FDA-approved molecules, and even slightly better than the top known p38-inhibitors. Following experimental trials, these compounds could be used as treatments for various p38-mediated diseases, including not just MFM, but also cancers and inflammatory diseases like rheumatoid arthritis and Alzheimer's. a b c d Furthermore, the procedure described in this study can be applied to repurpose existing drugs as protein inhibiting or activating treatments for a wide variety of other diseases.
One significant difference between our project and other deep-learning drug discovery projects is its application on FDA-approved drugs. The repurposing of FDA-approved drugs has many practical benefits, especially for p38 inhibitors. Many of the most common and potent p38 inhibitors do well in in vitro experiments but fall short during in vivo or human trials due to unexpected side effects or issues with toxicity (Hammaker & Firestein, 2010). When repurposing approved drugs, however, the safety of the compound has been studied extensively by the FDA and often other organizations like the European Medicines Agency (EMA). 33 This not only ensures that a drug is safe and its side effects are well documented, but it also significantly cheapens and shortens the researcher to patient pipeline, which normally takes an average of 10-15 years, because doctors can provide off-label prescriptions before official approval. 34 Thus, if clinical trials bode well, these inhibitors could very soon be available to patients suffering from potentially life-threatening p38-mediated diseases.
Traditional drug research and development is a long and tedious process that has become incredibly inefficient relative to the amount of money put in. In fact, "the number of new drugs approved per billion US dollars spent on R&D has halved roughly every 9 years since 1950." 35 Although in silico research cannot replace empirical trials, it allows us to efficiently and cheaply elucidate promising drug candidates to expedite the creation of new treatments, especially when multiple computational procedures are used in tandem. As new technologies emerge and the amount of accessible drug data continues to grow, machine learning and other computational algorithms will continue to improve in efficacy and allow for unprecedented advancements in the pharmaceutical field.

ACKNOWLEDGMENTS
We thank CCG (Montreal) for their support.

AUTHORS CONTRIBUTION
IT and VK proposed the research described in the article, outlined the possible tools and discussed the strategies, results, and conclusions. AV developed a program for computation, trained the machine learning model and investigated the results of machine learning selection.