Abstract
Renal secretion plays an important role in drug excretion from the kidney. Two major transporters known to be highly involved in renal secretion are MATE1/2-K and OCT2, the former of which is highly related to drug-drug interactions. Among published in silico models for MATE inhibitors, a previous model obtained a ROC-AUC value of 0.78 using high throughput percentage inhibition data [J Med Chem. 2013;56(3): 781–795] which we aimed to improve upon here using a combined fingerprint and physics-based approach. To this end, we collected 225 publicly available compounds with pIC50 values against MATE1. Subsequently, on the one hand we performed a physics-based approach using an Alpha-Fold protein structure, from which we obtained MM-GB/SA scores for those compounds. On the other hand, we built Random Forest (RF) and Message Passing Neural Network (MPNN) models using Extended-Connectivity Fingerprints with radius 4 (ECFP4) and chemical structures as graphs, respectively, which also included MM-GB/SA scores as input variables. In a five-fold cross-validation with a separate test set we found that the best predictivity for the hold-out test was observed in the RF model (including ECFP4 and MM-GB/SA data) with an ROC-AUC of 0.833±0.036; while that of the MM-GB/SA regression model was 0.742. However, the MM-GB/SA model was able to extrapolate to novel chemical space better. Additionally, via Structural Interaction Fingerprint analysis, we identified interacting residues with inhibitor as identical for those with non-inhibitors, including substrates, such as Gln49, Trp274, Tyr277, Tyr299, Ile303, Tyr306. The similar binding modes are consistent with the observed similar IC50 values inhibitor when using different substrates experimentally, and practically this can release the experimental scientists from bothering of selecting substrates for MATE1. Hence, we were able to build a highly predictive classification models for MATE1 inhibitory activity with both ECFP4 and MM-GB/SA score as input features, which is fit-for-purpose for use in the drug discovery process.