Abstract
A computational approach for the prediction of tobacco-specific nitrosamine (TSNA) metabolites by cytochrome P450s (CYPs) has been developed that currently predicts all of the known CYP2A13 metabolites of nicotine-derived nitrosamine ketone (NNK), N-nitrosonornicotine (NNN), and 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanol (NNAL) resulting from hydroxylations and heteroatom oxidations reported in metabolomics literature. This computational approach integrates 1) machine learning models trained on quantum-mechanically-derived molecular surface properties for a set of CYP substrates with known metabolites to identify sites of metabolism across CYP isoforms and 2) validation of machine learning predictions using ensemble docking of the TSNA parent molecules into CYP2A13’s binding site to identify the most likely TSNA reactive atoms. This method is generalizable to any CYP isoform for which there is structural information, opening the door to the prediction of P450-based metabolite prediction, as well as prediction and rationalization of metabolomics data.