Computational Prediction of Metabolic alpha-Carbon Hydroxylation Potential of N-Nitrosamines: Overcoming Data Limitations for Carcinogenicity Assessment


Recent withdrawal of several drugs from the market due to N-nitrosamine impurities highlighted the necessity for computational approaches to assess the carcinogenicity risk of these impurities. However, current approaches are limited because robust animal carcinogenicity data is only available for a few simple nitrosamines, which do not represent the structural diversity of the many possible nitrosamine drug substance related impurities (NDSRIs). In this paper, we present a novel method that uses data on CYP-mediated metabolic hydroxylation of CH2 groups in non-nitrosamine xenobiotics to identify structural features that may also help in predicting the likelihood of metabolic alpha-carbon hydroxylation in N-nitrosamines. Our approach offers a new avenue for tapping into potentially large experimental datasets on xenobiotic metabolism to improve the risk assessment of nitrosamines. It is believed that alpha-carbon hydroxylation is the vital rate-limiting step in the metabolic activation of nitrosamines, and identifying structural features that influence this process may be valuable in evaluating their carcinogenic potential. This is particularly significant as information regarding the factors that influence the metabolic activation of NDSRIs is practically non-existent. We discovered hundreds of structural features that either promote or hinder hydroxylation, in contrast to the very few that have been identified so far from the small nitrosamine carcinogenicity dataset. While relying solely on -carbon hydroxylation prediction is insufficient for forecasting carcinogenic potency, the identified features can help in the selection of relevant structural analogs in read across studies and assist domain experts who, after considering other factors such as the reactivity of the resulting electrophilic diazonium species, can establish the acceptable intake limits for nitrosamine impurities.


Supplementary material

SI_1: Different datasets used
Different datasets used, i.e., xenobiotic substrate-metabolite pairs, small carcinogenic nitrosamines, small nitrosamines with Lhasa TD50 values, and the list of potential NDSRIs.
SI_2: Molecular fragments and their regression coefficients.
Molecular fragments and their regression coefficients from the model for predicting CH2 hydroxylation in xenobiotics.
SI_3: tSNE coordinates
tSNE coordinates, SMILES, predicted ɑ-CH2 hydroxylation probabilities and clustering information for the acyclic and cyclic nitrosamine motifs.