RT-Tranformer: Retention Time Prediction for Metabolite Annotation to Assist in Metabolite Identification



Liquid chromatography retention times (RTs) prediction can assist in metabolite identification, which is a critical task and challenge in non-targeted metabolomics. However, different chromatographic methods (CM) may result in different RTs for the same metabolite. Current RT prediction methods lack sufficient scalability to transfer from one specific chromatographic method to another. Therefore, we present RT-Transformer, a novel deep neural network model coupled with 1D-Transformer and graph attention network (GAT) that can predict RTs under any chromatographic methods. First, we obtain a pre-trained model by training RT-Transformer on the large small molecule retention time (SMRT) dataset containing 80038 molecules, and then project the resulting model onto different chromatographic methods based on transfer learning. When tested on the METLIN dataset, as other authors did, the average absolute error reached 27.3 after removing samples with retention times fewer than five minutes. Still, it reached 33.5 when no samples were removed. The pre-trained RT-Transformer was further transferred to 5 datasets corresponding to different chromatographic conditions and fine-tuned. According to the experimental results, RT-Transformer achieves competitive performance compared to state-of-the-art methods. In addition, RT-Transformer was applied to 30 external molecular RT datasets. Extensive evaluations indicate that RT-Transformer has excellent scalability in predicting RTs for liquid chromatography and improves the accuracy of metabolite identification.