首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于深度学习的保留时间预测方法的研究进展及应用
引用本文:杜卓锟,邵伟,秦伟捷.基于深度学习的保留时间预测方法的研究进展及应用[J].色谱,2021,39(3):211-218.
作者姓名:杜卓锟  邵伟  秦伟捷
作者单位:1.安徽医科大学基础医学院, 安徽 合肥 2300322.军事科学院军事医学研究院生命组学研究所, 北京蛋白质组研究中心, 蛋白质组学国家重点实验室, 北京 102206
基金项目:国家重点研发计划项目(2017YFA0505002);国家重点研发计划项目(2018YFC0910302);国家重点研发计划项目(2016YFA0501403)
摘    要:在基于液相色谱-质谱联用的蛋白质组学研究中,肽段的保留时间作为有效区分不同肽段的特征参数,可以根据肽段自身的序列等信息对其进行预测.使用预测得到的保留时间辅助质谱数据鉴定肽段序列可以提高鉴定的准确性,因此对保留时间预测的工作一直受到领域内的广泛关注.传统的保留时间预测方法通常是根据氨基酸序列计算肽段的理化性质,进而计算...

关 键 词:液相色谱-串联质谱  保留时间  深度学习  蛋白质组
收稿时间:2020-08-20

Research progress and application of retention time prediction method based on deep learning
DU Zhuokun,SHAO Wei,QIN Weijie.Research progress and application of retention time prediction method based on deep learning[J].Chinese Journal of Chromatography,2021,39(3):211-218.
Authors:DU Zhuokun  SHAO Wei  QIN Weijie
Institution:1. School of Basic Medicine, Anhui Medical University, Hefei 230032, China2. State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Lifeomics, Beijing 102206, China
Abstract:In “shotgun” proteomics strategy, the proteome is explained by analyzing tryptic digested peptides using liquid chromatography-mass spectrometry. In this strategy, the retention time of peptides in liquid chromatography separation can be predicted based on the peptide sequence. This is a useful feature for peptide identification. Therefore, the prediction of the retention time has attracted much research attention. Traditional methods calculate the physical and chemical properties of the peptides based on their amino acid sequence to obtain the retention time under certain chromatography conditions; however, these methods cannot be directly adopted for other chromatography conditions, nor can they be used across laboratories or instrument platforms. To solve this problem, in recent years, deep learning was introduced to proteomics research for retention time prediction. Deep learning is an advanced machine-learning method that has extraordinary capability to learn complex relationships from large-scale data. By stacking multiple hidden neural networks, deep learning can ingest raw data without manually designed features. Transfer learning is an important method in deep learning. It improves the learning process a new task through the transfer of knowledge from an already-learned related task. Transfer learning allows models trained using large datasets to be utilized across conditions by fine-tuning on smaller datasets, instead of retraining the whole model. Many retention time prediction methods have been developed. In the process of training the model, the sequences of peptides are encoded to represent peptide information. Deep learning considers the relationship between the characteristics of the peptides and their corresponding retention times without the need for manual input of the physical and chemical properties of the peptides. Compared with traditional methods, deep learning methods have higher accuracy and can be easily used under different chromatography conditions by transfer learning. If there are not enough datasets to train a new model, a trained model from other datasets can be used as a replacement after calibration with small datasets obtained from these chromatography conditions. While the retention times of modified peptides can also be predicted, the predictions are inadequate for complex modifications such as glycosylation, and this is one of the main problems to be solved. The predicted retention times were used to control the quality of peptide identification. With high accuracy, the predicted retention times can be considered as actual retention times. Therefore, the difference between predicted and observed retention times can serve as an effective and unbiased quantitative metric for evaluating the quality of peptide-spectrum matches (PSMs) reported using different peptide identification methods. Combined with fragment ion intensity prediction, retention time prediction is used to generate spectral libraries for data-independent acquisition (DIA)-based mass spectrometry analysis. Generally, DIA methods identify peptides using specific spectrum libraries obtained from data-dependent acquisition (DDA) experiments. As a result, only peptides detected in the DDA experiments can be present in the libraries and detected in DIA. Furthermore, it takes a lot of time and effort to build libraries from DDA experiments, and typically, they cannot be adopted across different laboratories or instrument platforms. In contrast, the pseudo spectral libraries generated by retention times and fragment ion intensity prediction can overcome these shortcomings. The pseudo spectral libraries generate theoretical spectra of all possible peptides without the need for DDA experiments. This paper reviews the research progress of deep learning methods in the prediction of retention time and in related applications in order to provide references for retention time prediction and protein identification. At the same time, the development direction and application trend of retention time prediction methods based on deep learning are discussed.
Keywords:liquid chromatography-tandem mass spectrometry(LC-MS/MS)  retention time  deep learning  proteomics  
点击此处可从《色谱》浏览原始摘要信息
点击此处可从《色谱》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号