首页 | 本学科首页   官方微博 | 高级检索  
     

采用高斯过程模拟预测域/肽识别和相互作用
引用本文:任彦荣,陈绍成,邹晓川,田菲菲,周鹏. 采用高斯过程模拟预测域/肽识别和相互作用[J]. 中国科学:化学, 2012, 0(8): 1179-1189
作者姓名:任彦荣  陈绍成  邹晓川  田菲菲  周鹏
作者单位:[1]重庆第二师范学院生物与化学工程系,重庆400067 [2]西南交通大学生命科学与工程学院,成都610031 [3]电子科技大学神经信息教育部重点实验室生物信息中心,成都610054
基金项目:本工作得到国家高技术研究发展计划(863计划,2006AA02231);重庆市教育委员会项目(KJl01507)的资助,特此致谢.
摘    要:细胞信号网络中的蛋白质相互作用常通过结合一折叠偶合方式实现,即来自一方蛋白的刚性肽识别域与来自另一方蛋白表面的一段柔性寡肽片段发生识别和结合,从而介导母体蛋白相互作用.深入分析域/肽识别和相互作用的理化性质及精确预测其作用行为,能够有效揭示细胞信号转导的分子基础.该研究将一种新型非线性机器学习方法即高斯过程(GP),用于预测和分析4类域/肽体系数千个样本的亲和力数值和序列结构特征,并与传统偏最小二乘回归(PLS)及支持向量机(SVM)技术加以系统比较.结果表明,GP建模性能不亚于广泛使用的SVM,显著优于经典PLS.此外,GP能够较好处理线性和非线性混合问题、自动确定模型结构、能够通过超参数解释体系噪音纳入和变量贡献,给出预测结果的置信评估,这些特点皆是传统方法所不具备的.鉴于此,可以认为GP是一种具有开发潜力的机器学习策略,不仅可供分析域/肽识别和相互作用,还可用于解决和处理其他生物相关问题.

关 键 词:高斯过程  统计建模  机器学习  域/肽作用

Use of Gaussian process to model and predict domain-peptide recognition and interaction
REN YanRong,CHEN ShaoCheng,ZOU XiaoChuan,TIAN FeiFei,ZHOU Peng. Use of Gaussian process to model and predict domain-peptide recognition and interaction[J]. Scientia Sinica Chimica, 2012, 0(8): 1179-1189
Authors:REN YanRong  CHEN ShaoCheng  ZOU XiaoChuan  TIAN FeiFei  ZHOU Peng
Affiliation:1 Department of Biological and Chemical Engineering, Chongqing University of Education, Chongqing 400067, China 2 School of Life Science and Engineering, Southwest Jiaotong University, Chengdu 610031, China 3 Center of Bioinformatics (COBI), Key Laboratory for Neuroinformation of Ministry of Education, University of Electronic Science and Technology of China (UESTC), Chengdu 610054, China
Abstract:Many protein-protein interactions involved in cell signaling networks conduct with the manner so-called folding-on-binding, which are mediated by the binding of a globular domain in one protein to a short peptide stretch in another. Thus, systematic analysis and reliable prediction of domain-peptide recognition and interaction are fundamentally important for our understanding of the molecular mechanism and biological implications underlying cell signaling. Herein, we report the use of a new and powerful machine learning technique called Gaussian process (GP) to carry out statistical modeling and structural analysis for four categories of domain-peptide systems, including SH3, PDZ, 14-3-3, and GYF domains. The results of the modeling are compared systematically to those deriving from classical partial least square (PLS) regression and sophisticated supporting vector machine (SVM). We demonstrate that GP is comparable with or even better than nonlinear SVM, and is much well to linear PLS. In addition, GP possesses some more merits as it is capable of handling linearity and nonlinearity-hybrid problems effectively, determining algorithmic parameters automatically, interpreting obtained models straightforwardly, and providing additional evaluation for predictions quantitatively. All of these come together to suggest that GP would be a promising tool not only for exploring the domain-peptide interaction behavior, but also for solving other chemistry and biology-related problems.
Keywords:Gaussian process   statistical modeling   machine learning   domain-peptide interaction
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号