首页 | 本学科首页   官方微博 | 高级检索  
     


Laplacian normalization and random walk on heterogeneous networks for disease-gene prioritization
Affiliation:1. Hunan Key Laboratory for Computation and Simulation in Science and Engineering and Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, Hunan 411105, China;2. School of Mathematical Sciences, Queensland University of Technology, GPO Box 2434, Brisbane Q4001, Australia;3. Advanced Analytics Institute & Centre for Health Technologies, University of Technology Sydney, Broadway, NSW 2007, Australia;1. School of Chemistry, College of Science, University of Tehran, P.O. Box 14155-6455, Tehran, Islamic Republic of Iran;2. Department of Chemistry, Fukuoka University, 8-19-1 Nanakuma, Jonan-ku, Fukuoka, 814-0180, Japan;1. Anadolu University, Science Faculty, Department of Physics, 26470, Eskişehir, Turkey;2. Eskişehir Osmangazi University, Art and Sciences Faculty, Department of Physics, Eskişehir, Turkey;3. Eskişehir Osmangazi University, Central Research Laboratory, Application and Research Centre, Eskişehir, Turkey;1. School of Computer Science, Fudan University, Shanghai 200433, China;2. College of Electric and Information Engineering, Zhengzhou University of Light Industry, Zhengzhou 450002, China
Abstract:Random walk on heterogeneous networks is a recently emerging approach to effective disease gene prioritization. Laplacian normalization is a technique capable of normalizing the weight of edges in a network. We use this technique to normalize the gene matrix and the phenotype matrix before the construction of the heterogeneous network, and also use this idea to define the transition matrices of the heterogeneous network. Our method has remarkably better performance than the existing methods for recovering known gene–phenotype relationships. The Shannon information entropy of the distribution of the transition probabilities in our networks is found to be smaller than the networks constructed by the existing methods, implying that a higher number of top-ranked genes can be verified as disease genes. In fact, the most probable gene–phenotype relationships ranked within top 3 or top 5 in our gene lists can be confirmed by the OMIM database for many cases. Our algorithms have shown remarkably superior performance over the state-of-the-art algorithms for recovering gene–phenotype relationships. All Matlab codes can be available upon email request.
Keywords:Random walk with restart  Laplacian normalization  Disease genes and phenotypes  Heterogeneous network  Leave-one-out cross-validation  Shannon information entropy
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号