首页 | 本学科首页   官方微博 | 高级检索  
     检索      


An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data
Institution:1. School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou 215009, China;2. School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin 300350, China;3. Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, USA;1. School of Information Engineering, East China Jiaotong University, Nanchang, China;2. College of Computer Science and Electronic Engineering, Hunan University, Changsha, China;3. College of Information Science and Engineering, Hunan Normal University, Changsha, China;4. School of Information Science and Engineering, Shandong Normal University, Jinan, China;1. School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong 518055, China;2. School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China;3. Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China;1. College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan 410082, China;2. College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan 410003, China;1. School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China;2. Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China;3. College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China
Abstract:To explore the pathogenic mechanisms of MicroRNA (miRNA) on diverse diseases, many researchers have concentrated on discovering the potential associations between miRNA and disease using machine learning methods. However, the prediction accuracy of supervised machine learning methods is limited by lacking of experimentally-validated uncorrelated miRNA-disease pairs. Without these negative samples, training a highly accurate model is much more difficult. Different from traditional miRNA-disease prediction models using randomly selected unknown samples as negative training samples, we propose an ensemble learning framework to solve this positive-unlabeled (PU) learning problem. The framework incorporates two steps, i.e., a novel semi-supervised Kmeans (SS-Kmeans) to extract reliable negative samples from unknown miRNA-disease pairs and subagging method to generate diverse training sample sets to make full use of those reliable negative samples for ensemble learning. Combined with effective random vector functional link (RVFL) network as prediction model, the proposed framework showed superior prediction accuracy comparing with other popular approaches. A case study on lung and gastric neoplasms further confirms the framework’s efficacy at identifying miRNA disease associations.
Keywords:Semi-supervised Kmeans (SS-Kmeans)  Random vector functional link (RVFL)  Subagging  Ensemble learning  MiRNA-disease association
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号