首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A semi-supervised learning approach for RNA secondary structure prediction
Institution:1. Institute of Drug Research, School of Pharmacy, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 91120, Israel;2. University of Oslo, Faculty of Medicine, Institute of Clinical Medicine, N-0316 Oslo, Norway;3. Department of Oncology, Oslo University Hospital, Norwegian Radium Hospital, N-0310 Oslo, Norway;4. Genomic Data Analysis Unit, Hadassah Medical School, The Hebrew University of Jerusalem, Jerusalem 91120, Israel;5. Department of Pathology, Oslo University Hospital, Norwegian Radium Hospital, N-0310 Oslo, Norway
Abstract:RNA secondary structure prediction is a key technology in RNA bioinformatics. Most algorithms for RNA secondary structure prediction use probabilistic models, in which the model parameters are trained with reliable RNA secondary structures. Because of the difficulty of determining RNA secondary structures by experimental procedures, such as NMR or X-ray crystal structural analyses, there are still many RNA sequences that could be useful for training whose secondary structures have not been experimentally determined. In this paper, we introduce a novel semi-supervised learning approach for training parameters in a probabilistic model of RNA secondary structures in which we employ not only RNA sequences with annotated secondary structures but also ones with unknown secondary structures. Our model is based on a hybrid of generative (stochastic context-free grammars) and discriminative models (conditional random fields) that has been successfully applied to natural language processing. Computational experiments indicate that the accuracy of secondary structure prediction is improved by incorporating RNA sequences with unknown secondary structures into training. To our knowledge, this is the first study of a semi-supervised learning approach for RNA secondary structure prediction. This technique will be useful when the number of reliable structures is limited.
Keywords:RNA secondary structure  Semi-supervised learning  Parameter learning
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号