首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A comparison of classifiers for predicting the class color of fluorescent proteins
Institution:1. Department of Cell Biology and Genetics, Chongqing Medical University, Chongqing 400016, China;2. Molecular Medicine and Cancer Research Center, Chongqing Medical University, Chongqing 400016, China;3. Experimental Teaching Center, Chongqing Medical University, Chongqing 400016, China;4. Department of Bioinformatics, Chongqing Medical University, Chongqing 400016, China;1. School of Natural Sciences, Far Eastern State University, Suhanova, 8, 690091, Vladivostok, Russian Federation;2. Institute of Chemistry, Far Eastern Branch of Russian Academy of Sciences, pr. 100-letiya Vladivostoka, 159, Vladivostok, Russian Federation;1. Department of Allergy, The Third Affiliated Hospital of Shenzhen University, Shenzhen, 518020, China;2. Musculoskeletal Research Laboratory, Department of Orthopaedics & Traumatology, Faculty of Medicine, The Chinese University of Hong Kong, Shatin, 999077, Hong Kong Special Administrative Region;3. College of Life Sciences, Northwest University, Shaanxi, 712100, China
Abstract:Fluorescent proteins have been applied in a wide variety of fields ranging from basic science to industrial applications. Apart from the naturally occurring fluorescent proteins, there is a growing interest in genetically modified variants that emit light in a specific wavelength. Genetically modifying a protein is not an easy task, especially because the exchange of one residue by other has to achieve the desired property while maintaining protein stability. To help in the choice of residue exchange, computational methods are applied to predict function and stability of proteins. In this work we have prepared a dataset composed by 109 fluorescent proteins and tested four classical supervised classification algorithms: artificial neural networks (ANNs), decision trees (DTs), support vector machines (SVMs) and random forests (RFs). This is the first time that algorithms are compared in this task. Results of comparing the algorithm's performance shows that DT, SVM and RF were significantly better than ANNs, and RF was the best method in all the scenarios. However, the interpretability of DTs is highly relevant and can provide important clues about the mechanisms involved in protein color emission. The results are promising and indicate that the use of in silico methods can greatly reduce the time and cost of the in vitro experiments.
Keywords:Data mining  Classification  Fluorescent proteins  Structural biology
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号