Factors analysis of protein O-glycosylation site prediction |
| |
Affiliation: | 1. School of Mathematics and Information Science, Xianyang Normal University, Wenlin Road, Xianyang, 712000, China;2. Department of Computer and Information Science, Fordham University, Lincoln Center, New York, NY, 10023, USA;1. Institute for Quantitative Biomedical Sciences, Dartmouth College, Hanover, NH, United States;2. Department of Biological Sciences, Dartmouth College, Hanover, NH, United States;3. Department of Epidemiology, Geisel School of Medicine, Lebanon, NH, United States;4. Department of Biomedical Data Science, Geisel School of Medicine, Lebanon, NH, United States;5. The Dartmouth Institute for Health Policy and Clinical Practice, Geisel School of Medicine, Lebanon, NH, United States;1. Department of Pharmacology, Faculty of Medicine, University of Jordan, Amman, Jordan;2. Department of Biology, University of Jordan, Amman, Jordan;3. School of Medicine, University of Adelaide, Adelaide, South Australia, Australia;4. South Australian Health and Medical Research Institute, Adelaide, South Australia, Australia;5. Department of Pharmaceutical Sciences, Faculty of Pharmacy, University of Jordan, Amman, Jordan;1. SiSaf Ltd, Innovation Centre, Northern Ireland Science Park, Queen''s Island, Belfast, BT3 9DT, UK;2. Dipartimento di Scienze e Tecnologie Biologiche Chimiche e Farmaceutiche (STEBICEF), Università di Palermo, Via Archirafi 32, 90123, Palermo, Italy |
| |
Abstract: | To improve the prediction accuracy of O-glycosylation sites, and analyze the structure of the O-glycosylation sites, factor analysis based prediction is proposed in this study. Our studies show that factor analysis strongly boosts machine learning algorithms’ performance in glycosylation site prediction besides demonstrates advantages compared to principal component analysis and nonnegative matrix factorization. In addition, we have found that factor analysis based linear discriminant analysis seem to be a desirable method in O-glycosylation site prediction for its advantage in both accuracy and time complexity than other machine learning methods. To the best of our knowledge, it is the first work to employ factor analysis in glycosylation site prediction and will inspire more future work in this topic. |
| |
Keywords: | Protein Machine learning Factors analysis Correspondence analysis |
本文献已被 ScienceDirect 等数据库收录! |
|