Prediction of protein structural class based on symmetrical recurrence quantification analysis |
| |
Institution: | 1. Research Team in Intelligent Machines, National School of Engineers of Gabes, B.P. W, 6072 Gabes, Tunisia;2. GSII ESEO – LAUM UMR CNRS 6613, 49000 Angers, France;1. Department of Computer Science & Engineering, Dr. Sudhir Chandra Sur Degree Engineering College, 540, Dum Dum Road, Near Dum Dum Jn. Station, Surermath, Kolkata, 700074, India;2. Department of Computer Science & Engineering, University of Calcutta, Saltlake City, Kolkata, 700073, India;3. Department of Computer Science & Engineering, Netaji Subhash Engineering College, Techno City, Panchpota, Garia, Kolkata, 700152, India;1. Hainan Key Laboratory for Computational Science and Application, Hainan Normal University, Haikou 571158, China;2. Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China;3. Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China;1. Hainan Key Laboratory for Computational Science and Application, Hainan Normal University, Haikou 571158, China;2. Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China;3. Yangtze Delta Region Institute (Quzhou), Universityof Electronic Science and Technology of China, Quzhou 324000, China;1. School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, 518000, China;2. School of Management, Shenzhen Polytechnic, Shenzhen, 518000, China |
| |
Abstract: | Protein structural class prediction for low similarity sequences is a significant challenge and one of the deeply explored subjects. This plays an important role in drug design, folding recognition of protein, functional analysis and several other biology applications. In this paper, we worked with two benchmark databases existing in the literature (1) 25PDB and (2) 1189 to apply our proposed method for predicting protein structural class. Initially, we transformed protein sequences into DNA sequences and then into binary sequences. Furthermore, we applied symmetrical recurrence quantification analysis (the new approach), where we got 8 features from each symmetry plot computation. Moreover, the machine learning algorithms such as Linear Discriminant Analysis (LDA), Random Forest (RF) and Support Vector Machine (SVM) are used. In addition, comparison was made to find the best classifier for protein structural class prediction. Results show that symmetrical recurrence quantification as feature extraction method with RF classifier outperformed existing methods with an overall accuracy of 100% without overfitting. |
| |
Keywords: | Protein structural classes Symmetry Symmetrical recurrence quantification analysis Recurrence plot Machine learning SVM LDA Random Forest |
本文献已被 ScienceDirect 等数据库收录! |
|