Effective prediction of bacterial type IV secreted effectors by combined features of both C-termini and N-termini |
| |
Authors: | Yu Wang Yanzhi Guo Xuemei Pu Menglong Li |
| |
Institution: | 1.College of Chemistry,Sichuan University,Chengdu,China;2.College of Materials and Chemistry & Chemical Engineering,Chengdu University of Technology,Chengdu,China |
| |
Abstract: | Various bacterial pathogens can deliver their secreted substrates also called as effectors through type IV secretion systems (T4SSs) into host cells and cause diseases. Since T4SS secreted effectors (T4SEs) play important roles in pathogen-host interactions, identifying them is crucial to our understanding of the pathogenic mechanisms of T4SSs. A few computational methods using machine learning algorithms for T4SEs prediction have been developed by using features of C-terminal residues. However, recent studies have shown that targeting information can also be encoded in the N-terminal region of at least some T4SEs. In this study, we present an effective method for T4SEs prediction by novelly integrating both N-terminal and C-terminal sequence information. First, we collected a comprehensive dataset across multiple bacterial species of known T4SEs and non-T4SEs from literatures. Then, three types of distinctive features, namely amino acid composition, composition, transition and distribution and position-specific scoring matrices were calculated for 50 N-terminal and 100 C-terminal residues. After that, we employed information gain represent to rank the importance score of the 150 different position residues for T4SE secretion signaling. At last, 125 distinctive position residues were singled out for the prediction model to classify T4SEs and non-T4SEs. The support vector machine model yields a high receiver operating curve of 0.916 in the fivefold cross-validation and an accuracy of 85.29% for the independent test set. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|