Prediction of human-Streptococcus pneumoniae protein-protein interactions using logistic regression |
| |
Affiliation: | 1. Department of Computer Science, Abdul Wali Khan University Mardan, Pakistan;2. School of Computer Science and Engineering, Nanjing University of science and technology Nanjing 210094 China |
| |
Abstract: | Streptococcus pneumoniae is a major cause of mortality in children under five years old. In recent years, the emergence of antibiotic-resistant strains of S. pneumoniae increases the threat level of this pathogen. For that reason, the exploration of S. pneumoniae protein virulence factors should be considered in developing new drugs or vaccines, for instance by the analysis of host-pathogen protein-protein interactions (HP-PPIs). In this research, prediction of protein-protein interactions was performed with a logistic regression model with the number of protein domain occurrences as features. By utilizing HP-PPIs of three different pathogens as training data, the model achieved 57–77 % precision, 64–75 % recall, and 96–98 % specificity. Prediction of human-S. pneumoniae protein-protein interactions using the model yielded 5823 interactions involving thirty S. pneumoniae proteins and 324 human proteins. Pathway enrichment analysis showed that most of the pathways involved in the predicted interactions are immune system pathways. Network topology analysis revealed β-galactosidase (BgaA) as the most central among the S. pneumoniae proteins in the predicted HP-PPI networks, with a degree centrality of 1.0 and a betweenness centrality of 0.451853. Further experimental studies are required to validate the predicted interactions and examine their roles in S. pneumoniae infection. |
| |
Keywords: | Host-pathogen protein-protein interactions Logistic regression Network centrality Pathway enrichment |
本文献已被 ScienceDirect 等数据库收录! |
|