Prediction of 2-hydroxyisobutyrylation sites by integrating multiple sequence features with ensemble support vector machine |
| |
Affiliation: | College of Science, Shenyang Aerospace University, 110136, People’s Republic of China |
| |
Abstract: | Lysine 2-hydroxyisobutyrylation (Khib) is a new type of histone mark, which has been found to affect the association between histone and DNA. To better understand the molecular mechanism of Khib, it is important to identify 2-hydroxyisobutyrylated substrates and their corresponding Khib sites accurately. In this study, a novel bioinformatics tool named KhibPred is proposed to predict Khib sites in human HeLa cells. Three kinds of effective features, the composition of k-spaced amino acid pairs, binary encoding and amino acid factors, are incorporated to encode Khib sites. Moreover, an ensemble support vector machine is employed to overcome the imbalanced problem in the prediction. As illustrated by 10-fold cross-validation, the performance of KhibPred achieves a satisfactory performance with an area under receiver operating characteristic curve of 0.7937. Therefore, KhibPred can be a useful tool for predicting protein Khib sites. Feature analysis shows that the polarity factor features play significant roles in the prediction of Khib sites. The conclusions derived from this study might provide useful insights for in-depth investigation into the molecular mechanisms of Khib. |
| |
Keywords: | Post-translational modification 2-Hydroxyisobutyrylation Feature extraction Ensemble support vector machine |
本文献已被 ScienceDirect 等数据库收录! |
|