Sequence‐based prediction of protein–peptide binding sites using support vector machine |
| |
Authors: | Ghazaleh Taherzadeh Yuedong Yang Tuo Zhang Alan Wee‐Chung Liew Yaoqi Zhou |
| |
Affiliation: | 1. School of Information and Communication Technology, Griffith University, Southport, Queensland, Australia;2. Institute for Glycomics, Griffith University, Southport, Queensland, Australia;3. Weill Cornell Medical College, New York, New York |
| |
Abstract: | Protein–peptide interactions are essential for all cellular processes including DNA repair, replication, gene‐expression, and metabolism. As most protein – peptide interactions are uncharacterized, it is cost effective to investigate them computationally as the first step. All existing approaches for predicting protein – peptide binding sites, however, are based on protein structures despite the fact that the structures for most proteins are not yet solved. This article proposes the first machine‐learning method called SPRINT to make Sequence‐based prediction of Protein – peptide Residue‐level Interactions. SPRINT yields a robust and consistent performance for 10‐fold cross validations and independent test. The most important feature is evolution‐generated sequence profiles. For the test set (1056 binding and non‐binding residues), it yields a Matthews’ Correlation Coefficient of 0.326 with a sensitivity of 64% and a specificity of 68%. This sequence‐based technique shows comparable or more accurate than structure‐based methods for peptide‐binding site prediction. SPRINT is available as an online server at: http://sparks-lab.org/ . © 2016 Wiley Periodicals, Inc. |
| |
Keywords: | protein– peptide binding site sequence‐based prediction features machine learning support vector machine |
|
|