Incorporating support vector machine for identifying protein tyrosine sulfation sites |
| |
Authors: | Wen‐Chi Chang Tzong‐Yi Lee Dray‐Ming Shien Justin Bo‐Kai Hsu Jorng‐Tzong Horng Po‐Chiang Hsu Ting‐Yuan Wang Hsien‐Da Huang Rong‐Long Pan |
| |
Institution: | 1. Department of Biological Science and Technology, National Chiao Tung University, Hsin‐Chu, Taiwan;2. Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsin‐Chu, Taiwan;3. Both the authors contributed equally to this work.;4. Department of Computer Science and Information Engineering, National Central University, Chung‐Li 320, Taiwan;5. Department of Electronic Engineering, Chin Min Institute of Technology, Miao‐Li, Taiwan;6. Department of Bioinformatics, Asia University, Taichung, Taiwan;7. Institute of Bioinformatics and Structural Biology, College of Life Sciences, National Tsing Hua University, Hsin‐Chu, Taiwan |
| |
Abstract: | Tyrosine sulfation is a post‐translational modification of many secreted and membrane‐bound proteins. It governs protein‐protein interactions that are involved in leukocyte adhesion, hemostasis, and chemokine signaling. However, the intrinsic feature of sulfated protein remains elusive and remains to be delineated. This investigation presents SulfoSite, which is a computational method based on a support vector machine (SVM) for predicting protein sulfotyrosine sites. The approach was developed to consider structural information such as concerning the secondary structure and solvent accessibility of amino acids that surround the sulfotyrosine sites. One hundred sixty‐two experimentally verified tyrosine sulfation sites were identified using UniProtKB/SwissProt release 53.0. The results of a five‐fold cross‐validation evaluation suggest that the accessibility of the solvent around the sulfotyrosine sites contributes substantially to predictive accuracy. The SVM classifier can achieve an accuracy of 94.2% in five‐fold cross validation when sequence positional weighted matrix (PWM) is coupled with values of the accessible surface area (ASA). The proposed method significantly outperforms previous methods for accurately predicting the location of tyrosine sulfation sites. © 2009 Wiley Periodicals, Inc. J Comput Chem, 2009 |
| |
Keywords: | protein sulfation prediction |
|
|