Prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.

As one of the most widespread protein post-translational modifications, phosphorylation is involved in many biological processes such as cell cycle, apoptosis. Identification of phosphorylated substrates and their corresponding sites will facilitate the understanding of the molecular mechanism of ph...

Full description

Bibliographic Details
Main Authors: Xiaowei Zhao, Wenyi Zhang, Xin Xu, Zhiqiang Ma, Minghao Yin
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2012-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3478286?pdf=render
id doaj-de8b277b303748659873dc63d439e009
record_format Article
spelling doaj-de8b277b303748659873dc63d439e0092020-11-25T00:12:14ZengPublic Library of Science (PLoS)PLoS ONE1932-62032012-01-01710e4630210.1371/journal.pone.0046302Prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.Xiaowei ZhaoWenyi ZhangXin XuZhiqiang MaMinghao YinAs one of the most widespread protein post-translational modifications, phosphorylation is involved in many biological processes such as cell cycle, apoptosis. Identification of phosphorylated substrates and their corresponding sites will facilitate the understanding of the molecular mechanism of phosphorylation. Comparing with the labor-intensive and time-consuming experiment approaches, computational prediction of phosphorylation sites is much desirable due to their convenience and fast speed. In this paper, a new bioinformatics tool named CKSAAP_PhSite was developed that ignored the kinase information and only used the primary sequence information to predict protein phosphorylation sites. The highlight of CKSAAP_PhSite was to utilize the composition of k-spaced amino acid pairs as the encoding scheme, and then the support vector machine was used as the predictor. The performance of CKSAAP_PhSite was measured with a sensitivity of 84.81%, a specificity of 86.07% and an accuracy of 85.43% for serine, a sensitivity of 78.59%, a specificity of 82.26% and an accuracy of 80.31% for threonine as well as a sensitivity of 74.44%, a specificity of 78.03% and an accuracy of 76.21% for tyrosine. Experimental results obtained from cross validation and independent benchmark suggested that our method was very promising to predict phosphorylation sites and can be served as a useful supplement tool to the community. For public access, CKSAAP_PhSite is available at http://59.73.198.144/cksaap_phsite/.http://europepmc.org/articles/PMC3478286?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Xiaowei Zhao
Wenyi Zhang
Xin Xu
Zhiqiang Ma
Minghao Yin
spellingShingle Xiaowei Zhao
Wenyi Zhang
Xin Xu
Zhiqiang Ma
Minghao Yin
Prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.
PLoS ONE
author_facet Xiaowei Zhao
Wenyi Zhang
Xin Xu
Zhiqiang Ma
Minghao Yin
author_sort Xiaowei Zhao
title Prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.
title_short Prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.
title_full Prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.
title_fullStr Prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.
title_full_unstemmed Prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.
title_sort prediction of protein phosphorylation sites by using the composition of k-spaced amino acid pairs.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2012-01-01
description As one of the most widespread protein post-translational modifications, phosphorylation is involved in many biological processes such as cell cycle, apoptosis. Identification of phosphorylated substrates and their corresponding sites will facilitate the understanding of the molecular mechanism of phosphorylation. Comparing with the labor-intensive and time-consuming experiment approaches, computational prediction of phosphorylation sites is much desirable due to their convenience and fast speed. In this paper, a new bioinformatics tool named CKSAAP_PhSite was developed that ignored the kinase information and only used the primary sequence information to predict protein phosphorylation sites. The highlight of CKSAAP_PhSite was to utilize the composition of k-spaced amino acid pairs as the encoding scheme, and then the support vector machine was used as the predictor. The performance of CKSAAP_PhSite was measured with a sensitivity of 84.81%, a specificity of 86.07% and an accuracy of 85.43% for serine, a sensitivity of 78.59%, a specificity of 82.26% and an accuracy of 80.31% for threonine as well as a sensitivity of 74.44%, a specificity of 78.03% and an accuracy of 76.21% for tyrosine. Experimental results obtained from cross validation and independent benchmark suggested that our method was very promising to predict phosphorylation sites and can be served as a useful supplement tool to the community. For public access, CKSAAP_PhSite is available at http://59.73.198.144/cksaap_phsite/.
url http://europepmc.org/articles/PMC3478286?pdf=render
work_keys_str_mv AT xiaoweizhao predictionofproteinphosphorylationsitesbyusingthecompositionofkspacedaminoacidpairs
AT wenyizhang predictionofproteinphosphorylationsitesbyusingthecompositionofkspacedaminoacidpairs
AT xinxu predictionofproteinphosphorylationsitesbyusingthecompositionofkspacedaminoacidpairs
AT zhiqiangma predictionofproteinphosphorylationsitesbyusingthecompositionofkspacedaminoacidpairs
AT minghaoyin predictionofproteinphosphorylationsitesbyusingthecompositionofkspacedaminoacidpairs
_version_ 1725400251279343616