Q-Motif: Phosphorylation Motif Finding by Subtractive Clustering

碩士 === 國立陽明大學 === 生物醫學資訊研究所 === 99 === Development of computational algorithms to discover biologically relevant phosphorylation motifs is becoming more and more important with the rapid increase in the proteomic sequences. Here we present a novel unsupervised method, called Quick Motif Finder, (in...

Full description

Bibliographic Details
Main Authors: Ying-Chen Huang, 黃盈禎
Other Authors: I-Fang Chung
Format: Others
Language:zh-TW
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/49083541375728612917
Description
Summary:碩士 === 國立陽明大學 === 生物醫學資訊研究所 === 99 === Development of computational algorithms to discover biologically relevant phosphorylation motifs is becoming more and more important with the rapid increase in the proteomic sequences. Here we present a novel unsupervised method, called Quick Motif Finder, (in short, Q-Motif) to extract phosphorylation motifs. Q-Motif adopts the subtractive clustering algorithm to discover motifs exploiting statistical information hidden in phosphorylated data. The phosphorylated data is clustered into homogeneous groups, which are analyzed to identify candidate motifs. These candidate motifs are then filtered to find actual motifs with statistically significant motif scores. We have applied Q-Motif on several new and existing data sets and compared its performance with a well known state-of-the-art method, Motif-X (Schwartz, 2005). In 80% cases Q-Motif could identify all statistically significant motifs extracted by the state-of-the-art method. In addition to this, Q-Motif uncovers several novel motifs. For most of these additional motifs, we have verified their existence using evidences from the literature. Because motif scores were calculated by Motif-X scoring method, we consider it could not represent the statistical significant of every motif extracted in original phosphorylated data. Therefore, we re-define the definition of scoring method and compare the results with different scoring. These clearly establish the excellent motif discovery ability of our algorithm. An iterative algorithm proposed here uses exploratory data analysis to discover motifs from phosphorylated data. The effectiveness of Q-Motif has been demonstrated using several real data sets as well as using a synthetic data set. The method is quite general in nature and can be used to find other type of motifs also.