Improving Protein Disorder Prediction by Secondary Structure Information

碩士 === 國立臺灣大學 === 資訊工程學研究所 === 94 === There are increasing quantities of proteins discovered to contain regions that do not form stable tertiary structures in their native states. Such sequence fragments that have no propensity to form specific structures are regarded as “disordered regions”. Some d...

Full description

Bibliographic Details
Main Authors: Tong-Ming Xu, 許通明
Other Authors: Yen-Jen Oyang
Format: Others
Language:zh-TW
Published: 2006
Online Access:http://ndltd.ncl.edu.tw/handle/40456759127091505010
Description
Summary:碩士 === 國立臺灣大學 === 資訊工程學研究所 === 94 === There are increasing quantities of proteins discovered to contain regions that do not form stable tertiary structures in their native states. Such sequence fragments that have no propensity to form specific structures are regarded as “disordered regions”. Some disordered regions have been justified to be functionally significant. Therefore, a reliable predictor for such disordered regions is important for further understanding of protein functions. Most recent studies employ the amino acid composition and/or a number of biochemical properties within a sliding window with respect to the target residue as the feature set in predicting protein disorder. In this regard, this thesis conducts a comprehensive study on the performance of a recently proposed feature set which considers both physicochemical properties and amino acid propensity for order/disorder, and demonstrates how a two-stage framework improves the accuracy of the classifier. Furthermore, we propose a novel feature based on protein secondary structures to reduce potential false postives. This thesis attempts several ways of extracting information from the local secondary structures. The experimental results reveal that the feature set taking the distance to the nearest secondary structure element (SSE) of the target residue outperforms the others. In particular, it is observed that employing the proposed feature set in the second stage delivers better accuracies than.that is used together with the original feature sets.