Protein-Protein Recognition Classification by Genetic Programming

碩士 === 國立嘉義大學 === 資訊工程學系研究所 === 98 === With the development of bioinformatics, discovering amino acid patterns on protein binding sites has recently become a popular issue. A protein interacts with another protein through the binding sites, which contain a lot of information about physicochemical pr...

Full description

Bibliographic Details
Main Authors: Kuan-Yu Su, 蘇冠宇
Other Authors: Huang-Cheng Kuo
Format: Others
Language:en_US
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/46469651675379281342
Description
Summary:碩士 === 國立嘉義大學 === 資訊工程學系研究所 === 98 === With the development of bioinformatics, discovering amino acid patterns on protein binding sites has recently become a popular issue. A protein interacts with another protein through the binding sites, which contain a lot of information about physicochemical properties. And most of properties are obtained from the composition of amino acids. Hence, the compositions of amino acids or characteristics of lead proteins interacting with each other, is what we are curious about. Protein-protein interaction represents the relationship of proteins. The interaction network can reflect which proteins belong to what kind of functions and roles. Among the interactions, there is an interaction case is if a protein interacts with another protein transiently and they will separate, the interaction is called protein-protein recognition or a transient protein complex. Genetic programming is a prominent technique of evolutionary computation. It mimics the evolution mechanism of biological environment to determine optimal solutions. Classification problems play an important role in the development of knowledge engineering. Thus, many machine learning algorithms have arisen to solve such problems. In this thesis, we focus on a proposed genetic programming method for feature selection and feature construction and combine SVM and Neural Network to solve classification problem on protein-protein recognition. Experimental results show that the proposed methods are accurate and effective. The experiment shows an acceptable prediction of recognition proteins with an average accuracy of 80% with ten-fold cross validation. We used that the constructed features can significantly improve the prediction accuracy in SVM and Neural Network. This satisfies the biologist’s efforts by saving time and money.