Computational Structural Biology Studies on: I. Conformational entropy II. Protein dynamics

博士 === 國立交通大學 === 生物資訊研究所 === 97 === A complete protein sequence usually determines a unique structure; however the situation is different for shorter subsequence. Studies found that both designed and nature occurring subsequences may have different secondary structures in different contexts. This f...

Full description

Bibliographic Details
Main Authors: Shao-Wei Huang, 黃少偉
Other Authors: Jenn-Kang Hwang
Format: Others
Language:en_US
Online Access:http://ndltd.ncl.edu.tw/handle/45439283616294954037
Description
Summary:博士 === 國立交通大學 === 生物資訊研究所 === 97 === A complete protein sequence usually determines a unique structure; however the situation is different for shorter subsequence. Studies found that both designed and nature occurring subsequences may have different secondary structures in different contexts. This feature of short sequence is called “chameleon” which was first reported by Kabsch and Sander when they used sequence homology to predict protein structures. They found that several pentapeptides which have identical sequence adopt different secondary structures in different protein structures. For nature occurring proteins, systematic search on PDB shows that identical subsequences could have very different conformations. Here we developed a method to compute structure conservation from protein sequence. During protein folding process, there are some structured regions which are similar to folded conformation. Hydrogen isotope exchange (HX) rate is usually used to identify those structured regions. We applied this method to a set of proteins with known HX rate data and found a strong correlation between structure conservation and slow HX rate. One of the most important topics in biological science is to understand the protein function. It is well-known that protein dynamics is closely related to the function of protein. Several computational methods have been developed to get the protein dynamics. Molecular dynamics (MD) simulation has been widely used in the study of protein function and dynamics. It simulates the interactions between each atom, bonding force, van der Waals force, charge-charge interaction, etc. The computation time is extremely long when the size of the protein is large and the selection of appropriate parameters of force field itself is a complicated problem. Gaussian network model (GNM) transfers the protein structure into a network in which each C脉 atom pair is connected together if their distance is smaller than a given cutoff value. Using this protein-converted network, GNM can compute the theoretical thermal fluctuation of each atom and correlation of motions between each atom pair. Recently we have developed a model to predict the thermal fluctuation from protein structure, which is called protein-fixed-point (PFP) model. The PFP model only uses the coordinates of C脉 atoms and simply determines the center of mass of the protein. We found that the thermal fluctuation is proportional to the squared distance from the atom to the center of mass of the structure. Another model called weighted contact number (WCN) model computes the number of neighboring atoms weighted by the inverse distance between each atom pair. The PFP and WCN model show that the protein dynamics can be extracted directly from the intrinsic property of protein structure without the use of any mechanical model. The order parameter obtained by the NMR experiment is widely used to study the dynamic-related protein functions. Here, we use the PFP and WCN model to predict the N-H S2 order parameter directly from the protein structure. Our results show that the WCN model can more accurately reproduce the experimental order parameter than previous publication. The biological function of proteins is closely related to cooperative motions and correlated fluctuations which involve large portions of the structure. Normal Mode Analysis (NMA) had been used to study biomolecules since early 1980s. It decomposes the protein dynamics into a collection of motions which include large scale/low frequency and small scale/high frequency motions. Biologists usually focus on the large scale/low frequency motions which are relevant to protein functions. The major contribution of NMA to the biological research field is the ability to provide the information of large, domain-scale protein motions which is hard to compute by other methods. The classical approach of NMA is to diagonalize the Hessian matrix, i.e. the second derivative of the potential function of a molecular dynamics (MD) simulation. The major shortcoming of the classical NMA is that the sampling time increases dramatically with the size of the protein. The Elastic Network Model (ENM), which is able to describes protein dynamics without amino acid sequence and atomic coordinates, has been widely used in the studies of protein dynamics and structure-function relationship. The ENM views the protein structure as an elastic network, the nodes of which are the C脉 atoms of individual residues. Residue pairs within a cutoff distance are connected by springs which have a uniform force constant in the network. Based on ENM, a coarse-grained version of NMA is developed and widely used because of its low computation cost and the ability to extend the dynamics to longer timescale and larger motions. The coarse-grained NMA had been applied to various topics, for example, protein functions and catalytic residues. One of the most widely used ENM-based methods is the Gaussian network model (GNM). The protein-fixed-point (PFP) model is a simple method to compute the protein dynamics only using the coordinates of C脉 atoms. Despite its simplicity, it has been shown to be able to accurately predict the B-factors for a dataset of 972 proteins. Here, we compared the results of NMA based on the PFP model with those by Gaussian network model (GNM).