Protein structure similarity from principle component correlation analysis

<p>Abstract</p> <p>Background</p> <p>Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their...

Full description

Bibliographic Details
Main Authors: Chou James, Zhou Xiaobo, Wong Stephen TC
Format: Article
Language:English
Published: BMC 2006-01-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/7/40
id doaj-11c56a8050714d0d9a42a2abe78fd8e6
record_format Article
spelling doaj-11c56a8050714d0d9a42a2abe78fd8e62020-11-25T00:15:22ZengBMCBMC Bioinformatics1471-21052006-01-01714010.1186/1471-2105-7-40Protein structure similarity from principle component correlation analysisChou JamesZhou XiaoboWong Stephen TC<p>Abstract</p> <p>Background</p> <p>Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD) in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities.</p> <p>Results</p> <p>We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC) analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins.</p> <p>Conclusion</p> <p>The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum eigenvalues can be highly effective in clustering structurally or topologically similar proteins. We believe that the PCC analysis of interaction matrix is highly flexible in adopting various structural parameters for protein structure comparison.</p> http://www.biomedcentral.com/1471-2105/7/40
collection DOAJ
language English
format Article
sources DOAJ
author Chou James
Zhou Xiaobo
Wong Stephen TC
spellingShingle Chou James
Zhou Xiaobo
Wong Stephen TC
Protein structure similarity from principle component correlation analysis
BMC Bioinformatics
author_facet Chou James
Zhou Xiaobo
Wong Stephen TC
author_sort Chou James
title Protein structure similarity from principle component correlation analysis
title_short Protein structure similarity from principle component correlation analysis
title_full Protein structure similarity from principle component correlation analysis
title_fullStr Protein structure similarity from principle component correlation analysis
title_full_unstemmed Protein structure similarity from principle component correlation analysis
title_sort protein structure similarity from principle component correlation analysis
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2006-01-01
description <p>Abstract</p> <p>Background</p> <p>Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD) in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities.</p> <p>Results</p> <p>We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC) analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins.</p> <p>Conclusion</p> <p>The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum eigenvalues can be highly effective in clustering structurally or topologically similar proteins. We believe that the PCC analysis of interaction matrix is highly flexible in adopting various structural parameters for protein structure comparison.</p>
url http://www.biomedcentral.com/1471-2105/7/40
work_keys_str_mv AT choujames proteinstructuresimilarityfromprinciplecomponentcorrelationanalysis
AT zhouxiaobo proteinstructuresimilarityfromprinciplecomponentcorrelationanalysis
AT wongstephentc proteinstructuresimilarityfromprinciplecomponentcorrelationanalysis
_version_ 1725387264678166528