Selective prediction of interaction sites in protein structures with THEMATICS

<p>Abstract</p> <p>Background</p> <p>Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precis...

Full description

Bibliographic Details
Main Authors: Murga Leonel F, Ko Jaeju, Wei Ying, Ondrechen Mary Jo
Format: Article
Language:English
Published: BMC 2007-04-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/8/119
id doaj-7d8d5b419c6145208594a5ce15636a99
record_format Article
spelling doaj-7d8d5b419c6145208594a5ce15636a992020-11-25T00:26:47ZengBMCBMC Bioinformatics1471-21052007-04-018111910.1186/1471-2105-8-119Selective prediction of interaction sites in protein structures with THEMATICSMurga Leonel FKo JaejuWei YingOndrechen Mary Jo<p>Abstract</p> <p>Background</p> <p>Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites.</p> <p>Results</p> <p>Using a test set of 169 enzymes from the original Catalytic Residue Dataset (CatRes) it is shown that THEMATICS can deliver precise, localised site predictions. Furthermore, adjustment of the cut-off criteria can improve the recall rates for catalytic residues with only a small sacrifice in precision. Recall rates for CatRes/CSA annotated catalytic residues are 41.1%, 50.4%, and 54.2% for Z score cut-off values of 1.00, 0.99, and 0.98, respectively. The corresponding precision rates are 19.4%, 17.9%, and 16.4%. The success rate for catalytic sites is higher, with correct or partially correct predictions for 77.5%, 85.8%, and 88.2% of the enzymes in the test set, corresponding to the same respective Z score cut-offs, if only the CatRes annotations are used as the reference set. Incorporation of additional literature annotations into the reference set gives total success rates of 89.9%, 92.9%, and 94.1%, again for corresponding cut-off values of 1.00, 0.99, and 0.98. False positive rates for a 75-protein test set are 1.95%, 2.60%, and 3.12% for Z score cut-offs of 1.00, 0.99, and 0.98, respectively.</p> <p>Conclusion</p> <p>With a preferred cut-off value of 0.99, THEMATICS achieves a high success rate of interaction site prediction, about 86% correct or partially correct using CatRes/CSA annotations only and about 93% with an expanded reference set. Success rates for catalytic residue prediction are similar to those of other structure-based methods, but with substantially better precision and lower false positive rates. THEMATICS performs well across the spectrum of E.C. classes. The method requires only the structure of the query protein as input. THEMATICS predictions may be obtained via the web from structures in PDB format at: <url>http://pfweb.chem.neu.edu/thematics/submit.html</url></p> http://www.biomedcentral.com/1471-2105/8/119
collection DOAJ
language English
format Article
sources DOAJ
author Murga Leonel F
Ko Jaeju
Wei Ying
Ondrechen Mary Jo
spellingShingle Murga Leonel F
Ko Jaeju
Wei Ying
Ondrechen Mary Jo
Selective prediction of interaction sites in protein structures with THEMATICS
BMC Bioinformatics
author_facet Murga Leonel F
Ko Jaeju
Wei Ying
Ondrechen Mary Jo
author_sort Murga Leonel F
title Selective prediction of interaction sites in protein structures with THEMATICS
title_short Selective prediction of interaction sites in protein structures with THEMATICS
title_full Selective prediction of interaction sites in protein structures with THEMATICS
title_fullStr Selective prediction of interaction sites in protein structures with THEMATICS
title_full_unstemmed Selective prediction of interaction sites in protein structures with THEMATICS
title_sort selective prediction of interaction sites in protein structures with thematics
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2007-04-01
description <p>Abstract</p> <p>Background</p> <p>Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites.</p> <p>Results</p> <p>Using a test set of 169 enzymes from the original Catalytic Residue Dataset (CatRes) it is shown that THEMATICS can deliver precise, localised site predictions. Furthermore, adjustment of the cut-off criteria can improve the recall rates for catalytic residues with only a small sacrifice in precision. Recall rates for CatRes/CSA annotated catalytic residues are 41.1%, 50.4%, and 54.2% for Z score cut-off values of 1.00, 0.99, and 0.98, respectively. The corresponding precision rates are 19.4%, 17.9%, and 16.4%. The success rate for catalytic sites is higher, with correct or partially correct predictions for 77.5%, 85.8%, and 88.2% of the enzymes in the test set, corresponding to the same respective Z score cut-offs, if only the CatRes annotations are used as the reference set. Incorporation of additional literature annotations into the reference set gives total success rates of 89.9%, 92.9%, and 94.1%, again for corresponding cut-off values of 1.00, 0.99, and 0.98. False positive rates for a 75-protein test set are 1.95%, 2.60%, and 3.12% for Z score cut-offs of 1.00, 0.99, and 0.98, respectively.</p> <p>Conclusion</p> <p>With a preferred cut-off value of 0.99, THEMATICS achieves a high success rate of interaction site prediction, about 86% correct or partially correct using CatRes/CSA annotations only and about 93% with an expanded reference set. Success rates for catalytic residue prediction are similar to those of other structure-based methods, but with substantially better precision and lower false positive rates. THEMATICS performs well across the spectrum of E.C. classes. The method requires only the structure of the query protein as input. THEMATICS predictions may be obtained via the web from structures in PDB format at: <url>http://pfweb.chem.neu.edu/thematics/submit.html</url></p>
url http://www.biomedcentral.com/1471-2105/8/119
work_keys_str_mv AT murgaleonelf selectivepredictionofinteractionsitesinproteinstructureswiththematics
AT kojaeju selectivepredictionofinteractionsitesinproteinstructureswiththematics
AT weiying selectivepredictionofinteractionsitesinproteinstructureswiththematics
AT ondrechenmaryjo selectivepredictionofinteractionsitesinproteinstructureswiththematics
_version_ 1725342628701011968