Improved mutation tagging with gene identifiers applied to membrane protein stability prediction

Background The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets. Results We d...

Full description

Bibliographic Details
Main Authors: Schröder, Michael, Winnenburg, Rainer, Plake, Conrad
Other Authors: BioMed Central,
Format: Article
Language:English
Published: Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden 2015
Subjects:
Online Access:http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-177379
http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-177379
http://www.qucosa.de/fileadmin/data/qucosa/documents/17737/1471-2105-10-S8-S3.pdf
id ndltd-DRESDEN-oai-qucosa.de-bsz-14-qucosa-177379
record_format oai_dc
spelling ndltd-DRESDEN-oai-qucosa.de-bsz-14-qucosa-1773792016-07-16T03:30:20Z Improved mutation tagging with gene identifiers applied to membrane protein stability prediction Schröder, Michael Winnenburg, Rainer Plake, Conrad Biotechnologie Bioinformatik biotechnology bioinformatics ddc:610 ddc:004 ddc:570 rvk:XA 10000 Background The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets. Results We developed a rule- and regular expression-based protein point mutation retrieval pipeline for PubMed abstracts, which shows an F-measure of 87% for the mutation retrieval task on a benchmark dataset. In order to link mutations to their proteins, we utilize a named entity recognition algorithm for the identification of gene names co-occurring in the abstract, and establish links based on sequence checks. Vice versa, we could show that gene recognition improved from 77% to 91% F-measure when considering mutation information given in the text. To demonstrate practical relevance, we utilize mutation information from text to evaluate a novel solvation energy based model for the prediction of stabilizing regions in membrane proteins. For five G protein-coupled receptors we identified 35 relevant single mutations and associated phenotypes, of which none had been annotated in the UniProt or PDB database. In 71% reported phenotypes were in compliance with the model predictions, supporting a relation between mutations and stability issues in membrane proteins. Conclusion We present a reliable approach for the retrieval of protein mutations from PubMed abstracts for any set of genes or proteins of interest. We further demonstrate how amino acid substitution information from text can be utilized for protein structure stability studies on the basis of a novel energy model. Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden BioMed Central, 2015-10-27 doc-type:article application/pdf http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-177379 urn:nbn:de:bsz:14-qucosa-177379 issn:1471-2105 PPN474236560 http://www.qucosa.de/fileadmin/data/qucosa/documents/17737/1471-2105-10-S8-S3.pdf BMC Bioinformatics 2009, 10(Suppl 8):S3, ISSN 1471-2105 eng
collection NDLTD
language English
format Article
sources NDLTD
topic Biotechnologie
Bioinformatik
biotechnology
bioinformatics
ddc:610
ddc:004
ddc:570
rvk:XA 10000
spellingShingle Biotechnologie
Bioinformatik
biotechnology
bioinformatics
ddc:610
ddc:004
ddc:570
rvk:XA 10000
Schröder, Michael
Winnenburg, Rainer
Plake, Conrad
Improved mutation tagging with gene identifiers applied to membrane protein stability prediction
description Background The automated retrieval and integration of information about protein point mutations in combination with structure, domain and interaction data from literature and databases promises to be a valuable approach to study structure-function relationships in biomedical data sets. Results We developed a rule- and regular expression-based protein point mutation retrieval pipeline for PubMed abstracts, which shows an F-measure of 87% for the mutation retrieval task on a benchmark dataset. In order to link mutations to their proteins, we utilize a named entity recognition algorithm for the identification of gene names co-occurring in the abstract, and establish links based on sequence checks. Vice versa, we could show that gene recognition improved from 77% to 91% F-measure when considering mutation information given in the text. To demonstrate practical relevance, we utilize mutation information from text to evaluate a novel solvation energy based model for the prediction of stabilizing regions in membrane proteins. For five G protein-coupled receptors we identified 35 relevant single mutations and associated phenotypes, of which none had been annotated in the UniProt or PDB database. In 71% reported phenotypes were in compliance with the model predictions, supporting a relation between mutations and stability issues in membrane proteins. Conclusion We present a reliable approach for the retrieval of protein mutations from PubMed abstracts for any set of genes or proteins of interest. We further demonstrate how amino acid substitution information from text can be utilized for protein structure stability studies on the basis of a novel energy model.
author2 BioMed Central,
author_facet BioMed Central,
Schröder, Michael
Winnenburg, Rainer
Plake, Conrad
author Schröder, Michael
Winnenburg, Rainer
Plake, Conrad
author_sort Schröder, Michael
title Improved mutation tagging with gene identifiers applied to membrane protein stability prediction
title_short Improved mutation tagging with gene identifiers applied to membrane protein stability prediction
title_full Improved mutation tagging with gene identifiers applied to membrane protein stability prediction
title_fullStr Improved mutation tagging with gene identifiers applied to membrane protein stability prediction
title_full_unstemmed Improved mutation tagging with gene identifiers applied to membrane protein stability prediction
title_sort improved mutation tagging with gene identifiers applied to membrane protein stability prediction
publisher Saechsische Landesbibliothek- Staats- und Universitaetsbibliothek Dresden
publishDate 2015
url http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-177379
http://nbn-resolving.de/urn:nbn:de:bsz:14-qucosa-177379
http://www.qucosa.de/fileadmin/data/qucosa/documents/17737/1471-2105-10-S8-S3.pdf
work_keys_str_mv AT schrodermichael improvedmutationtaggingwithgeneidentifiersappliedtomembraneproteinstabilityprediction
AT winnenburgrainer improvedmutationtaggingwithgeneidentifiersappliedtomembraneproteinstabilityprediction
AT plakeconrad improvedmutationtaggingwithgeneidentifiersappliedtomembraneproteinstabilityprediction
_version_ 1718350024438448128