Classification methods for finding articles describing protein-protein interactions in PubMed

With the rapid expansion in the number of published papers in the biomedical field, finding relevant articles has become a demanding task for researchers. This has led to increasing interest in the use of text mining tools that help search the literature and identify the most relevant documents or i...

Full description

Bibliographic Details
Main Authors: Matos Sérgio, Oliveira José Luis
Format: Article
Language:English
Published: De Gruyter 2011-12-01
Series:Journal of Integrative Bioinformatics
Online Access:https://doi.org/10.1515/jib-2011-178
id doaj-d89e74b3ed34424abf3a89dc2eb7faf7
record_format Article
spelling doaj-d89e74b3ed34424abf3a89dc2eb7faf72021-09-06T19:40:31ZengDe GruyterJournal of Integrative Bioinformatics1613-45162011-12-018311812910.1515/jib-2011-178biecoll-jib-2011-178Classification methods for finding articles describing protein-protein interactions in PubMedMatos Sérgio0Oliveira José Luis1University of Aveiro, DETI/IEETA, Campus Universitário de Santiago, 3810-193, Aveiro, http://bioinformatics.ua.pt, PortugalUniversity of Aveiro, DETI/IEETA, Campus Universitário de Santiago, 3810-193, Aveiro, http://bioinformatics.ua.pt, PortugalWith the rapid expansion in the number of published papers in the biomedical field, finding relevant articles has become a demanding task for researchers. This has led to increasing interest in the use of text mining tools that help search the literature and identify the most relevant documents or information. One specific topic of interest is related to the identification of articles that might be used for extracting protein-protein interactions. Using the BioCreative III Article Classification Task dataset, composed of PubMed abstracts classified as relevant or non-relevant for describing protein-protein interactions, we compare different classification methods with different sets of features. The best results - area under the interpolated precision-recall curve of 0.654 - indicate that the proposed classification strategy could be incorporated in the database curation workflows in order to prioritize articles for extraction of protein-protein interactions. Furthermore, we also analysed the use of this method for ranking documents resulting from general PubMed queries, and propose that this approach could be useful for general researchers looking for publications describing protein-protein interactions within a particular topic of interest.https://doi.org/10.1515/jib-2011-178
collection DOAJ
language English
format Article
sources DOAJ
author Matos Sérgio
Oliveira José Luis
spellingShingle Matos Sérgio
Oliveira José Luis
Classification methods for finding articles describing protein-protein interactions in PubMed
Journal of Integrative Bioinformatics
author_facet Matos Sérgio
Oliveira José Luis
author_sort Matos Sérgio
title Classification methods for finding articles describing protein-protein interactions in PubMed
title_short Classification methods for finding articles describing protein-protein interactions in PubMed
title_full Classification methods for finding articles describing protein-protein interactions in PubMed
title_fullStr Classification methods for finding articles describing protein-protein interactions in PubMed
title_full_unstemmed Classification methods for finding articles describing protein-protein interactions in PubMed
title_sort classification methods for finding articles describing protein-protein interactions in pubmed
publisher De Gruyter
series Journal of Integrative Bioinformatics
issn 1613-4516
publishDate 2011-12-01
description With the rapid expansion in the number of published papers in the biomedical field, finding relevant articles has become a demanding task for researchers. This has led to increasing interest in the use of text mining tools that help search the literature and identify the most relevant documents or information. One specific topic of interest is related to the identification of articles that might be used for extracting protein-protein interactions. Using the BioCreative III Article Classification Task dataset, composed of PubMed abstracts classified as relevant or non-relevant for describing protein-protein interactions, we compare different classification methods with different sets of features. The best results - area under the interpolated precision-recall curve of 0.654 - indicate that the proposed classification strategy could be incorporated in the database curation workflows in order to prioritize articles for extraction of protein-protein interactions. Furthermore, we also analysed the use of this method for ranking documents resulting from general PubMed queries, and propose that this approach could be useful for general researchers looking for publications describing protein-protein interactions within a particular topic of interest.
url https://doi.org/10.1515/jib-2011-178
work_keys_str_mv AT matossergio classificationmethodsforfindingarticlesdescribingproteinproteininteractionsinpubmed
AT oliveirajoseluis classificationmethodsforfindingarticlesdescribingproteinproteininteractionsinpubmed
_version_ 1717768290408857600