A supervised approach to quantifying sentence similarity: with application to evidence based medicine.

Following the Evidence Based Medicine (EBM) practice, practitioners make use of the existing evidence to make therapeutic decisions. This evidence, in the form of scientific statements, is usually found in scholarly publications such as randomised control trials and systematic reviews. However, find...

Full description

Bibliographic Details
Main Authors: Hamed Hassanzadeh, Tudor Groza, Anthony Nguyen, Jane Hunter
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4454558?pdf=render
id doaj-8e7415a59de64e698ee3c231e775b1db
record_format Article
spelling doaj-8e7415a59de64e698ee3c231e775b1db2020-11-24T21:11:27ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01106e012939210.1371/journal.pone.0129392A supervised approach to quantifying sentence similarity: with application to evidence based medicine.Hamed HassanzadehTudor GrozaAnthony NguyenJane HunterFollowing the Evidence Based Medicine (EBM) practice, practitioners make use of the existing evidence to make therapeutic decisions. This evidence, in the form of scientific statements, is usually found in scholarly publications such as randomised control trials and systematic reviews. However, finding such information in the overwhelming amount of published material is particularly challenging. Approaches have been proposed to automatically extract scientific artefacts in EBM using standardised schemas. Our work takes this stream a step forward and looks into consolidating extracted artefacts-i.e., quantifying their degree of similarity based on the assumption that they carry the same rhetorical role. By semantically connecting key statements in the literature of EBM, practitioners are not only able to find available evidence more easily, but also can track the effects of different treatments/outcomes in a number of related studies. We devise a regression model based on a varied set of features and evaluate it both on a general English corpus (the SICK corpus), as well as on an EBM corpus (the NICTA-PIBOSO corpus). Experimental results show that our approach performs on par with the state of the art on the general English and achieves encouraging results on the biomedical text when compared against human judgement.http://europepmc.org/articles/PMC4454558?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Hamed Hassanzadeh
Tudor Groza
Anthony Nguyen
Jane Hunter
spellingShingle Hamed Hassanzadeh
Tudor Groza
Anthony Nguyen
Jane Hunter
A supervised approach to quantifying sentence similarity: with application to evidence based medicine.
PLoS ONE
author_facet Hamed Hassanzadeh
Tudor Groza
Anthony Nguyen
Jane Hunter
author_sort Hamed Hassanzadeh
title A supervised approach to quantifying sentence similarity: with application to evidence based medicine.
title_short A supervised approach to quantifying sentence similarity: with application to evidence based medicine.
title_full A supervised approach to quantifying sentence similarity: with application to evidence based medicine.
title_fullStr A supervised approach to quantifying sentence similarity: with application to evidence based medicine.
title_full_unstemmed A supervised approach to quantifying sentence similarity: with application to evidence based medicine.
title_sort supervised approach to quantifying sentence similarity: with application to evidence based medicine.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2015-01-01
description Following the Evidence Based Medicine (EBM) practice, practitioners make use of the existing evidence to make therapeutic decisions. This evidence, in the form of scientific statements, is usually found in scholarly publications such as randomised control trials and systematic reviews. However, finding such information in the overwhelming amount of published material is particularly challenging. Approaches have been proposed to automatically extract scientific artefacts in EBM using standardised schemas. Our work takes this stream a step forward and looks into consolidating extracted artefacts-i.e., quantifying their degree of similarity based on the assumption that they carry the same rhetorical role. By semantically connecting key statements in the literature of EBM, practitioners are not only able to find available evidence more easily, but also can track the effects of different treatments/outcomes in a number of related studies. We devise a regression model based on a varied set of features and evaluate it both on a general English corpus (the SICK corpus), as well as on an EBM corpus (the NICTA-PIBOSO corpus). Experimental results show that our approach performs on par with the state of the art on the general English and achieves encouraging results on the biomedical text when compared against human judgement.
url http://europepmc.org/articles/PMC4454558?pdf=render
work_keys_str_mv AT hamedhassanzadeh asupervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine
AT tudorgroza asupervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine
AT anthonynguyen asupervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine
AT janehunter asupervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine
AT hamedhassanzadeh supervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine
AT tudorgroza supervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine
AT anthonynguyen supervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine
AT janehunter supervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine
_version_ 1716753377210138624