A supervised approach to quantifying sentence similarity: with application to evidence based medicine.
Following the Evidence Based Medicine (EBM) practice, practitioners make use of the existing evidence to make therapeutic decisions. This evidence, in the form of scientific statements, is usually found in scholarly publications such as randomised control trials and systematic reviews. However, find...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2015-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC4454558?pdf=render |
id |
doaj-8e7415a59de64e698ee3c231e775b1db |
---|---|
record_format |
Article |
spelling |
doaj-8e7415a59de64e698ee3c231e775b1db2020-11-24T21:11:27ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01106e012939210.1371/journal.pone.0129392A supervised approach to quantifying sentence similarity: with application to evidence based medicine.Hamed HassanzadehTudor GrozaAnthony NguyenJane HunterFollowing the Evidence Based Medicine (EBM) practice, practitioners make use of the existing evidence to make therapeutic decisions. This evidence, in the form of scientific statements, is usually found in scholarly publications such as randomised control trials and systematic reviews. However, finding such information in the overwhelming amount of published material is particularly challenging. Approaches have been proposed to automatically extract scientific artefacts in EBM using standardised schemas. Our work takes this stream a step forward and looks into consolidating extracted artefacts-i.e., quantifying their degree of similarity based on the assumption that they carry the same rhetorical role. By semantically connecting key statements in the literature of EBM, practitioners are not only able to find available evidence more easily, but also can track the effects of different treatments/outcomes in a number of related studies. We devise a regression model based on a varied set of features and evaluate it both on a general English corpus (the SICK corpus), as well as on an EBM corpus (the NICTA-PIBOSO corpus). Experimental results show that our approach performs on par with the state of the art on the general English and achieves encouraging results on the biomedical text when compared against human judgement.http://europepmc.org/articles/PMC4454558?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hamed Hassanzadeh Tudor Groza Anthony Nguyen Jane Hunter |
spellingShingle |
Hamed Hassanzadeh Tudor Groza Anthony Nguyen Jane Hunter A supervised approach to quantifying sentence similarity: with application to evidence based medicine. PLoS ONE |
author_facet |
Hamed Hassanzadeh Tudor Groza Anthony Nguyen Jane Hunter |
author_sort |
Hamed Hassanzadeh |
title |
A supervised approach to quantifying sentence similarity: with application to evidence based medicine. |
title_short |
A supervised approach to quantifying sentence similarity: with application to evidence based medicine. |
title_full |
A supervised approach to quantifying sentence similarity: with application to evidence based medicine. |
title_fullStr |
A supervised approach to quantifying sentence similarity: with application to evidence based medicine. |
title_full_unstemmed |
A supervised approach to quantifying sentence similarity: with application to evidence based medicine. |
title_sort |
supervised approach to quantifying sentence similarity: with application to evidence based medicine. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2015-01-01 |
description |
Following the Evidence Based Medicine (EBM) practice, practitioners make use of the existing evidence to make therapeutic decisions. This evidence, in the form of scientific statements, is usually found in scholarly publications such as randomised control trials and systematic reviews. However, finding such information in the overwhelming amount of published material is particularly challenging. Approaches have been proposed to automatically extract scientific artefacts in EBM using standardised schemas. Our work takes this stream a step forward and looks into consolidating extracted artefacts-i.e., quantifying their degree of similarity based on the assumption that they carry the same rhetorical role. By semantically connecting key statements in the literature of EBM, practitioners are not only able to find available evidence more easily, but also can track the effects of different treatments/outcomes in a number of related studies. We devise a regression model based on a varied set of features and evaluate it both on a general English corpus (the SICK corpus), as well as on an EBM corpus (the NICTA-PIBOSO corpus). Experimental results show that our approach performs on par with the state of the art on the general English and achieves encouraging results on the biomedical text when compared against human judgement. |
url |
http://europepmc.org/articles/PMC4454558?pdf=render |
work_keys_str_mv |
AT hamedhassanzadeh asupervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine AT tudorgroza asupervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine AT anthonynguyen asupervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine AT janehunter asupervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine AT hamedhassanzadeh supervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine AT tudorgroza supervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine AT anthonynguyen supervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine AT janehunter supervisedapproachtoquantifyingsentencesimilaritywithapplicationtoevidencebasedmedicine |
_version_ |
1716753377210138624 |