Unsupervised Method for Disease Named Entity Recognition

Diseases take a central role in biomedical research; many studies aim to enable access to disease information, by designing named entity recognition models to make use of the available information. Disease recognition is a problem that has been tackled by various approaches of which the most famous...

Full description

Bibliographic Details
Main Author: Almutairi, Abeer N.
Other Authors: Hoehndorf, Robert
Language:en
Published: 2019
Subjects:
NER
Online Access:Almutairi, A. N. (2019). Unsupervised Method for Disease Named Entity Recognition. KAUST Research Repository. https://doi.org/10.25781/KAUST-K5387
http://hdl.handle.net/10754/659966
id ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-659966
record_format oai_dc
spelling ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-6599662021-02-20T05:10:56Z Unsupervised Method for Disease Named Entity Recognition Almutairi, Abeer N. Hoehndorf, Robert Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division Moshkov, Mikhail Laleg-Kirati, Taous-Meriem Text Mining NER Name Entity Recognition Disease Name Diseases take a central role in biomedical research; many studies aim to enable access to disease information, by designing named entity recognition models to make use of the available information. Disease recognition is a problem that has been tackled by various approaches of which the most famous are the lexical and supervised approaches. However, the aforementioned approaches have many drawbacks as their performance is affected by the amount of human-annotated data set available. Moreover, lexicalapproachescannotdistinguishbetweenrealmentionsofdiseasesand mentionsofotherentitiesthatsharethesamenameoracronym. Thechallengeofthis project is to find a model that can combine the strengths of the lexical approaches and supervised approaches, to design a named entity recognizer. We demonstrate that our model can accurately identify disease name mentions in text, by using word embedding to capture context information of each mention, which enables the model todistinguishifitisarealdiseasementionornot. Weevaluateourmodelusingagold standard data set which showed high precision of 84% and accuracy of 96%. Finally, we compare the performance of our model to different statistical name entity recognition models, and the results show that our model outperforms the unsupervised lexical approaches. 2019-11-11T10:42:51Z 2019-11-11T10:42:51Z 2019-11-06 Thesis Almutairi, A. N. (2019). Unsupervised Method for Disease Named Entity Recognition. KAUST Research Repository. https://doi.org/10.25781/KAUST-K5387 10.25781/KAUST-K5387 http://hdl.handle.net/10754/659966 en
collection NDLTD
language en
sources NDLTD
topic Text Mining
NER
Name Entity Recognition
Disease Name
spellingShingle Text Mining
NER
Name Entity Recognition
Disease Name
Almutairi, Abeer N.
Unsupervised Method for Disease Named Entity Recognition
description Diseases take a central role in biomedical research; many studies aim to enable access to disease information, by designing named entity recognition models to make use of the available information. Disease recognition is a problem that has been tackled by various approaches of which the most famous are the lexical and supervised approaches. However, the aforementioned approaches have many drawbacks as their performance is affected by the amount of human-annotated data set available. Moreover, lexicalapproachescannotdistinguishbetweenrealmentionsofdiseasesand mentionsofotherentitiesthatsharethesamenameoracronym. Thechallengeofthis project is to find a model that can combine the strengths of the lexical approaches and supervised approaches, to design a named entity recognizer. We demonstrate that our model can accurately identify disease name mentions in text, by using word embedding to capture context information of each mention, which enables the model todistinguishifitisarealdiseasementionornot. Weevaluateourmodelusingagold standard data set which showed high precision of 84% and accuracy of 96%. Finally, we compare the performance of our model to different statistical name entity recognition models, and the results show that our model outperforms the unsupervised lexical approaches.
author2 Hoehndorf, Robert
author_facet Hoehndorf, Robert
Almutairi, Abeer N.
author Almutairi, Abeer N.
author_sort Almutairi, Abeer N.
title Unsupervised Method for Disease Named Entity Recognition
title_short Unsupervised Method for Disease Named Entity Recognition
title_full Unsupervised Method for Disease Named Entity Recognition
title_fullStr Unsupervised Method for Disease Named Entity Recognition
title_full_unstemmed Unsupervised Method for Disease Named Entity Recognition
title_sort unsupervised method for disease named entity recognition
publishDate 2019
url Almutairi, A. N. (2019). Unsupervised Method for Disease Named Entity Recognition. KAUST Research Repository. https://doi.org/10.25781/KAUST-K5387
http://hdl.handle.net/10754/659966
work_keys_str_mv AT almutairiabeern unsupervisedmethodfordiseasenamedentityrecognition
_version_ 1719378112313032704