An Improved Method for Named Entity Recognition and Its Application to CEMR
Named Entity Recognition (NER) on Clinical Electronic Medical Records (CEMR) is a fundamental step in extracting disease knowledge by identifying specific entity terms such as diseases, symptoms, etc. However, the state-of-the-art NER methods based on Long Short-Term Memory (LSTM) fail to exploit GP...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2019-08-01
|
Series: | Future Internet |
Subjects: | |
Online Access: | https://www.mdpi.com/1999-5903/11/9/185 |
id |
doaj-8728ed63a0e743418a87891d72b59427 |
---|---|
record_format |
Article |
spelling |
doaj-8728ed63a0e743418a87891d72b594272020-11-25T01:35:11ZengMDPI AGFuture Internet1999-59032019-08-0111918510.3390/fi11090185fi11090185An Improved Method for Named Entity Recognition and Its Application to CEMRMing Gao0Qifeng Xiao1Shaochun Wu2Kun Deng3Department of Intelligent Information Processing, Shanghai University, Shanghai 200444, ChinaDepartment of Intelligent Information Processing, Shanghai University, Shanghai 200444, ChinaDepartment of Intelligent Information Processing, Shanghai University, Shanghai 200444, ChinaDepartment of Intelligent Information Processing, Shanghai University, Shanghai 200444, ChinaNamed Entity Recognition (NER) on Clinical Electronic Medical Records (CEMR) is a fundamental step in extracting disease knowledge by identifying specific entity terms such as diseases, symptoms, etc. However, the state-of-the-art NER methods based on Long Short-Term Memory (LSTM) fail to exploit GPU parallelism fully under the massive medical records. Although a novel NER method based on Iterated Dilated CNNs (ID-CNNs) can accelerate network computing, it tends to ignore the word-order feature and semantic information of the current word. In order to enhance the performance of ID-CNNs-based models on NER tasks, an attention-based ID-CNNs-CRF model, which combines the word-order feature and local context, is proposed. Firstly, position embedding is utilized to fuse word-order information. Secondly, the ID-CNNs architecture is used to extract global semantic information rapidly. Simultaneously, the attention mechanism is employed to pay attention to the local context. Finally, we apply the CRF to obtain the optimal tag sequence. Experiments conducted on two CEMR datasets show that our model outperforms traditional ones. The F1-scores of 94.55% and 91.17% are obtained respectively on these two datasets, and both are better than LSTM-based models.https://www.mdpi.com/1999-5903/11/9/185clinical electronic recordsnamed entity recognitionconvolutional neural network |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ming Gao Qifeng Xiao Shaochun Wu Kun Deng |
spellingShingle |
Ming Gao Qifeng Xiao Shaochun Wu Kun Deng An Improved Method for Named Entity Recognition and Its Application to CEMR Future Internet clinical electronic records named entity recognition convolutional neural network |
author_facet |
Ming Gao Qifeng Xiao Shaochun Wu Kun Deng |
author_sort |
Ming Gao |
title |
An Improved Method for Named Entity Recognition and Its Application to CEMR |
title_short |
An Improved Method for Named Entity Recognition and Its Application to CEMR |
title_full |
An Improved Method for Named Entity Recognition and Its Application to CEMR |
title_fullStr |
An Improved Method for Named Entity Recognition and Its Application to CEMR |
title_full_unstemmed |
An Improved Method for Named Entity Recognition and Its Application to CEMR |
title_sort |
improved method for named entity recognition and its application to cemr |
publisher |
MDPI AG |
series |
Future Internet |
issn |
1999-5903 |
publishDate |
2019-08-01 |
description |
Named Entity Recognition (NER) on Clinical Electronic Medical Records (CEMR) is a fundamental step in extracting disease knowledge by identifying specific entity terms such as diseases, symptoms, etc. However, the state-of-the-art NER methods based on Long Short-Term Memory (LSTM) fail to exploit GPU parallelism fully under the massive medical records. Although a novel NER method based on Iterated Dilated CNNs (ID-CNNs) can accelerate network computing, it tends to ignore the word-order feature and semantic information of the current word. In order to enhance the performance of ID-CNNs-based models on NER tasks, an attention-based ID-CNNs-CRF model, which combines the word-order feature and local context, is proposed. Firstly, position embedding is utilized to fuse word-order information. Secondly, the ID-CNNs architecture is used to extract global semantic information rapidly. Simultaneously, the attention mechanism is employed to pay attention to the local context. Finally, we apply the CRF to obtain the optimal tag sequence. Experiments conducted on two CEMR datasets show that our model outperforms traditional ones. The F1-scores of 94.55% and 91.17% are obtained respectively on these two datasets, and both are better than LSTM-based models. |
topic |
clinical electronic records named entity recognition convolutional neural network |
url |
https://www.mdpi.com/1999-5903/11/9/185 |
work_keys_str_mv |
AT minggao animprovedmethodfornamedentityrecognitionanditsapplicationtocemr AT qifengxiao animprovedmethodfornamedentityrecognitionanditsapplicationtocemr AT shaochunwu animprovedmethodfornamedentityrecognitionanditsapplicationtocemr AT kundeng animprovedmethodfornamedentityrecognitionanditsapplicationtocemr AT minggao improvedmethodfornamedentityrecognitionanditsapplicationtocemr AT qifengxiao improvedmethodfornamedentityrecognitionanditsapplicationtocemr AT shaochunwu improvedmethodfornamedentityrecognitionanditsapplicationtocemr AT kundeng improvedmethodfornamedentityrecognitionanditsapplicationtocemr |
_version_ |
1725067988769439744 |