Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks

Abstract Background Benefiting from big data, powerful computation and new algorithmic techniques, we have been witnessing the renaissance of deep learning, particularly the combination of natural language processing (NLP) and deep neural networks. The advent of electronic medical records (EMRs) has...

Full description

Bibliographic Details
Main Authors: Xiaozheng Li, Huazhen Wang, Huixin He, Jixiang Du, Jian Chen, Jinzhun Wu
Format: Article
Language:English
Published: BMC 2019-02-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2617-8
id doaj-005945522e2d4febb0b4a67ed11517fa
record_format Article
spelling doaj-005945522e2d4febb0b4a67ed11517fa2020-11-25T01:59:04ZengBMCBMC Bioinformatics1471-21052019-02-0120111210.1186/s12859-019-2617-8Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networksXiaozheng Li0Huazhen Wang1Huixin He2Jixiang Du3Jian Chen4Jinzhun Wu5College of Computer Science and Technology, Huaqiao UniversityCollege of Computer Science and Technology, Huaqiao UniversityCollege of Computer Science and Technology, Huaqiao UniversityCollege of Computer Science and Technology, Huaqiao UniversityResearch Department, Zhiye softwarePediatric Department, The First Affiliated Hospital of Xiamen UniversityAbstract Background Benefiting from big data, powerful computation and new algorithmic techniques, we have been witnessing the renaissance of deep learning, particularly the combination of natural language processing (NLP) and deep neural networks. The advent of electronic medical records (EMRs) has not only changed the format of medical records but also helped users to obtain information faster. However, there are many challenges regarding researching directly using Chinese EMRs, such as low quality, huge quantity, imbalance, semi-structure and non-structure, particularly the high density of the Chinese language compared with English. Therefore, effective word segmentation, word representation and model architecture are the core technologies in the literature on Chinese EMRs. Results In this paper, we propose a deep learning framework to study intelligent diagnosis using Chinese EMR data, which incorporates a convolutional neural network (CNN) into an EMR classification application. The novelty of this paper is reflected in the following: (1) We construct a pediatric medical dictionary based on Chinese EMRs. (2) Word2vec adopted in word embedding is used to achieve the semantic description of the content of Chinese EMRs. (3) A fine-tuning CNN model is constructed to feed the pediatric diagnosis with Chinese EMR data. Our results on real-world pediatric Chinese EMRs demonstrate that the average accuracy and F1-score of the CNN models are up to 81%, which indicates the effectiveness of the CNN model for the classification of EMRs. Particularly, a fine-tuning one-layer CNN performs best among all CNNs, recurrent neural network (RNN) (long short-term memory, gated recurrent unit) and CNN-RNN models, and the average accuracy and F1-score are both up to 83%. Conclusion The CNN framework that includes word segmentation, word embedding and model training can serve as an intelligent auxiliary diagnosis tool for pediatricians. Particularly, a fine-tuning one-layer CNN performs well, which indicates that word order does not appear to have a useful effect on our Chinese EMRs.http://link.springer.com/article/10.1186/s12859-019-2617-8Chinese electronic medical recordsConvolutional neural networksNatural language processing
collection DOAJ
language English
format Article
sources DOAJ
author Xiaozheng Li
Huazhen Wang
Huixin He
Jixiang Du
Jian Chen
Jinzhun Wu
spellingShingle Xiaozheng Li
Huazhen Wang
Huixin He
Jixiang Du
Jian Chen
Jinzhun Wu
Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks
BMC Bioinformatics
Chinese electronic medical records
Convolutional neural networks
Natural language processing
author_facet Xiaozheng Li
Huazhen Wang
Huixin He
Jixiang Du
Jian Chen
Jinzhun Wu
author_sort Xiaozheng Li
title Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks
title_short Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks
title_full Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks
title_fullStr Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks
title_full_unstemmed Intelligent diagnosis with Chinese electronic medical records based on convolutional neural networks
title_sort intelligent diagnosis with chinese electronic medical records based on convolutional neural networks
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2019-02-01
description Abstract Background Benefiting from big data, powerful computation and new algorithmic techniques, we have been witnessing the renaissance of deep learning, particularly the combination of natural language processing (NLP) and deep neural networks. The advent of electronic medical records (EMRs) has not only changed the format of medical records but also helped users to obtain information faster. However, there are many challenges regarding researching directly using Chinese EMRs, such as low quality, huge quantity, imbalance, semi-structure and non-structure, particularly the high density of the Chinese language compared with English. Therefore, effective word segmentation, word representation and model architecture are the core technologies in the literature on Chinese EMRs. Results In this paper, we propose a deep learning framework to study intelligent diagnosis using Chinese EMR data, which incorporates a convolutional neural network (CNN) into an EMR classification application. The novelty of this paper is reflected in the following: (1) We construct a pediatric medical dictionary based on Chinese EMRs. (2) Word2vec adopted in word embedding is used to achieve the semantic description of the content of Chinese EMRs. (3) A fine-tuning CNN model is constructed to feed the pediatric diagnosis with Chinese EMR data. Our results on real-world pediatric Chinese EMRs demonstrate that the average accuracy and F1-score of the CNN models are up to 81%, which indicates the effectiveness of the CNN model for the classification of EMRs. Particularly, a fine-tuning one-layer CNN performs best among all CNNs, recurrent neural network (RNN) (long short-term memory, gated recurrent unit) and CNN-RNN models, and the average accuracy and F1-score are both up to 83%. Conclusion The CNN framework that includes word segmentation, word embedding and model training can serve as an intelligent auxiliary diagnosis tool for pediatricians. Particularly, a fine-tuning one-layer CNN performs well, which indicates that word order does not appear to have a useful effect on our Chinese EMRs.
topic Chinese electronic medical records
Convolutional neural networks
Natural language processing
url http://link.springer.com/article/10.1186/s12859-019-2617-8
work_keys_str_mv AT xiaozhengli intelligentdiagnosiswithchineseelectronicmedicalrecordsbasedonconvolutionalneuralnetworks
AT huazhenwang intelligentdiagnosiswithchineseelectronicmedicalrecordsbasedonconvolutionalneuralnetworks
AT huixinhe intelligentdiagnosiswithchineseelectronicmedicalrecordsbasedonconvolutionalneuralnetworks
AT jixiangdu intelligentdiagnosiswithchineseelectronicmedicalrecordsbasedonconvolutionalneuralnetworks
AT jianchen intelligentdiagnosiswithchineseelectronicmedicalrecordsbasedonconvolutionalneuralnetworks
AT jinzhunwu intelligentdiagnosiswithchineseelectronicmedicalrecordsbasedonconvolutionalneuralnetworks
_version_ 1724966006532603904