Data Processing and Text Mining Technologies on Electronic Medical Records: A Review

Currently, medical institutes generally use EMR to record patient’s condition, including diagnostic information, procedures performed, and treatment results. EMR has been recognized as a valuable resource for large-scale analysis. However, EMR has the characteristics of diversity, incompleteness, re...

Full description

Bibliographic Details
Main Authors: Wencheng Sun, Zhiping Cai, Yangyang Li, Fang Liu, Shengqun Fang, Guoyan Wang
Format: Article
Language:English
Published: Hindawi Limited 2018-01-01
Series:Journal of Healthcare Engineering
Online Access:http://dx.doi.org/10.1155/2018/4302425
id doaj-29d38e0c378c462592c2e14f4ed8ebb9
record_format Article
spelling doaj-29d38e0c378c462592c2e14f4ed8ebb92020-11-24T21:39:27ZengHindawi LimitedJournal of Healthcare Engineering2040-22952040-23092018-01-01201810.1155/2018/43024254302425Data Processing and Text Mining Technologies on Electronic Medical Records: A ReviewWencheng Sun0Zhiping Cai1Yangyang Li2Fang Liu3Shengqun Fang4Guoyan Wang5College of Computer, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer, National University of Defense Technology, Changsha 410073, ChinaInnovation Center, China Academy of Electronics and Information Technology, Beijing 100041, ChinaSchool of Data and Computer Science, Sun Yat-sen University, Guangzhou 510006, ChinaCollege of Computer, National University of Defense Technology, Changsha 410073, ChinaXuzhou University of Technology, Xuzhou 221002, ChinaCurrently, medical institutes generally use EMR to record patient’s condition, including diagnostic information, procedures performed, and treatment results. EMR has been recognized as a valuable resource for large-scale analysis. However, EMR has the characteristics of diversity, incompleteness, redundancy, and privacy, which make it difficult to carry out data mining and analysis directly. Therefore, it is necessary to preprocess the source data in order to improve data quality and improve the data mining results. Different types of data require different processing technologies. Most structured data commonly needs classic preprocessing technologies, including data cleansing, data integration, data transformation, and data reduction. For semistructured or unstructured data, such as medical text, containing more health information, it requires more complex and challenging processing methods. The task of information extraction for medical texts mainly includes NER (named-entity recognition) and RE (relation extraction). This paper focuses on the process of EMR processing and emphatically analyzes the key techniques. In addition, we make an in-depth study on the applications developed based on text mining together with the open challenges and research issues for future work.http://dx.doi.org/10.1155/2018/4302425
collection DOAJ
language English
format Article
sources DOAJ
author Wencheng Sun
Zhiping Cai
Yangyang Li
Fang Liu
Shengqun Fang
Guoyan Wang
spellingShingle Wencheng Sun
Zhiping Cai
Yangyang Li
Fang Liu
Shengqun Fang
Guoyan Wang
Data Processing and Text Mining Technologies on Electronic Medical Records: A Review
Journal of Healthcare Engineering
author_facet Wencheng Sun
Zhiping Cai
Yangyang Li
Fang Liu
Shengqun Fang
Guoyan Wang
author_sort Wencheng Sun
title Data Processing and Text Mining Technologies on Electronic Medical Records: A Review
title_short Data Processing and Text Mining Technologies on Electronic Medical Records: A Review
title_full Data Processing and Text Mining Technologies on Electronic Medical Records: A Review
title_fullStr Data Processing and Text Mining Technologies on Electronic Medical Records: A Review
title_full_unstemmed Data Processing and Text Mining Technologies on Electronic Medical Records: A Review
title_sort data processing and text mining technologies on electronic medical records: a review
publisher Hindawi Limited
series Journal of Healthcare Engineering
issn 2040-2295
2040-2309
publishDate 2018-01-01
description Currently, medical institutes generally use EMR to record patient’s condition, including diagnostic information, procedures performed, and treatment results. EMR has been recognized as a valuable resource for large-scale analysis. However, EMR has the characteristics of diversity, incompleteness, redundancy, and privacy, which make it difficult to carry out data mining and analysis directly. Therefore, it is necessary to preprocess the source data in order to improve data quality and improve the data mining results. Different types of data require different processing technologies. Most structured data commonly needs classic preprocessing technologies, including data cleansing, data integration, data transformation, and data reduction. For semistructured or unstructured data, such as medical text, containing more health information, it requires more complex and challenging processing methods. The task of information extraction for medical texts mainly includes NER (named-entity recognition) and RE (relation extraction). This paper focuses on the process of EMR processing and emphatically analyzes the key techniques. In addition, we make an in-depth study on the applications developed based on text mining together with the open challenges and research issues for future work.
url http://dx.doi.org/10.1155/2018/4302425
work_keys_str_mv AT wenchengsun dataprocessingandtextminingtechnologiesonelectronicmedicalrecordsareview
AT zhipingcai dataprocessingandtextminingtechnologiesonelectronicmedicalrecordsareview
AT yangyangli dataprocessingandtextminingtechnologiesonelectronicmedicalrecordsareview
AT fangliu dataprocessingandtextminingtechnologiesonelectronicmedicalrecordsareview
AT shengqunfang dataprocessingandtextminingtechnologiesonelectronicmedicalrecordsareview
AT guoyanwang dataprocessingandtextminingtechnologiesonelectronicmedicalrecordsareview
_version_ 1725931382097575936