Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health Records

Recent advancement in EHR-based (Electronic Health Record) systems has resulted in producing data at an unprecedented rate. The complex, growing, and high-dimensional data available in EHRs creates great opportunities for machine learning techniques such as clustering. Cluster analysis often require...

Full description

Bibliographic Details
Main Authors: Sheikh S. Abdullah, Neda Rostamzadeh, Kamran Sedig, Amit X. Garg, Eric McArthur
Format: Article
Language:English
Published: MDPI AG 2020-05-01
Series:Informatics
Subjects:
Online Access:https://www.mdpi.com/2227-9709/7/2/17
id doaj-7653556f0f6942f5aef077627c88c3a5
record_format Article
spelling doaj-7653556f0f6942f5aef077627c88c3a52020-11-25T02:54:02ZengMDPI AGInformatics2227-97092020-05-017171710.3390/informatics7020017Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health RecordsSheikh S. Abdullah0Neda Rostamzadeh1Kamran Sedig2Amit X. Garg3Eric McArthur4Insight Lab, Western University, 1151 Richmond Street, London, ON N6A 3K7, CanadaInsight Lab, Western University, 1151 Richmond Street, London, ON N6A 3K7, CanadaInsight Lab, Western University, 1151 Richmond Street, London, ON N6A 3K7, CanadaDepartment of Medicine, Epidemiology and Biostatistics, Western University, 1151 Richmond Street, London, ON N6A 3K7, CanadaICES, London, ON N6A 3K7, CanadaRecent advancement in EHR-based (Electronic Health Record) systems has resulted in producing data at an unprecedented rate. The complex, growing, and high-dimensional data available in EHRs creates great opportunities for machine learning techniques such as clustering. Cluster analysis often requires dimension reduction to achieve efficient processing time and mitigate the curse of dimensionality. Given a wide range of techniques for dimension reduction and cluster analysis, it is not straightforward to identify which combination of techniques from both families leads to the desired result. The ability to derive useful and precise insights from EHRs requires a deeper understanding of the data, intermediary results, configuration parameters, and analysis processes. Although these tasks are often tackled separately in existing studies, we present a visual analytics (VA) system, called Visual Analytics for Cluster Analysis and Dimension Reduction of High Dimensional Electronic Health Records (VALENCIA), to address the challenges of high-dimensional EHRs in a single system. VALENCIA brings a wide range of cluster analysis and dimension reduction techniques, integrate them seamlessly, and make them accessible to users through interactive visualizations. It offers a balanced distribution of processing load between users and the system to facilitate the performance of high-level cognitive tasks in such a way that would be difficult without the aid of a VA system. Through a real case study, we have demonstrated how VALENCIA can be used to analyze the healthcare administrative dataset stored at ICES. This research also highlights what needs to be considered in the future when developing VA systems that are designed to derive deep and novel insights into EHRs.https://www.mdpi.com/2227-9709/7/2/17visual analyticsdimension reductioncluster analysiselectronic health recordshigh-dimensional datainteractive visualization
collection DOAJ
language English
format Article
sources DOAJ
author Sheikh S. Abdullah
Neda Rostamzadeh
Kamran Sedig
Amit X. Garg
Eric McArthur
spellingShingle Sheikh S. Abdullah
Neda Rostamzadeh
Kamran Sedig
Amit X. Garg
Eric McArthur
Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health Records
Informatics
visual analytics
dimension reduction
cluster analysis
electronic health records
high-dimensional data
interactive visualization
author_facet Sheikh S. Abdullah
Neda Rostamzadeh
Kamran Sedig
Amit X. Garg
Eric McArthur
author_sort Sheikh S. Abdullah
title Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health Records
title_short Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health Records
title_full Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health Records
title_fullStr Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health Records
title_full_unstemmed Visual Analytics for Dimension Reduction and Cluster Analysis of High Dimensional Electronic Health Records
title_sort visual analytics for dimension reduction and cluster analysis of high dimensional electronic health records
publisher MDPI AG
series Informatics
issn 2227-9709
publishDate 2020-05-01
description Recent advancement in EHR-based (Electronic Health Record) systems has resulted in producing data at an unprecedented rate. The complex, growing, and high-dimensional data available in EHRs creates great opportunities for machine learning techniques such as clustering. Cluster analysis often requires dimension reduction to achieve efficient processing time and mitigate the curse of dimensionality. Given a wide range of techniques for dimension reduction and cluster analysis, it is not straightforward to identify which combination of techniques from both families leads to the desired result. The ability to derive useful and precise insights from EHRs requires a deeper understanding of the data, intermediary results, configuration parameters, and analysis processes. Although these tasks are often tackled separately in existing studies, we present a visual analytics (VA) system, called Visual Analytics for Cluster Analysis and Dimension Reduction of High Dimensional Electronic Health Records (VALENCIA), to address the challenges of high-dimensional EHRs in a single system. VALENCIA brings a wide range of cluster analysis and dimension reduction techniques, integrate them seamlessly, and make them accessible to users through interactive visualizations. It offers a balanced distribution of processing load between users and the system to facilitate the performance of high-level cognitive tasks in such a way that would be difficult without the aid of a VA system. Through a real case study, we have demonstrated how VALENCIA can be used to analyze the healthcare administrative dataset stored at ICES. This research also highlights what needs to be considered in the future when developing VA systems that are designed to derive deep and novel insights into EHRs.
topic visual analytics
dimension reduction
cluster analysis
electronic health records
high-dimensional data
interactive visualization
url https://www.mdpi.com/2227-9709/7/2/17
work_keys_str_mv AT sheikhsabdullah visualanalyticsfordimensionreductionandclusteranalysisofhighdimensionalelectronichealthrecords
AT nedarostamzadeh visualanalyticsfordimensionreductionandclusteranalysisofhighdimensionalelectronichealthrecords
AT kamransedig visualanalyticsfordimensionreductionandclusteranalysisofhighdimensionalelectronichealthrecords
AT amitxgarg visualanalyticsfordimensionreductionandclusteranalysisofhighdimensionalelectronichealthrecords
AT ericmcarthur visualanalyticsfordimensionreductionandclusteranalysisofhighdimensionalelectronichealthrecords
_version_ 1724722951947812864