pyDRMetrics - A Python toolkit for dimensionality reduction quality assessment

High-dimensional data are pervasive in this bigdata era. To avoid the curse of the dimensionality problem, various dimensionality reduction (DR) algorithms have been proposed. To facilitate systematic DR quality comparison and assessment, this paper reviews related metrics and develops an open-sourc...

Full description

Bibliographic Details
Main Authors: Yinsheng Zhang, Qian Shang, Guoming Zhang
Format: Article
Language:English
Published: Elsevier 2021-02-01
Series:Heliyon
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2405844021003042
id doaj-d914d05357e948958afaeffb7bb8a613
record_format Article
spelling doaj-d914d05357e948958afaeffb7bb8a6132021-03-03T04:24:03ZengElsevierHeliyon2405-84402021-02-0172e06199pyDRMetrics - A Python toolkit for dimensionality reduction quality assessmentYinsheng Zhang0Qian Shang1Guoming Zhang2School of Management and E-Business, Zhejiang Gongshang University, Hangzhou 310018, China; School of Information Sciences, University of Illinois at Urbana Champaign, Champaign, IL 61820-6211, USA; Corresponding author.School of Management, Hangzhou Dianzi University, Hangzhou 310018, China; School of Information Sciences, University of Illinois at Urbana Champaign, Champaign, IL 61820-6211, USAPediatric Retinal Surgery Department, Shenzhen Eye Hospital, Shenzhen 518040, China; Shenzhen Key Ophthalmic Laboratory, The Second Affiliated Hospital of Jinan University, Shenzhen 518040, China; Corresponding author.High-dimensional data are pervasive in this bigdata era. To avoid the curse of the dimensionality problem, various dimensionality reduction (DR) algorithms have been proposed. To facilitate systematic DR quality comparison and assessment, this paper reviews related metrics and develops an open-source Python package pyDRMetrics. Supported metrics include reconstruction error, distance matrix, residual variance, ranking matrix, co-ranking matrix, trustworthiness, continuity, co-k-nearest neighbor size, LCMC (local continuity meta criterion), and rank-based local/global properties. pyDRMetrics provides a native Python class and a web-oriented API. A case study of mass spectra is conducted to demonstrate the package functions. A web GUI wrapper is also published to support user-friendly B/S applications.http://www.sciencedirect.com/science/article/pii/S2405844021003042Dimensionality reductionReconstruction errorDistance matrixCo-ranking matrixCo-k-nearest neighbor
collection DOAJ
language English
format Article
sources DOAJ
author Yinsheng Zhang
Qian Shang
Guoming Zhang
spellingShingle Yinsheng Zhang
Qian Shang
Guoming Zhang
pyDRMetrics - A Python toolkit for dimensionality reduction quality assessment
Heliyon
Dimensionality reduction
Reconstruction error
Distance matrix
Co-ranking matrix
Co-k-nearest neighbor
author_facet Yinsheng Zhang
Qian Shang
Guoming Zhang
author_sort Yinsheng Zhang
title pyDRMetrics - A Python toolkit for dimensionality reduction quality assessment
title_short pyDRMetrics - A Python toolkit for dimensionality reduction quality assessment
title_full pyDRMetrics - A Python toolkit for dimensionality reduction quality assessment
title_fullStr pyDRMetrics - A Python toolkit for dimensionality reduction quality assessment
title_full_unstemmed pyDRMetrics - A Python toolkit for dimensionality reduction quality assessment
title_sort pydrmetrics - a python toolkit for dimensionality reduction quality assessment
publisher Elsevier
series Heliyon
issn 2405-8440
publishDate 2021-02-01
description High-dimensional data are pervasive in this bigdata era. To avoid the curse of the dimensionality problem, various dimensionality reduction (DR) algorithms have been proposed. To facilitate systematic DR quality comparison and assessment, this paper reviews related metrics and develops an open-source Python package pyDRMetrics. Supported metrics include reconstruction error, distance matrix, residual variance, ranking matrix, co-ranking matrix, trustworthiness, continuity, co-k-nearest neighbor size, LCMC (local continuity meta criterion), and rank-based local/global properties. pyDRMetrics provides a native Python class and a web-oriented API. A case study of mass spectra is conducted to demonstrate the package functions. A web GUI wrapper is also published to support user-friendly B/S applications.
topic Dimensionality reduction
Reconstruction error
Distance matrix
Co-ranking matrix
Co-k-nearest neighbor
url http://www.sciencedirect.com/science/article/pii/S2405844021003042
work_keys_str_mv AT yinshengzhang pydrmetricsapythontoolkitfordimensionalityreductionqualityassessment
AT qianshang pydrmetricsapythontoolkitfordimensionalityreductionqualityassessment
AT guomingzhang pydrmetricsapythontoolkitfordimensionalityreductionqualityassessment
_version_ 1724233487730343936