Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield Data

Clustering algorithms in the high-dimensional space require many data to perform reliably and robustly. For multivariate volume data, it is possible to interpolate between the data points in the high-dimensional attribute space based on their spatial relationship in the volumetric domain (or physica...

Full description

Bibliographic Details
Main Authors:	Vladimir Molchanov, Lars Linsen
Format:	Article
Language:	English
Published:	MDPI AG 2018-06-01
Series:	Information
Subjects:	multi-dimensional data visualization multi-field data clustering
Online Access:	http://www.mdpi.com/2078-2489/9/7/156

id	doaj-c925cced37de4291bf5b099baedd392e
record_format	Article
spelling	doaj-c925cced37de4291bf5b099baedd392e2020-11-24T23:14:19ZengMDPI AGInformation2078-24892018-06-019715610.3390/info9070156info9070156Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield DataVladimir Molchanov0Lars Linsen1Department of Mathematics and Informatics, Westfälische Wilhelms-Universität Münster, 48149 Münster, GermanyDepartment of Mathematics and Informatics, Westfälische Wilhelms-Universität Münster, 48149 Münster, GermanyClustering algorithms in the high-dimensional space require many data to perform reliably and robustly. For multivariate volume data, it is possible to interpolate between the data points in the high-dimensional attribute space based on their spatial relationship in the volumetric domain (or physical space). Thus, sufficiently high number of data points can be generated, overcoming the curse of dimensionality for this particular type of multidimensional data. We applies this idea to a histogram-based clustering algorithm. We created a uniform partition of the attribute space in multidimensional bins and computed a histogram indicating the number of data samples belonging to each bin. Without interpolation, the analysis was highly sensitive to the histogram cell sizes, yielding inaccurate clustering for improper choices: Large histogram cells result in no cluster separation, while clusters fall apart for small cells. Using an interpolation in physical space, we could refine the data by generating additional samples. The depth of the refinement scheme was chosen according to the local data point distribution in attribute space and the histogram’s bin size. In the case of field discontinuities representing sharp material boundaries in the volume data, the interpolation can be adapted to locally make use of a nearest-neighbor interpolation scheme that avoids averaging values across the sharp boundary. Consequently, we could generate a density computation, where clusters stay connected even when using very small bin sizes. We exploited this result to create a robust hierarchical cluster tree, apply our technique to several datasets, and compare the cluster trees before and after interpolation.http://www.mdpi.com/2078-2489/9/7/156multi-dimensional data visualizationmulti-field dataclustering
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Vladimir Molchanov Lars Linsen
spellingShingle	Vladimir Molchanov Lars Linsen Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield Data Information multi-dimensional data visualization multi-field data clustering
author_facet	Vladimir Molchanov Lars Linsen
author_sort	Vladimir Molchanov
title	Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield Data
title_short	Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield Data
title_full	Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield Data
title_fullStr	Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield Data
title_full_unstemmed	Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield Data
title_sort	upsampling for improved multidimensional attribute space clustering of multifield data
publisher	MDPI AG
series	Information
issn	2078-2489
publishDate	2018-06-01
description	Clustering algorithms in the high-dimensional space require many data to perform reliably and robustly. For multivariate volume data, it is possible to interpolate between the data points in the high-dimensional attribute space based on their spatial relationship in the volumetric domain (or physical space). Thus, sufficiently high number of data points can be generated, overcoming the curse of dimensionality for this particular type of multidimensional data. We applies this idea to a histogram-based clustering algorithm. We created a uniform partition of the attribute space in multidimensional bins and computed a histogram indicating the number of data samples belonging to each bin. Without interpolation, the analysis was highly sensitive to the histogram cell sizes, yielding inaccurate clustering for improper choices: Large histogram cells result in no cluster separation, while clusters fall apart for small cells. Using an interpolation in physical space, we could refine the data by generating additional samples. The depth of the refinement scheme was chosen according to the local data point distribution in attribute space and the histogram’s bin size. In the case of field discontinuities representing sharp material boundaries in the volume data, the interpolation can be adapted to locally make use of a nearest-neighbor interpolation scheme that avoids averaging values across the sharp boundary. Consequently, we could generate a density computation, where clusters stay connected even when using very small bin sizes. We exploited this result to create a robust hierarchical cluster tree, apply our technique to several datasets, and compare the cluster trees before and after interpolation.
topic	multi-dimensional data visualization multi-field data clustering
url	http://www.mdpi.com/2078-2489/9/7/156
work_keys_str_mv	AT vladimirmolchanov upsamplingforimprovedmultidimensionalattributespaceclusteringofmultifielddata AT larslinsen upsamplingforimprovedmultidimensionalattributespaceclusteringofmultifielddata
_version_	1725595069431414784

Upsampling for Improved Multidimensional Attribute Space Clustering of Multifield Data

Similar Items