Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering pr...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2010-12-01
|
Series: | PLoS ONE |
Online Access: | https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21151943/?tool=EBI |
id |
doaj-3d1c06c6ca0b43f78298ded6d8916924 |
---|---|
record_format |
Article |
spelling |
doaj-3d1c06c6ca0b43f78298ded6d89169242021-03-04T02:13:00ZengPublic Library of Science (PLoS)PLoS ONE1932-62032010-12-01512e1406710.1371/journal.pone.0014067Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.Romeo RizziPritha MahataLuke MathiesonPablo MoscatoClustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21151943/?tool=EBI |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Romeo Rizzi Pritha Mahata Luke Mathieson Pablo Moscato |
spellingShingle |
Romeo Rizzi Pritha Mahata Luke Mathieson Pablo Moscato Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments. PLoS ONE |
author_facet |
Romeo Rizzi Pritha Mahata Luke Mathieson Pablo Moscato |
author_sort |
Romeo Rizzi |
title |
Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments. |
title_short |
Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments. |
title_full |
Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments. |
title_fullStr |
Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments. |
title_full_unstemmed |
Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments. |
title_sort |
hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2010-12-01 |
description |
Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities. |
url |
https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21151943/?tool=EBI |
work_keys_str_mv |
AT romeorizzi hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments AT prithamahata hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments AT lukemathieson hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments AT pablomoscato hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments |
_version_ |
1714808869853593600 |