Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.

Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering pr...

Full description

Bibliographic Details
Main Authors: Romeo Rizzi, Pritha Mahata, Luke Mathieson, Pablo Moscato
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2010-12-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21151943/?tool=EBI
id doaj-3d1c06c6ca0b43f78298ded6d8916924
record_format Article
spelling doaj-3d1c06c6ca0b43f78298ded6d89169242021-03-04T02:13:00ZengPublic Library of Science (PLoS)PLoS ONE1932-62032010-12-01512e1406710.1371/journal.pone.0014067Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.Romeo RizziPritha MahataLuke MathiesonPablo MoscatoClustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21151943/?tool=EBI
collection DOAJ
language English
format Article
sources DOAJ
author Romeo Rizzi
Pritha Mahata
Luke Mathieson
Pablo Moscato
spellingShingle Romeo Rizzi
Pritha Mahata
Luke Mathieson
Pablo Moscato
Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
PLoS ONE
author_facet Romeo Rizzi
Pritha Mahata
Luke Mathieson
Pablo Moscato
author_sort Romeo Rizzi
title Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
title_short Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
title_full Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
title_fullStr Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
title_full_unstemmed Hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
title_sort hierarchical clustering using the arithmetic-harmonic cut: complexity and experiments.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2010-12-01
description Clustering, particularly hierarchical clustering, is an important method for understanding and analysing data across a wide variety of knowledge domains with notable utility in systems where the data can be classified in an evolutionary context. This paper introduces a new hierarchical clustering problem defined by a novel objective function we call the arithmetic-harmonic cut. We show that the problem of finding such a cut is NP-hard and APX-hard but is fixed-parameter tractable, which indicates that although the problem is unlikely to have a polynomial time algorithm (even for approximation), exact parameterized and local search based techniques may produce workable algorithms. To this end, we implement a memetic algorithm for the problem and demonstrate the effectiveness of the arithmetic-harmonic cut on a number of datasets including a cancer type dataset and a corona virus dataset. We show favorable performance compared to currently used hierarchical clustering techniques such as k-Means, Graclus and Normalized-Cut. The arithmetic-harmonic cut metric overcoming difficulties other hierarchical methods have in representing both intercluster differences and intracluster similarities.
url https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21151943/?tool=EBI
work_keys_str_mv AT romeorizzi hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments
AT prithamahata hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments
AT lukemathieson hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments
AT pablomoscato hierarchicalclusteringusingthearithmeticharmoniccutcomplexityandexperiments
_version_ 1714808869853593600