Efficiency of random swap clustering
Abstract Random swap algorithm aims at solving clustering by a sequence of prototype swaps, and by fine-tuning their exact location by k-means. This randomized search strategy is simple to implement and efficient. It reaches good quality clustering relatively fast, and if iterated longer, it finds t...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2018-03-01
|
Series: | Journal of Big Data |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s40537-018-0122-y |
id |
doaj-d0761c817d8f436fbc4a5d5b2e0dbc0a |
---|---|
record_format |
Article |
spelling |
doaj-d0761c817d8f436fbc4a5d5b2e0dbc0a2020-11-25T00:01:32ZengSpringerOpenJournal of Big Data2196-11152018-03-015112910.1186/s40537-018-0122-yEfficiency of random swap clusteringPasi Fränti0Machine Learning Group, School of Computing, University of Eastern FinlandAbstract Random swap algorithm aims at solving clustering by a sequence of prototype swaps, and by fine-tuning their exact location by k-means. This randomized search strategy is simple to implement and efficient. It reaches good quality clustering relatively fast, and if iterated longer, it finds the correct clustering with high probability. In this paper, we analyze the expected number of iterations needed to find the correct clustering. Using this result, we derive the expected time complexity of the random swap algorithm. The main results are that the expected time complexity has (1) linear dependency on the number of data vectors, (2) quadratic dependency on the number of clusters, and (3) inverse dependency on the size of neighborhood. Experiments also show that the algorithm is clearly more efficient than k-means and almost never get stuck in inferior local minimum.http://link.springer.com/article/10.1186/s40537-018-0122-yClusteringRandom swapK-meansLocal searchEfficiency |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Pasi Fränti |
spellingShingle |
Pasi Fränti Efficiency of random swap clustering Journal of Big Data Clustering Random swap K-means Local search Efficiency |
author_facet |
Pasi Fränti |
author_sort |
Pasi Fränti |
title |
Efficiency of random swap clustering |
title_short |
Efficiency of random swap clustering |
title_full |
Efficiency of random swap clustering |
title_fullStr |
Efficiency of random swap clustering |
title_full_unstemmed |
Efficiency of random swap clustering |
title_sort |
efficiency of random swap clustering |
publisher |
SpringerOpen |
series |
Journal of Big Data |
issn |
2196-1115 |
publishDate |
2018-03-01 |
description |
Abstract Random swap algorithm aims at solving clustering by a sequence of prototype swaps, and by fine-tuning their exact location by k-means. This randomized search strategy is simple to implement and efficient. It reaches good quality clustering relatively fast, and if iterated longer, it finds the correct clustering with high probability. In this paper, we analyze the expected number of iterations needed to find the correct clustering. Using this result, we derive the expected time complexity of the random swap algorithm. The main results are that the expected time complexity has (1) linear dependency on the number of data vectors, (2) quadratic dependency on the number of clusters, and (3) inverse dependency on the size of neighborhood. Experiments also show that the algorithm is clearly more efficient than k-means and almost never get stuck in inferior local minimum. |
topic |
Clustering Random swap K-means Local search Efficiency |
url |
http://link.springer.com/article/10.1186/s40537-018-0122-y |
work_keys_str_mv |
AT pasifranti efficiencyofrandomswapclustering |
_version_ |
1725441589635973120 |