Distributed Similarity Queries in Metric Spaces
Abstract Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this pap...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2019-06-01
|
Series: | Data Science and Engineering |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1007/s41019-019-0095-7 |
id |
doaj-8db8cd1e603d4a2c86497df3624fe126 |
---|---|
record_format |
Article |
spelling |
doaj-8db8cd1e603d4a2c86497df3624fe1262021-04-02T17:03:21ZengSpringerOpenData Science and Engineering2364-11852364-15412019-06-01429310810.1007/s41019-019-0095-7Distributed Similarity Queries in Metric SpacesKeyu Yang0Xin Ding1Yuanliang Zhang2Lu Chen3Baihua Zheng4Yunjun Gao5College of Computer Science, Zhejiang UniversityCollege of Computer Science, Zhejiang UniversityCollege of Computer Science, Zhejiang UniversityDepartment of Computer Science, Aalborg UniversitySchool of Information Systems, Singapore Management UniversityCollege of Computer Science, Zhejiang UniversityAbstract Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), to support efficient metric similarity queries in the distributed environment. AMDS uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronous process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In addition, we develop efficient similarity search algorithms using AMDS. Extensive experiments using real and synthetic data demonstrate the performance of metric similarity queries using AMDS. Moreover, the AMDS scales sublinearly with the growing data size.http://link.springer.com/article/10.1007/s41019-019-0095-7Similarity queryRange querykNN queryMetric spaceDistributed processingAlgorithm |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Keyu Yang Xin Ding Yuanliang Zhang Lu Chen Baihua Zheng Yunjun Gao |
spellingShingle |
Keyu Yang Xin Ding Yuanliang Zhang Lu Chen Baihua Zheng Yunjun Gao Distributed Similarity Queries in Metric Spaces Data Science and Engineering Similarity query Range query kNN query Metric space Distributed processing Algorithm |
author_facet |
Keyu Yang Xin Ding Yuanliang Zhang Lu Chen Baihua Zheng Yunjun Gao |
author_sort |
Keyu Yang |
title |
Distributed Similarity Queries in Metric Spaces |
title_short |
Distributed Similarity Queries in Metric Spaces |
title_full |
Distributed Similarity Queries in Metric Spaces |
title_fullStr |
Distributed Similarity Queries in Metric Spaces |
title_full_unstemmed |
Distributed Similarity Queries in Metric Spaces |
title_sort |
distributed similarity queries in metric spaces |
publisher |
SpringerOpen |
series |
Data Science and Engineering |
issn |
2364-1185 2364-1541 |
publishDate |
2019-06-01 |
description |
Abstract Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), to support efficient metric similarity queries in the distributed environment. AMDS uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronous process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In addition, we develop efficient similarity search algorithms using AMDS. Extensive experiments using real and synthetic data demonstrate the performance of metric similarity queries using AMDS. Moreover, the AMDS scales sublinearly with the growing data size. |
topic |
Similarity query Range query kNN query Metric space Distributed processing Algorithm |
url |
http://link.springer.com/article/10.1007/s41019-019-0095-7 |
work_keys_str_mv |
AT keyuyang distributedsimilarityqueriesinmetricspaces AT xinding distributedsimilarityqueriesinmetricspaces AT yuanliangzhang distributedsimilarityqueriesinmetricspaces AT luchen distributedsimilarityqueriesinmetricspaces AT baihuazheng distributedsimilarityqueriesinmetricspaces AT yunjungao distributedsimilarityqueriesinmetricspaces |
_version_ |
1721554801023516672 |