Distributed Similarity Queries in Metric Spaces

Abstract Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this pap...

Full description

Bibliographic Details
Main Authors: Keyu Yang, Xin Ding, Yuanliang Zhang, Lu Chen, Baihua Zheng, Yunjun Gao
Format: Article
Language:English
Published: SpringerOpen 2019-06-01
Series:Data Science and Engineering
Subjects:
Online Access:http://link.springer.com/article/10.1007/s41019-019-0095-7
id doaj-8db8cd1e603d4a2c86497df3624fe126
record_format Article
spelling doaj-8db8cd1e603d4a2c86497df3624fe1262021-04-02T17:03:21ZengSpringerOpenData Science and Engineering2364-11852364-15412019-06-01429310810.1007/s41019-019-0095-7Distributed Similarity Queries in Metric SpacesKeyu Yang0Xin Ding1Yuanliang Zhang2Lu Chen3Baihua Zheng4Yunjun Gao5College of Computer Science, Zhejiang UniversityCollege of Computer Science, Zhejiang UniversityCollege of Computer Science, Zhejiang UniversityDepartment of Computer Science, Aalborg UniversitySchool of Information Systems, Singapore Management UniversityCollege of Computer Science, Zhejiang UniversityAbstract Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), to support efficient metric similarity queries in the distributed environment. AMDS uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronous process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In addition, we develop efficient similarity search algorithms using AMDS. Extensive experiments using real and synthetic data demonstrate the performance of metric similarity queries using AMDS. Moreover, the AMDS scales sublinearly with the growing data size.http://link.springer.com/article/10.1007/s41019-019-0095-7Similarity queryRange querykNN queryMetric spaceDistributed processingAlgorithm
collection DOAJ
language English
format Article
sources DOAJ
author Keyu Yang
Xin Ding
Yuanliang Zhang
Lu Chen
Baihua Zheng
Yunjun Gao
spellingShingle Keyu Yang
Xin Ding
Yuanliang Zhang
Lu Chen
Baihua Zheng
Yunjun Gao
Distributed Similarity Queries in Metric Spaces
Data Science and Engineering
Similarity query
Range query
kNN query
Metric space
Distributed processing
Algorithm
author_facet Keyu Yang
Xin Ding
Yuanliang Zhang
Lu Chen
Baihua Zheng
Yunjun Gao
author_sort Keyu Yang
title Distributed Similarity Queries in Metric Spaces
title_short Distributed Similarity Queries in Metric Spaces
title_full Distributed Similarity Queries in Metric Spaces
title_fullStr Distributed Similarity Queries in Metric Spaces
title_full_unstemmed Distributed Similarity Queries in Metric Spaces
title_sort distributed similarity queries in metric spaces
publisher SpringerOpen
series Data Science and Engineering
issn 2364-1185
2364-1541
publishDate 2019-06-01
description Abstract Similarity queries, including range queries and k nearest neighbor (kNN) queries, in metric spaces have applications in many areas such as multimedia retrieval, computational biology and location-based services. With the growing volumes of data, a distributed method is required. In this paper, we propose an Asynchronous Metric Distributed System (AMDS), to support efficient metric similarity queries in the distributed environment. AMDS uniformly partitions the data with the pivot-mapping technique to ensure the load balancing, and employs publish/subscribe communication model to asynchronous process large scale of queries. The employment of asynchronous processing model also improves robustness and efficiency of AMDS. In addition, we develop efficient similarity search algorithms using AMDS. Extensive experiments using real and synthetic data demonstrate the performance of metric similarity queries using AMDS. Moreover, the AMDS scales sublinearly with the growing data size.
topic Similarity query
Range query
kNN query
Metric space
Distributed processing
Algorithm
url http://link.springer.com/article/10.1007/s41019-019-0095-7
work_keys_str_mv AT keyuyang distributedsimilarityqueriesinmetricspaces
AT xinding distributedsimilarityqueriesinmetricspaces
AT yuanliangzhang distributedsimilarityqueriesinmetricspaces
AT luchen distributedsimilarityqueriesinmetricspaces
AT baihuazheng distributedsimilarityqueriesinmetricspaces
AT yunjungao distributedsimilarityqueriesinmetricspaces
_version_ 1721554801023516672