A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social Systems

The k-nearest neighbor (kNN) algorithm is a classic supervised machine learning algorithm. It is widely used in cyber-physical-social systems (CPSS) to analyze and mine data. However, in practical CPSS applications, the standard linear kNN algorithm struggles to efficiently process massive data sets...

Full description

Bibliographic Details
Main Authors: Wei Zhang, Xiaohui Chen, Yueqi Liu, Qian Xi
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
kNN
Online Access:https://ieeexplore.ieee.org/document/9001024/
id doaj-012dc430600740d7872bde360b77fcaf
record_format Article
spelling doaj-012dc430600740d7872bde360b77fcaf2021-03-30T01:23:07ZengIEEEIEEE Access2169-35362020-01-018501185013010.1109/ACCESS.2020.29747649001024A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social SystemsWei Zhang0https://orcid.org/0000-0003-3694-2246Xiaohui Chen1Yueqi Liu2Qian Xi3School of Computer Science and Technology, Huaiyin Normal University, Huai’an, ChinaSchool of Computer Science and Technology, Huaiyin Normal University, Huai’an, ChinaSchool of Computer Science and Technology, Huaiyin Normal University, Huai’an, ChinaSchool of Computer Science and Technology, Huaiyin Normal University, Huai’an, ChinaThe k-nearest neighbor (kNN) algorithm is a classic supervised machine learning algorithm. It is widely used in cyber-physical-social systems (CPSS) to analyze and mine data. However, in practical CPSS applications, the standard linear kNN algorithm struggles to efficiently process massive data sets. This paper proposes a distributed storage and computation k-nearest neighbor (D-kNN) algorithm. The D-kNN algorithm has the following advantages: First, the concept of k-nearest neighbor boundaries is proposed and the k-nearest neighbor search within the k-nearest neighbors boundaries can effectively reduce the time complexity of kNN. Second, based on the k-neighbor boundary, massive data sets beyond the main storage space are stored on distributed storage nodes. Third, the algorithm performs k-nearest neighbor searching efficiently by performing distributed calculations at each storage node. Finally, a series of experiments were performed to verify the effectiveness of the D-kNN algorithm. The experimental results show that the D-kNN algorithm based on distributed storage and calculation effectively improves the operation efficiency of k-nearest neighbor search. The algorithm can be easily and flexibly deployed in a cloud-edge computing environment to process massive data sets in CPSS.https://ieeexplore.ieee.org/document/9001024/kNNk-nearest neighbor boundarydistributed storage and computationcloud-edge computingCPSS
collection DOAJ
language English
format Article
sources DOAJ
author Wei Zhang
Xiaohui Chen
Yueqi Liu
Qian Xi
spellingShingle Wei Zhang
Xiaohui Chen
Yueqi Liu
Qian Xi
A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social Systems
IEEE Access
kNN
k-nearest neighbor boundary
distributed storage and computation
cloud-edge computing
CPSS
author_facet Wei Zhang
Xiaohui Chen
Yueqi Liu
Qian Xi
author_sort Wei Zhang
title A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social Systems
title_short A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social Systems
title_full A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social Systems
title_fullStr A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social Systems
title_full_unstemmed A Distributed Storage and Computation k-Nearest Neighbor Algorithm Based Cloud-Edge Computing for Cyber-Physical-Social Systems
title_sort distributed storage and computation k-nearest neighbor algorithm based cloud-edge computing for cyber-physical-social systems
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description The k-nearest neighbor (kNN) algorithm is a classic supervised machine learning algorithm. It is widely used in cyber-physical-social systems (CPSS) to analyze and mine data. However, in practical CPSS applications, the standard linear kNN algorithm struggles to efficiently process massive data sets. This paper proposes a distributed storage and computation k-nearest neighbor (D-kNN) algorithm. The D-kNN algorithm has the following advantages: First, the concept of k-nearest neighbor boundaries is proposed and the k-nearest neighbor search within the k-nearest neighbors boundaries can effectively reduce the time complexity of kNN. Second, based on the k-neighbor boundary, massive data sets beyond the main storage space are stored on distributed storage nodes. Third, the algorithm performs k-nearest neighbor searching efficiently by performing distributed calculations at each storage node. Finally, a series of experiments were performed to verify the effectiveness of the D-kNN algorithm. The experimental results show that the D-kNN algorithm based on distributed storage and calculation effectively improves the operation efficiency of k-nearest neighbor search. The algorithm can be easily and flexibly deployed in a cloud-edge computing environment to process massive data sets in CPSS.
topic kNN
k-nearest neighbor boundary
distributed storage and computation
cloud-edge computing
CPSS
url https://ieeexplore.ieee.org/document/9001024/
work_keys_str_mv AT weizhang adistributedstorageandcomputationknearestneighboralgorithmbasedcloudedgecomputingforcyberphysicalsocialsystems
AT xiaohuichen adistributedstorageandcomputationknearestneighboralgorithmbasedcloudedgecomputingforcyberphysicalsocialsystems
AT yueqiliu adistributedstorageandcomputationknearestneighboralgorithmbasedcloudedgecomputingforcyberphysicalsocialsystems
AT qianxi adistributedstorageandcomputationknearestneighboralgorithmbasedcloudedgecomputingforcyberphysicalsocialsystems
AT weizhang distributedstorageandcomputationknearestneighboralgorithmbasedcloudedgecomputingforcyberphysicalsocialsystems
AT xiaohuichen distributedstorageandcomputationknearestneighboralgorithmbasedcloudedgecomputingforcyberphysicalsocialsystems
AT yueqiliu distributedstorageandcomputationknearestneighboralgorithmbasedcloudedgecomputingforcyberphysicalsocialsystems
AT qianxi distributedstorageandcomputationknearestneighboralgorithmbasedcloudedgecomputingforcyberphysicalsocialsystems
_version_ 1724187092491173888