Optimizing Distance Computation in Distributed Graph Systems

Given a large graph, such as a social network or a knowledge graph, a fundamental query is how to find the distance from a source vertex to another vertex in the graph. As real graphs become very large and many distributed graph systems, such as Pregel, Pregel+, Giraph, and GraphX, are proposed, how...

Full description

Bibliographic Details
Main Authors: Qing Wang, Shengyi Ji, Peng Peng, Mingdao Li, Ping Huang, Zheng Qin
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9234445/
id doaj-c504a62aea054b898e34b0347908f03d
record_format Article
spelling doaj-c504a62aea054b898e34b0347908f03d2021-03-30T04:08:01ZengIEEEIEEE Access2169-35362020-01-01819167319168210.1109/ACCESS.2020.30327279234445Optimizing Distance Computation in Distributed Graph SystemsQing Wang0Shengyi Ji1Peng Peng2https://orcid.org/0000-0002-8095-8061Mingdao Li3Ping Huang4Zheng Qin5https://orcid.org/0000-0003-0877-3887Hunan University, Changsha, ChinaHunan University, Changsha, ChinaHunan University, Changsha, ChinaHunan University, Changsha, ChinaHunan University, Changsha, ChinaHunan University, Changsha, ChinaGiven a large graph, such as a social network or a knowledge graph, a fundamental query is how to find the distance from a source vertex to another vertex in the graph. As real graphs become very large and many distributed graph systems, such as Pregel, Pregel+, Giraph, and GraphX, are proposed, how to employ distributed graph systems to process single-source distance queries should attract more attention. In this paper, we propose a landmark-based framework to optimize the distance computation over distributed graph systems. We also use a measure called set betweenness to select the optimal set of landmarks for distance computation. Although we can prove that selecting the optimal set of landmarks is NP-hard, we propose a heuristic distributed algorithm that can guarantee the approximation ratio. Experiments on large real graphs confirm the superiority of our methods.https://ieeexplore.ieee.org/document/9234445/Distributed information systemsgraph theorydistributed computingdistance computationdistributed graph systemslandmark
collection DOAJ
language English
format Article
sources DOAJ
author Qing Wang
Shengyi Ji
Peng Peng
Mingdao Li
Ping Huang
Zheng Qin
spellingShingle Qing Wang
Shengyi Ji
Peng Peng
Mingdao Li
Ping Huang
Zheng Qin
Optimizing Distance Computation in Distributed Graph Systems
IEEE Access
Distributed information systems
graph theory
distributed computing
distance computation
distributed graph systems
landmark
author_facet Qing Wang
Shengyi Ji
Peng Peng
Mingdao Li
Ping Huang
Zheng Qin
author_sort Qing Wang
title Optimizing Distance Computation in Distributed Graph Systems
title_short Optimizing Distance Computation in Distributed Graph Systems
title_full Optimizing Distance Computation in Distributed Graph Systems
title_fullStr Optimizing Distance Computation in Distributed Graph Systems
title_full_unstemmed Optimizing Distance Computation in Distributed Graph Systems
title_sort optimizing distance computation in distributed graph systems
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Given a large graph, such as a social network or a knowledge graph, a fundamental query is how to find the distance from a source vertex to another vertex in the graph. As real graphs become very large and many distributed graph systems, such as Pregel, Pregel+, Giraph, and GraphX, are proposed, how to employ distributed graph systems to process single-source distance queries should attract more attention. In this paper, we propose a landmark-based framework to optimize the distance computation over distributed graph systems. We also use a measure called set betweenness to select the optimal set of landmarks for distance computation. Although we can prove that selecting the optimal set of landmarks is NP-hard, we propose a heuristic distributed algorithm that can guarantee the approximation ratio. Experiments on large real graphs confirm the superiority of our methods.
topic Distributed information systems
graph theory
distributed computing
distance computation
distributed graph systems
landmark
url https://ieeexplore.ieee.org/document/9234445/
work_keys_str_mv AT qingwang optimizingdistancecomputationindistributedgraphsystems
AT shengyiji optimizingdistancecomputationindistributedgraphsystems
AT pengpeng optimizingdistancecomputationindistributedgraphsystems
AT mingdaoli optimizingdistancecomputationindistributedgraphsystems
AT pinghuang optimizingdistancecomputationindistributedgraphsystems
AT zhengqin optimizingdistancecomputationindistributedgraphsystems
_version_ 1724182342806798336