Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication

At present, with the explosive growth of data scale, subgraph matching for massive graph data is difficult to satisfy with efficiency. Meanwhile, the graph index used in existing subgraph matching algorithm is difficult to update and maintain when facing dynamic graphs. We propose a distributed subg...

Full description

Bibliographic Details
Main Authors:	Ling Yuan, Jiali Bin, Peng Pan
Format:	Article
Language:	English
Published:	MDPI AG 2020-01-01
Series:	Electronics
Subjects:	subgraph matching graph indexing distributed computing graph partition
Online Access:	https://www.mdpi.com/2079-9292/9/1/184

id	doaj-f4b33789d4784385891c3f1a3cd7722f
record_format	Article
spelling	doaj-f4b33789d4784385891c3f1a3cd7722f2020-11-25T01:38:06ZengMDPI AGElectronics2079-92922020-01-019118410.3390/electronics9010184electronics9010184Optimized Distributed Subgraph Matching Algorithm Based on Partition ReplicationLing Yuan0Jiali Bin1Peng Pan2School of Computer Science, Huazhong University of Science and Technology, Wuhan 430074, ChinaSchool of Computer Science, Huazhong University of Science and Technology, Wuhan 430074, ChinaSchool of Computer Science, Huazhong University of Science and Technology, Wuhan 430074, ChinaAt present, with the explosive growth of data scale, subgraph matching for massive graph data is difficult to satisfy with efficiency. Meanwhile, the graph index used in existing subgraph matching algorithm is difficult to update and maintain when facing dynamic graphs. We propose a distributed subgraph matching algorithm based on Partition Replica (noted as PR-Match) to process the partition and storage of large-scale data graphs. The PR-Match algorithm first splits the query graph into sub-queries, then assigns the sub-query to each node for sub-graph matching, and finally merges the matching results. In the PR-Match algorithm, we propose a heuristic rule based on prediction cost to select the optimal merging plan, which greatly reduces the cost of merging. In order to accelerate the matching speed of the sub-query graph, a vertex code based on the vertex neighbor label signature is proposed, which greatly reduces the search space for the subquery. As the vertex code is based on the increment, the problem that the feature-based graph index is difficult to maintain in the face of the dynamic graph is solved. An abundance of experiments on real and synthetic datasets demonstrate the high efficiency and strong scalability of the PR-Match algorithm when handling large-scale data graphs.https://www.mdpi.com/2079-9292/9/1/184subgraph matchinggraph indexingdistributed computinggraph partition
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Ling Yuan Jiali Bin Peng Pan
spellingShingle	Ling Yuan Jiali Bin Peng Pan Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication Electronics subgraph matching graph indexing distributed computing graph partition
author_facet	Ling Yuan Jiali Bin Peng Pan
author_sort	Ling Yuan
title	Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication
title_short	Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication
title_full	Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication
title_fullStr	Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication
title_full_unstemmed	Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication
title_sort	optimized distributed subgraph matching algorithm based on partition replication
publisher	MDPI AG
series	Electronics
issn	2079-9292
publishDate	2020-01-01
description	At present, with the explosive growth of data scale, subgraph matching for massive graph data is difficult to satisfy with efficiency. Meanwhile, the graph index used in existing subgraph matching algorithm is difficult to update and maintain when facing dynamic graphs. We propose a distributed subgraph matching algorithm based on Partition Replica (noted as PR-Match) to process the partition and storage of large-scale data graphs. The PR-Match algorithm first splits the query graph into sub-queries, then assigns the sub-query to each node for sub-graph matching, and finally merges the matching results. In the PR-Match algorithm, we propose a heuristic rule based on prediction cost to select the optimal merging plan, which greatly reduces the cost of merging. In order to accelerate the matching speed of the sub-query graph, a vertex code based on the vertex neighbor label signature is proposed, which greatly reduces the search space for the subquery. As the vertex code is based on the increment, the problem that the feature-based graph index is difficult to maintain in the face of the dynamic graph is solved. An abundance of experiments on real and synthetic datasets demonstrate the high efficiency and strong scalability of the PR-Match algorithm when handling large-scale data graphs.
topic	subgraph matching graph indexing distributed computing graph partition
url	https://www.mdpi.com/2079-9292/9/1/184
work_keys_str_mv	AT lingyuan optimizeddistributedsubgraphmatchingalgorithmbasedonpartitionreplication AT jialibin optimizeddistributedsubgraphmatchingalgorithmbasedonpartitionreplication AT pengpan optimizeddistributedsubgraphmatchingalgorithmbasedonpartitionreplication
_version_	1725055122983092224

Optimized Distributed Subgraph Matching Algorithm Based on Partition Replication

Similar Items