Algorithms for evolving graph analysis

In many applications, entities and their relationships are represented by graphs. Examples include social networks (users and friendship), the WWW (web pages and hyperlinks) and bibliographic networks (authors and co-authorship). In a dynamic world, information changes and so the graphs representing...

Full description

Bibliographic Details
Main Authors:	Ren, Chenghui, 任成會
Other Authors:	Kao, CM
Language:	English
Published:	The University of Hong Kong (Pokfulam, Hong Kong) 2014
Subjects:	Graph theory - Data processing
Online Access:	http://hdl.handle.net/10722/197105

id	ndltd-HKU-oai-hub.hku.hk-10722-197105
record_format	oai_dc
spelling	ndltd-HKU-oai-hub.hku.hk-10722-1971052015-07-29T04:02:34Z Algorithms for evolving graph analysis Ren, Chenghui 任成會 Kao, CM Cheung, DWL Graph theory - Data processing In many applications, entities and their relationships are represented by graphs. Examples include social networks (users and friendship), the WWW (web pages and hyperlinks) and bibliographic networks (authors and co-authorship). In a dynamic world, information changes and so the graphs representing the information evolve with time. For example, a Facebook link between two friends is established, or a hyperlink is added to a web page. We propose that historical graph-structured data be archived for analytical processing. We call a historical evolving graph sequence an EGS. We study the problem of efficient query processing on an EGS, which finds many applications that lead to interesting evolving graph analysis. To solve the problem, we propose a solution framework called FVF and a cluster-based LU decomposition algorithm called CLUDE, which can evaluate queries efficiently to support EGS analysis. The Find-Verify-and-Fix (FVF) framework applies to a wide range of queries. We demonstrate how some important graph measures, including shortest-path distance, closeness centrality and graph centrality, can be efficiently computed from EGSs using FVF. Since an EGS generally contains numerous large graphs, we also discuss several compact storage models that support our FVF framework. Through extensive experiments on both real and synthetic datasets, we show that our FVF framework is highly efficient in EGS query processing. A graph can be conveniently modeled by a matrix from which various quantitative measures are derived like PageRank and SALSA and Personalized PageRank and Random Walk with Restart. To compute these measures, linear systems of the form Ax = b, where A is a matrix that captures a graph's structure, need to be solved. To facilitate solving the linear system, the matrix A is often decomposed into two triangular matrices (L and U). In a dynamic world, the graph that models it changes with time and thus is the matrix A that represents the graph. We consider a sequence of evolving graphs and its associated sequence of evolving matrices. We study how LU-decomposition should be done over the sequence so that (1) the decomposition is efficient and (2) the resulting LU matrices best preserve the sparsity of the matrices A's (i.e., the number of extra non-zero entries introduced in L and U are minimized). We propose a cluster-based algorithm CLUDE for solving the problem. Through an experimental study, we show that CLUDE is about an order of magnitude faster than the traditional incremental update algorithm. The number of extra non-zero entries introduced by CLUDE is also about an order of magnitude fewer than that of the traditional algorithm. CLUDE is thus an efficient algorithm for LU decomposition that produces high-quality LU matrices over an evolving matrix sequence. published_or_final_version Computer Science Doctoral Doctor of Philosophy 2014-05-07T23:15:27Z 2014-05-07T23:15:27Z 2014 PG_Thesis 10.5353/th_b5185948 b5185948 http://hdl.handle.net/10722/197105 eng HKU Theses Online (HKUTO) Creative Commons: Attribution 3.0 Hong Kong License The author retains all proprietary rights, (such as patent rights) and the right to use in future works. The University of Hong Kong (Pokfulam, Hong Kong)
collection	NDLTD
language	English
sources	NDLTD
topic	Graph theory - Data processing
spellingShingle	Graph theory - Data processing Ren, Chenghui 任成會 Algorithms for evolving graph analysis
description	In many applications, entities and their relationships are represented by graphs. Examples include social networks (users and friendship), the WWW (web pages and hyperlinks) and bibliographic networks (authors and co-authorship). In a dynamic world, information changes and so the graphs representing the information evolve with time. For example, a Facebook link between two friends is established, or a hyperlink is added to a web page. We propose that historical graph-structured data be archived for analytical processing. We call a historical evolving graph sequence an EGS. We study the problem of efficient query processing on an EGS, which finds many applications that lead to interesting evolving graph analysis. To solve the problem, we propose a solution framework called FVF and a cluster-based LU decomposition algorithm called CLUDE, which can evaluate queries efficiently to support EGS analysis. The Find-Verify-and-Fix (FVF) framework applies to a wide range of queries. We demonstrate how some important graph measures, including shortest-path distance, closeness centrality and graph centrality, can be efficiently computed from EGSs using FVF. Since an EGS generally contains numerous large graphs, we also discuss several compact storage models that support our FVF framework. Through extensive experiments on both real and synthetic datasets, we show that our FVF framework is highly efficient in EGS query processing. A graph can be conveniently modeled by a matrix from which various quantitative measures are derived like PageRank and SALSA and Personalized PageRank and Random Walk with Restart. To compute these measures, linear systems of the form Ax = b, where A is a matrix that captures a graph's structure, need to be solved. To facilitate solving the linear system, the matrix A is often decomposed into two triangular matrices (L and U). In a dynamic world, the graph that models it changes with time and thus is the matrix A that represents the graph. We consider a sequence of evolving graphs and its associated sequence of evolving matrices. We study how LU-decomposition should be done over the sequence so that (1) the decomposition is efficient and (2) the resulting LU matrices best preserve the sparsity of the matrices A's (i.e., the number of extra non-zero entries introduced in L and U are minimized). We propose a cluster-based algorithm CLUDE for solving the problem. Through an experimental study, we show that CLUDE is about an order of magnitude faster than the traditional incremental update algorithm. The number of extra non-zero entries introduced by CLUDE is also about an order of magnitude fewer than that of the traditional algorithm. CLUDE is thus an efficient algorithm for LU decomposition that produces high-quality LU matrices over an evolving matrix sequence. === published_or_final_version === Computer Science === Doctoral === Doctor of Philosophy
author2	Kao, CM
author_facet	Kao, CM Ren, Chenghui 任成會
author	Ren, Chenghui 任成會
author_sort	Ren, Chenghui
title	Algorithms for evolving graph analysis
title_short	Algorithms for evolving graph analysis
title_full	Algorithms for evolving graph analysis
title_fullStr	Algorithms for evolving graph analysis
title_full_unstemmed	Algorithms for evolving graph analysis
title_sort	algorithms for evolving graph analysis
publisher	The University of Hong Kong (Pokfulam, Hong Kong)
publishDate	2014
url	http://hdl.handle.net/10722/197105
work_keys_str_mv	AT renchenghui algorithmsforevolvinggraphanalysis AT rènchénghuì algorithmsforevolvinggraphanalysis
_version_	1716814237486022656

Algorithms for evolving graph analysis

Similar Items