Enumerating Approximate Maximal Cliques in a Distributed Framework
Main Author: | |
---|---|
Language: | English |
Published: |
University of Cincinnati / OhioLINK
2021
|
Subjects: | |
Online Access: | http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617104719399743 |
id |
ndltd-OhioLink-oai-etd.ohiolink.edu-ucin1617104719399743 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-OhioLink-oai-etd.ohiolink.edu-ucin16171047193997432021-10-16T05:25:16Z Enumerating Approximate Maximal Cliques in a Distributed Framework Dhanasetty, Abhishek Computer Science Approximate Maximal Cliques A-star search algorithm Triangle Enumeration Heuristics Degree Distribution Clustering Coefficient This thesis presents an algorithm for finding approximate maximal cliques in a very large graph using the distributed computational framework of Hadoop and spark. The primary motivation behind approximate maximal cliques enumeration is that many candidate maxcliques in graphs have almost all the needed edges but just a few ones are missing. In our approach, a heuristic search algorithm is used to list all strongly connected components of an undirected graph network. All approximate maximal cliques with connectivity higher than a threshold are produced by our algorithm. Starting from a triangle as a seed (smallest available Maximal Clique), we expand it using the lists of all vertices in the graph connected to the seed triangle's vertices. One or more combinations of vertices from the lists vertices connected to the triangle can form approximate Maximal Cliques. From the lists of nodes, we eliminate the least promising nodes one at a time. A* search algorithm performs the elimination of nodes to find the strongly connected components. As A* search algorithm works well with heuristics, three different heuristics are considered for this project. They are 1) Approximate maximal clique connectivity based on degree distribution, 2) Small-tail distribution of the degree of nodes connected to the triangle, and 3) Approximate maximal clique connectivity based on clustering coefficient. These heuristics are tested on three separate datasets, and values for different metrics are compared. 2021-10-05 English text University of Cincinnati / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617104719399743 http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617104719399743 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws. |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
topic |
Computer Science Approximate Maximal Cliques A-star search algorithm Triangle Enumeration Heuristics Degree Distribution Clustering Coefficient |
spellingShingle |
Computer Science Approximate Maximal Cliques A-star search algorithm Triangle Enumeration Heuristics Degree Distribution Clustering Coefficient Dhanasetty, Abhishek Enumerating Approximate Maximal Cliques in a Distributed Framework |
author |
Dhanasetty, Abhishek |
author_facet |
Dhanasetty, Abhishek |
author_sort |
Dhanasetty, Abhishek |
title |
Enumerating Approximate Maximal Cliques in a Distributed Framework |
title_short |
Enumerating Approximate Maximal Cliques in a Distributed Framework |
title_full |
Enumerating Approximate Maximal Cliques in a Distributed Framework |
title_fullStr |
Enumerating Approximate Maximal Cliques in a Distributed Framework |
title_full_unstemmed |
Enumerating Approximate Maximal Cliques in a Distributed Framework |
title_sort |
enumerating approximate maximal cliques in a distributed framework |
publisher |
University of Cincinnati / OhioLINK |
publishDate |
2021 |
url |
http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617104719399743 |
work_keys_str_mv |
AT dhanasettyabhishek enumeratingapproximatemaximalcliquesinadistributedframework |
_version_ |
1719490059713904640 |