id ndltd-OhioLink-oai-etd.ohiolink.edu-ucin1617104719399743
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-ucin16171047193997432021-10-16T05:25:16Z Enumerating Approximate Maximal Cliques in a Distributed Framework Dhanasetty, Abhishek Computer Science Approximate Maximal Cliques A-star search algorithm Triangle Enumeration Heuristics Degree Distribution Clustering Coefficient This thesis presents an algorithm for finding approximate maximal cliques in a very large graph using the distributed computational framework of Hadoop and spark. The primary motivation behind approximate maximal cliques enumeration is that many candidate maxcliques in graphs have almost all the needed edges but just a few ones are missing. In our approach, a heuristic search algorithm is used to list all strongly connected components of an undirected graph network. All approximate maximal cliques with connectivity higher than a threshold are produced by our algorithm. Starting from a triangle as a seed (smallest available Maximal Clique), we expand it using the lists of all vertices in the graph connected to the seed triangle's vertices. One or more combinations of vertices from the lists vertices connected to the triangle can form approximate Maximal Cliques. From the lists of nodes, we eliminate the least promising nodes one at a time. A* search algorithm performs the elimination of nodes to find the strongly connected components. As A* search algorithm works well with heuristics, three different heuristics are considered for this project. They are 1) Approximate maximal clique connectivity based on degree distribution, 2) Small-tail distribution of the degree of nodes connected to the triangle, and 3) Approximate maximal clique connectivity based on clustering coefficient. These heuristics are tested on three separate datasets, and values for different metrics are compared. 2021-10-05 English text University of Cincinnati / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617104719399743 http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617104719399743 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Computer Science
Approximate Maximal Cliques
A-star search algorithm
Triangle Enumeration
Heuristics
Degree Distribution
Clustering Coefficient
spellingShingle Computer Science
Approximate Maximal Cliques
A-star search algorithm
Triangle Enumeration
Heuristics
Degree Distribution
Clustering Coefficient
Dhanasetty, Abhishek
Enumerating Approximate Maximal Cliques in a Distributed Framework
author Dhanasetty, Abhishek
author_facet Dhanasetty, Abhishek
author_sort Dhanasetty, Abhishek
title Enumerating Approximate Maximal Cliques in a Distributed Framework
title_short Enumerating Approximate Maximal Cliques in a Distributed Framework
title_full Enumerating Approximate Maximal Cliques in a Distributed Framework
title_fullStr Enumerating Approximate Maximal Cliques in a Distributed Framework
title_full_unstemmed Enumerating Approximate Maximal Cliques in a Distributed Framework
title_sort enumerating approximate maximal cliques in a distributed framework
publisher University of Cincinnati / OhioLINK
publishDate 2021
url http://rave.ohiolink.edu/etdc/view?acc_num=ucin1617104719399743
work_keys_str_mv AT dhanasettyabhishek enumeratingapproximatemaximalcliquesinadistributedframework
_version_ 1719490059713904640