Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting

Abstract We analyze the performance of regular decomposition, a method for compression of large and dense graphs. This method is inspired by Szemerédi’s regularity lemma (SRL), a generic structural result of large and dense graphs. In our method, stochastic block model (SBM) is used as a model in ma...

Full description

Bibliographic Details
Main Authors: Hannu Reittu, Ilkka Norros, Tomi Räty, Marianna Bolla, Fülöp Bazsó
Format: Article
Language:English
Published: SpringerOpen 2019-03-01
Series:Data Science and Engineering
Subjects:
Online Access:http://link.springer.com/article/10.1007/s41019-019-0084-x
id doaj-3dc7b8c36a524789bf79ee44b798058e
record_format Article
spelling doaj-3dc7b8c36a524789bf79ee44b798058e2021-03-02T08:46:25ZengSpringerOpenData Science and Engineering2364-11852364-15412019-03-0141446010.1007/s41019-019-0084-xRegular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model FittingHannu Reittu0Ilkka Norros1Tomi Räty2Marianna Bolla3Fülöp Bazsó4VTT Technical Research Centre of Finland Ltd.Department of Mathematics and Statistics, University of HelsinkiVTT Technical Research Centre of Finland Ltd.Department of Stochastics, Institute of Mathematics, Technical University of BudapestDepartment of Computational Sciences, Institute for Particle and Nuclear Physics, Wigner Research Centre for Physics, Hungarian Academy of SciencesAbstract We analyze the performance of regular decomposition, a method for compression of large and dense graphs. This method is inspired by Szemerédi’s regularity lemma (SRL), a generic structural result of large and dense graphs. In our method, stochastic block model (SBM) is used as a model in maximum likelihood fitting to find a regular structure similar to the one predicted by SRL. Another ingredient of our method is Rissanen’s minimum description length principle (MDL). We consider scaling of algorithms to extremely large size of graphs by sampling a small subgraph. We continue our previous work on the subject by proving some experimentally found claims. Our theoretical setting does not assume that the graph is generated from a SBM. The task is to find a SBM that is optimal for modeling the given graph in the sense of MDL. This assumption matches with real-life situations when no random generative model is appropriate. Our aim is to show that regular decomposition is a viable and robust method for large graphs emerging, say, in Big Data area.http://link.springer.com/article/10.1007/s41019-019-0084-xCommunity detectionSamplingConsistencyMartingales
collection DOAJ
language English
format Article
sources DOAJ
author Hannu Reittu
Ilkka Norros
Tomi Räty
Marianna Bolla
Fülöp Bazsó
spellingShingle Hannu Reittu
Ilkka Norros
Tomi Räty
Marianna Bolla
Fülöp Bazsó
Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting
Data Science and Engineering
Community detection
Sampling
Consistency
Martingales
author_facet Hannu Reittu
Ilkka Norros
Tomi Räty
Marianna Bolla
Fülöp Bazsó
author_sort Hannu Reittu
title Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting
title_short Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting
title_full Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting
title_fullStr Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting
title_full_unstemmed Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting
title_sort regular decomposition of large graphs: foundation of a sampling approach to stochastic block model fitting
publisher SpringerOpen
series Data Science and Engineering
issn 2364-1185
2364-1541
publishDate 2019-03-01
description Abstract We analyze the performance of regular decomposition, a method for compression of large and dense graphs. This method is inspired by Szemerédi’s regularity lemma (SRL), a generic structural result of large and dense graphs. In our method, stochastic block model (SBM) is used as a model in maximum likelihood fitting to find a regular structure similar to the one predicted by SRL. Another ingredient of our method is Rissanen’s minimum description length principle (MDL). We consider scaling of algorithms to extremely large size of graphs by sampling a small subgraph. We continue our previous work on the subject by proving some experimentally found claims. Our theoretical setting does not assume that the graph is generated from a SBM. The task is to find a SBM that is optimal for modeling the given graph in the sense of MDL. This assumption matches with real-life situations when no random generative model is appropriate. Our aim is to show that regular decomposition is a viable and robust method for large graphs emerging, say, in Big Data area.
topic Community detection
Sampling
Consistency
Martingales
url http://link.springer.com/article/10.1007/s41019-019-0084-x
work_keys_str_mv AT hannureittu regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting
AT ilkkanorros regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting
AT tomiraty regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting
AT mariannabolla regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting
AT fulopbazso regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting
_version_ 1724240384205258752