Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting
Abstract We analyze the performance of regular decomposition, a method for compression of large and dense graphs. This method is inspired by Szemerédi’s regularity lemma (SRL), a generic structural result of large and dense graphs. In our method, stochastic block model (SBM) is used as a model in ma...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2019-03-01
|
Series: | Data Science and Engineering |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1007/s41019-019-0084-x |
id |
doaj-3dc7b8c36a524789bf79ee44b798058e |
---|---|
record_format |
Article |
spelling |
doaj-3dc7b8c36a524789bf79ee44b798058e2021-03-02T08:46:25ZengSpringerOpenData Science and Engineering2364-11852364-15412019-03-0141446010.1007/s41019-019-0084-xRegular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model FittingHannu Reittu0Ilkka Norros1Tomi Räty2Marianna Bolla3Fülöp Bazsó4VTT Technical Research Centre of Finland Ltd.Department of Mathematics and Statistics, University of HelsinkiVTT Technical Research Centre of Finland Ltd.Department of Stochastics, Institute of Mathematics, Technical University of BudapestDepartment of Computational Sciences, Institute for Particle and Nuclear Physics, Wigner Research Centre for Physics, Hungarian Academy of SciencesAbstract We analyze the performance of regular decomposition, a method for compression of large and dense graphs. This method is inspired by Szemerédi’s regularity lemma (SRL), a generic structural result of large and dense graphs. In our method, stochastic block model (SBM) is used as a model in maximum likelihood fitting to find a regular structure similar to the one predicted by SRL. Another ingredient of our method is Rissanen’s minimum description length principle (MDL). We consider scaling of algorithms to extremely large size of graphs by sampling a small subgraph. We continue our previous work on the subject by proving some experimentally found claims. Our theoretical setting does not assume that the graph is generated from a SBM. The task is to find a SBM that is optimal for modeling the given graph in the sense of MDL. This assumption matches with real-life situations when no random generative model is appropriate. Our aim is to show that regular decomposition is a viable and robust method for large graphs emerging, say, in Big Data area.http://link.springer.com/article/10.1007/s41019-019-0084-xCommunity detectionSamplingConsistencyMartingales |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hannu Reittu Ilkka Norros Tomi Räty Marianna Bolla Fülöp Bazsó |
spellingShingle |
Hannu Reittu Ilkka Norros Tomi Räty Marianna Bolla Fülöp Bazsó Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting Data Science and Engineering Community detection Sampling Consistency Martingales |
author_facet |
Hannu Reittu Ilkka Norros Tomi Räty Marianna Bolla Fülöp Bazsó |
author_sort |
Hannu Reittu |
title |
Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting |
title_short |
Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting |
title_full |
Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting |
title_fullStr |
Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting |
title_full_unstemmed |
Regular Decomposition of Large Graphs: Foundation of a Sampling Approach to Stochastic Block Model Fitting |
title_sort |
regular decomposition of large graphs: foundation of a sampling approach to stochastic block model fitting |
publisher |
SpringerOpen |
series |
Data Science and Engineering |
issn |
2364-1185 2364-1541 |
publishDate |
2019-03-01 |
description |
Abstract We analyze the performance of regular decomposition, a method for compression of large and dense graphs. This method is inspired by Szemerédi’s regularity lemma (SRL), a generic structural result of large and dense graphs. In our method, stochastic block model (SBM) is used as a model in maximum likelihood fitting to find a regular structure similar to the one predicted by SRL. Another ingredient of our method is Rissanen’s minimum description length principle (MDL). We consider scaling of algorithms to extremely large size of graphs by sampling a small subgraph. We continue our previous work on the subject by proving some experimentally found claims. Our theoretical setting does not assume that the graph is generated from a SBM. The task is to find a SBM that is optimal for modeling the given graph in the sense of MDL. This assumption matches with real-life situations when no random generative model is appropriate. Our aim is to show that regular decomposition is a viable and robust method for large graphs emerging, say, in Big Data area. |
topic |
Community detection Sampling Consistency Martingales |
url |
http://link.springer.com/article/10.1007/s41019-019-0084-x |
work_keys_str_mv |
AT hannureittu regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting AT ilkkanorros regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting AT tomiraty regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting AT mariannabolla regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting AT fulopbazso regulardecompositionoflargegraphsfoundationofasamplingapproachtostochasticblockmodelfitting |
_version_ |
1724240384205258752 |