Bootstrap quantification of estimation uncertainties in network degree distributions

Abstract We propose a new method of nonparametric bootstrap to quantify estimation uncertainties in functions of network degree distribution in large ultra sparse networks. Both network degree distribution and network order are assumed to be unknown. The key idea is based on adaptation of the “block...

Full description

Bibliographic Details
Main Authors: Yulia R. Gel, Vyacheslav Lyubchich, L. Leticia Ramirez Ramirez
Format: Article
Language:English
Published: Nature Publishing Group 2017-07-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-017-05885-x
id doaj-fbaa778d226d40e5bc096a82d7dd2355
record_format Article
spelling doaj-fbaa778d226d40e5bc096a82d7dd23552020-12-08T00:09:24ZengNature Publishing GroupScientific Reports2045-23222017-07-017111210.1038/s41598-017-05885-xBootstrap quantification of estimation uncertainties in network degree distributionsYulia R. Gel0Vyacheslav Lyubchich1L. Leticia Ramirez Ramirez2Department of Mathematical Sciences, University of Texas at DallasChesapeake Biological Laboratory, University of Maryland Center for Environmental ScienceCentro de Investigación en MatemáticasAbstract We propose a new method of nonparametric bootstrap to quantify estimation uncertainties in functions of network degree distribution in large ultra sparse networks. Both network degree distribution and network order are assumed to be unknown. The key idea is based on adaptation of the “blocking” argument, developed for bootstrapping of time series and re-tiling of spatial data, to random networks. We first sample a set of multiple ego networks of varying orders that form a patch, or a network block analogue, and then resample the data within patches. To select an optimal patch size, we develop a new computationally efficient and data-driven cross-validation algorithm. The proposed fast patchwork bootstrap (FPB) methodology further extends the ideas for a case of network mean degree, to inference on a degree distribution. In addition, the FPB is substantially less computationally expensive, requires less information on a graph, and is free from nuisance parameters. In our simulation study, we show that the new bootstrap method outperforms competing approaches by providing sharper and better-calibrated confidence intervals for functions of a network degree distribution than other available approaches, including the cases of networks in an ultra sparse regime. We illustrate the FPB in application to collaboration networks in statistics and computer science and to Wikipedia networks.https://doi.org/10.1038/s41598-017-05885-x
collection DOAJ
language English
format Article
sources DOAJ
author Yulia R. Gel
Vyacheslav Lyubchich
L. Leticia Ramirez Ramirez
spellingShingle Yulia R. Gel
Vyacheslav Lyubchich
L. Leticia Ramirez Ramirez
Bootstrap quantification of estimation uncertainties in network degree distributions
Scientific Reports
author_facet Yulia R. Gel
Vyacheslav Lyubchich
L. Leticia Ramirez Ramirez
author_sort Yulia R. Gel
title Bootstrap quantification of estimation uncertainties in network degree distributions
title_short Bootstrap quantification of estimation uncertainties in network degree distributions
title_full Bootstrap quantification of estimation uncertainties in network degree distributions
title_fullStr Bootstrap quantification of estimation uncertainties in network degree distributions
title_full_unstemmed Bootstrap quantification of estimation uncertainties in network degree distributions
title_sort bootstrap quantification of estimation uncertainties in network degree distributions
publisher Nature Publishing Group
series Scientific Reports
issn 2045-2322
publishDate 2017-07-01
description Abstract We propose a new method of nonparametric bootstrap to quantify estimation uncertainties in functions of network degree distribution in large ultra sparse networks. Both network degree distribution and network order are assumed to be unknown. The key idea is based on adaptation of the “blocking” argument, developed for bootstrapping of time series and re-tiling of spatial data, to random networks. We first sample a set of multiple ego networks of varying orders that form a patch, or a network block analogue, and then resample the data within patches. To select an optimal patch size, we develop a new computationally efficient and data-driven cross-validation algorithm. The proposed fast patchwork bootstrap (FPB) methodology further extends the ideas for a case of network mean degree, to inference on a degree distribution. In addition, the FPB is substantially less computationally expensive, requires less information on a graph, and is free from nuisance parameters. In our simulation study, we show that the new bootstrap method outperforms competing approaches by providing sharper and better-calibrated confidence intervals for functions of a network degree distribution than other available approaches, including the cases of networks in an ultra sparse regime. We illustrate the FPB in application to collaboration networks in statistics and computer science and to Wikipedia networks.
url https://doi.org/10.1038/s41598-017-05885-x
work_keys_str_mv AT yuliargel bootstrapquantificationofestimationuncertaintiesinnetworkdegreedistributions
AT vyacheslavlyubchich bootstrapquantificationofestimationuncertaintiesinnetworkdegreedistributions
AT lleticiaramirezramirez bootstrapquantificationofestimationuncertaintiesinnetworkdegreedistributions
_version_ 1724396766734843904