A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets

Quantile estimation with big data is still a challenging problem in statistics. In this paper we introduce a distributed algorithm for estimating high quantiles of heavy-tailed distributions with massive datasets. The key idea of the algorithm is to apply the alternating direction method of multipli...

Full description

Bibliographic Details
Main Authors: Xiaoyue Xie, Jian Shi
Format: Article
Language:English
Published: AIMS Press 2021-04-01
Series:Mathematical Biosciences and Engineering
Subjects:
Online Access:http://www.aimspress.com/article/doi/10.3934/mbe.2021011?viewType=HTML
id doaj-07a609bbe6e14ff0b7d6626b4f4475c9
record_format Article
spelling doaj-07a609bbe6e14ff0b7d6626b4f4475c92021-04-06T00:45:29ZengAIMS PressMathematical Biosciences and Engineering1551-00182021-04-0118121423010.3934/mbe.2021011A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasetsXiaoyue Xie0Jian Shi11. Academy of Mathematics and Systems Science, Chinese Academy of Science, Beijing 100190, China 2. School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, China1. Academy of Mathematics and Systems Science, Chinese Academy of Science, Beijing 100190, China 2. School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing 100049, ChinaQuantile estimation with big data is still a challenging problem in statistics. In this paper we introduce a distributed algorithm for estimating high quantiles of heavy-tailed distributions with massive datasets. The key idea of the algorithm is to apply the alternating direction method of multipliers in parameter estimation of the generalized pareto distribution in a distributed structure and compute high quantiles based on parameter estimation by the Peak Over Threshold method. This paper proves that the proposed algorithm converges to a stationary solution when the step size is properly chosen. The numerical study and real data analysis also shows that the algorithm is feasible and efficient for estimating high quantiles of heavy-tailed distribution with massive datasets and there is a clear-cut winner for the extreme quantiles.http://www.aimspress.com/article/doi/10.3934/mbe.2021011?viewType=HTMLdistributed algorithmbig datahigh quantile estimationheavy-tailed distributionpeak over threshold method
collection DOAJ
language English
format Article
sources DOAJ
author Xiaoyue Xie
Jian Shi
spellingShingle Xiaoyue Xie
Jian Shi
A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets
Mathematical Biosciences and Engineering
distributed algorithm
big data
high quantile estimation
heavy-tailed distribution
peak over threshold method
author_facet Xiaoyue Xie
Jian Shi
author_sort Xiaoyue Xie
title A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets
title_short A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets
title_full A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets
title_fullStr A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets
title_full_unstemmed A distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets
title_sort distributed quantile estimation algorithm of heavy-tailed distribution with massive datasets
publisher AIMS Press
series Mathematical Biosciences and Engineering
issn 1551-0018
publishDate 2021-04-01
description Quantile estimation with big data is still a challenging problem in statistics. In this paper we introduce a distributed algorithm for estimating high quantiles of heavy-tailed distributions with massive datasets. The key idea of the algorithm is to apply the alternating direction method of multipliers in parameter estimation of the generalized pareto distribution in a distributed structure and compute high quantiles based on parameter estimation by the Peak Over Threshold method. This paper proves that the proposed algorithm converges to a stationary solution when the step size is properly chosen. The numerical study and real data analysis also shows that the algorithm is feasible and efficient for estimating high quantiles of heavy-tailed distribution with massive datasets and there is a clear-cut winner for the extreme quantiles.
topic distributed algorithm
big data
high quantile estimation
heavy-tailed distribution
peak over threshold method
url http://www.aimspress.com/article/doi/10.3934/mbe.2021011?viewType=HTML
work_keys_str_mv AT xiaoyuexie adistributedquantileestimationalgorithmofheavytaileddistributionwithmassivedatasets
AT jianshi adistributedquantileestimationalgorithmofheavytaileddistributionwithmassivedatasets
AT xiaoyuexie distributedquantileestimationalgorithmofheavytaileddistributionwithmassivedatasets
AT jianshi distributedquantileestimationalgorithmofheavytaileddistributionwithmassivedatasets
_version_ 1721538623869812736