UDDSketch: Accurate Tracking of Quantiles in Data Streams

We present UDDSketch (Uniform DDSketch), a novel sketch for fast and accurate tracking of quantiles in data streams. This sketch is heavily inspired by the recently introduced DDSketch, and is based on a novel bucket collapsing procedure that allows overcoming the intrinsic limits of the correspondi...

Full description

Bibliographic Details
Main Authors: Italo Epicoco, Catiuscia Melle, Massimo Cafaro, Marco Pulimeno, Giuseppe Morleo
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9163358/
id doaj-2068829fd6864062b4a4a04e2fa84153
record_format Article
spelling doaj-2068829fd6864062b4a4a04e2fa841532021-03-30T01:55:38ZengIEEEIEEE Access2169-35362020-01-01814760414761710.1109/ACCESS.2020.30155999163358UDDSketch: Accurate Tracking of Quantiles in Data StreamsItalo Epicoco0https://orcid.org/0000-0002-6408-1335Catiuscia Melle1https://orcid.org/0000-0003-4463-0672Massimo Cafaro2https://orcid.org/0000-0003-1118-7109Marco Pulimeno3https://orcid.org/0000-0002-4201-1504Giuseppe Morleo4Department of Engineering for Innovation, University of Salento, Lecce, ItalyDepartment of Engineering for Innovation, University of Salento, Lecce, ItalyDepartment of Engineering for Innovation, University of Salento, Lecce, ItalyDepartment of Engineering for Innovation, University of Salento, Lecce, ItalyDepartment of Engineering for Innovation, University of Salento, Lecce, ItalyWe present UDDSketch (Uniform DDSketch), a novel sketch for fast and accurate tracking of quantiles in data streams. This sketch is heavily inspired by the recently introduced DDSketch, and is based on a novel bucket collapsing procedure that allows overcoming the intrinsic limits of the corresponding DDSketch procedures. Indeed, the DDSketch bucket collapsing procedure does not allow the derivation of formal guarantees on the accuracy of quantile estimation for data which does not follow a sub-exponential distribution. On the contrary, UDDSketch is designed so that accuracy guarantees can be given over the full range of quantiles and for arbitrary distribution in input. Moreover, our algorithm fully exploits the budgeted memory adaptively in order to guarantee the best possible accuracy over the full range of quantiles. Extensive experimental results on both synthetic and real datasets confirm the validity of our approach.https://ieeexplore.ieee.org/document/9163358/Sketchesquantilesstreaming algorithms
collection DOAJ
language English
format Article
sources DOAJ
author Italo Epicoco
Catiuscia Melle
Massimo Cafaro
Marco Pulimeno
Giuseppe Morleo
spellingShingle Italo Epicoco
Catiuscia Melle
Massimo Cafaro
Marco Pulimeno
Giuseppe Morleo
UDDSketch: Accurate Tracking of Quantiles in Data Streams
IEEE Access
Sketches
quantiles
streaming algorithms
author_facet Italo Epicoco
Catiuscia Melle
Massimo Cafaro
Marco Pulimeno
Giuseppe Morleo
author_sort Italo Epicoco
title UDDSketch: Accurate Tracking of Quantiles in Data Streams
title_short UDDSketch: Accurate Tracking of Quantiles in Data Streams
title_full UDDSketch: Accurate Tracking of Quantiles in Data Streams
title_fullStr UDDSketch: Accurate Tracking of Quantiles in Data Streams
title_full_unstemmed UDDSketch: Accurate Tracking of Quantiles in Data Streams
title_sort uddsketch: accurate tracking of quantiles in data streams
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description We present UDDSketch (Uniform DDSketch), a novel sketch for fast and accurate tracking of quantiles in data streams. This sketch is heavily inspired by the recently introduced DDSketch, and is based on a novel bucket collapsing procedure that allows overcoming the intrinsic limits of the corresponding DDSketch procedures. Indeed, the DDSketch bucket collapsing procedure does not allow the derivation of formal guarantees on the accuracy of quantile estimation for data which does not follow a sub-exponential distribution. On the contrary, UDDSketch is designed so that accuracy guarantees can be given over the full range of quantiles and for arbitrary distribution in input. Moreover, our algorithm fully exploits the budgeted memory adaptively in order to guarantee the best possible accuracy over the full range of quantiles. Extensive experimental results on both synthetic and real datasets confirm the validity of our approach.
topic Sketches
quantiles
streaming algorithms
url https://ieeexplore.ieee.org/document/9163358/
work_keys_str_mv AT italoepicoco uddsketchaccuratetrackingofquantilesindatastreams
AT catiusciamelle uddsketchaccuratetrackingofquantilesindatastreams
AT massimocafaro uddsketchaccuratetrackingofquantilesindatastreams
AT marcopulimeno uddsketchaccuratetrackingofquantilesindatastreams
AT giuseppemorleo uddsketchaccuratetrackingofquantilesindatastreams
_version_ 1724186084280107008