Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption

Data from smart grids are challenging to analyze due to their very large size, high dimensionality, skewness, sparsity, and number of seasonal fluctuations, including daily and weekly effects. With the data arriving in a sequential form the underlying distribution is subject to changes over the time...

Full description

Bibliographic Details
Main Authors: Krzysztof Gajowniczek, Marcin Bator, Tomasz Ząbkowski
Format: Article
Language:English
Published: MDPI AG 2020-12-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/22/12/1414
id doaj-95394ae0f5084726be915753b0ab6659
record_format Article
spelling doaj-95394ae0f5084726be915753b0ab66592020-12-16T00:03:35ZengMDPI AGEntropy1099-43002020-12-01221414141410.3390/e22121414Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity ConsumptionKrzysztof Gajowniczek0Marcin Bator1Tomasz Ząbkowski2Department of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences‑SGGW, 02-776 Warsaw, PolandDepartment of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences‑SGGW, 02-776 Warsaw, PolandDepartment of Artificial Intelligence, Institute of Information Technology, Warsaw University of Life Sciences‑SGGW, 02-776 Warsaw, PolandData from smart grids are challenging to analyze due to their very large size, high dimensionality, skewness, sparsity, and number of seasonal fluctuations, including daily and weekly effects. With the data arriving in a sequential form the underlying distribution is subject to changes over the time intervals. Time series data streams have their own specifics in terms of the data processing and data analysis because, usually, it is not possible to process the whole data in memory as the large data volumes are generated fast so the processing and the analysis should be done incrementally using sliding windows. Despite the proposal of many clustering techniques applicable for grouping the observations of a single data stream, only a few of them are focused on splitting the whole data streams into the clusters. In this article we aim to explore individual characteristics of electricity usage and recommend the most suitable tariff to the customer so they can benefit from lower prices. This work investigates various algorithms (and their improvements) what allows us to formulate the clusters, in real time, based on smart meter data.https://www.mdpi.com/1099-4300/22/12/1414clusteringdata streammachine learningsmart meteringtime series
collection DOAJ
language English
format Article
sources DOAJ
author Krzysztof Gajowniczek
Marcin Bator
Tomasz Ząbkowski
spellingShingle Krzysztof Gajowniczek
Marcin Bator
Tomasz Ząbkowski
Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption
Entropy
clustering
data stream
machine learning
smart metering
time series
author_facet Krzysztof Gajowniczek
Marcin Bator
Tomasz Ząbkowski
author_sort Krzysztof Gajowniczek
title Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption
title_short Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption
title_full Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption
title_fullStr Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption
title_full_unstemmed Whole Time Series Data Streams Clustering: Dynamic Profiling of the Electricity Consumption
title_sort whole time series data streams clustering: dynamic profiling of the electricity consumption
publisher MDPI AG
series Entropy
issn 1099-4300
publishDate 2020-12-01
description Data from smart grids are challenging to analyze due to their very large size, high dimensionality, skewness, sparsity, and number of seasonal fluctuations, including daily and weekly effects. With the data arriving in a sequential form the underlying distribution is subject to changes over the time intervals. Time series data streams have their own specifics in terms of the data processing and data analysis because, usually, it is not possible to process the whole data in memory as the large data volumes are generated fast so the processing and the analysis should be done incrementally using sliding windows. Despite the proposal of many clustering techniques applicable for grouping the observations of a single data stream, only a few of them are focused on splitting the whole data streams into the clusters. In this article we aim to explore individual characteristics of electricity usage and recommend the most suitable tariff to the customer so they can benefit from lower prices. This work investigates various algorithms (and their improvements) what allows us to formulate the clusters, in real time, based on smart meter data.
topic clustering
data stream
machine learning
smart metering
time series
url https://www.mdpi.com/1099-4300/22/12/1414
work_keys_str_mv AT krzysztofgajowniczek wholetimeseriesdatastreamsclusteringdynamicprofilingoftheelectricityconsumption
AT marcinbator wholetimeseriesdatastreamsclusteringdynamicprofilingoftheelectricityconsumption
AT tomaszzabkowski wholetimeseriesdatastreamsclusteringdynamicprofilingoftheelectricityconsumption
_version_ 1724381886279581696