Embedded Dimension and Time Series Length. Practical Influence on Permutation Entropy and Its Applications

Permutation Entropy (PE) is a time series complexity measure commonly used in a variety of contexts, with medicine being the prime example. In its general form, it requires three input parameters for its calculation: time series length <i>N</i>, embedded dimension <i>m</i>, a...

Full description

Bibliographic Details
Main Authors: David Cuesta-Frau, Juan Pablo Murillo-Escobar, Diana Alexandra Orrego, Edilson Delgado-Trejos
Format: Article
Language:English
Published: MDPI AG 2019-04-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/21/4/385
Description
Summary:Permutation Entropy (PE) is a time series complexity measure commonly used in a variety of contexts, with medicine being the prime example. In its general form, it requires three input parameters for its calculation: time series length <i>N</i>, embedded dimension <i>m</i>, and embedded delay <inline-formula> <math display="inline"> <semantics> <mi>&#964;</mi> </semantics> </math> </inline-formula>. Inappropriate choices of these parameters may potentially lead to incorrect interpretations. However, there are no specific guidelines for an optimal selection of <i>N</i>, <i>m</i>, or <inline-formula> <math display="inline"> <semantics> <mi>&#964;</mi> </semantics> </math> </inline-formula>, only general recommendations such as <inline-formula> <math display="inline"> <semantics> <mrow> <mi>N</mi> <mo>&gt;</mo> <mo>&gt;</mo> <mi>m</mi> <mo>!</mo> </mrow> </semantics> </math> </inline-formula>, <inline-formula> <math display="inline"> <semantics> <mrow> <mi>&#964;</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics> </math> </inline-formula>, or <inline-formula> <math display="inline"> <semantics> <mrow> <mi>m</mi> <mo>=</mo> <mn>3</mn> <mo>,</mo> <mo>&#8230;</mo> <mo>,</mo> <mn>7</mn> </mrow> </semantics> </math> </inline-formula>. This paper deals specifically with the study of the practical implications of <inline-formula> <math display="inline"> <semantics> <mrow> <mi>N</mi> <mo>&gt;</mo> <mo>&gt;</mo> <mi>m</mi> <mo>!</mo> </mrow> </semantics> </math> </inline-formula>, since long time series are often not available, or non-stationary, and other preliminary results suggest that low <i>N</i> values do not necessarily invalidate PE usefulness. Our study analyses the PE variation as a function of the series length <i>N</i> and embedded dimension <i>m</i> in the context of a diverse experimental set, both synthetic (random, spikes, or logistic model time series) and real&#8211;world (climatology, seismic, financial, or biomedical time series), and the classification performance achieved with varying <i>N</i> and <i>m</i>. The results seem to indicate that shorter lengths than those suggested by <inline-formula> <math display="inline"> <semantics> <mrow> <mi>N</mi> <mo>&gt;</mo> <mo>&gt;</mo> <mi>m</mi> <mo>!</mo> </mrow> </semantics> </math> </inline-formula> are sufficient for a stable PE calculation, and even very short time series can be robustly classified based on PE measurements before the stability point is reached. This may be due to the fact that there are forbidden patterns in chaotic time series, not all the patterns are equally informative, and differences among classes are already apparent at very short lengths.
ISSN:1099-4300