LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS

The sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for universal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash tables, binary search trees, and suffix trees have bee...

Full description

Bibliographic Details
Main Authors:	Artur Ferreira, Arlindo Oliveira, Mario Figueiredo
Format:	Article
Language:	English
Published:	Instituto Superior de Engenharia de Lisboa (ISEL) 2013-06-01
Series:	ISEL Academic Journal of Electronics, Telecommunications and Computers
Subjects:	Lempel-Ziv compression suffix arrays sliding window update substring search
Online Access:	http://journals.isel.pt/index.php/i-ETC/article/view/6

id	doaj-9ca6c1cce13e4a318486b5931d64bac2
record_format	Article
spelling	doaj-9ca6c1cce13e4a318486b5931d64bac22020-11-25T00:59:06ZengInstituto Superior de Engenharia de Lisboa (ISEL)ISEL Academic Journal of Electronics, Telecommunications and Computers2182-40102013-06-01216LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYSArtur FerreiraArlindo OliveiraMario FigueiredoThe sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for universal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash tables, binary search trees, and suffix trees have been used to speedup these searches, at the expense of memory usage. Previous work has shown how suffix arrays (SA) can be used for dictionary representation and LZ77 decomposition. In this paper, we improve over that work by proposing a new efficient algorithm to update the sliding window each time a token is produced at the output. The proposed algorithm toggles between two SA on consecutive tokens. The resulting SA-based encoder requires less memory than the conventional tree-based encoders. In comparing our SA-based technique against tree-based encoders, on a large set of benchmark files, we find that, in some compression settings, our encoder is also faster than tree-based encoders.http://journals.isel.pt/index.php/i-ETC/article/view/6Lempel-Ziv compressionsuffix arrayssliding window updatesubstring search
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Artur Ferreira Arlindo Oliveira Mario Figueiredo
spellingShingle	Artur Ferreira Arlindo Oliveira Mario Figueiredo LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS ISEL Academic Journal of Electronics, Telecommunications and Computers Lempel-Ziv compression suffix arrays sliding window update substring search
author_facet	Artur Ferreira Arlindo Oliveira Mario Figueiredo
author_sort	Artur Ferreira
title	LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS
title_short	LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS
title_full	LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS
title_fullStr	LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS
title_full_unstemmed	LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS
title_sort	lempel-ziv sliding window update with suffix arrays
publisher	Instituto Superior de Engenharia de Lisboa (ISEL)
series	ISEL Academic Journal of Electronics, Telecommunications and Computers
issn	2182-4010
publishDate	2013-06-01
description	The sliding window dictionary-based algorithms of the Lempel-Ziv (LZ) 77 family are widely used for universal lossless data compression. The encoding component of these algorithms performs repeated substring search. Data structures, such as hash tables, binary search trees, and suffix trees have been used to speedup these searches, at the expense of memory usage. Previous work has shown how suffix arrays (SA) can be used for dictionary representation and LZ77 decomposition. In this paper, we improve over that work by proposing a new efficient algorithm to update the sliding window each time a token is produced at the output. The proposed algorithm toggles between two SA on consecutive tokens. The resulting SA-based encoder requires less memory than the conventional tree-based encoders. In comparing our SA-based technique against tree-based encoders, on a large set of benchmark files, we find that, in some compression settings, our encoder is also faster than tree-based encoders.
topic	Lempel-Ziv compression suffix arrays sliding window update substring search
url	http://journals.isel.pt/index.php/i-ETC/article/view/6
work_keys_str_mv	AT arturferreira lempelzivslidingwindowupdatewithsuffixarrays AT arlindooliveira lempelzivslidingwindowupdatewithsuffixarrays AT mariofigueiredo lempelzivslidingwindowupdatewithsuffixarrays
_version_	1725218888451358720

LEMPEL-ZIV SLIDING WINDOW UPDATE WITH SUFFIX ARRAYS

Similar Items