Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem

Burrows-Wheeler compression is a three stage process in which the data is transformed with the Burrows-Wheeler Transform, then transformed with Move-To-Front, and finally encoded with an entropy coder. Move-To-Front, Transpose, and Frequency Count are some of the many algorithms used on the List Upd...

Full description

Bibliographic Details
Main Author: Chapin, Brenton
Other Authors: Tate, Stephen R.
Format: Others
Language:English
Published: University of North Texas 2001
Subjects:
Online Access:https://digital.library.unt.edu/ark:/67531/metadc2909/
id ndltd-unt.edu-info-ark-67531-metadc2909
record_format oai_dc
spelling ndltd-unt.edu-info-ark-67531-metadc29092017-03-17T08:35:50Z Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem Chapin, Brenton Data compression (Computer science) Burrows-Wheeler data compression list update Burrows-Wheeler compression is a three stage process in which the data is transformed with the Burrows-Wheeler Transform, then transformed with Move-To-Front, and finally encoded with an entropy coder. Move-To-Front, Transpose, and Frequency Count are some of the many algorithms used on the List Update problem. In 1985, Competitive Analysis first showed the superiority of Move-To-Front over Transpose and Frequency Count for the List Update problem with arbitrary data. Earlier studies due to Bitner assumed independent identically distributed data, and showed that while Move-To-Front adapts to a distribution faster, incurring less overwork, the asymptotic costs of Frequency Count and Transpose are less. The improvements to Burrows-Wheeler compression this work covers are increases in the amount, not speed, of compression. Best x of 2x-1 is a new family of algorithms created to improve on Move-To-Front's processing of the output of the Burrows-Wheeler Transform which is like piecewise independent identically distributed data. Other algorithms for both the middle stage of Burrows-Wheeler compression and the List Update problem for which overwork, asymptotic cost, and competitive ratios are also analyzed are several variations of Move One From Front and part of the randomized algorithm Timestamp. The Best x of 2x - 1 family includes Move-To-Front, the part of Timestamp of interest, and Frequency Count. Lastly, a greedy choosing scheme, Snake, switches back and forth as the amount of compression that two List Update algorithms achieves fluctuates, to increase overall compression. The Burrows-Wheeler Transform is based on sorting of contexts. The other improvements are better sorting orders, such as “aeioubcdf...” instead of standard alphabetical “abcdefghi...” on English text data, and an algorithm for computing orders for any data, and Gray code sorting instead of standard sorting. Both techniques lessen the overwork incurred by whatever List Update algorithms are used by reducing the difference between adjacent sorted contexts. University of North Texas Tate, Stephen R. Fisher, Paul S. Renka, Robert J. Jacob, Roy T. 2001-08 Thesis or Dissertation Text oclc: 50988945 https://digital.library.unt.edu/ark:/67531/metadc2909/ ark: ark:/67531/metadc2909 English Public Copyright Chapin, Brenton Copyright is held by the author, unless otherwise noted. All rights reserved.
collection NDLTD
language English
format Others
sources NDLTD
topic Data compression (Computer science)
Burrows-Wheeler
data compression
list update
spellingShingle Data compression (Computer science)
Burrows-Wheeler
data compression
list update
Chapin, Brenton
Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem
description Burrows-Wheeler compression is a three stage process in which the data is transformed with the Burrows-Wheeler Transform, then transformed with Move-To-Front, and finally encoded with an entropy coder. Move-To-Front, Transpose, and Frequency Count are some of the many algorithms used on the List Update problem. In 1985, Competitive Analysis first showed the superiority of Move-To-Front over Transpose and Frequency Count for the List Update problem with arbitrary data. Earlier studies due to Bitner assumed independent identically distributed data, and showed that while Move-To-Front adapts to a distribution faster, incurring less overwork, the asymptotic costs of Frequency Count and Transpose are less. The improvements to Burrows-Wheeler compression this work covers are increases in the amount, not speed, of compression. Best x of 2x-1 is a new family of algorithms created to improve on Move-To-Front's processing of the output of the Burrows-Wheeler Transform which is like piecewise independent identically distributed data. Other algorithms for both the middle stage of Burrows-Wheeler compression and the List Update problem for which overwork, asymptotic cost, and competitive ratios are also analyzed are several variations of Move One From Front and part of the randomized algorithm Timestamp. The Best x of 2x - 1 family includes Move-To-Front, the part of Timestamp of interest, and Frequency Count. Lastly, a greedy choosing scheme, Snake, switches back and forth as the amount of compression that two List Update algorithms achieves fluctuates, to increase overall compression. The Burrows-Wheeler Transform is based on sorting of contexts. The other improvements are better sorting orders, such as “aeioubcdf...” instead of standard alphabetical “abcdefghi...” on English text data, and an algorithm for computing orders for any data, and Gray code sorting instead of standard sorting. Both techniques lessen the overwork incurred by whatever List Update algorithms are used by reducing the difference between adjacent sorted contexts.
author2 Tate, Stephen R.
author_facet Tate, Stephen R.
Chapin, Brenton
author Chapin, Brenton
author_sort Chapin, Brenton
title Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem
title_short Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem
title_full Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem
title_fullStr Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem
title_full_unstemmed Higher Compression from the Burrows-Wheeler Transform with New Algorithms for the List Update Problem
title_sort higher compression from the burrows-wheeler transform with new algorithms for the list update problem
publisher University of North Texas
publishDate 2001
url https://digital.library.unt.edu/ark:/67531/metadc2909/
work_keys_str_mv AT chapinbrenton highercompressionfromtheburrowswheelertransformwithnewalgorithmsforthelistupdateproblem
_version_ 1718429561039880192