Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings

A grammar-based compressor is an algorithm that receives a word and outputs a context-free grammar that only produces this word. The approximation ratio for a single input word is the size of the grammar produced for this word divided by the size of a smallest grammar for this word. The worst-case a...

Full description

Bibliographic Details
Main Authors: Danny Hucke, Carl Philipp Reh
Format: Article
Language:English
Published: MDPI AG 2021-02-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/14/2/65
id doaj-4008000827fd4605befdf45daf149383
record_format Article
spelling doaj-4008000827fd4605befdf45daf1493832021-02-21T00:01:54ZengMDPI AGAlgorithms1999-48932021-02-0114656510.3390/a14020065Approximation Ratios of RePair, LongestMatch and Greedy on Unary StringsDanny Hucke0Carl Philipp Reh1Department Elektrotechnik und Informatik, Universität Siegen, D-57068 Siegen, GermanyDepartment Elektrotechnik und Informatik, Universität Siegen, D-57068 Siegen, GermanyA grammar-based compressor is an algorithm that receives a word and outputs a context-free grammar that only produces this word. The approximation ratio for a single input word is the size of the grammar produced for this word divided by the size of a smallest grammar for this word. The worst-case approximation ratio of a grammar-based compressor for a given word length is the largest approximation ratio over all input words of that length. In this work, we study the worst-case approximation ratio of the algorithms error, error and error on unary strings, i.e., strings that only make use of a single symbol. Our main contribution is to show the improved upper bound of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo form="prefix">error</mo><mo>(</mo><msup><mrow><mo>(</mo><mo form="prefix">log</mo><mi>n</mi><mo>)</mo></mrow><mn>8</mn></msup><mo>·</mo><msup><mrow><mo>(</mo><mo form="prefix">log</mo><mo form="prefix">log</mo><mi>n</mi><mo>)</mo></mrow><mn>3</mn></msup><mo>)</mo></mrow></semantics></math></inline-formula> for the worst-case approximation ratio of error. In addition, we also show the lower bound of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>1</mn><mo>.</mo><mn>34847194</mn><mo>⋯</mo><mspace width="0.166667em"></mspace><mspace width="0.166667em"></mspace></mrow></semantics></math></inline-formula> for the worst-case approximation ratio of error, and that error and error have a worst-case approximation ratio of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mo form="prefix">log</mo><mn>2</mn></msub><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></mrow></semantics></math></inline-formula>.https://www.mdpi.com/1999-4893/14/2/65data compressiongrammar-based compressionapproximation algorithmaddition chain
collection DOAJ
language English
format Article
sources DOAJ
author Danny Hucke
Carl Philipp Reh
spellingShingle Danny Hucke
Carl Philipp Reh
Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings
Algorithms
data compression
grammar-based compression
approximation algorithm
addition chain
author_facet Danny Hucke
Carl Philipp Reh
author_sort Danny Hucke
title Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings
title_short Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings
title_full Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings
title_fullStr Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings
title_full_unstemmed Approximation Ratios of RePair, LongestMatch and Greedy on Unary Strings
title_sort approximation ratios of repair, longestmatch and greedy on unary strings
publisher MDPI AG
series Algorithms
issn 1999-4893
publishDate 2021-02-01
description A grammar-based compressor is an algorithm that receives a word and outputs a context-free grammar that only produces this word. The approximation ratio for a single input word is the size of the grammar produced for this word divided by the size of a smallest grammar for this word. The worst-case approximation ratio of a grammar-based compressor for a given word length is the largest approximation ratio over all input words of that length. In this work, we study the worst-case approximation ratio of the algorithms error, error and error on unary strings, i.e., strings that only make use of a single symbol. Our main contribution is to show the improved upper bound of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mo form="prefix">error</mo><mo>(</mo><msup><mrow><mo>(</mo><mo form="prefix">log</mo><mi>n</mi><mo>)</mo></mrow><mn>8</mn></msup><mo>·</mo><msup><mrow><mo>(</mo><mo form="prefix">log</mo><mo form="prefix">log</mo><mi>n</mi><mo>)</mo></mrow><mn>3</mn></msup><mo>)</mo></mrow></semantics></math></inline-formula> for the worst-case approximation ratio of error. In addition, we also show the lower bound of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>1</mn><mo>.</mo><mn>34847194</mn><mo>⋯</mo><mspace width="0.166667em"></mspace><mspace width="0.166667em"></mspace></mrow></semantics></math></inline-formula> for the worst-case approximation ratio of error, and that error and error have a worst-case approximation ratio of <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><msub><mo form="prefix">log</mo><mn>2</mn></msub><mrow><mo>(</mo><mn>3</mn><mo>)</mo></mrow></mrow></semantics></math></inline-formula>.
topic data compression
grammar-based compression
approximation algorithm
addition chain
url https://www.mdpi.com/1999-4893/14/2/65
work_keys_str_mv AT dannyhucke approximationratiosofrepairlongestmatchandgreedyonunarystrings
AT carlphilippreh approximationratiosofrepairlongestmatchandgreedyonunarystrings
_version_ 1724258956167086080