Improved N-Best Extraction with an Evaluation on Language Data

We show that a previously proposed algorithm for the N-best trees problem can be made more efficient by changing how it arranges and explores the search space. Given an integer N and a weighted tree automaton (wta) M over the tropical semiring, the algorithm computes N trees of minimal weight with r...

Full description

Bibliographic Details
Main Authors: Björklund, J. (Author), Drewes, F. (Author), Jonsson, A. (Author)
Format: Article
Language:English
Published: MIT Press Journals 2022
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02061nam a2200325Ia 4500
001 10.1162-COLI_a_00427
008 220425s2022 CNT 000 0 und d
020 |a 08912017 (ISSN) 
245 1 0 |a Improved N-Best Extraction with an Evaluation on Language Data 
260 0 |b MIT Press Journals  |c 2022 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1162/COLI_a_00427 
520 3 |a We show that a previously proposed algorithm for the N-best trees problem can be made more efficient by changing how it arranges and explores the search space. Given an integer N and a weighted tree automaton (wta) M over the tropical semiring, the algorithm computes N trees of minimal weight with respect to M. Compared with the original algorithm, the modifications increase the laziness of the evaluation strategy, which makes the new algorithm asymptotically more efficient than its predecessor. The algorithm is implemented in the software BETTY, and compared to the state-of-the-art algorithm for extracting the N best runs, implemented in the software toolkit TIBURON. The data sets used in the experiments are wtas resulting from real-world natural language processing tasks, as well as artificially created wtas with varying degrees of nondeterminism. We find that BETTY outperforms TIBURON on all tested data sets with respect to running time, while TIBURON seems to be the more memory-efficient choice. © 2022 Association for Computational Linguistics 
650 0 4 |a Computational linguistics 
650 0 4 |a Data set 
650 0 4 |a Evaluation strategies 
650 0 4 |a Forestry 
650 0 4 |a Integer-N 
650 0 4 |a Natural language processing systems 
650 0 4 |a Original algorithms 
650 0 4 |a Search spaces 
650 0 4 |a Software toolkits 
650 0 4 |a State-of-the-art algorithms 
650 0 4 |a Tree automata 
650 0 4 |a Trees (mathematics) 
650 0 4 |a Tropical semiring 
650 0 4 |a Weighted tree 
700 1 |a Björklund, J.  |e author 
700 1 |a Drewes, F.  |e author 
700 1 |a Jonsson, A.  |e author 
773 |t Computational Linguistics