Scalable Reordering Models for SMT based on Multiclass SVM

In state-of-the-art phrase-based statistical machine translation systems, modelling phrase reorderings is an important need to enhance naturalness of the translated outputs, particularly when the grammatical structures of the language pairs differ significantly. Posing phrase movements as a classifi...

Full description

Bibliographic Details
Main Authors: Alrajeh Abdullah, Niranjan Mahesan
Format: Article
Language:English
Published: Sciendo 2015-04-01
Series:Prague Bulletin of Mathematical Linguistics
Online Access:https://doi.org/10.1515/pralin-2015-0004
id doaj-a3f1687bb4274cfbaf89d27de5eceb6b
record_format Article
spelling doaj-a3f1687bb4274cfbaf89d27de5eceb6b2021-09-05T13:59:53ZengSciendoPrague Bulletin of Mathematical Linguistics 1804-04622015-04-011031658410.1515/pralin-2015-0004pralin-2015-0004Scalable Reordering Models for SMT based on Multiclass SVMAlrajeh Abdullah0Niranjan Mahesan1School of Electronics and Computer Science, University of Southampton/Computer Research Institute, King Abdulaziz City for Science and Technology (KACST)Computer Research Institute, King Abdulaziz City for Science and Technology (KACST)In state-of-the-art phrase-based statistical machine translation systems, modelling phrase reorderings is an important need to enhance naturalness of the translated outputs, particularly when the grammatical structures of the language pairs differ significantly. Posing phrase movements as a classification problem, we exploit recent developments in solving large-scale multiclass support vector machines. Using dual coordinate descent methods for learning, we provide a mechanism to shrink the amount of training data required for each iteration. Hence, we produce significant computational saving while preserving the accuracy of the models. Our approach is a couple of times faster than maximum entropy approach and more memory-efficient (50% reduction). Experiments were carried out on an Arabic-English corpus with more than a quarter of a billion words. We achieve BLEU score improvements on top of a strong baseline system with sparse reordering features.https://doi.org/10.1515/pralin-2015-0004
collection DOAJ
language English
format Article
sources DOAJ
author Alrajeh Abdullah
Niranjan Mahesan
spellingShingle Alrajeh Abdullah
Niranjan Mahesan
Scalable Reordering Models for SMT based on Multiclass SVM
Prague Bulletin of Mathematical Linguistics
author_facet Alrajeh Abdullah
Niranjan Mahesan
author_sort Alrajeh Abdullah
title Scalable Reordering Models for SMT based on Multiclass SVM
title_short Scalable Reordering Models for SMT based on Multiclass SVM
title_full Scalable Reordering Models for SMT based on Multiclass SVM
title_fullStr Scalable Reordering Models for SMT based on Multiclass SVM
title_full_unstemmed Scalable Reordering Models for SMT based on Multiclass SVM
title_sort scalable reordering models for smt based on multiclass svm
publisher Sciendo
series Prague Bulletin of Mathematical Linguistics
issn 1804-0462
publishDate 2015-04-01
description In state-of-the-art phrase-based statistical machine translation systems, modelling phrase reorderings is an important need to enhance naturalness of the translated outputs, particularly when the grammatical structures of the language pairs differ significantly. Posing phrase movements as a classification problem, we exploit recent developments in solving large-scale multiclass support vector machines. Using dual coordinate descent methods for learning, we provide a mechanism to shrink the amount of training data required for each iteration. Hence, we produce significant computational saving while preserving the accuracy of the models. Our approach is a couple of times faster than maximum entropy approach and more memory-efficient (50% reduction). Experiments were carried out on an Arabic-English corpus with more than a quarter of a billion words. We achieve BLEU score improvements on top of a strong baseline system with sparse reordering features.
url https://doi.org/10.1515/pralin-2015-0004
work_keys_str_mv AT alrajehabdullah scalablereorderingmodelsforsmtbasedonmulticlasssvm
AT niranjanmahesan scalablereorderingmodelsforsmtbasedonmulticlasssvm
_version_ 1717812839873249280