OxLM: A Neural Language Modelling Framework for Machine Translation
This paper presents an open source implementation1 of a neural language model for machine translation. Neural language models deal with the problem of data sparsity by learning distributed representations for words in a continuous vector space. The language modelling probabilities are estimated by p...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Sciendo
2014-09-01
|
Series: | Prague Bulletin of Mathematical Linguistics |
Online Access: | https://doi.org/10.2478/pralin-2014-0016 |
id |
doaj-ae1313f82b654c5aa28e5447cd195b01 |
---|---|
record_format |
Article |
spelling |
doaj-ae1313f82b654c5aa28e5447cd195b012021-09-05T14:01:12ZengSciendoPrague Bulletin of Mathematical Linguistics 1804-04622014-09-011021819210.2478/pralin-2014-0016pralin-2014-0016OxLM: A Neural Language Modelling Framework for Machine TranslationPaul Baltescu0Phil Blunsom1Hieu Hoang2University of Oxford, Department of Computer ScienceUniversity of Oxford, Department of Computer ScienceUniversity of Edinburgh, School of InformaticsThis paper presents an open source implementation1 of a neural language model for machine translation. Neural language models deal with the problem of data sparsity by learning distributed representations for words in a continuous vector space. The language modelling probabilities are estimated by projecting a word's context in the same space as the word representations and by assigning probabilities proportional to the distance between the words and the context's projection. Neural language models are notoriously slow to train and test. Our framework is designed with scalability in mind and provides two optional techniques for reducing the computational cost: the so-called class decomposition trick and a training algorithm based on noise contrastive estimation. Our models may be extended to incorporate direct n-gram features to learn weights for every n-gram in the training data. Our framework comes with wrappers for the cdec and Moses translation toolkits, allowing our language models to be incorporated as normalized features in their decoders (inside the beam search).https://doi.org/10.2478/pralin-2014-0016 |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Paul Baltescu Phil Blunsom Hieu Hoang |
spellingShingle |
Paul Baltescu Phil Blunsom Hieu Hoang OxLM: A Neural Language Modelling Framework for Machine Translation Prague Bulletin of Mathematical Linguistics |
author_facet |
Paul Baltescu Phil Blunsom Hieu Hoang |
author_sort |
Paul Baltescu |
title |
OxLM: A Neural Language Modelling Framework for Machine Translation |
title_short |
OxLM: A Neural Language Modelling Framework for Machine Translation |
title_full |
OxLM: A Neural Language Modelling Framework for Machine Translation |
title_fullStr |
OxLM: A Neural Language Modelling Framework for Machine Translation |
title_full_unstemmed |
OxLM: A Neural Language Modelling Framework for Machine Translation |
title_sort |
oxlm: a neural language modelling framework for machine translation |
publisher |
Sciendo |
series |
Prague Bulletin of Mathematical Linguistics |
issn |
1804-0462 |
publishDate |
2014-09-01 |
description |
This paper presents an open source implementation1 of a neural language model for machine translation. Neural language models deal with the problem of data sparsity by learning distributed representations for words in a continuous vector space. The language modelling probabilities are estimated by projecting a word's context in the same space as the word representations and by assigning probabilities proportional to the distance between the words and the context's projection. Neural language models are notoriously slow to train and test. Our framework is designed with scalability in mind and provides two optional techniques for reducing the computational cost: the so-called class decomposition trick and a training algorithm based on noise contrastive estimation. Our models may be extended to incorporate direct n-gram features to learn weights for every n-gram in the training data. Our framework comes with wrappers for the cdec and Moses translation toolkits, allowing our language models to be incorporated as normalized features in their decoders (inside the beam search). |
url |
https://doi.org/10.2478/pralin-2014-0016 |
work_keys_str_mv |
AT paulbaltescu oxlmaneurallanguagemodellingframeworkformachinetranslation AT philblunsom oxlmaneurallanguagemodellingframeworkformachinetranslation AT hieuhoang oxlmaneurallanguagemodellingframeworkformachinetranslation |
_version_ |
1717810630303416320 |