On the Softmax Bottleneck of Word-Level Recurrent Language Models

For different input contexts (sequence of previous words), to predict the next word, a neural word-level language model outputs a probability distribution over all the words in the vocabulary using a softmax function. When the log of probability outputs for all such contexts are stacked together, th...

Full description

Bibliographic Details
Main Author: Parthiban, Dwarak Govind
Other Authors: Inkpen, Diana
Format: Others
Language:en
Published: Université d'Ottawa / University of Ottawa 2020
Subjects:
Online Access:http://hdl.handle.net/10393/41412
http://dx.doi.org/10.20381/ruor-25636