Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing. By warm-starting from the publicly released checkpoints, NLP practitioners have pushed the state-of-the-art on multiple benchmarks while saving significant amounts of compu...

Full description

Bibliographic Details
Main Authors: Rothe, Sascha, Narayan, Shashi, Severyn, Aliaksei
Format: Article
Language:English
Published: The MIT Press 2020-07-01
Series:Transactions of the Association for Computational Linguistics
Online Access:https://www.mitpressjournals.org/doi/abs/10.1162/tacl_a_00313