Named entity recognition in Slovene text

This paper presents an approach and an implementation of a named entity extractor for Slovene language, based on a machine learning approach. It is designed as a supervised algorithm based on Conditional Random Fields and is trained on the ssj500k annotated corpus of Slovene. The corpus, which is av...

Full description

Bibliographic Details
Main Authors: Tadej Štajner, Tomaž Erjavec, Simon Krek
Format: Article
Language:English
Published: Znanstvena založba Filozofske fakultete Univerze v Ljubljani (Ljubljana University Press, Faculty of Arts) 2013-12-01
Series:Slovenščina 2.0: Empirične, aplikativne in interdisciplinarne raziskave
Subjects:
Online Access:http://www.trojina.org/slovenscina2.0/arhiv/2013/2/Slo2.0_2013_2_04.pdf