Named Entity Recognition for Spanish language and applications in technology forecasting

Named Entity Recognition (NER) is a main task into Natural Language Processing. On the one hand, supporting the extraction of the information on unstructured data. On the other hand, The NER is a probabilistic graphical model that allows us to represent the conditional independency assumptions into...

Full description

Bibliographic Details
Main Authors: Raúl Gutiérrez, Andrés Castillo, Víctor Bucheli, Oswaldo Solarte
Format: Article
Language:Spanish
Published: Instituto Antioqueño de Investigación (IAI) 2015-12-01
Series:Revista Antioqueña de las Ciencias Computacionales y la Ingeniería de Software (RACCIS)
Subjects:
Online Access:http://fundacioniai.org//raccis/v5n2/n9a6.pdf
id doaj-5fad1891e84449109f93c64e1a6ecf0f
record_format Article
spelling doaj-5fad1891e84449109f93c64e1a6ecf0f2020-11-25T00:24:00ZspaInstituto Antioqueño de Investigación (IAI)Revista Antioqueña de las Ciencias Computacionales y la Ingeniería de Software (RACCIS)2248-74412015-12-01524347Named Entity Recognition for Spanish language and applications in technology forecastingRaúl Gutiérrez0Andrés Castillo1Víctor Bucheli2Oswaldo Solarte3Universidad del ValleUniversidad del ValleUniversidad del ValleUniversidad del ValleNamed Entity Recognition (NER) is a main task into Natural Language Processing. On the one hand, supporting the extraction of the information on unstructured data. On the other hand, The NER is a probabilistic graphical model that allows us to represent the conditional independency assumptions into the sequential labelling. In this paper, we propose a discriminative graphical model by using linear-chain Conditional Random Fields (CRFs). We present the experiments based on the Conll-2002 shared task and Ancora corpus according to the following criteria: recall, precision and F-score. Our contributions in this work are the following: first, we tested our baseline on the CoNLL-2002 shared task obtaining 80% F1-measure, and 59% F1-measure on AnCora corpus respectively. Finally, the application Vigtech allow us to identify information and patterns in the cancer topic, we discuss the results according to the model performance and the useful information to support the forecasting processhttp://fundacioniai.org//raccis/v5n2/n9a6.pdfNamed Entiti Recognitionnatural language processingartificial intelligence
collection DOAJ
language Spanish
format Article
sources DOAJ
author Raúl Gutiérrez
Andrés Castillo
Víctor Bucheli
Oswaldo Solarte
spellingShingle Raúl Gutiérrez
Andrés Castillo
Víctor Bucheli
Oswaldo Solarte
Named Entity Recognition for Spanish language and applications in technology forecasting
Revista Antioqueña de las Ciencias Computacionales y la Ingeniería de Software (RACCIS)
Named Entiti Recognition
natural language processing
artificial intelligence
author_facet Raúl Gutiérrez
Andrés Castillo
Víctor Bucheli
Oswaldo Solarte
author_sort Raúl Gutiérrez
title Named Entity Recognition for Spanish language and applications in technology forecasting
title_short Named Entity Recognition for Spanish language and applications in technology forecasting
title_full Named Entity Recognition for Spanish language and applications in technology forecasting
title_fullStr Named Entity Recognition for Spanish language and applications in technology forecasting
title_full_unstemmed Named Entity Recognition for Spanish language and applications in technology forecasting
title_sort named entity recognition for spanish language and applications in technology forecasting
publisher Instituto Antioqueño de Investigación (IAI)
series Revista Antioqueña de las Ciencias Computacionales y la Ingeniería de Software (RACCIS)
issn 2248-7441
publishDate 2015-12-01
description Named Entity Recognition (NER) is a main task into Natural Language Processing. On the one hand, supporting the extraction of the information on unstructured data. On the other hand, The NER is a probabilistic graphical model that allows us to represent the conditional independency assumptions into the sequential labelling. In this paper, we propose a discriminative graphical model by using linear-chain Conditional Random Fields (CRFs). We present the experiments based on the Conll-2002 shared task and Ancora corpus according to the following criteria: recall, precision and F-score. Our contributions in this work are the following: first, we tested our baseline on the CoNLL-2002 shared task obtaining 80% F1-measure, and 59% F1-measure on AnCora corpus respectively. Finally, the application Vigtech allow us to identify information and patterns in the cancer topic, we discuss the results according to the model performance and the useful information to support the forecasting process
topic Named Entiti Recognition
natural language processing
artificial intelligence
url http://fundacioniai.org//raccis/v5n2/n9a6.pdf
work_keys_str_mv AT raulgutierrez namedentityrecognitionforspanishlanguageandapplicationsintechnologyforecasting
AT andrescastillo namedentityrecognitionforspanishlanguageandapplicationsintechnologyforecasting
AT victorbucheli namedentityrecognitionforspanishlanguageandapplicationsintechnologyforecasting
AT oswaldosolarte namedentityrecognitionforspanishlanguageandapplicationsintechnologyforecasting
_version_ 1725354499187408896