Named Entity Recognition for Spanish language and applications in technology forecasting
Named Entity Recognition (NER) is a main task into Natural Language Processing. On the one hand, supporting the extraction of the information on unstructured data. On the other hand, The NER is a probabilistic graphical model that allows us to represent the conditional independency assumptions into...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | Spanish |
Published: |
Instituto Antioqueño de Investigación (IAI)
2015-12-01
|
Series: | Revista Antioqueña de las Ciencias Computacionales y la Ingeniería de Software (RACCIS) |
Subjects: | |
Online Access: | http://fundacioniai.org//raccis/v5n2/n9a6.pdf |
id |
doaj-5fad1891e84449109f93c64e1a6ecf0f |
---|---|
record_format |
Article |
spelling |
doaj-5fad1891e84449109f93c64e1a6ecf0f2020-11-25T00:24:00ZspaInstituto Antioqueño de Investigación (IAI)Revista Antioqueña de las Ciencias Computacionales y la Ingeniería de Software (RACCIS)2248-74412015-12-01524347Named Entity Recognition for Spanish language and applications in technology forecastingRaúl Gutiérrez0Andrés Castillo1Víctor Bucheli2Oswaldo Solarte3Universidad del ValleUniversidad del ValleUniversidad del ValleUniversidad del ValleNamed Entity Recognition (NER) is a main task into Natural Language Processing. On the one hand, supporting the extraction of the information on unstructured data. On the other hand, The NER is a probabilistic graphical model that allows us to represent the conditional independency assumptions into the sequential labelling. In this paper, we propose a discriminative graphical model by using linear-chain Conditional Random Fields (CRFs). We present the experiments based on the Conll-2002 shared task and Ancora corpus according to the following criteria: recall, precision and F-score. Our contributions in this work are the following: first, we tested our baseline on the CoNLL-2002 shared task obtaining 80% F1-measure, and 59% F1-measure on AnCora corpus respectively. Finally, the application Vigtech allow us to identify information and patterns in the cancer topic, we discuss the results according to the model performance and the useful information to support the forecasting processhttp://fundacioniai.org//raccis/v5n2/n9a6.pdfNamed Entiti Recognitionnatural language processingartificial intelligence |
collection |
DOAJ |
language |
Spanish |
format |
Article |
sources |
DOAJ |
author |
Raúl Gutiérrez Andrés Castillo Víctor Bucheli Oswaldo Solarte |
spellingShingle |
Raúl Gutiérrez Andrés Castillo Víctor Bucheli Oswaldo Solarte Named Entity Recognition for Spanish language and applications in technology forecasting Revista Antioqueña de las Ciencias Computacionales y la Ingeniería de Software (RACCIS) Named Entiti Recognition natural language processing artificial intelligence |
author_facet |
Raúl Gutiérrez Andrés Castillo Víctor Bucheli Oswaldo Solarte |
author_sort |
Raúl Gutiérrez |
title |
Named Entity Recognition for Spanish language and applications in technology forecasting |
title_short |
Named Entity Recognition for Spanish language and applications in technology forecasting |
title_full |
Named Entity Recognition for Spanish language and applications in technology forecasting |
title_fullStr |
Named Entity Recognition for Spanish language and applications in technology forecasting |
title_full_unstemmed |
Named Entity Recognition for Spanish language and applications in technology forecasting |
title_sort |
named entity recognition for spanish language and applications in technology forecasting |
publisher |
Instituto Antioqueño de Investigación (IAI) |
series |
Revista Antioqueña de las Ciencias Computacionales y la Ingeniería de Software (RACCIS) |
issn |
2248-7441 |
publishDate |
2015-12-01 |
description |
Named Entity Recognition (NER) is a main task into Natural Language Processing. On the one hand, supporting the extraction of the information on unstructured data. On the other hand, The NER is a probabilistic graphical model that allows us to represent the conditional independency assumptions into the sequential labelling. In this paper, we propose a discriminative graphical model by using linear-chain Conditional Random Fields (CRFs). We present the experiments based on the Conll-2002 shared task and Ancora corpus according to the following criteria: recall, precision and F-score. Our contributions in this work are the following: first, we tested our baseline on the CoNLL-2002 shared task obtaining 80% F1-measure, and 59% F1-measure on AnCora corpus respectively. Finally, the application Vigtech allow us to identify information and patterns in the cancer topic, we discuss the results according to the model performance and the useful information to support the forecasting process |
topic |
Named Entiti Recognition natural language processing artificial intelligence |
url |
http://fundacioniai.org//raccis/v5n2/n9a6.pdf |
work_keys_str_mv |
AT raulgutierrez namedentityrecognitionforspanishlanguageandapplicationsintechnologyforecasting AT andrescastillo namedentityrecognitionforspanishlanguageandapplicationsintechnologyforecasting AT victorbucheli namedentityrecognitionforspanishlanguageandapplicationsintechnologyforecasting AT oswaldosolarte namedentityrecognitionforspanishlanguageandapplicationsintechnologyforecasting |
_version_ |
1725354499187408896 |