Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text
The automatic extraction of geospatial information is an important aspect of data mining. Computer systems capable of discovering geographic information from natural language involve a complex process called geoparsing, which includes two important tasks: geographic entity recognition and toponym re...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-09-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/12/18/3041 |
id |
doaj-2a94e8c05d16492f856aa3ed81fb4916 |
---|---|
record_format |
Article |
spelling |
doaj-2a94e8c05d16492f856aa3ed81fb49162020-11-25T03:07:35ZengMDPI AGRemote Sensing2072-42922020-09-01123041304110.3390/rs12183041Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured TextEdwin Aldana-Bobadilla0Alejandro Molina-Villegas1Ivan Lopez-Arevalo2Shanel Reyes-Palacios3Victor Muñiz-Sanchez4Jean Arreola-Trapala5Conacyt-Centro de Investigación y de Estudios Avanzados del I.P.N. (Cinvestav), Victoria 87130, MexicoConacyt-Centro de Investigación en Ciencias de Información Geoespacial (Centrogeo), Mérida 97302, MexicoCentro de Investigación y de Estudios Avanzados del I.P.N. Unidad Tamaulipas (Cinvestav Tamaulipas), Victoria 87130, MexicoCentro de Investigación y de Estudios Avanzados del I.P.N. Unidad Tamaulipas (Cinvestav Tamaulipas), Victoria 87130, MexicoCentro de Investigación en Matemáticas (Cimat), Monterrey 66628, MexicoCentro de Investigación en Matemáticas (Cimat), Monterrey 66628, MexicoThe automatic extraction of geospatial information is an important aspect of data mining. Computer systems capable of discovering geographic information from natural language involve a complex process called geoparsing, which includes two important tasks: geographic entity recognition and toponym resolution. The first task could be approached through a machine learning approach, in which case a model is trained to recognize a sequence of characters (words) corresponding to geographic entities. The second task consists of assigning such entities to their most likely coordinates. Frequently, the latter process involves solving referential ambiguities. In this paper, we propose an extensible geoparsing approach including geographic entity recognition based on a neural network model and disambiguation based on what we have called <i>dynamic context disambiguation</i>. Once place names are recognized in an input text, they are solved using a grammar, in which a set of rules specifies how ambiguities could be solved, in a similar way to that which a person would utilize, considering the context. As a result, we have an assignment of the most likely geographic properties of the recognized places. We propose an assessment measure based on a ranking of closeness relative to the predicted and actual locations of a place name. Regarding this measure, our method outperforms OpenStreetMap Nominatim. We include other assessment measures to assess the recognition ability of place names and the prediction of what we called geographic levels (administrative jurisdiction of places).https://www.mdpi.com/2072-4292/12/18/3041geoparsingtoponym resolutiongeographic named entity recognitionnamed entity recognition in Spanish |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Edwin Aldana-Bobadilla Alejandro Molina-Villegas Ivan Lopez-Arevalo Shanel Reyes-Palacios Victor Muñiz-Sanchez Jean Arreola-Trapala |
spellingShingle |
Edwin Aldana-Bobadilla Alejandro Molina-Villegas Ivan Lopez-Arevalo Shanel Reyes-Palacios Victor Muñiz-Sanchez Jean Arreola-Trapala Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text Remote Sensing geoparsing toponym resolution geographic named entity recognition named entity recognition in Spanish |
author_facet |
Edwin Aldana-Bobadilla Alejandro Molina-Villegas Ivan Lopez-Arevalo Shanel Reyes-Palacios Victor Muñiz-Sanchez Jean Arreola-Trapala |
author_sort |
Edwin Aldana-Bobadilla |
title |
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text |
title_short |
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text |
title_full |
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text |
title_fullStr |
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text |
title_full_unstemmed |
Adaptive Geoparsing Method for Toponym Recognition and Resolution in Unstructured Text |
title_sort |
adaptive geoparsing method for toponym recognition and resolution in unstructured text |
publisher |
MDPI AG |
series |
Remote Sensing |
issn |
2072-4292 |
publishDate |
2020-09-01 |
description |
The automatic extraction of geospatial information is an important aspect of data mining. Computer systems capable of discovering geographic information from natural language involve a complex process called geoparsing, which includes two important tasks: geographic entity recognition and toponym resolution. The first task could be approached through a machine learning approach, in which case a model is trained to recognize a sequence of characters (words) corresponding to geographic entities. The second task consists of assigning such entities to their most likely coordinates. Frequently, the latter process involves solving referential ambiguities. In this paper, we propose an extensible geoparsing approach including geographic entity recognition based on a neural network model and disambiguation based on what we have called <i>dynamic context disambiguation</i>. Once place names are recognized in an input text, they are solved using a grammar, in which a set of rules specifies how ambiguities could be solved, in a similar way to that which a person would utilize, considering the context. As a result, we have an assignment of the most likely geographic properties of the recognized places. We propose an assessment measure based on a ranking of closeness relative to the predicted and actual locations of a place name. Regarding this measure, our method outperforms OpenStreetMap Nominatim. We include other assessment measures to assess the recognition ability of place names and the prediction of what we called geographic levels (administrative jurisdiction of places). |
topic |
geoparsing toponym resolution geographic named entity recognition named entity recognition in Spanish |
url |
https://www.mdpi.com/2072-4292/12/18/3041 |
work_keys_str_mv |
AT edwinaldanabobadilla adaptivegeoparsingmethodfortoponymrecognitionandresolutioninunstructuredtext AT alejandromolinavillegas adaptivegeoparsingmethodfortoponymrecognitionandresolutioninunstructuredtext AT ivanlopezarevalo adaptivegeoparsingmethodfortoponymrecognitionandresolutioninunstructuredtext AT shanelreyespalacios adaptivegeoparsingmethodfortoponymrecognitionandresolutioninunstructuredtext AT victormunizsanchez adaptivegeoparsingmethodfortoponymrecognitionandresolutioninunstructuredtext AT jeanarreolatrapala adaptivegeoparsingmethodfortoponymrecognitionandresolutioninunstructuredtext |
_version_ |
1724669710615707648 |