Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation

In recent years, Online Social Networks (OSNs) have received a great deal of attention for their potential use in the spatial and temporal modeling of events owing to the information that can be extracted from these platforms. Within this context, one of the most latent applications is the monitorin...

Full description

Bibliographic Details
Main Authors: Aldo Hernandez-Suarez, Gabriel Sanchez-Perez, Karina Toscano-Medina, Hector Perez-Meana, Jose Portillo-Portillo, Victor Sanchez, Luis Javier García Villalba
Format: Article
Language:English
Published: MDPI AG 2019-04-01
Series:Sensors
Subjects:
CRF
Online Access:https://www.mdpi.com/1424-8220/19/7/1746
id doaj-81f46834e1db4c0db286e35b6356df8f
record_format Article
spelling doaj-81f46834e1db4c0db286e35b6356df8f2020-11-24T21:44:24ZengMDPI AGSensors1424-82202019-04-01197174610.3390/s19071746s19071746Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density EstimationAldo Hernandez-Suarez0Gabriel Sanchez-Perez1Karina Toscano-Medina2Hector Perez-Meana3Jose Portillo-Portillo4Victor Sanchez5Luis Javier García Villalba6Instituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, MexicoInstituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, MexicoInstituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, MexicoInstituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, MexicoInstituto Politecnico Nacional, ESIME Culhuacan, Mexico City 04440, MexicoDepartment of Computer Science, University of Warwick, Coventry CV4 7AL, UKGroup of Analysis, Security and Systems (GASS), Department of Software Engineering and Artificial Intelligence (DISIA), Faculty of Computer Science and Engineering, Office 431, Universidad Complutense de Madrid (UCM), Calle Profesor José García Santesmases, 9, Ciudad Universitaria, 28040 Madrid, SpainIn recent years, Online Social Networks (OSNs) have received a great deal of attention for their potential use in the spatial and temporal modeling of events owing to the information that can be extracted from these platforms. Within this context, one of the most latent applications is the monitoring of natural disasters. Vital information posted by OSN users can contribute to relief efforts during and after a catastrophe. Although it is possible to retrieve data from OSNs using embedded geographic information provided by GPS systems, this feature is disabled by default in most cases. An alternative solution is to geoparse specific locations using language models based on Named Entity Recognition (NER) techniques. In this work, a sensor that uses Twitter is proposed to monitor natural disasters. The approach is intended to sense data by detecting toponyms (named places written within the text) in tweets with event-related information, e.g., a collapsed building on a specific avenue or the location at which a person was last seen. The proposed approach is carried out by transforming tokenized tweets into word embeddings: a rich linguistic and contextual vector representation of textual corpora. Pre-labeled word embeddings are employed to train a Recurrent Neural Network variant, known as a Bidirectional Long Short-Term Memory (biLSTM) network, that is capable of dealing with sequential data by analyzing information in both directions of a word (past and future entries). Moreover, a Conditional Random Field (CRF) output layer, which aims to maximize the transition from one NER tag to another, is used to increase the classification accuracy. The resulting labeled words are joined to coherently form a toponym, which is geocoded and scored by a Kernel Density Estimation function. At the end of the process, the scored data are presented graphically to depict areas in which the majority of tweets reporting topics related to a natural disaster are concentrated. A case study on Mexico’s 2017 Earthquake is presented, and the data extracted during and after the event are reported.https://www.mdpi.com/1424-8220/19/7/1746twitterdata miningword2vecCRFLSTMgeocodinggeoparsing
collection DOAJ
language English
format Article
sources DOAJ
author Aldo Hernandez-Suarez
Gabriel Sanchez-Perez
Karina Toscano-Medina
Hector Perez-Meana
Jose Portillo-Portillo
Victor Sanchez
Luis Javier García Villalba
spellingShingle Aldo Hernandez-Suarez
Gabriel Sanchez-Perez
Karina Toscano-Medina
Hector Perez-Meana
Jose Portillo-Portillo
Victor Sanchez
Luis Javier García Villalba
Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation
Sensors
twitter
data mining
word2vec
CRF
LSTM
geocoding
geoparsing
author_facet Aldo Hernandez-Suarez
Gabriel Sanchez-Perez
Karina Toscano-Medina
Hector Perez-Meana
Jose Portillo-Portillo
Victor Sanchez
Luis Javier García Villalba
author_sort Aldo Hernandez-Suarez
title Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation
title_short Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation
title_full Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation
title_fullStr Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation
title_full_unstemmed Using Twitter Data to Monitor Natural Disaster Social Dynamics: A Recurrent Neural Network Approach with Word Embeddings and Kernel Density Estimation
title_sort using twitter data to monitor natural disaster social dynamics: a recurrent neural network approach with word embeddings and kernel density estimation
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2019-04-01
description In recent years, Online Social Networks (OSNs) have received a great deal of attention for their potential use in the spatial and temporal modeling of events owing to the information that can be extracted from these platforms. Within this context, one of the most latent applications is the monitoring of natural disasters. Vital information posted by OSN users can contribute to relief efforts during and after a catastrophe. Although it is possible to retrieve data from OSNs using embedded geographic information provided by GPS systems, this feature is disabled by default in most cases. An alternative solution is to geoparse specific locations using language models based on Named Entity Recognition (NER) techniques. In this work, a sensor that uses Twitter is proposed to monitor natural disasters. The approach is intended to sense data by detecting toponyms (named places written within the text) in tweets with event-related information, e.g., a collapsed building on a specific avenue or the location at which a person was last seen. The proposed approach is carried out by transforming tokenized tweets into word embeddings: a rich linguistic and contextual vector representation of textual corpora. Pre-labeled word embeddings are employed to train a Recurrent Neural Network variant, known as a Bidirectional Long Short-Term Memory (biLSTM) network, that is capable of dealing with sequential data by analyzing information in both directions of a word (past and future entries). Moreover, a Conditional Random Field (CRF) output layer, which aims to maximize the transition from one NER tag to another, is used to increase the classification accuracy. The resulting labeled words are joined to coherently form a toponym, which is geocoded and scored by a Kernel Density Estimation function. At the end of the process, the scored data are presented graphically to depict areas in which the majority of tweets reporting topics related to a natural disaster are concentrated. A case study on Mexico’s 2017 Earthquake is presented, and the data extracted during and after the event are reported.
topic twitter
data mining
word2vec
CRF
LSTM
geocoding
geoparsing
url https://www.mdpi.com/1424-8220/19/7/1746
work_keys_str_mv AT aldohernandezsuarez usingtwitterdatatomonitornaturaldisastersocialdynamicsarecurrentneuralnetworkapproachwithwordembeddingsandkerneldensityestimation
AT gabrielsanchezperez usingtwitterdatatomonitornaturaldisastersocialdynamicsarecurrentneuralnetworkapproachwithwordembeddingsandkerneldensityestimation
AT karinatoscanomedina usingtwitterdatatomonitornaturaldisastersocialdynamicsarecurrentneuralnetworkapproachwithwordembeddingsandkerneldensityestimation
AT hectorperezmeana usingtwitterdatatomonitornaturaldisastersocialdynamicsarecurrentneuralnetworkapproachwithwordembeddingsandkerneldensityestimation
AT joseportilloportillo usingtwitterdatatomonitornaturaldisastersocialdynamicsarecurrentneuralnetworkapproachwithwordembeddingsandkerneldensityestimation
AT victorsanchez usingtwitterdatatomonitornaturaldisastersocialdynamicsarecurrentneuralnetworkapproachwithwordembeddingsandkerneldensityestimation
AT luisjaviergarciavillalba usingtwitterdatatomonitornaturaldisastersocialdynamicsarecurrentneuralnetworkapproachwithwordembeddingsandkerneldensityestimation
_version_ 1725910518375383040