Summary: | Late research has established the critical environmental, health and social impacts of traffic in highly populated urban regions. Apart from traffic monitoring, textual analysis of geo-located social media responses can provide an intelligent means in detecting and classifying traffic related events. This paper deals with the content analysis of Twitter textual data using an ensemble of supervised and unsupervised Machine Learning methods in order to cluster and properly classify traffic related events. Voluminous textual data was gathered using innovative Twitter APIs and managed by Big Data cloud methodologies via an Apache Spark system. Events were detected using a traffic related typology and the clustering K-Means model, where related event classification was achieved applying Support Vector Machines (SVM), Convolutional Neural Networks (CNN) and Long Short Term Memory (LSTM) networks. We provide experimental results for 2-class and 3-class classification examples indicating that the ensemble performs with accuracy and F-score reaching 98.5%.
|