Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning

Digital societies could be characterized by their increasing desire to express themselves and interact with others. This is being realized through digital platforms such as social media that have increasingly become convenient and inexpensive sensors compared to physical sensors in many sectors of s...

Full description

Bibliographic Details
Main Authors: Ebtesam Alomari, Iyad Katib, Aiiad Albeshri, Tan Yigitcanlar, Rashid Mehmood
Format: Article
Language:English
Published: MDPI AG 2021-04-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/9/2993
id doaj-a7a2f4a6e81f4e2ba32bbf0420577300
record_format Article
spelling doaj-a7a2f4a6e81f4e2ba32bbf04205773002021-04-24T23:01:40ZengMDPI AGSensors1424-82202021-04-01212993299310.3390/s21092993Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine LearningEbtesam Alomari0Iyad Katib1Aiiad Albeshri2Tan Yigitcanlar3Rashid Mehmood4Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi ArabiaFaculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi ArabiaFaculty of Computing and Information Technology, King Abdulaziz University, Jeddah 21589, Saudi ArabiaSchool of Architecture and Built Environment, Queensland University of Technology, 2 George Street, Brisbane, QLD 4000, AustraliaHigh Performance Computing Center, King Abdulaziz University, Jeddah 21589, Saudi ArabiaDigital societies could be characterized by their increasing desire to express themselves and interact with others. This is being realized through digital platforms such as social media that have increasingly become convenient and inexpensive sensors compared to physical sensors in many sectors of smart societies. One such major sector is road transportation, which is the backbone of modern economies and costs globally 1.25 million deaths and 50 million human injuries annually. The cutting-edge on big data-enabled social media analytics for transportation-related studies is limited. This paper brings a range of technologies together to detect road traffic-related events using big data and distributed machine learning. The most specific contribution of this research is an automatic labelling method for machine learning-based traffic-related event detection from Twitter data in the Arabic language. The proposed method has been implemented in a software tool called Iktishaf+ (an Arabic word meaning discovery) that is able to detect traffic events automatically from tweets in the Arabic language using distributed machine learning over Apache Spark. The tool is built using nine components and a range of technologies including Apache Spark, Parquet, and MongoDB. Iktishaf+ uses a light stemmer for the Arabic language developed by us. We also use in this work a location extractor developed by us that allows us to extract and visualize spatio-temporal information about the detected events. The specific data used in this work comprises 33.5 million tweets collected from Saudi Arabia using the Twitter API. Using support vector machines, naïve Bayes, and logistic regression-based classifiers, we are able to detect and validate several real events in Saudi Arabia without prior knowledge, including a fire in Jeddah, rains in Makkah, and an accident in Riyadh. The findings show the effectiveness of Twitter media in detecting important events with no prior knowledge about them.https://www.mdpi.com/1424-8220/21/9/2993smart citiesbig dataevent detectionroad trafficdistributed machine learningautomatic labeling
collection DOAJ
language English
format Article
sources DOAJ
author Ebtesam Alomari
Iyad Katib
Aiiad Albeshri
Tan Yigitcanlar
Rashid Mehmood
spellingShingle Ebtesam Alomari
Iyad Katib
Aiiad Albeshri
Tan Yigitcanlar
Rashid Mehmood
Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning
Sensors
smart cities
big data
event detection
road traffic
distributed machine learning
automatic labeling
author_facet Ebtesam Alomari
Iyad Katib
Aiiad Albeshri
Tan Yigitcanlar
Rashid Mehmood
author_sort Ebtesam Alomari
title Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning
title_short Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning
title_full Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning
title_fullStr Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning
title_full_unstemmed Iktishaf+: A Big Data Tool with Automatic Labeling for Road Traffic Social Sensing and Event Detection Using Distributed Machine Learning
title_sort iktishaf+: a big data tool with automatic labeling for road traffic social sensing and event detection using distributed machine learning
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2021-04-01
description Digital societies could be characterized by their increasing desire to express themselves and interact with others. This is being realized through digital platforms such as social media that have increasingly become convenient and inexpensive sensors compared to physical sensors in many sectors of smart societies. One such major sector is road transportation, which is the backbone of modern economies and costs globally 1.25 million deaths and 50 million human injuries annually. The cutting-edge on big data-enabled social media analytics for transportation-related studies is limited. This paper brings a range of technologies together to detect road traffic-related events using big data and distributed machine learning. The most specific contribution of this research is an automatic labelling method for machine learning-based traffic-related event detection from Twitter data in the Arabic language. The proposed method has been implemented in a software tool called Iktishaf+ (an Arabic word meaning discovery) that is able to detect traffic events automatically from tweets in the Arabic language using distributed machine learning over Apache Spark. The tool is built using nine components and a range of technologies including Apache Spark, Parquet, and MongoDB. Iktishaf+ uses a light stemmer for the Arabic language developed by us. We also use in this work a location extractor developed by us that allows us to extract and visualize spatio-temporal information about the detected events. The specific data used in this work comprises 33.5 million tweets collected from Saudi Arabia using the Twitter API. Using support vector machines, naïve Bayes, and logistic regression-based classifiers, we are able to detect and validate several real events in Saudi Arabia without prior knowledge, including a fire in Jeddah, rains in Makkah, and an accident in Riyadh. The findings show the effectiveness of Twitter media in detecting important events with no prior knowledge about them.
topic smart cities
big data
event detection
road traffic
distributed machine learning
automatic labeling
url https://www.mdpi.com/1424-8220/21/9/2993
work_keys_str_mv AT ebtesamalomari iktishafabigdatatoolwithautomaticlabelingforroadtrafficsocialsensingandeventdetectionusingdistributedmachinelearning
AT iyadkatib iktishafabigdatatoolwithautomaticlabelingforroadtrafficsocialsensingandeventdetectionusingdistributedmachinelearning
AT aiiadalbeshri iktishafabigdatatoolwithautomaticlabelingforroadtrafficsocialsensingandeventdetectionusingdistributedmachinelearning
AT tanyigitcanlar iktishafabigdatatoolwithautomaticlabelingforroadtrafficsocialsensingandeventdetectionusingdistributedmachinelearning
AT rashidmehmood iktishafabigdatatoolwithautomaticlabelingforroadtrafficsocialsensingandeventdetectionusingdistributedmachinelearning
_version_ 1721510902158589952