Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling.

In an age when people are predisposed to report real-world events through their social media accounts, many researchers value the benefits of mining user generated content from social media. Compared with the traditional news media, social media services, such as Twitter, can provide more complete a...

Full description

Bibliographic Details
Main Authors: Laurissa Tokarchuk, Xinyue Wang, Stefan Poslad
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2017-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5673163?pdf=render
id doaj-f9078b8911c246f6807666266da1af52
record_format Article
spelling doaj-f9078b8911c246f6807666266da1af522020-11-24T21:24:27ZengPublic Library of Science (PLoS)PLoS ONE1932-62032017-01-011211e018740110.1371/journal.pone.0187401Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling.Laurissa TokarchukXinyue WangStefan PosladIn an age when people are predisposed to report real-world events through their social media accounts, many researchers value the benefits of mining user generated content from social media. Compared with the traditional news media, social media services, such as Twitter, can provide more complete and timely information about the real-world events. However events are often like a puzzle and in order to solve the puzzle/understand the event, we must identify all the sub-events or pieces. Existing Twitter event monitoring systems for sub-event detection and summarization currently typically analyse events based on partial data as conventional data collection methodologies are unable to collect comprehensive event data. This results in existing systems often being unable to report sub-events in real-time and often in completely missing sub-events or pieces in the broader event puzzle. This paper proposes a Sub-event detection by real-TIme Microblog monitoring (STRIM) framework that leverages the temporal feature of an expanded set of news-worthy event content. In order to more comprehensively and accurately identify sub-events this framework first proposes the use of adaptive microblog crawling. Our adaptive microblog crawler is capable of increasing the coverage of events while minimizing the amount of non-relevant content. We then propose a stream division methodology that can be accomplished in real time so that the temporal features of the expanded event streams can be analysed by a burst detection algorithm. In the final steps of the framework, the content features are extracted from each divided stream and recombined to provide a final summarization of the sub-events. The proposed framework is evaluated against traditional event detection using event recall and event precision metrics. Results show that improving the quality and coverage of event contents contribute to better event detection by identifying additional valid sub-events. The novel combination of our proposed adaptive crawler and our stream division/recombination technique provides significant gains in event recall (44.44%) and event precision (9.57%). The addition of these sub-events or pieces, allows us to get closer to solving the event puzzle.http://europepmc.org/articles/PMC5673163?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Laurissa Tokarchuk
Xinyue Wang
Stefan Poslad
spellingShingle Laurissa Tokarchuk
Xinyue Wang
Stefan Poslad
Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling.
PLoS ONE
author_facet Laurissa Tokarchuk
Xinyue Wang
Stefan Poslad
author_sort Laurissa Tokarchuk
title Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling.
title_short Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling.
title_full Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling.
title_fullStr Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling.
title_full_unstemmed Piecing together the puzzle: Improving event content coverage for real-time sub-event detection using adaptive microblog crawling.
title_sort piecing together the puzzle: improving event content coverage for real-time sub-event detection using adaptive microblog crawling.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2017-01-01
description In an age when people are predisposed to report real-world events through their social media accounts, many researchers value the benefits of mining user generated content from social media. Compared with the traditional news media, social media services, such as Twitter, can provide more complete and timely information about the real-world events. However events are often like a puzzle and in order to solve the puzzle/understand the event, we must identify all the sub-events or pieces. Existing Twitter event monitoring systems for sub-event detection and summarization currently typically analyse events based on partial data as conventional data collection methodologies are unable to collect comprehensive event data. This results in existing systems often being unable to report sub-events in real-time and often in completely missing sub-events or pieces in the broader event puzzle. This paper proposes a Sub-event detection by real-TIme Microblog monitoring (STRIM) framework that leverages the temporal feature of an expanded set of news-worthy event content. In order to more comprehensively and accurately identify sub-events this framework first proposes the use of adaptive microblog crawling. Our adaptive microblog crawler is capable of increasing the coverage of events while minimizing the amount of non-relevant content. We then propose a stream division methodology that can be accomplished in real time so that the temporal features of the expanded event streams can be analysed by a burst detection algorithm. In the final steps of the framework, the content features are extracted from each divided stream and recombined to provide a final summarization of the sub-events. The proposed framework is evaluated against traditional event detection using event recall and event precision metrics. Results show that improving the quality and coverage of event contents contribute to better event detection by identifying additional valid sub-events. The novel combination of our proposed adaptive crawler and our stream division/recombination technique provides significant gains in event recall (44.44%) and event precision (9.57%). The addition of these sub-events or pieces, allows us to get closer to solving the event puzzle.
url http://europepmc.org/articles/PMC5673163?pdf=render
work_keys_str_mv AT laurissatokarchuk piecingtogetherthepuzzleimprovingeventcontentcoverageforrealtimesubeventdetectionusingadaptivemicroblogcrawling
AT xinyuewang piecingtogetherthepuzzleimprovingeventcontentcoverageforrealtimesubeventdetectionusingadaptivemicroblogcrawling
AT stefanposlad piecingtogetherthepuzzleimprovingeventcontentcoverageforrealtimesubeventdetectionusingadaptivemicroblogcrawling
_version_ 1725988124119531520