Detecting Hotspot Information Using Multi-Attribute Based Topic Model.

Microblogging as a kind of social network has become more and more important in our daily lives. Enormous amounts of information are produced and shared on a daily basis. Detecting hot topics in the mountains of information can help people get to the essential information more quickly. However, due...

Full description

Bibliographic Details
Main Authors: Jing Wang, Li Li, Feng Tan, Ying Zhu, Weisi Feng
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4619720?pdf=render
id doaj-405ee9a19ce74543b51748a72e9876c9
record_format Article
spelling doaj-405ee9a19ce74543b51748a72e9876c92020-11-25T01:52:38ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-011010e014053910.1371/journal.pone.0140539Detecting Hotspot Information Using Multi-Attribute Based Topic Model.Jing WangLi LiFeng TanYing ZhuWeisi FengMicroblogging as a kind of social network has become more and more important in our daily lives. Enormous amounts of information are produced and shared on a daily basis. Detecting hot topics in the mountains of information can help people get to the essential information more quickly. However, due to short and sparse features, a large number of meaningless tweets and other characteristics of microblogs, traditional topic detection methods are often ineffective in detecting hot topics. In this paper, we propose a new topic model named multi-attribute latent dirichlet allocation (MA-LDA), in which the time and hashtag attributes of microblogs are incorporated into LDA model. By introducing time attribute, MA-LDA model can decide whether a word should appear in hot topics or not. Meanwhile, compared with the traditional LDA model, applying hashtag attribute in MA-LDA model gives the core words an artificially high ranking in results meaning the expressiveness of outcomes can be improved. Empirical evaluations on real data sets demonstrate that our method is able to detect hot topics more accurately and efficiently compared with several baselines. Our method provides strong evidence of the importance of the temporal factor in extracting hot topics.http://europepmc.org/articles/PMC4619720?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Jing Wang
Li Li
Feng Tan
Ying Zhu
Weisi Feng
spellingShingle Jing Wang
Li Li
Feng Tan
Ying Zhu
Weisi Feng
Detecting Hotspot Information Using Multi-Attribute Based Topic Model.
PLoS ONE
author_facet Jing Wang
Li Li
Feng Tan
Ying Zhu
Weisi Feng
author_sort Jing Wang
title Detecting Hotspot Information Using Multi-Attribute Based Topic Model.
title_short Detecting Hotspot Information Using Multi-Attribute Based Topic Model.
title_full Detecting Hotspot Information Using Multi-Attribute Based Topic Model.
title_fullStr Detecting Hotspot Information Using Multi-Attribute Based Topic Model.
title_full_unstemmed Detecting Hotspot Information Using Multi-Attribute Based Topic Model.
title_sort detecting hotspot information using multi-attribute based topic model.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2015-01-01
description Microblogging as a kind of social network has become more and more important in our daily lives. Enormous amounts of information are produced and shared on a daily basis. Detecting hot topics in the mountains of information can help people get to the essential information more quickly. However, due to short and sparse features, a large number of meaningless tweets and other characteristics of microblogs, traditional topic detection methods are often ineffective in detecting hot topics. In this paper, we propose a new topic model named multi-attribute latent dirichlet allocation (MA-LDA), in which the time and hashtag attributes of microblogs are incorporated into LDA model. By introducing time attribute, MA-LDA model can decide whether a word should appear in hot topics or not. Meanwhile, compared with the traditional LDA model, applying hashtag attribute in MA-LDA model gives the core words an artificially high ranking in results meaning the expressiveness of outcomes can be improved. Empirical evaluations on real data sets demonstrate that our method is able to detect hot topics more accurately and efficiently compared with several baselines. Our method provides strong evidence of the importance of the temporal factor in extracting hot topics.
url http://europepmc.org/articles/PMC4619720?pdf=render
work_keys_str_mv AT jingwang detectinghotspotinformationusingmultiattributebasedtopicmodel
AT lili detectinghotspotinformationusingmultiattributebasedtopicmodel
AT fengtan detectinghotspotinformationusingmultiattributebasedtopicmodel
AT yingzhu detectinghotspotinformationusingmultiattributebasedtopicmodel
AT weisifeng detectinghotspotinformationusingmultiattributebasedtopicmodel
_version_ 1724994049662779392