A word embedding topic model for topic detection and summary in social networks

The aim of topic detection is to automatically identify the events and hot topics in social networks and continuously track known topics. Applying the traditional methods such as Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis is difficult given the high dimensionality of mass...

Full description

Bibliographic Details
Main Authors: Lei Shi, Gang Cheng, Shang-ru Xie, Gang Xie
Format: Article
Language:English
Published: SAGE Publishing 2019-11-01
Series:Measurement + Control
Online Access:https://doi.org/10.1177/0020294019865750
id doaj-3566c940ef5642ada0b127832b87d153
record_format Article
spelling doaj-3566c940ef5642ada0b127832b87d1532020-11-25T03:49:38ZengSAGE PublishingMeasurement + Control0020-29402019-11-015210.1177/0020294019865750A word embedding topic model for topic detection and summary in social networksLei Shi0Gang Cheng1Shang-ru Xie2Gang Xie3School of Computer Science, Beijing University of Posts and Telecommunications, Beijing, ChinaSchool of Earth Sciences and Engineering, Nanjing University, Nanjing, ChinaSchool of Computer Science, North China Institute of Science and Technology, Beijing, ChinaSchool of Big Data and Computer Science, Guizhou Normal University, Guiyang, ChinaThe aim of topic detection is to automatically identify the events and hot topics in social networks and continuously track known topics. Applying the traditional methods such as Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis is difficult given the high dimensionality of massive event texts and the short-text sparsity problems of social networks. The problem also exists of unclear topics caused by the sparse distribution of topics. To solve the above challenge, we propose a novel word embedding topic model by combining the topic model and the continuous bag-of-words mode (Cbow) method in word embedding method, named Cbow Topic Model (CTM), for topic detection and summary in social networks. We conduct similar word clustering of the target social network text dataset by introducing the classic Cbow word vectorization method, which can effectively learn the internal relationship between words and reduce the dimensionality of the input texts. We employ the topic model-to-model short text for effectively weakening the sparsity problem of social network texts. To detect and summarize the topic, we propose a topic detection method by leveraging similarity computing for social networks. We collected a Sina microblog dataset to conduct various experiments. The experimental results demonstrate that the CTM method is superior to the existing topic model method.https://doi.org/10.1177/0020294019865750
collection DOAJ
language English
format Article
sources DOAJ
author Lei Shi
Gang Cheng
Shang-ru Xie
Gang Xie
spellingShingle Lei Shi
Gang Cheng
Shang-ru Xie
Gang Xie
A word embedding topic model for topic detection and summary in social networks
Measurement + Control
author_facet Lei Shi
Gang Cheng
Shang-ru Xie
Gang Xie
author_sort Lei Shi
title A word embedding topic model for topic detection and summary in social networks
title_short A word embedding topic model for topic detection and summary in social networks
title_full A word embedding topic model for topic detection and summary in social networks
title_fullStr A word embedding topic model for topic detection and summary in social networks
title_full_unstemmed A word embedding topic model for topic detection and summary in social networks
title_sort word embedding topic model for topic detection and summary in social networks
publisher SAGE Publishing
series Measurement + Control
issn 0020-2940
publishDate 2019-11-01
description The aim of topic detection is to automatically identify the events and hot topics in social networks and continuously track known topics. Applying the traditional methods such as Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis is difficult given the high dimensionality of massive event texts and the short-text sparsity problems of social networks. The problem also exists of unclear topics caused by the sparse distribution of topics. To solve the above challenge, we propose a novel word embedding topic model by combining the topic model and the continuous bag-of-words mode (Cbow) method in word embedding method, named Cbow Topic Model (CTM), for topic detection and summary in social networks. We conduct similar word clustering of the target social network text dataset by introducing the classic Cbow word vectorization method, which can effectively learn the internal relationship between words and reduce the dimensionality of the input texts. We employ the topic model-to-model short text for effectively weakening the sparsity problem of social network texts. To detect and summarize the topic, we propose a topic detection method by leveraging similarity computing for social networks. We collected a Sina microblog dataset to conduct various experiments. The experimental results demonstrate that the CTM method is superior to the existing topic model method.
url https://doi.org/10.1177/0020294019865750
work_keys_str_mv AT leishi awordembeddingtopicmodelfortopicdetectionandsummaryinsocialnetworks
AT gangcheng awordembeddingtopicmodelfortopicdetectionandsummaryinsocialnetworks
AT shangruxie awordembeddingtopicmodelfortopicdetectionandsummaryinsocialnetworks
AT gangxie awordembeddingtopicmodelfortopicdetectionandsummaryinsocialnetworks
AT leishi wordembeddingtopicmodelfortopicdetectionandsummaryinsocialnetworks
AT gangcheng wordembeddingtopicmodelfortopicdetectionandsummaryinsocialnetworks
AT shangruxie wordembeddingtopicmodelfortopicdetectionandsummaryinsocialnetworks
AT gangxie wordembeddingtopicmodelfortopicdetectionandsummaryinsocialnetworks
_version_ 1724494248031551488