A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information

Hierarchical topic models, such as hierarchical Latent Dirichlet Allocation (hLDA)and its variations, can organize topics into a hierarchy automatically. On the other hand, there are lots of documents associated with hierarchical label information. Incorporating these information into the topic mode...

Full description

Bibliographic Details
Main Authors: Xi Zou, Yuelong Zhu, Jun Feng, Jiamin Lu, Xiaodong Li
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8936331/
id doaj-4b3e5c1c7c4b473cb64fdc3941f96698
record_format Article
spelling doaj-4b3e5c1c7c4b473cb64fdc3941f966982021-03-29T23:14:12ZengIEEEIEEE Access2169-35362019-01-01718424218425310.1109/ACCESS.2019.29604688936331A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label InformationXi Zou0https://orcid.org/0000-0002-8793-5954Yuelong Zhu1https://orcid.org/0000-0001-7194-260XJun Feng2https://orcid.org/0000-0002-2627-5403Jiamin Lu3https://orcid.org/0000-0002-0643-0736Xiaodong Li4https://orcid.org/0000-0001-6690-836XSchool of Computer and Information, Hohai University, Nanjing, ChinaSchool of Computer and Information, Hohai University, Nanjing, ChinaSchool of Computer and Information, Hohai University, Nanjing, ChinaSchool of Computer and Information, Hohai University, Nanjing, ChinaSchool of Computer and Information, Hohai University, Nanjing, ChinaHierarchical topic models, such as hierarchical Latent Dirichlet Allocation (hLDA)and its variations, can organize topics into a hierarchy automatically. On the other hand, there are lots of documents associated with hierarchical label information. Incorporating these information into the topic modeling process can help users to obtain a more reasonable hierarchical structure. However, after analyzing various real-world datasets, we find that these hierarchical labels are ambiguous and conflicting in some levels, which introduces error and restriction to the latent topic and the hierarchical structure exploration process. We call it the horizontal topic expansion problem. To address this problem, in this paper, we propose a novel hierarchical topic model named horizontal and vertical hierarchical topic model (HV-HTM), which aims to incorporate the observed hierarchical label information into the topic generation process, while keeping the flexibility of horizontal and vertical expansion of the hierarchical structure in the modeling process. We conduct experiments on BBC news and Yahoo! Answers datasets and evaluate the effectiveness of HV-HTM on three evaluation metrics. The experimental results show that HV-HTM has a significant improvement on topic modeling, compared to the state-of-the-art models, and it can also obtain a more interpretable hierarchical structure.https://ieeexplore.ieee.org/document/8936331/Topic modelinghierarchical topic modelhierarchical latent Dirichlet allocationlabel information
collection DOAJ
language English
format Article
sources DOAJ
author Xi Zou
Yuelong Zhu
Jun Feng
Jiamin Lu
Xiaodong Li
spellingShingle Xi Zou
Yuelong Zhu
Jun Feng
Jiamin Lu
Xiaodong Li
A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information
IEEE Access
Topic modeling
hierarchical topic model
hierarchical latent Dirichlet allocation
label information
author_facet Xi Zou
Yuelong Zhu
Jun Feng
Jiamin Lu
Xiaodong Li
author_sort Xi Zou
title A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information
title_short A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information
title_full A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information
title_fullStr A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information
title_full_unstemmed A Novel Hierarchical Topic Model for Horizontal Topic Expansion With Observed Label Information
title_sort novel hierarchical topic model for horizontal topic expansion with observed label information
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description Hierarchical topic models, such as hierarchical Latent Dirichlet Allocation (hLDA)and its variations, can organize topics into a hierarchy automatically. On the other hand, there are lots of documents associated with hierarchical label information. Incorporating these information into the topic modeling process can help users to obtain a more reasonable hierarchical structure. However, after analyzing various real-world datasets, we find that these hierarchical labels are ambiguous and conflicting in some levels, which introduces error and restriction to the latent topic and the hierarchical structure exploration process. We call it the horizontal topic expansion problem. To address this problem, in this paper, we propose a novel hierarchical topic model named horizontal and vertical hierarchical topic model (HV-HTM), which aims to incorporate the observed hierarchical label information into the topic generation process, while keeping the flexibility of horizontal and vertical expansion of the hierarchical structure in the modeling process. We conduct experiments on BBC news and Yahoo! Answers datasets and evaluate the effectiveness of HV-HTM on three evaluation metrics. The experimental results show that HV-HTM has a significant improvement on topic modeling, compared to the state-of-the-art models, and it can also obtain a more interpretable hierarchical structure.
topic Topic modeling
hierarchical topic model
hierarchical latent Dirichlet allocation
label information
url https://ieeexplore.ieee.org/document/8936331/
work_keys_str_mv AT xizou anovelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT yuelongzhu anovelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT junfeng anovelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT jiaminlu anovelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT xiaodongli anovelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT xizou novelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT yuelongzhu novelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT junfeng novelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT jiaminlu novelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
AT xiaodongli novelhierarchicaltopicmodelforhorizontaltopicexpansionwithobservedlabelinformation
_version_ 1724189859293167616