Lifelong Learning Augmented Short Text Stream Clustering Method

Depending on the scanning mode, existing short text stream clustering methods can be divided into the following two kinds of methods: one-pass-based and batch-based. The one-pass-based method handles each text only one time, but cannot deal with the sparseness problem very well. The batch-based meth...

Full description

Bibliographic Details
Main Authors: Jipeng Qiang, Wanyin Xu, Yun Li, Yunhao Yuan, Yi Zhu
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9424568/
id doaj-4d8b9618029e4bceb838bf7592e2e052
record_format Article
spelling doaj-4d8b9618029e4bceb838bf7592e2e0522021-05-27T23:01:38ZengIEEEIEEE Access2169-35362021-01-019704937050110.1109/ACCESS.2021.30780969424568Lifelong Learning Augmented Short Text Stream Clustering MethodJipeng Qiang0https://orcid.org/0000-0001-5721-0293Wanyin Xu1Yun Li2Yunhao Yuan3Yi Zhu4Department of Computer Science, Yangzhou University, Yangzhou, ChinaDepartment of Computer Science, Yangzhou University, Yangzhou, ChinaDepartment of Computer Science, Yangzhou University, Yangzhou, ChinaDepartment of Computer Science, Yangzhou University, Yangzhou, ChinaDepartment of Computer Science, Yangzhou University, Yangzhou, ChinaDepending on the scanning mode, existing short text stream clustering methods can be divided into the following two kinds of methods: one-pass-based and batch-based. The one-pass-based method handles each text only one time, but cannot deal with the sparseness problem very well. The batch-based method obtains better results by allowing multiple iterations of each batch, but the efficiency is relatively low. To overcome these problems, this paper presents Lifelong learning Augmented Short Text stream clustering method (LAST), which incorporates the episodic memory module and sparse experience replay module of lifelong learning into the clustering process. Specifically, LAST processes each text one time, but at a certain interval it randomly samples some previously seen texts of the episodic memory to update cluster features by performing sparse experience replay. Empirical studies on two public datasets demonstrate that the performance of the LAST-based method is on a par with the batch-based method, and runs close to the speed of the one-pass-based method.https://ieeexplore.ieee.org/document/9424568/Short text streamtext clusteringsparsenesslifelong learning
collection DOAJ
language English
format Article
sources DOAJ
author Jipeng Qiang
Wanyin Xu
Yun Li
Yunhao Yuan
Yi Zhu
spellingShingle Jipeng Qiang
Wanyin Xu
Yun Li
Yunhao Yuan
Yi Zhu
Lifelong Learning Augmented Short Text Stream Clustering Method
IEEE Access
Short text stream
text clustering
sparseness
lifelong learning
author_facet Jipeng Qiang
Wanyin Xu
Yun Li
Yunhao Yuan
Yi Zhu
author_sort Jipeng Qiang
title Lifelong Learning Augmented Short Text Stream Clustering Method
title_short Lifelong Learning Augmented Short Text Stream Clustering Method
title_full Lifelong Learning Augmented Short Text Stream Clustering Method
title_fullStr Lifelong Learning Augmented Short Text Stream Clustering Method
title_full_unstemmed Lifelong Learning Augmented Short Text Stream Clustering Method
title_sort lifelong learning augmented short text stream clustering method
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description Depending on the scanning mode, existing short text stream clustering methods can be divided into the following two kinds of methods: one-pass-based and batch-based. The one-pass-based method handles each text only one time, but cannot deal with the sparseness problem very well. The batch-based method obtains better results by allowing multiple iterations of each batch, but the efficiency is relatively low. To overcome these problems, this paper presents Lifelong learning Augmented Short Text stream clustering method (LAST), which incorporates the episodic memory module and sparse experience replay module of lifelong learning into the clustering process. Specifically, LAST processes each text one time, but at a certain interval it randomly samples some previously seen texts of the episodic memory to update cluster features by performing sparse experience replay. Empirical studies on two public datasets demonstrate that the performance of the LAST-based method is on a par with the batch-based method, and runs close to the speed of the one-pass-based method.
topic Short text stream
text clustering
sparseness
lifelong learning
url https://ieeexplore.ieee.org/document/9424568/
work_keys_str_mv AT jipengqiang lifelonglearningaugmentedshorttextstreamclusteringmethod
AT wanyinxu lifelonglearningaugmentedshorttextstreamclusteringmethod
AT yunli lifelonglearningaugmentedshorttextstreamclusteringmethod
AT yunhaoyuan lifelonglearningaugmentedshorttextstreamclusteringmethod
AT yizhu lifelonglearningaugmentedshorttextstreamclusteringmethod
_version_ 1721425197867728896