Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.

The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction a...

Full description

Bibliographic Details
Main Authors: Yuanchao Liu, Ming Liu, Xin Wang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4367988?pdf=render
id doaj-ca490863734549da98b0afe5611cec96
record_format Article
spelling doaj-ca490863734549da98b0afe5611cec962020-11-25T02:11:56ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-01103e011739010.1371/journal.pone.0117390Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.Yuanchao LiuMing LiuXin WangThe objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.http://europepmc.org/articles/PMC4367988?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Yuanchao Liu
Ming Liu
Xin Wang
spellingShingle Yuanchao Liu
Ming Liu
Xin Wang
Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.
PLoS ONE
author_facet Yuanchao Liu
Ming Liu
Xin Wang
author_sort Yuanchao Liu
title Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.
title_short Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.
title_full Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.
title_fullStr Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.
title_full_unstemmed Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.
title_sort towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2015-01-01
description The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.
url http://europepmc.org/articles/PMC4367988?pdf=render
work_keys_str_mv AT yuanchaoliu towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension
AT mingliu towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension
AT xinwang towardssemanticallysensitivetextclusteringafeaturespacemodelingtechnologybasedondimensionextension
_version_ 1724911849025044480