Feature Selection Based on Term Frequency Reordering of Document Level

In this paper, we propose a new feature selection algorithm based on term frequency reordering of document level. In our proposed algorithm, it uses the document frequency to weigh the unbalanced factors of the data sets and considers the effect of the term frequency on the feature importance orderi...

Full description

Bibliographic Details
Main Authors: Hongfang Zhou, Yingjie Zhang, Hongjiang Liu, Yao Zhang
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8454728/
id doaj-94c1b1c2c45142d2a5b5d392cd1643d4
record_format Article
spelling doaj-94c1b1c2c45142d2a5b5d392cd1643d42021-03-29T20:58:27ZengIEEEIEEE Access2169-35362018-01-016516555166810.1109/ACCESS.2018.28688448454728Feature Selection Based on Term Frequency Reordering of Document LevelHongfang Zhou0https://orcid.org/0000-0002-5145-3830Yingjie Zhang1https://orcid.org/0000-0003-3009-0054Hongjiang Liu2Yao Zhang3School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an, ChinaIn this paper, we propose a new feature selection algorithm based on term frequency reordering of document level. In our proposed algorithm, it uses the document frequency to weigh the unbalanced factors of the data sets and considers the effect of the term frequency on the feature importance ordering. In the experiments, our proposed algorithm is compared with Normalized Difference Measure, Chi-squared, Odds Ratio, Gini Index, and Balanced Accuracy on the WAP, K1a, K1b RE0, RE1, 20 Newsgroups, Reuters-21578, and RCV1-v2 data sets. The experimental results show that our proposed algorithm is superior to other five algorithms.https://ieeexplore.ieee.org/document/8454728/Attribute selectiondocument frequencyfeature selectionfiltering methodterm frequency weightingtext classification
collection DOAJ
language English
format Article
sources DOAJ
author Hongfang Zhou
Yingjie Zhang
Hongjiang Liu
Yao Zhang
spellingShingle Hongfang Zhou
Yingjie Zhang
Hongjiang Liu
Yao Zhang
Feature Selection Based on Term Frequency Reordering of Document Level
IEEE Access
Attribute selection
document frequency
feature selection
filtering method
term frequency weighting
text classification
author_facet Hongfang Zhou
Yingjie Zhang
Hongjiang Liu
Yao Zhang
author_sort Hongfang Zhou
title Feature Selection Based on Term Frequency Reordering of Document Level
title_short Feature Selection Based on Term Frequency Reordering of Document Level
title_full Feature Selection Based on Term Frequency Reordering of Document Level
title_fullStr Feature Selection Based on Term Frequency Reordering of Document Level
title_full_unstemmed Feature Selection Based on Term Frequency Reordering of Document Level
title_sort feature selection based on term frequency reordering of document level
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2018-01-01
description In this paper, we propose a new feature selection algorithm based on term frequency reordering of document level. In our proposed algorithm, it uses the document frequency to weigh the unbalanced factors of the data sets and considers the effect of the term frequency on the feature importance ordering. In the experiments, our proposed algorithm is compared with Normalized Difference Measure, Chi-squared, Odds Ratio, Gini Index, and Balanced Accuracy on the WAP, K1a, K1b RE0, RE1, 20 Newsgroups, Reuters-21578, and RCV1-v2 data sets. The experimental results show that our proposed algorithm is superior to other five algorithms.
topic Attribute selection
document frequency
feature selection
filtering method
term frequency weighting
text classification
url https://ieeexplore.ieee.org/document/8454728/
work_keys_str_mv AT hongfangzhou featureselectionbasedontermfrequencyreorderingofdocumentlevel
AT yingjiezhang featureselectionbasedontermfrequencyreorderingofdocumentlevel
AT hongjiangliu featureselectionbasedontermfrequencyreorderingofdocumentlevel
AT yaozhang featureselectionbasedontermfrequencyreorderingofdocumentlevel
_version_ 1724193752626495488