Feature Selection Based on Term Frequency Reordering of Document Level
In this paper, we propose a new feature selection algorithm based on term frequency reordering of document level. In our proposed algorithm, it uses the document frequency to weigh the unbalanced factors of the data sets and considers the effect of the term frequency on the feature importance orderi...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2018-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/8454728/ |
id |
doaj-94c1b1c2c45142d2a5b5d392cd1643d4 |
---|---|
record_format |
Article |
spelling |
doaj-94c1b1c2c45142d2a5b5d392cd1643d42021-03-29T20:58:27ZengIEEEIEEE Access2169-35362018-01-016516555166810.1109/ACCESS.2018.28688448454728Feature Selection Based on Term Frequency Reordering of Document LevelHongfang Zhou0https://orcid.org/0000-0002-5145-3830Yingjie Zhang1https://orcid.org/0000-0003-3009-0054Hongjiang Liu2Yao Zhang3School of Computer Science and Engineering, Xi’an University of Technology, Xi’an, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an, ChinaSchool of Computer Science and Engineering, Xi’an University of Technology, Xi’an, ChinaIn this paper, we propose a new feature selection algorithm based on term frequency reordering of document level. In our proposed algorithm, it uses the document frequency to weigh the unbalanced factors of the data sets and considers the effect of the term frequency on the feature importance ordering. In the experiments, our proposed algorithm is compared with Normalized Difference Measure, Chi-squared, Odds Ratio, Gini Index, and Balanced Accuracy on the WAP, K1a, K1b RE0, RE1, 20 Newsgroups, Reuters-21578, and RCV1-v2 data sets. The experimental results show that our proposed algorithm is superior to other five algorithms.https://ieeexplore.ieee.org/document/8454728/Attribute selectiondocument frequencyfeature selectionfiltering methodterm frequency weightingtext classification |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hongfang Zhou Yingjie Zhang Hongjiang Liu Yao Zhang |
spellingShingle |
Hongfang Zhou Yingjie Zhang Hongjiang Liu Yao Zhang Feature Selection Based on Term Frequency Reordering of Document Level IEEE Access Attribute selection document frequency feature selection filtering method term frequency weighting text classification |
author_facet |
Hongfang Zhou Yingjie Zhang Hongjiang Liu Yao Zhang |
author_sort |
Hongfang Zhou |
title |
Feature Selection Based on Term Frequency Reordering of Document Level |
title_short |
Feature Selection Based on Term Frequency Reordering of Document Level |
title_full |
Feature Selection Based on Term Frequency Reordering of Document Level |
title_fullStr |
Feature Selection Based on Term Frequency Reordering of Document Level |
title_full_unstemmed |
Feature Selection Based on Term Frequency Reordering of Document Level |
title_sort |
feature selection based on term frequency reordering of document level |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2018-01-01 |
description |
In this paper, we propose a new feature selection algorithm based on term frequency reordering of document level. In our proposed algorithm, it uses the document frequency to weigh the unbalanced factors of the data sets and considers the effect of the term frequency on the feature importance ordering. In the experiments, our proposed algorithm is compared with Normalized Difference Measure, Chi-squared, Odds Ratio, Gini Index, and Balanced Accuracy on the WAP, K1a, K1b RE0, RE1, 20 Newsgroups, Reuters-21578, and RCV1-v2 data sets. The experimental results show that our proposed algorithm is superior to other five algorithms. |
topic |
Attribute selection document frequency feature selection filtering method term frequency weighting text classification |
url |
https://ieeexplore.ieee.org/document/8454728/ |
work_keys_str_mv |
AT hongfangzhou featureselectionbasedontermfrequencyreorderingofdocumentlevel AT yingjiezhang featureselectionbasedontermfrequencyreorderingofdocumentlevel AT hongjiangliu featureselectionbasedontermfrequencyreorderingofdocumentlevel AT yaozhang featureselectionbasedontermfrequencyreorderingofdocumentlevel |
_version_ |
1724193752626495488 |