FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES

Datasets with heterogeneous features can affect feature selection results that are not appropriate because it is difficult to evaluate heterogeneous features concurrently. Feature transformation (FT) is another way to handle heterogeneous features subset selection. The results of transformation from...

Full description

Bibliographic Details
Main Authors: Ratri Enggar Pawening, Tio Darmawan, Rizqa Raaiqa Bintana, Agus Zainal Arifin, Darlis Herumurti
Format: Article
Language:English
Published: Universitas Indonesia 2016-06-01
Series:Jurnal Ilmu Komputer dan Informasi
Subjects:
Online Access:http://jiki.cs.ui.ac.id/index.php/jiki/article/view/384
id doaj-8e25bcc41e6c400598716f4ccc2cbd9a
record_format Article
spelling doaj-8e25bcc41e6c400598716f4ccc2cbd9a2020-11-24T20:56:07ZengUniversitas IndonesiaJurnal Ilmu Komputer dan Informasi2088-70512502-92742016-06-019210611210.21609/jiki.v9i2.384216FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURESRatri Enggar Pawening0Tio Darmawan1Rizqa Raaiqa Bintana2Agus Zainal Arifin3Darlis Herumurti4Department of Informatics, STT Nurul Jadid Paiton, Jl. Pondok Pesantren Nurul Jadid PaitonDepartment of Informatics, Faculty of Information Technology, Institut Teknologi Sepuluh NopemberDepartment of Informatics, Faculty of Information Technology, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, 60111, Indonesia Department of Informatics, Faculty of Science and Technology, UIN Sultan Syarif Kasim Riau, Jl. H.R Soebrantas, Pekanbaru, 28293, IndonesiaDepartment of Informatics, Faculty of Information Technology, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, 60111, IndonesiaDepartment of Informatics, Faculty of Information Technology, Institut Teknologi Sepuluh Nopember (ITS), Surabaya, 60111, IndonesiaDatasets with heterogeneous features can affect feature selection results that are not appropriate because it is difficult to evaluate heterogeneous features concurrently. Feature transformation (FT) is another way to handle heterogeneous features subset selection. The results of transformation from non-numerical into numerical features may produce redundancy to the original numerical features. In this paper, we propose a method to select feature subset based on mutual information (MI) for classifying heterogeneous features. We use unsupervised feature transformation (UFT) methods and joint mutual information maximation (JMIM) methods. UFT methods is used to transform non-numerical features into numerical features. JMIM methods is used to select feature subset with a consideration of the class label. The transformed and the original features are combined entirely, then determine features subset by using JMIM methods, and classify them using support vector machine (SVM) algorithm. The classification accuracy are measured for any number of selected feature subset and compared between UFT-JMIM methods and Dummy-JMIM methods. The average classification accuracy for all experiments in this study that can be achieved by UFT-JMIM methods is about 84.47% and Dummy-JMIM methods is about 84.24%. This result shows that UFT-JMIM methods can minimize information loss between transformed and original features, and select feature subset to avoid redundant and irrelevant features.http://jiki.cs.ui.ac.id/index.php/jiki/article/view/384Feature selection, Heterogeneous features, Joint mutual information maximation, Support vector machine, Unsupervised feature transformation
collection DOAJ
language English
format Article
sources DOAJ
author Ratri Enggar Pawening
Tio Darmawan
Rizqa Raaiqa Bintana
Agus Zainal Arifin
Darlis Herumurti
spellingShingle Ratri Enggar Pawening
Tio Darmawan
Rizqa Raaiqa Bintana
Agus Zainal Arifin
Darlis Herumurti
FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES
Jurnal Ilmu Komputer dan Informasi
Feature selection, Heterogeneous features, Joint mutual information maximation, Support vector machine, Unsupervised feature transformation
author_facet Ratri Enggar Pawening
Tio Darmawan
Rizqa Raaiqa Bintana
Agus Zainal Arifin
Darlis Herumurti
author_sort Ratri Enggar Pawening
title FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES
title_short FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES
title_full FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES
title_fullStr FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES
title_full_unstemmed FEATURE SELECTION METHODS BASED ON MUTUAL INFORMATION FOR CLASSIFYING HETEROGENEOUS FEATURES
title_sort feature selection methods based on mutual information for classifying heterogeneous features
publisher Universitas Indonesia
series Jurnal Ilmu Komputer dan Informasi
issn 2088-7051
2502-9274
publishDate 2016-06-01
description Datasets with heterogeneous features can affect feature selection results that are not appropriate because it is difficult to evaluate heterogeneous features concurrently. Feature transformation (FT) is another way to handle heterogeneous features subset selection. The results of transformation from non-numerical into numerical features may produce redundancy to the original numerical features. In this paper, we propose a method to select feature subset based on mutual information (MI) for classifying heterogeneous features. We use unsupervised feature transformation (UFT) methods and joint mutual information maximation (JMIM) methods. UFT methods is used to transform non-numerical features into numerical features. JMIM methods is used to select feature subset with a consideration of the class label. The transformed and the original features are combined entirely, then determine features subset by using JMIM methods, and classify them using support vector machine (SVM) algorithm. The classification accuracy are measured for any number of selected feature subset and compared between UFT-JMIM methods and Dummy-JMIM methods. The average classification accuracy for all experiments in this study that can be achieved by UFT-JMIM methods is about 84.47% and Dummy-JMIM methods is about 84.24%. This result shows that UFT-JMIM methods can minimize information loss between transformed and original features, and select feature subset to avoid redundant and irrelevant features.
topic Feature selection, Heterogeneous features, Joint mutual information maximation, Support vector machine, Unsupervised feature transformation
url http://jiki.cs.ui.ac.id/index.php/jiki/article/view/384
work_keys_str_mv AT ratrienggarpawening featureselectionmethodsbasedonmutualinformationforclassifyingheterogeneousfeatures
AT tiodarmawan featureselectionmethodsbasedonmutualinformationforclassifyingheterogeneousfeatures
AT rizqaraaiqabintana featureselectionmethodsbasedonmutualinformationforclassifyingheterogeneousfeatures
AT aguszainalarifin featureselectionmethodsbasedonmutualinformationforclassifyingheterogeneousfeatures
AT darlisherumurti featureselectionmethodsbasedonmutualinformationforclassifyingheterogeneousfeatures
_version_ 1716790691797925888