Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network

Instance selection endeavors to decide which instances from the data set should be maintained for further use during the learning process. It can result in increased generalization of the learning model, shorter time of the learning process, or scaling up to large data sources. This paper presents a...

Full description

Bibliographic Details
Main Author:	Fuangkhon Piyabute
Format:	Article
Language:	English
Published:	De Gruyter 2017-04-01
Series:	Journal of Intelligent Systems
Subjects:	data mining data reduction neural network parallel algorithm support vector machine 68t01
Online Access:	https://doi.org/10.1515/jisys-2015-0039

id	doaj-291824d0c49147f099a4a4a2a6bd189e
record_format	Article
spelling	doaj-291824d0c49147f099a4a4a2a6bd189e2021-09-06T19:40:36ZengDe GruyterJournal of Intelligent Systems0334-18602191-026X2017-04-0126233535810.1515/jisys-2015-0039Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural NetworkFuangkhon Piyabute0Department of Business Information Systems, Assumption University, Samut Prakan 10540, Kingdom of ThailandInstance selection endeavors to decide which instances from the data set should be maintained for further use during the learning process. It can result in increased generalization of the learning model, shorter time of the learning process, or scaling up to large data sources. This paper presents a parallel distance-based instance selection approach for a feed-forward neural network (FFNN), which can utilize all available processing power to reduce the data set while obtaining similar levels of classification accuracy as when the original data set is used. The algorithm identifies the instances at the decision boundary between consecutive classes of data, which are essential for placing hyperplane decision surfaces, and retains these instances in the reduced data set (subset). Each identified instance, called a prototype, is one of the representatives of the decision boundary of its class that constitutes the shape or distribution model of the data set. No feature or dimension is sacrificed in the reduction process. Regarding reduction capability, the algorithm obtains approximately 85% reduction power on non-overlapping two-class synthetic data sets, 70% reduction power on highly overlapping two-class synthetic data sets, and 77% reduction power on multiclass real-world data sets. Regarding generalization, the reduced data sets obtain similar levels of classification accuracy as when the original data set is used on both FFNN and support vector machine. Regarding execution time requirement, the speedup of the parallel algorithm over the serial algorithm is proportional to the number of threads the processor can run concurrently.https://doi.org/10.1515/jisys-2015-0039data miningdata reductionneural networkparallel algorithmsupport vector machine68t01
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Fuangkhon Piyabute
spellingShingle	Fuangkhon Piyabute Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network Journal of Intelligent Systems data mining data reduction neural network parallel algorithm support vector machine 68t01
author_facet	Fuangkhon Piyabute
author_sort	Fuangkhon Piyabute
title	Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network
title_short	Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network
title_full	Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network
title_fullStr	Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network
title_full_unstemmed	Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network
title_sort	parallel distance-based instance selection algorithm for feed-forward neural network
publisher	De Gruyter
series	Journal of Intelligent Systems
issn	0334-1860 2191-026X
publishDate	2017-04-01
description	Instance selection endeavors to decide which instances from the data set should be maintained for further use during the learning process. It can result in increased generalization of the learning model, shorter time of the learning process, or scaling up to large data sources. This paper presents a parallel distance-based instance selection approach for a feed-forward neural network (FFNN), which can utilize all available processing power to reduce the data set while obtaining similar levels of classification accuracy as when the original data set is used. The algorithm identifies the instances at the decision boundary between consecutive classes of data, which are essential for placing hyperplane decision surfaces, and retains these instances in the reduced data set (subset). Each identified instance, called a prototype, is one of the representatives of the decision boundary of its class that constitutes the shape or distribution model of the data set. No feature or dimension is sacrificed in the reduction process. Regarding reduction capability, the algorithm obtains approximately 85% reduction power on non-overlapping two-class synthetic data sets, 70% reduction power on highly overlapping two-class synthetic data sets, and 77% reduction power on multiclass real-world data sets. Regarding generalization, the reduced data sets obtain similar levels of classification accuracy as when the original data set is used on both FFNN and support vector machine. Regarding execution time requirement, the speedup of the parallel algorithm over the serial algorithm is proportional to the number of threads the processor can run concurrently.
topic	data mining data reduction neural network parallel algorithm support vector machine 68t01
url	https://doi.org/10.1515/jisys-2015-0039
work_keys_str_mv	AT fuangkhonpiyabute paralleldistancebasedinstanceselectionalgorithmforfeedforwardneuralnetwork
_version_	1717768106400546816

Parallel Distance-Based Instance Selection Algorithm for Feed-Forward Neural Network

Similar Items